2741 Commits

Author SHA1 Message Date
Matthias Springer
6422546e99
[mlir][LLVM] Fix conversion of non-standard MLIR float types (#122634)
Certain non-standard float types were directly passed through in the
LLVM type converter, resulting in invalid IR or failed assertions:

```
mlir-opt: mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:638: FailureOr<Type> mlir::LLVMTypeConverter::convertVectorType(VectorType) const: Assertion `LLVM::isCompatibleVectorType(vectorType) && "expected vector type compatible with the LLVM dialect"' failed.
```

The LLVM type converter should not define invalid type conversion rules
for such types. If there is no type conversion rule, conversion patterns
will not apply to ops with such operand types.
2025-01-12 15:17:12 +01:00
Kazu Hirata
129ec84574
[Conversion] Migrate away from PointerUnion::{is,get} (NFC) (#122421)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2025-01-10 15:10:17 -08:00
Lukas Sommer
4adeb6cf55
[mlir][spirv] Add convergent attribute to builtin (#122131)
Add the `convergent` attribute to builtin functions and builtin function
calls when lowering SPIR-V non-uniform group functions to LLVM dialect.

---------

Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
2025-01-10 09:15:18 +01:00
Longsheng Mou
9190e1c0ef
[mlir][linalg] Handle reassociationIndices correctly for 0D tensor (#121683)
This PR fixes a bug where a value is assigned to a 0-sized
reassociationIndices, preventing a crash. Fixes #116043.
2025-01-10 09:23:50 +08:00
Andrea Faulds
7724be9728
[mlir][spirv] Do SPIR-V serialization in -test-vulkan-runner-pipeline (#121494)
This commit is a further incremental step toward moving the whole
mlir-vulkan-runner MLIR pass pipeline into mlir-opt (see #73457). The
previous step was b225b3adf7b78387c9fcb97a3ff0e0a1e26eafe2, which moved
all device passes prior to SPIR-V serialization into a new mlir-opt test
pass, `-test-vulkan-runner-pipeline`.

This commit changes how SPIR-V serialization is accomplished for Vulkan
runner tests. Until now, this was done by the Vulkan-specific
ConvertGpuLaunchFuncToVulkanLaunchFunc pass. With this commit, this
responsibility is removed from that pass, and is instead done with the
existing generic GpuModuleToBinaryPass. In addition, the SPIR-V
serialization step is no longer done inside mlir-vulkan-runner, but
rather inside mlir-opt (in the `-test-vulkan-runner-pipeline` pass).
Both of these changes represent a greater alignment between
mlir-vulkan-runner and the other GPU integration tests. Notably, the IR
shapes produced by the mlir-opt pipelines for the Vulkan and SYCL
runners are now much more similar, with both using a gpu.binary op for
the serialized SPIR-V kernel.

In order to enable this, this commit includes these supporting changes:

- ConvertToSPIRVPass is enhanced to support producing the IR shape where
a spirv.module is nested inside a gpu.module, since this is what
GpuModuleToBinaryPass expects.
- ConvertGPULaunchFuncToVulkanLaunchFunc is changed to remove its SPIR-V
serialization functionality, and instead now extracts the SPIR-V from a
gpu.binary operation (as produced by ConvertToSPIRVPass).
- `-test-vulkan-runner-pipeline` now attaches SPIR-V target information
required by GpuModuleToBinaryPass.
- The WebGPU pass option, which had been removed from mlir-vulkan-runner
in the previous commit in this series, is restored as an option to
`-test-vulkan-runner-pipeline` instead, so that the WebGPU pass
continues being inserted into the pipeline just before SPIR-V
serialization.
2025-01-09 17:58:51 +01:00
Pietro Ghiglio
cdd652eb28
[MLIR][GPU] Support bf16 and i1 gpu::shuffles to LLVMSPIRV conversion (#119675)
This PR adds support to the `bf16` and `i1` data types when converting
`gpu::shuffle` to the `LLVMSPV` dialect, by inserting `bitcast` to/from
`i16` (for `bf16`) and extending/truncating to `i8` (for `i1`).
2025-01-09 13:16:18 +01:00
Longsheng Mou
c1d01b2fc2
[mlir][tosa] Add missing verifier for tosa.pad (#120934)
This PR adds a missing verifier for `tosa.pad`, ensuring that the
padding shape matches [2*rank(shape1)] according to V1.0.0
Specification. Fixes #119840.
2025-01-08 10:45:59 +02:00
Matthias Springer
599c739905
[mlir][GPU] Add NVVM-specific cf.assert lowering (#120431)
This commit add an NVIDIA-specific lowering of `cf.assert` to to
`__assertfail`.

Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and
`getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can
be reused.
2025-01-06 12:00:11 +01:00
Matthias Springer
3ace685105
[mlir][Transforms] Support 1:N mappings in ConversionValueMapping (#116524)
This commit updates the internal `ConversionValueMapping` data structure
in the dialect conversion driver to support 1:N replacements. This is
the last major commit for adding 1:N support to the dialect conversion
driver.

Since #116470, the infrastructure already supports 1:N replacements. But
the `ConversionValueMapping` still stored 1:1 value mappings. To that
end, the driver inserted temporary argument materializations (converting
N SSA values into 1 value). This is no longer the case. Argument
materializations are now entirely gone. (They will be deleted from the
type converter after some time, when we delete the old 1:N dialect
conversion driver.)

Note for LLVM integration: Replace all occurrences of
`addArgumentMaterialization` (except for 1:N dialect conversion passes)
with `addSourceMaterialization`.

---------

Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
2025-01-03 16:11:56 +01:00
josel-amd
d622b66a82
Re-introduce Type Conversion on EmitC (#121476)
This PR reintroduces https://github.com/llvm/llvm-project/pull/118940
with a fix for the build issues on
cd9caf3aeed55280537052227f08bb1b41154efd
2025-01-02 14:58:15 +01:00
Matthias Gehre
df728cf1d7 Revert "[MLIR][SCFToEmitC] Convert types while converting from SCF to EmitC (#118940)"
This reverts commit 450c6b02d224245656c41033cc0c849bde2045f3.
2025-01-02 11:55:35 +01:00
josel-amd
450c6b02d2
[MLIR][SCFToEmitC] Convert types while converting from SCF to EmitC (#118940)
Switch from rewrite patterns to conversion patterns. This allows to
perform type conversions together with other parts of the IR. For
example, this allows to convert from index to emit.size_t types.
2025-01-02 11:36:23 +01:00
Ivan Butygin
0e23cb0cc5
[mlir][nfc] GpuToROCDL: Remove some dead code (#121403) 2024-12-31 20:39:31 +03:00
Ivan Butygin
018b32ca1f
Revert "[mlir][nfc] GpuToROCDL: Remove some dead code" (#121402)
Reverts llvm/llvm-project#121395
2024-12-31 18:55:00 +03:00
Ivan Butygin
0b08e095cc
[mlir][nfc] GpuToROCDL: Remove some dead code (#121395) 2024-12-31 18:54:41 +03:00
Jie Fu
fb1dbe24f2 [mlir] Remove extra ';' outside of a function (NFC)
/llvm-project/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:51:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
};
 ^
/llvm-project/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:97:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
};
 ^
2 errors generated.
2024-12-23 22:34:08 +08:00
Matthias Springer
df31fd8a36
[mlir] Fix use-after-return in #117513 (#120968)
Fix a use-after-return in #117513. Free-standing lambdas should not be
defined inside of the `LLVMTypeConverter` constructor because they go
out of scope.
2024-12-23 15:13:42 +01:00
Matthias Springer
3cc311ab86
[mlir][Transforms] Dialect Conversion: No target mat. for 1:N replacement (#117513)
During a 1:N replacement (`applySignatureConversion` or
`replaceOpWithMultiple`), the dialect conversion driver used to insert
two materializations:

* Argument materialization: convert N replacement values to 1 SSA value
of the original type `S`.
* Target materialization: convert original type to legalized type `T`.

The target materialization is unnecessary. Subsequent patterns receive
the replacement values via their adaptors. These patterns have their own
type converter. When they see a replacement value of type `S`, they will
automatically insert a target materialization to type `T`. There is no
reason to do this already during the 1:N replacement. (The functionality
used to be duplicated in `remapValues` and `insertNTo1Materialization`.)

Special case: If a subsequent pattern does not have a type converter, it
does *not* insert any target materializations. That's because the
absence of a type converter indicates that the pattern does not care
about type legality. Therefore, it is correct to pass an SSA value of
type `S` (or any other type) to the pattern.

Note: Most patterns in `TestPatterns.cpp` run without a type converter.
To make sure that the tests still behave the same, some of these
patterns now have a type converter.

This commit is in preparation of adding 1:N support to the conversion
value mapping. Before making any further changes to the mapping
infrastructure, I'd like to make sure that the code base around it (that
uses the mapping) is robust.
2024-12-23 13:27:39 +01:00
Kazu Hirata
9901906035 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp:535:13: error:
  'applyPatternsAndFoldGreedily' is deprecated: Use
  applyPatternsGreedily() instead [-Werror,-Wdeprecated-declarations]
2024-12-20 10:25:50 -08:00
Jacques Pienaar
09dfc5713d
[mlir] Enable decoupling two kinds of greedy behavior. (#104649)
The greedy rewriter is used in many different flows and it has a lot of
convenience (work list management, debugging actions, tracing, etc). But
it combines two kinds of greedy behavior 1) how ops are matched, 2)
folding wherever it can.

These are independent forms of greedy and leads to inefficiency. E.g.,
cases where one need to create different phases in lowering and is
required to applying patterns in specific order split across different
passes. Using the driver one ends up needlessly retrying folding/having
multiple rounds of folding attempts, where one final run would have
sufficed.

Of course folks can locally avoid this behavior by just building their
own, but this is also a common requested feature that folks keep on
working around locally in suboptimal ways.

For downstream users, there should be no behavioral change. Updating
from the deprecated should just be a find and replace (e.g., `find ./
-type f -exec sed -i
's|applyPatternsAndFoldGreedily|applyPatternsGreedily|g' {} \;` variety)
as the API arguments hasn't changed between the two.
2024-12-20 08:15:48 -08:00
Ivan Butygin
953b07febc
[mlir] AMDGPUToROCDL: RawBufferOpLowering fixes (#120642)
1. We can use `getNumElements()` only for memrefs with trivial layout.
2. Buffer ops expecting sizes in i32 but descriptor values can be either
i32 or i64, add appropriate casts. This implementation is not ideal as
it can overflow, but it's still better than generating broken IR.
2024-12-20 18:09:01 +03:00
Matthias Springer
eb6c4197d5
[mlir][CF] Split cf-to-llvm from func-to-llvm (#120580)
Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes
https://github.com/llvm/llvm-project/issues/70982.

This commit changes the way how `func.func` ops are lowered to LLVM.
Previously, the signature of the entire region (i.e., entry block and
all other blocks in the `func.func` op) was converted as part of the
`func.func` lowering pattern.

Now, only the entry block is converted. The remaining block signatures
are converted together with `cf.br` and `cf.cond_br` as part of
`cf-to-llvm`. All unstructured control flow is not converted as part of
a single pass (`cf-to-llvm`). `func-to-llvm` no longer deals with
unstructured control flow.

Also add more test cases for control flow dialect ops.

Note: This PR is in preparation of #120431, which adds an additional
GPU-specific lowering for `cf.assert`. This was a problem because
`cf.assert` used to be converted as part of `func-to-llvm`.

Note for LLVM integration: If you see failures, add
`-convert-cf-to-llvm` to your pass pipeline.
2024-12-20 13:46:45 +01:00
Matthias Springer
53d080c5b5
[mlir][Arith] Remove arith-to-llvm from func-to-llvm (#120548)
Do not run `arith-to-llvm` as part of `func-to-llvm`. This commit partly
fixes #70982.

Also simplify the pass pipeline for two math dialect integration tests.

Note for LLVM integration: If you see failures, add `arith-to-llvm` to your pass pipeline.
2024-12-20 10:14:04 +01:00
Matthias Springer
0693b9e9cc
[mlir][Vector] Clean up populateVectorToLLVMConversionPatterns (#119975)
Clean up `populateVectorToLLVMConversionPatterns` so that it populates
only conversion patterns. All rewrite patterns that do not lower to LLVM
should be populated into a separate greedy pattern rewrite.

The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.

Depends on #119973.
2024-12-17 11:37:17 +01:00
Matthias Springer
8cd8b5079b
[mlir][Vector] Move mask materialization patterns to greedy rewrite (#119973)
The mask materialization patterns during `VectorToLLVM` are rewrite
patterns. They should run as part of the greedy pattern rewrite and not
the dialect conversion. (Rewrite patterns and conversion patterns are
not generally compatible.)

The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.
2024-12-17 11:26:31 +01:00
Hugo Trachino
3cbc73f71e
[MLIR][Arith] Add CeilFloorDivExpandOpsPatterns to conversion to LLVM (Reland) (#118839)
When running `convert-to-llvm`, `ceildiv` and `floordiv` ops, which do not
have direct llvm conversion pattern, would not get lowered to llvm
dialect. This patch adds CeilFloorDivExpandOpsPatterns to both
`convert-to-llvm` and `arith-to-llvm` (deprecated) lowering those ops to
lower level arith ops which can be lowered to llvm using LLVM
conversion.

Reland of https://github.com/llvm/llvm-project/pull/117305 after
buildbot failures.
See:
https://lab.llvm.org/buildbot/#/builders/80/builds/7168
https://lab.llvm.org/buildbot/#/builders/130/builds/7036
https://lab.llvm.org/buildbot/#/builders/138/builds/7290

Added dependence to ArithTransforms in ArithToLLVM. In previous
discussion, it has been suggested to move the
CeilFloorDivExpandOpsPatterns to ArithUtils but I think linking
ArithTransforms makes more sense as otherwise :
* ArithToLLVM needs a new dependency to ArithUtils
* ArithUtils needs new dependency to ArithTransforms or move the
patterns as well which will create more dependencies
* It creates lots of code motion which makes it hard to review.
2024-12-16 16:15:13 +00:00
Adam Siemieniuk
4c597d42dc
[mlir][xegpu] Support boundary checks only for block instructions (#119380)
Constrains Vector lowering to apply boundary checks only to data
transfers operating on block shapes.

This further aligns lowering with the current Xe instructions'
restrictions.
2024-12-13 10:01:13 +01:00
Benoit Jacob
bdd365825d
[MLIR] Fix ComplexToStandard lowering of complex::MulOp (#119591)
A complex multiplication should lower simply to the familiar 4 real
multiplications, 1 real addition, 1 real subtraction. No special-casing
of infinite or NaN values should be made, instead the complex numbers
should be thought as just vectors of two reals, naturally bottoming out
on the reals' semantics, IEEE754 or otherwise. That is what nearly
everybody else is doing ("nearly" because at the end of this PR
description we pinpoint the actual source of this in C99 `_Complex`),
and this pattern, by trying to do something different, was generating
much larger code, which was much slower and a departure from the
naturally expected floating-point behavior.

This code had originally been introduced in
https://reviews.llvm.org/D105270, which stated this rationale:
> The lowering handles special cases with NaN or infinity like C++.

I don't think that the C++ standard is a particularly important thing to
follow in this instance. What matters more is what people actually do in
practice with complex numbers, which rarely involves the C++
`std::complex` library type.

But out of curiosity, I checked, and the above statement seems
incorrect. The [current C++
standard](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4928.pdf)
library specification for `std::complex` does not say anything about the
implementation of complex multiplication: paragraph `[complex.ops]`
falls back on `[complex.member.ops]` which says:
> Effects: Multiplies the complex value rhs by the complex value *this
and stores the product in *this.

I also checked cppreference which often has useful information in case
something changed in a c++ language revision, but likewise, nothing at
all there:
https://en.cppreference.com/w/cpp/numeric/complex/operator_arith3

Finally, I checked in Compiler Explorer what Clang 19 currently
generates:
https://godbolt.org/z/oY7Ks4j95
That is just the familiar 4 multiplications.... and then there is some
weird check (`fcmp`) and conditionally a call to an external `__mulsc3`.
Googled that, found this StackOverflow answer:
https://stackoverflow.com/a/49438578

Summary: this is not about C++ (this post confirms my reading of the C++
standard not mandating anything about this). This is about C, and it
just happens that this C++ standard library implementation bottoms out
on code shared with the C `_Complex` implementation.

Another nuance missing in that SO answer: this is actually
[implementation-defined
behavior](https://en.cppreference.com/w/c/preprocessor/impl). There are
two modes, controlled by
```c
#pragma STDC CX_LIMITED_RANGE {ON,OFF,DEFAULT}
```
It is implementation-defined which is the default. Clang defaults to
OFF, but that's just Clang. In that mode, the check is required:

https://en.cppreference.com/w/c/language/arithmetic_types#Complex_floating_types
And the specific point in the [C99
standard](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
is: `G.5.1 Multiplicative operators`.

But set it to ON and the check is gone:
https://godbolt.org/z/aG8fnbYoP

Summary: the argument has moved from C++ to C --- and even there, to
implementation-defined behavior with a standard opt-out mechanism.

Like with C++, I maintain that the C standard is not a particularly
meaningful thing for MLIR to follow here, because people doing business
with complex numbers tend to lower them to real numbers themselves, or
have their own specialized complex types, either way not relying on
C99's `_Complex` type --- and the very poor performance of the
`CX_LIMITED_RANGE OFF` behavior (default in Clang) is certainly a key
reason why people who care prefer to stay away from `_Complex` and
`std::complex`.

A good example that's relevant to MLIR's space is CUDA's `cuComplex`
type (used in the cuBLAS CGEMM interface). Here is its multiplication
function. The comment about competitiveness is interesting: it's not a
quirk of this particular function, it's the spirit underpinning
numerical code that matters.

1bf5cd15c5/v8.0/include/cuComplex.h (L106-L120)
```c

/* This implementation could suffer from intermediate overflow even though
 * the final result would be in range. However, various implementations do
 * not guard against this (presumably to avoid losing performance), so we 
 * don't do it either to stay competitive.
 */
__host__ __device__ static __inline__ cuFloatComplex cuCmulf (cuFloatComplex x,
                                                              cuFloatComplex y)
{
    cuFloatComplex prod;
    prod = make_cuFloatComplex  ((cuCrealf(x) * cuCrealf(y)) - 
                                 (cuCimagf(x) * cuCimagf(y)),
                                 (cuCrealf(x) * cuCimagf(y)) + 
                                 (cuCimagf(x) * cuCrealf(y)));
    return prod;
}
```

Another instance in CUTLASS:
https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/complex.h#L231-L236

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2024-12-12 11:11:39 -05:00
Jefferson Le Quellec
81825687b4
[MLIR][GPUToLLVMSPV] Update ConvertGpuOpsToLLVMSPVOps's option (#118818)
## Description

This PR updates the `ConvertGpuOpsToLLVMSPVOps`'s option by replacing
the `index-bitwidth` with a boolean option `use-64bit-index` (similar to
the `ConvertGPUToSPIRV` option).

The reason for this modification is because the
`ConvertGpuOpsToLLVMSPVOps`:
> Generate LLVM operations to be ingested by a SPIR-V backend for gpu
operations

In the context of SPIR-V specifications only two physical addressing
models are allowed: `Physical32` and `Physical64`.

This change guarantees output sanity by preventing invalid or
unsupported index bitwidths from being specified.
2024-12-12 13:35:07 +01:00
arthurqiu
cf27e8ea04
[mlir][LLVM] Fix missing MLIRNVGPUDialect dependency for MLIRGPUToNVVMTransforms (#119306)
This patch adds the missing MLIRNVGPUDialect dependency for
MLIRGPUToNVVMTransforms, which comes from
[LowerGpuOpsToNVVMOps.cpp](7498eaa9ab/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp (L34))
2024-12-10 23:48:11 +01:00
lorenzo chelini
6e2e4d446c Revert "[MLIR][Arith] Add denormal attribute to binary/unary operations (#112700)"
This reverts commit 4a7b56e6e7dd0f83c379ad06b6e81450bc691ba6.

There is no agreement.
2024-12-10 04:18:20 +01:00
Matthias Gehre
1f932825f9
[MLIR][EmitC] arith-to-emitc: Fix lowering of fptoui (#118504)
`arith.fptoui %arg0 : f32 to i16` was lowered to
```
%0 = emitc.cast %arg0 : f32 to ui32
emitc.cast %0 : ui32 to i16
```
and is now lowered to
```
%0 = emitc.cast %arg0 : f32 to ui16
emitc.cast %0 : ui16 to i16
```
2024-12-05 14:50:35 +01:00
Kunwar Grover
a8f927161b
[mlir][Vector] Fix vector.extract lowering to llvm for 0-d vectors (#117731)
The current implementation of lowering to llvm for vector.extract
incorrectly assumes that if the number of indices is zero, the operation
can be folded away. This PR removes this condition and relies on the
folder to do it instead.

This PR also unifies the logic for scalar extracts and slice extracts,
which as a side effect also enables vector.extract lowering for n-d
vector.extract with dynamic inner most dimension. (This was only
prevented by a conservative check in the old implementation)
2024-12-04 17:26:53 +00:00
Andrea Faulds
0a2116f4f9
[mlir][spirv][vector] Support converting vector.from_elements to SPIR-V (#118540)
Closes #118098.
2024-12-04 17:42:06 +01:00
Andrzej Warzynski
52b9d0beb6 Revert "[MLIR][Arith] Add ExpandOps to convertArithToLLVM (#117305)"
Failing bot:
  * https://lab.llvm.org/buildbot/#/builders/138/builds/729

Also, not all discussions have been resolved:
  * https://github.com/llvm/llvm-project/pull/117305#discussion_r1861194201

This reverts commit 2c739dfd53fde0995f91c8a2c11ec803041bac86.
2024-12-04 10:39:14 +00:00
Hugo Trachino
2c739dfd53
[MLIR][Arith] Add ExpandOps to convertArithToLLVM (#117305)
Arith Floor and Ceil ops would not get lowered when running
--convert-arith-to-llvm.
2024-12-04 09:32:10 +00:00
Thomas Preud'homme
720864907d
[TOSA] Use attributes for unsigned rescale (#118075)
Unsigned integer types are uncommon enough in MLIR that there is no
operation to cast a scalar from signless to unsigned and vice versa.
Currently tosa.rescale uses builtin.unrealized_conversion_cast which
does not lower. Instead, this commit introduces optional attributes to
indicate unsigned input or output, named similarly to those in the TOSA
specification. This is more in line with the rest of MLIR where specific
operations rather than values are signed/unsigned.
2024-12-04 09:17:55 +00:00
Frank Schlimbach
79eb406a67
[mlir][mesh, MPI] Mesh2mpi (#104566)
Pass for lowering `Mesh` to `MPI`.
Initial commit lowers `UpdateHaloOp` only.
2024-11-28 09:38:38 +00:00
Victor Perez
a807bbea6f
[MLIR][GPUToLLVMSPV] Use llvm.func attributes to convert gpu.shuffle (#116967)
Use `llvm.func`'s `intel_reqd_sub_group_size` attribute instead of
SPIR-V environment attributes in the `gpu.shuffle` conversion pattern.
This metadata is needed to check the semantics of the operation are
supported, i.e., it has a constant width and its value is equal to the
sub-group size.

As the pass also converts `gpu.func` to `llvm.func`, adding a
discardable attribute of name `intel_reqd_sub_group_size` attribute to
the latter is enough for this pattern to work.

We no longer have a notion of "default" sub-group size, so this
attribute needs to be set in the parent function for `gpu.shuffle`
operations to be converted.

Drop dependency on the SPIR-V dialect as we no longer require creating
attributes from this dialect to lower `gpu.shuffle` instances.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-11-27 15:04:38 +01:00
Krzysztof Drewniak
3359806817
[mlir][LLVM][MemRef] Lower assume_alignment with operand bundles (#117800)
Now that LLVM allows a operand bundle on assume calls to directly
specify alignment assumptions, change the lowering of
memref.assume_alignment to use that feature instead of the ptrtoint
method.

This makes LLVM's job easier and prevents issues when dealing with
cases where ptrtoint isn't a desired operation (like those with poiner
provenance)
2024-11-26 17:20:39 -06:00
Jakub Kuderski
f4d7586343
[mlir] Use llvm::filter_to_vector. NFC. (#117655)
This got recently added to SmallVectorExtras:
https://github.com/llvm/llvm-project/pull/117460.
2024-11-26 09:11:36 -05:00
lorenzo chelini
4a7b56e6e7
[MLIR][Arith] Add denormal attribute to binary/unary operations (#112700)
Add support for denormal in the Arith dialect (binary and unary
operations).
Denormal are attached to every operation, and they can be of three
different
kinds:

1) ieee, denormal are preserved and processed as defined by IEEE 754
rules.

2) preserve sign, a mode where denormal numbers are flushed to zero, but
the
sign of the zero (+0 or -0) is preserved.

3) positive zero, a mode where all denormal numbers are flushed to
positive zero
(+0), ignoring the sign of the original number.

Denormal refers to both the operands and the result. Currently only
lowering for
ieee is supported.
2024-11-26 11:58:43 +01:00
Andrzej Warzyński
1b2c8f104f
[mlir][linalg] Extract GeneralizePadOpPattern into a standalone transformation (#117329)
Currently, `GeneralizePadOpPattern` is grouped under
`populatePadOpVectorizationPatterns`. However, as noted in #111349, this
transformation "decomposes" rather than "vectorizes" `tensor.pad`. As
such, it functions as:
  * a vectorization _pre-processing_ transformation, not
  * a vectorization transformation itself.

To clarify its purpose, this PR turns `GeneralizePadOpPattern` into a
standalone transformation by:
  * introducing a dedicated `populateDecomposePadPatterns` method,
  * adding a `apply_patterns.linalg.decompose_pad` Transform Dialect Op,
  * removing it from `populatePadOpVectorizationPatterns`.

In addition, to better reflect its role, it is renamed as "decomposition"
rather then "generalization".  This is in line with the recent renaming
of similar ops, i.e. tensor.pack/tensor.unpack Ops in #116439.
2024-11-26 08:11:15 +00:00
Fabian Mora
7498eaa9ab
[mlir][LLVM] Add the ConvertToLLVMAttrInterface and ConvertToLLVMOpInterface interfaces (#99566)
This patch adds the `ConvertToLLVMAttrInterface` and
`ConvertToLLVMOpInterface` interfaces. It also modifies the
`convert-to-llvm` pass to use these interfaces when available.

The `ConvertToLLVMAttrInterface` interfaces allows attributes to
configure conversion to LLVM, including the conversion target, LLVM type
converter, and populating conversion patterns. See the `NVVMTargetAttr`
implementation of this interface for an example of how this interface
can be used to configure conversion to LLVM.

The `ConvertToLLVMOpInterface` interface collects all convert to LLVM
attributes stored in an operation.

Finally, the `convert-to-llvm` pass was modified to use these interfaces
when available. This allows applying `convert-to-llvm` to GPU modules
and letting the `NVVMTargetAttr` decide which patterns to populate.
2024-11-24 10:09:43 -05:00
Matthias Springer
a0ef12c642
[mlir][LLVM] LLVMTypeConverter: Tighten materialization checks (#116532)
This commit adds extra checks to the MemRef argument materializations in
the LLVM type converter. These materializations construct a
`MemRefType`/`UnrankedMemRefType` from the unpacked elements of a MemRef
descriptor or from a bare pointer.

The extra checks ensure that the inputs to the materialization function
are correct. It is possible that a user added extra type conversion
rules that convert MemRef types in a different way and the extra checks
ensure that we construct a MemRef descriptor only if the inputs are what
we expect.

This commit also drops a check around bare pointer materializations:
```
// This is a bare pointer. We allow bare pointers only for function entry
// blocks.
```
This check should not be part of the materialization function. Whether a
MemRef block argument is converted into a MemRef descriptor or a bare
pointer is decided in the lowering pattern. At the point of time when
materialization functions are executed, we already made that decision
and we should just materialize regardless of the input format.
2024-11-24 12:20:09 +09:00
Victor Perez
05fcdd555e
[MLIR][SPIRV-TO-LLVM] Support SPV_INTEL_split_barrier ops (#116648)
Add conversion to LLVM for `SPV_INTEL_split_barrier` operations via
conversion to SPIR-V built-ins.

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-11-22 10:22:07 +01:00
Dragan Mladjenovic
596bfb804b
[MLIR][AMDGPU] Support gpu::ShuffleMode::DOWN lowering in ROCDL (#106237) 2024-11-20 03:00:05 -06:00
Kareem Ergawy
fd3ff2007a
[flang][OpenMP] Add basic support to lower loop directive to MLIR (#114199)
Adds initial support for lowering the `loop` directive to MLIR.

The PR includes basic suport and testing for the following clauses:
 * `collapse`
 * `order`
 * `private`
 * `reduction`

Parent PR: #113911, only the latest commit is relevant to this PR.
2024-11-18 06:23:27 +01:00
Jie Fu
06011fee3a [mlir] Fix -Wsign-compare in ComplexToStandard.cpp (NFC)
/llvm-project/mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp:529:21:
 error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
  529 |   for (int i = 1; i < coefficients.size(); ++i) {
      |                   ~ ^ ~~~~~~~~~~~~~~~~~~~
1 error generated.
2024-11-18 10:34:16 +08:00
Alexander Belyaev
18ee00323f
[mlir][complex] Add a numerically-stable lowering for complex.expm1. (#115082)
The current conversion to Standard in the MLIR repo is not stable for
small imag(arg).
2024-11-17 17:11:23 -08:00