llvm-project

Author	SHA1	Message	Date
Matthias Springer	6422546e99	[mlir][LLVM] Fix conversion of non-standard MLIR float types (#122634 ) Certain non-standard float types were directly passed through in the LLVM type converter, resulting in invalid IR or failed assertions: ``` mlir-opt: mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:638: FailureOr<Type> mlir::LLVMTypeConverter::convertVectorType(VectorType) const: Assertion `LLVM::isCompatibleVectorType(vectorType) && "expected vector type compatible with the LLVM dialect"' failed. ``` The LLVM type converter should not define invalid type conversion rules for such types. If there is no type conversion rule, conversion patterns will not apply to ops with such operand types.	2025-01-12 15:17:12 +01:00
Kazu Hirata	129ec84574	[Conversion] Migrate away from PointerUnion::{is,get} (NFC) (#122421 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2025-01-10 15:10:17 -08:00
Lukas Sommer	4adeb6cf55	[mlir][spirv] Add convergent attribute to builtin (#122131 ) Add the `convergent` attribute to builtin functions and builtin function calls when lowering SPIR-V non-uniform group functions to LLVM dialect. --------- Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>	2025-01-10 09:15:18 +01:00
Longsheng Mou	9190e1c0ef	[mlir][linalg] Handle reassociationIndices correctly for 0D tensor (#121683 ) This PR fixes a bug where a value is assigned to a 0-sized reassociationIndices, preventing a crash. Fixes #116043.	2025-01-10 09:23:50 +08:00
Andrea Faulds	7724be9728	[mlir][spirv] Do SPIR-V serialization in -test-vulkan-runner-pipeline (#121494 ) This commit is a further incremental step toward moving the whole mlir-vulkan-runner MLIR pass pipeline into mlir-opt (see #73457). The previous step was b225b3adf7b78387c9fcb97a3ff0e0a1e26eafe2, which moved all device passes prior to SPIR-V serialization into a new mlir-opt test pass, `-test-vulkan-runner-pipeline`. This commit changes how SPIR-V serialization is accomplished for Vulkan runner tests. Until now, this was done by the Vulkan-specific ConvertGpuLaunchFuncToVulkanLaunchFunc pass. With this commit, this responsibility is removed from that pass, and is instead done with the existing generic GpuModuleToBinaryPass. In addition, the SPIR-V serialization step is no longer done inside mlir-vulkan-runner, but rather inside mlir-opt (in the `-test-vulkan-runner-pipeline` pass). Both of these changes represent a greater alignment between mlir-vulkan-runner and the other GPU integration tests. Notably, the IR shapes produced by the mlir-opt pipelines for the Vulkan and SYCL runners are now much more similar, with both using a gpu.binary op for the serialized SPIR-V kernel. In order to enable this, this commit includes these supporting changes: - ConvertToSPIRVPass is enhanced to support producing the IR shape where a spirv.module is nested inside a gpu.module, since this is what GpuModuleToBinaryPass expects. - ConvertGPULaunchFuncToVulkanLaunchFunc is changed to remove its SPIR-V serialization functionality, and instead now extracts the SPIR-V from a gpu.binary operation (as produced by ConvertToSPIRVPass). - `-test-vulkan-runner-pipeline` now attaches SPIR-V target information required by GpuModuleToBinaryPass. - The WebGPU pass option, which had been removed from mlir-vulkan-runner in the previous commit in this series, is restored as an option to `-test-vulkan-runner-pipeline` instead, so that the WebGPU pass continues being inserted into the pipeline just before SPIR-V serialization.	2025-01-09 17:58:51 +01:00
Pietro Ghiglio	cdd652eb28	[MLIR][GPU] Support bf16 and i1 gpu::shuffles to LLVMSPIRV conversion (#119675 ) This PR adds support to the `bf16` and `i1` data types when converting `gpu::shuffle` to the `LLVMSPV` dialect, by inserting `bitcast` to/from `i16` (for `bf16`) and extending/truncating to `i8` (for `i1`).	2025-01-09 13:16:18 +01:00
Longsheng Mou	c1d01b2fc2	[mlir][tosa] Add missing verifier for `tosa.pad` (#120934 ) This PR adds a missing verifier for `tosa.pad`, ensuring that the padding shape matches [2*rank(shape1)] according to V1.0.0 Specification. Fixes #119840.	2025-01-08 10:45:59 +02:00
Matthias Springer	599c739905	[mlir][GPU] Add NVVM-specific `cf.assert` lowering (#120431 ) This commit add an NVIDIA-specific lowering of `cf.assert` to to `__assertfail`. Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and `getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can be reused.	2025-01-06 12:00:11 +01:00
Matthias Springer	3ace685105	[mlir][Transforms] Support 1:N mappings in `ConversionValueMapping` (#116524 ) This commit updates the internal `ConversionValueMapping` data structure in the dialect conversion driver to support 1:N replacements. This is the last major commit for adding 1:N support to the dialect conversion driver. Since #116470, the infrastructure already supports 1:N replacements. But the `ConversionValueMapping` still stored 1:1 value mappings. To that end, the driver inserted temporary argument materializations (converting N SSA values into 1 value). This is no longer the case. Argument materializations are now entirely gone. (They will be deleted from the type converter after some time, when we delete the old 1:N dialect conversion driver.) Note for LLVM integration: Replace all occurrences of `addArgumentMaterialization` (except for 1:N dialect conversion passes) with `addSourceMaterialization`. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2025-01-03 16:11:56 +01:00
josel-amd	d622b66a82	Re-introduce Type Conversion on EmitC (#121476 ) This PR reintroduces https://github.com/llvm/llvm-project/pull/118940 with a fix for the build issues on cd9caf3aeed55280537052227f08bb1b41154efd	2025-01-02 14:58:15 +01:00
Matthias Gehre	df728cf1d7	Revert "[MLIR][SCFToEmitC] Convert types while converting from SCF to EmitC (#118940 )" This reverts commit 450c6b02d224245656c41033cc0c849bde2045f3.	2025-01-02 11:55:35 +01:00
josel-amd	450c6b02d2	[MLIR][SCFToEmitC] Convert types while converting from SCF to EmitC (#118940 ) Switch from rewrite patterns to conversion patterns. This allows to perform type conversions together with other parts of the IR. For example, this allows to convert from index to emit.size_t types.	2025-01-02 11:36:23 +01:00
Ivan Butygin	0e23cb0cc5	[mlir][nfc] GpuToROCDL: Remove some dead code (#121403 )	2024-12-31 20:39:31 +03:00
Ivan Butygin	018b32ca1f	Revert "[mlir][nfc] GpuToROCDL: Remove some dead code" (#121402 ) Reverts llvm/llvm-project#121395	2024-12-31 18:55:00 +03:00
Ivan Butygin	0b08e095cc	[mlir][nfc] GpuToROCDL: Remove some dead code (#121395 )	2024-12-31 18:54:41 +03:00
Jie Fu	fb1dbe24f2	[mlir] Remove extra ';' outside of a function (NFC) /llvm-project/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:51:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi] }; ^ /llvm-project/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:97:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi] }; ^ 2 errors generated.	2024-12-23 22:34:08 +08:00
Matthias Springer	df31fd8a36	[mlir] Fix use-after-return in #117513 (#120968 ) Fix a use-after-return in #117513. Free-standing lambdas should not be defined inside of the `LLVMTypeConverter` constructor because they go out of scope.	2024-12-23 15:13:42 +01:00
Matthias Springer	3cc311ab86	[mlir][Transforms] Dialect Conversion: No target mat. for 1:N replacement (#117513 ) During a 1:N replacement (`applySignatureConversion` or `replaceOpWithMultiple`), the dialect conversion driver used to insert two materializations: * Argument materialization: convert N replacement values to 1 SSA value of the original type `S`. * Target materialization: convert original type to legalized type `T`. The target materialization is unnecessary. Subsequent patterns receive the replacement values via their adaptors. These patterns have their own type converter. When they see a replacement value of type `S`, they will automatically insert a target materialization to type `T`. There is no reason to do this already during the 1:N replacement. (The functionality used to be duplicated in `remapValues` and `insertNTo1Materialization`.) Special case: If a subsequent pattern does not have a type converter, it does not insert any target materializations. That's because the absence of a type converter indicates that the pattern does not care about type legality. Therefore, it is correct to pass an SSA value of type `S` (or any other type) to the pattern. Note: Most patterns in `TestPatterns.cpp` run without a type converter. To make sure that the tests still behave the same, some of these patterns now have a type converter. This commit is in preparation of adding 1:N support to the conversion value mapping. Before making any further changes to the mapping infrastructure, I'd like to make sure that the code base around it (that uses the mapping) is robust.	2024-12-23 13:27:39 +01:00
Kazu Hirata	9901906035	[mlir] Fix a warning This patch fixes: mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp:535:13: error: 'applyPatternsAndFoldGreedily' is deprecated: Use applyPatternsGreedily() instead [-Werror,-Wdeprecated-declarations]	2024-12-20 10:25:50 -08:00
Jacques Pienaar	09dfc5713d	[mlir] Enable decoupling two kinds of greedy behavior. (#104649 ) The greedy rewriter is used in many different flows and it has a lot of convenience (work list management, debugging actions, tracing, etc). But it combines two kinds of greedy behavior 1) how ops are matched, 2) folding wherever it can. These are independent forms of greedy and leads to inefficiency. E.g., cases where one need to create different phases in lowering and is required to applying patterns in specific order split across different passes. Using the driver one ends up needlessly retrying folding/having multiple rounds of folding attempts, where one final run would have sufficed. Of course folks can locally avoid this behavior by just building their own, but this is also a common requested feature that folks keep on working around locally in suboptimal ways. For downstream users, there should be no behavioral change. Updating from the deprecated should just be a find and replace (e.g., `find ./ -type f -exec sed -i 's\|applyPatternsAndFoldGreedily\|applyPatternsGreedily\|g' {} \;` variety) as the API arguments hasn't changed between the two.	2024-12-20 08:15:48 -08:00
Ivan Butygin	953b07febc	[mlir] AMDGPUToROCDL: RawBufferOpLowering fixes (#120642 ) 1. We can use `getNumElements()` only for memrefs with trivial layout. 2. Buffer ops expecting sizes in i32 but descriptor values can be either i32 or i64, add appropriate casts. This implementation is not ideal as it can overflow, but it's still better than generating broken IR.	2024-12-20 18:09:01 +03:00
Matthias Springer	eb6c4197d5	[mlir][CF] Split `cf-to-llvm` from `func-to-llvm` (#120580 ) Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes https://github.com/llvm/llvm-project/issues/70982. This commit changes the way how `func.func` ops are lowered to LLVM. Previously, the signature of the entire region (i.e., entry block and all other blocks in the `func.func` op) was converted as part of the `func.func` lowering pattern. Now, only the entry block is converted. The remaining block signatures are converted together with `cf.br` and `cf.cond_br` as part of `cf-to-llvm`. All unstructured control flow is not converted as part of a single pass (`cf-to-llvm`). `func-to-llvm` no longer deals with unstructured control flow. Also add more test cases for control flow dialect ops. Note: This PR is in preparation of #120431, which adds an additional GPU-specific lowering for `cf.assert`. This was a problem because `cf.assert` used to be converted as part of `func-to-llvm`. Note for LLVM integration: If you see failures, add `-convert-cf-to-llvm` to your pass pipeline.	2024-12-20 13:46:45 +01:00
Matthias Springer	53d080c5b5	[mlir][Arith] Remove `arith-to-llvm` from `func-to-llvm` (#120548 ) Do not run `arith-to-llvm` as part of `func-to-llvm`. This commit partly fixes #70982. Also simplify the pass pipeline for two math dialect integration tests. Note for LLVM integration: If you see failures, add `arith-to-llvm` to your pass pipeline.	2024-12-20 10:14:04 +01:00
Matthias Springer	0693b9e9cc	[mlir][Vector] Clean up `populateVectorToLLVMConversionPatterns` (#119975 ) Clean up `populateVectorToLLVMConversionPatterns` so that it populates only conversion patterns. All rewrite patterns that do not lower to LLVM should be populated into a separate greedy pattern rewrite. The current combination of rewrite patterns and conversion patterns triggered an edge case when merging the 1:1 and 1:N dialect conversions. Depends on #119973.	2024-12-17 11:37:17 +01:00
Matthias Springer	8cd8b5079b	[mlir][Vector] Move mask materialization patterns to greedy rewrite (#119973 ) The mask materialization patterns during `VectorToLLVM` are rewrite patterns. They should run as part of the greedy pattern rewrite and not the dialect conversion. (Rewrite patterns and conversion patterns are not generally compatible.) The current combination of rewrite patterns and conversion patterns triggered an edge case when merging the 1:1 and 1:N dialect conversions.	2024-12-17 11:26:31 +01:00
Hugo Trachino	3cbc73f71e	[MLIR][Arith] Add CeilFloorDivExpandOpsPatterns to conversion to LLVM (Reland) (#118839 ) When running `convert-to-llvm`, `ceildiv` and `floordiv` ops, which do not have direct llvm conversion pattern, would not get lowered to llvm dialect. This patch adds CeilFloorDivExpandOpsPatterns to both `convert-to-llvm` and `arith-to-llvm` (deprecated) lowering those ops to lower level arith ops which can be lowered to llvm using LLVM conversion. Reland of https://github.com/llvm/llvm-project/pull/117305 after buildbot failures. See: https://lab.llvm.org/buildbot/#/builders/80/builds/7168 https://lab.llvm.org/buildbot/#/builders/130/builds/7036 https://lab.llvm.org/buildbot/#/builders/138/builds/7290 Added dependence to ArithTransforms in ArithToLLVM. In previous discussion, it has been suggested to move the CeilFloorDivExpandOpsPatterns to ArithUtils but I think linking ArithTransforms makes more sense as otherwise : * ArithToLLVM needs a new dependency to ArithUtils * ArithUtils needs new dependency to ArithTransforms or move the patterns as well which will create more dependencies * It creates lots of code motion which makes it hard to review.	2024-12-16 16:15:13 +00:00
Adam Siemieniuk	4c597d42dc	[mlir][xegpu] Support boundary checks only for block instructions (#119380 ) Constrains Vector lowering to apply boundary checks only to data transfers operating on block shapes. This further aligns lowering with the current Xe instructions' restrictions.	2024-12-13 10:01:13 +01:00
Benoit Jacob	bdd365825d	[MLIR] Fix `ComplexToStandard` lowering of `complex::MulOp` (#119591 ) A complex multiplication should lower simply to the familiar 4 real multiplications, 1 real addition, 1 real subtraction. No special-casing of infinite or NaN values should be made, instead the complex numbers should be thought as just vectors of two reals, naturally bottoming out on the reals' semantics, IEEE754 or otherwise. That is what nearly everybody else is doing ("nearly" because at the end of this PR description we pinpoint the actual source of this in C99 `_Complex`), and this pattern, by trying to do something different, was generating much larger code, which was much slower and a departure from the naturally expected floating-point behavior. This code had originally been introduced in https://reviews.llvm.org/D105270, which stated this rationale: > The lowering handles special cases with NaN or infinity like C++. I don't think that the C++ standard is a particularly important thing to follow in this instance. What matters more is what people actually do in practice with complex numbers, which rarely involves the C++ `std::complex` library type. But out of curiosity, I checked, and the above statement seems incorrect. The [current C++ standard](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4928.pdf) library specification for `std::complex` does not say anything about the implementation of complex multiplication: paragraph `[complex.ops]` falls back on `[complex.member.ops]` which says: > Effects: Multiplies the complex value rhs by the complex value this and stores the product in this. I also checked cppreference which often has useful information in case something changed in a c++ language revision, but likewise, nothing at all there: https://en.cppreference.com/w/cpp/numeric/complex/operator_arith3 Finally, I checked in Compiler Explorer what Clang 19 currently generates: https://godbolt.org/z/oY7Ks4j95 That is just the familiar 4 multiplications.... and then there is some weird check (`fcmp`) and conditionally a call to an external `__mulsc3`. Googled that, found this StackOverflow answer: https://stackoverflow.com/a/49438578 Summary: this is not about C++ (this post confirms my reading of the C++ standard not mandating anything about this). This is about C, and it just happens that this C++ standard library implementation bottoms out on code shared with the C `_Complex` implementation. Another nuance missing in that SO answer: this is actually [implementation-defined behavior](https://en.cppreference.com/w/c/preprocessor/impl). There are two modes, controlled by ```c #pragma STDC CX_LIMITED_RANGE {ON,OFF,DEFAULT} ``` It is implementation-defined which is the default. Clang defaults to OFF, but that's just Clang. In that mode, the check is required: https://en.cppreference.com/w/c/language/arithmetic_types#Complex_floating_types And the specific point in the [C99 standard](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) is: `G.5.1 Multiplicative operators`. But set it to ON and the check is gone: https://godbolt.org/z/aG8fnbYoP Summary: the argument has moved from C++ to C --- and even there, to implementation-defined behavior with a standard opt-out mechanism. Like with C++, I maintain that the C standard is not a particularly meaningful thing for MLIR to follow here, because people doing business with complex numbers tend to lower them to real numbers themselves, or have their own specialized complex types, either way not relying on C99's `_Complex` type --- and the very poor performance of the `CX_LIMITED_RANGE OFF` behavior (default in Clang) is certainly a key reason why people who care prefer to stay away from `_Complex` and `std::complex`. A good example that's relevant to MLIR's space is CUDA's `cuComplex` type (used in the cuBLAS CGEMM interface). Here is its multiplication function. The comment about competitiveness is interesting: it's not a quirk of this particular function, it's the spirit underpinning numerical code that matters. `1bf5cd15c5/v8.0/include/cuComplex.h (L106-L120)` ```c /* This implementation could suffer from intermediate overflow even though * the final result would be in range. However, various implementations do * not guard against this (presumably to avoid losing performance), so we * don't do it either to stay competitive. / __host__ __device__ static __inline__ cuFloatComplex cuCmulf (cuFloatComplex x, cuFloatComplex y) { cuFloatComplex prod; prod = make_cuFloatComplex ((cuCrealf(x) cuCrealf(y)) - (cuCimagf(x) * cuCimagf(y)), (cuCrealf(x) * cuCimagf(y)) + (cuCimagf(x) * cuCrealf(y))); return prod; } ``` Another instance in CUTLASS: https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/complex.h#L231-L236 Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>	2024-12-12 11:11:39 -05:00
Jefferson Le Quellec	81825687b4	[MLIR][GPUToLLVMSPV] Update ConvertGpuOpsToLLVMSPVOps's option (#118818 ) ## Description This PR updates the `ConvertGpuOpsToLLVMSPVOps`'s option by replacing the `index-bitwidth` with a boolean option `use-64bit-index` (similar to the `ConvertGPUToSPIRV` option). The reason for this modification is because the `ConvertGpuOpsToLLVMSPVOps`: > Generate LLVM operations to be ingested by a SPIR-V backend for gpu operations In the context of SPIR-V specifications only two physical addressing models are allowed: `Physical32` and `Physical64`. This change guarantees output sanity by preventing invalid or unsupported index bitwidths from being specified.	2024-12-12 13:35:07 +01:00
arthurqiu	cf27e8ea04	[mlir][LLVM] Fix missing MLIRNVGPUDialect dependency for MLIRGPUToNVVMTransforms (#119306 ) This patch adds the missing MLIRNVGPUDialect dependency for MLIRGPUToNVVMTransforms, which comes from [LowerGpuOpsToNVVMOps.cpp](`7498eaa9ab/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp (L34)`)	2024-12-10 23:48:11 +01:00
lorenzo chelini	6e2e4d446c	Revert "[MLIR][Arith] Add denormal attribute to binary/unary operations (#112700 )" This reverts commit 4a7b56e6e7dd0f83c379ad06b6e81450bc691ba6. There is no agreement.	2024-12-10 04:18:20 +01:00
Matthias Gehre	1f932825f9	[MLIR][EmitC] arith-to-emitc: Fix lowering of fptoui (#118504 ) `arith.fptoui %arg0 : f32 to i16` was lowered to ``` %0 = emitc.cast %arg0 : f32 to ui32 emitc.cast %0 : ui32 to i16 ``` and is now lowered to ``` %0 = emitc.cast %arg0 : f32 to ui16 emitc.cast %0 : ui16 to i16 ```	2024-12-05 14:50:35 +01:00
Kunwar Grover	a8f927161b	[mlir][Vector] Fix vector.extract lowering to llvm for 0-d vectors (#117731 ) The current implementation of lowering to llvm for vector.extract incorrectly assumes that if the number of indices is zero, the operation can be folded away. This PR removes this condition and relies on the folder to do it instead. This PR also unifies the logic for scalar extracts and slice extracts, which as a side effect also enables vector.extract lowering for n-d vector.extract with dynamic inner most dimension. (This was only prevented by a conservative check in the old implementation)	2024-12-04 17:26:53 +00:00
Andrea Faulds	0a2116f4f9	[mlir][spirv][vector] Support converting vector.from_elements to SPIR-V (#118540 ) Closes #118098.	2024-12-04 17:42:06 +01:00
Andrzej Warzynski	52b9d0beb6	Revert "[MLIR][Arith] Add ExpandOps to convertArithToLLVM (#117305 )" Failing bot: * https://lab.llvm.org/buildbot/#/builders/138/builds/729 Also, not all discussions have been resolved: * https://github.com/llvm/llvm-project/pull/117305#discussion_r1861194201 This reverts commit 2c739dfd53fde0995f91c8a2c11ec803041bac86.	2024-12-04 10:39:14 +00:00
Hugo Trachino	2c739dfd53	[MLIR][Arith] Add ExpandOps to convertArithToLLVM (#117305 ) Arith Floor and Ceil ops would not get lowered when running --convert-arith-to-llvm.	2024-12-04 09:32:10 +00:00
Thomas Preud'homme	720864907d	[TOSA] Use attributes for unsigned rescale (#118075 ) Unsigned integer types are uncommon enough in MLIR that there is no operation to cast a scalar from signless to unsigned and vice versa. Currently tosa.rescale uses builtin.unrealized_conversion_cast which does not lower. Instead, this commit introduces optional attributes to indicate unsigned input or output, named similarly to those in the TOSA specification. This is more in line with the rest of MLIR where specific operations rather than values are signed/unsigned.	2024-12-04 09:17:55 +00:00
Frank Schlimbach	79eb406a67	[mlir][mesh, MPI] Mesh2mpi (#104566 ) Pass for lowering `Mesh` to `MPI`. Initial commit lowers `UpdateHaloOp` only.	2024-11-28 09:38:38 +00:00
Victor Perez	a807bbea6f	[MLIR][GPUToLLVMSPV] Use `llvm.func` attributes to convert `gpu.shuffle` (#116967 ) Use `llvm.func`'s `intel_reqd_sub_group_size` attribute instead of SPIR-V environment attributes in the `gpu.shuffle` conversion pattern. This metadata is needed to check the semantics of the operation are supported, i.e., it has a constant width and its value is equal to the sub-group size. As the pass also converts `gpu.func` to `llvm.func`, adding a discardable attribute of name `intel_reqd_sub_group_size` attribute to the latter is enough for this pattern to work. We no longer have a notion of "default" sub-group size, so this attribute needs to be set in the parent function for `gpu.shuffle` operations to be converted. Drop dependency on the SPIR-V dialect as we no longer require creating attributes from this dialect to lower `gpu.shuffle` instances. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-11-27 15:04:38 +01:00
Krzysztof Drewniak	3359806817	[mlir][LLVM][MemRef] Lower assume_alignment with operand bundles (#117800 ) Now that LLVM allows a operand bundle on assume calls to directly specify alignment assumptions, change the lowering of memref.assume_alignment to use that feature instead of the ptrtoint method. This makes LLVM's job easier and prevents issues when dealing with cases where ptrtoint isn't a desired operation (like those with poiner provenance)	2024-11-26 17:20:39 -06:00
Jakub Kuderski	f4d7586343	[mlir] Use `llvm::filter_to_vector`. NFC. (#117655 ) This got recently added to SmallVectorExtras: https://github.com/llvm/llvm-project/pull/117460.	2024-11-26 09:11:36 -05:00
lorenzo chelini	4a7b56e6e7	[MLIR][Arith] Add denormal attribute to binary/unary operations (#112700 ) Add support for denormal in the Arith dialect (binary and unary operations). Denormal are attached to every operation, and they can be of three different kinds: 1) ieee, denormal are preserved and processed as defined by IEEE 754 rules. 2) preserve sign, a mode where denormal numbers are flushed to zero, but the sign of the zero (+0 or -0) is preserved. 3) positive zero, a mode where all denormal numbers are flushed to positive zero (+0), ignoring the sign of the original number. Denormal refers to both the operands and the result. Currently only lowering for ieee is supported.	2024-11-26 11:58:43 +01:00
Andrzej Warzyński	1b2c8f104f	[mlir][linalg] Extract `GeneralizePadOpPattern` into a standalone transformation (#117329 ) Currently, `GeneralizePadOpPattern` is grouped under `populatePadOpVectorizationPatterns`. However, as noted in #111349, this transformation "decomposes" rather than "vectorizes" `tensor.pad`. As such, it functions as: * a vectorization _pre-processing_ transformation, not * a vectorization transformation itself. To clarify its purpose, this PR turns `GeneralizePadOpPattern` into a standalone transformation by: * introducing a dedicated `populateDecomposePadPatterns` method, * adding a `apply_patterns.linalg.decompose_pad` Transform Dialect Op, * removing it from `populatePadOpVectorizationPatterns`. In addition, to better reflect its role, it is renamed as "decomposition" rather then "generalization". This is in line with the recent renaming of similar ops, i.e. tensor.pack/tensor.unpack Ops in #116439.	2024-11-26 08:11:15 +00:00
Fabian Mora	7498eaa9ab	[mlir][LLVM] Add the `ConvertToLLVMAttrInterface` and `ConvertToLLVMOpInterface` interfaces (#99566 ) This patch adds the `ConvertToLLVMAttrInterface` and `ConvertToLLVMOpInterface` interfaces. It also modifies the `convert-to-llvm` pass to use these interfaces when available. The `ConvertToLLVMAttrInterface` interfaces allows attributes to configure conversion to LLVM, including the conversion target, LLVM type converter, and populating conversion patterns. See the `NVVMTargetAttr` implementation of this interface for an example of how this interface can be used to configure conversion to LLVM. The `ConvertToLLVMOpInterface` interface collects all convert to LLVM attributes stored in an operation. Finally, the `convert-to-llvm` pass was modified to use these interfaces when available. This allows applying `convert-to-llvm` to GPU modules and letting the `NVVMTargetAttr` decide which patterns to populate.	2024-11-24 10:09:43 -05:00
Matthias Springer	a0ef12c642	[mlir][LLVM] `LLVMTypeConverter`: Tighten materialization checks (#116532 ) This commit adds extra checks to the MemRef argument materializations in the LLVM type converter. These materializations construct a `MemRefType`/`UnrankedMemRefType` from the unpacked elements of a MemRef descriptor or from a bare pointer. The extra checks ensure that the inputs to the materialization function are correct. It is possible that a user added extra type conversion rules that convert MemRef types in a different way and the extra checks ensure that we construct a MemRef descriptor only if the inputs are what we expect. This commit also drops a check around bare pointer materializations: ``` // This is a bare pointer. We allow bare pointers only for function entry // blocks. ``` This check should not be part of the materialization function. Whether a MemRef block argument is converted into a MemRef descriptor or a bare pointer is decided in the lowering pattern. At the point of time when materialization functions are executed, we already made that decision and we should just materialize regardless of the input format.	2024-11-24 12:20:09 +09:00
Victor Perez	05fcdd555e	[MLIR][SPIRV-TO-LLVM] Support SPV_INTEL_split_barrier ops (#116648 ) Add conversion to LLVM for `SPV_INTEL_split_barrier` operations via conversion to SPIR-V built-ins. Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-11-22 10:22:07 +01:00
Dragan Mladjenovic	596bfb804b	[MLIR][AMDGPU] Support gpu::ShuffleMode::DOWN lowering in ROCDL (#106237 )	2024-11-20 03:00:05 -06:00
Kareem Ergawy	fd3ff2007a	[flang][OpenMP] Add basic support to lower `loop` directive to MLIR (#114199 ) Adds initial support for lowering the `loop` directive to MLIR. The PR includes basic suport and testing for the following clauses: * `collapse` * `order` * `private` * `reduction` Parent PR: #113911, only the latest commit is relevant to this PR.	2024-11-18 06:23:27 +01:00
Jie Fu	06011fee3a	[mlir] Fix -Wsign-compare in ComplexToStandard.cpp (NFC) /llvm-project/mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp:529:21: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare] 529 \| for (int i = 1; i < coefficients.size(); ++i) { \| ~ ^ ~~~~~~~~~~~~~~~~~~~ 1 error generated.	2024-11-18 10:34:16 +08:00
Alexander Belyaev	18ee00323f	[mlir][complex] Add a numerically-stable lowering for complex.expm1. (#115082 ) The current conversion to Standard in the MLIR repo is not stable for small imag(arg).	2024-11-17 17:11:23 -08:00

1 2 3 4 5 ...

2741 Commits