llvm-project

Author	SHA1	Message	Date
Yang Bai	b4c31dc98d	[mlir][Vector] add vector.insert canonicalization pattern to convert a chain of insertions to vector.from_elements (#142944 ) ## Description This change introduces a new canonicalization pattern for the MLIR Vector dialect that optimizes chains of insertions. The optimization identifies when a vector is completely initialized through a series of vector.insert operations and replaces the entire chain with a single `vector.from_elements` operation. Please be aware that the new pattern doesn't work for poison vectors where only some elements are set, as MLIR doesn't support partial poison vectors for now. New Pattern: InsertChainFullyInitialized * Detects chains of vector.insert operations. * Validates that all insertions are at static positions, and all intermediate insertions have only one use. * Ensures the entire vector is completely initialized. * Replaces the entire chain with a single vector.from_elementts operation. Refactored Helper Function * Extracted `calculateInsertPosition` from `foldDenseElementsAttrDestInsertOp` to avoid code duplication. ## Example ``` // Before: %v1 = vector.insert %c10, %v0[0] : i64 into vector<2xi64> %v2 = vector.insert %c20, %v1[1] : i64 into vector<2xi64> // After: %v2 = vector.from_elements %c10, %c20 : vector<2xi64> ``` It also works for multidimensional vectors. ``` // Before: %v1 = vector.insert %cv0, %v0[0] : vector<3xi64> into vector<2x3xi64> %v2 = vector.insert %cv1, %v1[1] : vector<3xi64> into vector<2x3xi64> // After: %0:3 = vector.to_elements %arg1 : vector<3xi64> %1:3 = vector.to_elements %arg2 : vector<3xi64> %v2 = vector.from_elements %0#0, %0#1, %0#2, %1#0, %1#1, %1#2 : vector<2x3xi64> ``` --------- Co-authored-by: Yang Bai <yangb@nvidia.com> Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2025-08-19 13:43:31 +01:00
Md Asghar Ahmad Shahid	c24c23d9ab	[NFC][mlir][vector] Handle potential static cast assertion. (#152957 ) In FoldArithToVectorOuterProduct pattern, static cast to vector type causes assertion when a scalar type was encountered. It seems the author meant to have a dyn_cast instead. This NFC patch handles it by using dyn_cast.	2025-08-19 09:27:20 +05:30
Yang Bai	4eb1a07d7d	[mlir][vector] Support multi-dimensional vectors in VectorFromElementsLowering (#151175 ) This patch introduces a new unrolling-based approach for lowering multi-dimensional `vector.from_elements` operations. Implementation Details: 1. New Transform Pattern: Added `UnrollFromElements` that unrolls a N-D(N>=2) from_elements op to a (N-1)-D from_elements op align the outermost dimension. 2. Utility Functions: Added `unrollVectorOp` to reuse the unroll algo of vector.gather for vector.from_elements. 3. Integration: Added the unrolling pattern to the convert-vector-to-llvm pass as a temporal transformation. 4. Use direct LLVM dialect operations instead of intermediate vector.insert operations for efficiency in `VectorFromElementsLowering`. Example: ```mlir // unroll %v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32> => %poison_2d = ub.poison : vector<2x2xf32> %vec_1d_0 = vector.from_elements %e0, %e1 : vector<2xf32> %vec_2d_0 = vector.insert %vec_1d_0, %poison_2d [0] : vector<2xf32> into vector<2x2xf32> %vec_1d_1 = vector.from_elements %e2, %e3 : vector<2xf32> %result = vector.insert %vec_1d_1, %vec_2d_0 [1] : vector<2xf32> into vector<2x2xf32> // convert-vector-to-llvm %v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32> => %poison_2d = ub.poison : vector<2x2xf32> %poison_2d_cast = builtin.unrealized_conversion_cast %poison_2d : vector<2x2xf32> to !llvm.array<2 x vector<2xf32>> %poison_1d_0 = llvm.mlir.poison : vector<2xf32> %c0_0 = llvm.mlir.constant(0 : i64) : i64 %vec_1d_0_0 = llvm.insertelement %e0, %poison_1d_0[%c0_0 : i64] : vector<2xf32> %c1_0 = llvm.mlir.constant(1 : i64) : i64 %vec_1d_0_1 = llvm.insertelement %e1, %vec_1d_0_0[%c1_0 : i64] : vector<2xf32> %vec_2d_0 = llvm.insertvalue %vec_1d_0_1, %poison_2d_cast[0] : !llvm.array<2 x vector<2xf32>> %poison_1d_1 = llvm.mlir.poison : vector<2xf32> %c0_1 = llvm.mlir.constant(0 : i64) : i64 %vec_1d_1_0 = llvm.insertelement %e2, %poison_1d_1[%c0_1 : i64] : vector<2xf32> %c1_1 = llvm.mlir.constant(1 : i64) : i64 %vec_1d_1_1 = llvm.insertelement %e3, %vec_1d_1_0[%c1_1 : i64] : vector<2xf32> %vec_2d_1 = llvm.insertvalue %vec_1d_1_1, %vec_2d_0[1] : !llvm.array<2 x vector<2xf32>> %result = builtin.unrealized_conversion_cast %vec_2d_1 : !llvm.array<2 x vector<2xf32>> to vector<2x2xf32> ``` --------- Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com> Co-authored-by: Yang Bai <yangb@nvidia.com> Co-authored-by: James Newling <james.newling@gmail.com> Co-authored-by: Diego Caballero <dieg0ca6aller0@gmail.com>	2025-08-18 10:09:12 -07:00
Matthias Springer	21b607adbe	[mlir][SCF] `scf.for`: Add support for unsigned integer comparison (#153379 ) Add a new unit attribute to allow for unsigned integer comparison. Example: ```mlir scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 { // body } ``` Discussion: https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655	2025-08-15 10:59:14 +02:00
Matthias Springer	ef2b8805bf	[mlir][vector] Implement `InferTypeOpInterface` on `vector.to_elements` (#153172 ) Just for convenience. This auto-generates an additional builder that infers the result type.	2025-08-12 15:15:30 +02:00
Min-Yih Hsu	b4e8b8ee91	[mlir][vector] Canonicalize broadcast of shape_cast (#150523 ) Fold `broadcast(shape_cast(x))` into `broadcast(x)` if the type of x is compatible with broadcast's result type and the shape_cast only adds or removes ones in the leading dimensions. --------- Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com> Co-authored-by: James Newling <james.newling@gmail.com>	2025-08-08 09:25:32 -07:00
Andrzej Warzyński	3692c73ce4	[mlir][linalg] Enable scalable vectorization of linalg.unpack (#149293 ) This patch updates `vectorizeAsTensorUnpackOp` to support scalable vectorization by requiring user-specified vector sizes for the _read_ operation (rather than the _write_ operation) in `linalg.unpack`. Conceptually, `linalg.unpack` consists of these high-level steps: * Read from the source tensor using `vector.transfer_read`. * Transpose the read value according to the permutation in the `linalg.unpack` op (via `vector.transpose`). * Re-associate dimensions of the transposed value, as specified by the op (via `vector.shape_cast`) * Write the result into the destination tensor via `vector.transfer_write`. Previously, the vector sizes provided by the user were interpreted as write-vector sizes. These were used to: * Infer read-vector sizes using the `inner_tiles` attribute of the unpack op. * Deduce vector sizes for the transpose and shape cast operations. * Ultimately determine the vector shape for the write. However, this logic breaks when one or more tile sizes are dynamic. In such cases, `vectorizeUnPackOpPrecondition` fails, and vectorization is rejected. This patch switches the contract: users now directly specify the "read-vector-sizes", which inherently encode all inner tile sizes - including dynamic ones. It becomes the user's responsibility to provide valid sizes. In practice, since `linalg.unpack` is typically constructed, tiled, and vectorized by the same transformation pipeline, the necessary "read-vector-sizes" should be recoverable.	2025-08-06 20:37:50 +01:00
James Newling	a96d8aed98	[mlir][vector] vector.splat and vector.broadcast folding/canonicalizing parity (#150284 ) This PR ensures parity in folding/canonicalizing of vector.broadcast (from a scalar) and vector.splat. This means that by using vector.broadcast instead of vector.splat (which is currently deprecated), there is no loss in optimizations performed. All tests which were previously checking folding/canonicalizing of vector.splat are now done for vector.broadcast. The vector.splat canonicalization tests are now in a separate file, ready for removal when, in the future, we remove vector.splat completely. This PR also adds a canonicalizer to vector.splat to always convert it to vector.broadcast. This is to reduce the 'traffic' through vector.splat. There is a chance that this PR will break downstream users who create/expect for vector.splat. Changing all such logic to work just vector.broadcast instead should fix.	2025-08-01 08:57:38 -07:00
James Newling	0f35244816	[mlir][vector] shape_cast(constant) -> constant fold for non-splats (#145539 ) The folder `shape_cast(splat constant) -> splat constant` was first introduced [here](`36480657d8 (diff-484cea976e0c96459027c951733bf2d22d34c5a0c0de6f577069870ef4588983R2600)`) (Nov 2020). In that commit there is a comment to _Only handle splat for now_. Based on that I assume the intention was to, at a later time, support a general `shape_cast(constant) -> constant` folder. That is what this PR does One minor downside: It is possible with this folder end up with, instead of 1 large constant and 1 shape_cast, 2 large constants: ```mlir func.func @foo() -> (vector<4xi32>, vector<2x2xi32>) { %cst = arith.constant dense<[1, 2, 3, 4]> : vector<4xi32> # 'large' constant 1 %0 = vector.shape_cast %cst : vector<4xi32> to vector<2x2xi32> return %cst, %0 : vector<4xi32>, vector<2x2xi32> } ``` gets folded with this new folder to ```mlir func.func @foo() -> (vector<4xi32>, vector<2x2xi32>) { %cst = arith.constant dense<[1, 2, 3, 4]> : vector<4xi32> # 'large' constant 1 %cst_0 = arith.constant dense<[[1, 2], [3, 4]]> : vector<2x2xi32> # 'large' constant 2 return %cst, %cst_0 : vector<4xi32>, vector<2x2xi32> } ``` Notes on the above case: 1) This only effects the textual IR, the actual values share the same context storage (I've verified this by checking pointer values in the `DenseIntOrFPElementsAttrStorage` [constructor](`da5c442550/mlir/lib/IR/AttributeDetail.h (L59)`)) so no compile-time memory overhead to this folding. At the LLVM IR level the constant is shared, too. 2) This only happens when the pre-folded constant cannot be dead code eliminated (i.e. when it has 2+ uses) which I don't think is common.	2025-07-31 12:12:53 -07:00
Max191	91e0055c7c	[mlir] Implement inferResultRanges for vector.step op (#151536 ) Implements the `inferResultRanges` method from the `InferIntRangeInterface` interface for `vector.step`. The implementation is similar to that of arith.constant, since the exact result values are statically known. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2025-07-31 10:35:26 -07:00
Max191	69751196a9	[mlir] Implement inferResultRanges for vector.transpose (#151537 ) Implements the `inferResultRanges` method from the `InferIntRangeInterface` interface for `vector.transpose`. The result ranges simply match the source ranges. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2025-07-31 09:32:50 -07:00
James Newling	671eaf84b3	[mlir][vector] Avoid use of vector.splat in transforms (#150279 ) This is part of vector.splat deprecation Reference: https://discourse.llvm.org/t/rfc-mlir-vector-deprecate-then-remove-vector-splat/87143/5 Instead of creating vector::SplatOp, create vector::BroadcastOp	2025-07-31 06:28:01 -07:00
Krzysztof Drewniak	9a46091d4b	[mlir][Vector] Allow elementwise/broadcast swap to handle mixed types (#151274 ) This patch extends the operation that rewrites elementwise operations whose inputs are all broadcast from the same shape to handle mixed-types, such as when the result and input types don't match, or when the inputs have multiple types. PR #150867 failed to check for the possibility of type mismatches when rewriting splat constants. In order to fix that issue, we add support for mixed-type operations more generally.	2025-07-30 09:36:37 -07:00
Mehdi Amini	75e5a70577	[MLIR] Migrate some conversion passes and dialects to LDBG() macro (NFC) (#151349 )	2025-07-30 17:58:54 +02:00
Krzysztof Drewniak	330a7e1136	[mlir][Vector] Make elementwise-on-broadcast sinking handle splat consts (#150867 ) There is a pattern that rewrites elementwise_op(broadcast(x1 : T to U), broadcast(x2 : T to U), ...) to broadcast(elementwise_op(x1, x2, ...) : T to U). This pattern did not, however, account for the case where a broadcast constant is represented as a SplatElementsAttr, which can safely be reshaped or scalarized but is not a `vector.broadcast` or `vector.splat` operation. This patch fixes this oversight, prenting premature broadcasting. This did result in the need to update some linalg dialect tests, which now feature a less-broadcast computation and/or more constant folding.	2025-07-29 09:40:49 -07:00
Diego Caballero	33465bb2bb	[mlir][Vector] Remove `vector.extractelement` and `vector.insertelement` ops (#149603 ) This PR removes `vector.extractelement` and `vector.insertelement` ops from the code base in favor of the `vector.extract` and `vector.insert` counterparts. See RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops	2025-07-28 11:01:14 -07:00
Maksim Levental	fcbcfe44cf	[mlir][NFC] update `mlir/Dialect` create APIs (32/n) (#150657 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-25 13:50:15 -05:00
Jacques Pienaar	07967d4af8	[mlir] Switch to new LDBG macro (#150616 ) Change local variants to use new central one.	2025-07-25 18:22:46 +02:00
Longsheng Mou	75aa629269	[mlir][vector] Add a check to ensure input vector rank equals target shape rank (#149239 ) The crash is caused because, during IR transformation, the vector-unrolling pass (using ExtractStridedSliceOp) attempts to slice an input vector of higher rank using a target vector of lower rank, which is not supported. Fixes #148368.	2025-07-25 10:37:33 +08:00
Longsheng Mou	f047b735e9	[mlir][NFC] Use `getDefiningOp<OpTy>()` instead of `dyn_cast<OpTy>(getDefiningOp())` (#150428 ) This PR uses `val.getDefiningOp<OpTy>()` to replace `dyn_cast<OpTy>(val.getDefiningOp())` , `dyn_cast_or_null<OpTy>(val.getDefiningOp())` and `dyn_cast_if_present<OpTy>(val.getDefiningOp())`.	2025-07-25 10:35:51 +08:00
Hideto Ueno	cd1acf2ae3	[mlir][Transforms] Remove UB dialect dependency from Canonicalizer pass (#150555 ) The Canonicalizer pass has a dependency to UB dialect which shouldn't have. It also no longer needs to directly depend on the UB dialect since the Vector dialect (which uses UB dialect for poison index operations introduced by 35df525) already declares this dependency(878d3594).	2025-07-24 17:33:07 -07:00
Kazu Hirata	0925d7572a	[mlir] Remove unused includes (NFC) (#150266 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-23 15:18:53 -07:00
Longsheng Mou	8937b61f21	[mlir][vector] Fix cast incompatible type bug in `ShuffleOp::fold` (#150037 ) This PR uses `dyn_cast` instead of `cast` to avoid a crash when the constant attribute is not a `DenseElementsAttr`. Fixes #149325.	2025-07-23 11:37:22 +08:00
Maksim Levental	f904cdd6c3	[mlir][NFC] update `mlir/Dialect` create APIs (24/n) (#149931 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-22 08:16:15 -04:00
Adam Siemieniuk	b956f049b1	[mlir][linalg] Vectorize directly to a named contraction (#147296 ) Extends linalg vectorizer with a path to lower contraction ops directly into `vector.contract`. The direct rewriting preserves high-level op semantics and provides more progressive lowering compared to reconstructing contraction back from multi dimensional reduction. The added lowering focuses on named linalg ops and leverages their well defined semantics to avoid complex precondition verification. The new path is optional and disabled by default to avoid changing the default vectorizer behavior.	2025-07-22 07:42:02 +02:00
James Newling	abce4e9ad0	[mlir][vector] Folder: shape_cast(extract) -> extract (#146368 ) In a later PR more shape_cast ops will appear. Specifically, broadcasts that just prepend ones become shape_cast ops (i.e. volume preserving broadcasts are canonicalized to shape_casts). This PR ensures that broadcast-like shape_cast ops fold at least as well as broadcast ops. This is done by modifying patterns that target broadcast ops, to target 'broadcast-like' ops. No new patterns are added, the patterns that exist are just made to match on shape_casts where appropriate. This PR also includes minor code simplifications: use `isBroadcastableTo` to simplify `ExtractOpFromBroadcast` and simplify how broadcast dims are detected in `foldExtractFromBroadcast`. These are NFC. --------- Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2025-07-21 11:12:50 -07:00
Andrzej Warzyński	03bd0f36ba	[mlir][vector] Remove MatrixMultiplyOp and FlatTransposeOp from Vector dialect (#144307 ) This patch deletes `vector.matrix_multiply` and `vector.flat_transpose`, which are thin wrappers around the corresponding LLVM intrinsics: - `llvm.intr.matrix.multiply` - `llvm.intr.matrix.transpose` These Vector dialect ops did not provide additional semantics or abstraction beyond the LLVM intrinsics. Their removal simplifies the lowering pipeline without losing any functionality. The lowering chains: - `vector.contract` → `vector.matrix_multiply` → `llvm.intr.matrix.multiply` - `vector.transpose` → `vector.flat_transpose` → `llvm.intr.matrix.transpose` are now replaced with: - `vector.contract` → `llvm.intr.matrix.multiply` - `vector.transpose` → `llvm.intr.matrix.transpose` This was accomplished by directly replacing: - `vector::MatrixMultiplyOp` with `LLVM::MatrixMultiplyOp` - `vector::FlatTransposeOp` with `LLVM::MatrixTransposeOp` Note: To avoid a build-time dependency from `Vector` to `LLVM`, relevant transformations are moved from "Vector/Transforms" to `Conversion/VectorToLLVM`.	2025-07-21 08:19:30 +01:00
Tomás Longeri	5d367080a8	[MLIR][Vector] Fix bug in ExtractStrideSlicesOp canonicalization (#147591 ) The pattern would produce an invalid slice when some dimensions were both sliced and broadcast.	2025-07-16 08:52:35 +01:00
Kazu Hirata	c06d3a7b72	[mlir] Remove unused includes (NFC) (#148769 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-14 22:19:23 -07:00
Charitha Saumya	244ebef1dd	Reapply [mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#148313 ) Reapply attempt for : https://github.com/llvm/llvm-project/pull/148291 Fix for the build failure reported in : https://lab.llvm.org/buildbot/#/builders/116/builds/15477 ----- This crash is caused by mismatch of distributed type returned by `getDistributedType` and intended distributed type for forOp results. Solution diff: `20c2cf6766` Example: ``` func.func @warp_scf_for_broadcasted_result(%arg0: index) -> vector<1xf32> { %c128 = arith.constant 128 : index %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index %2 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<1xf32>) { %ini = "some_def"() : () -> (vector<1xf32>) %0 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini) -> (vector<1xf32>) { %1 = "some_op"(%arg4) : (vector<1xf32>) -> (vector<1xf32>) scf.yield %1 : vector<1xf32> } gpu.yield %0 : vector<1xf32> } return %2 : vector<1xf32> } ``` In this case the distributed type for forOp result is `vector<1xf32>` (result is not distributed and broadcasted to all lanes instead). However, in this case `getDistributedType` will return NULL type. Therefore, if the distributed type can be recovered from warpOp, we should always do that first before using `getDistributedType`	2025-07-14 15:41:56 -07:00
Nishant Patel	834591e062	[MLIR] [Vector] Linearization patterns for vector.load and vector.store (#145115 ) This PR add inearizarion pattern for vector.load and vector.store. It is follow up PR to https://github.com/llvm/llvm-project/pull/143420#issuecomment-2967406606	2025-07-14 14:24:52 -07:00
Quinn Dawkins	b1ef5a8890	[mlir][MemRef] Add support for emulating narrow floats (#148036 ) This enables memref.load/store + vector.load/store support for sub-byte float types. Since the memref types don't matter for loads/stores, we still use the same types as integers with equivalent widths, with a few extra bitcasts needed around certain operations. There is no direct change needed for vector.load/store support. The tests added for them are to verify that float types are supported as well.	2025-07-14 11:18:51 -04:00
Diego Caballero	ace1c838ca	[mlir][Vector] Support scalar `vector.extract` in VectorLinearize (#147440 ) It generates a linearized version of the `vector.extract` for scalar cases.	2025-07-11 16:02:26 -07:00
Charitha Saumya	1d33bbab57	Revert "[mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results." (#148291 ) Reverts llvm/llvm-project#147620 Reverting due to build failure: https://lab.llvm.org/buildbot/#/builders/116/builds/15477	2025-07-11 13:22:54 -07:00
Charitha Saumya	3092b765ba	[mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#147620 ) Current implementation generates incorrect code or crashes in the following valid cases. 1. At least one of the for op results are not yielded by the warpOp. Example: ``` %0 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<4xf32>) { .... %3:2 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini, %arg5 = %ini1) -> (vector<128xf32>, vector<128xf32>) { %1 = ... %acc = .... scf.yield %acc, %1 : vector<128xf32>, vector<128xf32> } gpu.yield %3#0 : vector<128xf32> // %3#1 is not used but can not be removed as dead code (loop carried). } "some_use"(%0) : (vector<4xf32>) -> () return ``` 2. Enclosing warpOp yields the forOp results in different order compared to the forOp results. Example: ``` %0:3 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<4xf32>, vector<4xf32>, vector<8xf32>) { .... %3:3 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini1, %arg5 = %ini2, %arg6 = %ini3) -> (vector<256xf32>, vector<128xf32>, vector<128xf32>) { ..... scf.yield %acc1, %acc2, %acc3 : vector<256xf32>, vector<128xf32>, vector<128xf32> } gpu.yield %3#2, %3#1, %3#0 : vector<128xf32>, vector<128xf32>, vector<256xf32> // swapped order } "some_use_1"(%0#0) : (vector<4xf32>) -> () "some_use_2"(%0#1) : (vector<4xf32>) -> () "some_use_3"(%0#2) : (vector<8xf32>) -> () ```	2025-07-11 13:08:33 -07:00
Kunwar Grover	77914c96df	[mlir][Vector] Do not propagate vector.extract on dynamic position (#148245 ) Propagating vector.extract when a dynamic position is present can cause dominance issues and needs better handling. For now, disable propagation if there is a dynamic position present.	2025-07-11 15:38:48 +01:00
Kunwar Grover	0227aef688	[mlir][Vector] Add canonicalization for extract_strided_slice(create_mask) (#146745 ) extract_strided_slice(create_mask) can be folded into create_mask by simply subtracting the offsets from the bounds.	2025-07-10 15:43:20 +01:00
Kazu Hirata	cd65f8bf17	[mlir] Fix a warning This patch fixes: mlir/lib/Dialect/Vector/Transforms/LowerVectorToFromElementsToShuffleTree.cpp:42:20: error: unused variable 'kIndScale' [-Werror,-Wunused-const-variable]	2025-07-09 16:45:18 -07:00
Diego Caballero	ddf9b91f9f	[mlir][Vector] Add `vector.shuffle` tree transformation (#145740 ) This PR adds a new transformation that turns sequences of `vector.to_elements` and `vector.from_elements` into a binary tree of `vector.shuffle` operations. (Related RFC: https://discourse.llvm.org/t/rfc-adding-vector-to-elements-op-to-the-vector-dialect/86779). Example: ``` %0:4 = vector.to_elements %a : vector<4xf32> %1:4 = vector.to_elements %b : vector<4xf32> %2:4 = vector.to_elements %c : vector<4xf32> %3 = vector.from_elements %0#0, %0#1, %0#2, %0#3, %1#0, %1#1, %1#2, %1#3, %2#0, %2#1, %2#2, %2#3 : vector<12xf32> ==> %0 = vector.shuffle %a, %b [0, 1, 2, 3, 4, 5, 6, 7] : vector<4xf32>, vector<4xf32> %1 = vector.shuffle %c, %c [0, 1, 2, 3, -1, -1, -1, -1] : vector<4xf32>, vector<4xf32> %2 = vector.shuffle %0, %1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] : vector<8xf32>, vector<8xf32> ``` The algorithm leverages the structured extraction/insertion information of `vector.to_elements` and `vector.from_elements` operations and builds a set of intervals to determine the vector length that should be used at each level of the tree to combine the level inputs in pairs. There are a few improvements that can be implemented in the future, such as shuffle mask compression to avoid unnecessarily large vector lengths with poison values, but I decided to keep things "simpler" and spend more time documenting the different steps of the algorithm so that people can follow along.	2025-07-09 16:09:53 -07:00
Diego Caballero	889ac879ce	[mlir][Vector] Remove usage of `vector.insertelement/extractelement` from Vector (#144413 ) This PR is part of the last step to remove `vector.extractelement` and `vector.insertelement` ops. RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops It removes instances of `vector.extractelement` and `vector.insertelement` from the Vector dialect layer.	2025-07-09 12:09:17 -07:00
lonely eagle	517cda12e5	[mlir][vector] Add foldInsertUseChain folder function to insert op (#147045 ) When the result of an insert op is used by an insert op, and the subsequent insert op is inserted at the same location as the previous insert op, replaces the dest of the subsequent insert op with the dest of the previous insert op.This is because the previous insert op does not affect subsequent insert ops. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com> Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2025-07-08 22:39:18 +08:00
Jakub Kuderski	6512ca7ddb	[mlir] Add `isStatic`* size check for `ShapedType`s. NFCI. (#147085 ) The motivation is to avoid having to negate `isDynamic*` checks, avoid double negations, and allow for `ShapedType::isStaticDim` to be used in ADT functions without having to wrap it in a lambda performing the negation. Also add the new functions to C and Python bindings.	2025-07-07 14:57:27 -04:00
Diego Caballero	7451e4c330	[mlir][Vector] Support scalar 'vector.insert' in vector linearization (#146954 ) This PR add support for linearizing the insertion of a scalar element by just linearizing the `vector.insert` op.	2025-07-07 10:22:33 -07:00
Kunwar Grover	5eb195fa90	[mlir][Vector] Fold vector.constant_mask to SplatElementsAttr (#146724 ) Adds a folder to vector.constant_mask to fold to SplatElementsAttr when possible	2025-07-04 14:44:36 +01:00
Andrzej Warzyński	6ecb6a8a8c	[mlir][vector][nfc] Rename `populateVectorTransferCollapseInnerMostContiguousDimsPatterns` (#145228 ) Renames `populateVectorTransferCollapseInnerMostContiguousDimsPatterns` as `populateDropInnerMostUnitDimsXferOpPatterns` + updates the corresponding comments. This addresses a TODO and makes the difference between these two `populate` methods clearer: `populateDropUnitDimWithShapeCastPatterns`, * `populateDropInnerMostUnitDimsXferOpPatterns`.	2025-07-02 20:07:35 +01:00
Yang Bai	393a75ebb7	[mlir][Vector] Add constant folding for vector.from_elements operation (#145849 ) ### Summary This PR adds a new folding pattern for vector.from_elements that canonicalizes it to arith.constant when all input operands are constants. ### Implementation Details Leverages FoldAdaptor capabilities: Uses adaptor.getElements() to access pre-computed constant attributes, avoiding redundant pattern matching on operands. ### Example Transformation ``` Before: %c0_i32 = arith.constant 0 : i32 %c1_i32 = arith.constant 1 : i32 %c2_i32 = arith.constant 2 : i32 %c3_i32 = arith.constant 3 : i32 %v = vector.from_elements %c0_i32, %c1_i32, %c2_i32, %c3_i32 : vector<2x2xi32> After: %v = arith.constant dense<[[0, 1], [2, 3]]> : vector<2x2xi32> ``` --------- Co-authored-by: Yang Bai <yangb@nvidia.com>	2025-06-30 20:39:53 -07:00
Fabian Mora	878d3594ed	[mlir][vector] Avoid setting padding by default to `0` in `vector.transfer_read` prefer `ub.poison` (#146088 ) Context: `vector.transfer_read` always requires a padding value. Most of its builders take no `padding` value and assume the safe value of `0`. However, this should be a conscious choice by the API user, as it makes it easy to introduce bugs. For example, I found several occasions while making this patch that the padding value was not getting propagated (`vector.transfer_read` was transformed into another `vector.transfer_read`). These bugs, were always caused because of constructors that don't require specifying padding. Additionally, using `ub.poison` as a possible default value is better, as it indicates the user "doesn't care" about the actual padding value, forcing users to specify the actual padding semantics they want. With that in mind, this patch changes the builders in `vector.transfer_read` to always having a `std::optional<Value> padding` argument. This argument is never optional, but for convenience users can pass `std::nullopt`, padding the transfer read with `ub.poison`. --------- Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>	2025-06-30 15:20:42 -04:00
Kazu Hirata	abc2c3a538	[mlir] Use llvm::is_contained instead of llvm::all_of (NFC) (#145845 ) llvm::is_contained is shorter than llvm::all_of plus a lambda.	2025-06-26 08:41:26 -07:00
Andrzej Warzyński	e9e25f02e6	[mlir][vector] Restrict vector.insert/vector.extract to disallow 0-d vectors (#121458 ) This patch enforces a restriction in the Vector dialect: the non-indexed operands of `vector.insert` and `vector.extract` must no longer be 0-D vectors. In other words, rank-0 vector types like `vector<f32>` are disallowed as the source or result. EXAMPLES -------- The following are now illegal (note the use of `vector<f32>`): ```mlir %0 = vector.insert %v, %dst[0, 0] : vector<f32> into vector<2x2xf32> %1 = vector.extract %src[0, 0] : vector<f32> from vector<2x2xf32> ``` Instead, use scalars as the source and result types: ```mlir %0 = vector.insert %v, %dst[0, 0] : f32 into vector<2x2xf32> %1 = vector.extract %src[0, 0] : f32 from vector<2x2xf32> ``` Note, this change serves three goals. These are summarised below. ## 1. REDUCED AMBIGUITY By enforcing scalar-only semantics when the result (`vector.extract`) or source (`vector.insert`) are rank-0, we eliminate ambiguity in interpretation. Prior to this patch, both `f32` and `vector<f32>` were accepted. ## 2. MATCH IMPLEMENTATION TO DOCUMENTATION The current behaviour contradicts the documented intent. For example, `vector.extract` states: > Degenerates to an element type if n-k is zero. This patch enforces that intent in code. ## 3. ENSURE SYMMETRY BETWEEN INSERT AND EXTRACT With the stricter semantics in place, it’s natural and consistent to make `vector.insert` behave symmetrically to `vector.extract`, i.e., degenerate the source type to a scalar when n = 0. NOTES FOR REVIEWERS ------------------- 1. Main change is in "VectorOps.cpp", where stricter type checks are implemented. 2. Test updates in "invalid.mlir" and "ops.mlir" are minor cleanups to remove now-illegal examples. 2. Lowering changes in "VectorToSCF.cpp" are the main trade-off: we now require an additional `vector.extract` when a preceding `vector.transfer_read` generates a rank-0 vector. RELATED RFC ----------- * https://discourse.llvm.org/t/rfc-should-we-restrict-the-usage-of-0-d-vectors-in-the-vector-dialect	2025-06-26 09:47:06 +01:00
Charitha Saumya	c539ec0db5	[mlir][vector] Add support for vector extract/insert_strided_slice in vector distribution. (#145421 ) This PR adds initial support for `vector.extract_strided_slice` and `vector.insert_strided_slice` ops in vector distribution.	2025-06-25 16:41:28 -07:00

1 2 3 4 5 ...

1145 Commits