llvm-project

Author	SHA1	Message	Date
Jacques Pienaar	4bf33958da	[mlir] Update builders to use new form. (#154132 ) Mechanically applied using clang-tidy.	2025-08-18 15:19:34 +00:00
Matthias Springer	21b607adbe	[mlir][SCF] `scf.for`: Add support for unsigned integer comparison (#153379 ) Add a new unit attribute to allow for unsigned integer comparison. Example: ```mlir scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 { // body } ``` Discussion: https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655	2025-08-15 10:59:14 +02:00
Ege Beysel	8de85e753f	[mlir][linalg] Add support for scalable vectorization of `linalg.batch_mmt4d` (#152984 ) This PR builds upon the previous #146531 and enables scalable vectorization for `batch_mmt4d` as well. --------- Signed-off-by: Ege Beysel <beyselege@gmail.com>	2025-08-14 11:47:51 +02:00
Renato Golin	d15280894b	[MLIR][Linalg] Remove matmul_transpose variants (#147961 ) Removes the `(batch_)matmul_transpose_{a\|b}` variants from OpDSL and replace it with `matmul affine_maps [...]` whenever appropriate. This is in line with the [plan](https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863), and can be done since #104783 merged. See: https://discourse.llvm.org/t/deprecate-batch-matmul-transpose-a-b-linalg-operations/87245 Issues investigated: * pad transform tests that could use `matmul` instead, so change to that. * ArmSME test using transpose actually needed it, so changed to `matmul` + affine maps. Arm tests validated by @banach-space (thanks!!).	2025-08-08 22:20:27 +01:00
James Newling	b574bcf036	[mlir][TD] Support padding with poison (#152003 ) Signed-off-by: James Newling <james.newling@gmail.com>	2025-08-08 09:09:03 -07:00
Javed Absar	ceda56be7f	[mlir][linalg] Morphism across linalg -- named, category and generic ops. (#148424 ) Adds `linalg-morph-ops` pass to convert an op from one representation to another: named-op <--> category_op (elementwise, contraction, ..) <--> generic e.g. ```mlir %exp = linalg.exp ins(%A : tensor<16x8xf32>) outs(%B : tensor<16x8xf32>) -> tensor<16x8xf32> ``` After `mlir-opt -linalg-morph-ops=named-to-category ..` ```mlir %0 = linalg.elementwise kind=#linalg.elementwise_kind<exp> ins(%arg0 : tensor<16x8xf32> .. Note: this is generalization of `--linalg-generalize-named-ops` is the path `named-op --> generic-op` `--linalg-specialize-generic-ops` is the path `named-op <-- generic-op` email: quic_mabsar@quicinc.com	2025-08-07 12:36:47 +01:00
Andrzej Warzyński	3692c73ce4	[mlir][linalg] Enable scalable vectorization of linalg.unpack (#149293 ) This patch updates `vectorizeAsTensorUnpackOp` to support scalable vectorization by requiring user-specified vector sizes for the _read_ operation (rather than the _write_ operation) in `linalg.unpack`. Conceptually, `linalg.unpack` consists of these high-level steps: * Read from the source tensor using `vector.transfer_read`. * Transpose the read value according to the permutation in the `linalg.unpack` op (via `vector.transpose`). * Re-associate dimensions of the transposed value, as specified by the op (via `vector.shape_cast`) * Write the result into the destination tensor via `vector.transfer_write`. Previously, the vector sizes provided by the user were interpreted as write-vector sizes. These were used to: * Infer read-vector sizes using the `inner_tiles` attribute of the unpack op. * Deduce vector sizes for the transpose and shape cast operations. * Ultimately determine the vector shape for the write. However, this logic breaks when one or more tile sizes are dynamic. In such cases, `vectorizeUnPackOpPrecondition` fails, and vectorization is rejected. This patch switches the contract: users now directly specify the "read-vector-sizes", which inherently encode all inner tile sizes - including dynamic ones. It becomes the user's responsibility to provide valid sizes. In practice, since `linalg.unpack` is typically constructed, tiled, and vectorized by the same transformation pipeline, the necessary "read-vector-sizes" should be recoverable.	2025-08-06 20:37:50 +01:00
Andrzej Warzyński	77363fbd7c	[mlir][linalg] Add getCollapsedVecType and update vectorization of linalg.unpack (#151503 ) This patch introduces a new helper, `getCollapsedVecType`, and updates `vectorizeAsTensorUnpackOp` to use it. The motivation stems from improving how `vector.shape_cast` operations are generated when vectorizing `linalg.unpack`. Previously, the vectorizer relied on * `tensor::CollapseShapeOp::inferCollapsedType` to compute the collapsed vector type. This approach is suboptimal because: * `inferCollapsedType` lacks awareness of scalable vector flags. * Linalg vectorization should not depend on Tensor dialect utilities. Instead of relocating `inferCollapsedType`, we introduce `getCollapsedVecType` — a lightweight, specialized hook that: * Assumes no dynamic sizes. * Handles scalable flags alongside shape dimensions. This change also reduces temporary variables in `vectorizeAsTensorUnpackOp` and paves the way for a cleaner update in #149293.	2025-08-01 11:26:19 +01:00
Daniel Garvey	1e504bef20	[MLIR] Specify new padOp's output type in DropPadUnitDims (#150706 ) Previously when dropping unit dim from a pad with mixed dynamic/static input/output shapes, the resulting shape would take on the Type of the input, resulting in invalid IR. Also did some minor cleanup to the formatting of the `drop_unit_dim_corresponding_to_dynamic_dim` test to make it match the rest of the file. --------- Signed-off-by: dan <danimal197@gmail.com>	2025-07-31 11:51:38 +01:00
Andrzej Warzyński	96b4425669	[mlir][linalg][nfc] Clean-up leftover code post #149156 (#151334 ) In https://github.com/llvm/llvm-project/pull/149156, I ensured that we no longer generate spurious `tensor.empty` ops when vectorizing `linalg.unpack`. This follow-up removes leftover code that is now redundant but was missed in the original PR and in #150602 that was also meant to clean-up left-over code. Note, this is removing code to compute "write-vector-sizes". Instead, these are fully inferred from previous Ops.	2025-07-30 20:34:01 +01:00
Vivian Zhang	dc6d7f0637	[mlir][linalg] Fix padding shape computation in PadTilingInterface for convs (#149576 ) This PR fixes the computation of padded shapes for convolution-style affine maps (e.g., d0 + d1) in `PadTilingInterface`. Previously, the codes used the direct sum of loop upper bounds, leading to over-padding. For example, the following `conv_2d_nhwc_fhwc` op, if only padding the c dimensions to multiples of 16, it also incorrectly pads the convolved dimensions and generates the wrong input shape as: ``` %padded = tensor.pad %arg0 low[0, 0, 0, 0] high[0, 1, 1, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<1x16x16x4xf32> to tensor<1x17x17x16xf32> %padded_0 = tensor.pad %arg1 low[0, 0, 0, 0] high[0, 0, 0, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<16x3x3x4xf32> to tensor<16x3x3x16xf32> %0 = linalg.conv_2d_nhwc_fhwc {dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%padded, %padded_0 : tensor<1x17x17x16xf32>, tensor<16x3x3x16xf32>) outs(%arg2 : tensor<1x14x14x16xf32>) -> tensor<1x14x14x16xf32> return %0 : tensor<1x14x14x16xf32> ``` The new implementation uses the maximum accessed index as the input for affine map and then adds 1 after aggregating all the terms to get the final padded size. This fixed https://github.com/llvm/llvm-project/issues/148679.	2025-07-29 09:58:30 -07:00
Han-Chung Wang	3f3fac8478	[mlir][linalg] Enable pack consumer fusion for all perfect tiling cases. (#150672 ) It was disabled because there may be artificial padding. After [refining the pack op semantics](`773e158c64`), we can assume that there is no artificial padding. Thus, the check can be removed, and we can unconditionally enable the consumer fusion if it is a perfect tiling case. Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-28 10:23:54 -07:00
Han-Chung Wang	496d31c8a9	Reapply "[mlir][linalg] Restrict linalg.pack to not have artificial padding." (#150675 ) (#150680 ) This reverts commit `0844812b2e` with a shape fix in `1db4c6b275` The revision restrict the `linalg.pack` op to not have artificial padding semantics. E.g., the below is valid without the change, and it becomes invalid with the change. ```mlir func.func @foo(%src: tensor<9xf32>) -> tensor<100x8xf32> { %cst = arith.constant 0.000000e+00 : f32 %dest = tensor.empty() : tensor<100x8xf32> %pack = linalg.pack %src padding_value(%cst : f32) inner_dims_pos = [0] inner_tiles = [8] into %dest : tensor<9xf32> -> tensor<100x8xf32> return %pack : tensor<100x8xf32> } ``` IMO, it is a misuse if we use pack ops with artificial padding sizes because the intention of the pack op is to relayout the source based on target intrinsics, etc. The output shape is expected to be `tensor<2x8xf32>`. If people need extra padding sizes, they can create a new pad op followed by the pack op. This also makes consumer tiling much easier because the consumer fusion does not support artificial padding sizes. It is very hard to make it work without using ad-hoc patterns because the tiling sizes are about source, which implies that you don't have a core_id/thread_id to write padding values to the whole tile. People may have a question how why pad tiling implementation works. The answer is that it creates an `if-else` branch to handle the case. In my experience, it is very struggle in transformation because most of the time people only need one side of the branch given that the tile sizes are usually greater than padding sizes. However, the implementation is conservatively correct in terms of semantics. Given that the introduction of `pack` op is to serve the relayout needs better, having the restriction makes sense to me. Removed tests: - `no_bubble_up_pack_extending_dimension_through_expand_cannot_reassociate` from `data-layout-propagation.mlir`: it is a dup test to `bubble_up_pack_non_expanded_dims_through_expand` after we fix the shape. - `fuse_pack_consumer_with_untiled_extra_padding` from `tile-and-fuse-consumer.mlir`: it was created for artificial padding in the consumer fusion implementation. The other changes in lit tests are just fixing the shape. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-28 09:29:15 -07:00
Andrzej Warzyński	f529c0b56f	[mlir][linalg][nfc] Clean-up leftover code post #149156 (#150602 ) In https://github.com/llvm/llvm-project/pull/149156, I ensured that we no longer generate spurious `tensor.empty` ops when vectorizing `linalg.unpack`. This follow-up removes leftover code that is now redundant but was missed in the original PR.	2025-07-28 09:00:19 +01:00
Maksim Levental	fcbcfe44cf	[mlir][NFC] update `mlir/Dialect` create APIs (32/n) (#150657 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-25 13:50:15 -05:00
Han-Chung Wang	0844812b2e	Revert "[mlir][linalg] Restrict linalg.pack to not have artificial padding." (#150675 ) Reverts llvm/llvm-project#150522 because it breaks `Integration/Dialect/Linalg/CPU/pack-unpack-mmt4d.mlir`. https://lab.llvm.org/buildbot/#/builders/116/builds/16097	2025-07-25 11:27:41 -07:00
Han-Chung Wang	773e158c64	[mlir][linalg] Restrict linalg.pack to not have artificial padding. (#150522 ) The revision restrict the `linalg.pack` op to not have artificial padding semantics. E.g., the below is valid without the change, and it becomes invalid with the change. ```mlir func.func @foo(%src: tensor<9xf32>) -> tensor<100x8xf32> { %cst = arith.constant 0.000000e+00 : f32 %dest = tensor.empty() : tensor<100x8xf32> %pack = linalg.pack %src padding_value(%cst : f32) inner_dims_pos = [0] inner_tiles = [8] into %dest : tensor<9xf32> -> tensor<100x8xf32> return %pack : tensor<100x8xf32> } ``` IMO, it is a misuse if we use pack ops with artificial padding sizes because the intention of the pack op is to relayout the source based on target intrinsics, etc. The output shape is expected to be `tensor<2x8xf32>`. If people need extra padding sizes, they can create a new pad op followed by the pack op. This also makes consumer tiling much easier because the consumer fusion does not support artificial padding sizes. It is very hard to make it work without using ad-hoc patterns because the tiling sizes are about source, which implies that you don't have a core_id/thread_id to write padding values to the whole tile. People may have a question how why pad tiling implementation works. The answer is that it creates an `if-else` branch to handle the case. In my experience, it is very struggle in transformation because most of the time people only need one side of the branch given that the tile sizes are usually greater than padding sizes. However, the implementation is conservatively correct in terms of semantics. Given that the introduction of `pack` op is to serve the relayout needs better, having the restriction makes sense to me. Removed tests: - `no_bubble_up_pack_extending_dimension_through_expand_cannot_reassociate` from `data-layout-propagation.mlir`: it is a dup test to `bubble_up_pack_non_expanded_dims_through_expand` after we fix the shape. - `fuse_pack_consumer_with_untiled_extra_padding` from `tile-and-fuse-consumer.mlir`: it was created for artificial padding in the consumer fusion implementation. The other changes in lit tests are just fixing the shape. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-25 11:06:17 -07:00
Maksim Levental	c610b24493	[mlir][NFC] update `mlir/Dialect` create APIs (27/n) (#150638 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-25 11:48:32 -05:00
Jacques Pienaar	07967d4af8	[mlir] Switch to new LDBG macro (#150616 ) Change local variants to use new central one.	2025-07-25 18:22:46 +02:00
Frank Schlimbach	b2d4963ee9	[NFC][mlir][mesh,shard] Fixing misnomers in mesh dialect, renaming 'mesh' dialect to 'shard' (#150177 ) Dialect to 'shard' (discourse 87053) - dialect name mesh -> shard - (device) mesh -> (device) grid - spmdize -> partition A lot of diffs, but simple renames only. @tkarna @yaochengji	2025-07-25 16:53:08 +02:00
Longsheng Mou	f047b735e9	[mlir][NFC] Use `getDefiningOp<OpTy>()` instead of `dyn_cast<OpTy>(getDefiningOp())` (#150428 ) This PR uses `val.getDefiningOp<OpTy>()` to replace `dyn_cast<OpTy>(val.getDefiningOp())` , `dyn_cast_or_null<OpTy>(val.getDefiningOp())` and `dyn_cast_if_present<OpTy>(val.getDefiningOp())`.	2025-07-25 10:35:51 +08:00
Han-Chung Wang	1ff6d9daec	[mlir][linalg] Take artificial padding into account for pack/unpack folding. (#150272 ) The revision only folds the tensor.pad/extract_slice op into linalg.pack/unpack ops only when it is safe to fold. It is not valid to have artificial padding. The documentation improvement and verifier update will be done in a separate PR (i.e., https://github.com/llvm/llvm-project/pull/149624). The revision is a step towards it. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-24 13:55:07 -07:00
Maksim Levental	75aa7065dc	[mlir][NFC] update `mlir/Dialect` create APIs (17/n) (#149924 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-24 15:37:36 -05:00
Ian Wood	3ebe5d661f	[mlir][linalg] Drop unit dims on IndexingMapOpInterface (#150280 ) Generalizes `dropUnitDims` to operate on any op implementing the `IndexingMapOpInterface`. Operation specific creation is handled by passing a builder that will construct the new operation based on the dropped dimensions. --------- Signed-off-by: Ian Wood <ianwood@u.northwestern.edu> Co-authored-by: Kunwar Grover <groverkss@gmail.com>	2025-07-24 16:07:51 +01:00
Kazu Hirata	0925d7572a	[mlir] Remove unused includes (NFC) (#150266 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-23 15:18:53 -07:00
Frank Schlimbach	7ad1d5bd34	[mlir][mesh] removing partial/reduction axes from mesh.sharding (#149805 ) [mlir][mesh] Removing partial axes from sharding annotations (discourse 87053)	2025-07-23 08:32:38 +02:00
fabrizio-indirli	3e7433d75a	[mlir][linalg] Fix to Elementwise Fusion when preserving results (#149843 ) In the linalg ElementwiseOpFusion transform, a pre-requisite for the fusion between a producer and consumer op is that the producer's output indexing map associated to the result to be fused must be invertible (e.g. a simple permutation). Before this patch, only the first output indexing map was being checked; this bug produced issues when the operand to fuse was not the 1st result of the producer op. For example, this situation arises when the producer op has multiple results because it's the result of previous fusions where the original result had been preserved: in these cases, the pass ought to check the indexing map of the result being fused, which is not necessarily the 1st one. Signed-off-by: Fabrizio Indirli <Fabrizio.Indirli@arm.com>	2025-07-22 10:16:59 +01:00
Adam Siemieniuk	b956f049b1	[mlir][linalg] Vectorize directly to a named contraction (#147296 ) Extends linalg vectorizer with a path to lower contraction ops directly into `vector.contract`. The direct rewriting preserves high-level op semantics and provides more progressive lowering compared to reconstructing contraction back from multi dimensional reduction. The added lowering focuses on named linalg ops and leverages their well defined semantics to avoid complex precondition verification. The new path is optional and disabled by default to avoid changing the default vectorizer behavior.	2025-07-22 07:42:02 +02:00
Han-Chung Wang	3ea6da59ec	[mlir][linalg] Allow pack consumer fusion if the tile size is greater than dimension size. (#149438 ) This happens only when you use larger tile size, which is greater than or equal to the dimension size. In this case, it is a full slice, so it is fusible. The IR can be generated during the TileAndFuse process. It is hard to fix in such driver, so we enable the naive fusion for the case. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-18 10:42:42 -07:00
Han-Chung Wang	7d040d4675	[mlir][linalg] Handle outer_dims_perm in linalg.pack consumer fusion. (#149426 ) Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-18 09:42:40 -07:00
Han-Chung Wang	6ff471883f	[mlir][linalg] Improve linalg.pack consumer fusion. (#148993 ) If a dimension is not tiled, it is always valid to fuse the pack op, even if it has padding semantics. Because it always generates a full slice along the dimension. If a dimension is tiled and it does not need extra padding, the fusion is valid. The revision also formats corresponding tests for consistency. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-17 16:06:06 -07:00
Andrzej Warzyński	3b11aaaf94	[mlir][linalg] Add support for scalable vectorization of linalg.mmt4d (#146531 ) This patch adds support for scalable vectorization of linalg.mmt4d. The key design change is the introduction of a new vectorizer state variable: * `assumeDynamicDimsMatchVecSizes` ...along with the corresponding Transform dialect attribute: * `assume_dynamic_dims_match_vec_sizes`. This flag instructs the vectorizer to assume that dynamic memref/tensor dimensions match the corresponding vector sizes (fixed or scalable). With this assumption, masking becomes unnecessary, which simplifies the lowering pipeline significantly. While this assumption is not universally valid, it typically holds for `linalg.mmt4d`. Inputs and outputs are explicitly packed using `linalg.pack`, and this packing includes padding, ensuring that dimension sizes align with vector sizes (). Related discussion: https://github.com/llvm/llvm-project/issues/143920 An upcoming patch will include an end-to-end test that leverages scalable vectorization of linalg.mmt4d to demonstrate the newly enabled functionality. This would not be feasible without the changes introduced here, as it would otherwise require additional logic to handle complex - but ultimately redundant - masks. (*) This holds provided that the tile sizes used for packing match the vector sizes used during vectorization. It is the user’s responsibility to enforce this.	2025-07-17 19:02:08 +01:00
Andrzej Warzyński	bce951c572	[mlir][linalg] Update vectorization logic for linalg.unpack (#149156 ) This PR makes sure that we don't generate unnecessary `tensor.empty` when vectorizing `linalg.unpack`. To better visualize the changes implemented here, consider this IR: ```mlir func.func @example( %source: tensor<8x4x16x16xf32>, %dest: tensor<64x127xf32>) -> tensor<64x127xf32> { %res = linalg.unpack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1] inner_tiles = [16, 16] into %dest : tensor<8x4x16x16xf32> -> tensor<64x127xf32> return %res : tensor<64x127xf32> } ``` Below is the output after vectorization, BEFORE and AFTER this PR. BEFORE (note `tensor.empty` and the fact that `%arg1` is not used): ```mlir func.func @example(%arg0: tensor<8x4x16x16xf32>, %arg1: tensor<64x127xf32>) -> tensor<64x127xf32> { %cst = arith.constant 0.000000e+00 : f32 %c0 = arith.constant 0 : index %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<8x4x16x16xf32>, vector<8x4x16x16xf32> %1 = vector.transpose %0, [1, 2, 0, 3] : vector<8x4x16x16xf32> to vector<4x16x8x16xf32> %2 = vector.shape_cast %1 : vector<4x16x8x16xf32> to vector<64x128xf32> %3 = tensor.empty() : tensor<64x127xf32> %c0_0 = arith.constant 0 : index %4 = vector.transfer_write %2, %3[%c0_0, %c0_0] {in_bounds = [true, false]} : vector<64x128xf32>, tensor<64x127xf32> return %4 : tensor<64x127xf32> } ``` AFTER (note that `%arg1` is correctly used): ```mlir func.func @example(%arg0: tensor<8x4x16x16xf32>, %arg1: tensor<64x127xf32>) -> tensor<64x127xf32> { %cst = arith.constant 0.000000e+00 : f32 %c0 = arith.constant 0 : index %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<8x4x16x16xf32>, vector<8x4x16x16xf32> %1 = vector.transpose %0, [1, 2, 0, 3] : vector<8x4x16x16xf32> to vector<4x16x8x16xf32> %2 = vector.shape_cast %1 : vector<4x16x8x16xf32> to vector<64x128xf32> %c0_0 = arith.constant 0 : index %3 = vector.transfer_write %2, %arg1[%c0_0, %c0_0] {in_bounds = [true, false]} : vector<64x128xf32>, tensor<64x127xf32> return %3 : tensor<64x127xf32> } ```	2025-07-17 09:14:17 +01:00
zbenzion	6033544173	[mlir][linalg] Fix memref type verification in CollapseLinalgDimensions (#147245 ) When collapsing linalg dimensions we check if its memref operands are guaranteed to be collapsible. However, we currently assume that the matching indexing map is the identity map. This commit modifies this behavior and checks if the memref is collapsible on the transformed dimensions.	2025-07-09 01:04:08 -07:00
Jakub Kuderski	6512ca7ddb	[mlir] Add `isStatic`* size check for `ShapedType`s. NFCI. (#147085 ) The motivation is to avoid having to negate `isDynamic*` checks, avoid double negations, and allow for `ShapedType::isStaticDim` to be used in ADT functions without having to wrap it in a lambda performing the negation. Also add the new functions to C and Python bindings.	2025-07-07 14:57:27 -04:00
Kazu Hirata	be4cd9f4da	[mlir] Remove unused includes (NFC) (#147206 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-06 19:06:07 -07:00
Longsheng Mou	718e647a0c	[mlir] Fix Wparentheses warning (#146893 ) warning: suggest parentheses around ‘&&’ within ‘\|\|’ [-Wparentheses] 265 \| isa<VectorType>(operandType) && \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ 266 \| "Unexpected non-vector ShapedType"); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	2025-07-07 09:09:43 +08:00
Fabian Mora	bca79ec0d2	[mlir][linalg] Use `ub.poison` in linalg vectorizer instead of `0` for some transfer ops (#146544 ) This patch is a follow up to https://github.com/llvm/llvm-project/pull/146088 and changes the padding value in the linalg vectorizer from `0` to `ub.poison` in `vector.transfer_read`s created for extracting slices or when vectorizing a generic. Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>	2025-07-02 10:10:03 -04:00
zbenzion	b68e8f1de7	[mlir][linalg] Allow promotion to use the original subview size (#144334 ) linalg promotion attempts to compute a constant upper bound for the allocated buffer size. Only when failed to compute an upperbound it fallbacks to the original subview size, which may be dynamic. Adding a promotion option to use the original subview size by default, thus minimizing the allocation size. Fixes #144268.	2025-07-02 08:47:51 +02:00
Han-Chung Wang	42578e8586	[mlir][linalg] Use hasPureTensorSemantics in TransposeMatmul methods. (#146438 ) The issue is triggered by `ee070d0816` that checks `TensorLikeType` when downstream projects use the pattern without registering bufferization::BufferizationDialect. The registration is needed because the interface implementation for builtin types locate at `BufferizationDialect::initialize()`. However, we do not need to fix it by the registration. The proper fix is using the linalg method, i.e., hasPureTensorSemantics. No additional tests are added because the functionality is well tested in [transpose-matmul.mlir](https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/transpose-matmul.mlir). To reproduce the issue, it requires a different setup, e.g., writing a new C++ pass, which seems not worth it. Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-07-01 14:15:27 -07:00
Ege Beysel	ace5108f37	feat(linalg): add a way to pass controlFn to `foldIntoPackUnpackPatterns` (#143685 ) This PR adds a mechanism, so that downstream consumers can pass in control functions for the application of these patterns. This change shouldn't affect any consumers of this method that do not specify a controlFn. The controlFn always gets the source operand of the consumer in each of the patterns as a parameter. In IREE, we (will) use it to control preventing folding patterns that would inhibit fusion. See IREE issue [#20896](https://github.com/iree-org/iree/issues/20896) for more details.	2025-07-01 07:22:38 -07:00
Zhuoran Yin	8cfd9b8821	[MLIR] Make generic skip packing init operand when not used in DataLayoutPropagation (#146139 ) In both `bubbleUpPackOpThroughGenericOp()` or `pushDownUnPackOpThroughGenericOp()`, we can simplify the lowered IR by removing the pack of an empty when the init tensor isn't used in generic op. Instead of packing an empty tensor, the empty tensor can be forwarded to the generic output. This allows cleaner result after data layout propagation.	2025-07-01 09:39:30 -04:00
Fabian Mora	878d3594ed	[mlir][vector] Avoid setting padding by default to `0` in `vector.transfer_read` prefer `ub.poison` (#146088 ) Context: `vector.transfer_read` always requires a padding value. Most of its builders take no `padding` value and assume the safe value of `0`. However, this should be a conscious choice by the API user, as it makes it easy to introduce bugs. For example, I found several occasions while making this patch that the padding value was not getting propagated (`vector.transfer_read` was transformed into another `vector.transfer_read`). These bugs, were always caused because of constructors that don't require specifying padding. Additionally, using `ub.poison` as a possible default value is better, as it indicates the user "doesn't care" about the actual padding value, forcing users to specify the actual padding semantics they want. With that in mind, this patch changes the builders in `vector.transfer_read` to always having a `std::optional<Value> padding` argument. This argument is never optional, but for convenience users can pass `std::nullopt`, padding the transfer read with `ub.poison`. --------- Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>	2025-06-30 15:20:42 -04:00
Andrzej Warzyński	541f33e075	[mlir][linalg] Prevent hoisting of transfer pairs in the presence of aliases (#145235 ) This patch adds additional checks to the hoisting logic to prevent hoisting of `vector.transfer_read` / `vector.transfer_write` pairs when the underlying memref has users that introduce aliases via operations implementing `ViewLikeOpInterface`. Note: This may conservatively block some valid hoisting opportunities and could affect performance. However, as demonstrated by the included tests, the current logic is too permissive and can lead to incorrect transformations. If this change prevents hoisting in cases that are provably safe, please share a minimal repro - I'm happy to explore ways to relax the check. Special treatment is given to `memref.assume_alignment`, mainly to accommodate recent updates in: * https://github.com/llvm/llvm-project/pull/139521 Note that such special casing does not scale and should generally be avoided. The current hoisting logic lacks robust alias analysis. While better support would require more work, the broader semantics of `memref.assume_alignment` remain somewhat unclear. It's possible this op may eventually be replaced with the "alignment" attribute added in: * https://github.com/llvm/llvm-project/pull/144344	2025-06-27 13:18:15 +01:00
Christopher McGirr	96c1611163	[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern (#141613 ) Given the following example: ``` module { func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> { %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32> return %pack : tensor<1x1x1x4x1xf32> } } ``` We would generate an invalid transpose operation because the calculated permutation would be `[0, 2, 0]` which is semantically incorrect. As the permutation must contain unique integers corresponding to the source tensor dimensions. The following change modifies how we calculate the permutation array and ensures that the dimension indices given in the permutation array is unique. The above example would then translate to a transpose having a permutation of `[1, 2, 0]`. Following the rule, that the `inner_dim_pos` is appended to the permutation array and the preceding indices are filled with the remaining dimensions.	2025-06-27 09:24:33 +02:00
Kazu Hirata	abc2c3a538	[mlir] Use llvm::is_contained instead of llvm::all_of (NFC) (#145845 ) llvm::is_contained is shorter than llvm::all_of plus a lambda.	2025-06-26 08:41:26 -07:00
MaheshRavishankar	c873e5f87d	[mlir][TilingInterface] Handle multi operand consumer fusion. (#145193 ) For consumer fusion cases of this form ``` %0:2 = scf.forall .. shared_outs(%arg0 = ..., %arg0 = ...) { tensor.parallel_insert_slice ... into %arg0 tensor.parallel_insert_slice ... into %arg1 } %1 = linalg.generic ... ins(%0#0, %0#1) ``` the current consumer fusion that handles one slice at a time cannot fuse the consumer into the loop, since fusing along one slice will create and SSA violation on the other use from the `scf.forall`. The solution is to allow consumer fusion to allow considering multiple slices at once. This PR changes the `TilingInterface` methods related to consumer fusion, i.e. - `getTiledImplementationFromOperandTile` - `getIterationDomainFromOperandTile` to allow fusion while considering multiple operands. It is upto the `TilingInterface` implementation to return an error if a list of tiles of the operands cannot result in a consistent implementation of the tiled operation. The Linalg operation implementation of `TilingInterface` has been modified to account for these changes and allow cases where operand tiles that can result in a consistent tiling implementation are handled. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-06-25 11:54:38 -07:00
Spenser Bauman	532c15a718	[mlir][linalg] Fix module dependency issue due to unused import (#145727 ) This include introduces a dependency for LinalgTransforms on LinalgTransformOps, which is unspecified in the module dependencies, and would produce a cyclic dependency if it were specified. The include is unused in WinogradConv2D.cpp, so this change removes it.	2025-06-25 12:54:49 -04:00
Hsiangkai Wang	d16f42d1e2	[mlir][linalg] Constrain the parameters m, r in Winograd ops (#144657 ) We only support fixed set of minimum filtering algorithm for Winograd Conv2D decomposition. Instead of letting users specify any integer, define a fixed set of enumeration values for the parameters of minimum filtering algorithm.	2025-06-25 14:02:07 +01:00
Max191	4d21da002a	[mlir] Return vectorized values instead of replacing (#144158 ) Updates the linalg::vectorize function to return a `FailureOr<VectorizationResult>` containing the values to replace the original operation, instead of directly replacing the original operation. This aligns better with the style of transforms used with the TilingInterface, and gives more control to users over the lowering, since it allows for additional transformation of the IR before replacement. There was already a `VectorizationResult` defined, which was used for the internal vectorize implementation using `CustomVectorizationHook`s, so the old struct is renamed to `VectorizationHookResult`. Note for integration: The replacement of the original operation is now the responsibility of the caller, so wherever `linalg::vectorize` is used, the caller must also do `rewriter.replaceOp(vectorizeResults->replacements)`. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2025-06-24 12:06:41 -07:00

1 2 3 4 5 ...

1767 Commits