1767 Commits

Author SHA1 Message Date
Jacques Pienaar
4bf33958da
[mlir] Update builders to use new form. (#154132)
Mechanically applied using clang-tidy.
2025-08-18 15:19:34 +00:00
Matthias Springer
21b607adbe
[mlir][SCF] scf.for: Add support for unsigned integer comparison (#153379)
Add a new unit attribute to allow for unsigned integer comparison.

Example:
```mlir
scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 {
  // body
}
```

Discussion:
https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655
2025-08-15 10:59:14 +02:00
Ege Beysel
8de85e753f
[mlir][linalg] Add support for scalable vectorization of linalg.batch_mmt4d (#152984)
This PR builds upon the previous #146531 and enables scalable
vectorization for `batch_mmt4d` as well.

---------

Signed-off-by: Ege Beysel <beyselege@gmail.com>
2025-08-14 11:47:51 +02:00
Renato Golin
d15280894b
[MLIR][Linalg] Remove matmul_transpose variants (#147961)
Removes the `(batch_)matmul_transpose_{a|b}` variants from OpDSL and
replace it with `matmul affine_maps [...]` whenever appropriate. This is
in line with the
[plan](https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863),
and can be done since #104783 merged.

See:
https://discourse.llvm.org/t/deprecate-batch-matmul-transpose-a-b-linalg-operations/87245

Issues investigated:
* pad transform tests that could use `matmul` instead, so change to
that.
* ArmSME test using transpose actually needed it, so changed to `matmul`
+ affine maps.

Arm tests validated by @banach-space (thanks!!).
2025-08-08 22:20:27 +01:00
James Newling
b574bcf036
[mlir][TD] Support padding with poison (#152003)
Signed-off-by: James Newling <james.newling@gmail.com>
2025-08-08 09:09:03 -07:00
Javed Absar
ceda56be7f
[mlir][linalg] Morphism across linalg -- named, category and generic ops. (#148424)
Adds `linalg-morph-ops` pass to convert an op from one representation to another: 
   named-op <--> category_op (elementwise, contraction, ..) <--> generic
e.g.
```mlir
  %exp = linalg.exp ins(%A : tensor<16x8xf32>) outs(%B :  tensor<16x8xf32>) -> tensor<16x8xf32>
```
After `mlir-opt -linalg-morph-ops=named-to-category ..`

```mlir
  %0 = linalg.elementwise kind=#linalg.elementwise_kind<exp> ins(%arg0 : tensor<16x8xf32> ..

Note: this is generalization of 
`--linalg-generalize-named-ops` is the path `named-op --> generic-op`
`--linalg-specialize-generic-ops` is the path `named-op <-- generic-op`

email: quic_mabsar@quicinc.com
2025-08-07 12:36:47 +01:00
Andrzej Warzyński
3692c73ce4
[mlir][linalg] Enable scalable vectorization of linalg.unpack (#149293)
This patch updates `vectorizeAsTensorUnpackOp` to support scalable
vectorization by requiring user-specified vector sizes for the _read_ operation
(rather than the _write_ operation) in `linalg.unpack`. 

Conceptually, `linalg.unpack` consists of these high-level steps:
  * **Read** from the source tensor using `vector.transfer_read`.
  * **Transpose** the read value according to the permutation in the
    `linalg.unpack` op (via `vector.transpose`).
  * **Re-associate** dimensions of the transposed value, as specified by the op
    (via `vector.shape_cast`)
  * **Write** the result into the destination tensor via
    `vector.transfer_write`.

Previously, the vector sizes provided by the user were interpreted as
write-vector sizes. These were used to:
  * Infer read-vector sizes using the `inner_tiles` attribute of the unpack op.
  * Deduce vector sizes for the transpose and shape cast operations.
  * Ultimately determine the vector shape for the write.

However, this logic breaks when one or more tile sizes are dynamic. In such
cases, `vectorizeUnPackOpPrecondition` fails, and vectorization is rejected.

This patch switches the contract: users now directly specify the
"read-vector-sizes", which inherently encode all inner tile sizes - including
dynamic ones. It becomes the user's responsibility to provide valid sizes.

In practice, since `linalg.unpack` is typically constructed, tiled, and
vectorized by the same transformation pipeline, the necessary
"read-vector-sizes" should be recoverable.
2025-08-06 20:37:50 +01:00
Andrzej Warzyński
77363fbd7c
[mlir][linalg] Add getCollapsedVecType and update vectorization of linalg.unpack (#151503)
This patch introduces a new helper, `getCollapsedVecType`, and updates
`vectorizeAsTensorUnpackOp` to use it. The motivation stems from improving how
`vector.shape_cast` operations are generated when vectorizing `linalg.unpack`.

Previously, the vectorizer relied on
* `tensor::CollapseShapeOp::inferCollapsedType`

to compute the collapsed vector type. This approach is suboptimal
because:
  * `inferCollapsedType` lacks awareness of scalable vector flags.
  * Linalg vectorization should not depend on Tensor dialect utilities.

Instead of relocating `inferCollapsedType`, we introduce
`getCollapsedVecType` — a lightweight, specialized hook that:
  * Assumes no dynamic sizes.
  * Handles scalable flags alongside shape dimensions.

This change also reduces temporary variables in
`vectorizeAsTensorUnpackOp` and paves the way for a cleaner update in
 #149293.
2025-08-01 11:26:19 +01:00
Daniel Garvey
1e504bef20
[MLIR] Specify new padOp's output type in DropPadUnitDims (#150706)
Previously when dropping unit dim from a pad with mixed dynamic/static
input/output shapes, the resulting shape would take on the Type of the
input, resulting in invalid IR.

Also did some minor cleanup to the formatting of the
`drop_unit_dim_corresponding_to_dynamic_dim` test to make it match the
rest of the file.

---------

Signed-off-by: dan <danimal197@gmail.com>
2025-07-31 11:51:38 +01:00
Andrzej Warzyński
96b4425669
[mlir][linalg][nfc] Clean-up leftover code post #149156 (#151334)
In https://github.com/llvm/llvm-project/pull/149156, I ensured that we
no longer generate spurious `tensor.empty` ops when vectorizing
`linalg.unpack`.

This follow-up removes leftover code that is now redundant but was
missed in the original PR and in #150602 that was also meant to clean-up
left-over code.

Note, this is removing code to compute "write-vector-sizes". Instead,
these are fully inferred from previous Ops.
2025-07-30 20:34:01 +01:00
Vivian Zhang
dc6d7f0637
[mlir][linalg] Fix padding shape computation in PadTilingInterface for convs (#149576)
This PR fixes the computation of padded shapes for convolution-style
affine maps (e.g., d0 + d1) in `PadTilingInterface`. Previously, the
codes used the direct sum of loop upper bounds, leading to over-padding.
For example, the following `conv_2d_nhwc_fhwc` op, if only padding the c
dimensions to multiples of 16, it also incorrectly pads the convolved
dimensions and generates the wrong input shape as:

```
%padded = tensor.pad %arg0 low[0, 0, 0, 0] high[0, 1, 1, 12] {
^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index):
  tensor.yield %cst : f32
} : tensor<1x16x16x4xf32> to tensor<1x17x17x16xf32>
%padded_0 = tensor.pad %arg1 low[0, 0, 0, 0] high[0, 0, 0, 12] {
^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index):
  tensor.yield %cst : f32
} : tensor<16x3x3x4xf32> to tensor<16x3x3x16xf32>
%0 = linalg.conv_2d_nhwc_fhwc {dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%padded, %padded_0 : tensor<1x17x17x16xf32>, tensor<16x3x3x16xf32>) outs(%arg2 : tensor<1x14x14x16xf32>) -> tensor<1x14x14x16xf32>
return %0 : tensor<1x14x14x16xf32>
```

The new implementation uses the maximum accessed index as the input for
affine map and then adds 1 after aggregating all the terms to get the
final padded size. This fixed
https://github.com/llvm/llvm-project/issues/148679.
2025-07-29 09:58:30 -07:00
Han-Chung Wang
3f3fac8478
[mlir][linalg] Enable pack consumer fusion for all perfect tiling cases. (#150672)
It was disabled because there may be artificial padding. After [refining the pack op semantics](773e158c64),
we can assume that there is no artificial padding. Thus, the check can
be removed, and we can unconditionally enable the consumer fusion if it
is a perfect tiling case.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-28 10:23:54 -07:00
Han-Chung Wang
496d31c8a9
Reapply "[mlir][linalg] Restrict linalg.pack to not have artificial padding." (#150675) (#150680)
This reverts commit
0844812b2e
with a shape fix in
1db4c6b275

The revision restrict the `linalg.pack` op to not have artificial
padding semantics. E.g., the below is valid without the change, and it
becomes invalid with the change.

```mlir
func.func @foo(%src: tensor<9xf32>) -> tensor<100x8xf32> {
  %cst = arith.constant 0.000000e+00 : f32
  %dest = tensor.empty() : tensor<100x8xf32>
  %pack = linalg.pack %src
    padding_value(%cst : f32)
    inner_dims_pos = [0]
    inner_tiles = [8] into %dest
    : tensor<9xf32> -> tensor<100x8xf32>
  return %pack : tensor<100x8xf32>
}
```

IMO, it is a misuse if we use pack ops with artificial padding sizes
because the intention of the pack op is to relayout the source based on
target intrinsics, etc. The output shape is expected to be
`tensor<2x8xf32>`. If people need extra padding sizes, they can create a
new pad op followed by the pack op.

This also makes consumer tiling much easier because the consumer fusion
does not support artificial padding sizes. It is very hard to make it
work without using ad-hoc patterns because the tiling sizes are about
source, which implies that you don't have a core_id/thread_id to write
padding values to the whole tile.

People may have a question how why pad tiling implementation works. The
answer is that it creates an `if-else` branch to handle the case. In my
experience, it is very struggle in transformation because most of the
time people only need one side of the branch given that the tile sizes
are usually greater than padding sizes. However, the implementation is
conservatively correct in terms of semantics. Given that the
introduction of `pack` op is to serve the relayout needs better, having
the restriction makes sense to me.

Removed tests:
-
`no_bubble_up_pack_extending_dimension_through_expand_cannot_reassociate`
from `data-layout-propagation.mlir`: it is a dup test to
`bubble_up_pack_non_expanded_dims_through_expand` after we fix the
shape.
- `fuse_pack_consumer_with_untiled_extra_padding` from
`tile-and-fuse-consumer.mlir`: it was created for artificial padding in
the consumer fusion implementation.

The other changes in lit tests are just fixing the shape.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-28 09:29:15 -07:00
Andrzej Warzyński
f529c0b56f
[mlir][linalg][nfc] Clean-up leftover code post #149156 (#150602)
In https://github.com/llvm/llvm-project/pull/149156, I ensured that we
no longer generate spurious `tensor.empty` ops when vectorizing
`linalg.unpack`.

This follow-up removes leftover code that is now redundant but was
missed in the original PR.
2025-07-28 09:00:19 +01:00
Maksim Levental
fcbcfe44cf
[mlir][NFC] update mlir/Dialect create APIs (32/n) (#150657)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 13:50:15 -05:00
Han-Chung Wang
0844812b2e
Revert "[mlir][linalg] Restrict linalg.pack to not have artificial padding." (#150675)
Reverts llvm/llvm-project#150522 because it breaks
`Integration/Dialect/Linalg/CPU/pack-unpack-mmt4d.mlir`.

https://lab.llvm.org/buildbot/#/builders/116/builds/16097
2025-07-25 11:27:41 -07:00
Han-Chung Wang
773e158c64
[mlir][linalg] Restrict linalg.pack to not have artificial padding. (#150522)
The revision restrict the `linalg.pack` op to not have artificial
padding semantics. E.g., the below is valid without the change, and it
becomes invalid with the change.

```mlir
func.func @foo(%src: tensor<9xf32>) -> tensor<100x8xf32> {
  %cst = arith.constant 0.000000e+00 : f32
  %dest = tensor.empty() : tensor<100x8xf32>
  %pack = linalg.pack %src
    padding_value(%cst : f32)
    inner_dims_pos = [0]
    inner_tiles = [8] into %dest
    : tensor<9xf32> -> tensor<100x8xf32>
  return %pack : tensor<100x8xf32>
}
```

IMO, it is a misuse if we use pack ops with artificial padding sizes
because the intention of the pack op is to relayout the source based on
target intrinsics, etc. The output shape is expected to be
`tensor<2x8xf32>`. If people need extra padding sizes, they can create a
new pad op followed by the pack op.

This also makes consumer tiling much easier because the consumer fusion
does not support artificial padding sizes. It is very hard to make it
work without using ad-hoc patterns because the tiling sizes are about
source, which implies that you don't have a core_id/thread_id to write
padding values to the whole tile.

People may have a question how why pad tiling implementation works. The
answer is that it creates an `if-else` branch to handle the case. In my
experience, it is very struggle in transformation because most of the
time people only need one side of the branch given that the tile sizes
are usually greater than padding sizes. However, the implementation is
conservatively correct in terms of semantics. Given that the
introduction of `pack` op is to serve the relayout needs better, having
the restriction makes sense to me.

Removed tests:
-
`no_bubble_up_pack_extending_dimension_through_expand_cannot_reassociate`
from `data-layout-propagation.mlir`: it is a dup test to
`bubble_up_pack_non_expanded_dims_through_expand` after we fix the
shape.
- `fuse_pack_consumer_with_untiled_extra_padding` from
`tile-and-fuse-consumer.mlir`: it was created for artificial padding in
the consumer fusion implementation.

The other changes in lit tests are just fixing the shape.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-25 11:06:17 -07:00
Maksim Levental
c610b24493
[mlir][NFC] update mlir/Dialect create APIs (27/n) (#150638)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 11:48:32 -05:00
Jacques Pienaar
07967d4af8
[mlir] Switch to new LDBG macro (#150616)
Change local variants to use new central one.
2025-07-25 18:22:46 +02:00
Frank Schlimbach
b2d4963ee9
[NFC][mlir][mesh,shard] Fixing misnomers in mesh dialect, renaming 'mesh' dialect to 'shard' (#150177)
Dialect to 'shard' (discourse 87053)
  - dialect name mesh -> shard
  - (device) mesh -> (device) grid
  - spmdize -> partition

A lot of diffs, but simple renames only.

@tkarna @yaochengji
2025-07-25 16:53:08 +02:00
Longsheng Mou
f047b735e9
[mlir][NFC] Use getDefiningOp<OpTy>() instead of dyn_cast<OpTy>(getDefiningOp()) (#150428)
This PR uses `val.getDefiningOp<OpTy>()` to replace `dyn_cast<OpTy>(val.getDefiningOp())` , `dyn_cast_or_null<OpTy>(val.getDefiningOp())` and `dyn_cast_if_present<OpTy>(val.getDefiningOp())`.
2025-07-25 10:35:51 +08:00
Han-Chung Wang
1ff6d9daec
[mlir][linalg] Take artificial padding into account for pack/unpack folding. (#150272)
The revision only folds the tensor.pad/extract_slice op into
linalg.pack/unpack ops only when it is safe to fold. It is not valid to
have artificial padding.

The documentation improvement and verifier update will be done in a
separate PR (i.e., https://github.com/llvm/llvm-project/pull/149624).
The revision is a step towards it.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-24 13:55:07 -07:00
Maksim Levental
75aa7065dc
[mlir][NFC] update mlir/Dialect create APIs (17/n) (#149924)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-24 15:37:36 -05:00
Ian Wood
3ebe5d661f
[mlir][linalg] Drop unit dims on IndexingMapOpInterface (#150280)
Generalizes `dropUnitDims` to operate on any op implementing the
`IndexingMapOpInterface`. Operation specific creation is handled by
passing a builder that will construct the new operation based on the
dropped dimensions.

---------

Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
Co-authored-by: Kunwar Grover <groverkss@gmail.com>
2025-07-24 16:07:51 +01:00
Kazu Hirata
0925d7572a
[mlir] Remove unused includes (NFC) (#150266)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:53 -07:00
Frank Schlimbach
7ad1d5bd34
[mlir][mesh] removing partial/reduction axes from mesh.sharding (#149805)
[mlir][mesh] Removing partial axes from sharding annotations (discourse 87053)
2025-07-23 08:32:38 +02:00
fabrizio-indirli
3e7433d75a
[mlir][linalg] Fix to Elementwise Fusion when preserving results (#149843)
In the linalg ElementwiseOpFusion transform, a pre-requisite for the
fusion between a producer and consumer op is that the producer's output
indexing map associated to the result to be fused must be invertible
(e.g. a simple permutation).
Before this patch, only the first output indexing map was being checked;
this bug produced issues when the operand to fuse was not the 1st result
of the producer op. For example, this situation arises when the producer
op has multiple results because it's the result of previous fusions
where the original result had been preserved: in these cases, the pass
ought to check the indexing map of the result being fused, which is not
necessarily the 1st one.

Signed-off-by: Fabrizio Indirli <Fabrizio.Indirli@arm.com>
2025-07-22 10:16:59 +01:00
Adam Siemieniuk
b956f049b1
[mlir][linalg] Vectorize directly to a named contraction (#147296)
Extends linalg vectorizer with a path to lower contraction ops directly
into `vector.contract`.

The direct rewriting preserves high-level op semantics and provides more
progressive lowering compared to reconstructing contraction back from
multi dimensional reduction.
The added lowering focuses on named linalg ops and leverages their well
defined semantics to avoid complex precondition verification.

The new path is optional and disabled by default to avoid changing the
default vectorizer behavior.
2025-07-22 07:42:02 +02:00
Han-Chung Wang
3ea6da59ec
[mlir][linalg] Allow pack consumer fusion if the tile size is greater than dimension size. (#149438)
This happens only when you use larger tile size, which is greater than
or equal to the dimension size. In this case, it is a full slice, so it
is fusible.

The IR can be generated during the TileAndFuse process. It is hard to
fix in such driver, so we enable the naive fusion for the case.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-18 10:42:42 -07:00
Han-Chung Wang
7d040d4675
[mlir][linalg] Handle outer_dims_perm in linalg.pack consumer fusion. (#149426)
Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-18 09:42:40 -07:00
Han-Chung Wang
6ff471883f
[mlir][linalg] Improve linalg.pack consumer fusion. (#148993)
If a dimension is not tiled, it is always valid to fuse the pack op,
even if it has padding semantics. Because it always generates a full
slice along the dimension.

If a dimension is tiled and it does not need extra padding, the fusion
is valid.

The revision also formats corresponding tests for consistency.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-17 16:06:06 -07:00
Andrzej Warzyński
3b11aaaf94
[mlir][linalg] Add support for scalable vectorization of linalg.mmt4d (#146531)
This patch adds support for scalable vectorization of linalg.mmt4d. The
key design change is the introduction of a new vectorizer state variable:

* `assumeDynamicDimsMatchVecSizes`

...along with the corresponding Transform dialect attribute:

* `assume_dynamic_dims_match_vec_sizes`.

This flag instructs the vectorizer to assume that dynamic memref/tensor
dimensions match the corresponding vector sizes (fixed or scalable). With this
assumption, masking becomes unnecessary, which simplifies the lowering pipeline
significantly.

While this assumption is not universally valid, it typically holds for
`linalg.mmt4d`. Inputs and outputs are explicitly packed using `linalg.pack`,
and this packing includes padding, ensuring that dimension sizes align with
vector sizes (*).

* Related discussion: https://github.com/llvm/llvm-project/issues/143920

An upcoming patch will include an end-to-end test that leverages scalable
vectorization of linalg.mmt4d to demonstrate the newly enabled functionality.
This would not be feasible without the changes introduced here, as it would
otherwise require additional logic to handle complex - but ultimately redundant
- masks.

(*) This holds provided that the tile sizes used for packing match the vector
sizes used during vectorization. It is the user’s responsibility to enforce
this.
2025-07-17 19:02:08 +01:00
Andrzej Warzyński
bce951c572
[mlir][linalg] Update vectorization logic for linalg.unpack (#149156)
This PR makes sure that we don't generate unnecessary `tensor.empty`
when vectorizing `linalg.unpack`.

To better visualize the changes implemented here, consider this IR:
```mlir
func.func @example(
  %source: tensor<8x4x16x16xf32>,
  %dest: tensor<64x127xf32>) -> tensor<64x127xf32> {

    %res = linalg.unpack %source
      outer_dims_perm = [1, 0]
      inner_dims_pos = [0, 1]
      inner_tiles = [16, 16]
    into %dest : tensor<8x4x16x16xf32> -> tensor<64x127xf32>

    return %res : tensor<64x127xf32>
 }
```

Below is the output after vectorization, BEFORE and AFTER this PR.

BEFORE (note `tensor.empty` and the fact that `%arg1` is not used):
```mlir
  func.func @example(%arg0: tensor<8x4x16x16xf32>, %arg1: tensor<64x127xf32>) -> tensor<64x127xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %c0 = arith.constant 0 : index
    %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<8x4x16x16xf32>, vector<8x4x16x16xf32>
    %1 = vector.transpose %0, [1, 2, 0, 3] : vector<8x4x16x16xf32> to vector<4x16x8x16xf32>
    %2 = vector.shape_cast %1 : vector<4x16x8x16xf32> to vector<64x128xf32>
    %3 = tensor.empty() : tensor<64x127xf32>
    %c0_0 = arith.constant 0 : index
    %4 = vector.transfer_write %2, %3[%c0_0, %c0_0] {in_bounds = [true, false]} : vector<64x128xf32>, tensor<64x127xf32>
    return %4 : tensor<64x127xf32>
  }
```

AFTER (note that `%arg1` is correctly used):
```mlir
  func.func @example(%arg0: tensor<8x4x16x16xf32>, %arg1: tensor<64x127xf32>) -> tensor<64x127xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %c0 = arith.constant 0 : index
    %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<8x4x16x16xf32>, vector<8x4x16x16xf32>
    %1 = vector.transpose %0, [1, 2, 0, 3] : vector<8x4x16x16xf32> to vector<4x16x8x16xf32>
    %2 = vector.shape_cast %1 : vector<4x16x8x16xf32> to vector<64x128xf32>
    %c0_0 = arith.constant 0 : index
    %3 = vector.transfer_write %2, %arg1[%c0_0, %c0_0] {in_bounds = [true, false]} : vector<64x128xf32>, tensor<64x127xf32>
    return %3 : tensor<64x127xf32>
  }
```
2025-07-17 09:14:17 +01:00
zbenzion
6033544173
[mlir][linalg] Fix memref type verification in CollapseLinalgDimensions (#147245)
When collapsing linalg dimensions we check if its memref operands are
guaranteed to be collapsible. However, we currently assume that the
matching indexing map is the identity map.

This commit modifies this behavior and checks if the memref is
collapsible on the transformed dimensions.
2025-07-09 01:04:08 -07:00
Jakub Kuderski
6512ca7ddb
[mlir] Add isStatic* size check for ShapedTypes. NFCI. (#147085)
The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.

Also add the new functions to C and Python bindings.
2025-07-07 14:57:27 -04:00
Kazu Hirata
be4cd9f4da
[mlir] Remove unused includes (NFC) (#147206)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-06 19:06:07 -07:00
Longsheng Mou
718e647a0c
[mlir] Fix Wparentheses warning (#146893)
warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
  265 |              isa<VectorType>(operandType) &&
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  266 |                  "Unexpected non-vector ShapedType");
      |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2025-07-07 09:09:43 +08:00
Fabian Mora
bca79ec0d2
[mlir][linalg] Use ub.poison in linalg vectorizer instead of 0 for some transfer ops (#146544)
This patch is a follow up to https://github.com/llvm/llvm-project/pull/146088 and changes the padding value in the linalg vectorizer from `0` to `ub.poison` in `vector.transfer_read`s created for extracting slices or when vectorizing a generic.

Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
2025-07-02 10:10:03 -04:00
zbenzion
b68e8f1de7
[mlir][linalg] Allow promotion to use the original subview size (#144334)
linalg promotion attempts to compute a constant upper bound for the
allocated buffer size. Only when failed to compute an upperbound it
fallbacks to the original subview size, which may be dynamic.

Adding a promotion option to use the original subview size by default,
thus minimizing the allocation size.
Fixes #144268.
2025-07-02 08:47:51 +02:00
Han-Chung Wang
42578e8586
[mlir][linalg] Use hasPureTensorSemantics in TransposeMatmul methods. (#146438)
The issue is triggered by
ee070d0816
that checks `TensorLikeType` when downstream projects use the pattern
without registering bufferization::BufferizationDialect. The
registration is needed because the interface implementation for builtin
types locate at `BufferizationDialect::initialize()`. However, we do not
need to fix it by the registration. The proper fix is using the linalg
method, i.e., hasPureTensorSemantics.

No additional tests are added because the functionality is well tested
in
[transpose-matmul.mlir](https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/transpose-matmul.mlir).
To reproduce the issue, it requires a different setup, e.g., writing a
new C++ pass, which seems not worth it.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-01 14:15:27 -07:00
Ege Beysel
ace5108f37
feat(linalg): add a way to pass controlFn to foldIntoPackUnpackPatterns (#143685)
This PR adds a mechanism, so that downstream consumers can pass in
control functions for the application of these patterns. This change
shouldn't affect any consumers of this method that do not specify a
controlFn. The controlFn always gets the source operand of the consumer
in each of the patterns as a parameter.

In IREE, we (will) use it to control preventing folding patterns that
would inhibit fusion. See IREE issue
[#20896](https://github.com/iree-org/iree/issues/20896) for more
details.
2025-07-01 07:22:38 -07:00
Zhuoran Yin
8cfd9b8821
[MLIR] Make generic skip packing init operand when not used in DataLayoutPropagation (#146139)
In both `bubbleUpPackOpThroughGenericOp()` or
`pushDownUnPackOpThroughGenericOp()`, we can simplify the lowered IR by
removing the pack of an empty when the init tensor isn't used in generic
op. Instead of packing an empty tensor, the empty tensor can be
forwarded to the generic output. This allows cleaner result after data
layout propagation.
2025-07-01 09:39:30 -04:00
Fabian Mora
878d3594ed
[mlir][vector] Avoid setting padding by default to 0 in vector.transfer_read prefer ub.poison (#146088)
Context:
`vector.transfer_read` always requires a padding value. Most of its
builders take no `padding` value and assume the safe value of `0`.
However, this should be a conscious choice by the API user, as it makes
it easy to introduce bugs.
For example, I found several occasions while making this patch that the
padding value was not getting propagated (`vector.transfer_read` was
transformed into another `vector.transfer_read`). These bugs, were
always caused because of constructors that don't require specifying
padding.

Additionally, using `ub.poison` as a possible default value is better,
as it indicates the user "doesn't care" about the actual padding value,
forcing users to specify the actual padding semantics they want.

With that in mind, this patch changes the builders in
`vector.transfer_read` to always having a `std::optional<Value> padding`
argument. This argument is never optional, but for convenience users can
pass `std::nullopt`, padding the transfer read with `ub.poison`.

---------

Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
2025-06-30 15:20:42 -04:00
Andrzej Warzyński
541f33e075
[mlir][linalg] Prevent hoisting of transfer pairs in the presence of aliases (#145235)
This patch adds additional checks to the hoisting logic to prevent hoisting of
`vector.transfer_read` / `vector.transfer_write` pairs when the underlying
memref has users that introduce aliases via operations implementing
`ViewLikeOpInterface`.

Note: This may conservatively block some valid hoisting opportunities and could
affect performance. However, as demonstrated by the included tests, the current
logic is too permissive and can lead to incorrect transformations.

If this change prevents hoisting in cases that are provably safe, please share
a minimal repro - I'm happy to explore ways to relax the check.

Special treatment is given to `memref.assume_alignment`, mainly to accommodate
recent updates in:

* https://github.com/llvm/llvm-project/pull/139521

Note that such special casing does not scale and should generally be avoided.
The current hoisting logic lacks robust alias analysis. While better support
would require more work, the broader semantics of `memref.assume_alignment`
remain somewhat unclear. It's possible this op may eventually be replaced with
the "alignment" attribute added in:

* https://github.com/llvm/llvm-project/pull/144344
2025-06-27 13:18:15 +01:00
Christopher McGirr
96c1611163
[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern (#141613)
Given the following example:
```
module {
  func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> {
    %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32>
    return %pack : tensor<1x1x1x4x1xf32>
  }
}
```

We would generate an invalid transpose operation because the calculated
permutation would be `[0, 2, 0]` which is semantically incorrect. As the
permutation must contain unique integers corresponding to the source
tensor dimensions.

The following change modifies how we calculate the permutation array and
ensures that the dimension indices given in the permutation array is
unique.

The above example would then translate to a transpose having a
permutation of `[1, 2, 0]`. Following the rule, that the `inner_dim_pos`
is appended to the permutation array and the preceding indices are
filled with the remaining dimensions.
2025-06-27 09:24:33 +02:00
Kazu Hirata
abc2c3a538
[mlir] Use llvm::is_contained instead of llvm::all_of (NFC) (#145845)
llvm::is_contained is shorter than llvm::all_of plus a lambda.
2025-06-26 08:41:26 -07:00
MaheshRavishankar
c873e5f87d
[mlir][TilingInterface] Handle multi operand consumer fusion. (#145193)
For consumer fusion cases of this form

```
%0:2 = scf.forall .. shared_outs(%arg0 = ..., %arg0 = ...) {

  tensor.parallel_insert_slice ... into %arg0
  tensor.parallel_insert_slice ... into %arg1
}
%1 = linalg.generic ... ins(%0#0, %0#1)
```

the current consumer fusion that handles one slice at a time cannot fuse
the consumer into the loop, since fusing along one slice will create and
SSA violation on the other use from the `scf.forall`. The solution is to
allow consumer fusion to allow considering multiple slices at once. This
PR changes the `TilingInterface` methods related to consumer fusion,
i.e.

- `getTiledImplementationFromOperandTile`
- `getIterationDomainFromOperandTile`

to allow fusion while considering multiple operands. It is upto the
`TilingInterface` implementation to return an error if a list of tiles
of the operands cannot result in a consistent implementation of the
tiled operation.

The Linalg operation implementation of `TilingInterface` has been
modified to account for these changes and allow cases where operand
tiles that can result in a consistent tiling implementation are handled.

---------

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-06-25 11:54:38 -07:00
Spenser Bauman
532c15a718
[mlir][linalg] Fix module dependency issue due to unused import (#145727)
This include introduces a dependency for LinalgTransforms on
LinalgTransformOps, which is unspecified in the module dependencies, and
would produce a cyclic dependency if it were specified.

The include is unused in WinogradConv2D.cpp, so this change removes it.
2025-06-25 12:54:49 -04:00
Hsiangkai Wang
d16f42d1e2
[mlir][linalg] Constrain the parameters m, r in Winograd ops (#144657)
We only support fixed set of minimum filtering algorithm for Winograd
Conv2D decomposition. Instead of letting users specify any integer,
define a fixed set of enumeration values for the parameters of minimum
filtering algorithm.
2025-06-25 14:02:07 +01:00
Max191
4d21da002a
[mlir] Return vectorized values instead of replacing (#144158)
Updates the linalg::vectorize function to return a
`FailureOr<VectorizationResult>` containing the values to replace the
original operation, instead of directly replacing the original
operation. This aligns better with the style of transforms used with the
TilingInterface, and gives more control to users over the lowering,
since it allows for additional transformation of the IR before
replacement.

There was already a `VectorizationResult` defined, which was used for
the internal vectorize implementation using `CustomVectorizationHook`s,
so the old struct is renamed to `VectorizationHookResult`.

Note for integration: The replacement of the original operation is now
the responsibility of the caller, so wherever `linalg::vectorize` is
used, the caller must also do
`rewriter.replaceOp(vectorizeResults->replacements)`.

---------

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2025-06-24 12:06:41 -07:00