229 Commits

Author SHA1 Message Date
Jack Frankland
5a221c39b6
[mlir][memref]: Fold ExpandShape into TransferRead (#176786)
Add support for folding `memref.expand_shape` ops into
`vector.transfer_read` ops when the permutation map is a
non-minor-identity.

In the case that the permutation map indexes into expanded dimensions
that would be contiguous within the original source shape then it is
safe to make this transformation.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2026-02-02 09:30:02 +00:00
Han-Chung Wang
20b925a28a
[mlir][memref] Add non-atomic RMW option for emulated memref.store. (#178498)
The revision follows
f0e1857c84
to add an option for supporting non-atomic RMW emulation. The 0D case
uses non-atomic option unconditionally because it writes the entire
value.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2026-01-29 14:15:44 -08:00
Jakub Kuderski
59e44799bd
[mlir] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178487)
Pre-commiting this before landing the new check in
https://github.com/llvm/llvm-project/pull/177892
2026-01-28 19:13:47 +00:00
Zhuoran Yin
f0bf97281f
[MemRef] Propagate strided layout through view-like ops in multiBuffer (#176941)
The memref::multiBuffer transformation replaces an allocation with a
multi-buffered allocation and creates a strided memref.subview at each
loop iteration. When the original allocation is used through view-like
ops, the existing code only handles SubViewOp, leaving other view-like
ops with incorrect types.

This patch extends replaceUsesAndPropagateType to handle ExpandShapeOp,
CollapseShapeOp, and CastOp using TypeSwitch. For each view-like op, we
compute the correct result type (or assert on failure) and create a new
operation, then recursively propagate the updated type through chains.
New FileCheck tests cover expand_shape, collapse_shape, cast, and a
chained expand_shape->cast case.

A single ViewLikeOpInterface hook is not practical here: view-like ops
have distinct type inference and validity rules (e.g., subview uses
offset/size/stride inference, expand/collapse use reassociation, cast
requires compatibility checks). Ops like memref.view or
memref.reinterpret_cast need additional layout/size validation beyond
what multi-buffering currently tracks, so this patch handles the common
safe cases directly.
2026-01-27 10:00:22 -05:00
Krzysztof Drewniak
003b28d031
[mlir] Move affine's FoldMemRefAliasOps into its own pass (#172548)
I'm planning to introduce an interface that'll allow FoldMemRefAliasOps
to not know about dialects like NVVM or GPU. To do this, however, I need
to get the `affine` ops (which need special handling in order to handle
their implicit affine maps) into a separate pass, analogously to how
`amdgpu` ops have these patterns under their dialect and ton under
`memref`.

This commit also changes the expand/collapse_shape index resolvers to
return `void`, since they never actually failed and to make it clearer
that they modify IR.

(Note: An LLM did the initial refactoring and test movement, I've
reviewed the results and edited them some.)
2026-01-02 10:13:42 -08:00
Jack Frankland
e575539541
[milr][memref]: Fold expand_shape + transfer_read (#167679)
Extend the load of a expand shape rewrite pattern to support folding a
`memref.expand_shape` and `vector.transfer_read` when the permutation
map on `vector.transfer_read` is a minor identity.

---------

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2025-11-24 12:58:34 +00:00
Ivan Butygin
0a34d37365
[mlir][memref] Remove invalid extract_aligned_pointer_as_index folding in ExpandStridedMetadata (#167615)
`RewriteExtractAlignedPointerAsIndexOfViewLikeOp` tries to propagate
`extract_aligned_pointer_as_index` through the view ops.

`ViewLikeOpInterface` by itself doesn't guarantee to preserve the base
pointer and `memref.view` is one such example, so limit pattern to a few
specific ops.
2025-11-12 22:29:56 +03:00
Hanumanth
a664f584f9
[mlir][memref] Fix runtime verification for memref.subview for empty memref subviews (#166581)
This PR applies the same fix from #166569 to `memref.subview`. That PR
fixed the issue for `tensor.extract_slice`, and this one addresses the
identical problem for `memref.subview`.

The runtime verification for `memref.subview` incorrectly rejects valid
empty subviews (size=0) starting at the memref boundary.

**Example that demonstrates the issue:**

```mlir
func.func @subview_with_empty_slice(%memref: memref<10x4x1xf32, strided<[?, ?, ?], offset: ?>>, 
                                     %dim_0: index, 
                                     %dim_1: index, 
                                     %dim_2: index,
                                     %offset: index) {
    // When called with: offset=10, dim_0=0, dim_1=4, dim_2=1
    // Runtime verification fails: "offset 0 is out-of-bounds"
    %subview = memref.subview %memref[%offset, 0, 0] [%dim_0, %dim_1, %dim_2] [1, 1, 1] :
        memref<10x4x1xf32, strided<[?, ?, ?], offset: ?>> to
        memref<?x?x?xf32, strided<[?, ?, ?], offset: ?>>
    return
}
```

When `%offset=10` and `%dim_0=0`, we're creating an empty subview (zero
elements along dimension 0) starting at the boundary. The current
verification enforces `offset < dim_size`, which evaluates to `10 < 10`
and fails. I feel this should be valid since no memory is accessed.

**The fix:**

Same as #166569 - make the offset check conditional on subview size:
- Empty subview (size == 0): allow `0 <= offset <= dim_size`
- Non-empty subview (size > 0): require `0 <= offset < dim_size`

Please see #166569 for motivation and rationale.

---

Co-authored-by: Hanumanth Hanumantharayappa <hhanuman@ah-hhanuman-l.dhcp.mathworks.com>
2025-11-12 08:36:41 +09:00
MaheshRavishankar
fc093f1361
[mlir][Interfaces] Add interface methods to allow reifying single result/single dim of result. (#162924)
Current implementation of `reifyResultShapes` forces all
implementations to return all dimensions of all results. This can be
wasteful when you only require dimensions of one result, or a single
dimension of a result. Further this also creates issues with using
patterns to resolve the `tensor.dim` and `memref.dim` operations since
the extra operations created result in the pattern rewriter entering
an infinite loop (eventually breaking out of the loop due to the
iteration limit on the pattern rewriter). This is demonstrated by some
of the test cases added here that hit this limit when using
`--resolve-shaped-type-result-dims` and
`--resolve-ranked-shaped-type-result-dims`. To resolve this issue the
interface should allow for creating just the operations needed. This
change is the first step in resolving this.

The original implementation was done with the restriction in mind that
it might not always be possible to compute dimension of a single
result or one dimension of a single result in all cases. To account
for such cases, two additional interface methods are added

- `reifyShapeOfResult` (which allows reifying dimensions of
  just one result), has a default implementation that calls
  `reifyResultShapes` and returns the dimensions of a single result.
- `reifyDimOfResult` (which allows reifying a single dimension of a
  single result) has a default implementation that calls
  `reifyDimOfResult` and returns the value for the dimension of the
  result (which in turn for the default case would call
  `reifyDimOfResult`).

While this change sets up the interface, ideally most operations will
implement the `refiyDimOfResult` when possible. For almost all
operations in tree this is true. Subsequent commits will change those
incrementally.

Some of the tests added here that check that the default
implementations for the above method work as expected, also end up
hitting the pattern rewriter limit when using
`--resolve-ranked-shaped-type-result-dims`/
`--resolve-ranked-shaped-type-result-dims`. For testing purposes, a
flag is added to these passes that ignore the error returned by the
pattern application (this flag is left on by default to maintain
current state).

Changes required downstream to integrate this change

1. In operation definitions in .td files, for those operations that
   implement the `ReifyRankedShapedTypeOpInterface`.

```
def <op-name> : Op<..., [...,
    DeclareOpInterfaceMethods[ReifyRankedShapedTypeOpInterface]]>
```

should be changed to

```
def <op-name> : Op<..., [...,
    DeclareOpInterfaceMethods[ReifyRankedShapedTypeOpInterface, [
        "reifyResultShapes"]]]>
```

---------

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-11-10 09:01:01 -08:00
Jakub Kuderski
ba0be89cd2
[mlir] Simplify Default cases in type switches. NFC. (#165767)
Use default values instead of lambdas when possible. `std::nullopt` and
`nullptr` can be used now because of
https://github.com/llvm/llvm-project/pull/165724.
2025-10-30 15:10:59 -04:00
Hanumanth
cbe7c49e93
[mlir][memref] Fix runtime verification for memref.subview when size dimension value is 0 (#164897)
Previously, the runtime verification pass would insert assertion
statements with conditions that always evaluate to false for
semantically valid `memref.subview` operations where one of the
dimensions had a size of 0.

The `memref.subview` runtime verification logic was unconditionally
generating checks for the position of the last element (`offset + (size
- 1) * stride`). When `size` is 0, this causes the assertion condition
to always be false, leading to runtime failures even though the
operation is semantically valid.

This patch fixes the issue by making the `lastPos` check conditional.
The offset is always verified, but the endpoint check is only performed
when `size > 0` to avoid generating spurious assert statements.

This issue was discovered through a LiteRT model, where a dynamic shape
calculation resulted in a zero-sized dimension being passed to
`memref.subview`. The following is a simplified IR snippet from the
model. After running the runtime verification pass, an assertion that
always fails is generated because the SSA value `%5` becomes 0.

```mlir
module {
  memref.global "private" constant @__constant_2xi32 : memref<2xi32> = dense<-1> {alignment = 64 : i64}
  memref.global "private" constant @__constant_1xi32 : memref<1xi32> = dense<0> {alignment = 64 : i64}
  func.func @simpleRepro(%arg0: memref<10x4x1xf32, strided<[?, ?, ?], offset: ?>>) -> memref<?x?x?xf32, strided<[?, ?, ?], offset: ?>> {
    %c2 = arith.constant 2 : index
    %c4 = arith.constant 4 : index
    %c1 = arith.constant 1 : index
    %c10 = arith.constant 10 : index
    %c0 = arith.constant 0 : index
    %c-1 = arith.constant -1 : index
    %0 = memref.get_global @__constant_1xi32 : memref<1xi32>
    %1 = memref.get_global @__constant_2xi32 : memref<2xi32>
    %alloca = memref.alloca() {alignment = 64 : i64} : memref<3xi32>
    %subview = memref.subview %alloca[0] [1] [1] : memref<3xi32> to memref<1xi32, strided<[1]>>
    memref.copy %0, %subview : memref<1xi32> to memref<1xi32, strided<[1]>>
    %subview_0 = memref.subview %alloca[1] [2] [1] : memref<3xi32> to memref<2xi32, strided<[1], offset: 1>>
    memref.copy %1, %subview_0 : memref<2xi32> to memref<2xi32, strided<[1], offset: 1>>
    %2 = memref.load %alloca[%c0] : memref<3xi32>
    %3 = index.casts %2 : i32 to index
    %4 = arith.cmpi eq, %3, %c-1 : index
    %5 = arith.select %4, %c10, %3 : index
    %6 = memref.load %alloca[%c1] : memref<3xi32>
    %7 = index.casts %6 : i32 to index
    %8 = arith.cmpi eq, %7, %c-1 : index
    %9 = arith.select %8, %c4, %7 : index
    %10 = memref.load %alloca[%c2] : memref<3xi32>
    %11 = index.casts %10 : i32 to index
    %12 = arith.cmpi eq, %11, %c-1 : index
    %13 = arith.select %12, %c1, %11 : index
    %subview_1 = memref.subview %arg0[0, 0, 0] [%5, %9, %13] [1, 1, 1] : memref<10x4x1xf32, strided<[?, ?, ?], offset: ?>> to memref<?x?x?xf32, strided<[?, ?, ?], offset: ?>>
    return %subview_1 : memref<?x?x?xf32, strided<[?, ?, ?], offset: ?>>
  }
}
```

P.S. This is a similar issue to the one fixed for `tensor.extract_slice`
in https://github.com/llvm/llvm-project/pull/164878

---------

Co-authored-by: Hanumanth Hanumantharayappa <hhanuman@ah-hhanuman-l.dhcp.mathworks.com>
2025-10-27 11:43:45 -07:00
Ming Yan
4eaeeab771
[mlir][memref] Fold extract_strided_metadata(cast(x)) into extract_strided_metadata(x) (#164585) 2025-10-23 08:54:38 +08:00
Shenghang Tsai
7be89bb07b
[MLIR] Fix typo of the word "pattern" in CAPI and docs (#163780)
This includes the rename from `mlirOpRewritePattenCreate` to `mlirOpRewritePatternCreate` in CAPI, and other typo fixes in docs and code comments.
2025-10-17 12:57:59 +08:00
Hanchenng Wu
a6d1a52b8d
[MLIR] Reuse AsmState to enable fast generate-runtime-verification pass; add location-only pass option (#160331)
The pass generate-runtime-verification generates additional runtime op
verification checks.

Currently, the pass is extremely expensive. For example, with a
mobilenet v2 ssd network(converted to mlir), running this pass alone in
debug mode will take 30 minutes. The same observation has been made to
other networks as small as 5 Mb.

The culprit is this line "op->print(stream, flags);" in function
"RuntimeVerifiableOpInterface::generateErrorMessage" in File
mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp.

As we are printing the op with all the names of the operands in the
middle end, we are constructing a new SSANameState for each
op->print(...) call. Thus, we are doing a new SSA analysis for each
error message printed.

Perf profiling shows that 98% percent of the time is spent in the
constructor of SSANameState.

This change refactored the message generator. We use a toplevel
AsmState, and reuse it with all the op-print(stream, asmState). With a
release build, this change reduces the pass exeuction time from ~160
seconds to 0.3 seconds on my machine.

This change also adds verbose options to generate-runtime-verification
pass.
verbose 0: print only source location with error message.
verbose 1: print the full op, including the name of the operands.
2025-10-08 11:48:34 +01:00
Jakub Kuderski
8bab6c4e8c
[mlir] Simplify unreachable type switch cases. NFC. (#162032)
Use `DefaultUnreachable` from
https://github.com/llvm/llvm-project/pull/161970.
2025-10-06 09:23:25 -04:00
Alan Li
4a094095a4
[MLIR] Make 1-D memref flattening a prerequisite for vector narrow type emulation (#157771)
Addresses:  https://github.com/llvm/llvm-project/issues/115653

We already have utilities to flatten memrefs into 1-D. This change makes
memref flattening a prerequisite for vector narrow type emulation,
ensuring that emulation patterns only need to handle 1-D scenarios.
2025-09-16 20:43:20 +00:00
Mehdi Amini
6c8ad83a2c [MLIR] Apply clang-tidy fixes for readability-container-size-empty in EmulateNarrowType.cpp (NFC) 2025-09-06 07:31:18 -07:00
Mehdi Amini
2af45d3d6e [MLIR] Apply clang-tidy fixes for performance-unnecessary-value-param in BufferViewFlowOpInterfaceImpl.cpp (NFC) 2025-08-30 12:23:09 -07:00
Samarth Narang
dbf34e56ca
[mlir][MemRef] Address TODO to use early_inc to simplify elimination of uses (NFC) (#155123) 2025-08-26 06:19:24 -04:00
donald chen
5af7263d42
[mlir] add getViewDest method to viewLikeOpInterface (#154524)
The viewLikeOpInterface abstracts the behavior of an operation view one
buffer as another. However, the current interface only includes a
"getViewSource" method and lacks a "getViewDest" method.

Previously, it was generally assumed that viewLikeOpInterface operations
would have only one return value, which was the view dest. This
assumption was broken by memref.extract_strided_metadata, and more
operations may break these silent conventions in the future. Calling
"viewLikeInterface->getResult(0)" may lead to a core dump at runtime.
Therefore, we need 'getViewDest' method to standardize our behavior.

This patch adds the getViewDest function to viewLikeOpInterface and
modifies the usage points of viewLikeOpInterface to standardize its use.
2025-08-21 20:09:52 +08:00
lonely eagle
2adbf9e92b
[mlir][memref] Support test-compose-subview dynamic size (#146881)
Supports the case where the sizes of the subview op is dynamic.When
there are more for loops in the tile algorithm, multiple subviews are
performed and test-compose-subview does not work when the size operand
of the subview ops is dynamic value.
2025-07-28 16:58:45 +08:00
Maksim Levental
c090ed53fb
[mlir][NFC] update mlir/Dialect create APIs (33/n) (#150659)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 16:13:55 -04:00
Maksim Levental
a636b7bfdd
[mlir][NFC] update mlir/Dialect create APIs (18/n) (#149925)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-24 15:38:30 -05:00
Alan Li
1c3e4e994b
Reapply "[AMDGPU] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds" (#150334)
This is a reapply of patch #149851. The reapply also fixes a CMake/Bazel
build issue, which was the reason of the revert. (Thanks @rupprecht )

Original patch (#149851) message:
-----
This PR adds a new optimization pass to fold
`memref.subview/expand_shape/collapse_shape` ops into consumer
`amdgpu.gather_to_lds` operations.
* Implements a new pass `AmdgpuFoldMemRefOpsPass` with pattern
`FoldMemRefOpsIntoGatherToLDSOp`
* Adds corresponding folding tests
2025-07-24 09:23:15 -04:00
Kazu Hirata
0925d7572a
[mlir] Remove unused includes (NFC) (#150266)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:53 -07:00
Alan Li
9cb5c00bf7
Revert "[AMDGPU] fold memref.subview/expand_shape/collapse_shape in… (#150256)
…to `amdgpu.gather_to_lds` (#149851)"

This reverts commit dbc63f1e3724b6f2348c431dc1216537d9c042e8.

Having build deps issue.
2025-07-23 12:50:26 -04:00
Alan Li
dbc63f1e37
[AMDGPU] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds (#149851)
This PR adds a new optimization pass to fold
`memref.subview/expand_shape/collapse_shape` ops into consumer
`amdgpu.gather_to_lds` operations.

* Implements a new pass `AmdgpuFoldMemRefOpsPass` with pattern
`FoldMemRefOpsIntoGatherToLDSOp`
* Adds corresponding folding tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-23 11:22:41 -04:00
Quinn Dawkins
b1ef5a8890
[mlir][MemRef] Add support for emulating narrow floats (#148036)
This enables memref.load/store + vector.load/store support for sub-byte
float types. Since the memref types don't matter for loads/stores, we
still use the same types as integers with equivalent widths, with a few
extra bitcasts needed around certain operations.

There is no direct change needed for vector.load/store support. The
tests added for them are to verify that float types are
supported as well.
2025-07-14 11:18:51 -04:00
Kazu Hirata
d5def016b6
[llvm] Remove unused includes (NFC) (#148342)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-12 11:28:55 -07:00
Jakub Kuderski
6512ca7ddb
[mlir] Add isStatic* size check for ShapedTypes. NFCI. (#147085)
The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.

Also add the new functions to C and Python bindings.
2025-07-07 14:57:27 -04:00
Nicolas Vasilache
08cf6ae537
[mlir][memref] Add a new ReifyResultShapes pass (#145927)
This pass reifies the shapes of a subset of
`ReifyRankedShapedTypeOpInterface` ops with `tensor` results.

The pass currently only supports result shape type reification for:
   - tensor::PadOp
   - tensor::ConcatOp

It addresses a representation gap where implicit op semantics are needed
to infer static result types from dynamic
operands. But it does so by using `ReifyRankedShapedTypeOpInterface` as
the source of truth rather than the op itself.
As a consequence, this cannot generalize today.

TODO: in the future, we should consider coupling this information with
op "transfer functions" (e.g.
`IndexingMapOpInterface`) to provide a source of truth that can work
across result shape inference, canonicalization and
op verifiers.

The pass replaces the operations with their reified versions, when more
static information can be derived, and inserts
casts when results shapes are updated.

Example:
```mlir
  #map = affine_map<(d0) -> (-d0 + 256)>
  func.func @func(%arg0: f32, %arg1: index, %arg2: tensor<64x?x64xf32>) -> tensor<1x?x64xf32> {
    %0 = affine.apply #map(%arg1)
    %extracted_slice = tensor.extract_slice %arg2[0, 0, 0] [1, %arg1, 64] [1, 1, 1] : tensor<64x?x64xf32> to tensor<1x?x64xf32>
    %padded = tensor.pad %extracted_slice low[0, 0, 0] high[0, %0, 0] {
    ^bb0(%arg3: index, %arg4: index, %arg5: index):
      tensor.yield %arg0 : f32
    } : tensor<1x?x64xf32> to tensor<1x?x64xf32>
    return %padded : tensor<1x?x64xf32>
  }

  // mlir-opt --reify-result-shapes
  #map = affine_map<()[s0] -> (-s0 + 256)>
  func.func @func(%arg0: f32, %arg1: index, %arg2: tensor<64x?x64xf32>) -> tensor<1x?x64xf32> {
    %0 = affine.apply #map()[%arg1]
    %extracted_slice = tensor.extract_slice %arg2[0, 0, 0] [1, %arg1, 64] [1, 1, 1] : tensor<64x?x64xf32> to tensor<1x?x64xf32>
    %padded = tensor.pad %extracted_slice low[0, 0, 0] high[0, %0, 0] {
    ^bb0(%arg3: index, %arg4: index, %arg5: index):
      tensor.yield %arg0 : f32
    } : tensor<1x?x64xf32> to tensor<1x256x64xf32>
    %cast = tensor.cast %padded : tensor<1x256x64xf32> to tensor<1x?x64xf32>
    return %cast : tensor<1x?x64xf32>
  }
  ```

---------

Co-authored-by: Fabian Mora <fabian.mora-cordero@amd.com>
2025-07-01 15:39:21 +02:00
Uday Bondhugula
80625c16f0
[MLIR][Affine] Fix memref replacement in affine-data-copy-generate (#139016)
Fixes: https://github.com/llvm/llvm-project/issues/130257

Fix affine-data-copy-generate in certain cases that involved users in
multiple blocks. Perform the memref replacement correctly during copy
generation.

Improve/clean up memref affine use replacement API. Instead of
supporting dominance and post dominance filters (which aren't adequate
in most cases) and computing dominance info expensively each time in
RAMUW, provide a user filter callback, i.e., force users to compute
dominance if needed.
2025-06-28 10:27:11 +05:30
Oleksandr "Alex" Zinenko
8a469da8b2
[mlir] remove unnecessary atomic_rmw expansions (#144515)
The expansion of `memref.atomic_rmw` into a `memref.generic_atomic_rmw`
for floating-point min/max operations is no longer necessary as those
are now supported by the LLVM dialect and LLVM IR.

Furthermore, combining this expansion with direct lowering of
`generic_atomic_rmw` could leads to invalid LLVM dialect IR with
`cmpxchg` operating on floating-point values that it does not support.
2025-06-18 13:32:46 +02:00
Alan Li
6fd3c20d25
[MLIR] Add a utility pass to linearize memref (#136797)
To add a transformation that simplifies memory access patterns, this PR
adds a memref linearizer which is based on the GPU/DecomposeMemRefs
pass, with the following changes:
* support vector dialect ops
* instead of decompose memrefs to rank-0 memrefs, flatten higher-ranked
memrefs to rank-1.

Notes:
* After the linearization, a MemRef's offset is kept, so a
`memref<4x8xf32, strided<[8, 1], offset: 100>>` becomes `memref<32xf32,
strided<[1], offset: 100>>`.
* It also works with dynamic shapes and strides and offsets (see test
cases for details).
* The shape of the casted memref is computed as 1d, flattened.
2025-05-22 13:05:37 -04:00
Han-Chung Wang
c39915fa2e
[mlir][NFC] Simplify constant checks with isOneInteger and renamed isZeroInteger. (#139340)
The revision adds isOneInteger helper, and simplifies the existing code
with the two methods. It removes some lambda, which makes code cleaner.

For downstream users, you can update the code with the below script.

```bash
sed -i "s/isZeroIndex/isZeroInteger/g" **/*.h
sed -i "s/isZeroIndex/isZeroInteger/g" **/*.cpp
```

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-05-20 14:53:02 -07:00
Kazu Hirata
6d515ce827
[mlir] Use llvm::all_of (NFC) (#140464) 2025-05-18 18:13:49 -07:00
Shay Kleiman
ffb9bbfd07
[mlir][MemRef] Changed AssumeAlignment into a Pure ViewLikeOp (#139521)
Made AssumeAlignment a ViewLikeOp that returns a new SSA memref equal
to its memref argument and made it have Pure trait. This
gives it a defined memory effect that matches what it does in practice
and makes it behave nicely with optimizations which won't get rid of it
unless its result isn't being used.
2025-05-18 13:50:29 +03:00
Kazu Hirata
25348394bb [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp:106:14:
  error: unused variable 'sourceType' [-Werror,-Wunused-variable]
2025-05-13 13:43:10 -07:00
Krzysztof Drewniak
a891163e50
[mlir][MemRef] Use specialized index ops to fold expand/collapse_shape (#138930)
This PR updates the FoldMemRefAliasOps to use `affine.linearize_index`
and `affine.delinearize_index` to perform the index computations needed
to fold a `memref.expand_shape` or `memref.collapse_shape` into its
consumers, respectively.

This also loosens some limitations of the pass:
1. The existing `output_shape` argument to `memref.expand_shape` is now
used, eliminating the need to re-infer this shape or call `memref.dim`.
2. Because we're using `affine.delinearize_index`, the restriction that
each group in a `memref.collapse_shape` can only have one dynamic
dimension is removed.
2025-05-13 13:28:53 -05:00
Andrzej Warzyński
c45cc3e420
[mlir][vector] Standardize base Naming Across Vector Ops (NFC) (#137859)
[mlir][vector] Standardize base Naming Across Vector Ops (NFC)

This change standardizes the naming convention for the argument
representing the value to read from or write to in Vector ops that
interface with Tensors or MemRefs. Specifically, it ensures that all
such ops use the name `base` (i.e., the base address or location to
which offsets are applied).

Updated operations:

* `vector.transfer_read`,
* `vector.transfer_write`.

For reference, these ops already use `base`:

* `vector.load`, `vector.store`, `vector.scatter`, `vector.gather`,
  `vector.expandload`, `vector.compressstore`, `vector.maskedstore`,
  `vector.maskedload`.

This is a non-functional change (NFC) and does not alter the semantics of these
operations. However, it does require users of the XFer ops to switch from
`op.getSource()` to `op.getBase()`.

To ease the transition, this PR temporarily adds a `getSource()` interface
method for compatibility. This is intended for downstream use only and should
not be relied on upstream. The method will be removed prior to the LLVM 21
release.

Implements #131602
2025-05-12 09:44:50 +01:00
Kazu Hirata
921d162460
[mlir] Remove unused local variables (NFC) (#138642) 2025-05-06 07:55:50 -07:00
Matthias Springer
fd161cf56f
[mlir][memref] Remove runtime verification for memref.reinterpret_cast (#132547)
The runtime verification code used to verify that the result of a
`memref.reinterpret_cast` is in-bounds with respect to the source
memref. This is incorrect: `memref.reinterpret_cast` allows users to
construct almost arbitrary memref descriptors and there is no
correctness expectation.

This op is supposed to be used when the user "knows what they are
doing." Similarly, the static verifier of `memref.reinterpret_cast` does
not verify in-bounds semantics either.
2025-05-06 09:40:28 +02:00
Kazu Hirata
15f7c6ed70
[mlir] Remove unused local variables (NFC) (#138481) 2025-05-05 10:08:00 -07:00
Matthias Springer
120e940356
[mlir][memref] Add runtime verification for memref.atomic_rmw (#130414)
Implement runtime verification for `memref.atomic_rmw` and
`memref.generic_atomic_rmw`. Also add a missing test for `memref.store`.
2025-04-30 13:45:11 +02:00
Arnab Dutta
99cb3f7ac6
[mlir] Add memref normalization support for reinterpret_cast op (#133417)
Rewrites memrefs defined by reinterpet_cast ops to have an identity
layout map and updates all their indexing uses. Also, extend
`replaceAllMemRefUsesWith` utility to work when there are multiple 
occurrences of `oldMemRef` in `op`'s operand list when op is 
non-dereferencing.

Fixes #122090 
Fixes #121091
2025-04-30 14:13:09 +05:30
lorenzo chelini
8502ba1eb4
[MLIR][NFC] Retire let constructor for MemRef (#134788)
let constructor is legacy (do not use in tree!) since the tableGen
backend emits most of the glue logic to build a pass.

Note: The following constructor has been retired:

```cpp
std::unique_ptr<Pass> createExpandReallocPass(bool emitDeallocs = true);
```
    
To update your codebase, replace it with the new options-based API:
    
```cpp
memref::ExpandReallocPassOptions expandAllocPassOptions{
          /*emitDeallocs=*/false};
pm.addPass(memref::createExpandReallocPass(expandAllocPassOptions));
```
2025-04-23 16:50:00 +02:00
Uday Bondhugula
37deb09593
[MLIR][Affine] Fix signatures of normalize memref utilities (#134466)
These methods were passing derived op types by pointers, which deviates
from the style. While on this, fix obsolete comments on those methods.
2025-04-07 17:36:28 +05:30
Matthias Springer
bc3b1b06c6
[mlir][memref] Fix build after #132545 (#133760)
There was a typo in the error message.
2025-03-31 10:38:55 -07:00
Matthias Springer
8b06da1682
[mlir][memref] Improve runtime verification for memref.subview (#132545)
This commit addresses a TODO in the runtime verification of
`memref.subview`. Each dimension is now verified: the offset must be
in-bounds and the slice must not run out-of-bounds.

This commit aligns runtime verification with static op verification
(which was improved in #133086).
2025-03-31 10:24:30 -07:00
Matthias Springer
a810141281
[mlir][memref] Add runtime verification for memref.assume_alignment (#130412)
Implement runtime verification for `memref.assume_alignment`.
2025-03-19 21:23:40 +01:00