479 Commits

Author SHA1 Message Date
Scott Manley
e72335192d
[Arith][MemRef] add AtomicRMWKind::xori to enum (#151701)
Add missing xor AtomicRMWKind enum in arith. Also add support for xor to
memref.atomic_rmw so the change can be tested.

This does NOT add it for all users of the enum (e.g. Affine, Vector)
2025-08-11 08:46:06 -04:00
lonely eagle
2adbf9e92b
[mlir][memref] Support test-compose-subview dynamic size (#146881)
Supports the case where the sizes of the subview op is dynamic.When
there are more for loops in the tile algorithm, multiple subviews are
performed and test-compose-subview does not work when the size operand
of the subview ops is dynamic value.
2025-07-28 16:58:45 +08:00
Maksim Levental
c090ed53fb
[mlir][NFC] update mlir/Dialect create APIs (33/n) (#150659)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 16:13:55 -04:00
Maksim Levental
c610b24493
[mlir][NFC] update mlir/Dialect create APIs (27/n) (#150638)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 11:48:32 -05:00
Maksim Levental
a636b7bfdd
[mlir][NFC] update mlir/Dialect create APIs (18/n) (#149925)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-24 15:38:30 -05:00
Alan Li
1c3e4e994b
Reapply "[AMDGPU] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds" (#150334)
This is a reapply of patch #149851. The reapply also fixes a CMake/Bazel
build issue, which was the reason of the revert. (Thanks @rupprecht )

Original patch (#149851) message:
-----
This PR adds a new optimization pass to fold
`memref.subview/expand_shape/collapse_shape` ops into consumer
`amdgpu.gather_to_lds` operations.
* Implements a new pass `AmdgpuFoldMemRefOpsPass` with pattern
`FoldMemRefOpsIntoGatherToLDSOp`
* Adds corresponding folding tests
2025-07-24 09:23:15 -04:00
Kazu Hirata
0925d7572a
[mlir] Remove unused includes (NFC) (#150266)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:53 -07:00
Alan Li
9cb5c00bf7
Revert "[AMDGPU] fold memref.subview/expand_shape/collapse_shape in… (#150256)
…to `amdgpu.gather_to_lds` (#149851)"

This reverts commit dbc63f1e3724b6f2348c431dc1216537d9c042e8.

Having build deps issue.
2025-07-23 12:50:26 -04:00
Alan Li
dbc63f1e37
[AMDGPU] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds (#149851)
This PR adds a new optimization pass to fold
`memref.subview/expand_shape/collapse_shape` ops into consumer
`amdgpu.gather_to_lds` operations.

* Implements a new pass `AmdgpuFoldMemRefOpsPass` with pattern
`FoldMemRefOpsIntoGatherToLDSOp`
* Adds corresponding folding tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-23 11:22:41 -04:00
lonely eagle
09bea21d95
[mlir][memref] Simplify memref.copy canonicalization (#149506)
FoldCopyOfCast has both a OpRewritePattern implementation and a folder
implementation. This PR removes the OpRewritePattern implementation.
2025-07-19 07:11:10 +08:00
Quinn Dawkins
b1ef5a8890
[mlir][MemRef] Add support for emulating narrow floats (#148036)
This enables memref.load/store + vector.load/store support for sub-byte
float types. Since the memref types don't matter for loads/stores, we
still use the same types as integers with equivalent widths, with a few
extra bitcasts needed around certain operations.

There is no direct change needed for vector.load/store support. The
tests added for them are to verify that float types are
supported as well.
2025-07-14 11:18:51 -04:00
Kazu Hirata
d5def016b6
[llvm] Remove unused includes (NFC) (#148342)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-12 11:28:55 -07:00
Jakub Kuderski
6512ca7ddb
[mlir] Add isStatic* size check for ShapedTypes. NFCI. (#147085)
The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.

Also add the new functions to C and Python bindings.
2025-07-07 14:57:27 -04:00
Amir Bishara
c43efcb039
[MLIR][MemRef]-Add basic folding for memref ViewOp (#146237)
Add a folding for MemRef::ViewOp where the source
memref type and the result memref type are similar.
2025-07-05 23:33:45 +03:00
Nicolas Vasilache
08cf6ae537
[mlir][memref] Add a new ReifyResultShapes pass (#145927)
This pass reifies the shapes of a subset of
`ReifyRankedShapedTypeOpInterface` ops with `tensor` results.

The pass currently only supports result shape type reification for:
   - tensor::PadOp
   - tensor::ConcatOp

It addresses a representation gap where implicit op semantics are needed
to infer static result types from dynamic
operands. But it does so by using `ReifyRankedShapedTypeOpInterface` as
the source of truth rather than the op itself.
As a consequence, this cannot generalize today.

TODO: in the future, we should consider coupling this information with
op "transfer functions" (e.g.
`IndexingMapOpInterface`) to provide a source of truth that can work
across result shape inference, canonicalization and
op verifiers.

The pass replaces the operations with their reified versions, when more
static information can be derived, and inserts
casts when results shapes are updated.

Example:
```mlir
  #map = affine_map<(d0) -> (-d0 + 256)>
  func.func @func(%arg0: f32, %arg1: index, %arg2: tensor<64x?x64xf32>) -> tensor<1x?x64xf32> {
    %0 = affine.apply #map(%arg1)
    %extracted_slice = tensor.extract_slice %arg2[0, 0, 0] [1, %arg1, 64] [1, 1, 1] : tensor<64x?x64xf32> to tensor<1x?x64xf32>
    %padded = tensor.pad %extracted_slice low[0, 0, 0] high[0, %0, 0] {
    ^bb0(%arg3: index, %arg4: index, %arg5: index):
      tensor.yield %arg0 : f32
    } : tensor<1x?x64xf32> to tensor<1x?x64xf32>
    return %padded : tensor<1x?x64xf32>
  }

  // mlir-opt --reify-result-shapes
  #map = affine_map<()[s0] -> (-s0 + 256)>
  func.func @func(%arg0: f32, %arg1: index, %arg2: tensor<64x?x64xf32>) -> tensor<1x?x64xf32> {
    %0 = affine.apply #map()[%arg1]
    %extracted_slice = tensor.extract_slice %arg2[0, 0, 0] [1, %arg1, 64] [1, 1, 1] : tensor<64x?x64xf32> to tensor<1x?x64xf32>
    %padded = tensor.pad %extracted_slice low[0, 0, 0] high[0, %0, 0] {
    ^bb0(%arg3: index, %arg4: index, %arg5: index):
      tensor.yield %arg0 : f32
    } : tensor<1x?x64xf32> to tensor<1x256x64xf32>
    %cast = tensor.cast %padded : tensor<1x256x64xf32> to tensor<1x?x64xf32>
    return %cast : tensor<1x?x64xf32>
  }
  ```

---------

Co-authored-by: Fabian Mora <fabian.mora-cordero@amd.com>
2025-07-01 15:39:21 +02:00
Uday Bondhugula
80625c16f0
[MLIR][Affine] Fix memref replacement in affine-data-copy-generate (#139016)
Fixes: https://github.com/llvm/llvm-project/issues/130257

Fix affine-data-copy-generate in certain cases that involved users in
multiple blocks. Perform the memref replacement correctly during copy
generation.

Improve/clean up memref affine use replacement API. Instead of
supporting dominance and post dominance filters (which aren't adequate
in most cases) and computing dominance info expensively each time in
RAMUW, provide a user filter callback, i.e., force users to compute
dominance if needed.
2025-06-28 10:27:11 +05:30
long.chen
aed8f1992a
[NFC][mlir][memref] refine debug message about memref::SubViewOp. (#145470) 2025-06-27 18:34:45 +08:00
Jack Frankland
022e1e99f3
[mlir][memref]: Fix Bug in GlobalOp Verifier (#144900)
When comparing the type of the initializer in a `memref::GlobalOp`
against its result only consider the element type and the shape. Other
attributes such as memory space should be ignored since comparing these
between tensors and memrefs doesn't make sense and constructing a memref
in a specific memory space with a tensor that has no such attribute
should be valid.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2025-06-25 15:14:55 +01:00
Vitalii Shutov
9e704a0aa1
[MLIR][MemRef] Add alloca support for erase_dead_alloc_and_stores (#142131)
Previously, `erase_dead_alloc_and_stores` didn't support
`memref.alloca`. This patch introduces support for it.

---------

Signed-off-by: Vitalii Shutov <vitalii.shutov@arm.com>
2025-06-23 13:44:20 +01:00
Artemiy Bulavin
26f3f24a4f
[MLIR][NFC] Declare RuntimeVerifiableOpInterface for memref ops that have an implementation (#145230)
Previously running `-generate-runtime-verification` on an IR containing
`memref.reinterpret_cast` would crash because its implementation of the
`RuntimeVerifiableOpInterface` was removed in
https://github.com/llvm/llvm-project/pull/132547 but its associated
entry in `declarePromisedInterface` was never removed.

This causes an error when you try and run
`-generate-runtime-verification` on an IR containing
`memref.reinterpret_cast` that looks like

```
LLVM ERROR: checking for an interface (`mlir::RuntimeVerifiableOpInterface`) that was promised by dialect 'memref' but never implemented. This is generally an indication that the dialect extension implementing the interface was never registered.
```
as reported in https://github.com/llvm/llvm-project/issues/144028.

In this PR I also added all the ops that do have implementations of this
interface in
`mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp` to the
`declarePromisedInterface` for consistency.

Fixes https://github.com/llvm/llvm-project/issues/144028
2025-06-23 14:02:49 +08:00
Oleksandr "Alex" Zinenko
8a469da8b2
[mlir] remove unnecessary atomic_rmw expansions (#144515)
The expansion of `memref.atomic_rmw` into a `memref.generic_atomic_rmw`
for floating-point min/max operations is no longer necessary as those
are now supported by the LLVM dialect and LLVM IR.

Furthermore, combining this expansion with direct lowering of
`generic_atomic_rmw` could leads to invalid LLVM dialect IR with
`cmpxchg` operating on floating-point values that it does not support.
2025-06-18 13:32:46 +02:00
Han-Chung Wang
58ea53863b
[mlir][memref] Add a folder for chained AssumeAlignmentOp ops. (#142425)
The chained ops can be folded away when they have the same alignment.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-06-02 21:09:42 -07:00
Longsheng Mou
26b81c4300
[mlir][memref] Add terminator check to prevent a crash (#141972)
This PR adds terminator check to prevent a crash when invoke
`lastNonTerminatorInRegion`. Fixes #137333.
2025-05-31 13:25:42 +08:00
Alan Li
6fd3c20d25
[MLIR] Add a utility pass to linearize memref (#136797)
To add a transformation that simplifies memory access patterns, this PR
adds a memref linearizer which is based on the GPU/DecomposeMemRefs
pass, with the following changes:
* support vector dialect ops
* instead of decompose memrefs to rank-0 memrefs, flatten higher-ranked
memrefs to rank-1.

Notes:
* After the linearization, a MemRef's offset is kept, so a
`memref<4x8xf32, strided<[8, 1], offset: 100>>` becomes `memref<32xf32,
strided<[1], offset: 100>>`.
* It also works with dynamic shapes and strides and offsets (see test
cases for details).
* The shape of the casted memref is computed as 1d, flattened.
2025-05-22 13:05:37 -04:00
Han-Chung Wang
c39915fa2e
[mlir][NFC] Simplify constant checks with isOneInteger and renamed isZeroInteger. (#139340)
The revision adds isOneInteger helper, and simplifies the existing code
with the two methods. It removes some lambda, which makes code cleaner.

For downstream users, you can update the code with the below script.

```bash
sed -i "s/isZeroIndex/isZeroInteger/g" **/*.h
sed -i "s/isZeroIndex/isZeroInteger/g" **/*.cpp
```

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-05-20 14:53:02 -07:00
Kazu Hirata
6d515ce827
[mlir] Use llvm::all_of (NFC) (#140464) 2025-05-18 18:13:49 -07:00
Shay Kleiman
ffb9bbfd07
[mlir][MemRef] Changed AssumeAlignment into a Pure ViewLikeOp (#139521)
Made AssumeAlignment a ViewLikeOp that returns a new SSA memref equal
to its memref argument and made it have Pure trait. This
gives it a defined memory effect that matches what it does in practice
and makes it behave nicely with optimizations which won't get rid of it
unless its result isn't being used.
2025-05-18 13:50:29 +03:00
Kazu Hirata
25348394bb [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp:106:14:
  error: unused variable 'sourceType' [-Werror,-Wunused-variable]
2025-05-13 13:43:10 -07:00
Krzysztof Drewniak
a891163e50
[mlir][MemRef] Use specialized index ops to fold expand/collapse_shape (#138930)
This PR updates the FoldMemRefAliasOps to use `affine.linearize_index`
and `affine.delinearize_index` to perform the index computations needed
to fold a `memref.expand_shape` or `memref.collapse_shape` into its
consumers, respectively.

This also loosens some limitations of the pass:
1. The existing `output_shape` argument to `memref.expand_shape` is now
used, eliminating the need to re-infer this shape or call `memref.dim`.
2. Because we're using `affine.delinearize_index`, the restriction that
each group in a `memref.collapse_shape` can only have one dynamic
dimension is removed.
2025-05-13 13:28:53 -05:00
Andrzej Warzyński
c45cc3e420
[mlir][vector] Standardize base Naming Across Vector Ops (NFC) (#137859)
[mlir][vector] Standardize base Naming Across Vector Ops (NFC)

This change standardizes the naming convention for the argument
representing the value to read from or write to in Vector ops that
interface with Tensors or MemRefs. Specifically, it ensures that all
such ops use the name `base` (i.e., the base address or location to
which offsets are applied).

Updated operations:

* `vector.transfer_read`,
* `vector.transfer_write`.

For reference, these ops already use `base`:

* `vector.load`, `vector.store`, `vector.scatter`, `vector.gather`,
  `vector.expandload`, `vector.compressstore`, `vector.maskedstore`,
  `vector.maskedload`.

This is a non-functional change (NFC) and does not alter the semantics of these
operations. However, it does require users of the XFer ops to switch from
`op.getSource()` to `op.getBase()`.

To ease the transition, this PR temporarily adds a `getSource()` interface
method for compatibility. This is intended for downstream use only and should
not be relied on upstream. The method will be removed prior to the LLVM 21
release.

Implements #131602
2025-05-12 09:44:50 +01:00
Zhuoran Yin
53e8ff13bd
[MLIR] Fixing the memref linearization size computation for non-packed memref (#138922)
Credit to @krzysz00 who discovered this subtle bug in `MemRefUtils`. The
problem is in `getLinearizedMemRefOffsetAndSize()` utility. In
particular, how this subroutine computes the linearized size of a memref
is incorrect when given a non-packed memref.

### Background

As context, in a packed memref of `memref<8x8xf32>`, we'd compute the
size by multiplying the size of dimensions together. This is implemented
by composing an affine_map of `affine_map<()[s0, s1] -> (s0 * s1)>` and
then computing the result of size via `%size = affine.apply #map()[%c8,
%c8]`.

However, this is wrong for a non-packed memref of `memref<8x8xf32,
strided<[1024, 1]>>`. Since the previous computed multiplication map
will only consider the dimension sizes, it'd continue to conclude that
the size of the non-packed memref to be 64.

### Solution

This PR come up with a fix such that the linearized size computation
take strides into consideration. It computes the maximum of (dim size *
dim stride) for each dimension. We'd compute the size via the affine_map
of `affine_map<()[stride0, size0, stride1] -> ((stride0 * size0), 1 *
size1)>` and then computing the size via `%size = affine.max
#map()[%stride0, %size0, %size1]`. In particular for the new non-packed
memref, the size will be derived as max(1024\*8, 1\*8) = 8192 (rather
than the wrong size 64 computed by packed memref equation).
2025-05-08 13:14:32 -04:00
Kazu Hirata
921d162460
[mlir] Remove unused local variables (NFC) (#138642) 2025-05-06 07:55:50 -07:00
Matthias Springer
fd161cf56f
[mlir][memref] Remove runtime verification for memref.reinterpret_cast (#132547)
The runtime verification code used to verify that the result of a
`memref.reinterpret_cast` is in-bounds with respect to the source
memref. This is incorrect: `memref.reinterpret_cast` allows users to
construct almost arbitrary memref descriptors and there is no
correctness expectation.

This op is supposed to be used when the user "knows what they are
doing." Similarly, the static verifier of `memref.reinterpret_cast` does
not verify in-bounds semantics either.
2025-05-06 09:40:28 +02:00
Kazu Hirata
15f7c6ed70
[mlir] Remove unused local variables (NFC) (#138481) 2025-05-05 10:08:00 -07:00
Simon Camphausen
0009a17834
[mlir][EmitC] Add pass that combines all available emitc conversions (#117549) 2025-05-01 07:24:01 -07:00
Matthias Springer
120e940356
[mlir][memref] Add runtime verification for memref.atomic_rmw (#130414)
Implement runtime verification for `memref.atomic_rmw` and
`memref.generic_atomic_rmw`. Also add a missing test for `memref.store`.
2025-04-30 13:45:11 +02:00
Arnab Dutta
99cb3f7ac6
[mlir] Add memref normalization support for reinterpret_cast op (#133417)
Rewrites memrefs defined by reinterpet_cast ops to have an identity
layout map and updates all their indexing uses. Also, extend
`replaceAllMemRefUsesWith` utility to work when there are multiple 
occurrences of `oldMemRef` in `op`'s operand list when op is 
non-dereferencing.

Fixes #122090 
Fixes #121091
2025-04-30 14:13:09 +05:30
lorenzo chelini
8502ba1eb4
[MLIR][NFC] Retire let constructor for MemRef (#134788)
let constructor is legacy (do not use in tree!) since the tableGen
backend emits most of the glue logic to build a pass.

Note: The following constructor has been retired:

```cpp
std::unique_ptr<Pass> createExpandReallocPass(bool emitDeallocs = true);
```
    
To update your codebase, replace it with the new options-based API:
    
```cpp
memref::ExpandReallocPassOptions expandAllocPassOptions{
          /*emitDeallocs=*/false};
pm.addPass(memref::createExpandReallocPass(expandAllocPassOptions));
```
2025-04-23 16:50:00 +02:00
Matthias Springer
62d32c2c27
[mlir][memref][NFC] Simplify constifyIndexValues (#135940)
Simplify the code by removing function pointers.
2025-04-17 08:58:48 +02:00
Kazu Hirata
eb7f51485e
[mlir] Use llvm::append_range (NFC) (#135722) 2025-04-14 22:22:04 -07:00
ivangarcia44
5083e80c14
Folding extract_strided_metadata input into reinterpret_cast (#134845)
We can always fold the input of a extract_strided_metadata operator to
the input of a reinterpret_cast operator, because they point to the same
memory. Note that the reinterpret_cast does not use the layout of its
input memref, only its base memory pointer which is the same as the base
pointer returned by the extract_strided_metadata operator and the base
pointer of the extract_strided_metadata memref input.

Operations like expand_shape, collapse_shape, and subview are lowered to
a pair of extract_strided_metadata and reinterpret_cast like this:
      
%base_buffer, %offset, %sizes:2, %strides:2 =
memref.extract_strided_metadata %input_memref :
memref<ID1x...xIDNxBaseType> -> memref<f32>, index, index, index, index,
index

%reinterpret_cast = memref.reinterpret_cast %base_buffer to offset:
[%o1], sizes: [%d1,...,%dN], strides: [%s1,...,%N] : memref<f32> to
memref<OD1x...xODNxBaseType >

In many cases the input of the extract_strided_metadata input can be
passed directly into the input of the reinterpret_cast operation like
this (see how %base_buffer is replaced by %input_memref in the
reinterpret_cast above and the input type is updated):

%base_buffer, %offset, %sizes:2, %strides:2 =
memref.extract_strided_metadata %input_memref :
memref<ID1x...xIDNxBaseType> -> memref<f32>, index, index, index, index,
index
%reinterpret_cast = memref.reinterpret_cast %input_memref to offset:
[%o1], sizes: [%d1,...,%dN], strides: [%s1,...,%N] :
memref<ID1x...xIDNxBaseType> to memref<OD1x...xODNxBaseType >

When dealing with static dimensions, the extract_strided_metatdata will
become deadcode and we end up only with a reinterpret_cast:

%reinterpret_cast = memref.reinterpret_cast %input_memref to offset:
[%o1], sizes: [%d1,...,%dN], strides: [%s1,...,%N] :
memref<ID1x...xIDNxBaseType> to memref<OD1x...xODNxBaseType >

Note that reinterpret_cast only reads the base memory pointer from the
input memref (%input_memref above), which is equivalent to the
%base_buffer returned by the extract_strided_metadata operation. Hence
it is legal always to use the extract_strided_metadata input memref
directly in the reinterpret_cast. Note that since this is a pointer,
this operation is legal even when the base pointer values are modified
between the operation pair.

@matthias-springer 
@joker-eph 
@sahas3
@Hanumanth04
@dixinzhou
@rafaelubalmw

---------

Co-authored-by: Ivan Garcia <igarcia@vdi-ah2ddp-178.dhcp.mathworks.com>
2025-04-09 16:50:16 +02:00
Uday Bondhugula
37deb09593
[MLIR][Affine] Fix signatures of normalize memref utilities (#134466)
These methods were passing derived op types by pointers, which deviates
from the style. While on this, fix obsolete comments on those methods.
2025-04-07 17:36:28 +05:30
Matthias Springer
bc3b1b06c6
[mlir][memref] Fix build after #132545 (#133760)
There was a typo in the error message.
2025-03-31 10:38:55 -07:00
Matthias Springer
5edf127384
[mlir][memref] Verify out-of-bounds access for memref.subview (#133086)
* Improve the verifier of `memref.subview` to detect out-of-bounds
extractions.
* Improve the documentation of `memref.subview` to make clear that
out-of-bounds extractions are not allowed. Rewrite examples to use the
new `strided<>` notation instead of `affine_map` layout maps. Also
remove all unrelated operations (`memref.alloc`) from the examples.
* Fix various test cases where `memref.subview` ops ran out-of-bounds.
* Update canonicalizations patterns to ensure that they do not fold IR
if it would generate IR that no longer verifies.

Related discussion on Discourse:
https://discourse.llvm.org/t/out-of-bounds-semantics-of-memref-subview/85293

This is a re-upload of #131876, which was reverted due to failing GPU
tests. These tests were faulty and fixed in #133051.
2025-03-31 10:28:55 -07:00
Matthias Springer
8b06da1682
[mlir][memref] Improve runtime verification for memref.subview (#132545)
This commit addresses a TODO in the runtime verification of
`memref.subview`. Each dimension is now verified: the offset must be
in-bounds and the slice must not run out-of-bounds.

This commit aligns runtime verification with static op verification
(which was improved in #133086).
2025-03-31 10:24:30 -07:00
Karlo Basioli
f6823a0ae1
Revert "[mlir][memref] Verify out-of-bounds access for memref.subview" (#132940)
Reverts llvm/llvm-project#131876

GPU integration tests get broken by this PR. 
E.x.
`mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir`
2025-03-25 14:56:08 +00:00
Matthias Springer
d4304d85f2
[mlir][memref] Verify out-of-bounds access for memref.subview (#131876)
* Improve the verifier of `memref.subview` to detect out-of-bounds
extractions.
* Improve the documentation of `memref.subview` to make clear that
out-of-bounds extractions are not allowed. Rewrite examples to use the
new `strided<>` notation instead of `affine_map` layout maps. Also
remove all unrelated operations (`memref.alloc`) from the examples.
* Fix various test cases where `memref.subview` ops ran out-of-bounds.
* Update canonicalizations patterns to ensure that they do not fold IR
if it would generate IR that no longer verifies.

Related discussion on Discourse:
https://discourse.llvm.org/t/out-of-bounds-semantics-of-memref-subview/85293
2025-03-25 11:25:11 +01:00
Matthias Springer
a810141281
[mlir][memref] Add runtime verification for memref.assume_alignment (#130412)
Implement runtime verification for `memref.assume_alignment`.
2025-03-19 21:23:40 +01:00
Matthias Springer
e614e840bc
[mlir][memref] Add runtime verification for memref.dim (#130410)
Add runtime verification for `memref.dim`: check that the index is in
bounds.

Also simplify the pass pipeline for all memref runtime verification
checks.
2025-03-18 09:10:49 +01:00
Matthias Springer
6c867e27a7
[mlir] Use getSingleElement/hasSingleElement in various places (#131460)
This is a code cleanup. Update a few places in MLIR that should use
`hasSingleElement`/`getSingleElement`.

Note: `hasSingleElement` is faster than `.getSize() == 1` when it is
used with linked lists etc.

Depends on #131508.
2025-03-17 07:43:18 +01:00