28 Commits

Author SHA1 Message Date
Maksim Levental
c610b24493
[mlir][NFC] update mlir/Dialect create APIs (27/n) (#150638)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 11:48:32 -05:00
Alan Li
1c3e4e994b
Reapply "[AMDGPU] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds" (#150334)
This is a reapply of patch #149851. The reapply also fixes a CMake/Bazel
build issue, which was the reason of the revert. (Thanks @rupprecht )

Original patch (#149851) message:
-----
This PR adds a new optimization pass to fold
`memref.subview/expand_shape/collapse_shape` ops into consumer
`amdgpu.gather_to_lds` operations.
* Implements a new pass `AmdgpuFoldMemRefOpsPass` with pattern
`FoldMemRefOpsIntoGatherToLDSOp`
* Adds corresponding folding tests
2025-07-24 09:23:15 -04:00
Alan Li
9cb5c00bf7
Revert "[AMDGPU] fold memref.subview/expand_shape/collapse_shape in… (#150256)
…to `amdgpu.gather_to_lds` (#149851)"

This reverts commit dbc63f1e3724b6f2348c431dc1216537d9c042e8.

Having build deps issue.
2025-07-23 12:50:26 -04:00
Alan Li
dbc63f1e37
[AMDGPU] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds (#149851)
This PR adds a new optimization pass to fold
`memref.subview/expand_shape/collapse_shape` ops into consumer
`amdgpu.gather_to_lds` operations.

* Implements a new pass `AmdgpuFoldMemRefOpsPass` with pattern
`FoldMemRefOpsIntoGatherToLDSOp`
* Adds corresponding folding tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-23 11:22:41 -04:00
Kazu Hirata
d5def016b6
[llvm] Remove unused includes (NFC) (#148342)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-12 11:28:55 -07:00
Vitalii Shutov
9e704a0aa1
[MLIR][MemRef] Add alloca support for erase_dead_alloc_and_stores (#142131)
Previously, `erase_dead_alloc_and_stores` didn't support
`memref.alloca`. This patch introduces support for it.

---------

Signed-off-by: Vitalii Shutov <vitalii.shutov@arm.com>
2025-06-23 13:44:20 +01:00
Zhuoran Yin
53e8ff13bd
[MLIR] Fixing the memref linearization size computation for non-packed memref (#138922)
Credit to @krzysz00 who discovered this subtle bug in `MemRefUtils`. The
problem is in `getLinearizedMemRefOffsetAndSize()` utility. In
particular, how this subroutine computes the linearized size of a memref
is incorrect when given a non-packed memref.

### Background

As context, in a packed memref of `memref<8x8xf32>`, we'd compute the
size by multiplying the size of dimensions together. This is implemented
by composing an affine_map of `affine_map<()[s0, s1] -> (s0 * s1)>` and
then computing the result of size via `%size = affine.apply #map()[%c8,
%c8]`.

However, this is wrong for a non-packed memref of `memref<8x8xf32,
strided<[1024, 1]>>`. Since the previous computed multiplication map
will only consider the dimension sizes, it'd continue to conclude that
the size of the non-packed memref to be 64.

### Solution

This PR come up with a fix such that the linearized size computation
take strides into consideration. It computes the maximum of (dim size *
dim stride) for each dimension. We'd compute the size via the affine_map
of `affine_map<()[stride0, size0, stride1] -> ((stride0 * size0), 1 *
size1)>` and then computing the size via `%size = affine.max
#map()[%stride0, %size0, %size1]`. In particular for the new non-packed
memref, the size will be derived as max(1024\*8, 1\*8) = 8192 (rather
than the wrong size 64 computed by packed memref equation).
2025-05-08 13:14:32 -04:00
Kazu Hirata
eb7f51485e
[mlir] Use llvm::append_range (NFC) (#135722) 2025-04-14 22:22:04 -07:00
Matthias Springer
6aaa8f25b6
[mlir][IR][NFC] Move free-standing functions to MemRefType (#123465)
Turn free-standing `MemRefType`-related helper functions in
`BuiltinTypes.h` into member functions.
2025-01-21 08:48:09 +01:00
lialan
2c313259c6
[MLIR] VectorEmulateNarrowType to support loading of unaligned vectors (#113411)
Previously, the pass only supported emulation of loading vector sizes
that are multiples of the emulated data type. This patch expands its
support for emulating sizes that are not multiples of byte sizes. In
such cases, the element values are packed back-to-back to preserve
memory space.

To give a concrete example: if an input has type `memref<3x3xi2>`, it is
actually occupying 3 bytes in memory, with the first 18 bits storing the
values and the last 6 bits as padding. The slice of `vector<3xi2>` at
index `[2, 0]` is stored in memory from bit 12 to bit 18. To properly
load the elements from bit 12 to bit 18 from memory, first load byte 2
and byte 3, and convert it to a vector of `i2` type; then extract bits 4
to 10 (element index 2-5) to form a `vector<3xi2>`.

A limitation of this patch is that the linearized index of the unaligned
vector has to be known at compile time. Extra code needs to be emitted
to handle it if the condition does not hold.

The following ops are updated:
* `vector::LoadOp`
* `vector::TransferReadOp`
* `vector::MaskedLoadOp`
2024-10-29 20:04:48 -07:00
Quinn Dawkins
4e2efea5e8
[mlir][vector] Add all view-like ops to transfer flow opt (#110521)
`vector.transfer_*` folding and forwarding currently does not take into
account reshaping view-like memref ops (expand and collapse shape),
leading to potentially invalid store folding or value forwarding. This
patch adds tracking for those (and other) view-like ops. It is still
possible to design operations that alias memrefs without being a view
(e.g. memref in the iter_args of an `scf.for`), so these patterns may
still need revisiting in the future.
2024-10-02 00:20:44 -04:00
Han-Chung Wang
e3c9c82ce8
[mlir][MemRef] Extend memref.subview sub-byte type emulation support. (#94045)
In some cases (see https://github.com/iree-org/iree/issues/16285),
`memref.subview` ops can't be folded into transfer ops and sub-byte type
emulation fails. This issue has been blocking a few things, including
the enablement of vector flattening transformations
(https://github.com/iree-org/iree/pull/16456). This PR extends the
existing sub-byte type emulation support of `memref.subview` to handle
multi-dimensional subviews with dynamic offsets and addresses the issues
for some of the `memref.subview` cases that can't be folded.

Co-authored-by: Diego Caballero <diegocaballero@google.com>
2024-06-03 22:02:15 -07:00
Benjamin Maxwell
90d2f8c630
[mlir][vector] Teach TransferOptimization to look through trivial aliases (#87805)
This allows `TransferOptimization` to eliminate and forward stores that
are to trivial aliases (rather than just to identical memref values).

A trivial aliases is (currently) defined as:

  1. A `memref.cast`
  2. A `memref.subview` with a zero offset and unit strides
  3. A chain of 1 and 2
2024-05-16 10:53:14 +01:00
Prathamesh Tagore
6ed8434edc
[mlir][fold-memref-alias-ops] Add support for folding memref.expand_shape involving dynamic dims (#89093)
`fold-memref-alias-ops` bails out in presence of dynamic shapes in
`memref.expand_shape` op. Handle this case.
2024-05-08 07:24:43 -07:00
Hanhan Wang
c5dee18b63 [mlir][memref] Add support for erasing dead allocations.
Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D159135
2023-09-01 13:30:26 -07:00
Adrian Kuegel
6cde64a949 [mlir] Apply ClangTidy fix (NFC)
Prefer to use empty() instead of checking size() > 0.
2023-08-29 09:33:48 +02:00
Jie Fu
c730c62715 [mlir] Fix -Wctad-maybe-unsupported in MemRefUtils.cpp (NFC)
/Users/jiefu/llvm-project/mlir/lib/Dialect/MemRef/Utils/MemRefUtils.cpp:56:3: error: 'SmallVector' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported]
  SmallVector indicesVec = llvm::to_vector(indices);
  ^
/Users/jiefu/llvm-project/mlir/include/mlir/Support/LLVM.h:69:7: note: add a deduction guide to suppress this warning
class SmallVector;
      ^
1 error generated.
2023-08-18 07:13:48 +08:00
Mahesh Ravishankar
0f8bab8d59 [mlir] Revamp implementation of sub-byte load/store emulation.
When handling sub-byte emulation, the sizes of the converted `memref`s
also need to be updated (this was not done in the current
implementation). This adds the additional complexity of having to
linearize the `memref`s as well. Consider a `memref<3x3xi4>` where the
`i4` elements are packed. This has a overall size of 5 bytes (rounded
up to number of bytes). This can only be represented by a
`memref<5xi8>`. A `memref<3x2xi8>` would imply an implicit padding of
4 bits at the end of each row. So incorporate linearization into the
sub-byte load-store emulation.

This patch also updates some of the utility functions to make better
use of statically available information using `OpFoldResult` and
`makeComposedFoldedAffineApplyOps`.

Reviewed By: hanchung, yzhang93

Differential Revision: https://reviews.llvm.org/D158125
2023-08-17 20:27:53 +00:00
Hanhan Wang
8fc433f055 [mlir][MemRef] Move narrow type emulation common methods to MemRefUtils.
It also unifies the computation of StridedLayoutAttr. If the stride is
static known value, we can just use it.

Differential Revision: https://reviews.llvm.org/D155017
2023-07-13 14:43:21 -07:00
Kazu Hirata
1c983af96a [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/MemRef/Utils/MemRefUtils.cpp:45:2: error: extra ';'
  outside of a function is incompatible with C++98
  [-Werror,-Wc++98-compat-extra-semi]
2023-05-15 10:06:15 -07:00
Oleg Shyshkov
b4d6aada62 [mlir][memref] Extract isStaticShapeAndContiguousRowMajor as a util function.
Differential Revision: https://reviews.llvm.org/D150543
2023-05-15 17:09:04 +02:00
Mehdi Amini
2c9783c6b9 Remove empty MLIRMemRefUtils library (NFC) 2023-02-09 19:15:21 -08:00
Uday Bondhugula
af9f7d319b NFC. Clean up memref utils library
NFC. Clean up memref utils library. This library had a single function
that was completely misplaced. MemRefUtils is expected to be (also per
its comment) a library providing analysis/transforms utilities on memref
dialect ops or memref types. However, in reality it had a helper that
was depended upon by the MemRef dialect, i.e., it was a helper for the
dialect ops library and couldn't contain anything that itself depends on
the MemRef dialect. Move the single method to the memref dialect that
will now allow actual utilities depending on the memref dialect to be
placed in it.

Put findDealloc in the `memref` namespace. This is a pure move.

Differential Revision: https://reviews.llvm.org/D121273
2022-03-09 16:00:39 +05:30
Rahul Joshi
f8d3755f00 [MLIR][memref] Fix findDealloc() to handle > 1 dealloc for the given alloc.
- Change findDealloc() to return Optional<Operation *> and return None if > 1
  dealloc is associated with the given alloc.
- Add findDeallocs() to return all deallocs associated with the given alloc.
- Fix current uses of findDealloc() to bail out if > 1 dealloc is found.

Differential Revision: https://reviews.llvm.org/D106456
2021-07-22 09:34:19 -07:00
Lei Zhang
0deeaaca39 [mlir] Move memref.subview patterns to MemRef/Transforms/
These patterns have been used as a prerequisite step for lowering
to SPIR-V. But they don't involve SPIR-V dialect ops; they are
pure memref/vector op transformations. Given now we have a dedicated
MemRef dialect, moving them to Memref/Transforms/, which is a more
suitable place to host them, to allow used by others.

This commit just moves code around and renames patterns/passes
accordingly. CMakeLists.txt for existing MemRef libraries are
also improved along the way.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D100326
2021-04-12 16:38:22 -04:00
Alexander Belyaev
465b9a4a33 Revert "Revert "[mlir] Introduce CloneOp and adapt test cases in BufferDeallocation.""
This reverts commit 883912abe669ef246ada0adc9cf1c9748b742400.
2021-03-31 09:49:09 +02:00
Alexander Belyaev
883912abe6 Revert "[mlir] Introduce CloneOp and adapt test cases in BufferDeallocation."
This reverts commit 06b03800f3fcbf49f5ddd4145b40f04e4ba4eb42.
Until some kind of support for region args is added.
2021-03-29 12:47:59 +02:00
Julian Gross
06b03800f3 [mlir] Introduce CloneOp and adapt test cases in BufferDeallocation.
Add a new clone operation to the memref dialect. This operation implicitly
copies data from a source buffer to a new buffer. In contrast to the linalg.copy
operation, this operation does not accept a target buffer as an argument.
Instead, this operation performs a conceptual allocation which does not need to
be performed manually.

Furthermore, this operation resolves the dependency from the linalg-dialect
in the BufferDeallocation pass. In addition, we also extended the canonicalization
patterns to fold clone operations. The copy removal pass has been removed.

Differential Revision: https://reviews.llvm.org/D99172
2021-03-29 10:19:10 +02:00