125 Commits

Author SHA1 Message Date
Matthias Springer
2fcdabaf39
[mlir][DialectUtils] Fix div by zero crash (#153380) 2025-08-13 13:38:57 +02:00
Maya Amrami
e138c95155
[mlir] ViewLikeInterface - verify ranks in verifyOffsetSizeAndStrideOp (#147926)
getMixedOffsets() calls getMixedValues() with `static_offsets` and
`offsets`. It is assumed that the number of dynamic offsets in
`static_offsets` equals the rank of `offsets`. Otherwise, we fail on
assert when trying to access an array out of its bounds.
The same applies to getMixedStrides() and getMixedOffsets().

A verification of this assumption is added to
verifyOffsetSizeAndStrideOp() and a clear assert is added in
getMixedValues().
2025-07-20 14:20:16 +03:00
Kazu Hirata
c06d3a7b72
[mlir] Remove unused includes (NFC) (#148769)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-14 22:19:23 -07:00
Jakub Kuderski
6512ca7ddb
[mlir] Add isStatic* size check for ShapedTypes. NFCI. (#147085)
The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.

Also add the new functions to C and Python bindings.
2025-07-07 14:57:27 -04:00
Thomas Preud'homme
7763002357
[MLIR/Utils] Add missing dep on Arith dialect (#146834)
Fix the following compile error when building libMLIRDialectUtils.a
only:

In file included from
mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h:17,
                 from mlir/lib/Dialect/Utils/ReshapeOpsUtils.cpp:9:
mlir/include/mlir/Dialect/Arith/IR/Arith.h:28:10:
fatal error: mlir/Dialect/Arith/IR/ArithOpsDialect.h.inc: No such file
or directory

ArithDialect dependency is now needed since
0515449f6dcb452ea0b089fb3057d469c3cffa3f to create arith.muli op.
2025-07-03 11:33:11 +01:00
MaheshRavishankar
7bc956d3d6
[mlir][PartialReductionTilingInterface] Add support for ReductionTilingStrategy::PartialReductionOuterParallel in tileUsingSCF. (#143988)
Following up from https://github.com/llvm/llvm-project/pull/143467,
this PR adds support for
`ReductionTilingStrategy::PartialReductionOuterParallel` to
`tileUsingSCF`. The implementation of
`PartialReductionTilingInterface` for `Linalg` ops has been updated to
support this strategy as well. This makes the `tileUsingSCF` come on
par with `linalg::tileReductionUsingForall` which will be deprecated
subsequently.

Changes summary
- `PartialReductionTilingInterface` changes :
  - `tileToPartialReduction` method needed to get the induction
    variables of the generated tile loops. This was needed to keep the
    generated code similar to `linalg::tileReductionUsingForall`,
    specifically to create a simplified access for slicing the
intermediate partial results tensor when tiled in `num_threads` mode.
  - `getPartialResultTilePosition` methods needs the induction
    varialbes for the generated tile loops for the same reason above,
    and also needs the `tilingStrategy` to be passed in to generate
    correct code.

The tests in `transform-tile-reduction.mlir` testing the
`linalg::tileReductionUsingForall` have been moved over to test
`scf::tileUsingSCF` with
`ReductionTilingStrategy::PartialReductionOuterParallel`
strategy. Some of the test that were doing further cyclic distribution
of the transformed code from tiling are removed. Those seem like two
separate transformation that were merged into one. Ideally that would
need to happen when resolving the `scf.forall` rather than during
tiling.

Please review only the top commit. Depends on
https://github.com/llvm/llvm-project/pull/143467

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-06-23 12:27:26 -07:00
Momchil Velikov
4af96a9d83
[MLIR] Determine contiguousness of memrefs with dynamic dimensions (#142421)
This patch enhances `MemRefType::areTrailingDimsContiguous` to also
handle memrefs with dynamic dimensions.

The implementation itself is based on a new member function
`MemRefType::getMaxCollapsableTrailingDims` that return the maximum
number of trailing dimensions that can be collapsed - trivially all
dimensions for memrefs with identity layout, or by examining the memref
strides stopping at discontiguous or statically unknown strides.
2025-06-23 09:28:33 +01:00
Kazu Hirata
c4ba734993
[mlir] Compare std::optional<T> to values directly (NFC) (#144241)
This patch transforms:

  X && *X == Y

to:

  X == Y

where X is of std::optional<T>, and Y is of T or similar.
2025-06-14 23:23:42 -07:00
Artem Gindinson
f82cf74420
[mlir][tensor] Fix getReassociationForCollapse for tensor/scalar re… (#144118)
…shapes

Commit 6e5a142 changed the behavior of the function when computing
reassociations between tensors (consisting of unit/dynamic dimensions)
and scalars/0d vectors. The IR representation for such reshapes actually
expects an empty reassociation, like so:
```
func.func @example(%arg0 : tensor<?x?x?xf32>) -> tensor<f32> {
  %0 = tensor.collapse_shape %arg0 [] : tensor<?x?x?xf32> into tensor<f32>
}
```

Restore the original behavior - the routine should resort to reporting
failures when compile time-known non-unit dimensions are part of the
attempted reassociation.

Signed-off-by: Artem Gindinson <gindinson@roofline.ai>
2025-06-13 20:03:24 +02:00
Ian Wood
6e5a1423b7
[mlir] Reapply "Loosen restrictions on folding dynamic reshapes" (#142827)
The original PR https://github.com/llvm/llvm-project/pull/137963 had a
nvidia bot failure. This appears to be a flaky test because rerunning
the build was successful.

This change needs commit 6f2ba47 to fix incorrect usage of
`getReassociationIndicesForCollapse`.

Reverts llvm/llvm-project#142639

Co-authored-by: Artem Gindinson <gindinson@roofline.ai>
2025-06-12 10:28:27 +02:00
Ian Wood
f5a2f00da9
Revert "[mlir][tensor] Loosen restrictions on folding dynamic reshapes" (#142639)
Reverts llvm/llvm-project#137963

---------

Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
2025-06-03 14:10:41 -07:00
Artem Gindinson
cb4a407e5c
[mlir][tensor] Loosen restrictions on folding dynamic reshapes (#137963)
The main idea behind the change is to allow expand-of-collapse folds for
reshapes like `?x?xk` -> `?` (k>1). The rationale here is that the
expand op must have a coherent index/affine expression specified in its
`output_shape` argument (see example below), and if it doesn't, the IR
has already been invalidated at an earlier stage:
```
%c32 = arith.constant 32 : index
%div = arith.divsi %<some_index>, %c32 : index
%collapsed = tensor.collapse_shape %41#1 [[0], [1, 2], [3, 4]]
	         : tensor<9x?x32x?x32xf32> into tensor<9x?x?xf32>
%affine = affine.apply affine_map<()[s0] -> (s0 * 32)> ()[%div]
%expanded = tensor.expand_shape %collapsed [[0], [1, 2], [3]] output_shape [9, %div, 32, %affine]
		: tensor<9x?x?xf32> into tensor<9x?x32x?xf32>
```

On the above assumption, adjust the routine in
`getReassociationIndicesForCollapse()` to allow dynamic reshapes beyond
just `?x..?x1x1x..x1` -> `?`. Dynamic subshapes introduce two kinds of
issues:
1. n>2 consecutive dynamic dimensions in the source shape cannot be
collapsed together into 1<k<n neighboring dynamic dimensions in the
target shape, since there'd be more than one suitable reassociation
(example: `?x?x10x? into ?x?`)
2. When figuring out static subshape reassociations based on products,
there are cases where a static dimension is collapsed with a dynamic
one, and should therefore be skipped when comparing products of source &
target dimensions (e.g. `?x2x3x4 into ?x12`)

To address 1, we should detect such sequences in the target shape before
assigning multiple dynamic dimensions into the same index set. For 2, we
take note that a static target dimension was preceded by a dynamic one
and allow an "offset" subshape of source static dimensions, as long as
there's an exact sequence for the target size later in the source shape.

This PR aims to address all reshapes that can be determined based purely
on shapes (and original reassociation
maps, as done in
`ComposeExpandOfCollapseOp::findCollapsingReassociation)`. It doesn't
seem possible to fold all qualifying dynamic shape patterns in a
deterministic way without looking into affine expressions
simultaneously. That would be difficult to maintain in a single general
utility, so a path forward would be to provide dialect-specific
implementations for Linalg/Tensor.

Signed-off-by: Artem Gindinson <gindinson@roofline.ai>

---------

Signed-off-by: Artem Gindinson <gindinson@roofline.ai>
Co-authored-by: Ian Wood <ianwood2024@u.northwestern.edu>
2025-06-03 09:09:01 -07:00
Han-Chung Wang
c39915fa2e
[mlir][NFC] Simplify constant checks with isOneInteger and renamed isZeroInteger. (#139340)
The revision adds isOneInteger helper, and simplifies the existing code
with the two methods. It removes some lambda, which makes code cleaner.

For downstream users, you can update the code with the below script.

```bash
sed -i "s/isZeroIndex/isZeroInteger/g" **/*.h
sed -i "s/isZeroIndex/isZeroInteger/g" **/*.cpp
```

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-05-20 14:53:02 -07:00
Chao Chen
99720bbb87
[MLIR][Utils] Fix the overflow issue in computeSuffixProductImpl for 32-bit system. (#140567)
In `int64_t r = strides.size() - 2`, it may cause overflow on 32-bit
system when strides.size() is 1, because `strides.size()` is defined 
as `unsigned int`
2025-05-19 13:27:37 -05:00
Iris Shi
78af0f3ab8
[mlir][NFC] Use llvm::sort (#140261) 2025-05-16 23:35:13 +08:00
Han-Chung Wang
7de2e4971f
[mlir][NFC] Use Builder for getReassociationIndicesAttribute method. (#137251)
The method does not need to create any operation, so we can use Builder.
It can be reused by any attribute getter implementation, so it does not
need to declare OpBuilder in the implementation.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-04-24 15:16:18 -07:00
Kazu Hirata
3041fa6c7a
[mlir] Use *Set::insert_range (NFC) (#132326)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-20 22:24:17 -07:00
Ian Wood
fbbb33f400
[mlir] Fix crash when verifying linalg.transpose (#131733)
Adds checks in `isPermutationVector` for indices that are outside of the
bounds and removes the assert.

Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
2025-03-18 12:33:27 -07:00
Andrzej Warzyński
517800e37e
[mlir][tensor][linalg] Move Pack/UnPack Ops to Linalg (#123902)
Moves `PackOp` and `UnPackOp` from the Tensor dialect to Linalg. This change
was discussed in the following RFC:
* https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg

This change involves significant churn but only relocates existing code - no new
functionality is added.

**Note for Downstream Users**
Downstream users must update references to `PackOp` and `UnPackOp` as follows:
  * Code: `s/tensor::(Up)PackOp/linalg::(Un)PackOp/g`
  * Tests: `s/tensor.(un)pack/linalg.(un)pack/g`

No other modifications should be required.
2025-02-17 10:44:27 +00:00
MaheshRavishankar
092372da15
[mlir][Tensor] Rework ReifyRankedShapedTypeInterface implementation for tensor.expand_shape op. (#113501)
The op carries the output-shape directly. This can be used directly.
Also adds a method to get the shape as a `SmallVector<OpFoldResult>`.

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-01-27 07:05:34 -08:00
Kazu Hirata
129f1001c3
[Dialect] Migrate away from PointerUnion::{is,get} (NFC) (#120818)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-21 08:17:51 -08:00
Andrzej Warzyński
e9bafa35d2
[mlir][tensor] Generalize/restrict GeneralizeOuterUnitDimsPackOpPattern (#114315)
This PR *restricts* `GeneralizeOuterUnitDimsPackOpPattern` to follow its
intended purpose (as per the documentation), which is to:

  > require all outer dimensions of tensor.pack to be 1.

There was one in-tree test that violated this assumption (and happened
to work) – see `@simple_KCRS_to_KRSCsr` in
"generalize-tensor-pack.mlir". That test has been updated to satisfy the
new requirements of the pattern.

By enforcing the pattern to follow its intended design (i.e., making it
stricter), the calculation of shapes and sizes for various Ops that the
pattern generates (PadOp, ExtractSliceOp, EmptyOp, TensorOp, and
InsertSliceOp) becomes much simpler and easier to document. This also
helped *generalize* the pattern to support cases like the one below:

```mlir
func.func @simple_pad_and_pack_dynamic_tile_cst(
    %src: tensor<5x1xf32>,
    %dest: tensor<1x1x?x2xf32>,
    %pad: f32) -> tensor<1x1x?x2xf32> {

  %tile_dim_0 = arith.constant 8 : index
  %0 = tensor.pack %src
    padding_value(%pad : f32)
    inner_dims_pos = [0, 1]
    inner_tiles = [%tile_dim_0, 2]
    into %dest : tensor<5x1xf32> -> tensor<1x1x?x2xf32>

  return %0 : tensor<1x1x?x2xf32>
}
```

Note that the inner tile slice is dynamic but compile-time constant.
`getPackOpSourceOrPaddedSource`, which is used to generate PadOp,
detects this and generates a PadOp with static shapes. This is a good
optimization, but it means that all shapes/sizes for Ops generated by
`GeneralizeOuterUnitDimsPackOpPattern` also need to be updated to be
constant/static. By restricting the pattern and simplifying the
size/shape calculation, supporting the case above becomes much easier.

Notable implementation changes:

* PadOp processes the original source (no change in dimensions/rank).
  ExtractSliceOp extracts the tile to pack and may reduce the rank. All
  following ops work on the tile extracted by ExtractSliceOp (possibly
  rank-reduced).
* All shape/size calculations assume that trailing dimensions match
  inner_tiles from tensor.pack. All leading dimensions (i.e., outer
  dimensions) are assumed to be 1.
* Dynamic sizes for ops like ExtractSliceOp are taken from inner_tiles
  rather than computed as, for example, tensor.dim %dest, 2. It’s the
  responsibility of the "producers" of tensor.pack to ensure that
  dimensions in %dest match the specified tile sizes.
2024-11-06 20:42:47 +00:00
Max191
98e838a890
[mlir] Do not bufferize parallel_insert_slice dest to read for full slices (#112761)
In the insert_slice bufferization interface implementation, the
destination tensor is not considered read if the full tensor is
overwritten by the slice. This PR adds the same check for
tensor.parallel_insert_slice.

Adds two new StaticValueUtils:
- `isAllConstantIntValue` checks if an array of `OpFoldResult` are all
equal to a passed `int64_t` value.
- `areConstantIntValues` checks if an array of `OpFoldResult` are all
equal to a passed array of `int64_t` values.

fixes https://github.com/llvm/llvm-project/issues/112435

---------

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2024-10-18 16:02:03 -04:00
Vinayak Dev
2f15d7e43e
[mlir][tensor] Fix off-by-one error in ReshapeOpsUtils (#112774)
This patch fixes an off-by-one error in
`mlir::getReassociationIndicesForCollapse()` that occurs when the last
two dims of the source tensor satisfy the while loop.

This would cause an assertion failure due to out-of-bounds-access, which
is now fixed.
2024-10-18 14:02:30 +05:30
Kazu Hirata
b52885bc23
[mlir] Use std::optional::value_or (NFC) (#109893) 2024-09-26 09:53:43 -07:00
donald chen
9cc11b98a7
[mlir] [linalg] Add pattern to swap transpose with broadcast (#97063)
Add a pattern that implement:

  transpose(broadcast(input)) -> broadcast(transpose(input))
2024-07-23 12:52:25 +08:00
c8ef
3f222f3bc6
[NFC] Fix some typos (#98791) 2024-07-14 13:28:11 +02:00
Ramkumar Ramachandra
0fb216fb2f
mlir/MathExtras: consolidate with llvm/MathExtras (#95087)
This patch is part of a project to move the Presburger library into
LLVM.
2024-06-11 23:00:02 +01:00
Spenser Bauman
a9205c5c9d
[mlir][tensor] Implement constant folder for tensor.pad (#92691)
Extend the folding ability of the RewriteAsConstant patterns to include
tensor.pad operations on constants. The new pattern with constant fold
tensor.pad operations which operate on tensor constants and have
statically resolvable padding sizes/values.

    %init = arith.constant dense<[[6, 7], [8, 9]]> : tensor<2x2xi32>
    %pad_value = arith.constant 0 : i32

    %0 = tensor.pad %init low[1, 1] high[1, 1] {
      ^bb0(%arg1: index, %arg2: index):
        tensor.yield %pad_value : i32
    } : tensor<2x2xi32> to tensor<4x4xi32>

becomes

    %cst = arith.constant dense<[[0, 0, 0, 0],
                                 [0, 6, 7, 0],
                                 [0, 8, 9, 0],
                                 [0, 0, 0, 0]]> : tensor<4x4xi32>

Co-authored-by: Spenser Bauman <sabauma@fastmail>
2024-06-06 10:22:16 -04:00
Gaurav Shukla
97069a8619
[MLIR] Generalize expand_shape to take shape as explicit input (#90040)
This patch generalizes tensor.expand_shape and memref.expand_shape to
consume the output shape as a list of SSA values. This enables us to
implement generic reshape operations with dynamic shapes using
collapse_shape/expand_shape pairs.

The output_shape input to expand_shape follows the static/dynamic
representation that's also used in `tensor.extract_slice`.

Differential Revision: https://reviews.llvm.org/D140821

---------

Signed-off-by: Gaurav Shukla<gaurav.shukla@amd.com>
Signed-off-by: Gaurav Shukla <gaurav.shukla@amd.com>
Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>
2024-04-30 09:28:35 -07:00
Mehdi Amini
8c0341df02
Revert "[MLIR] Generalize expand_shape to take shape as explicit input" (#89540)
Reverts llvm/llvm-project#69267

this broke some bots.
2024-04-21 14:33:48 +02:00
Gaurav Shukla
e095d978ba
[MLIR] Generalize expand_shape to take shape as explicit input (#69267)
This patch generalizes tensor.expand_shape and memref.expand_shape to
consume the output shape as a list of SSA values. This enables us to
implement generic reshape operations with dynamic shapes using
collapse_shape/expand_shape pairs.

The output_shape input to expand_shape follows the static/dynamic
representation that's also used in `tensor.extract_slice`.

Differential Revision: https://reviews.llvm.org/D140821

Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>
2024-04-21 07:37:02 -04:00
Aviad Cohen
ccc02563f4
[mlir][linalg]: Fixed possible memory leak in cloneToCollapsedOp (#87595)
* Direct call to `clone` function leads to memory leak. Instead, we should use `RewriterBase` clone function instead.
2024-04-07 08:23:16 +03:00
Thomas Preud'homme
da2c98b558
[MLIR] Remove UtilsDialect dep on ArithUtils (#85919)
This will reduce the amount of libraries pulled through the de facto
dependency of TilingInterface on UtilsDialect for its IteratorType.
2024-03-20 12:18:42 +00:00
Diego Caballero
847048f497
[mlir][Vector] Fix bug in vector xfer op flattening transformation (#81964)
It looks like the affine map generated to compute the indices of the
collapsed dimensions used the wrong dim size. For indices `[idx0][idx1]`
we computed the collapsed index as `idx0*size0 + idx1` instead of
`idx0*size1 + idx1`. This led to correctness issues in convolution tests
when enabling this transformation internally.
2024-02-22 12:37:32 -08:00
Mehdi Amini
8383bf2307 Apply clang-tidy fixes for llvm-else-after-return in IndexingUtils.cpp (NFC) 2024-02-14 10:11:37 -08:00
Han-Chung Wang
2472c45ba3
[mlir][tensor] Enhance pack/unpack simplification for identity outer_dims_perm cases. (#77409)
They can be simplified to reshape ops if outer_dims_perm is an identity
permutation. The revision adds a `isIdentityPermutation` method to
IndexingUtils.
2024-01-10 08:30:34 -08:00
Guray Ozen
c65d8c7187
[mlir][memref] extract_strided_metadata for zero-sized memref (#74835) 2023-12-08 15:55:14 +01:00
Rik Huijzer
68f0bc6f2e
[mlir] Fix a zero stride canonicalizer crash (#74200)
This PR fixes https://github.com/llvm/llvm-project/issues/73383 and is
another shot at the refactoring proposed in
https://github.com/llvm/llvm-project/pull/72885.

---------

Co-authored-by: Kai Sasaki <lewuathe@gmail.com>
2023-12-06 07:35:18 +01:00
Matthias Springer
68386a74ba
[mlir][tensor] Fix crash when canonicalizing invalid IR (#72888)
This commit fixes a crash of the canonicalizer when there are slice ops
with offset/size SSA values that have a negative constant value. Such
ops are invalid if they are reachable and their offsets/sizes should not
be folded to static integer values. (But such ops may appear in
non-reachable block.)

This commit fixes #71150.
2023-11-21 09:20:18 +01:00
long.chen
1609f1c2a5
[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/

Not all changes are made manually, most of them are made through a clang
tool I wrote https://github.com/lipracer/cpp-refactor.
2023-11-14 13:01:19 +08:00
bjacob
8a80e33150
Add isBatchVecmat utilities for linalg.batch_vecmat (#70284)
`linalg.batch_vecmat` was just added in
https://github.com/llvm/llvm-project/pull/70218, but I forgot then to
add the standard `isBatchVecmat` utilities
2023-10-26 07:47:00 -04:00
NatashaKnk
9f4950983e
[mlir] Add ContractionOpInterface utility functions for vector matrix multiplication (#68945) 2023-10-18 08:55:51 -07:00
Christopher Bate
831041be79 [mlir][vector] Cleanup VectorUnroll and create a generic tile iteration utility
This change refactors some of the utilities used to unroll larger vector
computations into smaller vector computations. In fact, the indexing
computations used here are rather generic and are useful in other dialects or
downstream projects. Therefore, a utility for iterating over all possible tile
offsets for a particular pair of static (shape, tiled shape) is introduced in
IndexingUtils and replaces the existing computations in the vector unrolling
transformations. This builds off of the refactoring of IndexingUtils introduced
in 203fad476b7e.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D150000
2023-09-14 20:34:44 -06:00
Ivan Butygin
793ee2bf08 [mlir][gpu] Add DecomposeMemrefsPass
Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail.
This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`.

`memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011

Differential Revision: https://reviews.llvm.org/D155247
2023-08-10 22:28:05 +02:00
Ivan Butygin
b13248f997 Revert "[mlir][gpu] Add DecomposeMemrefsPass"
Broke some bots

This reverts commit 2b5b2bfef102b1021d91f2b9485e2443bdea9df5.
2023-08-10 03:07:28 +02:00
Ivan Butygin
2b5b2bfef1 [mlir][gpu] Add DecomposeMemrefsPass
Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail.
This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`.

`memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011

Differential Revision: https://reviews.llvm.org/D155247
2023-08-10 02:28:03 +02:00
Nicolas Vasilache
a3cd2eeb2d [mlir][nvgpu] Add a nvgpu.rewrite_copy_as_tma transform operation.
This revision adds support for direct lowering of a linalg.copy on buffers between global and shared memory to a tma async load + synchronization operations.
This uses the recently introduced Hopper NVVM and NVGPU abstraction to connect things end to end.

Differential Revision: https://reviews.llvm.org/D157087
2023-08-08 12:07:59 +00:00
Matthias Springer
b2826c0209 [mlir][NFC] Move offsets/sizes/strides helper to dialect utils and interface header
* Move `foldDynamicIndexList` to `DialectUtils` and simplify function.
* Move `OpWithOffsetSizesAndStridesConstantArgumentFolder` to `ViewLikeInterface` and add documentation.

Differential Revision: https://reviews.llvm.org/D156581
2023-07-31 14:53:14 +02:00
Nicolas Vasilache
90ecfa2a40 [mlir][linalg] NFC - Move some utils in preparation for revamping mapping of scf.forall 2023-07-25 01:19:57 +02:00