558 Commits

Author SHA1 Message Date
Erick Ochoa Lopez
8b9c70dcdb
[mlir] Move vector.{to_elements,from_elements} unrolling to VectorUnroll.cpp (#159118)
This PR moves the patterns that unroll vector.to_elements and
vector.from_elements into the file with other vector unrolling
operations. This PR also adds these unrolling patterns into the
`populateVectorUnrollPatterns`. And renames
`populateVectorToElementsLoweringPatterns`
`populateVectorFromElementsLoweringPatterns` to
`populateVectorToElementsUnrollPatterns`
`populateVectorFromElementsUnrollPatterns`.
2025-09-18 12:03:54 -04:00
Jan Patrick Lehr
50f81531b4
[MLIR] Fix compilation after #157771 (#159257)
The original PR broke pretty much all our bots.
Apologies for the noise in the previous PR.
2025-09-17 09:13:44 +02:00
Diego Caballero
7bdd88c1e3
[mlir][Vector] Add patterns to lower vector.shuffle (#157611)
This PR adds patterns to lower `vector.shuffle` with inputs with
different vector sizes more efficiently. The current LLVM lowering for
these cases degenerates to a sequence of `vector.extract` and
`vector.insert` operations. With this PR, the smaller input is promoted
to larger vector size by introducing an extra `vector.shuffle`.
2025-09-16 18:01:30 -07:00
Nishant Patel
0e5c32bd6d
[MLIR][Vector] Add unrolling pattern for vector StepOp (#157752)
This PR adds unrolling pattern for vector.step op to VectorUnroll
transform.
2025-09-16 15:33:08 -07:00
Alan Li
4a094095a4
[MLIR] Make 1-D memref flattening a prerequisite for vector narrow type emulation (#157771)
Addresses:  https://github.com/llvm/llvm-project/issues/115653

We already have utilities to flatten memrefs into 1-D. This change makes
memref flattening a prerequisite for vector narrow type emulation,
ensuring that emulation patterns only need to handle 1-D scenarios.
2025-09-16 20:43:20 +00:00
Andrzej Warzyński
1287ed1fa2
[mlir][vector] Use source as the source argument name (#158258)
This patch updates the following ops to use `source` (instead of
`vector`) as the name for their source argument:
  * `vector.extract`
  * `vector.scalable.extract`
  * `vector.extract_strided_slice`

This change ensures naming consistency with the "builders" for these Ops
that already use the name `source` rather than `vector`. It also
addresses part of:
  * https://github.com/llvm/llvm-project/issues/131602

Specifically, it ensures that we use `source` and `dest` for read and
write operations, respectively (as opposed to `vector` and `dest`).
2025-09-15 21:18:26 +01:00
Erick Ochoa Lopez
b812e3d61a
[mlir][vector] Add LinearizeVectorToElements (#157740)
Co-authored-by: James Newling <james.newling@gmail.com>
2025-09-11 13:58:42 -04:00
Erick Ochoa Lopez
9d19250610
[mlir][vector] Add vector.to_elements unrolling (#157142)
This PR adds support for unrolling `vector.to_element`'s source operand.

It transforms

```mlir
%0:8 = vector.to_elements %v : vector<2x2x2xf32>
```

to

```mlir
%v0 = vector.extract %v[0] : vector<2x2xf32> from vector<2x2x2xf32>
%v1 = vector.extract %v[1] : vector<2x2xf32> from vector<2x2x2xf32>
%0:4 = vector.to_elements %v0 : vector<2x2xf32>
%1:4 = vector.to_elements %v1 : vector<2x2xf32>
// %0:8 = %0:4 - %1:4
```

This pattern will be applied until there are only 1-D vectors left.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
Co-authored-by: hanhanW <hanhan0912@gmail.com>
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
2025-09-11 13:56:57 -04:00
Artem Kroviakov
7f007b572d
[MLIR][Vector] Add warp distribution for scf.if (#157119)
This PR adds `scf.if` op distribution to the existing `VectorDistribute`
patterns. The logic mostly follows that of `scf.for`: move op outside, wrap each
branch with `gpu.warp_execute_on_lane_0`. A notable difference to `scf.for` is 
that each branch has its own set of escaping values, and `scf.if` itself does not 
have block arguments.
2025-09-10 13:33:26 -07:00
Erick Ochoa Lopez
c5e6d56f01
[mlir][vector] Propagate alignment when emulating masked{load,stores}. (#155648)
Propagate alignment from `vector.maskedload` and `vector.maskedstore` to
`memref.load` and `memref.store` during `VectorEmulateMaskedLoadStore`
pass.
2025-09-05 13:48:40 +00:00
Erick Ochoa Lopez
250d2517b2
[mlir][vector] Propagate alignment in LowerVectorGather. (#155683)
Alignment is properly propagated when patterns
`UnrollGather`, `RemoveStrideFromGatherSource`, or
`Gather1DToConditionalLoads` are applied.
2025-09-05 13:43:18 +00:00
Mehdi Amini
8edb5b4fb3 [MLIR] Add LDBG() tracing to VectorTransferOpTransforms.cpp (NFC)
Had to debug an issue, more debugging helped here.
2025-09-03 15:32:22 -07:00
Mehdi Amini
9bb9206d8e [MLIR] Apply clang-tidy fixes for readability-container-size-empty in VectorTransforms.cpp (NFC) 2025-08-31 02:09:27 -07:00
Artem Kroviakov
f2e6ca805d
[MLIR][Vector] Add warp distribution for vector.step op (#155425)
This PR adds a distribution pattern for
[`vector.step`](https://mlir.llvm.org/docs/Dialects/Vector/#vectorstep-vectorstepop)
op.

The result of the step op is a vector containing a sequence
`[0,1,...,N-1]`. For the warp distribution, we consider a vector with `N
== warp_size` (think SIMD). Distributing it to SIMT, means that each
lane is represented by a thread/lane id scalar.

More complex cases with the support for warp size multiples (e.g.,
`[0,1,...,2*N-1]`) require additional layout information to be handled
properly. Such support may be added later.

The lane id scalar is wrapped into a `vector<1xindex>` to emulate the
sequence distribution result.
Other than that, the distribution is similar to that of
`arith.constant`.
2025-08-28 09:24:01 -07:00
Yang Bai
5fdd3a12e5
[mlir][vector] Follow-up improvements for multi-dimensional vector.from_elements support (#154664)
This PR is a follow-up to #151175 that supported lowering
multi-dimensional `vector.from_elements` op to LLVM by introducing a
unrolling pattern.

## Changes

### Add `vector.shape_cast` based flattening pattern for
`vector.from_elements`

This change introduces a new linearization pattern that uses
`vector.shape_cast` to flatten multi-dimensional `vector.from_elements`
operations. This provides an alternative approach to the unrolling-based
method introduced in #151175.

**Example:**
```mlir
// Before
%v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32>

// After
%flat = vector.from_elements %e0, %e1, %e2, %e3 : vector<4xf32>
%result = vector.shape_cast %flat : vector<4xf32> to vector<2x2xf32>
```

---------

Co-authored-by: Yang Bai <yangb@nvidia.com>
Co-authored-by: James Newling <james.newling@gmail.com>
2025-08-27 21:41:06 -07:00
Andrzej Warzyński
613ec4c24c
[mlir][vector] Rename gather/scatter arguments (nfc) (#153640)
Renames `indices` as `offsets` and `index_vec` as `indices`. This is
primarily to make clearer distinction between the arguments.
2025-08-25 17:02:11 +01:00
Adam Siemieniuk
533ddcd989
[mlir][gpu] Warp execute terminator getter (#154729)
Adds a utility getter to `warp_execute_on_lane_0` which simplifies
access to the op's terminator.

Uses are refactored to utilize the new terminator getter.
2025-08-22 18:24:23 +02:00
donald chen
5af7263d42
[mlir] add getViewDest method to viewLikeOpInterface (#154524)
The viewLikeOpInterface abstracts the behavior of an operation view one
buffer as another. However, the current interface only includes a
"getViewSource" method and lacks a "getViewDest" method.

Previously, it was generally assumed that viewLikeOpInterface operations
would have only one return value, which was the view dest. This
assumption was broken by memref.extract_strided_metadata, and more
operations may break these silent conventions in the future. Calling
"viewLikeInterface->getResult(0)" may lead to a core dump at runtime.
Therefore, we need 'getViewDest' method to standardize our behavior.

This patch adds the getViewDest function to viewLikeOpInterface and
modifies the usage points of viewLikeOpInterface to standardize its use.
2025-08-21 20:09:52 +08:00
Md Asghar Ahmad Shahid
c24c23d9ab
[NFC][mlir][vector] Handle potential static cast assertion. (#152957)
In FoldArithToVectorOuterProduct pattern, static cast to vector type
causes assertion when a scalar type was encountered. It seems the author
meant to have a dyn_cast instead.

This NFC patch handles it by using dyn_cast.
2025-08-19 09:27:20 +05:30
Yang Bai
4eb1a07d7d
[mlir][vector] Support multi-dimensional vectors in VectorFromElementsLowering (#151175)
This patch introduces a new unrolling-based approach for lowering
multi-dimensional `vector.from_elements` operations.

**Implementation Details:**
1. **New Transform Pattern**: Added `UnrollFromElements` that unrolls a
N-D(N>=2) from_elements op to a (N-1)-D from_elements op align the
outermost dimension.
2. **Utility Functions**: Added `unrollVectorOp` to reuse the unroll
algo of vector.gather for vector.from_elements.
3. **Integration**: Added the unrolling pattern to the
convert-vector-to-llvm pass as a temporal transformation.
4. Use direct LLVM dialect operations instead of intermediate
vector.insert operations for efficiency in `VectorFromElementsLowering`.

**Example:**
```mlir
// unroll
%v = vector.from_elements  %e0, %e1, %e2, %e3 : vector<2x2xf32>
=>
%poison_2d = ub.poison : vector<2x2xf32>
%vec_1d_0 = vector.from_elements %e0, %e1 : vector<2xf32>
%vec_2d_0 = vector.insert %vec_1d_0, %poison_2d [0] : vector<2xf32> into vector<2x2xf32>
%vec_1d_1 = vector.from_elements %e2, %e3 : vector<2xf32>
%result = vector.insert %vec_1d_1, %vec_2d_0 [1] : vector<2xf32> into vector<2x2xf32>

// convert-vector-to-llvm
%v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32>
=>
%poison_2d = ub.poison : vector<2x2xf32>
%poison_2d_cast = builtin.unrealized_conversion_cast %poison_2d : vector<2x2xf32> to !llvm.array<2 x vector<2xf32>>
%poison_1d_0 = llvm.mlir.poison : vector<2xf32>
%c0_0 = llvm.mlir.constant(0 : i64) : i64
%vec_1d_0_0 = llvm.insertelement %e0, %poison_1d_0[%c0_0 : i64] : vector<2xf32>
%c1_0 = llvm.mlir.constant(1 : i64) : i64
%vec_1d_0_1 = llvm.insertelement %e1, %vec_1d_0_0[%c1_0 : i64] : vector<2xf32>
%vec_2d_0 = llvm.insertvalue %vec_1d_0_1, %poison_2d_cast[0] : !llvm.array<2 x vector<2xf32>>
%poison_1d_1 = llvm.mlir.poison : vector<2xf32>
%c0_1 = llvm.mlir.constant(0 : i64) : i64
%vec_1d_1_0 = llvm.insertelement %e2, %poison_1d_1[%c0_1 : i64] : vector<2xf32>
%c1_1 = llvm.mlir.constant(1 : i64) : i64
%vec_1d_1_1 = llvm.insertelement %e3, %vec_1d_1_0[%c1_1 : i64] : vector<2xf32>
%vec_2d_1 = llvm.insertvalue %vec_1d_1_1, %vec_2d_0[1] : !llvm.array<2 x vector<2xf32>>
%result = builtin.unrealized_conversion_cast %vec_2d_1 : !llvm.array<2 x vector<2xf32>> to vector<2x2xf32>
```

---------

Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com>
Co-authored-by: Yang Bai <yangb@nvidia.com>
Co-authored-by: James Newling <james.newling@gmail.com>
Co-authored-by: Diego Caballero <dieg0ca6aller0@gmail.com>
2025-08-18 10:09:12 -07:00
Matthias Springer
21b607adbe
[mlir][SCF] scf.for: Add support for unsigned integer comparison (#153379)
Add a new unit attribute to allow for unsigned integer comparison.

Example:
```mlir
scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 {
  // body
}
```

Discussion:
https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655
2025-08-15 10:59:14 +02:00
James Newling
671eaf84b3
[mlir][vector] Avoid use of vector.splat in transforms (#150279)
This is part of vector.splat deprecation
Reference: https://discourse.llvm.org/t/rfc-mlir-vector-deprecate-then-remove-vector-splat/87143/5
Instead of creating vector::SplatOp, create vector::BroadcastOp
2025-07-31 06:28:01 -07:00
Krzysztof Drewniak
9a46091d4b
[mlir][Vector] Allow elementwise/broadcast swap to handle mixed types (#151274)
This patch extends the operation that rewrites elementwise operations
whose inputs are all broadcast from the same shape to handle
mixed-types, such as when the result and input types don't match, or
when the inputs have multiple types.

PR #150867 failed to check for the possibility of type mismatches when
rewriting splat constants. In order to fix that issue, we add support
for mixed-type operations more generally.
2025-07-30 09:36:37 -07:00
Mehdi Amini
75e5a70577
[MLIR] Migrate some conversion passes and dialects to LDBG() macro (NFC) (#151349) 2025-07-30 17:58:54 +02:00
Krzysztof Drewniak
330a7e1136
[mlir][Vector] Make elementwise-on-broadcast sinking handle splat consts (#150867)
There is a pattern that rewrites
elementwise_op(broadcast(x1 : T to U), broadcast(x2 : T to U), ...) to
broadcast(elementwise_op(x1, x2, ...) : T to U).

This pattern did not, however, account for the case where a broadcast
constant is represented as a SplatElementsAttr, which can safely be
reshaped or scalarized but is not a `vector.broadcast` or `vector.splat`
operation.

This patch fixes this oversight, prenting premature broadcasting.

This did result in the need to update some linalg dialect tests, which
now feature a less-broadcast computation and/or more constant folding.
2025-07-29 09:40:49 -07:00
Maksim Levental
fcbcfe44cf
[mlir][NFC] update mlir/Dialect create APIs (32/n) (#150657)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-25 13:50:15 -05:00
Jacques Pienaar
07967d4af8
[mlir] Switch to new LDBG macro (#150616)
Change local variants to use new central one.
2025-07-25 18:22:46 +02:00
Longsheng Mou
75aa629269
[mlir][vector] Add a check to ensure input vector rank equals target shape rank (#149239)
The crash is caused because, during IR transformation, the
vector-unrolling pass (using ExtractStridedSliceOp) attempts to slice an
input vector of higher rank using a target vector of lower rank, which
is not supported. Fixes #148368.
2025-07-25 10:37:33 +08:00
Kazu Hirata
0925d7572a
[mlir] Remove unused includes (NFC) (#150266)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:53 -07:00
Maksim Levental
f904cdd6c3
[mlir][NFC] update mlir/Dialect create APIs (24/n) (#149931)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-22 08:16:15 -04:00
Andrzej Warzyński
03bd0f36ba
[mlir][vector] Remove MatrixMultiplyOp and FlatTransposeOp from Vector dialect (#144307)
This patch deletes `vector.matrix_multiply` and `vector.flat_transpose`,
which are thin wrappers around the corresponding LLVM intrinsics:
  - `llvm.intr.matrix.multiply`
  - `llvm.intr.matrix.transpose`

These Vector dialect ops did not provide additional semantics or
abstraction beyond the LLVM intrinsics. Their removal simplifies the
lowering pipeline without losing any functionality.

The lowering chains:
- `vector.contract` → `vector.matrix_multiply` →
`llvm.intr.matrix.multiply`
- `vector.transpose` → `vector.flat_transpose` →
`llvm.intr.matrix.transpose`

are now replaced with:
  - `vector.contract` → `llvm.intr.matrix.multiply`
  - `vector.transpose` → `llvm.intr.matrix.transpose`

This was accomplished by directly replacing:
  - `vector::MatrixMultiplyOp` with `LLVM::MatrixMultiplyOp`
  - `vector::FlatTransposeOp` with `LLVM::MatrixTransposeOp`

Note: To avoid a build-time dependency from `Vector` to `LLVM`,
relevant transformations are moved from "Vector/Transforms" to
`Conversion/VectorToLLVM`.
2025-07-21 08:19:30 +01:00
Kazu Hirata
c06d3a7b72
[mlir] Remove unused includes (NFC) (#148769)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-14 22:19:23 -07:00
Charitha Saumya
244ebef1dd
Reapply [mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#148313)
Reapply attempt for : https://github.com/llvm/llvm-project/pull/148291
Fix for the build failure reported in :
https://lab.llvm.org/buildbot/#/builders/116/builds/15477

-----

This crash is caused by mismatch of distributed type returned by
`getDistributedType` and intended distributed type for forOp results.

Solution diff:
20c2cf6766

Example:
```
func.func @warp_scf_for_broadcasted_result(%arg0: index) -> vector<1xf32> {
  %c128 = arith.constant 128 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %2 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<1xf32>) {
    %ini = "some_def"() : () -> (vector<1xf32>)
    %0 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini) -> (vector<1xf32>) {
      %1 = "some_op"(%arg4) : (vector<1xf32>) -> (vector<1xf32>)
      scf.yield %1 : vector<1xf32>
    }
    gpu.yield %0 : vector<1xf32>
  }
  return %2 : vector<1xf32>
}
``` 
In this case the distributed type for forOp result is `vector<1xf32>`
(result is not distributed and broadcasted to all lanes instead).
However, in this case `getDistributedType` will return NULL type.

Therefore, if the distributed type can be recovered from warpOp, we
should always do that first before using `getDistributedType`
2025-07-14 15:41:56 -07:00
Nishant Patel
834591e062
[MLIR] [Vector] Linearization patterns for vector.load and vector.store (#145115)
This PR add inearizarion pattern for vector.load and vector.store. It is
follow up PR to
https://github.com/llvm/llvm-project/pull/143420#issuecomment-2967406606
2025-07-14 14:24:52 -07:00
Quinn Dawkins
b1ef5a8890
[mlir][MemRef] Add support for emulating narrow floats (#148036)
This enables memref.load/store + vector.load/store support for sub-byte
float types. Since the memref types don't matter for loads/stores, we
still use the same types as integers with equivalent widths, with a few
extra bitcasts needed around certain operations.

There is no direct change needed for vector.load/store support. The
tests added for them are to verify that float types are
supported as well.
2025-07-14 11:18:51 -04:00
Diego Caballero
ace1c838ca
[mlir][Vector] Support scalar vector.extract in VectorLinearize (#147440)
It generates a linearized version of the `vector.extract` for scalar cases.
2025-07-11 16:02:26 -07:00
Charitha Saumya
1d33bbab57
Revert "[mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results." (#148291)
Reverts llvm/llvm-project#147620

Reverting due to build failure:
https://lab.llvm.org/buildbot/#/builders/116/builds/15477
2025-07-11 13:22:54 -07:00
Charitha Saumya
3092b765ba
[mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#147620)
Current implementation generates incorrect code or crashes in the
following valid cases.

1. At least one of the for op results are not yielded by the warpOp.
Example:
```
%0 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<4xf32>) {
    ....
    %3:2 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini, %arg5 = %ini1) -> (vector<128xf32>, vector<128xf32>) {
      
      %1  = ...
      %acc = ....
      scf.yield %acc, %1 : vector<128xf32>, vector<128xf32>
    }
    gpu.yield %3#0 : vector<128xf32> // %3#1 is not used but can not be removed as dead code (loop carried).
  }
  "some_use"(%0) : (vector<4xf32>) -> ()
  return
```
2. Enclosing warpOp yields the forOp results in different order compared
to the forOp results.
Example:
```
  %0:3 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<4xf32>, vector<4xf32>, vector<8xf32>) {
    ....
    %3:3 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini1, %arg5 = %ini2, %arg6 = %ini3) -> (vector<256xf32>, vector<128xf32>, vector<128xf32>) {
      .....
      scf.yield %acc1, %acc2, %acc3 : vector<256xf32>, vector<128xf32>, vector<128xf32>
    }
    gpu.yield %3#2, %3#1, %3#0 : vector<128xf32>, vector<128xf32>, vector<256xf32> // swapped order
  }
  "some_use_1"(%0#0) : (vector<4xf32>) -> ()
  "some_use_2"(%0#1) : (vector<4xf32>) -> ()
  "some_use_3"(%0#2) : (vector<8xf32>) -> ()

```
2025-07-11 13:08:33 -07:00
Kunwar Grover
77914c96df
[mlir][Vector] Do not propagate vector.extract on dynamic position (#148245)
Propagating vector.extract when a dynamic position is present can cause
dominance issues and needs better handling. For now, disable propagation
if there is a dynamic position present.
2025-07-11 15:38:48 +01:00
Kazu Hirata
cd65f8bf17 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/Vector/Transforms/LowerVectorToFromElementsToShuffleTree.cpp:42:20:
  error: unused variable 'kIndScale' [-Werror,-Wunused-const-variable]
2025-07-09 16:45:18 -07:00
Diego Caballero
ddf9b91f9f
[mlir][Vector] Add vector.shuffle tree transformation (#145740)
This PR adds a new transformation that turns sequences of `vector.to_elements` and `vector.from_elements` into a binary tree of `vector.shuffle` operations.

(Related RFC:
https://discourse.llvm.org/t/rfc-adding-vector-to-elements-op-to-the-vector-dialect/86779).

Example:

```
  %0:4 = vector.to_elements %a : vector<4xf32>
  %1:4 = vector.to_elements %b : vector<4xf32>
  %2:4 = vector.to_elements %c : vector<4xf32>
  %3 = vector.from_elements %0#0, %0#1, %0#2, %0#3,
                            %1#0, %1#1, %1#2, %1#3,
                            %2#0, %2#1, %2#2, %2#3 : vector<12xf32>

==>

  %0 = vector.shuffle %a, %b [0, 1, 2, 3, 4, 5, 6, 7] : vector<4xf32>, vector<4xf32>
  %1 = vector.shuffle %c, %c [0, 1, 2, 3, -1, -1, -1, -1] : vector<4xf32>, vector<4xf32>
  %2 = vector.shuffle %0, %1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] : vector<8xf32>, vector<8xf32>
```

The algorithm leverages the structured extraction/insertion information
of `vector.to_elements` and `vector.from_elements` operations and builds
a set of intervals to determine the vector length that should be used at
each level of the tree to combine the level inputs in pairs.

There are a few improvements that can be implemented in the future, such
as shuffle mask compression to avoid unnecessarily large vector lengths
with poison values, but I decided to keep things "simpler" and spend
more time documenting the different steps of the algorithm so that
people can follow along.
2025-07-09 16:09:53 -07:00
Diego Caballero
889ac879ce
[mlir][Vector] Remove usage of vector.insertelement/extractelement from Vector (#144413)
This PR is part of the last step to remove `vector.extractelement` and `vector.insertelement` ops.
RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops

It removes instances of `vector.extractelement` and `vector.insertelement` from the Vector dialect layer.
2025-07-09 12:09:17 -07:00
Diego Caballero
7451e4c330
[mlir][Vector] Support scalar 'vector.insert' in vector linearization (#146954)
This PR add support for linearizing the insertion of a scalar element by
just linearizing the `vector.insert` op.
2025-07-07 10:22:33 -07:00
Andrzej Warzyński
6ecb6a8a8c
[mlir][vector][nfc] Rename populateVectorTransferCollapseInnerMostContiguousDimsPatterns (#145228)
Renames `populateVectorTransferCollapseInnerMostContiguousDimsPatterns`
as `populateDropInnerMostUnitDimsXferOpPatterns` + updates the
corresponding comments.

This addresses a TODO and makes the difference between these two
`populate*` methods clearer:
 * `populateDropUnitDimWithShapeCastPatterns`,
 * `populateDropInnerMostUnitDimsXferOpPatterns`.
2025-07-02 20:07:35 +01:00
Fabian Mora
878d3594ed
[mlir][vector] Avoid setting padding by default to 0 in vector.transfer_read prefer ub.poison (#146088)
Context:
`vector.transfer_read` always requires a padding value. Most of its
builders take no `padding` value and assume the safe value of `0`.
However, this should be a conscious choice by the API user, as it makes
it easy to introduce bugs.
For example, I found several occasions while making this patch that the
padding value was not getting propagated (`vector.transfer_read` was
transformed into another `vector.transfer_read`). These bugs, were
always caused because of constructors that don't require specifying
padding.

Additionally, using `ub.poison` as a possible default value is better,
as it indicates the user "doesn't care" about the actual padding value,
forcing users to specify the actual padding semantics they want.

With that in mind, this patch changes the builders in
`vector.transfer_read` to always having a `std::optional<Value> padding`
argument. This argument is never optional, but for convenience users can
pass `std::nullopt`, padding the transfer read with `ub.poison`.

---------

Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
2025-06-30 15:20:42 -04:00
Charitha Saumya
c539ec0db5
[mlir][vector] Add support for vector extract/insert_strided_slice in vector distribution. (#145421)
This PR adds initial support for `vector.extract_strided_slice` and
`vector.insert_strided_slice` ops in vector distribution.
2025-06-25 16:41:28 -07:00
Momchil Velikov
c7d9b6ed5d
[MLIR] Fix incorrect slice contiguity inference in vector::isContiguousSlice (#142422)
Previously, slices were sometimes marked as non-contiguous when they
were actually contiguous. This occurred when the vector type had leading
unit dimensions, e.g., `vector<1x1x...x1xd0xd1x...xdn-1xT>`. In such
cases, only the trailing `n` dimensions of the memref need to be
contiguous, not the entire vector rank.

This affects how `FlattenContiguousRowMajorTransfer{Read,Write}Pattern`
flattens `transfer_read` and `transfer_write` ops.

The patterns used to collapse a number of dimensions equal to the vector
rank which missed some opportunities when the leading unit dimensions of
the vector span non-contiguous dimensions of the memref.
Now that the contiguity of the slice is determined correctly, there is a
choice how many dimensions of the
memref to collapse, ranging from 
a) the number of vector dimensions after ignoring the leading unit
dimensions, up to
  b) the maximum number of contiguous memref dimensions

This patch makes a choice to do minimal memref collapsing. The rationale
behind this decision is that
this way the least amount of information is discarded.

(It follows that in some cases where the patterns used to trigger and
collapse some memref dimensions, after this patch the patterns may
collapse less dimensions).
2025-06-23 14:12:56 +01:00
Nishant Patel
9c1ce31f54
[mlir][vector] Add unroll patterns for vector.load and vector.store (#143420)
This PR adds unroll patterns for vector.load and vector.store. This PR is follow up of #137558
2025-06-20 13:50:25 -07:00
Diego Caballero
a00b736a79
[mlir][Vector] Support vector.extract(xfer_read) folding with dynamic indices (#143269)
This PR is part of the last step to remove `vector.extractelement` and `vector.insertelement` ops.
RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops

It adds support for folding `vector.transfer_read(vector.extract) ->
memref.load` with dynamic indices, which is currently supported by
`vector.extractelement`.
2025-06-16 12:05:20 -07:00
Charitha Saumya
0c7ce6883a
Revert "[mlir][vector] Fix for WarpOpScfForOp failure when scf.for has results that are unused." (#144124)
Reverts llvm/llvm-project#141853

Reverting the bug fix because it does not handle all cases correctly.
2025-06-13 11:02:05 -07:00