23873 Commits

Author SHA1 Message Date
Yang Bai
4eb1a07d7d
[mlir][vector] Support multi-dimensional vectors in VectorFromElementsLowering (#151175)
This patch introduces a new unrolling-based approach for lowering
multi-dimensional `vector.from_elements` operations.

**Implementation Details:**
1. **New Transform Pattern**: Added `UnrollFromElements` that unrolls a
N-D(N>=2) from_elements op to a (N-1)-D from_elements op align the
outermost dimension.
2. **Utility Functions**: Added `unrollVectorOp` to reuse the unroll
algo of vector.gather for vector.from_elements.
3. **Integration**: Added the unrolling pattern to the
convert-vector-to-llvm pass as a temporal transformation.
4. Use direct LLVM dialect operations instead of intermediate
vector.insert operations for efficiency in `VectorFromElementsLowering`.

**Example:**
```mlir
// unroll
%v = vector.from_elements  %e0, %e1, %e2, %e3 : vector<2x2xf32>
=>
%poison_2d = ub.poison : vector<2x2xf32>
%vec_1d_0 = vector.from_elements %e0, %e1 : vector<2xf32>
%vec_2d_0 = vector.insert %vec_1d_0, %poison_2d [0] : vector<2xf32> into vector<2x2xf32>
%vec_1d_1 = vector.from_elements %e2, %e3 : vector<2xf32>
%result = vector.insert %vec_1d_1, %vec_2d_0 [1] : vector<2xf32> into vector<2x2xf32>

// convert-vector-to-llvm
%v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32>
=>
%poison_2d = ub.poison : vector<2x2xf32>
%poison_2d_cast = builtin.unrealized_conversion_cast %poison_2d : vector<2x2xf32> to !llvm.array<2 x vector<2xf32>>
%poison_1d_0 = llvm.mlir.poison : vector<2xf32>
%c0_0 = llvm.mlir.constant(0 : i64) : i64
%vec_1d_0_0 = llvm.insertelement %e0, %poison_1d_0[%c0_0 : i64] : vector<2xf32>
%c1_0 = llvm.mlir.constant(1 : i64) : i64
%vec_1d_0_1 = llvm.insertelement %e1, %vec_1d_0_0[%c1_0 : i64] : vector<2xf32>
%vec_2d_0 = llvm.insertvalue %vec_1d_0_1, %poison_2d_cast[0] : !llvm.array<2 x vector<2xf32>>
%poison_1d_1 = llvm.mlir.poison : vector<2xf32>
%c0_1 = llvm.mlir.constant(0 : i64) : i64
%vec_1d_1_0 = llvm.insertelement %e2, %poison_1d_1[%c0_1 : i64] : vector<2xf32>
%c1_1 = llvm.mlir.constant(1 : i64) : i64
%vec_1d_1_1 = llvm.insertelement %e3, %vec_1d_1_0[%c1_1 : i64] : vector<2xf32>
%vec_2d_1 = llvm.insertvalue %vec_1d_1_1, %vec_2d_0[1] : !llvm.array<2 x vector<2xf32>>
%result = builtin.unrealized_conversion_cast %vec_2d_1 : !llvm.array<2 x vector<2xf32>> to vector<2x2xf32>
```

---------

Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com>
Co-authored-by: Yang Bai <yangb@nvidia.com>
Co-authored-by: James Newling <james.newling@gmail.com>
Co-authored-by: Diego Caballero <dieg0ca6aller0@gmail.com>
2025-08-18 10:09:12 -07:00
Nishant Patel
4a9d038acd
[MLIR][XeGPU] Distribute load_nd/store_nd/prefetch_nd with offsets from Wg to Sg (#153432)
This PR adds pattern to distribute the load/store/prefetch nd ops with
offsets from workgroup to subgroup IR. This PR is part of the transition
to move offsets from create_nd to load/store/prefetch nd ops.

Create_nd PR : #152351
2025-08-18 09:45:29 -07:00
Jeremy Kun
c67d27dad0
[mlir][Presburger] NFC: return var index from IntegerRelation::addLocalFloorDiv (#153463)
addLocalFloorDiv currently returns void and requires the caller to know
that the newly added local variable is in a particular index. This
commit returns the index of the newly added variable so that callers
need not tie themselves to this implementation detail.

I found one relevant callsite demonstrating this and updated it. I am
using this API out of tree and wanted to make our out-of-tree code a bit
more resilient to upstream changes.
2025-08-18 08:47:47 -07:00
Jacques Pienaar
4bf33958da
[mlir] Update builders to use new form. (#154132)
Mechanically applied using clang-tidy.
2025-08-18 15:19:34 +00:00
Matthias Springer
f84aaa6eaa
[mlir][Transforms] Dialect conversion: Add flag to dump materialization kind (#119532)
Add a debugging flag to the dialect conversion to dump the
materialization kind. This flag is useful to find out whether a missing
materialization rule is for source or target materializations.

Also add missing test coverage for the `buildMaterializations` flag.
2025-08-18 13:25:18 +00:00
Chaitanya
4a3bf27c69
[OpenMP] Introduce omp.target_allocmem and omp.target_freemem omp dialect ops. (#145464)
This PR introduces two new ops in omp dialect, omp.target_allocmem and
omp.target_freemem.
omp.target_allocmem: Allocates heap memory on device. Will be lowered to
omp_target_alloc call in llvm.
omp.target_freemem: Deallocates heap memory on device. Will be lowered
to omp+target_free call in llvm.


Example:
  %1 = omp.target_allocmem %device : i32, i64
  omp.target_freemem %device, %1 : i32, i64

The work in this PR is C-P/inspired from @ivanradanov commit from
coexecute implementation:
[Add fir omp target alloc and free
ops](be860ac8ba)
[Lower omp_target_{alloc,free} to
llvm](6e2d584dc9)
2025-08-18 18:15:11 +05:30
Mehdi Amini
cfe5975eaf
[MLIR] Fix SCF verifier crash (#153974)
An operand of the nested yield op can be null and hasn't been verified
yet when processing the enclosing operation. Using `getResultTypes()`
will dereference this null Value and crash in the verifier.
2025-08-18 12:48:55 +02:00
Andrzej Warzyński
51b5a3e1a6
[MLIR] Add Egress dialects maintainers (#151721)
As per https://discourse.llvm.org/t/mlir-project-maintainers/87189, this
PR adds maintainers for the "egress" dialects.

Compared to the original proposal, two changes are included:
* The "mesh" dialect has been renamed to "shard"
(https://discourse.llvm.org/t/mlir-mesh-cleanup-mesh/).
* The "XeVM" dialect has been added
(https://discourse.llvm.org/t/rfc-proposal-for-new-xevm-dialect/).
2025-08-18 10:34:44 +01:00
Mehdi Amini
16aa283344
[MLIR] Refactor the walkAndApplyPatterns driver to remove the recursion (#154037)
This is in preparation of a follow-up change to stop traversing
unreachable blocks.

This is not NFC because of a subtlety of the early_inc. On a test case
like:

```
  scf.if %cond {
    "test.move_after_parent_op"() ({
      "test.any_attr_of_i32_str"() {attr = 0 : i32} : () -> ()
    }) : () -> ()
  }
```

We recursively traverse the nested regions, and process an op when the
region is done (post-order).
We need to pre-increment the iterator before processing an operation in
case it gets deleted. However
we can do this before or after processing the nested region. This
implementation does the latter.
2025-08-18 09:07:19 +00:00
Mehdi Amini
87e6fd161a
[MLIR] Erase unreachable blocks before applying patterns in the greedy rewriter (#153957)
Operations like:

    %add = arith.addi %add, %add : i64

are legal in unreachable code. Unfortunately many patterns would be
unsafe to apply on such IR and can lead to crashes or infinite loops. To
avoid this we can remove unreachable blocks before attempting to apply
patterns.
We may have to do this also whenever the CFG is changed by a pattern, it
is left up for future work right now.

Fixes #153732
2025-08-18 10:59:43 +02:00
Matthias Springer
ff68f7115c
[mlir][builtin] Make unrealized_conversion_cast inlineable (#139722)
Until now, `builtin.unrealized_conversion_cast` ops could not be inlined
by the Inliner pass.
2025-08-18 10:23:26 +02:00
Matthias Springer
f7b09ad700
[mlir][LLVM] ArithToLLVM: Add 1:N support for arith.select lowering (#153944)
Add 1:N support for the `arith.select` lowering. Only cases where the
entire true/false value is selected are supported.
2025-08-18 09:42:37 +02:00
Guray Ozen
5d300afa80
[MLIR][NVVM] Add support for multiple return values in inline_ptx (#153774)
This PR adds the ability for `nvvm.inline_ptx` to return multiple
values, matching the expected semantics in PTX while respecting LLVM’s
constraints.

LLVM’s `inline_asm` op does not natively support multiple returns —
instead, it requires packing results into an LLVM `struct` and then
extracting them. This PR implements automatic packing/unpacking so that
multiple return values can be expressed naturally in MLIR without extra
user boilerplate.

**Example**
MLIR:

```
%r1, %r2 = nvvm.inline_ptx  "{
   .reg .pred p;
   setp.ge.s32 p, $2, $3;
   selp.s32 $0, $2, $3, p;
   selp.s32 $1, $2, $3, !p;
}" (%a, %b) : i32, i32 -> i32, i32

%r3 = llvm.add %r1, %r2 : i32
```

Lowered LLVM IR:

```
%1 = llvm.inline_asm has_side_effects asm_dialect = att "{\0A\09 .reg .pred p;\0A\09 setp.ge.s32 p, $2, $3;\0A\09 selp.s32 $0, $2, $3, p;\0A\09 selp.s32 $1, $2, $3, !p;\0A\09}\0A", "=r,=r,r,r" %a, %b : (i32, i32) -> !llvm.struct<(i32, i32)>
%2 = llvm.extractvalue %1[0] : !llvm.struct<(i32, i32)>
%3 = llvm.extractvalue %1[1] : !llvm.struct<(i32, i32)>
%4 = llvm.add %2, %3 : i32
```
2025-08-18 08:37:55 +02:00
Shenghang Tsai
7610b13729
[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit method call (#153524)
Retry landing https://github.com/llvm/llvm-project/pull/153373
## Major changes from previous attempt
- remove the test in CAPI because no existing tests in CAPI deal with
sanitizer exemptions
- update `mlir/docs/Dialects/GPU.md` to reflect the new behavior: load
GPU binary in global ctors, instead of loading them at call site.
- skip the test on Aarch64 since we have an issue with initialization there

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-08-17 23:07:24 +02:00
Veera
e1aa415220
[mlir][InferIntRangeCommon] Fix Division by Zero Crash (#151637)
Fixes #131273

Adds a check to avoid division when max value of denominator is zero.
2025-08-17 10:56:34 -07:00
Erik Davis
a66d8f62e6
[mlir][doc] fixup code block (#153977)
This fixes a small typo in the toy tutorial. A code block was not
correctly terminated, causing it to run into the subsequent block.
2025-08-17 13:01:05 +02:00
Matthias Springer
0d8aa9d9ec
[mlir][SparseTensor] Simplify pipeline (#152908)
This refactoring improves compilation time.
2025-08-16 18:45:26 +02:00
Maksim Levental
6fc1deb8b7
[mlir][python] handle more undefined symbols not covered by nanobind (#153861)
Introduced (but omitted from this CMake) in
https://github.com/llvm/llvm-project/pull/151246.
2025-08-16 09:25:15 -04:00
Matthias Springer
2692ff8213
[mlir][LLVM] Fix build (#153947)
Fix build after #153937.
2025-08-16 13:06:58 +02:00
Matthias Springer
f8f23e838a
[mlir][LLVM] ControlFlowToLLVM: Add 1:N type conversion support (#153937)
Add support for 1:N type conversions to the `ControlFlowToLLVM` lowering
patterns. Not applicable to `cf.switch` and `cf.assert`.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2025-08-16 12:51:40 +02:00
Matthias Springer
f0967fca04
[mlir][LLVM] FuncToLLVM: Add 1:N type conversion support (#153823)
Add support for 1:N type conversions to the `FuncToLLVM` lowering
patterns. This commit does not change the lowering of any types (such as
`MemRefType`). It just sets up the infrastructure, such that 1:N type
conversions can be used during `FuncToLLVM`.

Note: When the converted result types of a `func.func` have more than 1
type, then the results are wrapped in an `llvm.struct`. That's because
`llvm.func` does not support multiple result values. This "wrapping" was
already implemented for cases where the original `func.func` has
multiple results. With 1:N conversions, even a single result can now
expand to multiple converted results, triggering the same wrapping
mechanism.

The test cases are exercised with both the old and the new no-rollback
conversion driver.
2025-08-16 09:45:08 +02:00
Chao Chen
9c4e571ae8
[mlir][xegpu] Add definitions of MemDescType and related ops. (#153273) 2025-08-15 18:02:13 -05:00
Aiden Grossman
ca8ee49c1f
[MLIR] Set LLVM_LIT_ARGS in Standalone Example CMake (#152423)
Setting LLVM_LIT_ARGS to include --quiet and then running check-mlir in
a standard checkout will otherwise cause test failures here because
LLVM_LIT_ARGS gets propagated into this project.
2025-08-15 12:40:32 -07:00
asraa
b045729eb4
[mlir][presburger] add functionality to compute local mod in IntegerRelation (#153614)
Similar to `IntegerRelation::addLocalFloorDiv`, this adds a utility
`IntegerRelation::addLocalModulo` that adds and returns a local variable
that is the modulus of an affine function of the variables modulo some
constant modulus. The function returns the absolute index of the new var
in the relation.

This is computed by first finding the floordiv of `exprs // modulus = q`
and then computing the remainder `result = exprs - q * modulus`.

Signed-off-by: Asra Ali <asraa@google.com>
2025-08-15 09:55:13 -07:00
Andrey Timonin
dfa1335db1
[mlir][emitc] Add verification for the emitc.get_field op (#152577)
This MR adds a `verifier` for the `emitc.get_field` op. 
- The `verifier` checks that the `emitc.get_field` operation is nested
  inside an `emitc.class` op.
- Additionally, appropriate tests for erroneous cases were added for
  class-related operations in `invalid_ops.mlir`.
2025-08-15 18:32:12 +02:00
Tim Gymnich
ffaba758fb
[MLIR][ROCDL] Add permlane16.swap and permanlane32.swap (#153804)
add rocdl.permlane16.swap and rocdl.permanlane32.swap
2025-08-15 17:35:31 +02:00
Kazu Hirata
f4bc3151bb [mlir] Fix warnings
This patch fixes:

  mlir/lib/Target/Wasm/TranslateFromWasm.cpp:82:1: error: unused
  variable 'wasmSectionName<(anonymous
  namespace)::WasmSectionType::DATACOUNT>'
  [-Werror,-Wunused-const-variable]

  mlir/lib/Target/Wasm/TranslateFromWasm.cpp💯5: error: unused
  variable 'valueTypesEncodings' [-Werror,-Wunused-const-variable]

  mlir/lib/Target/Wasm/TranslateFromWasm.cpp:735:13: error: unused
  function 'buildLiteralType<unsigned int>'
  [-Werror,-Wunused-function]

  mlir/lib/Target/Wasm/TranslateFromWasm.cpp:740:13: error: unused
  function 'buildLiteralType<unsigned long>'
  [-Werror,-Wunused-function]

  mlir/lib/Target/Wasm/TranslateFromWasm.cpp:292:33: error: private
  field 'symbols' is not used [-Werror,-Wunused-private-field]
2025-08-15 07:24:31 -07:00
Guray Ozen
4c389178ee
[MLIR][NVVM] Print readable modifer (NFC) (#153779)
Currently, modifier is printed as address, so it is not readable and not
useful. This PR adds readable printing for it.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-15 15:47:39 +02:00
Guray Ozen
af92cabdef
[MLIR][NVVM] Combine griddepcontrol Ops (#152525)
We've 2 ops:
1. nvvm.griddepcontrol.wait
2. nvvm.griddepcontrol.launch_dependents

They are related to Grid Dependent Launch (or programmatic dependent
launch in CUDA) and same concept. This PR unifies both ops into a single
one.
2025-08-15 15:47:12 +02:00
Erick Ochoa Lopez
61caab7789
[mlir][llvm] Add align attribute to llvm.intr.masked.{expandload,compressstore} (#153063)
* Add `requiresArgsAndResultsAttr` to `LLVM_OneResultIntrOp`
* Add `args_attrs` to `llvm.intr.masked.{expandload,compressstore}`

The LLVM intrinsics
[`llvm.intr.masked.expandload`](https://llvm.org/docs/LangRef.html#llvm-masked-expandload-intrinsics)
and
[`llvm.intr.masked.compressstore`](https://llvm.org/docs/LangRef.html#llvm-masked-compressstore-intrinsics)
both allow an optional align parameter attribute to be set which
defaults to one.

Inlining the documentation below for [`llvm.intr.masked.expandload` 's
](https://llvm.org/docs/LangRef.html#id1522) and
[`llvm.intr.masked.compressstore`'s](https://llvm.org/docs/LangRef.html#id1522)
arguments respectively

> The `align` parameter attribute can be provided for the first
argument. The pointer alignment defaults to 1.

> The `align` parameter attribute can be provided for the second
argument. The pointer alignment defaults to 1.
2025-08-15 08:34:14 -04:00
Mehdi Amini
69453d7021
[MLIR] Fix memory leak in importWebAssemblyToModule when it fails to import (#153794) 2025-08-15 12:33:25 +00:00
Mehdi Amini
7640645f79
[MLIR][Wasm] Remove statistics as they depend on global ctors (#153795)
Use a debug log instead for now.
2025-08-15 12:29:20 +00:00
Markus Böck
8582025f1f
[mlir][Transforms] Turn 1:N -> 1:1 dispatch fatal error into match failure (#153605)
Prior to this PR, the default behaviour of a conversion pattern which
receives operands of a 1:N is to abort the compilation. This has
historically been useful when the 1:N type conversion got merged into
the dialect conversion as it allowed us to easily find patterns that
should be capable of handling 1:N type conversions but didn't.

However, this behaviour has the disadvantage of being non-composable:
While the pattern in question cannot handle the 1:N type conversion,
another pattern part of the set might, but doesn't get the chance as
compilation is aborted.

This PR fixes this behaviour by failing to match and instead of
aborting, giving other patterns the chance to legalize an op. The
implementation uses a reusable function called `dispatchTo1To1` to allow
derived conversion patterns to also implement the behaviour.
2025-08-15 11:45:25 +02:00
Matthias Springer
21b607adbe
[mlir][SCF] scf.for: Add support for unsigned integer comparison (#153379)
Add a new unit attribute to allow for unsigned integer comparison.

Example:
```mlir
scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 {
  // body
}
```

Discussion:
https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655
2025-08-15 10:59:14 +02:00
Ferdinand Lemaire
6bb8f6f2d0
[MLIR][WASM] Introduce an importer for Wasm binaries (#152131)
First step in introducing the wasm-import target to mlir-translate. 
This is the first PR to introduce the pass, with this PR, there is very
little support for the actual WebAssembly language, it's mostly there to
introduce the skeleton of the importer. A follow-up will come with
support for a wider range of operators. It was split to make it easier
to review, since it's a good chunk of work.

---------

Co-authored-by: Luc Forget <dev@alias.lforget.fr>
Co-authored-by: Ferdinand Lemaire <ferdinand.lemaire@woven-planet.global>
Co-authored-by: Jessica Paquette <jessica.paquette@woven-planet.global>
Co-authored-by: Luc Forget <luc.forget@woven.toyota>
2025-08-15 10:54:40 +02:00
Chenguang Wang
3f797a8342
[mlir][spirv] Add missing #include in SPIRVImageInterfaces.h (#153727)
SPIRVImageInterfaces.h.inc uses some types, e.g. mlir::TypedValue,
without #include the necessary headers. This is fine most of the time,
but we did run into a weird case where bazel fails to compile
//mlir:SPIRVImageInterfaces on clang19 for ChromiumOS when parse_headers
(see [1]) is specified.

[1]: https://bazel.build/docs/bazel-and-cpp#toolchain-features
2025-08-14 19:07:54 -07:00
Erich Keane
e5e3e4bdb5
[OpenACC] Add firstprivate recipe helper methods to ACC dialect (#153604)
Like we did for the 'private' clause, this adds an easier to use helper
function to add the 'firstprivate' clause + recipe to the Parallel and
Serial ops.
2025-08-14 13:07:59 -07:00
Jianhui Li
98728d9dc8
[MLIR][XeGPU] Add lowering from transfer_read/transfer_write to load_gather/store_scatter (#152429)
Lowering transfer_read/transfer_write to load_gather/store_scatter in
case the target uArch doesn't support load_nd/store_nd. The high level
steps:
  1. compute Strides;
  2. compute Offsets;
  3. collapseMemrefTo1D;
  4. create Load gather or store_scatter op
2025-08-14 11:27:07 -07:00
Boyana Norris
ada191136b
[mlir][cmake] Fix mlir target export (#153341)
In https://github.com/llvm/llvm-project/pull/152195, target export was
accidentally moved inside a conditional, but it should have been left
outside. This patch undoes that change.
2025-08-14 11:24:44 -06:00
Matthias Springer
e2ae634cc1
[mlir][LLVM][NFC] Simplify copyUnrankedDescriptors (#153597)
Split the function into two: one that copies a single unranked
descriptor and one that copies multiple unranked descriptors. This is in
preparation of adding 1:N support to the Func->LLVM lowering patterns.
2025-08-14 18:25:19 +02:00
Boyana Norris
1945753700
[mlir][linalg] Fix incorrect linalg short form printing (#153219)
Both `linalg.map` and `linalg.reduce` are sometimes printed in short
form incorrectly, resulting in a round-trip output with different
semantics. This patch adds additional `yield` operand checks to ensure
that all criteria for short-form printing are satisfied. Updated/added
comments and renamed the `findPayloadOp` function to `canUseShortForm`,
which more accurately reflects its purpose. A couple of new lit tests
check for the proper use of long form when short-form conditions are not
met.

Fixes #117528
2025-08-14 17:19:16 +01:00
Renato Golin
8cc22ee674
[MLIR][Maintainers] Add maintainer list for core sub-categories (#152136)
Ref: https://discourse.llvm.org/t/mlir-project-maintainers/87189

See also:
 * #151721 
 * #150945

Compared to the original proposal, one change is included:
* The `ub` dialect has @Hardcode84 as maintainer.

Please accept to validate your nomination, let's keep new nominations
for follow up PRs.
2025-08-14 16:08:15 +01:00
Matthias Springer
0ff92fe2f0
[mlir][LLVM][NFC] Simplify computeSizes function (#153588)
Rename `computeSizes` to `computeSize` and make it compute just a single
size. This is in preparation of adding 1:N support to the Func->LLVM
lowering patterns.
2025-08-14 17:00:03 +02:00
Jaden Angella
bfda0e777d
[mlir][EmitC] Expand the MemRefToEmitC pass - Lowering CopyOp (#151206)
This patch lowers `memref.copy` to `emitc.call_opaque "memcpy"`.
From:
```
func.func @copying(%arg0 : memref<9x4x5x7xf32>, %arg1 : memref<9x4x5x7xf32>) {
  memref.copy %arg0, %arg1 : memref<9x4x5x7xf32> to memref<9x4x5x7xf32>
  return
}
```
To:
```cpp
#include <cstring>
void copying(float v1[9][4][5][7], float v2[9][4][5][7]) {
  size_t v3 = 0;
  float* v4 = &v2[v3][v3][v3][v3];
  float* v5 = &v1[v3][v3][v3][v3];
  size_t v6 = sizeof(float);
  size_t v7 = 1260;
  size_t v8 = v6 * v7;
  memcpy(v5, v4, v8);
  return;
}
```
2025-08-14 05:25:55 -07:00
lonely eagle
6d08a39eeb
[mlir][nvgpu] Add tma last dim bytes check (#153451)
Add the check the number of bytes in the last dimension of Tma must be a
multiple of 16.
2025-08-14 20:14:20 +08:00
Igor Wodiany
87de48d11f
[mlir][spirv] Add spirv validation for module.mlir target test (#153227)
Creating this patch as an example on using the new `mlir-translate`
flag. Eventually all tests will be updated to validate SPIR-V modules.
2025-08-14 12:45:55 +01:00
Andrzej Warzyński
8d4f3171fa
[mlir][linalg] Fix UnPackOp::getTiledOuterDims (#152960)
Fixes `getTiledOuterDims` by making sure that the `outer_dims_perm`
attribute from `linalg.unpack` is taken into account.

Fixes #152037
2025-08-14 11:39:50 +01:00
Ege Beysel
8de85e753f
[mlir][linalg] Add support for scalable vectorization of linalg.batch_mmt4d (#152984)
This PR builds upon the previous #146531 and enables scalable
vectorization for `batch_mmt4d` as well.

---------

Signed-off-by: Ege Beysel <beyselege@gmail.com>
2025-08-14 11:47:51 +02:00
Jordan Rupprecht
1d55b70ec3
[MLIR][GPU][XeVM] Add missing #include for standalone header build (#153532)
This header uses GPUModuleOp but does not directly include the header:
`error: no type named 'GPUModuleOp' in namespace 'mlir::gpu'; did you
mean 'ModuleOp'?`

Needed for #148286
2025-08-14 04:13:41 +00:00
Sayan Saha
8432f24831
[mlir][tosa] Don't fold mul with zero lhs/rhs if resulting type is dynamic (#153420)
Canonicalizing the following IR:

```
func.func @mul_zero_dynamic_nofold(%arg0: tensor<?x17xf32>) -> tensor<?x17xf32> {
  %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1x1xf32>
  %1 = "tosa.const"() <{values = dense<0> : tensor<1xi8>}> : () -> tensor<1xi8>
  %2 = tosa.mul %arg0, %0, %1 : (tensor<?x17xf32>, tensor<1x1xf32>, tensor<1xi8>) -> tensor<?x17xf32>
  return %2 : tensor<?x17xf32>
}
```

resulted in a crash

```
#0 0x000056513187e8db backtrace (./build-release/bin/mlir-opt+0x9d698db)                                                                                                                                                                                                                                                                                                                   
 #1 0x0000565131b17737 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:838:8                                                                                                                                                                                                                
 #2 0x0000565131b187f3 PrintStackTraceSignalHandler(void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:918:1                                                                                                                                                                                                                                
 #3 0x0000565131b18c30 llvm::sys::RunSignalHandlers() /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Signals.cpp:105:18                                                                                                                                                                                                                                         
 #4 0x0000565131b18c30 SignalHandler(int, siginfo_t*, void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:409:3                                                                                                                                                                                                                              
 #5 0x00007f2e4165b050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050)                                                                                                                                                                                                                                                                                                                            
 #6 0x00007f2e416a9eec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76                                                                                                                                                                                                                                                                                                            
 #7 0x00007f2e4165afb2 raise ./signal/../sysdeps/posix/raise.c:27:6                                                                                                                                                                                                                                                                                                                         
 #8 0x00007f2e41645472 abort ./stdlib/abort.c:81:7                                                                                                                                                                                                                                                                                                                                          
 #9 0x00007f2e41645395 _nl_load_domain ./intl/loadmsgcat.c:1177:9                                                                                                                                                                                                                                                                                                                           
#10 0x00007f2e41653ec2 (/lib/x86_64-linux-gnu/libc.so.6+0x34ec2)                                                                                                                                                                                                                                                                                                                            
#11 0x00005651443ec4ba mlir::DenseIntOrFPElementsAttr::getRaw(mlir::ShapedType, llvm::ArrayRef<char>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:1361:3                                                                                                                                                                                    
#12 0x00005651443f1209 mlir::DenseElementsAttr::resizeSplat(mlir::ShapedType) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:0:10                                                                                                                                                                                                              
#13 0x000056513f76f2b6 mlir::tosa::MulOp::fold(mlir::tosa::MulOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute>>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:0:0
```

from the folder for `tosa::mul` since the zero value was being reshaped
to `?x17` size which isn't supported. AFAIK, `tosa.const` requires all
dimensions to be static. So in this case, the fix is to not to fold the
op.
2025-08-13 19:45:06 -04:00