23532 Commits

Author SHA1 Message Date
Nikita Popov
92c55a315e
[IR] Only allow lifetime.start/end on allocas (#149310)
lifetime.start and lifetime.end are primarily intended for use on
allocas, to enable stack coloring and other liveness optimizations. This
is necessary because all (static) allocas are hoisted into the entry
block, so lifetime markers are the only way to convey the actual
lifetimes.

However, lifetime.start and lifetime.end are currently *allowed* to be
used on non-alloca pointers. We don't actually do this in practice, but
just the mere fact that this is possible breaks the core purpose of the
lifetime markers, which is stack coloring of allocas. Stack coloring can
only work correctly if all lifetime markers for an alloca are
analyzable.

* If a lifetime marker may operate on multiple allocas via a select/phi,
we don't know which lifetime actually starts/ends and handle it
incorrectly (https://github.com/llvm/llvm-project/issues/104776).
* Stack coloring operates on the assumption that all lifetime markers
are visible, and not, for example, hidden behind a function call or
escaped pointer. It's not possible to change this, as part of the
purpose of lifetime markers is that they work even in the presence of
escaped pointers, where simple use analysis is insufficient.

I don't think there is any way to have coherent semantics for lifetime
markers on allocas, while also permitting them on arbitrary pointer
values.

This PR restricts lifetimes to operate on allocas only. As a followup, I
will also drop the size argument, which is superfluous if we always
operate on an alloca. (This change also renders various code handling
lifetime markers on non-alloca dead. I plan to clean up that kind of
code after dropping the size argument as well.)

In practice, I've only found a few places that currently produce
lifetimes on non-allocas:

* CoroEarly replaces the promise alloca with the result of an intrinsic,
which will later be replaced back with an alloca. I think this is the
only place where there is some legitimate loss of functionality, but I
don't think this is particularly important (I don't think we'd expect
the promise in a coroutine to admit useful lifetime optimization.)
* SafeStack moves unsafe allocas onto a separate frame. We can safely
drop lifetimes here, as SafeStack performs its own stack coloring.
* Similar for AddressSanitizer, it also moves allocas into separate
memory.
* LSR sometimes replaces the lifetime argument with a GEP chain of the
alloca (where the offsets ultimately cancel out). This is just
unnecessary. (Fixed separately in
https://github.com/llvm/llvm-project/pull/149492.)
* InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast
of an alloca. I don't think this is necessary.
2025-07-21 15:04:50 +02:00
James Newling
6edc1faf3b
[mlir][llvm dialect] Verify element type of nested types (#148975)
Before this PR, this was valid
```
 %0 = llvm.mlir.constant(dense<[1, 2]> : vector<2xi32>) : vector<2xf32>
```

but this was not:

```
%0 = llvm.mlir.constant(1 : i32) : f32
```

because only scalar types were checked for compatibility, not the element types of nested types. Another additional check that this PR adds is to verify the float semantics. Before this PR,

```
 %cst = llvm.mlir.constant(1.0 : bf16) : f16
 ```
 
 was considered valid (because bf16 and f16 both have 16 bits), but with this PR it is not considered valid.  This PR also moves all tests on the verifier of the llvm constant op into a single file. To summarize the state after this PR. 

Invalid:
```mlir
%0 = llvm.mlir.constant(dense<[128, 1024]> : vector<2xi32>) :
vector<2xf32>
%0 = llvm.mlir.constant(dense<[128., 1024.]> : vector<2xbf16>) :
vector<2xf16>
```
Valid:
```mlir
%0 = llvm.mlir.constant(dense<[128., 1024.]> : vector<2xf32>) :
vector<2xi32>
%0 = llvm.mlir.constant(dense<[128, 1024]> : vector<2xi64>) :
vector<2xi8>
```
and identical valid/invalid cases for the scalar cases.
2025-07-21 05:40:02 -07:00
Uday Bondhugula
34526eddb3
[MLIR][Affine] Clean up outer logic of affine loop tiling pass (#149750)
Clean up outer logic of affine loop tiling pass. A wrongly named
temporary method was exposed publicly; fix that. Remove unconditional
emission of remarks.
2025-07-21 14:30:04 +05:30
Luke Hutton
41274582fd
[mlir][tosa] Fix check for isolated regions in tosa.cond_if (#143772)
This commit fixes a check in the validation pass which intended to
validate whether a `tosa.cond_if` operation was conformant to the
specification. The specification requires all values used in the
then/else regions are explicitly declared within the regions. This
change checks that these regions are 'isolated from above', to ensure
this requirement is true.
2025-07-21 09:42:45 +01:00
Andrzej Warzyński
03bd0f36ba
[mlir][vector] Remove MatrixMultiplyOp and FlatTransposeOp from Vector dialect (#144307)
This patch deletes `vector.matrix_multiply` and `vector.flat_transpose`,
which are thin wrappers around the corresponding LLVM intrinsics:
  - `llvm.intr.matrix.multiply`
  - `llvm.intr.matrix.transpose`

These Vector dialect ops did not provide additional semantics or
abstraction beyond the LLVM intrinsics. Their removal simplifies the
lowering pipeline without losing any functionality.

The lowering chains:
- `vector.contract` → `vector.matrix_multiply` →
`llvm.intr.matrix.multiply`
- `vector.transpose` → `vector.flat_transpose` →
`llvm.intr.matrix.transpose`

are now replaced with:
  - `vector.contract` → `llvm.intr.matrix.multiply`
  - `vector.transpose` → `llvm.intr.matrix.transpose`

This was accomplished by directly replacing:
  - `vector::MatrixMultiplyOp` with `LLVM::MatrixMultiplyOp`
  - `vector::FlatTransposeOp` with `LLVM::MatrixTransposeOp`

Note: To avoid a build-time dependency from `Vector` to `LLVM`,
relevant transformations are moved from "Vector/Transforms" to
`Conversion/VectorToLLVM`.
2025-07-21 08:19:30 +01:00
Longsheng Mou
22ef58ceda
[mlir][linalg] Add missing check for isaCopyOpInterface (#149313)
This PR fixes a missing validation in `isaCopyOpInterface` by checking
that the `linalg.yield` operand is identical to the first block
argument, indicating a direct copy. Fixes #130002.
2025-07-21 09:37:54 +08:00
donald chen
04b17bd470
[mlir][scf] fix getSuccessorRegions func in scf.forall (#147491)
In accordance with the semantics of forall, its body is executed in
parallel by multiple threads. We should not expect to branch back into
the forall body after the region's execution is complete.
2025-07-21 09:27:37 +08:00
Maya Amrami
e138c95155
[mlir] ViewLikeInterface - verify ranks in verifyOffsetSizeAndStrideOp (#147926)
getMixedOffsets() calls getMixedValues() with `static_offsets` and
`offsets`. It is assumed that the number of dynamic offsets in
`static_offsets` equals the rank of `offsets`. Otherwise, we fail on
assert when trying to access an array out of its bounds.
The same applies to getMixedStrides() and getMixedOffsets().

A verification of this assumption is added to
verifyOffsetSizeAndStrideOp() and a clear assert is added in
getMixedValues().
2025-07-20 14:20:16 +03:00
Maksim Levental
6056f942ab
[mlir][NFC] update LLVM create APIs (2/n) (#149667)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-19 18:15:54 -04:00
eaeltsin
6eef978e1e
Include <vector> in TemplatingUtils.h (#149671)
This is needed after 3ee0f97b950a550ef14e3adbdf45f507273f2190
2025-07-19 22:08:41 +02:00
Maksim Levental
906295b8a3
[mlir] update affine+arith create APIs (1/n) (#149656)
This PR updates create APIs for arith and affine - specifically these
are the only in-tree dialects/ops with "custom" builders:

```
AffineDmaStartOp
AffineDmaWaitOp
ConstantIntOp
ConstantFloatOp
ConstantIndexOp
```

See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-19 12:37:39 -05:00
Jordan Rupprecht
e1ac57c1a5
[mlir][test] Add missing REQUIRES: asserts for --debug-only flag (#149634)
Debug flags are not provided in fully optimized builds.

Test added in #149378 / #146228
2025-07-18 22:21:09 -05:00
Colin De Vlieghere
fef4238288
[MLIR][SCF] Add dedicated Python bindings for ForallOp (#149416)
This patch specializes the Python bindings for ForallOp and
InParallelOp, similar to the existing one for ForOp. These bindings
create the regions and blocks properly and expose some additional
helpers.
2025-07-18 19:53:11 -04:00
lonely eagle
09bea21d95
[mlir][memref] Simplify memref.copy canonicalization (#149506)
FoldCopyOfCast has both a OpRewritePattern implementation and a folder
implementation. This PR removes the OpRewritePattern implementation.
2025-07-19 07:11:10 +08:00
Kazu Hirata
cb6370167f
[mlir] Deprecate OpPrintingFlags(std::nullopt_t) (NFC) (#149546)
This patch deprecates OpPrintingFlags(std::nullopt_t) to avoid use of
std::nullopt outside the context of std::optional.
2025-07-18 13:33:05 -07:00
Kazu Hirata
c98b05bd56
[mlir] Deprecate NamedAttrList(std::nullopt_t) (NFC) (#149544)
This patch deprecates NamedAttrList(std::nullopt_t) to avoid use of
std::nullopt outside the context of std::optional.
2025-07-18 13:32:56 -07:00
Hanumanth
b846d8c3e2
[mlir][tosa] Fix tosa-reduce-transposes to handle large constants better (#148755)
This change addresses the performance issue in the **--tosa-reduce-transposes** implementation by working directly with the
raw tensor data, eliminating the need for creating the costly intermediate attributes that leads to bottleneck.
2025-07-18 16:12:57 -04:00
Han-Chung Wang
3ea6da59ec
[mlir][linalg] Allow pack consumer fusion if the tile size is greater than dimension size. (#149438)
This happens only when you use larger tile size, which is greater than
or equal to the dimension size. In this case, it is a full slice, so it
is fusible.

The IR can be generated during the TileAndFuse process. It is hard to
fix in such driver, so we enable the naive fusion for the case.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-18 10:42:42 -07:00
Jaden Angella
7fd91bb6e8
[mlir][EmitC]Expand the MemRefToEmitC pass - Adding scalars (#148055)
This aims to expand the the MemRefToEmitC pass so that it can accept
global scalars.
From:
```
memref.global "private" constant @__constant_xi32 : memref<i32> = dense<-1>
func.func @globals() {
    memref.get_global @__constant_xi32 : memref<i32>
}
```
To:
```
emitc.global static const @__constant_xi32 : i32 = -1
    emitc.func @globals() {
      %0 = get_global @__constant_xi32 : !emitc.lvalue<i32>
      %1 = apply "&"(%0) : (!emitc.lvalue<i32>) -> !emitc.ptr<i32>
      return
    }
```
2025-07-18 10:15:05 -07:00
Mohammadreza Ameri Mahabadian
10518c76de
[mlir][spirv] Add conversion pass to rewrite splat constant composite… (#148910)
…s to replicated form

This adds a new SPIR-V dialect-level conversion pass
`ConversionToReplicatedConstantCompositePass`. This pass looks for splat
composite `spirv.Constant` or `spirv.SpecConstantComposite` and rewrites
them into `spirv.EXT.ConstantCompositeReplicate` or
`spirv.EXT.SpecConstantCompositeReplicate`, respectively.

---------

Signed-off-by: Mohammadreza Ameri Mahabadian <mohammadreza.amerimahabadian@arm.com>
2025-07-18 12:59:39 -04:00
Han-Chung Wang
7d040d4675
[mlir][linalg] Handle outer_dims_perm in linalg.pack consumer fusion. (#149426)
Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-18 09:42:40 -07:00
Adam Siemieniuk
e73cb43b44
[mlir][xegpu] Remove unused custom pass declaration (#149278)
Removes unused declaration for pass creation.
Only the create function auto-generated from tablegen should be used.
2025-07-18 17:16:46 +02:00
lonely eagle
4c70195634
[mlir][transform] Fix ch2 and additional documentation (#148407)
Fixed error code in example.In addition to this, the content in the documentation has been improved by adding links to the code repository.
2025-07-18 20:28:43 +08:00
Longsheng Mou
baa291bfb5
[mlir][mesh] Add null check for dyn_cast to prevent crash (#149266)
This PR adds a null check for dyn_cast result before use to prevent
crash, and use `isa` instead `dyn_cast` to make code clean. Fixes
#148619.
2025-07-18 09:28:29 +08:00
Han-Chung Wang
6ff471883f
[mlir][linalg] Improve linalg.pack consumer fusion. (#148993)
If a dimension is not tiled, it is always valid to fuse the pack op,
even if it has padding semantics. Because it always generates a full
slice along the dimension.

If a dimension is tiled and it does not need extra padding, the fusion
is valid.

The revision also formats corresponding tests for consistency.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-17 16:06:06 -07:00
Charitha Saumya
fc3781853b
[mlir][xegpu] Minor fixes in XeGPU subgroup distribution. (#147846)
This PR addresses the following issues.

1. Add the missing attributes when creating a new GPU funcOp in
`MoveFuncBodyToWarpExecuteOnLane0` pattern.
2. Bug fix in LoadNd distribution to make sure LoadOp is the last op in
warpOp region before it is distributed (needed for preserving the memory
op ordering during distribution).
3. Add utility for removing OpOperand or OpResult layout attributes.
2025-07-17 15:13:20 -07:00
Ivan Butygin
6b29ee9d9a
[mlir][amdgpu] Properly handle mismatching memref ranks in amdgpu.gather_to_lds (#149407)
This op doesn't have any rank or indices restrictions on src/dst
memrefs, but was using `SameVariadicOperandSize` which was causing
issues. Also fix some other issues while we at it.
2025-07-18 00:42:25 +03:00
Jian Cai
7e220630d2
[mlir][docs] Rename OpTrait to Trait in ODS doc (#148276)
This makes the doc consistent with the code base.
2025-07-17 14:13:28 -07:00
Jianhui Li
aea2d53961
[MLIR][XeGPU] make offsets optional for create_nd_tdesc (#148335) 2025-07-17 15:33:39 -05:00
Jeremy Kun
a8880265e1
[mlir] Fix CI breakage from https://github.com/llvm/llvm-project/pull/146228 (#149378)
Some platforms print `{anonymous}` instead of the other two forms
accepted by the test regex. This PR just removes the attempt to guess
how the anonymous namespace will be printed.

@Kewen12 is there a way to trigger the particular CIs that failed in
https://github.com/llvm/llvm-project/pull/146228 on this PR?

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
2025-07-17 21:52:37 +02:00
Andrzej Warzyński
3b11aaaf94
[mlir][linalg] Add support for scalable vectorization of linalg.mmt4d (#146531)
This patch adds support for scalable vectorization of linalg.mmt4d. The
key design change is the introduction of a new vectorizer state variable:

* `assumeDynamicDimsMatchVecSizes`

...along with the corresponding Transform dialect attribute:

* `assume_dynamic_dims_match_vec_sizes`.

This flag instructs the vectorizer to assume that dynamic memref/tensor
dimensions match the corresponding vector sizes (fixed or scalable). With this
assumption, masking becomes unnecessary, which simplifies the lowering pipeline
significantly.

While this assumption is not universally valid, it typically holds for
`linalg.mmt4d`. Inputs and outputs are explicitly packed using `linalg.pack`,
and this packing includes padding, ensuring that dimension sizes align with
vector sizes (*).

* Related discussion: https://github.com/llvm/llvm-project/issues/143920

An upcoming patch will include an end-to-end test that leverages scalable
vectorization of linalg.mmt4d to demonstrate the newly enabled functionality.
This would not be feasible without the changes introduced here, as it would
otherwise require additional logic to handle complex - but ultimately redundant
- masks.

(*) This holds provided that the tile sizes used for packing match the vector
sizes used during vectorization. It is the user’s responsibility to enforce
this.
2025-07-17 19:02:08 +01:00
tyb0807
aa3978573e
[mlir][vector][memref] Add alignment attribute to memory access ops (#144344)
Alignment information is important to allow LLVM backends such as AMDGPU
to select wide memory accesses (e.g., dwordx4 or b128). Since this info
is not always inferable, it's better to inform LLVM backends explicitly
about it. Furthermore, alignment is not necessarily a property of the
element type, but of each individual memory access op (we can have
overaligned and underaligned accesses compared to the natural/preferred
alignment of the element type).

This patch introduces `alignment` attribute to memref/vector.load/store
ops.

Follow-up PRs will

1. Propagate the attribute to LLVM/SPIR-V.

2. Introduce `alignment` attribute to other vector memory access ops:
    vector.gather + vector.scatter
    vector.transfer_read + vector.transfer_write
    vector.compressstore + vector.expandload
    vector.maskedload + vector.maskedstore

3. Replace `--convert-vector-to-llvm='use-vector-alignment=1` with a
   simple pass to populate alignment attributes based on the vector
   types.
2025-07-17 13:38:21 -04:00
Robert Konicar
46c059f925
[mlir][LLVMIR] Add IFuncOp to LLVM dialect (#147697)
Add IFunc to LLVM dialect and add support for lifting/exporting LLVMIR
IFunc.
2025-07-17 19:20:31 +02:00
Akshay Khadse
e4a3541ff8
[MLIR][Python] Support eliding large resource strings in PassManager (#149187)
- Introduces a `large_resource_limit` parameter across Python bindings,
enabling the eliding of resource strings exceeding a specified character
limit during IR printing.
- To maintain backward compatibilty, when using `operation.print()` API,
if `large_resource_limit` is None and the `large_elements_limit` is set,
the later will be used to elide the resource string as well. This change
was introduced by https://github.com/llvm/llvm-project/pull/125738.
- For printing using pass manager, the `large_resource_limit` and
`large_elements_limit` are completely independent of each other.
2025-07-17 12:57:04 -04:00
delaram-talaashrafi
0dae924c1f
[openacc][flang] Support two type bindName representation in acc routine (#149147)
Based on the OpenACC specification — which states that if the bind name
is given as an identifier it should be resolved according to the
compiled language, and if given as a string it should be used unmodified
— we introduce two distinct `bindName` representations for `acc routine`
to handle each case appropriately: one as an array of `SymbolRefAttr`
for identifiers and another as an array of `StringAttr` for strings.

To ensure correct correspondence between bind names and devices, this
patch also introduces two separate sets of device attributes. The
routine operation is extended accordingly, along with the necessary
updates to the OpenACC dialect and its lowering.
2025-07-17 09:38:02 -07:00
Jeremy Kun
7caf12da0b
[mlir][core] Add an MLIR "pattern catalog" generator (#146228)
This PR adds a feature that attaches a listener to all RewritePatterns that
logs information about the modified operations.

When the MLIR test suite is run, these debug outputs can
be filtered and combined into an index linking operations to the
patterns that insert, modify, or replace them. This index is intended to
be used to create a website that allows one to look up patterns from an
operation name.

The debug logs emitted can be viewed with --debug-only=generate-pattern-catalog, 
and the lit config is modified to do this when the env var MLIR_GENERATE_PATTERN_CATALOG is set.

Example usage:

```
mkdir build && cd build
cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS="mlir" \
  -DLLVM_TARGETS_TO_BUILD="host" \
  -DCMAKE_BUILD_TYPE=DEBUG
ninja -j 24 check-mlir
MLIR_GENERATE_PATTERN_CATALOG=1 bin/llvm-lit -j 24 -v -a tools/mlir/test | grep 'pattern-logging-listener' | sed 's/^# | [pattern-logging-listener] //g' | sort | uniq > pattern_catalog.txt
```

Sample pattern catalog output (that fits in a gist):
https://gist.github.com/j2kun/02d1ab8d31c10d71027724984c89905a

---------

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-07-17 09:09:12 -07:00
Jeremy Kun
7817163663
[mlir] [presburger] Add IntegerRelation::rangeProduct (#148092)
This is intended to match `isl::map`'s `flat_range_product`.

---------

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
2025-07-17 08:01:58 -07:00
Andrzej Warzyński
bce951c572
[mlir][linalg] Update vectorization logic for linalg.unpack (#149156)
This PR makes sure that we don't generate unnecessary `tensor.empty`
when vectorizing `linalg.unpack`.

To better visualize the changes implemented here, consider this IR:
```mlir
func.func @example(
  %source: tensor<8x4x16x16xf32>,
  %dest: tensor<64x127xf32>) -> tensor<64x127xf32> {

    %res = linalg.unpack %source
      outer_dims_perm = [1, 0]
      inner_dims_pos = [0, 1]
      inner_tiles = [16, 16]
    into %dest : tensor<8x4x16x16xf32> -> tensor<64x127xf32>

    return %res : tensor<64x127xf32>
 }
```

Below is the output after vectorization, BEFORE and AFTER this PR.

BEFORE (note `tensor.empty` and the fact that `%arg1` is not used):
```mlir
  func.func @example(%arg0: tensor<8x4x16x16xf32>, %arg1: tensor<64x127xf32>) -> tensor<64x127xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %c0 = arith.constant 0 : index
    %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<8x4x16x16xf32>, vector<8x4x16x16xf32>
    %1 = vector.transpose %0, [1, 2, 0, 3] : vector<8x4x16x16xf32> to vector<4x16x8x16xf32>
    %2 = vector.shape_cast %1 : vector<4x16x8x16xf32> to vector<64x128xf32>
    %3 = tensor.empty() : tensor<64x127xf32>
    %c0_0 = arith.constant 0 : index
    %4 = vector.transfer_write %2, %3[%c0_0, %c0_0] {in_bounds = [true, false]} : vector<64x128xf32>, tensor<64x127xf32>
    return %4 : tensor<64x127xf32>
  }
```

AFTER (note that `%arg1` is correctly used):
```mlir
  func.func @example(%arg0: tensor<8x4x16x16xf32>, %arg1: tensor<64x127xf32>) -> tensor<64x127xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %c0 = arith.constant 0 : index
    %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<8x4x16x16xf32>, vector<8x4x16x16xf32>
    %1 = vector.transpose %0, [1, 2, 0, 3] : vector<8x4x16x16xf32> to vector<4x16x8x16xf32>
    %2 = vector.shape_cast %1 : vector<4x16x8x16xf32> to vector<64x128xf32>
    %c0_0 = arith.constant 0 : index
    %3 = vector.transfer_write %2, %arg1[%c0_0, %c0_0] {in_bounds = [true, false]} : vector<64x128xf32>, tensor<64x127xf32>
    return %3 : tensor<64x127xf32>
  }
```
2025-07-17 09:14:17 +01:00
Martin Erhart
616e4c43dd
[mlir] Add Python bindings to enable default passmanager timing (#149087) 2025-07-16 15:45:15 +01:00
Akash Banerjee
fc114e4d93
[MLIR] Add ComplexTOROCDLLibraryCalls pass (#144926) 2025-07-16 13:59:41 +01:00
Chaitanya Koparkar
4cc9af219f
[mlir][bufferization] Fix a typo in to_tensor op's summary field (#149082)
Fixes #149081
2025-07-16 14:37:12 +02:00
Artemiy Bulavin
38be53aa04
[MLIR] Fix use-after-frees when accessing DistinctAttr storage (#148666)
This PR fixes a use-after-free error that happens when `DistinctAttr`
instances are created within a `PassManager` running with crash recovery
enabled. The root cause is that `DistinctAttr` storage is allocated in a
thread_local allocator, which is destroyed when the crash recovery
thread joins, invalidating the storage.

Moreover, even without crash reproduction disabling multithreading on
the context will destroy the context's thread pool, and in turn delete
the threadlocal storage. This means a call to
`ctx->disableMulthithreading()` breaks the IR.

This PR replaces the thread local allocator with a synchronised
allocator that's shared between threads. This persists the lifetime of
allocated DistinctAttr storage instances to the lifetime of the context.

### Problem Details:

The `DistinctAttributeAllocator` uses a
`ThreadLocalCache<BumpPtrAllocator>` for lock-free allocation of
`DistinctAttr` storage in a multithreaded context. The issue occurs when
a `PassManager` is run with crash recovery (`runWithCrashRecovery`), the
pass pipeline is executed on a temporary thread spawned by
`llvm::CrashRecoveryContext`. Any `DistinctAttr`s created during this
execution have their storage allocated in the thread_local cache of this
temporary thread. When the thread joins, the thread_local storage is
destroyed, freeing the `DistinctAttr`s' memory. If this attribute is
accessed later, e.g. when printing, it results in a use-after-free.

As mentioned previously, this is also seen after creating some
`DistinctAttr`s and then calling `ctx->disableMulthithreading()`.

### Solution

`DistinctAttrStorageAllocator` uses a synchronised, shared allocator
instead of one wrapped in a `ThreadLocalCache`. The former is what
stores the allocator in transient thread_local storage.

### Testing:

A C++ unit test has been added to validate this fix. (I was previously
reproducing this failure with `mlir-opt` but I can no longer do so and I
am unsure why.)

-----

Note: This is a 2nd attempt at my previous PR
https://github.com/llvm/llvm-project/pull/128566 that was reverted in
https://github.com/llvm/llvm-project/pull/133000. I believe I've
addressed the TSAN and race condition concerns.
2025-07-16 12:11:38 +02:00
Luke Hutton
1c223829b8
[mlir][tosa] Fix transpose_conv2d verifier when output channels are dynamic (#147062)
This commit fixes a transpose_conv2d verifier check which compares the
output channels size to the bias size. The check didn't make sure output
channels were static before performing the comparison. This lead to
failures such as:
```
'tosa.transpose_conv2d' op bias channels expected to be equal to output channels (-9223372036854775808) or 1, got 5
```
when the output channels size was dynamic.
2025-07-16 09:15:45 +01:00
Luke Hutton
d7f6660c34
[mlir][tosa] Remove profile compliance of cond_if and while_loop (#148212)
The requirement for a boolean condition is already checked for both
operators elsewhere. `cond_if` requires a boolean condition at
construction. `while_loop` cond_graph is checked in the verifier for a
scalar boolean output type.
2025-07-16 09:13:14 +01:00
Tomás Longeri
5d367080a8
[MLIR][Vector] Fix bug in ExtractStrideSlicesOp canonicalization (#147591)
The pattern would produce an invalid slice when some dimensions were
both sliced and broadcast.
2025-07-16 08:52:35 +01:00
Luke Hutton
5480fc6bb8
[mlir][tosa] Interpret boolean values correctly in cast folder (#147078)
Previously the cast folder would sign extend boolean values, leading
"true" to be casted to a value of -1 instead of 1. This change ensures
i1 values are zero extended, since i1 is used as a boolean value in
TOSA. According to the TOSA spec, the result of a boolean cast with
value "true" to another integer type should give a result of 1.

Fixes https://github.com/llvm/llvm-project/issues/57951
2025-07-16 07:33:40 +01:00
Kazu Hirata
606e7f90b1
[mlir] Remove unused includes (NFC) (#148872)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-15 20:47:53 -07:00
James Newling
228c45f13d
Revert [mlir][vector] Use vector.broadcast in place of vector.splat (#148937)
This reverts PR/commit 99875733fc

This PR/commit should only be landed after
https://github.com/llvm/llvm-project/pull/148027, at which point we
don't need to assume that vector.broadcast has been lowered to another
form.
2025-07-15 20:45:01 -07:00
Uday Bondhugula
fa88c188de
[MLIR][Affine] Add default null init for mlir::affine::MemRefAccess (#147922)
Add default null init for `mlir::affine::MemRefAccess`. This is
consistent with various other MLIR structures and had been missing for
`mlir::affine::MemRefAccess`.
2025-07-16 08:54:46 +05:30
Valentin Clement (バレンタイン クレメン)
bec508ad17
[mlir][nvvm] Fix control reaches end of non-void function warning (#148965) 2025-07-15 14:49:26 -07:00