298 Commits

Author SHA1 Message Date
Scott Manley
e72335192d
[Arith][MemRef] add AtomicRMWKind::xori to enum (#151701)
Add missing xor AtomicRMWKind enum in arith. Also add support for xor to
memref.atomic_rmw so the change can be tested.

This does NOT add it for all users of the enum (e.g. Affine, Vector)
2025-08-11 08:46:06 -04:00
Frank Schlimbach
b2d4963ee9
[NFC][mlir][mesh,shard] Fixing misnomers in mesh dialect, renaming 'mesh' dialect to 'shard' (#150177)
Dialect to 'shard' (discourse 87053)
  - dialect name mesh -> shard
  - (device) mesh -> (device) grid
  - spmdize -> partition

A lot of diffs, but simple renames only.

@tkarna @yaochengji
2025-07-25 16:53:08 +02:00
Longsheng Mou
f047b735e9
[mlir][NFC] Use getDefiningOp<OpTy>() instead of dyn_cast<OpTy>(getDefiningOp()) (#150428)
This PR uses `val.getDefiningOp<OpTy>()` to replace `dyn_cast<OpTy>(val.getDefiningOp())` , `dyn_cast_or_null<OpTy>(val.getDefiningOp())` and `dyn_cast_if_present<OpTy>(val.getDefiningOp())`.
2025-07-25 10:35:51 +08:00
Maksim Levental
967626b842
[mlir][NFC] update mlir/Dialect create APIs (14/n) (#149920)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-24 13:03:47 -05:00
Kazu Hirata
0925d7572a
[mlir] Remove unused includes (NFC) (#150266)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:53 -07:00
James Newling
6ed921f967
Reland "[mlir][vector] Use vector.broadcast in place of vector.splat" (#150138)
This reverts commit 228c45f13dc92546661b6825b7b32c3808b0d2eb (PR
#148937) . Now that #148027 is landed, I think it is safe to "reland"
the original PR: #148028
2025-07-23 06:00:59 -07:00
Krzysztof Drewniak
eb554128ac
[mlir][Arith] Prevent IR modification for non-matching pattern (#150103)
The F4E2M1 truncation emulation was expanding or truncating operations
to F32 even when the pattern did not apply, causing non-convergent
rewrites when operating on doubles.

Also, fix a pair of whitespace issues that snuck in.
2025-07-22 15:57:31 -07:00
Andrzej Warzyński
03bd0f36ba
[mlir][vector] Remove MatrixMultiplyOp and FlatTransposeOp from Vector dialect (#144307)
This patch deletes `vector.matrix_multiply` and `vector.flat_transpose`,
which are thin wrappers around the corresponding LLVM intrinsics:
  - `llvm.intr.matrix.multiply`
  - `llvm.intr.matrix.transpose`

These Vector dialect ops did not provide additional semantics or
abstraction beyond the LLVM intrinsics. Their removal simplifies the
lowering pipeline without losing any functionality.

The lowering chains:
- `vector.contract` → `vector.matrix_multiply` →
`llvm.intr.matrix.multiply`
- `vector.transpose` → `vector.flat_transpose` →
`llvm.intr.matrix.transpose`

are now replaced with:
  - `vector.contract` → `llvm.intr.matrix.multiply`
  - `vector.transpose` → `llvm.intr.matrix.transpose`

This was accomplished by directly replacing:
  - `vector::MatrixMultiplyOp` with `LLVM::MatrixMultiplyOp`
  - `vector::FlatTransposeOp` with `LLVM::MatrixTransposeOp`

Note: To avoid a build-time dependency from `Vector` to `LLVM`,
relevant transformations are moved from "Vector/Transforms" to
`Conversion/VectorToLLVM`.
2025-07-21 08:19:30 +01:00
Maksim Levental
906295b8a3
[mlir] update affine+arith create APIs (1/n) (#149656)
This PR updates create APIs for arith and affine - specifically these
are the only in-tree dialects/ops with "custom" builders:

```
AffineDmaStartOp
AffineDmaWaitOp
ConstantIntOp
ConstantFloatOp
ConstantIndexOp
```

See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-19 12:37:39 -05:00
James Newling
228c45f13d
Revert [mlir][vector] Use vector.broadcast in place of vector.splat (#148937)
This reverts PR/commit 99875733fc

This PR/commit should only be landed after
https://github.com/llvm/llvm-project/pull/148027, at which point we
don't need to assume that vector.broadcast has been lowered to another
form.
2025-07-15 20:45:01 -07:00
Matthias Springer
cbdc18542c
[mlir][arith] Fix bug in arith.bitcast canonicalizer (#148795)
`bitcast(bitcast(x))` was incorrectly folded to `x`.
2025-07-15 10:14:02 +02:00
James Newling
99875733fc
[mlir][vector] Use vector.broadcast in place of vector.splat (#148028)
Part of deprecation of vector.splat

RFC:
https://discourse.llvm.org/t/rfc-mlir-vector-deprecate-then-remove-vector-splat/87143/4
More complete deprecation:
https://github.com/llvm/llvm-project/pull/147818
2025-07-14 15:12:21 -07:00
Kazu Hirata
cac806bcc5
[mlir] Remove unused includes (NFC) (#148535) 2025-07-13 13:13:01 -07:00
Andrei Golubev
a63f572628
[mlir][bufferization] Return BufferLikeType in BufferizableOpInterface (#144867)
Support custom types (2/N): allow value-owning operations (e.g.
allocation ops) to bufferize custom tensors into custom buffers. This
requires BufferizableOpInterface::getBufferType() to return
BufferLikeType instead of BaseMemRefType.

Affected implementors of the interface are updated accordingly.

Relates to ee070d08163ac09842d9bf0c1315f311df39faf1.
2025-07-02 11:27:35 -07:00
Skrai Pardus
5ed852f7f7
[mlir][arith] Add arith::ConstantIntOp constructor (#144638)
This PR adds a `build()` constructor for `ConstantIntOp` that takes in
an `APInt`.


Creating an `arith` constant value with an `APInt` currently requires a
structure like the following:
```c
b.create<arith::ConstantOp>(IntegerAttr::get(apintValue, 5));
```
In comparison, the`ConstantFloatOp` already has an `APFloat` constructor
which allows for the following:
```c
b.create<arith::ConstantFloatOp>(floatType, apfloatValue);
```
Thus, intuitively, it makes sense that a similar `ConstantIntOp`
constructor is made for `APInts` like so:
```c
b.create<arith::ConstantIntOp>(intType, apintValue);
```

Depends on https://github.com/llvm/llvm-project/pull/144636
2025-07-01 23:50:39 +02:00
Fabian Mora
878d3594ed
[mlir][vector] Avoid setting padding by default to 0 in vector.transfer_read prefer ub.poison (#146088)
Context:
`vector.transfer_read` always requires a padding value. Most of its
builders take no `padding` value and assume the safe value of `0`.
However, this should be a conscious choice by the API user, as it makes
it easy to introduce bugs.
For example, I found several occasions while making this patch that the
padding value was not getting propagated (`vector.transfer_read` was
transformed into another `vector.transfer_read`). These bugs, were
always caused because of constructors that don't require specifying
padding.

Additionally, using `ub.poison` as a possible default value is better,
as it indicates the user "doesn't care" about the actual padding value,
forcing users to specify the actual padding semantics they want.

With that in mind, this patch changes the builders in
`vector.transfer_read` to always having a `std::optional<Value> padding`
argument. This argument is never optional, but for convenience users can
pass `std::nullopt`, padding the transfer read with `ub.poison`.

---------

Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
2025-06-30 15:20:42 -04:00
Skrai Pardus
a45fda6aeb
switch type and value ordering for arith Constant[XX]Op (#144636)
This change standardizes the order of the parameters for `Constant[XXX]
Ops` to match with all other `Op` `build()` constructors.

In all instances of generated code for the MLIR dialects's Ops (that is
the TableGen using the .td files to create the .h.inc/.cpp.inc files),
the desired result type is always specified before the value.

Examples: 
```
// ArithOps.h.inc
class ConstantOp : public ::mlir::Op<ConstantOp, ::mlir::OpTrait::ZeroRegions, ::mlir::OpTrait::OneResult, ::mlir::OpTrait::OneTypedResult<::mlir::Type>::Impl, ::mlir::OpTrait::ZeroSuccessors, ::mlir::OpTrait::ZeroOperands, ::mlir::OpTrait::OpInvariants, ::mlir::BytecodeOpInterface::Trait, ::mlir::OpTrait::ConstantLike, ::mlir::ConditionallySpeculatable::Trait, ::mlir::OpTrait::AlwaysSpeculatableImplTrait, ::mlir::MemoryEffectOpInterface::Trait, ::mlir::OpAsmOpInterface::Trait, ::mlir::InferIntRangeInterface::Trait, ::mlir::InferTypeOpInterface::Trait> {
public:
....
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::TypedAttr value);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypedAttr value);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::TypedAttr value);
  static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
...
```
```
// ArithOps.h.inc
class SubIOp : public ::mlir::Op<SubIOp, ::mlir::OpTrait::ZeroRegions, ::mlir::OpTrait::OneResult, ::mlir::OpTrait::OneTypedResult<::mlir::Type>::Impl, ::mlir::OpTrait::ZeroSuccessors, ::mlir::OpTrait::NOperands<2>::Impl, ::mlir::OpTrait::OpInvariants, ::mlir::BytecodeOpInterface::Trait, ::mlir::ConditionallySpeculatable::Trait, ::mlir::OpTrait::AlwaysSpeculatableImplTrait, ::mlir::MemoryEffectOpInterface::Trait, ::mlir::InferIntRangeInterface::Trait, ::mlir::arith::ArithIntegerOverflowFlagsInterface::Trait, ::mlir::OpTrait::SameOperandsAndResultType, ::mlir::VectorUnrollOpInterface::Trait, ::mlir::OpTrait::Elementwise, ::mlir::OpTrait::Scalarizable, ::mlir::OpTrait::Vectorizable, ::mlir::OpTrait::Tensorizable, ::mlir::InferTypeOpInterface::Trait> {
public:
...
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none);
  static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
...
```
In comparison, in the distinct case of `ConstantIntOp` and
`ConstantFloatOp`, the ordering of the result type and the value is
switched.

Thus, this PR corrects the ordering of the aforementioned
`Constant[XXX]Ops` to match with other constructors.
2025-06-23 23:35:50 +02:00
Muzammil
379a609dad
[mlir][arith][transforms] Adds f4E2M1FN support to truncf and extf (#144157)
See work detail: https://github.com/iree-org/iree/issues/20920

Add support for f4E2M1FN in `arith.truncf` and `arith.extf` ops though a software emulation

---------

Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
2025-06-20 11:27:35 -05:00
Tobias Gysi
eb694b2846
[mlir][arith] Delete mul ext canonicalizations (#144844)
The Arith dialect includes patterns that canonicalize a sequence of:

- trunci(shrui(mul(sext(x), sext(y)), c)) -> mulsi_extended(x, y)
- trunci(shrui(mul(zext(x), zext(y)), c)) -> mului_extended(x, y)

These patterns return the high word of an extended multiplication, which
assumes that the shift amount is equal to the bit width of the original
operands. This check was missing, leading to incorrect canonicalizations
when the shift amount was less than the bit width.

For example, the following code:
```
  %x = arith.extui %a: i32 to i33
  %y = arith.extui %b: i32 to i33
  %m = arith.muli %x, %y: i33
  %c1 = arith.constant 1: i33
  %sh = arith.shrui %m, %c1 : i33
  %hi = arith.trunci %sh: i33 to i32
```
would incorrectly be canonicalized to:
```
_, %hi = arith.mului_extended %a, %b : i32
```
This commit removes the faulty canonicalizations since they are not
believed to be generally beneficial (c.f., the discussion of the
alternative https://github.com/llvm/llvm-project/pull/144787 which fixes
the canonicalizations).
2025-06-19 16:32:48 +02:00
Matthias Springer
e33f13ba48
[mlir][arith] Add overflow flags to arith.trunci (#144863)
LLVM already supports overflow flags on `llvm.trunc` for a while. This
commit adds support for these flags to `arith.trunci`.
2025-06-19 13:59:22 +02:00
Andrei Golubev
ee070d0816
[mlir][bufferization] Support custom types (1/N) (#142986)
Following the addition of TensorLike and BufferLike type interfaces (see
00eaff3e9c897c263a879416d0f151d7ca7eeaff), introduce minimal changes
required to bufferize a custom tensor operation into a custom buffer
operation.

To achieve this, new interface methods are added to TensorLike type
interface that abstract away the differences between existing (tensor ->
memref) and custom conversions.

The scope of the changes is intentionally limited (for example,
BufferizableOpInterface is untouched) in order to first understand the
basics and reach consensus design-wise.

---
Notable changes:
* mlir::bufferization::getBufferType() returns BufferLikeType (instead
of BaseMemRefType)
* ToTensorOp / ToBufferOp operate on TensorLikeType / BufferLikeType.
Operation argument "memref" renamed to "buffer"
* ToTensorOp's tensor type inferring builder is dropped (users now need
to provide the tensor type explicitly)
2025-06-18 16:18:12 +02:00
Umang Yadav
7f08503a3b
Introduce arith.scaling_extf and arith.scaling_truncf (#141965)
This PR adds `arith.scaling_truncf` and `arith.scaling_extf` operations
which supports the block quantization following OCP MXFP specs listed
here
https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf

OCP MXFP Spec comes with reference implementation here
https://github.com/microsoft/microxcaling/tree/main

Interesting piece of reference code is this method `_quantize_mx`
7bc41952de/mx/mx_ops.py (L173).

Both `arith.scaling_truncf` and `arith.scaling_extf` are designed to be
an elementwise operation. Please see description about them in
`ArithOps.td` file for more details.
 
Internally, 

`arith.scaling_truncf` does the
`arith.truncf(arith.divf(input/(2^scale)))`. `scale` should have
necessary broadcast, clamping, normalization and NaN propagation done
before callling into `arith.scaling_truncf`.

`arith.scaling_extf` does the `arith.mulf(2^scale, input)` after taking
care of necessary data type conversions.


CC: @krzysz00 @dhernandez0 @bjacob @pashu123 @MaheshRavishankar
@tgymnich

---------

Co-authored-by: Prashant Kumar <pk5561@gmail.com>
Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>
2025-06-09 13:13:31 -05:00
Michele Scuttari
63cb6af782
[MLIR] Add bufferization state to getBufferType and resolveConflicts interface methods (#141466)
The PR continues the work started in #141019 by adding the `BufferizationState` class also to the `getBufferType` and `resolveConflicts` interface methods, together with the additional support functions that are used throughout the bufferization infrastructure.
2025-05-28 10:35:23 +02:00
Michele Scuttari
61d5fdf50c
[MLIR] Add bufferization state class to OneShotBufferization pass (#141019)
Follow-up on #138143, which was reverted due to a missing update a method signature (more specifically, the bufferization interface for `tensor::ConcatOp`) that was not catched before merging. The old PR description is reported in the next lines.

This PR is a follow-up on https://github.com/llvm/llvm-project/pull/138125, and adds a bufferization state class providing information about the IR. The information currently consists of a cached list of symbol tables, which aims to solve the quadratic scaling of the bufferization task with respect to the number of symbols. The PR breaks API compatibility: the bufferize method of the BufferizableOpInterface has been enriched with a reference to a BufferizationState object.

The bufferization state must be kept in a valid state by the interface implementations. For example, if an operation with the Symbol trait is inserted or replaced, its parent SymbolTable must be updated accordingly (see, for example, the bufferization of arith::ConstantOp, where the symbol table of the module gets the new global symbol inserted). Similarly, the invalidation of a symbol table must be performed if an operation with the SymbolTable trait is removed (this can be performed using the invalidateSymbolTable method, introduced in https://github.com/llvm/llvm-project/pull/138014).
2025-05-23 09:21:35 +02:00
Umang Yadav
ffa5ce04d0
Add arith expansion of f8E8M0 type for extf/trunc ops (#140332)
F8E8M0 floating type is supposed to represent biased exponent bits of
F32 type in OCP Micro scaling floating point formats.


https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf

This PR expands `arith.truncf` and `arith.extf` to support this
behavior.

For the `arith.truncf` thing to note here is that F8E8M0FNU type has one
NaN representation which is encoded as `0xFF`. Therefore alll kinds of
NaNs and +/-Inf in Float32Type would map to NaN in F8E8M0FNU. F8E8M0FNU
doesn't have a sign bit therefore it is a lossy and irreversible
downcast.

cc: @krzysz00  @MaheshRavishankar @Muzammiluddin-Syed-ECE
2025-05-22 15:36:00 -05:00
Michele Scuttari
72a8893689
Revert "[MLIR] Add bufferization state class to OneShotBufferization pass" (#141012)
Reverts llvm/llvm-project#138143

The PR for the BufferizationState is temporarily reverted due to API incompatibilities that have been initially missed during the update and were not catched by PR checks.
2025-05-22 09:25:07 +02:00
Michele Scuttari
67fc1660d9
[MLIR] Add bufferization state class to OneShotBufferization pass (#138143)
This PR is a follow-up on #138125, and adds a bufferization state class providing information about the IR. The information currently consists of a cached list of symbol tables, which aims to solve the quadratic scaling of the bufferization task with respect to the number of symbols. The PR breaks API compatibility: the `bufferize` method of the `BufferizableOpInterface` has been enriched with a reference to a `BufferizationState` object.

The bufferization state must be kept in a valid state by the interface implementations. For example, if an operation with the `Symbol` trait is inserted or replaced, its parent `SymbolTable` must be updated accordingly (see, for example, the bufferization of `arith::ConstantOp`, where the symbol table of the module gets the new global symbol inserted). Similarly, the invalidation of a symbol table must be performed if an operation with the `SymbolTable` trait is removed (this can be performed using the `invalidateSymbolTable` method, introduced in #138014).
2025-05-22 08:53:38 +02:00
Christian Sigg
c56e7f22f0
[mlir][arith] Canonicalize sitofp(truncf) -> sitofp, and uitofp. (#139925)
Add a canonicalization patterns that simplifies `truncf(sitofp(x))` to
`sitofp(x)` and `truncf(uitofp(x))` to `uitofp(x)`, if truncf has default rounding mode.

This assumes that the destination type of truncf is representable by the
intermediate type.

Note that the truncf semantics requires that the destination type is
narrower than the source type, so this is true for all types I can
possibly think of, but one could probably construct an artificial
counter example.

Somewhat related: https://github.com/llvm/llvm-project/pull/128096
2025-05-19 15:07:30 +02:00
Max Graey
8aaac80ddd
[NFC] Use more isa and isa_and_nonnull instead dyn_cast for predicates (#137393)
Also fix some typos in comments

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-05-13 22:34:42 +08:00
lorenzo chelini
61536f2781
[mlir] Retire additional let constructor (NFC) (#139390)
Three main changes:

- The pass createRequestCWrappersPass is renamed as
createLLVMRequestCWrappersPass

- createOptimizeForTargetPass is now under the LLVM namespace. It’s
unclear why the NVVM namespace was used initially, as all passes in
LLVMIR/Transforms/Passes.h consistently reside in the LLVM namespace.

- DuplicateFunctionEliminationPass is now in the func namespace.
2025-05-13 11:15:29 +02:00
Simon Camphausen
0009a17834
[mlir][EmitC] Add pass that combines all available emitc conversions (#117549) 2025-05-01 07:24:01 -07:00
Quinn Dawkins
ea3959e841
[MLIR][Arith] Add ValueBoundsOpInterface for FloorDivSI (#137879)
Enables value bounds inference through signed division operations.
2025-04-30 12:40:36 -04:00
Mehdi Amini
4149ec9970 [MLIR] Remove redundant verifier code in arith::ConstantOp
This is already checked by the `AllTypesMatch` traits defined in ODS.
2025-04-27 06:56:14 -07:00
Oleksandr "Alex" Zinenko
0c61b24337
[mlir] add a fluent API to GreedyRewriterConfig (#137122)
This is similar to other configuration objects used across MLIR.

Rename some fields to better reflect that they are no longer booleans.

Reland 04d261101b4f229189463136a794e3e362a793af / #132253.
2025-04-24 09:51:42 +02:00
Ivan Butygin
e87aa0c6ab
[mlir][vector] Sink vector.extract/splat into load/store ops (#134389)
```
vector.load %arg0[%arg1] : memref<?xf32>, vector<4xf32>
vector.extract %0[1] : f32 from vector<4xf32>
```
Gets converted to:
```
%c1 = arith.constant 1 : index
%0 = arith.addi %arg1, %c1 overflow<nsw> : index
%1 = memref.load %arg0[%0] : memref<?xf32>
```

```
%0 = vector.splat %arg2 : vector<1xf32>
vector.store %0, %arg0[%arg1] : memref<?xf32>, vector<1xf32>
```
Gets converted to:
```
memref.store %arg2, %arg0[%arg1] : memref<?xf32>
```
2025-04-22 17:18:54 +03:00
Kazu Hirata
4cb9a3700c Revert "[mlir] add a fluent API to GreedyRewriterConfig (#132253)"
This reverts commit 63b8f1c9482ed0a964980df4aed89bef922b8078.

Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/172/builds/12083/steps/5/logs/stdio

I've reproduced the error with a release build (-DCMAKE_BUILD_TYPE=Release).
2025-04-18 09:40:28 -07:00
Oleksandr "Alex" Zinenko
63b8f1c948
[mlir] add a fluent API to GreedyRewriterConfig (#132253)
This is similar to other configuration objects used across MLIR.
2025-04-18 15:19:57 +02:00
Prakhar Dixit
35f4cdbf59
[mlir][arith] Add constraints to the MulIOp for preventing type mismatch while folding (#136093)
Fixes #135289
The original version didn't check if the types of lhs, rhs, and the
result matched, which could cause type errors.
This fix adds type checks to make sure the constants attributes have
the same type as the SSA values before applying the simplification.
2025-04-17 11:17:34 +02:00
Matthias Springer
6966b4f4a5
[mlir][arith] Remove func patterns from populateArithWideIntEmulationPatterns (#134316)
This function should populate only patterns that are related to wide
integer operation emulation.
2025-04-04 06:23:17 -07:00
Fehr Mathieu
8b67f36258
[mlir] [arith] Fix ceildivsi lowering in arith-expand (#133774)
This fixes the current lowering of `arith.ceildivsi` in the arith-expand
pass, which was previously incorrect. The new version is based on the
lowering of `arith.floordivsi`, and will not introduce new undefined
behavior or poison during the lowering. It also replaces one division
with a multiplication.

The previous lowering of `ceildivsi(n, m)` was the following:
```
x = (m > 0) ? -1 : 1
(n*m>0) ? ((n+x) / m) + 1 : - (-n / m)
```

This caused two problems:
* In the case where `n` is INT_MIN and `m` is positive, the result would
be poison instead of an actual value
* In the case where `n` is INT_MAX and `m` is `-1`, this would trigger
undefined behavior, while the original code wouldn't. This is because
`n+x` would be equal to `INT_MIN` (`INT_MAX + 1`), so the `(n+x) / m`
division would overflow and trigger UB.
2025-04-02 17:26:58 +01:00
Maksim Levental
1d4801f22a
[mlir] fix maybeReplaceWithConstant in IntRangeOptimizations (#133556)
If a dialect is caching/reusing constants when materializing then such
constants might already have `IntegerValueRangeLattice`s associated with
them and the range endpoint bit widths might not match the new
replacement (amongst other possible wackiness).

I observed this with `%true = arith.constant true` which was
materialized but had an existing `IntegerValueRangeLattice` (i.e.,
`solver.getOrCreateState<dataflow::IntegerValueRangeLattice>` was not
uninitalized) with range endpoint bit widths:

```
umin bit width: 32
umax bit width: 32
smin bit width: 32
smax bit width: 32
```

while the widths of the range end points for something like `%20 =
arith.cmpi slt, %19, %c1_i32` (a replacement candidate) would be

```
umin bit width: 1
umax bit width: 1
smin bit width: 1
smax bit width: 1
```

Thus, we should be clearing the analysis state each time a constant is
reused.
2025-03-28 23:28:11 -04:00
egebeysel
3a3732c252
[mlir][arith] wide integer emulation support for fpto*i ops (#132375)
Adding wide integer emulation support for `arith.fpto*i` operations. As
the other emulated operations, the upper and lower `N` bits of the `i2N`
integer result are emitted separately.

For the unsigned case we use the following emulation

```c
// example is 64 -> 32 bit emulation, but the implementation is generalized to any 2N -> N case
const double TWO_POW_N = (uint_64_t(1) << N); // 2^N, N is the bitwidth of the widest int supported

// f is a floating-point value representing the input of the fptoui op.
uint32_t hi = (uint32_t)(f / TWO_POW_N);         // Truncates the division result
uint32_t lo = (uint32_t)(f - hi * TWO_POW_N);       // Subtracts to get the lower bits.
```

For the signed case, we defer the emulation of the absolute value to
`fptoui` and handle the sign:

```
fptosi(fp) = sign(fp) * fptoui(abs(fp))
```

The edge cases of `NaNs, +-inf` and overflows/underflows are undefined
behaviour and the resulting numbers are the combination of the lower
bitwidth UB values. These operations also propagate poison values.

Signed-off-by: Ege Beysel <beysel@roofline.ai>
2025-03-27 20:58:56 -04:00
egebeysel
a1a5594ad2
[mlir][arith] add wide integer emulation support for subi (#133248)
Adds wide integer emulation support for the `arith.subi` op. `(i2N, i2N)
-> (i2N)` ops are emulated as `(vector<2xiN>, vector<2xiN>) ->
(vector<2xiN>)`, just as the other emulation patterns.

The emulation uses the following scheme:

```
resLow = lhsLow - rhsLow;      // carry = 1 if rhsLow > lhsLow
resHigh = lhsLow - carry - rhsLow;
```

Signed-off-by: Ege Beysel <beysel@roofline.ai>
2025-03-27 15:01:04 -05:00
Maksim Levental
bfe85230e2
[mlir][IntegerRangeAnalysis] expose maybeReplaceWithConstant (#133151)
This PR exposes `maybeReplaceWithConstant` in headers for downstream
use.
2025-03-26 18:10:12 -04:00
Matthias Springer
a21cfca320
[mlir][IR] Deprecate match and rewrite functions (#130031)
Deprecate the `match` and `rewrite` functions. They mainly exist for
historic reasons. This PR also updates all remaining uses of in the MLIR
codebase.

This is addressing a
[comment](https://github.com/llvm/llvm-project/pull/129861#pullrequestreview-2662696084)
on an earlier PR.

Note for LLVM integration: `SplitMatchAndRewrite` will be deleted soon,
update your patterns to use `matchAndRewrite` instead of separate
`match` / `rewrite`.

---------

Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>
2025-03-07 08:43:01 +01:00
Matthias Springer
a6151f4e23
[mlir][IR] Move match and rewrite functions into separate class (#129861)
The vast majority of rewrite / conversion patterns uses a combined
`matchAndRewrite` instead of separate `match` and `rewrite` functions.

This PR optimizes the code base for the most common case where users
implement a combined `matchAndRewrite`. There are no longer any `match`
and `rewrite` functions in `RewritePattern`, `ConversionPattern` and
their derived classes. Instead, there is a `SplitMatchAndRewriteImpl`
class that implements `matchAndRewrite` in terms of `match` and
`rewrite`.

Details:
* The `RewritePattern` and `ConversionPattern` classes are simpler
(fewer functions). Especially the `ConversionPattern` class, which now
has 5 fewer functions. (There were various `rewrite` overloads to
account for 1:1 / 1:N patterns.)
* There is a new class `SplitMatchAndRewriteImpl` that derives from
`RewritePattern` / `OpRewritePatern` / ..., along with a type alias
`RewritePattern::SplitMatchAndRewrite` for convenience.
* Fewer `llvm_unreachable` are needed throughout the code base. Instead,
we can use pure virtual functions. (In cases where users previously had
to implement `rewrite` or `matchAndRewrite`, etc.)
* This PR may also improve the number of [`-Woverload-virtual`
warnings](https://discourse.llvm.org/t/matchandrewrite-hiding-virtual-functions/84933)
that are produced by GCC. (To be confirmed...)

Note for LLVM integration: Patterns with separate `match` / `rewrite`
implementations, must derive from `X::SplitMatchAndRewrite` instead of
`X`.

---------

Co-authored-by: River Riddle <riddleriver@gmail.com>
2025-03-06 08:48:51 +01:00
Zahi Moudallal
5d0c5c638a
[MLIR][ARITH] Adds missing foldings for truncf (#128096)
This patch is mainly to deal with folding `truncf`, as follows:
`truncf(extf(a))` -> `a`, if `a` has the same bitwidth as the result
`truncf(extf(a))` -> `truncf(a)`, if `a` has larger bitwidth than the
result
2025-02-21 15:37:19 -08:00
Maksim Levental
ab7664c02c
[mlir][integer-range-analysis] expose helpers in header and fix ConstantIntRange print (#127888) 2025-02-19 21:01:45 -05:00
Frank Schlimbach
0fd50ec9a3
[MLIR][mesh] Mesh fixes (#124724)
A collection of fixes to the mesh dialect
- allow constants in sharding propagation/spmdization
- fixes to tensor replication (e.g. 0d tensors)
- improved canonicalization
- sharding propagation incorrectly generated too many ShardOps
New operation `mesh.GetShardOp` enables exchanging sharding information
(like on function boundaries)
2025-02-12 12:44:48 +01:00
Longsheng Mou
4c3169d24c
[mlir][arith] EmulateWideInt only support vector.print (#124510)
This PR fixes a bug where dynamically legal operations were added for
all vector operations, but only `vector.print` was supported, leading to
a crash. Fixes #73381.
2025-02-05 09:43:42 +08:00