971 Commits

Author SHA1 Message Date
Slava Zakharin
35f89458fa
[mlir] Made DefaultResource the root of memory resource hierarchy. (#187423)
DefaultResource is made the root of the memory resource hierarchy,
so now it overlaps with all resources.

RFC:
https://discourse.llvm.org/t/rfc-mlir-memory-region-hierarchy-for-mlir-side-effects/89811/32
2026-03-30 17:52:45 -07:00
Mehdi Amini
5d293008c2
[MLIR][Transforms] Fix two bugs in loop-invariant-subset-hoisting (#188761)
Fix two issues in `MatchingSubsets::populateSubsetOpsAtIterArg`:

1. The `collectHoistableOps` parameter was declared but never used when
inserting subset ops via `insert(subsetOp)`. As a result, when recursing
into nested loops with `collectHoistableOps=false`, the nested loop's
subset ops were incorrectly added to the hoistable extraction/insertion
pairs of the parent loop. This caused spurious failures in the
`allDisjoint` check, preventing valid hoisting when nested loop ops
overlapped with outer loop ops. Fix by passing the parameter:
`insert(subsetOp, collectHoistableOps)`.

2. In the nested loop handling branch, there was no guard to detect when
a value has multiple nested loop uses (i.e., is used as an init arg in
more than one nested loop). Without the guard, `nextValue` would be
silently overwritten, leading to an incorrect use-def chain traversal.
Add `if (nextValue) return failure()` before setting `nextValue` for the
nested loop case, mirroring the existing guard for insertion ops.

Fixes #147096

Assisted-by: Claude Code
2026-03-27 17:27:08 +01:00
Mehdi Amini
e9669fd6fb
[MLIR][EmitC] Fix crash in SwitchOp::getEntrySuccessorRegions on unsigned integer type (#188546)
SwitchOp::getEntrySuccessorRegions and getRegionInvocationBounds called
IntegerAttr::getInt() to retrieve the constant switch argument, but
getInt() asserts that the attribute type must be a signless integer or
index. For unsigned integer types (e.g. ui32), this assertion fired and
crashed the process.

Fix by selecting the appropriate accessor based on the attribute type:
getInt() for signless/index, getSInt() for signed, and getUInt() (cast
to int64_t) for unsigned integer types. Unknown types fall back to the
conservative "all regions possible" path.

The same fix is applied to getRegionInvocationBounds, which had an
identical call to getInt().

Fixes #187973

Assisted-by: Claude Code
2026-03-27 17:26:45 +01:00
Mehdi Amini
23eec12169
[MLIR] Fix outdated restriction comment in RemoveDeadValuesPass (#189041)
The RemoveDeadValuesPass previously emitted an error and skipped
optimization when the IR contained non-function symbol ops, non-call
symbol user ops, or branch ops. This restriction was later removed, but
the comments in RemoveDeadValues.cpp and Passes.td still described the
pass as operating "iff the IR doesn't have any non-function symbol ops,
non-call symbol user ops and branch ops."

Remove the stale restriction text from both the .cpp file comment and
the Passes.td description. Also add a test that verifies dead function
arguments are correctly removed inside a module that defines a symbol
(has a sym_name attribute), which was the original failure case reported
in issue #98700.

Fixes #98700

Assisted-by: Claude Code
2026-03-27 16:22:47 +00:00
Slava Zakharin
443e4cb2df
Reapply "[MLIR] [Mem2Reg] Fix unused block argument removal logic (#188484)" (#188571) (#188599)
This reverts commit d9402d087ab90610d3ff8a78a50eb66d3be4cffd.

This re-applies commit e5adddc5be63b8bb8c36572f68ac64c8042cb282
along with
62eafb5cd1

Co-authored-by: Yi Zhang <cathyzhyi@google.com>

Co-authored-by: Yi Zhang <cathyzhyi@google.com>
2026-03-25 14:08:50 -07:00
Mehdi Amini
820eaa427a
[MLIR][DataFlow] Fix two crashes in DeadCodeAnalysis on empty/no-terminator regions (#188548)
Two related assertion failures in DeadCodeAnalysis when processing
OpenACC operations:

1. visitRegionBranchEdges (issue #187972): When a RegionSuccessor refers
to an empty region (no blocks), calling getSuccessor()->front()
dereferences a sentinel ilist iterator, crashing with
"\!NodePtr->isKnownSentinel()". Fix: skip successors whose region is
empty.

2. isRegionOrCallableReturn (issue #188408): When iterating over ops in
a nested acc region whose blocks do not have a required terminator,
Block::getTerminator() is called without first checking
mightHaveTerminator(), triggering "Assertion `mightHaveTerminator()'
failed". Fix: guard the getTerminator() call with mightHaveTerminator().

Fixes #187972, #188408

Assisted-by: Claude Code
2026-03-25 20:37:16 +01:00
Théo Degioanni
239ca11a55
[MLIR][Mem2Reg] Add support for region control flow and SCF (#185036)
This PR adds support for region control-flow. Region control-flow and
CFG can be mixed together in the same program. See the [accompanying
RFC](https://discourse.llvm.org/t/rfc-support-region-control-flow-in-mem2reg/90082)
for some design considerations.

Beyond the considerations in the RFC, a few minor changes were
introduced:

- Calling the visitor hook for defined values is now deferred to the end
of promotion.
- The lazy creation of default values has been moved to the places where
it happens to prepare for a future change where it is actually lazy.
Documentation about it not working as intended for now was also added.

All SCF operations are supported, including `forall` and `parallel`,
which is pretty cool I think.

I am sorry in advance for git diff displaying a really bad diff for
Mem2Reg.cpp around where the liveness analysis used to be. Do consider
simply reading this part of the code off the file.

As a disclaimer, I designed all the test cases myself, but I used a
large amount of matrix multiplications to produce the corresponding IR
and FileCheck tests. I have reviewed them carefully and they correspond
to my intent.

---------

Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
2026-03-23 18:08:55 +01:00
Mehdi Amini
dd9dd1d2f3
[mlir][bufferization] Fix crash in promote-buffers-to-stack for nested memrefs (#186426)
The `--promote-buffers-to-stack` pass crashes when allocating a memref
whose element type is itself a memref (e.g., `memref<1xmemref<2xf32>>`).
This happens because `defaultIsSmallAlloc` calls
`DataLayout::getTypeSizeInBits` on the element type, but `MemRefType`
(and other types without `DataLayoutTypeInterface`) trigger a fatal
error when queried this way.

Fix the crash by checking whether the element type has data layout
support before computing its size. Types that are not int/float,
complex, index, or vector and do not implement `DataLayoutTypeInterface`
are silently skipped (i.e., the allocation is not promoted).

Fixes #60092

Assisted-by: Claude Code
2026-03-18 16:44:26 +01:00
Mehdi Amini
1cfc264547
[mlir][bufferization] Fix integer overflow crash in promote-buffers-to-stack (#186276)
`defaultIsSmallAlloc` called `ShapedType::getNumElements()` which
asserts when the static element count overflows `int64_t` (e.g. a
`memref<3090540x3090540x3090540xi32>` whose element count is ~29e18).

Switch to `ShapedType::tryGetNumElements()`, which returns
`std::nullopt` on overflow. An overflowing element count means the
allocation is definitely not small, so we return `false` immediately. A
secondary overflow guard is added for the final size comparison.

Fixes #64638

Assisted-by: Claude Code
2026-03-13 11:38:39 +00:00
Mehdi Amini
2d70dbdb35
[mlir][test] Fix UNREACHABLE in TestInlinerInterface for multi-block inlining (#186266)
When the MLIR inliner inlines a callable region that has more than one
block, it calls `handleTerminator(op, Block *newDest)` for the
terminator of every inlined block. `TestInlinerInterface` only
implemented the single-block variant (`handleTerminator(op,
ValueRange)`), so the default `llvm_unreachable` was hit when inlining a
`test.functional_region_op` whose body contained multiple blocks (e.g.
an explicit `cf.br` jump to a successor block whose terminator was
`test.return`).

Fix: add the missing `handleTerminator(op, Block *)` override to
`TestInlinerInterface`. Mirror the pattern used by
`FuncDialectInlinerExtension`: if the terminator is a `TestReturnOp`,
replace it with a `cf.br` to `newDest` carrying the return operands. Any
other terminator (e.g. `cf.br` for intra-region branches) is left
untouched — the existing `ControlFlowInlinerInterface` no-op already
handles those correctly.

Add a regression test in `test/Transforms/inlining.mlir` that inlines a
`call_indirect` into a `test.functional_region_op` with two blocks.

Fixes #185350

Assisted-by: Claude Code
2026-03-12 22:29:52 +00:00
Mehdi Amini
d8f3be726e
[mlir][dialect-conversion] Fix OOB crash in convertFuncOpTypes for funcs with extra block args (#185060)
Some function ops (e.g., gpu.func with workgroup memory arguments) have
more entry block arguments than their FunctionType has inputs. The
workgroup memory arguments are not part of the public function signature
but are present as additional block arguments.

`convertFuncOpTypes` previously created a `SignatureConversion` sized
only for `type.getNumInputs()`, then called `applySignatureConversion`
on the entry block. When the block had more arguments (e.g., workgroup
args), the loop in `applySignatureConversion` would call
`getInputMapping(i)` with out-of-bounds indices, causing an assertion
failure in `SmallVector::operator[]`.

Fix this by:
1. Sizing the `SignatureConversion` for all entry block arguments.
2. Adding identity mappings for extra block args beyond the function
type inputs.
3. Using only the converted function-type-input types when updating the
FunctionType (so extra block arg types are not included in the
signature).

Fixes #184744

Assisted-by: Claude Code
2026-03-11 14:25:03 +01:00
Mehdi Amini
b78ceef43e
[mlir][scf] Fix crash in extractFixedOuterLoops with iter_args loops (#184106)
The stripmineSink helper splices loop body operations into a new inner
scf.for that has no iter_args. When the target loop carries iter_args,
values yielded by the spliced body are moved inside the inner loop, but
the outer loop's yield terminator still references those values,
creating an SSA invariant violation. In debug builds this triggers the
assertion
  use_empty() && "Cannot destroy a value that still has uses\!"
when the outer RewriterBase tries to erase the now-broken operations.

Fix: in extractFixedOuterLoops, skip the strip-mining transformation if
any of the collected perfectly-nested loops have iter_args.

Add a regression test to parametric-tiling.mlir.

Fixes #129044

Assisted-by: Claude Code
2026-03-11 14:21:57 +01:00
Slava Zakharin
48e6adc97e
[RFC][mlir] Resource hierarchy for MLIR Side Effects. (#181229)
This patch allows creating a hierarchy of `SideEffects::Resource`s by adding
a virtual `getParent()` method, so that effects on *disjoint* resources
can be proven non-conflicting. It also adds virtual `isAddressable()` method
that represents a property of a resource to be addressable via a pointer
value. The non-addressable resources may not be affected via any pointer.
This is unblocking CSE, LICM and alias analysis without per-pass
special-casing.

RFC:
https://discourse.llvm.org/t/rfc-mlir-memory-region-hierarchy-for-mlir-side-effects/89811
2026-03-09 13:12:49 -07:00
Mehdi Amini
220f91a05b
[MLIR] Fix crash in inliner when return arity mismatches call results (#185037)
The `handleTerminator` implementation in the test dialect's inliner
interface was asserting that the number of `test.return` operands equals
the number of values to replace. This assertion fires when inlining a
callee whose body uses `test.return` with values into a call site that
expects zero results (e.g., a void `llvm.func` calling a function whose
implementation uses `test.return` with operands).

Replace the assertion with a conditional early return so the inliner
gracefully skips replacement instead of crashing.

Fixes #108376

Assisted-by: Claude Code
2026-03-06 16:50:04 +00:00
Jeongseok Son
62a5e53919
[mlir] Improve dialect conversion failure diagnostics (#182729)
This PR improves MLIR dialect conversion failure diagnostics when
legalization fails.

Previously, the diagnostic mostly included the operation name (and in
partial conversion, whether it was explicitly marked illegal). This
change keeps that prefix and appends the printed failing operation. This
provides immediate operand/result/type context directly in the same
error line.

### Example

Before:
```
failed to legalize operation 'test.type_consumer' that was explicitly marked illegal
```

After:
```
failed to legalize operation 'test.type_consumer' that was explicitly marked illegal: "test.type_consumer"(%arg0) : (f32) -> ()
```

### Tests
- Updated `mlir/test/Transforms/test-legalizer.mlir` expectations for
the richer emitted diagnostic.
2026-03-06 05:31:01 +00:00
Erick Ochoa Lopez
f2cdf3f3b0
Revert "[mlir][arith] Add exact to index_cast{,ui} (#183395)" (#184876)
This reverts commit 7ad2c6db54a0e77249f2edb3c589ccf4c930d455.

PR #183395 introduced the `exact` flag to `index_cast` and
`index_castui` and updated some canonicalization patterns.
These canonicalization patterns were found to be unsound. For example:

* `index_cast(index_cast(x)) -> x`
* where one first truncates and then widens x

the rewrite is unsound because information is lost on the first cast as
it **may** truncate the value of x, therefore losing information. The
`exact` flag was made to make this transformation sound. Its semantics
are that when the `exact` flag is present, then it is assumed that the
operand to index_cast does not lose information (i.e., fits perfectly in
the destination type).

In PR #183395, the canonicalization rule was rewritten such that would
only match where the inner index_cast had the `exact` flag set.

* `index_cast(index_cast(x, exact)) -> x`
* where source type and destination type are the same

A post-merge review
[highlighted](https://github.com/llvm/llvm-project/pull/183395#discussion_r2880422529)
that the pattern above also disallows the following correct pattern:

* `index_cast(index_cast(x)) -> x`
* when the first index widens and the second one truncates.

Unfortunately the semantics of `index` are such that its bitwidth is
target specific. Attempts were made in
https://github.com/llvm/llvm-project/pull/184631 to automatically add
annotations were possible but no agreement was reached on the best way
to do this. Adding to the disagreement are the following points:

* [there are other unsound patterns that assume index is
64](https://github.com/llvm/llvm-project/pull/184631/changes#r2885181291)
* [The semantics of index is
contested](https://discourse.llvm.org/t/index-type-and-assumption-about-bitwidth/88287)

This lead to the belief that a reversal and an RFC would be a good
approach to get some consensus from the community before proceeding
further.
2026-03-05 16:04:45 -05:00
Mehdi Amini
ecec7920c6
[mlir][func] Move return-type verification from ReturnOp to FuncOp (#184153)
Move the operand count and type checks for func.return from
ReturnOp::verify() into a new FuncOp::verify(). The verifier iterates
all blocks in the callable region, skipping terminators that are not
func.return (e.g. llvm.return or test.return that may appear during
dialect conversion).

Fix several invalid-IR tests that had func.func return types
inconsistent with the actual func.return operands. Previously these
mismatches were silent because block verification stopped at an earlier
expected error before reaching the func.return; now that
FuncOp::verify() runs before body verification, the return types must be
consistent.
2026-03-03 12:18:27 +01:00
Mehdi Amini
785490e9db
[MLIR] Remove let constructor = from mlir/include/mlir/Transforms/Passes.td (#183950)
This makes the constructor auto-generated.
2026-03-01 13:51:23 +01:00
Erick Ochoa Lopez
7ad2c6db54
[mlir][arith] Add exact to index_cast{,ui} (#183395)
The `exact` flag with the following semantics

> If the `exact` attribute is present, it is assumed that the index type
width
> is such that the conversion does not lose information. When this
assumption
>    is violated, the result is poison.

can be added to index_cast and index_castui operations. This unlocks
the following lowerings:

*   index_cast (signed) exact    -> trunc nsw
*   index_castui (unsigned) exact -> trunc nuw
*   index_castui nneg exact       -> trunc nuw nsw

Changes:

* Adds ArithExactFlagInterface.
* Updates Arith_IntBinaryOpWithExactFlag to use ArithExactFlagInterface
* Update IndexCastOp and IndexCastUIOp to declare
`ArithExactFlagInterface`
* Update canonicalization patterns
* Update roundtrip, lowering, and canonicalization tests.
2026-02-27 17:17:48 -05:00
Mehdi Amini
bcd8819aee
[mlir][transforms] Fix crash in remove-dead-values when function has non-call users (#183655)
`processFuncOp` asserts that all symbol uses of a function are
`CallOpInterface` operations. This is violated when a function is
referenced by a non-call operation such as `spirv.EntryPoint`, which
uses the function symbol for metadata purposes without calling it.

Fix this by replacing the assertion with an early return: if any user of
the function symbol is not a `CallOpInterface`, skip the function
entirely. This is safe because the pass cannot determine the semantics
of arbitrary non-call references, so it should leave such functions
alone.

Fixes #180416
2026-02-27 15:44:08 +00:00
Prathamesh Tagore
5460a202ea
[mlir][remove-dead-values] Replace appropriate operation results with poison (#181013)
Before erasing the operation, replace all result values with live-uses
by
ub.poison values. This is important to maintain IR validity. For
example,
if we have an op with one of its results used by another op, erasing the
op without replacing its corresponding result would leave us with a
dangling operand in the user op. By replacing the result with a
ub.poison
value, we ensure that the user op still has a valid operand, even though
it's a poison value which will be cleaned up later if it can be cleaned
up. This keeps the IR valid for further simplification and
canonicalization while fixing a related crash in the canonicalizer.

Fixes https://github.com/llvm/llvm-project/issues/179944
2026-02-16 20:46:36 +01:00
Scott Manley
370a571597
[RegionUtils] replace uses in nested regions when isolating from above (#180548)
When making a region IsolatedFromAbove, replace uses in any region
within the parent region, not just the immediate parent region.
2026-02-10 07:36:47 -06:00
Jorn Tuyls
f84c3672c3
[mlir] Extend moveValueDefinitions/moveOperationDependencies with cross-region support (#176343)
Extends `moveValueDefinitions` and `moveOperationDependencies` to
support moving operations across basic blocks and out of nested regions
2026-02-02 11:39:02 +01:00
Lukas Sommer
c1152f0fb2
[mlir] Avoid segfault in 'MoveBlockRewrite' rollback (#178148)
Prior to this change, rollback of the `MoveBlockRewrite` could result in
segfault if the block wasn't contained in a region anymore.

That situation could arise if the previous rollback of another rewrite
orphaned the block by removing it from its region, as demonstrated by
the new test pattern.

Signed-off-by: Lukas Sommer <lukas.sommer@amd.com>
2026-01-27 15:06:57 +01:00
Jorn Tuyls
5faa181112
[mlir] Add side-effect check to moveOperationDependencies (#176361)
This patch adds a side-effect check to `moveOperationDependencies` to
match the behavior of `moveValueDefinitions`. Previously,
`moveOperationDependencies` would move operations with side-effecting
dependencies, which could change program semantics.

**Note** that the existing test changes are needed because unregistered
operations (e.g., "moved_op"()) are treated as side-effecting. These
tests were updated to use pure operations for operations in the moved
slice, while keeping unregistered ops for operations that aren't moved
(e.g., "before"(), "foo"()). This ensures that tests continue to
exercise their intended functionality without being blocked by the new
side-effect check.
2026-01-23 14:10:42 +01:00
Matthias Springer
c4750d0575
[mlir] Consolidate patterns into RegionBranchOpInterface patterns (#174094)
Instead of op-specific cleanup patterns for region branch ops to remove
unused results / block arguments, etc., add a set of patterns that can
handle all `RegionBranchOpInterface` ops. These patterns are enabled
only for selected SCF dialect ops at the moment:
* `scf.execute_region`
* `scf.for`
* `scf.if`
* `scf.index_switch`
* `scf.while`

It is currently not possible to register canoncalization patterns for op
interfaces and some ops have incorrect interface implementations. In
follow-up PRs, the set of ops will be gradually extended within the SCF
dialect (`scf.forall`) and across other dialects
(`gpu.warp_execute_on_lane0`, (maybe) various affine dialect ops, ...),
and maybe eventually to apply to all `RegionBranchOpInterface` ops.

This commit removes many similar canonicalization patterns from the SCF
dialect. The newly added canonicalization patterns allow users to get
the same canonicalizations for free for their own ops. And even a few
additional new canonicalizations
([example](https://github.com/llvm/llvm-project/pull/174094/files#diff-54318cd685386d5519c42be49818e388b09d934edcbe4280548baa3601802977R2241),
[example](https://github.com/llvm/llvm-project/pull/174094/files#diff-54318cd685386d5519c42be49818e388b09d934edcbe4280548baa3601802977R1101),
...).

Implementation outline: This commit adds 3 canonicalization patterns.
* `MakeRegionBranchOpSuccessorInputsDead`: Remove uses of successor
inputs, by swapping them for successor operand values.
* `RemoveDuplicateSuccessorInputUses`: Remove uses of successor inputs
that are duplicates. (Similar to `WhileRemoveDuplicatedResults` in the
SCF dialect.)
* `RemoveDeadRegionBranchOpSuccessorInputs`: Remove dead successor
inputs if all of their "tied" successor inputs are also dead. (Similar
to `WhileUnusedResult` in the SCF dialect.)
2026-01-13 07:22:09 +00:00
Matthias Springer
82c1f9435d
[mlir][Transforms] remove-dead-values: Rely on canonicalizer for region simplification (#173505)
This commit simplifies the `remove-dead-values` pass and fixes a bug in
the handling of `RegionBranchOpInterface` ops. The pass used to produce
invalid IR ("null value found") for the newly added test case.

`remove-dead-values` is a pass for additional IR simplification that
cannot be performed by the canonicalizer pass. Based on a liveness
analysis, it erases dead values / IR. (The liveness analysis is a
dataflow analysis that has more information about the IR than a
canonicalization pattern, which can see only "local" information.)

Region-based ops are difficult. The liveness analysis may determine that
an SSA value is dead. However, that does not mean that the value can
actually be removed. Doing so may violate an region data flow (as
modeled by the `RegionBranchOpInterface`). As an example, consider the
case where a region branch terminator may dispatch to one of two region
successor with the same forwarded values. A successor input (block
argument) can be erased only if it is dead on both successors.

Before this commit, there used to be complex logic to determine when it
is safe to erase an SSA value. That logic was broken. The new
implementation does not remove any block arguments or op results of
region-based ops. Instead, operands of region-based ops and region
branch terminators are replaced with `ub.poison` if all of their
successor values are dead. This simplifies the IR good enough for the
canonicalizer to perform the remaining region simplification (i.e.,
dropping block arguments etc.).

RFC:
https://discourse.llvm.org/t/rfc-delegate-simplification-of-region-based-ops-from-remove-dead-values-to-canonicalizer/89194
2026-01-07 14:51:40 +01:00
Ben Vanik
4ac6431755
[mlir] Fix crash in dropRedundantArguments with produced operands. (#172759)
dropRedundantArguments was incorrectly indexing into forwardedOperands
using the block argument index directly. This crashes when the block has
produced operands (generated by the terminator, not forwarded from
predecessors) because forwardedOperands doesn't include them.

The fix checks isOperandProduced() to skip produced arguments and uses
SuccessorOperands::operator[] which handles the offset correctly.
2026-01-06 22:53:28 -08:00
lonely eagle
0394ad1bfa
[mlir][dataflow] Add new visitNonControlFlowArgumentst API to SparseBackwardDataFlowAnalysis and apply it in LivenessAnalysis/RemoveDeadValues (#169816)
Add visitNonControlFlowArgumentst API to SparseBackwardDataFlowAnalysis,
current SparseBackwardDataflowAnalysis cannot access all SSA values,
such as, the loop's IV. Now we can use visitNonControlFlowArgumentst to
visit it. Apply it in LivenessAnalysis/RemoveDeadValues, solved the
issue of IV liveness in the loop.
https://discourse.llvm.org/t/rfc-add-visitbranchregionargument-interface-to-sparsedataflowanalysis/89061
2026-01-02 17:10:05 +08:00
Matthias Springer
78711b66bd
[mlir][Transforms] Legalize nested operations (#172158)
This commit align the implementation of
`ConversionPatternRewriter::legalize` with its documentation:

```
  /// Attempt to legalize the given region. This can be used within
  ...
  LogicalResult legalize(Region *r);
```

This function now legalizes the entire region, including nested ops. The
implementation follows the same logic as the "main" traversal:
pre-order, forward-dominance.
2025-12-16 16:47:47 +01:00
Quinn Dawkins
bb17dfa7d1
[mlir][bufferization] Enable moving dependent values in eliminate-empty-tensors (#169718)
Currently empty tensor elimination by constructing a SubsetExtractionOp
to match a SubsetInsertionOp at the end of a DPS chain will fail if any
operands required by the insertion op don't dominate the insertion point
for the extraction op.

This change improves the transformation by attempting to move all pure
producers of required operands to the insertion point of the extraction
op. In the process this improves a number of tests for empty tensor
elimination.
2025-12-05 14:40:08 -05:00
Matthias Springer
e6110cb339
[mlir][Transforms] Fix crash in -remove-dead-values on private functions (#169269)
This commit fixes two crashes in the `-remove-dead-values` pass related
to private functions.

Private functions are considered entirely "dead" by the liveness
analysis, which drives the `-remove-dead-values` pass.

The `-remove-dead-values` pass removes dead block arguments from private
functions. Private functions are entirely dead, so all of their block
arguments are removed. However, the pass did not correctly update all
users of these dropped block arguments.

1. A side-effecting operation must be removed if one of its operands is
dead. Otherwise, the operation would end up with a NULL operand. Note:
The liveness analysis would not have marked an SSA value as "dead" if it
had a reachable side-effecting users. (Therefore, it is safe to erase
such side-effecting operations.)
2. A branch operation must be removed if one of its non-forwarded
operands is dead. (E.g., the condition value of a `cf.cond_br`.)
Whenever a terminator is removed, a `ub.unrechable` operation is
inserted. This fixes #158760.
2025-12-03 08:35:05 +01:00
Matthias Springer
504b507896
[mlir][Transforms] Dialect conversion: Add support for replaceUsesWithIf (#169606)
This commit adds support for `replaceUsesWithIf` (and variants such as
`replaceAllUsesExcept`) to the `ConversionPatternRewriter`. This API is
supported only in no-rollback mode. An assertion is triggered in
rollback mode. (This missing assertion has been confusing for users
because it seemed that the API supported, while it was actually not
working properly.)

This commit brings us a bit closer towards removing
[this](76ec25f729/mlir/lib/Transforms/Utils/DialectConversion.cpp (L1214))
workaround.

Additional changes are needed to support this API in rollback mode. In
particular, no entries should be added to the `ConversionValueMapping`
for conditional replacements. It's unclear at this point if this API can
be supported in rollback mode, so this is deferred to later.

This commit turns `replaceUsesWithIf` into a virtual function, so that
the `ConversionPatternRewriter` can override it. All other API functions
for conditional value replacements call that function.

Note for LLVM integration: If you are seeing failed assertions due to
this change, you are using unsupported API in your dialect conversion.
You have 3 options: (1) Migrate to the no-rollback driver. (2) Rewrite
your patterns without the unsupported API. (3) Last resort: bypass the
rewriter and call `replaceUsesWithIf` etc. directly on the `Value`
object.
2025-11-27 01:54:07 +00:00
lonely eagle
765208b313
[mlir] Make remove-dead-values remove block and successorOperands before delete ops (#166766)
Reland https://github.com/llvm/llvm-project/pull/165725, fix the Failed
test by removing successor operands before delete operations. Following
the deletion of cond.branch, its successor operands will subsequently be
removed.
2025-11-20 13:55:09 +08:00
Jeremy Furtek
9349a10f93
Fix side effects for LLVM integer operations (udiv, sdiv) incorrectly marked as Pure (#166648)
This MR modifies side effect traits of some integer arithmetic
operations in the LLVM dialect.

Prior to this MR, the LLVM dialect `sdiv` and `udiv` operations were
marked as `Pure` through `tblgen` inheritance of the
`LLVM_ArithmeticOpBase` class. The `Pure` trait allowed incorrect
hoisting of `sdiv`/`udiv` operations by the
`loop-independent-code-motion` pass.

This MR modifies the `sdiv` and `udiv` LLVM operations to have traits
and code motion behavior identical to their counterparts in the `arith`
dialect, which were established by the commit/review below.


ed39825be4
https://reviews.llvm.org/D137814
2025-11-17 09:46:56 -08:00
Veera
996639d6eb
[MLIR][BufferResultsToOutParamsPass] Add Option to Modify Public Function's Signature (#167248)
Since https://github.com/llvm/llvm-project/pull/162441,
`buffer-results-to-out-params` transforms `private` functions only.

But, as mentioned in
https://github.com/llvm/llvm-project/pull/162441#issuecomment-3404195242,
this is a breaking change for pipelines handling C code. Our pipeline
@EfficientComputer is also affected by this breaking change.

Therefore, this PR adds an opt-in flag to allow `public` functions to be
transformed by `BufferResultsToOutParamsPass`.
2025-11-09 21:08:22 -08:00
lonely eagle
55fb1caf8a
Revert "[mlir] Make remove-dead-values pass remove blocks arguments first" (#166746)
Reverts llvm/llvm-project#165725. See
https://lab.llvm.org/buildbot/#/builders/169/builds/16768,
2025-11-06 11:03:19 +00:00
lonely eagle
a928c61961
[mlir] Make remove-dead-values pass remove blocks arguments first (#165725)
Fix https://github.com/llvm/llvm-project/issues/163051. Some Ops which
have multiple blocks, before deleting the ops, first remove the dead
parameters within its blocks.
2025-11-06 16:36:14 +08:00
Matthias Springer
a38e094240
[mlir] Dialect Conversion: Add support for post-order legalization order (#166292)
By default, the dialect conversion driver processes operations in
pre-order: the initial worklist is populated pre-order. (New/modified
operations are immediately legalized recursively.)

This commit adds a new API for selective post-order legalization.
Patterns can request an operation / region legalization via
`ConversionPatternRewriter::legalize`. They can call these helper
functions on nested regions before rewriting the operation itself.

Note: In rollback mode, a failed recursive legalization typically leads
to a conversion failure. Since recursive legalization is performed by
separate pattern applications, there is no way for the original pattern
to recover from such a failure.
2025-11-05 21:04:32 +09:00
Matthias Springer
d4c41b7fa3
[mlir][Transforms] Dialect Conversion: Convert entry block only (#165180)
When converting a function, convert only the entry block signature. The
remaining block signatures should be converted by the respective
branching ops. The `FuncToLLVM` / `ControlFlowToLLVM` patterns already
use that design.

```c++
struct BranchOpLowering : public ConvertOpToLLVMPattern<cf::BranchOp> {

  LogicalResult
  matchAndRewrite(cf::BranchOp op, OneToNOpAdaptor adaptor,
                  ConversionPatternRewriter &rewriter) const override {
    // Convert successor block.
    SmallVector<Value> flattenedAdaptor = flattenValues(adaptor.getOperands());
    FailureOr<Block *> convertedBlock =
        getConvertedBlock(rewriter, getTypeConverter(), op, op.getSuccessor(),
                          TypeRange(ValueRange(flattenedAdaptor)));
    // ...
  }
};
```

This is consistent with the fact that operations from unreachable blocks
are not put on the initial worklist.

With this change, parent ops are no longer recursively legalized when
inserting a block, simplifying the conversion driver a bit.

Note for LLVM integration: If you are seeing failures, make sure to:
- Drop `converter.isLegal(&op.getBody())` when checking the legality of
a function op. Only the entry block signature / function type should be
taken into account.
- If you need to convert all reachable blocks and are using `cf`
branching ops, add `populateCFStructuralTypeConversionsAndLegality`.
- If you need to convert all reachable blocks and are using custom
branching ops, implement and populate custom structural type conversion
patterns, similar to `populateCFStructuralTypeConversionsAndLegality`.
2025-11-03 23:34:52 +00:00
Matthias Springer
ca84e9e826
[mlir][CF] Add structural type conversion patterns (#165629)
Add structural type conversion patterns for CF dialect ops. These
patterns are similar to the SCF structural type conversion patterns.

This commit adds missing functionality and is in preparation of #165180,
which changes the way blocks are converted. (Only entry blocks are
converted.)
2025-10-29 18:12:40 -07:00
lonely eagle
e665f245f5
[mlir] Delete unroll-full option for Affine/SCF unroll pass (#164658)
Make the unroll-factor take -1 as "full" and avoid potential conflict
when passing both an explicit factor and unroll-full=true.
2025-10-24 02:45:39 +08:00
lonely eagle
d6e2143b06
[mlir][dataflow] Fix LivenessAnalysis/RemoveDeadValues handling of loop induction variables (#161117)
Fix https://github.com/llvm/llvm-project/issues/157934. In liveness
analysis, variables that are not analyzed are set as dead variables, but
some variables are definitely live.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-10-21 18:48:43 +08:00
lonely eagle
71586a6a73
[mlir][bufferize] Make buffer-results-to-out-params support only functions that are neither public nor extern (#162441)
The callers of public or extern functions are unknown, so their function
signatures cannot be changed.
2025-10-08 16:54:02 +08:00
lonely eagle
1087c1079f
[mlir][bufferize] Add hoist-dynamic-allocs-option to buffer-results-to-out-params (#160985)
Add hoist-dynamic-allocs-option to buffer-results-to-out-params. This PR
supported that obtain the size of the dynamic shape memref through the
caller-callee relationship.
2025-10-06 17:05:21 +08:00
Mehdi Amini
7ab7bc7274
[MLIR] Fix LivenessAnalysis/RemoveDeadValues handling of dead function arguments (#160755)
In #153973 I added the correctly handling of block arguments,
unfortunately this was gated on operation that also have results. This
wasn't intentional and this excluded operations like function from being
correctly processed.
2025-09-26 13:47:46 +00:00
Francisco Geiman Thiesen
3e746bd8fb
Allowing RDV to call getArgOperandsMutable() (#160415)
## Problem

`RemoveDeadValues` can legally drop dead function arguments on private
`func.func` callees. But call-sites to such functions aren't fixed if
the call operation keeps its call arguments in a **segmented operand
group** (i.ie, uses `AttrSizedOperandSegments`), unless the call op
implements `getArgOperandsMutable` and the RDV pass actually uses it.

## Fix
When RDV decides to drop callee function args, it should, for each
call-site that implements `CallOpInterface`, **shrink the call's
argument segment** via `getArgOperandsMutable()` using the same dead-arg
indices. This keeps both the flat operand list and the
`operand_segment_sizes` attribute in sync (that's what
`MutableOperandRange` does when bound to the segment).

## Note
This change is a no-op for:
* call ops without segment operands (they still get their flat operands
erased via the generic path)
* call ops whose calle args weren't dropped (public, external,
non-`func-func`, unresolved symbol, etc)
* `llvm.call`/`llvm.invoke` (RDV doesn't drop `llvm.func` args

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-09-26 15:30:46 +02:00
Fabian Mora
077a796c0d
[mlir] Implement a memory-space cast bubbling-down transform (#159454)
This commit adds functionality to bubble down memory-space casts
operations, allowing consumer operations to use the original
memory-space rather than first casting to a different memory space.

Changes:
- Introduce `MemorySpaceCastOpInterface` to handle memory-space cast
operations
- Create a `MemorySpaceCastConsumerOpInterface` pass that identifies and
bubbles down eligible casts
- Add implementation for memref and vector operations to handle
memory-space cast propagation
- Add `bubbleDownCasts` method to relevant operations to support the
fusion

In particular, in the current implementation only memory-space casts
into the default memory-space can be bubbled-down.

Example:

```mlir
func.func @op_with_cast_sequence(%arg0: memref<4x4xf32, 1>, %arg1: index, %arg2: f32) -> memref<16xf32> {
    %memspacecast = memref.memory_space_cast %arg0 : memref<4x4xf32, 1> to memref<4x4xf32>
    %c0 = arith.constant 0 : index
    %c4 = arith.constant 4 : index
    %expanded = memref.expand_shape %memspacecast [[0], [1, 2]] output_shape [4, 2, 2] : memref<4x4xf32> into memref<4x2x2xf32>
    %collapsed = memref.collapse_shape %expanded [[0, 1, 2]] : memref<4x2x2xf32> into memref<16xf32>
    %loaded = memref.load %collapsed[%c0] : memref<16xf32>
    %added = arith.addf %loaded, %arg2 : f32
    memref.store %added, %collapsed[%c0] : memref<16xf32>
    %atomic_result = memref.atomic_rmw addf %arg2, %collapsed[%c4] : (f32, memref<16xf32>) -> f32
    return %collapsed : memref<16xf32>
}
// mlir-opt --bubble-down-memory-space-casts
func.func @op_with_cast_sequence(%arg0: memref<4x4xf32, 1>, %arg1: index, %arg2: f32) -> memref<16xf32> {
    %c4 = arith.constant 4 : index
    %c0 = arith.constant 0 : index
    %expand_shape = memref.expand_shape %arg0 [[0], [1, 2]] output_shape [4, 2, 2] : memref<4x4xf32, 1> into memref<4x2x2xf32, 1>
    %collapse_shape = memref.collapse_shape %expand_shape [[0, 1, 2]] : memref<4x2x2xf32, 1> into memref<16xf32, 1>
    %memspacecast = memref.memory_space_cast %collapse_shape : memref<16xf32, 1> to memref<16xf32>
    %0 = memref.load %collapse_shape[%c0] : memref<16xf32, 1>
    %1 = arith.addf %0, %arg2 : f32
    memref.store %1, %collapse_shape[%c0] : memref<16xf32, 1>
    %2 = memref.atomic_rmw addf %arg2, %collapse_shape[%c4] : (f32, memref<16xf32, 1>) -> f32
    return %memspacecast : memref<16xf32>
}
```

---------

Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-09-24 09:11:43 -04:00
Ian Wood
2dd3d3852d
[MLIR] getBackwardSlice: don't bail on ops that are IsolatedFromAbove (#158135)
Ops with the `IsIsolatedFromAbove` trait should be captured by the
backward slice.

---------

Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
2025-09-22 10:48:40 -07:00
Matthias Springer
2929a2978c
[mlir][Transforms] Add support for ConversionPatternRewriter::replaceAllUsesWith (#155244)
This commit generalizes `replaceUsesOfBlockArgument` to
`replaceAllUsesWith`. In rollback mode, the same restrictions keep
applying: a value cannot be replaced multiple times and a call to
`replaceAllUsesWith` will replace all current and future uses of the
`from` value.

`replaceAllUsesWith` is now fully supported and its behavior is
consistent with the remaining dialect conversion API. Before this
commit, `replaceAllUsesWith` was immediately reflected in the IR when
running in rollback mode. After this commit, `replaceAllUsesWith`
changes are materialized in a delayed fashion, at the end of the dialect
conversion. This is consistent with the `replaceUsesOfBlockArgument` and
`replaceOp` APIs.

`replaceAllUsesExcept` etc. are still not supported and will be
deactivated on the `ConversionPatternRewriter` (when running in rollback
mode) in a follow-up commit.

Note for LLVM integration: Replace `replaceUsesOfBlockArgument` with
`replaceAllUsesWith`. If you are seeing failures, you may have patterns
that use `replaceAllUsesWith` incorrectly (e.g., being called multiple
times on the same value) or bypass the rewriter API entirely. E.g., such
failures were mitigated in Flang by switching to the walk-patterns
driver (#156171).

You can temporarily reactivate the old behavior by calling
`RewriterBase::replaceAllUsesWith`. However, note that that behavior is
faulty in a dialect conversion. E.g., the base
`RewriterBase::replaceAllUsesWith` implementation does not see uses of
the `from` value that have not materialized yet and will, therefore, not
replace them.
2025-09-06 11:17:55 +02:00