When the affine.parallel op was introduced, affine utilities weren't
extended to handle it. Extending these is straightforward and natural
given that addAffineParallelOpDomain has also been added.
Update/complete memref region compute to account for affine.parallel
ops. Handle failure cleanly.
Add and expose utilities missing for affine.parallel to be consistent
with affine.for.
All of these allow various affine passes to work with a combination of
affine.parallel and affine.for ops.
Differential Revision: https://reviews.llvm.org/D145669
The inliner pass performs canonicalization when created programtically, run with `mlir-opt` with default options, or when explicitly specified. However, when running the pipeline resulting from a `-dump-pass-pipeline` on a default inline pass, the canonicalization is not performed as part of the inlining. This is because the default value for the `default-pipeline` option of the inline pass is an empty string, and this is selected during the dumping. When `InlinerPass::initializeOptions` detects the empty string, it sets the `defaultPipeline` to `nullptr`, which was previously set to canonicalize in the `InlinerPass` constructor, thus the canonicalization is not performed.
The added test checks if the inline pass performs canonicalization by default, and that the dumped `default-pipeline` is set to `canonicalize`.
Fixes: https://github.com/llvm/llvm-project/issues/60960
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D145066
Fix upper bound constraint addition in addAffineParallelOpDomain; it was
off by one in the case of constants.
Differential Revision: https://reviews.llvm.org/D144836
Fix obvious bug in `addAffineParallelOpDomain` that would lead to
incorrect domain constraints for any affine.parallel op with
dimensionality greater than one.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D144349
When changing IR in a RewriterPattern, all changes must go through the
rewriter. There are several convenience functions in RewriterBase that
help with high-level modifications, such as replaceAllUsesWith for
Values, but there is currently none to do the same task for Blocks.
Reviewed By: mehdi_amini, ingomueller-net
Differential Revision: https://reviews.llvm.org/D142525
Nested tuples were only supported in some narrow edge cases (and
potentially only because the test ops like `test.make_tuple` aren't
properly verified). This patch adds a couple of test cases with tested
tuple types and makes them work in the test pass by extending the
argument materialization and decomposition functions.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D143579
There were issues with the CSE equivalence analysis that have been fixed with D142558. This makes it possible to CSE ops with multiple regions.
Differential Revision: https://reviews.llvm.org/D142562
Previously, the CallOpSignatureConversion pattern would assert if
function signature change affected the number of results. Fail the
pattern instead and let the caller propagate failure.
Fixes#60186.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D142624
This change adds `allErased` to the `applyOpPatternsAndFold(ArrayRef<Operation *>, ...)` overload. This overload now supports all functionality that is also supported by `applyOpPatternsAndFold(Operation *, ...)` and can be used as a replacement.
This change has no performance implications when `allErased = nullptr`.
The single-operation overload is removed in a subsequent NFC change.
Differential Revision: https://reviews.llvm.org/D141920
There are now three options:
* `AnyOp` (previously `false`)
* `ExistingAndNewOps` (previously `true`)
* `ExistingOps`: this one is new.
The last option corresponds to what the `applyOpPatternsAndFold(Operation*, ...)` overload is doing. It is now also supported on the `applyOpPatternsAndFold(ArrayRef<Operation *>, ...)` overload.
Differential Revision: https://reviews.llvm.org/D141904
Drop unnecessary bailout in checkMemRefAccessDependence in the presence of
surrounding affine.parallel ops. When the affine.parallel op was added, affine
analysis methods weren't extended to account for it. Fix this and allow memref
dependence check to work in the presence of affine.parallel ops in the mix.
Rename isForInductionVar -> isAffineForInductionVar, getLoopIVs ->
getAffineForIVs to avoid confusion since that's what they were.
Differential Revision: https://reviews.llvm.org/D141254
Canonicalize affine set + operands in addAffineIfOpDomain. This is to
ensure a unique set of operands for FlatAffineValueConstraints and in
general to provide a simplified set of constraints. For the latter
scenario, this just leads to efficiency improvements as opposed to
functionality. While on this, remove outdated/stale stuff from
AffineStructures.h.
Fixes: https://github.com/llvm/llvm-project/issues/59461
Differential Revision: https://reviews.llvm.org/D141250
When `strict = true`, only pre-existing and newly-created ops are rewritten and/or folded. Such ops are stored in `strictModeFilteredOps`.
Newly-created ops were previously added to `strictModeFilteredOps` after calling `addToWorklist` (via `GreedyPatternRewriteDriver::notifyOperationInserted`). Therefore, newly-created ops were never added to the worklist.
Also fix a test case that should have gone into an infinite loop (`test.replace_with_new_op` was replaced with itself, which should have caused the op to be rewritten over and over), but did not due to this bug.
Differential Revision: https://reviews.llvm.org/D141141
This new option is set to `false` by default. It should be set only in Canonicalizer tests to detect faulty canonicalization patterns. I.e., patterns that prevent the canonicalizer from converging. The canonicalizer should always convergence on such small unit tests that we have in `canonicalize.mlir`.
Two faulty canonicalization patterns were detected and fixed with this change.
Differential Revision: https://reviews.llvm.org/D140873
The affine fusion pass can actually work on the top-level of a `Block`
and doesn't require to be called on a `FuncOp`. Remove this restriction
and generalize the pass to work on any `Block`. This allows fusion to be
performed, for example, on multiple blocks of a FuncOp or any
region-holding op like an scf.while, scf.if or even at an inner depth of
an affine.for or affine.if op. This generalization has no effect on
existing functionality. No changes to the fusion logic or its
transformational power were needed.
Update fusion pass to be a generic operation pass (instead of FuncOp
pass) and remove references and assumptions on the parent being a
FuncOp.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D139293
Check for aliases for escaping memrefs check in affine fusion pass. Fix
`isEscapingMemRef` to handle unknown defining ops for the memref.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D139268
This covers more options for CSE. It also ensures that two operations
that have same operands but different regions to begin with, but same
regions after `simplifyRegions`, don't get both added to the list of
`knownValues`.
Fixes#59135
Differential Revision: https://reviews.llvm.org/D139490
Support affine.parallel in the index set analysis. It allows us to do dependence analysis containing affine.parallel in addition to affine.for and affine.if. This change only supports the constant lower/upper bound in affine.parallel. Other complicated affine map bounds will be supported in further commits.
See https://github.com/llvm/llvm-project/issues/57327
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D136056
This is generated by running
```
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.td
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.mlir
```
Reviewed By: rriddle, dcaballe
Differential Revision: https://reviews.llvm.org/D138866
Pack and Unpack return new tensors within which the individual elements
are reshuffled according to the packing specification. This has the
consequence of modifying the canonical order in which a given operator
(i.e., Matmul) accesses the individual elements. After bufferization,
this typically translates to increased access locality and cache
behavior improvement, e.g., eliminating cache line splitting.
Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com>
Co-authored-by: Han-Chung Wang <hanchung@google.com>
RFC: https://discourse.llvm.org/t/rfc-tensor-pack-and-tensor-unpack/66408/1
Reviewed By: nicolasvasilache, rengolin, hanchung
Differential Revision: https://reviews.llvm.org/D138119
Currently CSE does not support CSE of ops with regions. This patch
extends the CSE support to ops with a single region.
Differential Revision: https://reviews.llvm.org/D134306
Depends on D137857
Up until now PDL(L) has not supported dialect conversion because we had no
way of remapping values or integrating with type conversions. This commit
rectifies that by adding a new "pattern configuration" concept to PDL. This
essentially allows for attaching external configurations to patterns, which
can hook into pattern events (for now just the scope of a rewrite, but we
could also pass configs to native rewrites as well). This allows for injecting
the type converter into the conversion pattern rewriter.
Differential Revision: https://reviews.llvm.org/D133142
The KeyTy of attribute/type storage classes provide enough information for
automatically implementing the necessary sub element interface methods. This
removes the need for derived classes to do it themselves, which is both much
nicer and easier to handle certain invariants (e.g. null handling). In cases where
explicitly handling for parameter types is necessary, they can provide an implementation
of `AttrTypeSubElementHandler` to opt-in to support.
This tickles a few things alias wise, which annoyingly messes with tests that hard
code specific affine map numbers.
Differential Revision: https://reviews.llvm.org/D137374
In D134622 the printed form of a pass manager is changed to include the
name of the op that the pass manager is anchored on. This updates the
`-pass-pipeline` argument format to include the anchor op as well, so
that the printed form of a pipeline can be directly passed to
`-pass-pipeline`. In most cases this requires updating
`-pass-pipeline='pipeline'` to
`-pass-pipeline='builtin.module(pipeline)'`.
This also fixes an outdated assert that prevented running a
`PassManager` anchored on `'any'`.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D134900
for (I = Start; I < End; I += 1) always terminates so mark
{scf|affine}.for as RecursivelySpeculatable when step is known to be
1.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D136376
We currently only support one level of aliases, which isn't great
in situations where an attribute/type can have multiple duplicated
components nested within it(e.g. debuginfo metadata). This commit
refactors alias generation to support nested aliases, which requires
changing alias grouping to take into account the depth of child
aliases, to ensure that attributes/types aren't printed before the
aliases they use.
The only real user facing change here was that we no longer print
0 as an alias suffix, which would be unnecessarily expensive to keep
in the new alias generation method (and isn't that valuable of a
behavior to preserve).
Differential Revision: https://reviews.llvm.org/D136541
This change allows analyzing ops from different block, in particular when used in programs that have `cf` branches.
Differential Revision: https://reviews.llvm.org/D135644
These operations have undefined behavior if the index is not less than the rank of the source tensor / memref, so they cannot be freely speculated like they were before this patch. After this patch we speculate them only if we can prove that they don't have UB.
Depends on D135505.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D135748
This patch takes the first step towards a more principled modeling of undefined behavior in MLIR as discussed in the following discourse threads:
1. https://discourse.llvm.org/t/semantics-modeling-undefined-behavior-and-side-effects/4812
2. https://discourse.llvm.org/t/rfc-mark-tensor-dim-and-memref-dim-as-side-effecting/65729
This patch in particular does the following:
1. Introduces a ConditionallySpeculatable OpInterface that dynamically determines whether an Operation can be speculated.
2. Re-defines `NoSideEffect` to allow undefined behavior, making it necessary but not sufficient for speculation. Also renames it to `NoMemoryEffect`.
3. Makes LICM respect the above semantics.
4. Changes all ops tagged with `NoSideEffect` today to additionally implement ConditionallySpeculatable and mark themselves as always speculatable. This combined trait is named `Pure`. This makes this change NFC.
For out of tree dialects:
1. Replace `NoSideEffect` with `Pure` if the operation does not have any memory effects, undefined behavior or infinite loops.
2. Replace `NoSideEffect` with `NoSideEffect` otherwise.
The next steps in this process are (I'm proposing to do these in upcoming patches):
1. Update operations like `tensor.dim`, `memref.dim`, `scf.for`, `affine.for` to implement a correct hook for `ConditionallySpeculatable`. I'm also happy to update ops in other dialects if the respective dialect owners would like to and can give me some pointers.
2. Update other passes that speculate operations to consult `ConditionallySpeculatable` in addition to `NoMemoryEffect`. I could not find any other than LICM on a quick skim, but I could have missed some.
3. Add some documentation / FAQs detailing the differences between side effects, undefined behavior, speculatabilty.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D135505
Fix crash in normalizeMemRefType. Correctly handle scenario and replace
assertion with a failure.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D135424
Many tests still depend on specific names of SSA values (!!).
This commit is a best effort cleanup that will set the stage for adding some pretty SSA result names.
All relevant operations have been switched to primarily use the strided
layout, but still support the affine map layout. Update the relevant
tests to use the strided format instead for compatibility with how ops
now print by default.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134045
Memref subview operation has been initially designed to work on memrefs with
strided layouts only and has never supported anything else. Port it to use the
recently added StridedLayoutAttr instead of extracting the strided from
implicitly from affine maps.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D133938
Introduce a new attribute to represent the strided memref layout. Strided
layouts are omnipresent in code generation flows and are the only kind of
layouts produced and supported by a half of operation in the memref dialect
(view-related, shape-related). However, they are internally represented as
affine maps that require a somewhat fragile extraction of the strides from the
linear form that also comes with an overhead. Furthermore, textual
representation of strided layouts as affine maps is difficult to read: compare
`affine_map<(d0, d1, d2)[s0, s1] -> (d0*32 + d1*s0 + s1 + d2)>` with
`strides: [32, ?, 1], offset: ?`. While a rudimentary support for parsing a
syntactically sugared version of the strided layout has existed in the codebase
for a long time, it does not go as far as this commit to make the strided
layout a first-class attribute in the IR.
This introduces the attribute and updates the tests that using the pre-existing
sugared form to use the new attribute instead. Most memref created
programmatically, e.g., in passes, still use the affine form with further
extraction of strides and will be updated separately.
Update and clean-up the memref type documentation that has gotten stale and has
been referring to the details of affine map composition that are long gone.
See https://discourse.llvm.org/t/rfc-materialize-strided-memref-layout-as-an-attribute/64211.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D132864
The current approach is convervative in which whenever there is a
non-normalizable operations in a function will the function be labelled
as non-normalizable. It means it requires that all operations must have
MemRefsNormalizable trait.
This patch relaxes the requirement that if the memref map layouts of a
non-normalizable operation are identity, this operation does not block
the normalization of the other operations in the same function.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D125854
This change add a helper function for computing a topological sorting of a list of ops. E.g. this can be useful in transforms where a subset of ops should be cloned without dominance errors.
The analysis reuses the existing implementation in TopologicalSortUtils.cpp.
Differential Revision: https://reviews.llvm.org/D131669
This reland includes changes to the Python bindings.
Switch variadic operand and result segment size attributes to use the
dense i32 array. Dense integer arrays were introduced primarily to
represent index lists. They are a better fit for segment sizes than
dense elements attrs.
Depends on D131801
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D131803
Switch variadic operand and result segment size attributes to use the
dense i32 array. Dense integer arrays were introduced primarily to
represent index lists. They are a better fit for segment sizes than
dense elements attrs.
Depends on D131738
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D131702
When dead-code analysis is run at the scope of a function, call ops to
other functions at the same level were being marked as unreachable,
since the analysis optimistically assumes the call op to have no known
predecessors and that all predecessors are known, but the callee would
never get visited.
This patch fixes the bug by checking if a referenced function is above
the top-level op of the analysis, and is thus considered an external
callable.
Fixes#56830
Reviewed By: zero9178
Differential Revision: https://reviews.llvm.org/D130829
Added a commutativity utility pattern and a function to populate it. The pattern sorts the operands of an op in ascending order of the "key" associated with each operand iff the op is commutative. This sorting is stable.
The function is intended to be used inside passes to simplify the matching of commutative operations. After the application of the above-mentioned pattern, since the commutative operands now have a deterministic order in which they occur in an op, the matching of large DAGs becomes much simpler, i.e., requires much less number of checks to be written by a user in her/his pattern matching function.
The "key" associated with an operand is the list of the "AncestorKeys" associated with the ancestors of this operand, in a breadth-first order.
The operand of any op is produced by a set of ops and block arguments. Each of these ops and block arguments is called an "ancestor" of this operand.
Now, the "AncestorKey" associated with:
1. A block argument is `{type: BLOCK_ARGUMENT, opName: ""}`.
2. A non-constant-like op, for example, `arith.addi`, is `{type: NON_CONSTANT_OP, opName: "arith.addi"}`.
3. A constant-like op, for example, `arith.constant`, is `{type: CONSTANT_OP, opName: "arith.constant"}`.
So, if an operand, say `A`, was produced as follows:
```
`<block argument>` `<block argument>`
\ /
\ /
`arith.subi` `arith.constant`
\ /
`arith.addi`
|
returns `A`
```
Then, the block arguments and operations present in the backward slice of `A`, in the breadth-first order are:
`arith.addi`, `arith.subi`, `arith.constant`, `<block argument>`, and `<block argument>`.
Thus, the "key" associated with operand `A` is:
```
{
{type: NON_CONSTANT_OP, opName: "arith.addi"},
{type: NON_CONSTANT_OP, opName: "arith.subi"},
{type: CONSTANT_OP, opName: "arith.constant"},
{type: BLOCK_ARGUMENT, opName: ""},
{type: BLOCK_ARGUMENT, opName: ""}
}
```
Now, if "keyA" is the key associated with operand `A` and "keyB" is the key associated with operand `B`, then:
"keyA" < "keyB" iff:
1. In the first unequal pair of corresponding AncestorKeys, the AncestorKey in operand `A` is smaller, or,
2. Both the AncestorKeys in every pair are the same and the size of operand `A`'s "key" is smaller.
AncestorKeys of type `BLOCK_ARGUMENT` are considered the smallest, those of type `CONSTANT_OP`, the largest, and `NON_CONSTANT_OP` types come in between. Within the types `NON_CONSTANT_OP` and `CONSTANT_OP`, the smaller ones are the ones with smaller op names (lexicographically).
---
Some examples of such a sorting:
Assume that the sorting is being applied to `foo.commutative`, which is a commutative op.
Example 1:
> %1 = foo.const 0
> %2 = foo.mul <block argument>, <block argument>
> %3 = foo.commutative %1, %2
Here,
1. The key associated with %1 is:
```
{
{CONSTANT_OP, "foo.const"}
}
```
2. The key associated with %2 is:
```
{
{NON_CONSTANT_OP, "foo.mul"},
{BLOCK_ARGUMENT, ""},
{BLOCK_ARGUMENT, ""}
}
```
The key of %2 < the key of %1
Thus, the sorted `foo.commutative` is:
> %3 = foo.commutative %2, %1
Example 2:
> %1 = foo.const 0
> %2 = foo.mul <block argument>, <block argument>
> %3 = foo.mul %2, %1
> %4 = foo.add %2, %1
> %5 = foo.commutative %1, %2, %3, %4
Here,
1. The key associated with %1 is:
```
{
{CONSTANT_OP, "foo.const"}
}
```
2. The key associated with %2 is:
```
{
{NON_CONSTANT_OP, "foo.mul"},
{BLOCK_ARGUMENT, ""}
}
```
3. The key associated with %3 is:
```
{
{NON_CONSTANT_OP, "foo.mul"},
{NON_CONSTANT_OP, "foo.mul"},
{CONSTANT_OP, "foo.const"},
{BLOCK_ARGUMENT, ""},
{BLOCK_ARGUMENT, ""}
}
```
4. The key associated with %4 is:
```
{
{NON_CONSTANT_OP, "foo.add"},
{NON_CONSTANT_OP, "foo.mul"},
{CONSTANT_OP, "foo.const"},
{BLOCK_ARGUMENT, ""},
{BLOCK_ARGUMENT, ""}
}
```
Thus, the sorted `foo.commutative` is:
> %5 = foo.commutative %4, %3, %2, %1
Signed-off-by: Srishti Srivastava <srishti.srivastava@polymagelabs.com>
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D124750
When this was updated in D127139 the update in-place case was no longer
marked as pessimistic. Add back in.
Differential Revision: https://reviews.llvm.org/D130453
Convert arith.cmpi to the canonical form with constants on the right side
to simplify further optimizations and open more opportunities for CSE.
Differential Revision: https://reviews.llvm.org/D129929
In the current state, this is only special cased for Allocation effects, but any effects on results allocated by the operation may be ignored when checking whether the op may be removed, as none of them are possible to be observed if the result is unused.
A use case for this is for IRs for languages which always initialize on allocation. To correctly model such operations, a Write as well as an Allocation effect should be placed on the result. This would prevent the Op from being deleted if unused however. This patch fixes that issue.
Differential Revision: https://reviews.llvm.org/D129854