As part of this extension this change also does some general cleanup
1) Make all the methods take `RewriterBase` as arguments instead of
creating their own builders that tend to crash when used within
pattern rewrites
2) Split `coalesePerfectlyNestedLoops` into two separate methods, one
for `scf.for` and other for `affine.for`. The templatization didnt
seem to be buying much there.
Also general clean up of tests.
This commit changes the API of `ValueBoundsConstraintSet`: the stop
condition is now passed to the constructor instead of `processWorklist`.
That makes it easier to add items to the worklist multiple times and
process them in a consistent manner. The current
`ValueBoundsConstraintSet` is passed as a reference to the stop
function, so that the stop function can be defined before the the
`ValueBoundsConstraintSet` is constructed.
This change is in preparation of adding support for branches.
We have no need to vectorize affine.apply inside the vectorizing loop.
However, we still need to generate it in the original scalar form. We
have to replace all its operands with the generated scalar operands in
the vectorizing loop, e.g., induction variables.
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`
The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.
Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
Generalize affine fusion to work at any inner depth; fusing loops inside
other
affine.for or even inside scf.for or scf.while nests. Apply in post
order on
all affine nests on the pass' top-level operation.
Fix MDG init for blocks inside other affine nests.
Relax unnecessary requirement for unique vars during merge and align of
FlatLinearValueConstraints. There are several cases where
FlatLinearValueConstraints need to have duplicate Values for the
dimensions:
for eg. in dependence relation systems with source and destination
accesses
could have common loop IVs. `mergeAndAlign` can be done even in the
presence
of Values reappearing by simply aligning from left to right in that
order.
While at this, drop outdated comments; improve some debug messages.
MemRefDependenceGraph::init should have been in affine analysis utils
since MemRefDependenceGraph is part of the affine analysis library; its
move was missed. Move it. NFC.
In the process of vectorization of the affine loop, the 0 vector size
causes the crash with building the invalid AffineForOp. We can catch the
case beforehand propagating to the assertion.
See: https://github.com/llvm/llvm-project/issues/64262
This commit implements `LoopLikeOpInterface` on `scf.while`. This
enables LICM (and potentially other transforms) on `scf.while`.
`LoopLikeOpInterface::getLoopBody()` is renamed to `getLoopRegions` and
can now return multiple regions.
Also fix a bug in the default implementation of
`LoopLikeOpInterface::isDefinedOutsideOfLoop()`, which returned "false"
for some values that are defined outside of the loop (in a nested op, in
such a way that the value does not dominate the loop). This interface is
currently only used for LICM and there is no way to trigger this bug, so
no test is added.
Fix affine LICM for side-effecting ops. The code was special-cased for
DMA ops. Generalize it and use isMemoryEffectFree.
Differential Revision: https://reviews.llvm.org/D154783
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
4. Some changes have been deleted for the following reasons:
- Some files had a variable also named cast
- Some files had not included a header file that defines the cast
functions
- Some files are definitions of the classes that have the casting
methods, so the code still refers to the method instead of the
function without adding a prefix or removing the method declaration
at the same time.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
mlir/lib/**/IR/\
mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
mlir/test/lib/Dialect/Test/TestTypes.cpp\
mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
mlir/test/lib/Dialect/Test/TestAttributes.cpp\
mlir/unittests/TableGen/EnumsGenTest.cpp\
mlir/test/python/lib/PythonTestCAPI.cpp\
mlir/include/mlir/IR/
```
Differential Revision: https://reviews.llvm.org/D150123
There are now two entry points. One for shaped values and one for index-typed values. This addresses a comment in D146524.
Differential Revision: https://reviews.llvm.org/D147987
Use `reifyValueBound` instead, which is more general not hard-coded to a specific list supported ops.
Also add a `closedUB` parameter to the ValueBoundsOpInterface API.
Differential Revision: https://reviews.llvm.org/D146356
Affine Prefetch Op impelements AffineMapAccessInterface but does not implement
AffineReadOpInterface or AffineWriteOpInterface. Prefetch Op was cast to
AffineWriteOpinterface causing the crash.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D146836
If an `scf.for` loop yields an equal index-typed value or a shaped value with the same dimension sizes (in comparison to the corresponding iter_arg), bounds can be computed for the iter_arg and the OpResult of the `scf.for` op.
Differential Revision: https://reviews.llvm.org/D146306
Ops can implement this interface to specify lower/upper bounds for their result values and block arguments. Bounds can be specified for:
* Index-type values
* Dimension sizes of shapes values
The bounds are added to a constraint set. Users can query this constraint set to compute bounds wrt. to a user-specified set of values. Only EQ bounds are supported at the moment.
This revision also contains interface implementations for various tensor dialect ops, which illustrates how to implement this interface.
Differential Revision: https://reviews.llvm.org/D145681
Move out MemRefDependenceGraph analysis structure out of LoopFusion into
the Affine Analysis library. This had been a long pending TODO. Moving
MDG out allows its use in other affine passes as well as allows building
custom affine fusion passes downstream while reusing upstream fusion
utilties. The file LoopFusion.cpp had also become lengthy and this
change makes things more modular. This change is a pure NFC and is a
code movement.
NFC.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D147105
The affine loop utility `tilePerfectlyNestedLoops` was checking for the
validity of tiling as well as performing the tiling. This is
inconsistent with how other similar utilities work. Move out the
analysis/check from the utility so that the latter only performs the
mechanics of IR manipulation.
This is NFC/pure move beyond the change in behavior of
tilePerfectlyNestedLoops.
Differential Revision: https://reviews.llvm.org/D147055
Care is taken to order operands from least hoistable to most hoistable and to process subexpressions in the same
order.
This allows exposing more oppportunities for licm, cse and strength reduction.
Such a step should typically be applied while we still have loops in the IR and just before lowering affine ops to arith.
This is because the affine.apply canonicalization currently tries to maximally compose chains of affine.apply operations
and could undo the effects of these decompositions.
Depends on: D145784
Differential Revision: https://reviews.llvm.org/D145685
Remove unnecessary Block argument on MemRefDependenceGraph::init.
`block` is already a field on MDG.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D142597
`strictMode` is moved to GreedyRewriteConfig to simplify the API and state of rewriter classes. The region-based GreedyPatternRewriteDriver now also supports strict mode.
MultiOpPatternRewriteDriver becomes simpler: fewer method must be overridden.
Differential Revision: https://reviews.llvm.org/D142623
There are now three options:
* `AnyOp` (previously `false`)
* `ExistingAndNewOps` (previously `true`)
* `ExistingOps`: this one is new.
The last option corresponds to what the `applyOpPatternsAndFold(Operation*, ...)` overload is doing. It is now also supported on the `applyOpPatternsAndFold(ArrayRef<Operation *>, ...)` overload.
Differential Revision: https://reviews.llvm.org/D141904
Replace a couple of check instances with llvm::any_of (clang-tidy
warnings). Factor out "canCreatePrivateMemRef" and
"performFusionsIntoDest" into separate methods to reduce the
length/indent of the containing methods. Add doc comments and debug messages.
Mark some of the methods that should have been const const.
NFC.
Reviewed By: vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D142076
This patch made a minor refactor of LoopCoalescing.cpp's walkLoops
templated method and placed it in Affine's LoopUtils.cpp/h.
This method is also renamed as coalescePerfectlyNestedLoops method. This
minor change enables this method to be invoked
by both the original LoopCoalescing pass as well as the newly introduced
loop.coalesce transform op.
The loop.coalesce transform op has the ability to coalesce affine, and
scf loop nests, leveraging existing LoopCoalescing
mechanism. I have created it inside the SCFTransformOps.td instead of
AffineTransformOps.td as it feels to be similar
in spirit as the loop.unroll op that can handle both scf and affine
loops. Please let me know if you feel that this op
should be moved into AffineTransformOps.td instead.
The testcase added illustrates loop.coalesce transform op working for
scf, affine loops (inner, outer) as well as
coalesced loop can be further unrolled (achieving composibility).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D141202
Similar to how `makeArrayRef` is deprecated in favor of deduction guides, do the
same for `makeMutableArrayRef`.
Once all of the places in-tree are using the deduction guides for
`MutableArrayRef`, we can mark `makeMutableArrayRef` as deprecated.
Differential Revision: https://reviews.llvm.org/D141814
The patch adds operations to `BlockAndValueMapping` and renames it to `IRMapping`. When operations are cloned, old operations are mapped to the cloned operations. This allows mapping from an operation to a cloned operation. Example:
```
Operation *opWithRegion = ...
Operation *opInsideRegion = &opWithRegion->front().front();
IRMapping map
Operation *newOpWithRegion = opWithRegion->clone(map);
Operation *newOpInsideRegion = map.lookupOrNull(opInsideRegion);
```
Migration instructions:
All includes to `mlir/IR/BlockAndValueMapping.h` should be replaced with `mlir/IR/IRMapping.h`. All uses of `BlockAndValueMapping` need to be renamed to `IRMapping`.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D139665
Drop unnecessary bailout in checkMemRefAccessDependence in the presence of
surrounding affine.parallel ops. When the affine.parallel op was added, affine
analysis methods weren't extended to account for it. Fix this and allow memref
dependence check to work in the presence of affine.parallel ops in the mix.
Rename isForInductionVar -> isAffineForInductionVar, getLoopIVs ->
getAffineForIVs to avoid confusion since that's what they were.
Differential Revision: https://reviews.llvm.org/D141254
Fix affine LICM pass for unknown region-holding ops. The logic was
completely ignoring regions of unknown ops leading to generation of
invalid IR on hoisting. Handle affine.parallel op among those with
regions that are supported.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D140738
The `hasDependencePath` method in affine fusion is quite inefficient as
it does a DFS on the complete graph for what is a small part of the
checks before fusion can be performed. Make this efficient by using the
fact that the nodes involved are all at the top-level of the same block.
With this change, for large graphs with about 10,000 nodes, the check
runs in a few seconds instead of not terminating even in a few hours.
This is NFC from a functionality standpoint; it only leads to an
improvement in pass running time on large IR.
Differential Revision: https://reviews.llvm.org/D140522
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.