This PR adds a new transform op that replaces `memref.alloca`s with
`memref.get_global`s to newly inserted `memref.global`s. This is useful,
for example, for allocations that should reside in the shared memory of
a GPU, which have to be declared as globals.
Introduce a simple conversion of a memref.alloc/dealloc pair into an
alloca in the same scope. Expose it as a transform op and a pattern.
Allocas typically lower to stack allocations as opposed to alloc/dealloc
that lower to significantly more expensive malloc/free calls. In
addition, this can be combined with allocation hoisting from loops to
further improve performance.
This revision adds a `transform.apply_conversion_patterns.func.func_to_llvm` transformation.
It is unclear at this point whether this should be spelled out as a standalone transformation
or whether it should resemble `transform.apply_conversion_patterns.dialect_to_llvm "fun"`.
This is dependent on how we want to handle the type converter creation.
In particular the current implementation exhibits the fact that
`transform.apply_conversion_patterns.memref.memref_to_llvm_type_converter` was not rich enough
and did not match the LowerToLLVMOptions.
Keeping those options in sync across all the passes that lower to LLVM is very error prone.
Instead, we should have a single `to_llvm_type_converter`.
Differential Revision: https://reviews.llvm.org/D157553
These patterns are exposed via a new "apply_conversion_patterns" op.
Also provide a new type converter that converts from memref to LLVM types. Conversion patterns that lower to LLVM are special: they require an `LLVMTypeConverter`; a normal `TypeConverter` is not enough. This revision also adds a new interface method to pattern descriptor ops to verify that the default type converter of the enclosing "apply_conversion_patterns" op is compatible with the set of patterns. At the moment, a simple `StringRef` is used. This can evolve to a richer type in the future if needed.
Differential Revision: https://reviews.llvm.org/D157369
All `apply` functions now have a `TransformRewriter &` parameter. This rewriter should be used to modify the IR. It has a `TrackingListener` attached and updates the internal handle-payload mappings based on rewrites.
Implementations no longer need to create their own `TrackingListener` and `IRRewriter`. Error checking is integrated into `applyTransform`. Tracking listener errors are reported only for ops with the `ReportTrackingListenerFailuresOpTrait` trait attached, allowing for a gradual migration. Furthermore, errors can be silenced with an op attribute.
Additional API will be added to `TransformRewriter` in subsequent revisions. This revision just adds an "empty" `TransformRewriter` class and updates all `apply` implementations.
Differential Revision: https://reviews.llvm.org/D152427
* Remove `transform::PatternRegistry`.
* Add a new op for each currently registered pattern set.
* Change names of vector dialect pattern selector ops, so that they are consistent with the remaining code base.
* Remove redundant `transform.vector.extract_address_computations` op.
Differential Revision: https://reviews.llvm.org/D152249
Update operations in Transform dialect extensions defined in the Affine,
GPU, MemRef and Tensor dialects to use the more generic
`TransformHandleTypeInterface` type constraint instead of hardcoding
`PDL_Operation`. See
https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702
for motivation.
Remove the dependency on PDLDialect from these extensions.
Update tests to use `!transform.any_op` instead of `!pdl.operation`.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D150781
Instead of returning an `ArrayRef<Operation *>`, return at iterator that skips ops that were erased/replaced while iterating over the payload ops.
This fixes an issue in conjuction with TrackingListener, where a tracked op was erased during the iteration. Elements may not be removed from an array while iterating over it; this invalidates the iterator.
When ops are erased/removed via `replacePayloadOp`, they are not immediately removed from the mappings data structure. Instead, they are set to `nullptr`. `nullptr`s are not enumerated by `getPayloadOps`. At the end of each transformation, `nullptr`s are removed from the mapping data structure.
Differential Revision: https://reviews.llvm.org/D149847
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
4. Some changes have been deleted for the following reasons:
- Some files had a variable also named cast
- Some files had not included a header file that defines the cast
functions
- Some files are definitions of the classes that have the casting
methods, so the code still refers to the method instead of the
function without adding a prefix or removing the method declaration
at the same time.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
mlir/lib/**/IR/\
mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
mlir/test/lib/Dialect/Test/TestTypes.cpp\
mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
mlir/test/lib/Dialect/Test/TestAttributes.cpp\
mlir/unittests/TableGen/EnumsGenTest.cpp\
mlir/test/python/lib/PythonTestCAPI.cpp\
mlir/include/mlir/IR/
```
Differential Revision: https://reviews.llvm.org/D150123
Add a helper function that makes dynamic sizes of `memref.alloca` ops independent of a given set of values. This functionality can be used to make dynamic allocations hoistable from loops.
Differential Revision: https://reviews.llvm.org/D149316
This patch adds patterns to rewrite memory accesses such that the resulting
accesses are only using a base pointer.
E.g.,
```mlir
memref.load %base[%off0, ...]
```
Will be rewritten in:
```mlir
%new_base = memref.subview %base[%off0,...][1,...][1,...]
memref.load %new_base[%c0,...]
```
The idea behind these patterns is to offer a way to more gradually lower
address computations.
These patterns are the exact opposite of FoldMemRefAliasOps.
I've implemented the support of only five operations in this patch:
- memref.load
- memref.store
- nvgpu.ldmatrix
- vector.transfer_read
- vector.transfer_write
Going forward we may want to provide an interface for these rewritings (and
the ones in FoldMemRefAliasOps.)
One step at a time!
Differential Revision: https://reviews.llvm.org/D146724
Rewrite and document multi-buffering properly:
1. Use IndexingUtils / StaticValueUtils instead of duplicating functionality
2. Properly plumb RewriterBase through.
3. Add support
4. Better debug messages.
This revision is otherwise almost NFC, if it weren't for the extra DeallocOp
support that would previoulsy make multi-buffering fail.
Depends on: D145036
Differential Revision: https://reviews.llvm.org/D145055
Allow user to apply multi-buffering transformation for cases where proving
that there is no loop carried dependency is not trivial. In this case user needs
to ensure that the data are written and read in the same iteration otherwise the
result is incorrect.
Differential Revision: https://reviews.llvm.org/D144227
`alloc`s that have users outside of loops are guaranteed to fail in
`multibuffer`.
Instead of exposing ourselves to that failure in the transform dialect,
filter out the `alloc`s that fall in this category.
To be able to do this filtering we have to change the `multibuffer`
transform op from `TransformEachOpTrait` to a plain `TransformOp`. This is
because `TransformEachOpTrait` expects that every successful `applyToOne`
returns a non-empty result.
Couple of notes:
- I changed the assembly syntax to make sure we only get `alloc` ops as
input. (And added a test case to make sure we reject invalid inputs.)
- `multibuffer` can still fail pretty easily when you know its limitations.
See the updated `op failed to multibuffer` test case for instance.
Longer term, instead of leaking/coupling the actual implementation (in
this case the checks normally done in `memref::multiBuffer`) with the
transform dialect (the added check in `::apply`), we may want to refactor
how we structure the underlying implementation. E.g., we could imagine a
`canApply` method for all the implementations that we want to hook up in
the transform dialect.
This has some implications on how not to duplicate work between
`canApply` and the actual implementation but I thought I throw that here
to have us think about it :).
Differential Revision: https://reviews.llvm.org/D143747
Multibuffer will fail to apply on allocs that are used outside of loops.
This was properly caught in the current implementation but the way we report
it was broken.
Notes cannot be emitted on their own, they need to be attached to another
main diagnostic.
Long story short, change the severity of the report from Note to Error.
Differential Revision: https://reviews.llvm.org/D143729
Adapt the implementation of TransformEachOpTrait to the existence of
parameter values recently introduced into the transform dialect. In
particular, allow `applyToOne` hooks to return a list containing a mix
of `Operation *` that will be associated with handles and `Attribute`
that will be associated with parameter values by the trait
implementation of the transform interface's `apply` method.
Disentangle the "transposition" of the list of per-payload op partial
results to decrease its overall complexity and detemplatize the code
that doesn't really need templates. This removes the poorly documented
special handling for single-result ops with TransformEachOpTrait that
could have assigned null pointer values to handles.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D140979
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.
Now we have more convenient functions to construct silenceable errors
while emitting diagnostics, and the constructor is ambiguous as it
doesn't tell whether the logical error is silencebale or definite.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D137257
Add the plumbing necessary to call the memref dialect's multiBuffer
function. This will allow separation between choosing which buffers
to multi-buffer and the actual transform.
Alter the multibuffer function to return the newly created
allocation if multi-buffering succeeds. This is necessary to
communicate with the transform dialect hooks what allocation
multi-buffering created.
Reviewed By: ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D133985