New buffer allocations can now be returned/yielded from blocks with `allow-return-allocs`. One-Shot Bufferize deallocates all buffers at the end of the block. If this is not possible (because the buffer escapes the block), this is now done by the existing BufferDeallocation pass.
Differential Revision: https://reviews.llvm.org/D121527
This improves the modularity of the bufferization.
From now on, all ops that do not implement BufferizableOpInterface are considered hoisting barriers. Previously, all ops that do not implement the interface were not considered barriers and such ops had to be marked as barriers explicitly. This was unsafe because we could've hoisted across unknown ops where it was not safe to hoist.
As a side effect, this allows for cleaning up AffineBufferizableOpInterfaceImpl. This build unit no longer needed and can be deleted.
Differential Revision: https://reviews.llvm.org/D121519
The Func has a large number of legacy dependencies carried over from the old
Standard dialect, which was pervasive and contained a large number of varied
operations. With the split of the standard dialect and its demise, a lot of lingering
dead dependencies have survived to the Func dialect. This commit removes a
large majority of then, greatly reducing the dependence surface area of the
Func dialect.
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:
* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect
See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061
Differential Revision: https://reviews.llvm.org/D120624
In D115022, we introduced an optimization where OpResults of a `linalg.generic` may bufferize in-place with an "in" OpOperand if the corresponding "out" OpOperand is not used in the computation.
This optimization can lead to unexpected behavior if the newly chosen OpOperand is in the same alias set as another OpOperand (that is used in the computation). In that case, the newly chosen OpOperand must bufferize out-of-place. This can be confusing to users, as always choosing the "out" OpOperand (regardless of whether it is used) would be expected when having the notion of "destination-passing style" in mind.
With this change, we go back to always bufferizing in-place with "out" OpOperands by default, but letting users override the behavior with a bufferization option.
Differential Revision: https://reviews.llvm.org/D120182
Add `BufferizableOpInterface::verifyAnalysis`. Ops can implement this method to check for expected invariants and limitations.
The purpose of this change is to introduce a modular way of checking assertions such as `assertScfForAliasingProperties`.
Differential Revision: https://reviews.llvm.org/D120189
They used to be classes with a virtual `run` function. This was inconvenient because post analysis steps are stored in BufferizationOptions. Because of this design choice, BufferizationOptions were not copyable.
Differential Revision: https://reviews.llvm.org/D119258
Also reimplement `std-bufferize` in terms of BufferizableOpInterface-based bufferization. The old `std.select` bufferization pattern is no longer needed and deleted.
Differential Revision: https://reviews.llvm.org/D118559
The bufferization of arith.constant ops is also switched over to BufferizableOpInterface-based bufferization. The old implementation is deleted. Both implementations utilize GlobalCreator, now renamed to just `getGlobalFor`.
GlobalCreator no longer maintains a set of all created allocations to avoid duplicate allocations of the same constant. Instead, `getGlobalFor` scans the module to see if there is already a global allocation with the same constant value.
For compatibility reasons, it is still possible to create a pass that bufferizes only `arith.constant`. This pass (createConstantBufferizePass) could be deleted once all users were switched over to One-Shot bufferization.
Differential Revision: https://reviews.llvm.org/D118483
This is for compatibility with existing bufferization passes. Also clean up memref type generation a bit.
Differential Revision: https://reviews.llvm.org/D118243
No longer go through an external model. Also put BufferizableOpInterface into the same build target as the BufferizationDialect. This allows for some code reuse between BufferizationOps canonicalizers and BufferizableOpInterface implementations.
Differential Revision: https://reviews.llvm.org/D117987
This is in preparation of unifying the existing bufferization with One-Shot bufferization.
A subsequent commit will replace `tensor-bufferize`'s implementation with the BufferizableOpInterface-based implementation and move over missing test cases.
Differential Revision: https://reviews.llvm.org/D117984
Pass a ValueRange instead of an ArrayRef<Value> for better compatibility. Also provide an additional function overload that automatically deallocates the buffer if specified.
Differential Revision: https://reviews.llvm.org/D118025
This commit is the first step towards unifying core bufferization and One-Shot Bufferize.
This commit does not move over the implementations of BufferizableOpInterface yet. This will be done in separate commits. This change does also not move the unit tests yet. The tests will be moved together with op interface implementations and split into separate files.
Differential Revision: https://reviews.llvm.org/D117641
Benchmarks show that memref::CopyOp is curently up to 200x slower than
tiled and vectorized versions of linalg::Copy.
Add a temporary flag to allow comprehensive bufferize to generate a
linalg::GenericOp that implements a copy until this performance bug is
resolved.
Differential Revision: https://reviews.llvm.org/D117696
This separates the analysis (and its helpers/data structures) more clearly from the rest of the bufferization.
Differential Revision: https://reviews.llvm.org/D117477
Also move `createAlloc` and related helper functions out of BufferizationState. The goal is to make BufferizationState as small as possible. (Code cleanup)
Differential Revision: https://reviews.llvm.org/D117476
If not allow-return-memref, raise an error if a new memory allocation is returned/yielded from a block. We do not check for new allocations directly, but for ops that yield/return values that are not equivalent to values that are defined outside of the current of the block.
Note: We still need to check that scf.for yield values and bbArgs are aliasing to ensure that getAliasingOpOperand/getAliasingOpResult is correct.
Differential Revision: https://reviews.llvm.org/D116687
This change makes it possible to use a different buffer deallocation strategy. E.g., `-buffer-deallocation` can be used, which also works for allocations that are not in destination-passing style.
Differential Revision: https://reviews.llvm.org/D117096
This op is an example for how to deal with ops who's OpResult may aliasing with one of multiple OpOperands.
Differential Revision: https://reviews.llvm.org/D116868
This revision fixes SubviewOp, InsertSliceOp, ExtractSliceOp construction during bufferization
where not all offset/size/stride operands were properly specified.
A test that exhibited problematic behaviors related to incorrect memref casts is introduced.
Init tensor optimization is disabled in teh testing func bufferize pass.
Differential Revision: https://reviews.llvm.org/D116899
init_tensor elimination is arguably a pre-optimization that should be separated from comprehensive bufferization.
In any case it is still experimental and easily results in wrong IR with violated SSA def-use orderings.
Isolate the optimization behind a flag, separate the test cases and add a test case that would results in wrong IR.
Differential Revision: https://reviews.llvm.org/D116936
In addition, all functions that call `allocationFn` now return FailureOr<Value>. This resolves a few TODOs in the code base.
Differential Revision: https://reviews.llvm.org/D116452
No need to keep track of equivalent extract_slice / insert_slice tensors during bufferization. Just emit a copy, it will fold away.
Note: The analysis still keeps track of equivalent tensors to make the correct inplace bufferization decisions.
Differential Revision: https://reviews.llvm.org/D116684
Pass unique_ptr<BufferizationOption> to the bufferization. This allows the bufferization to enqueue additional PostAnalysisSteps. When running bufferization a second time, a new BufferizationOptions must be constructed.
Differential Revision: https://reviews.llvm.org/D116101
Instead of modifying the existing scf.if op, create a new op with memref OpOperands/OpResults and delete the old op.
New allocations / other memrefs can now be yielded from the op. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.
Differential Revision: https://reviews.llvm.org/D115491
Instead of printing analysis debug information to stderr, annotate the IR. This makes it easier to understand decisions made by the analysis, especially in larger input IR.
Differential Revision: https://reviews.llvm.org/D115575
Remove all function calls related to buffer equivalence from bufferize implementations.
Add a new PostAnalysisStep for scf.for that ensures that yielded values are equivalent to the corresponding BBArgs. (This was previously checked in `bufferize`.) This will be relaxed in a subsequent commit.
Note: This commit changes two test cases. These were broken by design
and should not have passed. With the new scf.for PostAnalysisStep, this
bug was fixed.
Differential Revision: https://reviews.llvm.org/D114927
Allow ops that are not bufferizable in the input IR. (Deactivated by default.)
bufferization::ToMemrefOp and bufferization::ToTensorOp are generated at the bufferization boundaries.
Differential Revision: https://reviews.llvm.org/D114669
This change provides `BufferizableOpInterface` implementations for ops from the Bufferization dialects. These ops are needed at the bufferization boundaries for partial bufferization.
Differential Revision: https://reviews.llvm.org/D114618
There is special logic for InsertSliceOp to check if a memcpy is needed. This change extracts that piece of code and makes it a PostAnalysisStep.
The purpose of this change is to untangle `bufferize` from BufferizationAliasInfo. (Not fully there yet.)
Differential Revision: https://reviews.llvm.org/D114513
Bufferization of function boundaries is extracted from ComprehensiveBufferize into a separate file. This will become its own build target in the future.
Differential Revision: https://reviews.llvm.org/D114226