This fixes an issue in One-Shot Bufferize that could lead to missing buffer copies in the future. This bug can currently not be triggered because of the order in which ops are analyzed (always bottom-to-top). However, if we consider different traversal orders for the analysis in the future, this bug can cause subtle issues that are difficult to debug.
Example:
```
%0 = ...
%1 = tensor.insert ... into %0
%2 = tensor.extract_slice %0
tensor.extract %2[...]
```
In case of a top-to-bottom analysis of the above IR, the `tensor.insert` is analyzed before the `tensor.extract_slice`. In that case, the `tensor.insert` will bufferize in-place because %2 is not yet known to become an alias of %0 (and therefore causing a conflict).
With this change, the `tensor.insert` will bufferize out-of-place, regardless of the traversal order.
Differential Revision: https://reviews.llvm.org/D135049
This fixes a bug where a required buffer copy was not inserted.
Not only written aliases, but also read aliases should be taken into account when computing common enclosing repetitive regions. Furthermore, for writing ops, it does not matter where the destination tensor is defined, but where the op itself is located.
Differential Revision: https://reviews.llvm.org/D135420
This method allows to declare regions as "repetitive" even if the parent op does not implement the RegionBranchOpInterface.
This is needed to support loop-like ops that have parallel semantics but do not branch between regions.
Differential Revision: https://reviews.llvm.org/D133113
bufferization.to_memref ops are not supported in One-Shot Analysis. They often trigger a failed assertion that can be confusing. Instead, scan for to_memref ops before running the analysis and immediately abort with a proper error message.
Differential Revision: https://reviews.llvm.org/D132027
This change changes the bufferization so that it utilizes the new TensorCopyInsertion pass. One-Shot Bufferize no longer calls the One-Shot Analysis. Instead, it relies on the TensorCopyInsertion pass to make the entire IR fully inplacable. The `bufferize` implementations of all ops are simplified; they no longer have to account for out-of-place bufferization decisions. These were already materialized in the IR in the form of `bufferization.alloc_tensor` ops during the TensorCopyInsertion pass.
Differential Revision: https://reviews.llvm.org/D127652
Find writability conflicts (writes to buffers that are not allowed to be written to) by checking SSA use-def chains. This is better than the current writability analysis, which is too conservative and finds false positives.
Differential Revision: https://reviews.llvm.org/D127256
Now that analysis and bufferization are better separated, post-analysis steps are no longer needed. Users can directly interleave analysis and bufferization as needed.
Differential Revision: https://reviews.llvm.org/D126571
This change adds a new op `alloc_tensor` to the bufferization dialect. During bufferization, this op is always lowered to a buffer allocation (unless it is "eliminated" by a pre-processing pass). It is useful to have such an op in tensor land, because it allows users to model tensor SSA use-def chains (which drive bufferization decisions) and because tensor SSA use-def chains can be analyzed by One-Shot Bufferize, while memref values cannot.
This change also replaces all uses of linalg.init_tensor in bufferization-related code with bufferization.alloc_tensor.
linalg.init_tensor and bufferization.alloc_tensor are similar, but the purpose of the former one is just to carry a shape. It does not indicate a memory allocation.
linalg.init_tensor is not suitable for modelling SSA use-def chains for bufferization purposes, because linalg.init_tensor is marked as not having side effects (in contrast to alloc_tensor). As such, it is legal to move linalg.init_tensor ops around/CSE them/etc. This is not desirable for alloc_tensor; it represents an explicit buffer allocation while still in tensor land and such allocations should not suddenly disappear or get moved around when running the canonicalizer/CSE/etc.
BEGIN_PUBLIC
No public commit message needed for presubmit.
END_PUBLIC
Differential Revision: https://reviews.llvm.org/D126003
This commit relaxes the rules around ops that define a value but do not specify the tensor's contents. (The only such op at the moment is init_tensor.)
When such a tensor is written in a loop, it should not cause out-of-place bufferization.
Differential Revision: https://reviews.llvm.org/D124849
This makes the API easier to use. Also allows us to check for incorrect API usage for easier debugging.
Differential Revision: https://reviews.llvm.org/D124265
Writes into tensors that are definied outside of a repetitive region, but with the write happening inside of the repetitive region were previously not considered conflicts. This was incorrect.
E.g.:
```
%0 = ... : tensor<?xf32>
scf.for ... {
"reading_op"(%0) : tensor<?xf32>
%1 = "writing_op"(%0) : tensor<?xf32> -> tensor<?xf32>
...
}
```
In the above example, "writing_op" should be out-of-place.
This commit fixes the bufferization for any op that declares its repetitive semantics via RegionBranchOpInterface.
Such IR is rejected by default, but can be allowed with `allow-return-memref`. In preparation of future refactorings, do not deallocate such buffers.
One-Shot Analysis now gathers information about yielded tensors, so that we know during the actual bufferization whether a newly allocated buffer should be deallocated again. (Otherwise, it will leak. This will be addressed in a subsequent commit that also makes `allow-return-memref` a non-experimental flag.)
As a cleanup, `allow-return-memref` is now part of OneShotBufferizationOptions. (It was previously ignored by AlwaysCopyBufferizationState.) Moreover, AlwaysCopyBufferizationState now asserts that `create-deallocs` is deactivated to prevent surprising behavior.
Differential Revision: https://reviews.llvm.org/D121521
The related functionality is moved over to the bufferization dialect. Test cases are cleaned up a bit.
Differential Revision: https://reviews.llvm.org/D120191
Add `BufferizableOpInterface::verifyAnalysis`. Ops can implement this method to check for expected invariants and limitations.
The purpose of this change is to introduce a modular way of checking assertions such as `assertScfForAliasingProperties`.
Differential Revision: https://reviews.llvm.org/D120189
This makes getAliasingOpResult symmetric to getAliasingOpOperand. The previous implementation was confusing for users and implemented in such a way only because there are currently no bufferizable ops that have multiple aliasing OpResults.
Differential Revision: https://reviews.llvm.org/D119259
They used to be classes with a virtual `run` function. This was inconvenient because post analysis steps are stored in BufferizationOptions. Because of this design choice, BufferizationOptions were not copyable.
Differential Revision: https://reviews.llvm.org/D119258
This commit is the first step towards unifying core bufferization and One-Shot Bufferize.
This commit does not move over the implementations of BufferizableOpInterface yet. This will be done in separate commits. This change does also not move the unit tests yet. The tests will be moved together with op interface implementations and split into separate files.
Differential Revision: https://reviews.llvm.org/D117641