This is still somehow a WIP, we have some issues with this interface
that are not trivial to solve. This patch tries to make the concepts of
RegionBranchPoint and RegionSuccessor more robust and aligned with their
definition:
- A `RegionBranchPoint` is either the parent (`RegionBranchOpInterface`)
op or a `RegionBranchTerminatorOpInterface` operation in a nested
region.
- A `RegionSuccessor` is either one of the nested region or the parent
`RegionBranchOpInterface`
Some new methods with reasonnable default implementation are added to
help resolving the flow of values across the RegionBranchOpInterface.
It is still not trivial in the current state to walk the def-use chain
backward with this interface. For example when you have the 3rd block
argument in the entry block of a for-loop, finding the matching operands
requires to know about the hidden loop iterator block argument and where
the iterargs start. The API is designed around forward-tracking of the
chain unfortunately.
Try to reland #161575 ; I suspect a buildbot incremental build issue.
This is still somehow a WIP, we have some issues with this interface
that are not trivial to solve. This patch tries to make the concepts of
RegionBranchPoint and RegionSuccessor more robust and aligned with their
definition:
- A `RegionBranchPoint` is either the parent (`RegionBranchOpInterface`)
op or a `RegionBranchTerminatorOpInterface` operation in a nested
region.
- A `RegionSuccessor` is either one of the nested region or the parent
`RegionBranchOpInterface`
Some new methods with reasonnable default implementation are added to
help resolving the flow of values across the RegionBranchOpInterface.
It is still not trivial in the current state to walk the def-use chain
backward with this interface. For example when you have the 3rd block
argument in the entry block of a for-loop, finding the matching operands
requires to know about the hidden loop iterator block argument and where
the iterargs start. The API is designed around forward-tracking of the
chain unfortunately.
Add hoist-dynamic-allocs-option to buffer-results-to-out-params. This PR
supported that obtain the size of the dynamic shape memref through the
caller-callee relationship.
Support custom types (3/N): allow custom tensor and buffer types in
function signatures and at call-sites. This is one of the major building
blocks to move in the direction of module-level one-shot-bufferization
support.
To achieve this, `BufferizationOptions::FunctionArgTypeConverterFn`
callback is converted to work with tensor-like and buffer-like types,
instead of the builtin counterparts. The default behavior for builtins
remains unchanged, while custom types by default go through
`TensorLikeType::getBufferType()` which is a general conversion
interface.
The viewLikeOpInterface abstracts the behavior of an operation view one
buffer as another. However, the current interface only includes a
"getViewSource" method and lacks a "getViewDest" method.
Previously, it was generally assumed that viewLikeOpInterface operations
would have only one return value, which was the view dest. This
assumption was broken by memref.extract_strided_metadata, and more
operations may break these silent conventions in the future. Calling
"viewLikeInterface->getResult(0)" may lead to a core dump at runtime.
Therefore, we need 'getViewDest' method to standardize our behavior.
This patch adds the getViewDest function to viewLikeOpInterface and
modifies the usage points of viewLikeOpInterface to standardize its use.
I have seen misuse of the `hasEffect` API in downstream projects: users
sometimes think that `hasEffect == false` indicates that the operation
does not have a certain memory effect. That's not necessarily the case.
When the op does not implement the `MemoryEffectsOpInterface`, it is
unknown whether it has the specified effect. "false" can also mean
"maybe".
This commit clarifies the semantics in the documentation. Also adds
`hasUnknownEffects` and `mightHaveEffect` convenience functions. Also
simplifies a few call sites.
As part of 2646c36a864aa6a62bc1280e9a8cd2bcd2695349,
`OneShotModuleBufferize` no longer descends into nested symbol tables,
recommending users who wish to do this should do so in a pass
pipeline/custom pass. This did not support the use case of ops that
weren't ModuleOps. The patch updates `OneShotModuleBufferize` to work on
any general op.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.
Also add the new functions to C and Python bindings.
Support custom types (2/N): allow value-owning operations (e.g.
allocation ops) to bufferize custom tensors into custom buffers. This
requires BufferizableOpInterface::getBufferType() to return
BufferLikeType instead of BaseMemRefType.
Affected implementors of the interface are updated accordingly.
Relates to ee070d08163ac09842d9bf0c1315f311df39faf1.
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
One of the common uses of std::nullopt is in one of the constructors
for ValueRange. This patch takes care of the migration where we need
ValueRange() to facilitate perfect forwarding. Note that {} would be
ambiguous for perfecting forwarding to work.
Following the addition of TensorLike and BufferLike type interfaces (see
00eaff3e9c897c263a879416d0f151d7ca7eeaff), introduce minimal changes
required to bufferize a custom tensor operation into a custom buffer
operation.
To achieve this, new interface methods are added to TensorLike type
interface that abstract away the differences between existing (tensor ->
memref) and custom conversions.
The scope of the changes is intentionally limited (for example,
BufferizableOpInterface is untouched) in order to first understand the
basics and reach consensus design-wise.
---
Notable changes:
* mlir::bufferization::getBufferType() returns BufferLikeType (instead
of BaseMemRefType)
* ToTensorOp / ToBufferOp operate on TensorLikeType / BufferLikeType.
Operation argument "memref" renamed to "buffer"
* ToTensorOp's tensor type inferring builder is dropped (users now need
to provide the tensor type explicitly)
Generally, bufferization should be able to create a memref from a tensor
without needing to know more than just a mlir::Type. Thus, change
BufferizationOptions::UnknownTypeConverterFn to accept just a type
(mlir::TensorType for now) instead of mlir::Value. Additionally, apply
the same rationale to getMemRefType() helper function.
Both changes are prerequisites to enable custom types support in
one-shot bufferization.
The current algorithm searching for circular function calls scales quadratically due to the linear scan of the functions vector that is performed for each element of the vector itself. The PR replaces such algorithm with an O(V + E) version based on the Khan's algorithm for topological sorting, where V is the number of functions and E is the number of function calls.
The `DeallocationState` class has been modified to keep a reference to an externally owned `SymbolTableCollection` class, to preserve the cached symbol tables across multiple insertions of deallocation instructions.
Address TODO regarding the recomputation of symbol tables. The signature of the `getFuncOpsOrderedByCalls` function is modified to receive the collection of cached symbol tables.
The PR continues the work started in #141019 by adding the `BufferizationState` class also to the `getBufferType` and `resolveConflicts` interface methods, together with the additional support functions that are used throughout the bufferization infrastructure.
Avoid recomputing the symbol tables by using the `BufferizationState` class introduced in #141019.
There is also one similar TODO remaining within the `getBufferType` function, but that requires more reasoning and one more API change.
Follow-up on #138143, which was reverted due to a missing update a method signature (more specifically, the bufferization interface for `tensor::ConcatOp`) that was not catched before merging. The old PR description is reported in the next lines.
This PR is a follow-up on https://github.com/llvm/llvm-project/pull/138125, and adds a bufferization state class providing information about the IR. The information currently consists of a cached list of symbol tables, which aims to solve the quadratic scaling of the bufferization task with respect to the number of symbols. The PR breaks API compatibility: the bufferize method of the BufferizableOpInterface has been enriched with a reference to a BufferizationState object.
The bufferization state must be kept in a valid state by the interface implementations. For example, if an operation with the Symbol trait is inserted or replaced, its parent SymbolTable must be updated accordingly (see, for example, the bufferization of arith::ConstantOp, where the symbol table of the module gets the new global symbol inserted). Similarly, the invalidation of a symbol table must be performed if an operation with the SymbolTable trait is removed (this can be performed using the invalidateSymbolTable method, introduced in https://github.com/llvm/llvm-project/pull/138014).
Reverts llvm/llvm-project#138143
The PR for the BufferizationState is temporarily reverted due to API incompatibilities that have been initially missed during the update and were not catched by PR checks.
This PR is a follow-up on #138125, and adds a bufferization state class providing information about the IR. The information currently consists of a cached list of symbol tables, which aims to solve the quadratic scaling of the bufferization task with respect to the number of symbols. The PR breaks API compatibility: the `bufferize` method of the `BufferizableOpInterface` has been enriched with a reference to a `BufferizationState` object.
The bufferization state must be kept in a valid state by the interface implementations. For example, if an operation with the `Symbol` trait is inserted or replaced, its parent `SymbolTable` must be updated accordingly (see, for example, the bufferization of `arith::ConstantOp`, where the symbol table of the module gets the new global symbol inserted). Similarly, the invalidation of a symbol table must be performed if an operation with the `SymbolTable` trait is removed (this can be performed using the `invalidateSymbolTable` method, introduced in #138014).
As part of the work on transitioning bufferization dialect, ops, and
associated logic to operate on newly added type interfaces (see
00eaff3e9c897c263a879416d0f151d7ca7eeaff), rename the
bufferization.to_memref to highlight the generic nature of the op.
Bufferization process produces buffers while memref is a builtin type
rather than a generic term.
Preserve the current API (to_buffer still produces a memref), however,
as the new type interfaces are not used yet.
During bufferization, the callee of each `func::CallOp` / `CallableOpInterface` operation is retrieved by means of a symbol table that is temporarily built for the lookup purpose. The creation of the symbol table requires a linear scan of the operation body (e.g., a linear scan of the `ModuleOp` body). Considering that functions are typically called at least once, this leads to a scaling behavior that is quadratic with respect to the number of symbols. The problem is described in the following Discourse topic: https://discourse.llvm.org/t/quadratic-scaling-of-bufferization/86122/
This patch aims to partially address this scaling issue by leveraging the `SymbolTableCollection` class, whose instance is added to the `FuncAnalysisState` extension. Later modifications are also expected to address the problem in other methods required by `BufferizableOpInterface` (e.g., `bufferize` and `getBufferType`), which suffer of the same problem but do not provide access to any bufferization state.
The bufferization.tensor_layout is unnecessarily restricted to affine
map attributes when it could reasonably be any implementor of
MemRefLayoutAttrInterface.
`FunctionOpInterface` assumed the fact that the function type (attribute
of the operation) can be cloned with arbirary lists of function
arguments and results to support argument and result list mutation. This
is not always correct, in particular, LLVM dialect functions require
exactly one result making it impossible to erase the result.
Allow function type cloning to fail and propagate this failure through
various APIs that use it. The common assumption is that existing IR has
not been modified.
Fixes#131142.
Reland a8c7ecdcbc3e89b493b495c6831cc93671c3b844 / #136300.
This is similar to other configuration objects used across MLIR.
Rename some fields to better reflect that they are no longer booleans.
Reland 04d261101b4f229189463136a794e3e362a793af / #132253.
`FunctionOpInterface` assumed the fact that the function type (attribute
of the operation) can be cloned with arbirary lists of function
arguments and results to support argument and result list mutation. This
is not always correct, in particular, LLVM dialect functions require
exactly one result making it impossible to erase the result.
Allow function type cloning to fail and propagate this failure through
various APIs that use it. The common assumption is that existing IR has
not been modified.
Fixes#131142.
Current one-shot bufferization infrastructure operates on top of
TensorType and BaseMemRefType. These are non-extensible base classes of
the respective builtins: tensor and memref. Thus, the infrastructure is
bound to work only with builtin tensor/memref types. At the same time,
there are customization points that allow one to provide custom logic to
control the bufferization behavior.
This patch introduces new type interfaces: tensor-like and buffer-like
that aim to supersede TensorType/BaseMemRefType within the bufferization
dialect and allow custom tensors / memrefs to be used. Additionally,
these new type interfaces are attached to the respective builtin types
so that the switch is seamless.
Note that this patch does very minimal initial work, it does NOT
refactor bufferization infrastructure.
See https://discourse.llvm.org/t/rfc-changing-base-types-for-tensors-and-memrefs-from-c-base-classes-to-type-interfaces/85509
Relax the assumption that alloc op always has allocation at
`getResult(0)`, allow to use `optimize-allocation-liveness` pass for
custom ops with >1 results. Ops with multiple allocations are not
handled here yet.
This PR changes the type of the command-line arguments representing
`LayoutMapOption` from `std::string` to the enum with the same name.
This allows for checking the values of programmable usages of the
corresponding options at compile time.
OneShotBufferizePass's opFilter definition in runOnOperation() fails to
allow operations for all dialect when the dialectFilter has an empty
array value (as opposed to no value). This happens when constructing
OneShotBufferizePass from a OneShotBufferizePassOptions parameter with
an empty dialectFilter. This commit only does filtering if filterDialect
option has a value and it is not an empty array.