The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.
Also add the new functions to C and Python bindings.
Following the addition of TensorLike and BufferLike type interfaces (see
00eaff3e9c897c263a879416d0f151d7ca7eeaff), introduce minimal changes
required to bufferize a custom tensor operation into a custom buffer
operation.
To achieve this, new interface methods are added to TensorLike type
interface that abstract away the differences between existing (tensor ->
memref) and custom conversions.
The scope of the changes is intentionally limited (for example,
BufferizableOpInterface is untouched) in order to first understand the
basics and reach consensus design-wise.
---
Notable changes:
* mlir::bufferization::getBufferType() returns BufferLikeType (instead
of BaseMemRefType)
* ToTensorOp / ToBufferOp operate on TensorLikeType / BufferLikeType.
Operation argument "memref" renamed to "buffer"
* ToTensorOp's tensor type inferring builder is dropped (users now need
to provide the tensor type explicitly)
This commit adds an additional overload to `replaceOpWithMultiple` that
accepts additional container types. This has been brought up by users of
the new `replaceOpWithMultiple` API.
In particular, one missing container type was
`SmallVector<SmallVector<Value>>`. The "default" `ArrayRef<ValueRange>`
container type can lead to use-after-scope errors in cases such as:
```c++
// Compute the replacement value ranges. Some replacements are single
// values, some are value ranges.
SmallVector<ValueRange> repl;
repl.push_back(someValueRange); // OK
for (...) {
// push_back(Value) triggers an implicit conversion to ValueRange,
// which does not own the range.
repl.push_back(someValue); // triggers use-after-scope later
}
rewriter.replaceOpWithMultiple(op, repl);
```
In this example, users should use `SmallVector<SmallVector<Value>>
repl;`.
This is a code cleanup. Update a few places in MLIR that should use
`hasSingleElement`/`getSingleElement`.
Note: `hasSingleElement` is faster than `.getSize() == 1` when it is
used with linked lists etc.
Depends on #131508.
This commit adds a new `matchAndRewrite` overload to `ConversionPattern`
to support 1:N replacements. This is the first of two main PRs that
merge the 1:1 and 1:N dialect conversion drivers.
The existing `matchAndRewrite` function supports only 1:1 replacements,
as can be seen from the `ArrayRef<Value>` parameter.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<Value> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const;
```
This commit adds a `matchAndRewrite` overload that is called by the
dialect conversion driver. By default, this new overload dispatches to
the original 1:1 `matchAndRewrite` implementation. Existing
`ConversionPattern`s do not need to be changed as long as there are no
1:N type conversions or value replacements.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<ValueRange> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const {
// Note: getOneToOneAdaptorOperands produces a fatal error if at least one
// ValueRange has 0 or more than 1 value.
return matchAndRewrite(op, getOneToOneAdaptorOperands(operands), rewriter);
}
```
The `ConversionValueMapping`, which keeps track of value replacements
and materializations, still does not support 1:N replacements. We still
rely on argument materializations to convert N replacement values back
into a single value. The `ConversionValueMapping` will be generalized to
1:N mappings in the second main PR.
Before handing the adaptor values to a `ConversionPattern`, all argument
materializations are "unpacked". The `ConversionPattern` receives N
replacement values and does not see any argument materializations. This
implementation strategy allows us to use the 1:N infrastructure/API in
`ConversionPattern`s even though some functionality is still missing in
the driver. This strategy was chosen to keep the sizes of the PRs
smaller and to make it easier for downstream users to adapt to API
changes.
This commit also updates the the "decompose call graphs" transformation
and the "sparse tensor codegen" transformation to use the new 1:N
`ConversionPattern` API.
Note for LLVM conversion: If you are using a type converter with 1:N
type conversion rules or if your patterns are performing 1:N
replacements (via `replaceOpWithMultiple` or
`applySignatureConversion`), conversion pattern applications will start
failing (fatal LLVM error) with this error message: `pattern 'name' does
not support 1:N conversion`. The name of the failing pattern is shown in
the error message. These patterns must be updated to the new 1:N
`matchAndRewrite` API.
`getDescriptorFromTensorTuple` and `getMutDescriptorFromTensorTuple`
extract the tensor type from an `unrealized_conversion_cast` op that
serves as a workaround for missing 1:N dialect conversion support.
This commit changes these functions so that they explicitly receive the
tensor type as a function argument. This is in preparation of merging
the 1:1 and 1:N conversion drivers. The conversion patterns in this file
will soon start receiving multiple SSA values (`ValueRange`) from their
adaptors (instead of a single value that is the result of
`unrealized_conversion_cast`). It will no longer be possible to take the
tensor type from the `unrealized_conversion_cast` op. The
`unrealized_conversion_cast` workaround will disappear entirely.
This commit adds a new function
`ConversionPatternRewriter::replaceOpWithMultiple`. This function is
similar to `replaceOp`, but it accepts multiple `ValueRange`
replacements, one per op result.
Note: This function is not an overload of `replaceOp` because of
ambiguous overload resolution that would make the API difficult to use.
This commit aligns "block signature conversions" with "op replacements":
both support 1:N replacements now. Due to incomplete 1:N support in the
dialect conversion driver, an argument materialization is inserted when
an SSA value is replaced with multiple values; same as block signature
conversions already work around the problem. These argument
materializations are going to be removed in a subsequent commit that
adds full 1:N support. The purpose of this PR is to add missing features
gradually in small increments.
This commit also updates two MLIR transformations that have their custom
workarounds around missing 1:N support. These can already start using
`replaceOpWithMultiple`.
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.
Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.
Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
Codegen "vectors" for pos/crd/val use the capacity as memref size, not
the actual used size. Although the sparsifier itself always uses just
the defined pos/crd/val parts, printing these and passing them back to a
runtime environment could benefit from wrapping the basic pos/crd/val
getters into a proper memref view that sets the right size.
This commit adds a new test-only op:
`sparse_tensor.has_runtime_library`. The op returns "1" if the sparse
compiler runs in runtime library mode.
This op is useful for writing test cases that require different IR
depending on whether the sparse compiler runs in runtime library or
codegen mode.
This commit fixes a memory leak in `sparse_pack_d.mlir`. This test case
uses `sparse_tensor.assemble` to create a sparse tensor SSA value from
existing buffers. This runtime library reallocates+copies the existing
buffers; the codegen path does not. Therefore, the test requires
additional deallocations when running in runtime library mode.
Alternatives considered:
- Make the codegen path allocate. "Codegen" is the "default" compilation
mode and it is handling `sparse_tensor.assemble` correctly. The issue is
with the runtime library path, which should not allocate. Therefore, it
is better to put a workaround in the runtime library path than to work
around the issue with a new flag in the codegen path.
- Add a `sparse_tensor.runtime_only` attribute to
`bufferization.dealloc_tensor`. Verifying that the attribute can only be
attached to `bufferization.dealloc_tensor` may introduce an unwanted
dependency of `MLIRSparseTensorDialect` on `MLIRBufferizationDialect`.
Separates actual transformation files from supporting utility files in
the transforms directory. Includes a bazel overlay fix for the build (as
well as a bit of cleanup of that file to be less verbose and more
flexible).
This centralizes all COO methods, and provides a cleaner API. Note that
the "enc" only constructor is a temporary workaround the need for COO
methods inside the "enc" only storage specifier.
The "Dim" prefix is a legacy left-over that no longer makes sense, since
we have a very strict "Dimension" vs. "Level" definition for sparse
tensor types and their storage.
The "dimension" before "level" does not really make sense Note that
renaming the actual type DimLevelType to LevelType is still TBD, since
this is an externally visible change (e.g. visible to Python API).
This change implements the correct *level* sizes set up for the direct
IR codegen fields in the sparse storage scheme. This brings libgen and
codegen together again.
This is step 3 out of 3 to make sparse_tensor.new work for BSR
This change provides access to the individual components of dim sizes
and lvl sizes after each codegenutil call.
This is step 2 out of 3 to make sparse_tensor.new work for BSR
Note that the (dis)assemble operations still make some simplfying
assumptions (e.g. trailing 2-D COO in AoS format) but now at least both
the direct IR and support library path behave exactly the same.
Generalizing the ops is still TBD.
This is a first revision in a small series of changes that removes
duplications between direct encoding methods and sparse tensor type
wrapper methods (in favor of the latter abstraction, since it provides
more safety). The goal is to simply end up with "just" SparseTensorType
This revision introduces a MapRef, which will support a future
generalization beyond permutations (e.g. block sparsity). This revision
also unifies the conversion/codegen paths for the sparse_tensor.new
operation from file (eg. the readers). Note that more unification is
planned as well as general affine dim2lvl and lvl2dim (all marked with
TODOs).