Fixing RemoveDeadValues to properly remove arguments from
BranchOpInterface operations.
This is a follow-up for:
https://github.com/llvm/llvm-project/pull/117405
cc: @joker-eph @codemzs
---------
Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
This PR fixes a bug in `RemoveDeadValues` where the
`FunctionOpInterface` does not have the `isDeclaration` method. As a
result, we should use the `isExternal` method instead. Fixes#116347.
This commit adds a new `matchAndRewrite` overload to `ConversionPattern`
to support 1:N replacements. This is the first of two main PRs that
merge the 1:1 and 1:N dialect conversion drivers.
The existing `matchAndRewrite` function supports only 1:1 replacements,
as can be seen from the `ArrayRef<Value>` parameter.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<Value> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const;
```
This commit adds a `matchAndRewrite` overload that is called by the
dialect conversion driver. By default, this new overload dispatches to
the original 1:1 `matchAndRewrite` implementation. Existing
`ConversionPattern`s do not need to be changed as long as there are no
1:N type conversions or value replacements.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<ValueRange> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const {
// Note: getOneToOneAdaptorOperands produces a fatal error if at least one
// ValueRange has 0 or more than 1 value.
return matchAndRewrite(op, getOneToOneAdaptorOperands(operands), rewriter);
}
```
The `ConversionValueMapping`, which keeps track of value replacements
and materializations, still does not support 1:N replacements. We still
rely on argument materializations to convert N replacement values back
into a single value. The `ConversionValueMapping` will be generalized to
1:N mappings in the second main PR.
Before handing the adaptor values to a `ConversionPattern`, all argument
materializations are "unpacked". The `ConversionPattern` receives N
replacement values and does not see any argument materializations. This
implementation strategy allows us to use the 1:N infrastructure/API in
`ConversionPattern`s even though some functionality is still missing in
the driver. This strategy was chosen to keep the sizes of the PRs
smaller and to make it easier for downstream users to adapt to API
changes.
This commit also updates the the "decompose call graphs" transformation
and the "sparse tensor codegen" transformation to use the new 1:N
`ConversionPattern` API.
Note for LLVM conversion: If you are using a type converter with 1:N
type conversion rules or if your patterns are performing 1:N
replacements (via `replaceOpWithMultiple` or
`applySignatureConversion`), conversion pattern applications will start
failing (fatal LLVM error) with this error message: `pattern 'name' does
not support 1:N conversion`. The name of the failing pattern is shown in
the error message. These patterns must be updated to the new 1:N
`matchAndRewrite` API.
When an unresolved materialization (`unrealized_conversion_cast` op) is
rolled back, the mapping should be rolled back as well, regardless of
whether it is a source, target or argument materialization. Otherwise,
we accumulate pointers to erased IR in the `mapping`. This is harmless
in most cases, but can cause issues when a new operation is allocated at
the same memory location and the pointer is "reused".
It is not possible to write a test case for this because I cannot
trigger the pointer reuse programmatically.
This commit adds an extra assertion to `applySignatureConversion` to
prevent incorrect API usage: The same block cannot be converted multiple
times. That would mess with the underlying conversion value mapping.
(Mappings would be overwritten.) This is similar to op replacements: The
same op cannot be replaced multiple times.
To simplify the check, `BlockTypeConversionRewrite::block` now stores
the original block. The new block is stored in an extra field. (It used
to be the other way around.)
This commit is in preparation of adding 1:N support to the conversion
value mapping. Before making any further changes to the mapping
infrastructure, I'd like to make sure that the code base around it (that
uses the mapping) is robust.
This change removes the restriction on `SymbolUserOpInterface` operators
so they can be used with operators that implement `SymbolOpInterface`,
example:
`memref.global` implements `SymbolOpInterface` so it can be used with
`memref.get_global` which implements `SymbolUserOpInterface`
```
// Define a global constant array
memref.global "private" constant @global_array : memref<10xi32> = dense<[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]> : tensor<10xi32>
// Access this global constant within a function
func @use_global() {
%0 = memref.get_global @global_array : memref<10xi32>
}
```
Reference: https://github.com/llvm/llvm-project/pull/116519 and
https://discourse.llvm.org/t/question-on-criteria-for-acceptable-ir-in-removedeadvaluespass/83131
---------
Co-authored-by: Zeeshan Siddiqui <mzs@ntdev.microsoft.com>
This change is related to discussion:
https://discourse.llvm.org/t/question-on-criteria-for-acceptable-ir-in-removedeadvaluespass/83131
I do not know the original reason to disallow the optimization on
modules with global private constant. Please let me know what am I
missing, I will be happy to make it better. Thank you!
CC: @Wheest
---------
Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
This is a re-upload of #116934, which was reverted.
The dialect conversion driver has three phases:
- **Create** `IRRewrite` objects as the IR is traversed.
- **Finalize** `IRRewrite` objects. During this phase, source
materializations for mismatching value types are created. (E.g., when
`Value` is replaced with a `Value` of different type, but there is a
user of the original value that was not modified because it is already
legal.)
- **Commit** `IRRewrite` objects. During this phase, all remaining IR
modifications are materialized. In particular, SSA values are actually
being replaced during this phase.
This commit removes the "finalize" phase. This simplifies the code base
a bit and avoids one traversal over the `IRRewrite` stack. Source
materializations are now built during the "commit" phase, right before
an SSA value is being replaced.
This commit also removes the "inverse mapping" of the conversion value
mapping, which was used to predict if an SSA value will be dead at the
end of the conversion. This check is replaced with an approximate check
that does not require an inverse mapping. (A false positive for `v` can
occur if another value `v2` is mapped to `v` and `v2` turns out to be
dead at the end of the conversion. This case is not expected to happen
very often.) This reduces the complexity of the driver a bit and removes
one potential source of bugs. (There have been bugs in the usage of the
inverse mapping in the past.)
`BlockTypeConversionRewrite` no longer stores a pointer to the type
converter. This pointer is now stored in `ReplaceBlockArgRewrite`.
This commit is in preparation of merging the 1:1 and 1:N dialect
conversion driver. It simplifies the upcoming changes around the
conversion value mapping. (API surface of the conversion value mapping
is reduced.)
The dialect conversion driver has three phases:
- **Create** `IRRewrite` objects as the IR is traversed.
- **Finalize** `IRRewrite` objects. During this phase, source
materializations for mismatching value types are created. (E.g., when
`Value` is replaced with a `Value` of different type, but there is a
user of the original value that was not modified because it is already
legal.)
- **Commit** `IRRewrite` objects. During this phase, all remaining IR
modifications are materialized. In particular, SSA values are actually
being replaced during this phase.
This commit removes the "finalize" phase. This simplifies the code base
a bit and avoids one traversal over the `IRRewrite` stack. Source
materializations are now built during the "commit" phase, right before
an SSA value is being replaced.
This commit also removes the "inverse mapping" of the conversion value
mapping, which was used to predict if an SSA value will be dead at the
end of the conversion. This check is replaced with an approximate check
that does not require an inverse mapping. (A false positive for `v` can
occur if another value `v2` is mapped to `v` and `v2` turns out to be
dead at the end of the conversion. This case is not expected to happen
very often.) This reduces the complexity of the driver a bit and removes
one potential source of bugs. (There have been bugs in the usage of the
inverse mapping in the past.)
`BlockTypeConversionRewrite` no longer stores a pointer to the type
converter. This pointer is now stored in `ReplaceBlockArgRewrite`.
This commit is in preparation of merging the 1:1 and 1:N dialect
conversion driver. It simplifies the upcoming changes around the
conversion value mapping. (API surface of the conversion value mapping
is reduced.)
This commit adds a new function
`ConversionPatternRewriter::replaceOpWithMultiple`. This function is
similar to `replaceOp`, but it accepts multiple `ValueRange`
replacements, one per op result.
Note: This function is not an overload of `replaceOp` because of
ambiguous overload resolution that would make the API difficult to use.
This commit aligns "block signature conversions" with "op replacements":
both support 1:N replacements now. Due to incomplete 1:N support in the
dialect conversion driver, an argument materialization is inserted when
an SSA value is replaced with multiple values; same as block signature
conversions already work around the problem. These argument
materializations are going to be removed in a subsequent commit that
adds full 1:N support. The purpose of this PR is to add missing features
gradually in small increments.
This commit also updates two MLIR transformations that have their custom
workarounds around missing 1:N support. These can already start using
`replaceOpWithMultiple`.
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
This is intended as a fast pattern rewrite driver for the cases when a
simple walk gets the job done but we would still want to implement it in
terms of rewrite patterns (that can be used with the greedy pattern
rewrite driver downstream).
The new driver is inspired by the discussion in
https://github.com/llvm/llvm-project/pull/112454 and the LLVM Dev
presentation from @matthias-springer earlier this week.
This limitation comes with some limitations:
* It does not repeat until a fixpoint or revisit ops modified in place
or newly created ops. In general, it only walks forward (in the
post-order).
* `matchAndRewrite` can only erase the matched op or its descendants.
This is verified under expensive checks.
* It does not perform folding / DCE.
We could probably relax some of these in the future without sacrificing
too much performance.
This commit changes the format of the materialization error message.
Previously: `failed to legalize unresolved materialization from ('f64')
to 'f32' that remained live after conversion`
Now: `failed to legalize unresolved materialization from ('f64') to
('f32') that remained live after conversion`
This commit is in preparation of merging the 1:1 and 1:N dialect
conversions. At that point, target materializations may create more than
one SSA value. I am sending this change as a separate PR to keep the
main PR smaller.
This commit adds extra checks/assertions to the
`ConversionPatternRewriterImpl::notifyOpReplaced` to improve its
robustness.
1. Replacing an `unrealized_conversion_cast` op that was created by the
driver is now forbidden and caught early during `replaceOp`. It may work
in some cases, but it is generally dangerous because the conversion
driver keeps track of these ops and performs some extra legalization
steps during the "finalize" phase. (Erasing is them is fine.)
2. `null` replacement values are no longer registered in the
`ConversionValueMapping`. This was an oversight in #106760. There is no
benefit in having `null` values in the `ConversionValueMapping`. (It may
even cause problems.)
This change is in preparation of merging the 1:1 and 1:N dialect
conversion drivers.
The 1:N type converter derived from the 1:1 type converter and extends
it with 1:N target materializations. This commit merges the two type
converters and stores 1:N target materializations in the 1:1 type
converter. This is in preparation of merging the 1:1 and 1:N dialect
conversion infrastructures.
1:1 target materializations (producing a single `Value`) will remain
valid. An additional API is added to the type converter to register 1:N
target materializations (producing a `SmallVector<Value>`). Internally,
all target materializations are stored as 1:N materializations.
The 1:N type converter is removed.
Note for LLVM integration: If you are using the `OneToNTypeConverter`,
simply switch all occurrences to `TypeConverter`.
---------
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
This commit simplifies the result type of materialization functions.
Previously: `std::optional<Value>`
Now: `Value`
The previous implementation allowed 3 possible return values:
- Non-null value: The materialization function produced a valid
materialization.
- `std::nullopt`: The materialization function failed, but another
materialization can be attempted.
- `Value()`: The materialization failed and so should the dialect
conversion. (Previously: Dialect conversion can roll back.)
This commit removes the last variant. It is not particularly useful
because the dialect conversion will fail anyway if all other
materialization functions produced `std::nullopt`.
Furthermore, in contrast to type conversions, at least one
materialization callback is expected to succeed. In case of a failing
type conversion, the current dialect conversion can roll back and try a
different pattern. This also used to be the case for materializations,
but that functionality was removed with #107109: failed materializations
can no longer trigger a rollback. (They can just make the entire dialect
conversion fail without rollback.) With this in mind, it is even less
useful to have an additional error state for materialization functions.
This commit is in preparation of merging the 1:1 and 1:N type
converters. Target materializations will have to return multiple values
instead of a single one. With this commit, we can keep the API simple:
`SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`.
Note for LLVM integration: All 1:1 materializations should return
`Value` instead of `std::optional<Value>`. Instead of `std::nullopt`
return `Value()`.
This commit adds an optional `originalType` parameter to target
materialization functions. Without this parameter, target
materializations are underspecified.
Note: `originalType` is only needed for target materializations.
Source/argument materializations do not have it.
Consider the following example: Let's assume that a conversion pattern
"P1" replaced an SSA value "v1" (type "t1") with "v2" (type "t2"). Then
a different conversion pattern "P2" matches an op that has "v1" as an
operand. Let's furthermore assume that "P2" determines that the
legalized type of "t1" is "t3", which may be different from "t2". In
this example, the target materialization callback will be invoked with:
outputType = "t3", inputs = "v2", originalType = "t1". Note that the
original type "t1" cannot be recovered from just "t3" and "v2"; that's
why the `originalType` parameter is added.
This change is in preparation of merging the 1:1 and 1:N dialect
conversion drivers. As part of that change, argument materializations
will be removed (as they are no longer needed; they were just a
workaround because of missing 1:N support in the dialect conversion).
The new `originalType` parameter is needed when lowering MemRef to LLVM.
During that lowering, MemRef function block arguments are replaced with
the elements that make up a MemRef descriptor. The type converter is set
up in such a way that the legalized type of a MemRef type is an
`!llvm.struct` that represents the MemRef descriptor. When the bare
pointer calling convention is enabled, the function block arguments
consist of just an LLVM pointer. In such a case, a target
materialization will be invoked to construct a MemRef descriptor (output
type = `!llvm.struct<...>`) from just the bare pointer (inputs =
`!llvm.ptr`). The original MemRef type is required to construct the
MemRef descriptor, as static sizes/strides/offset cannot be inferred
from just the bare pointer.
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.
Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.
Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
This commit changes the implementation of analysis conversions, so that
no rollback is needed at the end of the analysis. Instead, the dialect conversion is run on a clone of the IR.
The purpose of this commit is to reduce the number of rollbacks in the
dialect conversion framework. (Long term goal: Remove rollback
functionality entirely.)
Fixes#107870.
We can allow the enclosing Module operation to have a symbol.
The check was likely originally not considering this case and intended
to catch symbols inside the region, not accounting that the walk would
visit the enclosing operation.
This commit simplifies the signature of `OperationConverter::finalize`.
This function always returns "success", so the return value can be
removed.
Note: Previously, this function used to return "failure" if a
materialization failed to legalize. This is now optional and happening
at a later point of time (see `config.buildMaterializations`).
There are some spurious libraries which can be removed.
I'm trying to bundle MLIR/LLVM library dependencies for our own
libraries. We're utilizing cmake function to recursively collect
MLIR/LLVM related dependencies. However, we identified certain library
dependencies as redundant and safe for removal.
Simplify the nesting structure of "if" checks in `remapValues` and
update the code comments.
This is what the comments stated in case there is no type converter:
```
// TODO: What we should do here is just set `desiredType` to `origType`
// and then handle the necessary type conversions after the conversion
// process has finished. Unfortunately a lot of patterns currently rely on
// receiving the new operands even if the types change, so we keep the
// original behavior here for now until all of the patterns relying on
// this get updated.
```
However, without a type converter it is not possible to perform any
materializations. Furthermore, the absence of a type converter indicates
that the pattern does not care about type legality. Therefore, the
current implementation is correct and this TODO can be removed.
Note: Patterns that actually require a remapped type to match the
original operand type can be equipped with a type converter that maps
each type to itself.
This TODO is outdated:
```
// TODO: There currently isn't any mechanism to do 1->N type conversion
// via the PatternRewriter replacement API, so for now we just ignore it.
```
1->N type conversions are already possible as part of block signature
conversions. It is incorrect to just ignore such cases. However, there
is currently no better way to handle 1->N conversions in this function
because of infrastructure limitations. This is now clarified in the
comments.
Remove a redundant `lookupOrDefault` that has no effect.
When no type is passed to `lookupOrDefault`, that function returns the
furthest mapped value (by following the mapping iteratively). If there
is no mapped value with the desired type, then the function also returns
the furthest mapped value.
The value that was passed to the redundant `lookupOrDefault` was
produced by this code:
```
Value newOperand = mapping.lookupOrDefault(operand, desiredType);
```
There are 2 possible cases:
- Case 1: There is no mapping to `desiredType`. Then `newOperand` is the
furthest mapped value.
- Case 2: There is a mapping to `desiredType`. Then the type of
`newOperand` is `desiredType` and the "if" branch that encloses the
redundant `lookupOrDefault` is not executed at all.
Also improve the documentation of
`ConversionValueMapping::lookupOrDefault` and simplify the
implementation a bit.
PR #106760 aligned the handling of dropped block arguments and dropped
op results. The two helper functions that insert source materializations
for uses of replaced block arguments / op results that survived the
conversion are now almost identical (`legalizeConvertedArgumentTypes`
and `legalizeConvertedOpResultTypes`). This PR merges the two functions
and moves the implementation directly into `finalize`.
This PR simplifies the code base and improves the efficiency a bit:
previously, `finalize` iterated over
`ConversionPatternRewriterImpl::rewrites` twice. Now, only one iteration
is needed.
---------
Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>
* Strip calls to raw_string_ostream::flush(), which is essentially a no-op
* Strip unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
The dialect conversion maintains a set of unresolved materializations
(`UnrealizedConversionCastOp`). Turn that set into a `DenseMap` that
maps from ops to `UnresolvedMaterializationRewrite *`. This improves
efficiency a bit, because an iteration over
`ConversionPatternRewriterImpl::rewrites` can be avoided.
Also delete some dead code.
Handle dropped block arguments and dropped op results in the same way:
build a source materialization (that may fold away if unused). This
simplifies the code base a bit and makes it possible to merge
`legalizeConvertedArgumentTypes` and `legalizeConvertedOpResultTypes` in
a future commit. These two functions are almost doing the same thing
now.
As a side effect, this commit also changes the dialect conversion such
that temporary circular cast ops are no longer generated. (There was a
workaround in #107109 that can now be removed again.) Example:
```
%0 = "builtin.unrealized_conversion_cast"(%1) : (!a) -> !b
%1 = "builtin.unrealized_conversion_cast"(%0) : (!b) -> !a
// No further uses of %0, %1.
```
This happened when:
1. An op was erased. (No replacement values provided.)
2. A conversion pattern for another op builds a replacement value for
the erased op's results (first cast op) during `remapValues`, but that
SSA value is not used during the pattern application.
3. During the finalization phase, `legalizeConvertedOpResultTypes`
thinks that the erased op is alive because of the cast op that was built
in Step 2. It builds a cast from that replacement value to the original
type.
4. During the commit phase, all uses of the original op are replaced
with the casted value produced in Step 3. We have generated circular IR.
This problem can be avoided by making sure that source materializations
are generated for all dropped results. This ensures that we always have
some replacement SSA value in the mapping. Previously, we sometimes had
a value mapped and sometimes not. (No more special casing is needed
anymore to distinguish between "value dropped" or "value replaced with
SSA value".)
This commit makes source/target/argument materializations (via the
`TypeConverter` API) optional.
By default (`ConversionConfig::buildMaterializations = true`), the
dialect conversion infrastructure tries to legalize all unresolved
materializations right after the main transformation process has
succeeded. If at least one unresolved materialization fails to resolve,
the dialect conversion fails. (With an error message such as `failed to
legalize unresolved materialization ...`.) Automatic materializations
through the `TypeConverter` API can now be deactivated. In that case,
every unresolved materialization will show up as a
`builtin.unrealized_conversion_cast` op in the output IR.
There used to be a complex and error-prone analysis in the dialect
conversion that predicted the future uses of unresolved
materializations. Based on that logic, some casts (that were deemed to
unnecessary) were folded. This analysis was needed because folding
happened at a point of time when some IR changes (e.g., op replacements)
had not materialized yet.
This commit removes that analysis. Any folding of cast ops now happens
after all other IR changes have been materialized and the uses can
directly be queried from the IR. This simplifies the analysis
significantly. And certain helper data structures such as
`inverseMapping` are no longer needed for the analysis. The folding
itself is done by `reconcileUnrealizedCasts` (which also exists as a
standalone pass).
After casts have been folded, the remaining casts are materialized
through the `TypeConverter`, as usual. This last step can be deactivated
in the `ConversionConfig`.
`ConversionConfig::buildMaterializations = false` can be used to debug
error messages such as `failed to legalize unresolved materialization
...`. (It is also useful in case automatic materializations are not
needed.) The materializations that failed to resolve can then be seen as
`builtin.unrealized_conversion_cast` ops in the resulting IR. (This is
better than running with `-debug`, because `-debug` shows IR where some
IR changes have not been materialized yet.)
Note: This is a reupload of #104668, but with correct handling of cyclic
unrealized_conversion_casts that may be generated by the dialect
conversion.
This fixes#94520 by ensuring that any if any block arguments are being
used outside of the original block that the block is not considered a
candidate for merging.
More details: the root cause of the issue described in #94520 was that
`^bb2` and `^bb5` were being merged despite `%4` (an argument to `^bb2`)
was being used later in `^bb7`. When the block merge occurred, that
unintentionally changed the value of `%4` for all downstream code. This
change prevents that from happening.
This patch deprecates DenseMap::getOrInsertDefault in favor of
DenseMap::operator[], which does the same thing, has been around
longer, and is also a household name as part of std::map and
std::unordered_map.
Note that DenseMap provides several equivalent ways to insert or
default-construct a key-value pair:
- operator[Key]
- try_emplace(Key).first->second
- getOrInsertDefault(Key)
- FindAndConstruct(Key).second