Interpret an option value with multiple values, either in the form of an
`ArrayAttr` (either static or passed through a param) or as the multiple
attrs associated to a param, as a comma-separated list, i.e. as a
ListOption on a pass.
Improve ApplyRegisteredPassOp's support for taking options by taking
them as a dict (vs a list of string-valued key-value pairs).
Values of options are provided as either static attributes or as params
(which pass in attributes at interpreter runtime). In either case, the
keys and value attributes are converted to strings and a single
options-string, in the format used on the commandline, is constructed to
pass to the `addToPipeline`-pass API.
Makes it possible to pass around the options to a pass inside a schedule.
The refactoring also makes it so that the pass manager and pass are only
constructed once per `apply()` of the transform op versus for each target
payload given to the op's `apply()`.
This is similar to other configuration objects used across MLIR.
Rename some fields to better reflect that they are no longer booleans.
Reland 04d261101b4f229189463136a794e3e362a793af / #132253.
Print operations are often used for debugging, immediately before the
compiler aborts. In such cases, it is sometimes possible that the output
isn't fully produced yet. Make sure it is by explicitly flushing the
output.
The greedy rewriter is used in many different flows and it has a lot of
convenience (work list management, debugging actions, tracing, etc). But
it combines two kinds of greedy behavior 1) how ops are matched, 2)
folding wherever it can.
These are independent forms of greedy and leads to inefficiency. E.g.,
cases where one need to create different phases in lowering and is
required to applying patterns in specific order split across different
passes. Using the driver one ends up needlessly retrying folding/having
multiple rounds of folding attempts, where one final run would have
sufficed.
Of course folks can locally avoid this behavior by just building their
own, but this is also a common requested feature that folks keep on
working around locally in suboptimal ways.
For downstream users, there should be no behavioral change. Updating
from the deprecated should just be a find and replace (e.g., `find ./
-type f -exec sed -i
's|applyPatternsAndFoldGreedily|applyPatternsGreedily|g' {} \;` variety)
as the API arguments hasn't changed between the two.
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
This PR addresses a [comment] made by @ftynse about the syntax for
`ForeachOp`. The syntax was modified by @muneebkhan85 in #82792, where
the attribute dictionary was moved to the middle.
This patch moves it back to its original place at the end. And
introduces an optional keyword for `zip_shortest`.
[comment]:
https://github.com/llvm/llvm-project/pull/82792#pullrequestreview-2132814144
This patch enables continuous tiling of a target structured op using
diminishing tile sizes. In cases where the tensor dimensions are not
exactly divisible by the tile size, we are left with leftover tensor
chunks that are irregularly tiled. This approach enables tiling of the
leftover chunk with a smaller tile size and repeats this process
recursively using exponentially diminishing tile sizes. This eventually
generates a chain of loops that apply tiling using diminishing tile
sizes.
Adds `continuous_tile_sizes` op to the transform dialect. This op, when
given a tile size and a dimension, computes a series of diminishing tile
sizes that can be used to tile the target along the given dimension.
Additionally, this op also generates a series of chunk sizes that the
corresponding tile sizes should be applied to along the given dimension.
Adds `multiway` attribute to `transform.structured.split` that enables
multiway splitting of a single target op along the given dimension, as
specified in a list enumerating the chunk sizes.
This patch adds more precise side effects to the current ops with memory
effects, allowing us to determine which OpOperand/OpResult/BlockArgument
the
operation reads or writes, rather than just recording the reading and
writing
of values. This allows for convenient use of precise side effects to
achieve
analysis and optimization.
Related discussions:
https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243
Changes transform.foreach's interface to take multiple arguments, e.g.
transform.foreach %ops1, %ops2, %params : ... { ^bb0(%op1, %op2,
%param): BODY } The semantics are that the payloads for these handles
get iterated over as if the payloads have been zipped-up together - BODY
gets executed once for each such tuple. The documentation explains that
this implementation requires that the payloads have the same length.
This change also enables the target argument(s) to be any op/value/param
handle.
The added test cases demonstrate some use cases for this change.
A variable `typeConverterOp` may be nullptr after dynamic cast. There is
a security guard for this, but during logging error message the variable
getting dereferenced.
Found with static analysis.
It may be useful to have access to additional handles or parameters when
performing matches and actions in `foreach_match`, for example, to
parameterize the matcher by rank or restrict it in a non-trivial way.
Enable `foreach_match` to forward additional handles from operands to
matcher symbols and from action symbols to results.
Introduce 3 new optional attributes to the `transform.print` ops:
* `assume_verified`
* `use_local_scope`
* `skip_regions`
The primary motivation is to allow printing on large inputs that
otherwise take forever to print and verify. For the full context, see
this IREE issue: https://github.com/openxla/iree/issues/16901.
Also add some tests and fix the op description.
The original implementation was eagerly reporting silenceable failures
from actions as definite failures. Since silenceable failures are
intended for cases when the IR has not been irreversibly modified, it's
okay to propagate them as silenceable failures of the parent op.
Fixes#86834.
Transform op trait verification calls `getEffects`, and since trait
verification runs before op verification, this call cannot assume the op
to be valid. However, the operand getters now return a `TypedValue` that
unconditionally casts the value to the expected type, leading to an
assertion failure. Use the untyped mechanism instead.
Fixes#84701.
Transform interfaces are implemented, direction or via extensions, in
libraries belonging to multiple other dialects. Those dialects don't
need to depend on the non-interface part of the transform dialect, which
includes the growing number of ops and transitive dependency footprint.
Split out the interfaces into a separate library. This in turn requires
flipping the dependency from the interface on the dialect that has crept
in because both co-existed in one library. The interface shouldn't
depend on the transform dialect either.
As a consequence of splitting, the capability of the interpreter to
automatically walk the payload IR to identify payload ops of a certain
kind based on the type used for the entry point symbol argument is
disabled. This is a good move by itself as it simplifies the interpreter
logic. This functionality can be trivially replaced by a
`transform.structured.match` operation.
Until now, `transform.apply_conversion_patterns` consumed the target
handle and potentially invalidated handles. This commit adds tracking
functionality similar to `transform.apply_patterns`, such that handles
are no longer invalidated, but updated based on op replacements
performed by the dialect conversion.
This new functionality is hidden behind a `preserve_handles` attribute
for now.
Adds `transform.func.cast_and_call` that takes a set of inputs and
outputs and replaces the uses of those outputs with a call to a function
at a specified insertion point.
The idea with this operation is to allow users to author independent IR
outside of a to-be-compiled module, and then match and replace a slice
of the program with a call to the external function.
Additionally adds a mechanism for populating a type converter with a set
of conversion materialization functions that allow insertion of
casts on the inputs/outputs to and from the types of the function
signature.
Similar to `transform.get_result`, except it returns a handle to the
operand indicated by a positional specification, same as is defined for
the linalg match ops.
Additionally updates `get_result` to take the same positional specification.
This makes the use case of wanting to get all of the results of an
operation easier by no longer requiring the user to reconstruct the list
of results one-by-one.
Introduce a new match combinator into the transform dialect. This
operation collects all operations that are yielded by a satisfactory
match into its results. This is a simpler version of `foreach_match`
that can be inserted directly into existing transform scripts.
Add a new transform operation that creates a new parameter containing the number of payload objects (operations, values or attributes) associated with the argument. This is useful in matching and for debugging purposes. This replaces three ad-hoc operations previously provided by the test extension.
This revision provides the ability to use an arbitrary named sequence op
as
the entry point to a transform dialect strategy.
It is also a step towards better transform dialect usage in pass
pipelines
that need to preload a transform library rather thanparse it on the fly.
The interpreter itself is significantly simpler than its testing
counterpart
by avoiding payload/debug root tags and multiple shared modules.
In the process, the NamedSequenceOp::apply function is adapted to allow
it
being an entry point.
NamedSequenceOp is **not** extended to take the PossibleTopLevelTrait at
this
time, because the implementation of the trait is specific to allowing
one
top-level dangling op with a region such as SequenceOp or
AlternativesOp.
In particular, the verifier of PossibleTopLevelTrait does not allow for
an
empty body, which is necessary to declare a NamedSequenceOp that gets
linked
in separately before application.
In the future, we should dispense with the PossibleTopLevelTrait
altogether
and always enter the interpreter with a NamedSequenceOp.
Lastly, relevant TD linking utilities are moved to
TransformInterpreterUtils
and reused from there.
Same as #66369 but for payload values. (#66369 added checks only for
payload operations.)
It was necessary to change the signature of `getPayloadValues` to return
an iterator. This is now similar to payload operations.
Fixes an issue in #66369 where the `LLVM_ENABLE_ABI_BREAKING_CHECKS`
check was inverted.
transform::PrintOp::build(OpBuilder &builder, OperationState &result,
StringRef name) does not set name correctly. Calling
PrintOp::build(builder, result, "whatever name") is going to end up with
a PrintOp with no name.
This patch fixes it by replicating the approach from tablegen created
code. Refer to
build/mlir/include/mlir/Dialect/Transform/IR/TransformOps.cpp.inc
This `Block` member function introduced in
87d77d3cfb5049b3b3714f95b2e48bbc78d8c5f9 may be misleading to users as
the last operation in the block might have not been registered, so there
would be no way to ensure that is a terminator.
---------
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
The previous implementation crashed if run on a `builtin.module` using
an `op_name` filter (because the initial value of `parent` in the while
loop was a `nullptr`). This PR fixes the crash and adds a test.
This change adds a method to modify the ConversionTarget used during
`transform.apply_conversion_patterns` to the
`ConversionPatternDescriptorOpInterface`. This is needed when the TypeConverter
is used to dictate the dynamic legality of operations, as in "structural"
conversion patterns present in, for example, the SCF and func dialects.
As a first use case/test, this change also adds a
`transform.apply_patterns.scf.structural_conversions` operation to the SCF
dialect.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D158672
Fixes a crash when an op, that is mapped to handle that a
`transform.foreach` iterates over, was erased (through the
`TrackingRewriter`). Erasing an op removes it from all mappings and
invalidates iterators. This is already taken care of when an op is
iterating over payload ops in its `apply` method, but not when another
transform op is erasing a tracked payload op.
Introduce a simple conversion of a memref.alloc/dealloc pair into an
alloca in the same scope. Expose it as a transform op and a pattern.
Allocas typically lower to stack allocations as opposed to alloc/dealloc
that lower to significantly more expensive malloc/free calls. In
addition, this can be combined with allocation hoisting from loops to
further improve performance.
The same transform op can now be used to apply registered pass pipelines.
This revision also adds a helper function for querying `PassPipelineInfo` objects and moves the corresponding `lookup` function for `PassInfo` objects to the `PassInfo` class.
Differential Revision: https://reviews.llvm.org/D159211