This revision adds vector transform operations that allow us to better inspect the composition
of various lowerings that were previously very opaque.
This commit is NFC in that it does not change patterns beyond adding `rewriter.notifyFailure` messages
and it does not change the tests beyond breaking them into pieces and using transforms instead of
throwaway opaque test passes.
Reviewed By: ftynse, springerm
Co-authored-by: Alex Zinenko <zinenko@google.com>
Differential Revision: https://reviews.llvm.org/D146755
Vector dialect patterns have grown enormously in the past year to a point where they are now impenetrable.
Start reorganizing them towards finer-grained control.
Differential Revision: https://reviews.llvm.org/D146736
This is for targets that do not support gather-like ops, e.g., SPIR-V.
Gather is expanded into lower-level vector ops with memory accesses
guarded with `scf.if`.
I also considered generating `vector.maskedload`s, but decided against
it to keep the `memref` and `tensor` codepath closer together. There's a
good chance that if a target doesn't support gather it does not support
masked loads either.
Issue: https://github.com/llvm/llvm-project/issues/60905
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D145942
This pattern is not specific to nvgpu; I intend to use in SPIR-V codegen. `VectorTransforms` seems like a more generally useful place.
In addition:
- Fix a bug in the second condition (the dimensions were swapped for RHS).
- Add tests.
- Add support for externally provided filter functions, similar to other vector transforms.
- Prefer to transpose before zero/sign-extending inputs.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D145638
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This helper handles non trivial cases of broadcast + optional transpose creation
that should not leak to the outside world.
Differential Revision: https://reviews.llvm.org/D139003
This revision refactors and cleans up a bunch of infra related to vector, shapes and indexing into more reusable APIs.
Differential Revision: https://reviews.llvm.org/D138501
Ops such as `%1 = vector.extractelement %0[%pos : index] : vector<96xf32>`.
In case of an extract from a 1D vector, the source vector is distributed. The lane into which the requested position falls, extracts the element and shuffles it to all other lanes.
Differential Revision: https://reviews.llvm.org/D137336
This is useful for breaking down extract_strided_slice and potentially
cancel with other extract / insert ops before or after.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D137471
Quantization method is crucial and ubiqutous in accelerating machine
learning workloads. Most of these methods uses f16 and i8 types.
This patch relaxes the type contraints on warp reduce distribution to
allow these types. Furthermore, this patch also changed the interface
and moved the initial reduction of data to a single thread into the
distributedReductionFn, this gives flexibility for developers to control
how they are obtaining the initial lane value, which might differ based
on the input types. (i.e to shuffle 32-width type, we need to reduce f16
to 2xf16 types rather than a single element).
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D137691
When a value used in the forOp is defined outside the region but within
the parent warpOp we need to return and distribute the value to pass it
to new operations created within the loop.
Also simplify the lambda interface.
Differential Revision: https://reviews.llvm.org/D137146
This allows for incrementally updating the old API usages without
needing to update everything at once. These will be left on Both
for a little bit and then flipped to prefixed when all APIs have been
updated.
Differential Revision: https://reviews.llvm.org/D134386
This aligns the SCF dialect file layout with the majority of the dialects.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D128049
Make the reduction distribution pattern more generic and remove layering
problem. The new pattern to distribute reduction is now independent of
GPU and takes a lamdba to decide how the distributed reduction should be
generated.
Differential Revision: https://reviews.llvm.org/D127867
Add a pattern to do ad hoc lowering of vector.reduction to a sequence of
warp shuffles. This allow distributing reduction on a warp for GPU targets.
Also add an execution test for warp reduction.
co-authored with @springerm
Differential Revision: https://reviews.llvm.org/D127176
Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.
recommit after minor bug fix.
Differential Revision: https://reviews.llvm.org/D127167
Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.
Differential Revision: https://reviews.llvm.org/D127167
Add pattern to hoist scalar code outside of warp distribute region as
those cannot be distributed and we would want to execute them on all
the lanes.
Add patterns to distribute transfer_write ops. Those operations can be
distributed in different ways and it is control by user.
Differential Revision: https://reviews.llvm.org/D127152
Fixed issue with vector.contract default unroll permutation.
Adds support for vector unroll transformations to unroll in different
orders. For example, the vector.contract can be unrolled into a
smaller set of contractions. There is a choice of how to unroll the
decomposition based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
vector.contract, vector.transfer_read/transfer_write operations
support the callback.
Differential Revision: https://reviews.llvm.org/D127004
Reverts commit 1469ebf8382107e0344173f362b690d19e24029d (original commit)
Reverts commit a392a39f75af586e3d3cd046a8361939277e067f (build fix for above commit)
The commit broke tests in out-of-tree projects, indicating that some logical
error was made in the previous change but not covered by current tests.
Adds supprot for vector unroll transformations to unroll in different
orders. For example, the `vector.contract` can be unrolled into a
smaller set of contractions. There is a choice of how to unroll the
decomposition based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
`vector.contract`, `vector.transfer_read/transfer_write` operations
support the callback.
Differential Revision: https://reviews.llvm.org/D127004
Add lowering for cases where the reduction dimension is fully unrolled.
It is common to unroll the reduction dimension, therefore we would want
to lower the contractions to an elementwise vector op in this case.
Differential Revision: https://reviews.llvm.org/D126120
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.
This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.
This is mostly from @springerm contribution.
Differential Revision: https://reviews.llvm.org/D125430
Support unrolling for vector.transpose following the same interface as
other vector unrolling ops.
Differential Revision: https://reviews.llvm.org/D123688
This commit restructures how TypeID is implemented to ideally avoid
the current problems related to shared libraries. This is done by changing
the "implicit" fallback path to use the name of the type, instead of using
a static template variable (which breaks shared libraries). The major downside to this
is that it adds some additional initialization costs for the implicit path. Given the
use of type names for uniqueness in the fallback, we also no longer allow types
defined in anonymous namespaces to have an implicit TypeID. To simplify defining
an ID for these classes, a new `MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID` macro
was added to allow for explicitly defining a TypeID directly on an internal class.
To help identify when types are using the fallback, `-debug-only=typeid` can be
used to log which types are using implicit ids.
This change generally only requires changes to the test passes, which are all defined
in anonymous namespaces, and thus can't use the fallback any longer.
Differential Revision: https://reviews.llvm.org/D122775
ListOption currently uses llvm:🆑:list under the hood, but the usages
of ListOption are generally a tad different from llvm:🆑:list. This
commit codifies this by making ListOption implicitly comma separated,
and removes the explicit flag set for all of the current list options.
The new parsing for comma separation of ListOption also adds in support
for skipping over delimited sub-ranges (i.e. {}, [], (), "", ''). This
more easily supports nested options that use those as part of the
format, and this constraint (balanced delimiters) is already codified
in the syntax of pass pipelines.
See https://discourse.llvm.org/t/list-of-lists-pass-option/5950 for
related discussion
Differential Revision: https://reviews.llvm.org/D122879
This has been on _Both for a couple of weeks. Flip usages in core with
intention to flip flag to _Prefixed in follow up. Needed to add a couple
of helper methods in AffineOps and Linalg to facilitate a pure flag flip
in follow up as some of these classes are used in templates and so
sensitive to Vector dialect changes.
Differential Revision: https://reviews.llvm.org/D122151
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.
Differential Revision: https://reviews.llvm.org/D121266
Implement the vectorLoopUnroll interface for MultiDimReduceOp and add a
pattern to do the unrolling following the same interface other vector
unroll patterns.
Differential Revision: https://reviews.llvm.org/D121263
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:
* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect
See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061
Differential Revision: https://reviews.llvm.org/D120624
This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.
Differential Revision: https://reviews.llvm.org/D118533
This patch adds the vector.scan op which computes the
scan for a given n-d vector. It requires specifying the operator,
the identity element and whether the scan is inclusive or
exclusive.
TEST: Added test in ops.mlir
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D117171