This PR adds "value casting", i.e., a mechanism to wrap `ir.Value` in a
proxy class that overloads dunders such as `__add__`, `__sub__`, and
`__mul__` for fun and great profit.
This is thematically similar to
bfb1ba7526
and
9566ee2806.
The example in the test demonstrates the value of the feature (no pun
intended):
```python
@register_value_caster(F16Type.static_typeid)
@register_value_caster(F32Type.static_typeid)
@register_value_caster(F64Type.static_typeid)
@register_value_caster(IntegerType.static_typeid)
class ArithValue(Value):
__add__ = partialmethod(_binary_op, op="add")
__sub__ = partialmethod(_binary_op, op="sub")
__mul__ = partialmethod(_binary_op, op="mul")
a = arith.constant(value=FloatAttr.get(f16_t, 42.42))
b = a + a
# CHECK: ArithValue(%0 = arith.addf %cst, %cst : f16)
print(b)
a = arith.constant(value=FloatAttr.get(f32_t, 42.42))
b = a - a
# CHECK: ArithValue(%1 = arith.subf %cst_0, %cst_0 : f32)
print(b)
a = arith.constant(value=FloatAttr.get(f64_t, 42.42))
b = a * a
# CHECK: ArithValue(%2 = arith.mulf %cst_1, %cst_1 : f64)
print(b)
```
**EDIT**: this now goes through the bindings and thus supports automatic
casting of `OpResult` (including as an element of `OpResultList`),
`BlockArgument` (including as an element of `BlockArgumentList`), as
well as `Value`.
The _mlirRegisterEverything symbol may not be built by some customers. The
code here was intended to support this, but didn't properly initialize the
init_module variable.
This would break JAX with:
NameError: free variable 'init_module' referenced before assignment in enclosing scope
Added missing register_translations in python to replicate the same in
the C-API
Cleaned up the current calls to register passes where the other calls
are already embedded in the mlirRegisterAllPasses.
found here,
https://discourse.llvm.org/t/opencl-example/74187
Linalg currently has these named ops:
* `matmul`
* `matvec`
* `vecmat`
* `batch_matmul`
* `batch_matvec`
But it does not have:
* `batch_vecmat`
This PRs adds that for consistency, and I have a short-term need for it
( https://github.com/openxla/iree/issues/15158 ), so not having this
would cause some contortion on my end.
Currently, `linalg.transpose` and `linalg.broadcast` can't be emitted
through either the C API or the python bindings (which of course go
through the C API). See
https://discourse.llvm.org/t/how-to-build-linalg-transposeop-in-mlir-pybind/73989/10.
The reason is even though they're named ops, there is no opdsl
`@linalg_structured_op` for them and thus while they can be instantiated
they cannot be passed to
[`mlirLinalgFillBuiltinNamedOpRegion`](a7cccb9cbb/mlir/lib/CAPI/Dialect/Linalg.cpp (L18)).
I believe the issue is they both take a `IndexAttrDef` but
`IndexAttrDef` cannot represent dynamic rank. Note, if I'm mistaken and
there is a way to write the `@linalg_structured_op` let me know.
The solution here simply implements the `regionBuilder` interface which
is then picked up by
[`LinalgDialect::addNamedOpBuilders`](7557530f42/mlir/lib/Dialect/Linalg/IR/LinalgDialect.cpp (L116)).
Extension classes are added "by hand" that mirror the API of the
`@linalg_structured_op`s. Note, the extension classes are added to to
`dialects/linalg/__init__.py` instead of
`dialects/linalg/opdsl/ops/core_named_ops.py` in order that they're not
confused for opdsl generators/emitters.
This PR replaces the mixin `OpView` extension mechanism with the
standard inheritance mechanism.
Why? Firstly, mixins are not very pythonic (inheritance is usually used
for this), a little convoluted, and too "tight" (can only be used in the
immediately adjacent `_ext.py`). Secondly, it (mixins) are now blocking
are correct implementation of "value builders" (see
[here](https://github.com/llvm/llvm-project/pull/68764)) where the
problem becomes how to choose the correct base class that the value
builder should call.
This PR looks big/complicated but appearances are deceiving; 4 things
were needed to make this work:
1. Drop `skipDefaultBuilders` in
`OpPythonBindingGen::emitDefaultOpBuilders`
2. Former mixin extension classes are converted to inherit from the
generated `OpView` instead of being "mixins"
a. extension classes that simply were calling into an already generated
`super().__init__` continue to do so
b. (almost all) extension classes that were calling `self.build_generic`
because of a lack of default builder being generated can now also just
call `super().__init__`
3. To handle the [lone single
use-case](https://sourcegraph.com/search?q=context%3Aglobal+select_opview_mixin&patternType=standard&sm=1&groupBy=repo)
of `select_opview_mixin`, namely
[linalg](https://github.com/llvm/llvm-project/blob/main/mlir/python/mlir/dialects/_linalg_ops_ext.py#L38),
only a small change was necessary in `opdsl/lang/emitter.py` (thanks to
the emission/generation of default builders/`__init__`s)
4. since the `extend_opview_class` decorator is removed, we need a way
to register extension classes as the desired `OpView` that `op.opview`
conjures into existence; so we do the standard thing and just enable
replacing the existing registered `OpView` i.e.,
`register_operation(_Dialect, replace=True)`.
Note, the upgrade path for the common case is to change an extension to
inherit from the generated builder and decorate it with
`register_operation(_Dialect, replace=True)`. In the slightly more
complicated case where `super().__init(self.build_generic(...))` is
called in the extension's `__init__`, this needs to be updated to call
`__init__` in `OpView`, i.e., the grandparent (see updated docs).
Note, also `<DIALECT>_ext.py` files/modules will no longer be automatically loaded.
Note, the PR has 3 base commits that look funny but this was done for
the purpose of tracking the line history of moving the
`<DIALECT>_ops_ext.py` class into `<DIALECT>.py` and updating (commit
labeled "fix").
The reason I want this is that I am writing my own Python bindings and
would like to use the insertion point from
`PyThreadContextEntry::getDefaultInsertionPoint()` to call C++ functions
that take an `OpBuilder` (I don't need to expose it in Python but it
also seems appropriate). AFAICT, there is currently no way to translate
a `PyInsertionPoint` into an `OpBuilder` because the operation is
inaccessible.
This PR creates the necessary files to support bindings for operations
in the affine dialect.
This is the first of many PRs which will progressively introduce
affine.load, affine.for, etc operations. I would like to
acknowledge the work by Nelli's author @makslevental :
https://github.com/makslevental/nelli/blob/main/nelli/mlir/affine/affine.py
which jump-starts the work.
This PR adds the additional generation of what I'm calling "value
builders" (a term I'm not married to) that look like this:
```python
def empty(sizes, element_type, *, loc=None, ip=None):
return get_result_or_results(tensor.EmptyOp(sizes=sizes, element_type=element_type, loc=loc, ip=ip))
```
which instantiates a `tensor.EmptyOp` and then immediately grabs the
result (`OpResult`) and then returns that *instead of a handle to the
op*.
What's the point of adding these when `EmptyOp.result` already exists?
My claim/feeling/intuition is that eDSL users are more comfortable with
a value centric programming model (i.e., passing values as operands) as
opposed to an operator instantiation programming model. Thus this change
enables (or at least goes towards) the bindings supporting such a user
and use case. For example,
```python
i32 = IntegerType.get_signless(32)
...
ten1 = tensor.empty((10, 10), i32)
ten2 = tensor.empty((10, 10), i32)
ten3 = arith.addi(ten1, ten2)
```
Note, in order to present a "pythonic" API and enable "pythonic" eDSLs,
the generated identifiers (op names and operand names) are snake case
instead of camel case and thus `llvm::convertToSnakeFromCamelCase`
needed a small fix. Thus this PR is stacked on top of
https://github.com/llvm/llvm-project/pull/68375.
In addition, as a kind of victory lap, this PR adds a "rangefor" that
looks and acts exactly like python's `range` but emits `scf.for`.
TOSA defines the filter channel ordering for 2D convolution operation
`tosa.conv2d` as `[OC, KH, KW, IC]`. The LinAlg dialect supports `[F, H,
W, C]` and `[H, W, C, F]` orderings via the `linalg.conv_2d_nhwc_fhwc`
and `linalg.conv_2d_nhwc_hwcf` operations respectively. Where `F == OC`,
`KH == H`, `KW == W` and `C == IC`.
Currently `tosa.conv2d` is lowered to `linalg.conv2d_nhwc_hwcf` meaning
we need to insert a transposition operation to permute the filter
channels before they can be passed as weights to the linalg op, that is
`[F, H, W, C]` -> `[H, W, C, F]`. An analogous transformation needs to
be applied to the quantized operation that lowers to
`linalg.conv_2d_nhwc_hwcf_q`.
This commit updates the TOSA->LinAlg lowering so that `tosa.conv2d` is
lowered to `linalg.conv2d_nhwc_fhwc` removing the need for the
introduction of a transposition operation and making the mapping 1-1. It
also adds a `linalg.conv_2d_nhwc_fhwc_q` quantized operation to the
LinAlg dialect so the same direct 1-1 mapping can be applied to the
quantized variant.
This commit does not add any new lit tests but repurposes the current
TosaToLinalgNamed tests by removing the checks for transpositions and
updating the targeted LinAlg operations from `linalg.conv2d_nhwc_hwcf`
to linalg.conv2d_nhwc_fhwc`.
Signed-off-by: Jack Frankland <jack.frankland@arm.com>
This patch updates `transform.loop.peel` so that this Op returns two
rather than one handle:
* one for the peeled loop, and
* one for the remainder loop.
Also, following this change this Op will fail if peeling fails. This is
consistent with other similar Ops that also fail if no transformation
takes place.
Relands #67482 with an extra fix for transform_loop_ext.py
Rename and restructure tiling-related transform ops from the structured
extension to be more homogeneous. In particular, all ops now follow a
consistent naming scheme:
- `transform.structured.tile_using_for`;
- `transform.structured.tile_using_forall`;
- `transform.structured.tile_reduction_using_for`;
- `transform.structured.tile_reduction_using_forall`.
This drops the "_op" naming artifact from `tile_to_forall_op` that
shouldn't have been included in the first place, consistently specifies
the name of the control flow op to be produced for loops (instead of
`tile_reduction_using_scf` since `scf.forall` also belongs to `scf`),
and opts for the `using` connector to avoid ambiguity.
The loops produced by tiling are now systematically placed as *trailing*
results of the transform op. While this required changing 3 out of 4 ops
(except for `tile_using_for`), this is the only choice that makes sense
when producing multiple `scf.for` ops that can be associated with a
variadic number of handles. This choice is also most consistent with
*other* transform ops from the structured extension, in particular with
fusion ops, that produce the structured op as the leading result and the
loop as the trailing result.
This PR adds a new transform op that replaces `memref.alloca`s with
`memref.get_global`s to newly inserted `memref.global`s. This is useful,
for example, for allocations that should reside in the shared memory of
a GPU, which have to be declared as globals.
This PR renames the vectorization transform ops as follows:
* `structured.masked_vectorize` => `structured.vectorize`. This reflects
the fact that since [recently](https://reviews.llvm.org/D157774) the op
can also handle the unmasked case.
* `structured.vectorize` =>
`structured.vectorize_children_and_applies_patterns`. This reflects the
fact that the op does not just vectorize the given payload op but all
vectorizable children contained in it, and applies patterns before and
after for preparation and clean-up.
This rename was discussed first
[here](https://reviews.llvm.org/D157774).
The PR also adapts and cleans ups the tablegen description of the
`VectorizeChildrenAndApplyPatternsOp` (formerly `VectorizeOp`).
Don't generate enums from the main VectorOps.td file as that
transitively includes enums from Arith.
---------
Co-authored-by: Nicolas Vasilache <ntv@google.com>
- make sure that the type of induction variable should be determined by the type of the lower bound type.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D159534
The Linalg named ops specification went out of sync with the OpDSL
description, presumably due to direct manual modifications of the yaml
file.
Additionally, the unsigned division has been generating the signed
scalar instruction, which is now fixed.
This revision pipes the fastmath attribute support through the
vector.reduction op. This seemingly simple first step already requires
quite some genuflexions, file and builder reorganization. In the
process, retire the boolean reassoc flag deep in the LLVM dialect
builders and just use the fastmath attribute.
During conversions, templated builders for predicated intrinsics are
partially cleaned up. In the future, to finalize the cleanups, one
should consider adding fastmath to the VPIntrinsic ops.
* Moves several orphaned methods from Operation/OpView -> _OperationBase
so that both hierarchies share them (whether unknown or known to ODS).
* Adds typing information for missing `MLIRError` exception.
* Adds `DiagnosticInfo` typing.
* Adds `DenseResourceElementsAttr` typing that was missing.
This commit removes the deallocation capabilities of
one-shot-bufferization. One-shot-bufferization should never deallocate
any memrefs as this should be entirely handled by the
ownership-based-buffer-deallocation pass going forward. This means the
`allow-return-allocs` pass option will default to true now,
`create-deallocs` defaults to false and they, as well as the escape
attribute indicating whether a memref escapes the current region, will
be removed. A new `allow-return-allocs-from-loops` option is added as a
temporary workaround for some bufferization limitations.
For some reason, the mix-ins of the Python bindings of this dialect used
the PDL type for "any op". However, PDL isn't involved here, so it makes
more sense to use the corresponding type of the transform dialect. This
PR changes that.
`_get_op_result_or_value` was used in mix-ins to unify the handling of
op results and values. However, that function is now called in the
generated constructors, such that doing so in the mix-ins is not
necessary anymore.
This reverts commit 6a91dfedeb956dfa092a6a3f411e8b02f0d5d289.
This caused problems in downstream projects. We are reverting to give
them more time for integration.
This is the first commit in a series with the goal to rework the
BufferDeallocation pass. Currently, this pass heavily relies on copies
to perform correct deallocations, which leads to very slow code and
potentially high memory usage. Additionally, there are unsupported cases
such as returning memrefs which this series of commits aims to add
support for as well.
This first commit removes the deallocation capabilities of
one-shot-bufferization.One-shot-bufferization should never deallocate any
memrefs as this should be entirely handled by the buffer-deallocation pass
going forward. This means the allow-return-allocs pass option will
default to true now, create-deallocs defaults to false and they, as well
as the escape attribute indicating whether a memref escapes the current region,
will be removed.
The documentation should w.r.t. these pass option changes should also be
updated in this commit.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D156662
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.
This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
This patch adds attribute builders for all buildable attributes from the
builtin dialect that did not previously have any. These builders can be
used to construct attributes of a particular type identified by a string
from a Python argument without knowing the details of how to pass that
Python argument to the attribute constructor. This is used, for example,
in the generated code of the Python bindings of ops.
The list of "all" attributes was produced with:
(
grep -h "ods_ir.AttrBuilder.get" $(find ../build/ -name "*_ops_gen.py") \
| cut -f2 -d"'"
git grep -ho "^def [a-zA-Z0-9_]*" -- include/mlir/IR/CommonAttrConstraints.td \
| cut -f2 -d" "
) | sort -u
Then, I only retained those that had an occurence in
`mlir/include/mlir/IR`. In particular, this drops many dialect-specific
attributes; registering those builders is something that those dialects
should do. Finally, I removed those attrbiutes that had a match in
`mlir/python/mlir/ir.py` already and implemented the remaining ones. The
only ones that still miss a builder now are the following:
* Represent more than one possible attribute type:
- `Any.*Attr` (9x)
- `IntNonNegative`
- `IntPositive`
- `IsNullAttr`
- `ElementsAttr`
* I am not sure what "constant attributes" are:
- `ConstBoolAttrFalse`
- `ConstBoolAttrTrue`
- `ConstUnitAttr`
* `Location` not exposed by Python bindings:
- `LocationArrayAttr`
- `LocationAttr`
* `get` function not implemented in Python bindings:
- `StringElementsAttr`
This patch also fixes a compilation problem with
`I64SmallVectorArrayAttr`.
Reviewed By: makslevental, rkayaith
Differential Revision: https://reviews.llvm.org/D159403
Memref descriptors contain an `offset` field that denotes the start of
the content of the memref relative to the `alignedPtr`. This offset is
not considered when converting a memref descriptor to a np.array in the
Python runtime library, essentially treating all memrefs as if they had
an offset of zero. This patch introduces the necessary pointer arithmetic
to find the actual beginning of the memref contents to the memref->numpy
conversion functions.
There is an ongoing discussion about whether the `offset` field is needed
at all in the memref descriptor.
Until that is decided, the Python runtime and CRunnerUtils should
still correctly implement the offset handling.
Related: https://reviews.llvm.org/D157008
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D158494
The mix-in of the `MultiTileSizesOp` set the default value of its
`divisor` argument. This repeats information from the tablegen
defintion, is not necessary (since the generic code deals with `None`
and default values), and has the risk of running out of sync without
people noticing. This patch removes the setting of the value and forward
`None` to the generic constructor instead.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D159416
This patch simplifies and improves the mix-in of the `TileOp`. In
particular:
* Accept all types of sizes (static, dynamic, scalable) in a single
argument `sizes`.
* Use the existing convenience function to dispatch different types of
sizes instead of repeating the implementation in the mix-in.
* Pass on `None` values as is of optional arguments to the init function
of the super class.
* Reformat with default indentation width (4 spaces vs 2 spaces).
* Add a a test for providing scalable sizes.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D159417
This patch removes some manual conversion of mixed Python/attribute
arguments to `I64ArrayAttr`s, which turned out to be unnecessary.
Interestingly, this change does not depend on the additional attribute
builders added in the (currently pending)
https://reviews.llvm.org/D159403 patch.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D159419
The mix-in did not allow to *not* set many of the arguments, even though
they represent optional attributes. Instead, it set default values,
which have different semantics in some cases. In other cases, setting
the default values is already done by the C++ layer, in which case they
are currently redundant and may be wrong in some potential future change
in the TD or C++ files. With this patch, `None` is preserved until the
generated binding, which handles them as desired.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D158844
This patch makes the `transform.structured.pad` op return also a handle
to the copy op that it inserts. This allows to continue transformation
on that op, such as mapping it to a GPU thread.
The patch was mainly authored by @springerm as part of the WIP patch
https://reviews.llvm.org/D156371, which also has an example usage of
this change.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D159088
Extends the existing mix-in for VectorizeOp with support for the missing unit attributes.
Also fixes the unintuitive implementation where
`structured.VectorizeOp(target=target, vectorize_padding=False)` still resulted in the creation of the UnitAttr `vectorize_padding`.
Reviewed By: ingomueller-net
Differential Revision: https://reviews.llvm.org/D158726
This PR implements python enum bindings for *all* the enums - this includes `I*Attrs` (including positional/bit) and `Dialect/EnumAttr`.
There are a few parts to this:
1. CMake: a small addition to `declare_mlir_dialect_python_bindings` and `declare_mlir_dialect_extension_python_bindings` to generate the enum, a boolean arg `GEN_ENUM_BINDINGS` to make it opt-in (even though it works for basically all of the dialects), and an optional `GEN_ENUM_BINDINGS_TD_FILE` for handling corner cases.
2. EnumPythonBindingGen.cpp: there are two weedy aspects here that took investigation:
1. If an enum attribute is not a `Dialect/EnumAttr` then the `EnumAttrInfo` record is canonical, as far as both the cases of the enum **and the `AttrDefName`**. On the otherhand, if an enum is a `Dialect/EnumAttr` then the `EnumAttr` record has the correct `AttrDefName` ("load bearing", i.e., populates `ods.ir.AttributeBuilder('<NAME>')`) but its `enum` field contains the cases, which is an instance of `EnumAttrInfo`. The solution is to generate an one enum class for both `Dialect/EnumAttr` and "independent" `EnumAttrInfo` but to make that class interopable with two builder registrations that both do the right thing (see next sub-bullet).
2. Because we don't have a good connection to cpp `EnumAttr`, i.e., only the `enum class` getters are exposed (like `DimensionAttr::get(Dimension value)`), we have to resort to parsing e.g., `Attribute.parse(f'#gpu<dim {x}>')`. This means that the set of supported `assemblyFormat`s (for the enum) is fixed at compile of MLIR (currently 2, the only 2 I saw). There might be some things that could be done here but they would require quite a bit more C API work to support generically (e.g., casting ints to enum cases and binding all the getters or going generically through the `symbolize*` methods, like `symbolizeDimension(uint32_t)` or `symbolizeDimension(StringRef)`).
A few small changes:
1. In addition, since this patch registers default builders for attributes where people might've had their own builders already written, I added a `replace` param to `AttributeBuilder.insert` (`False` by default).
2. `makePythonEnumCaseName` can't handle all the different ways in which people write their enum cases, e.g., `llvm.CConv.Intel_OCL_BI`, which gets turned into `INTEL_O_C_L_B_I` (because `llvm::convertToSnakeFromCamelCase` doesn't look for runs of caps). So I dropped it. On the otherhand regularization does need to done because some enums have `None` as a case (and others might have other python keywords).
3. I turned on `llvm` dialect generation here in order to test `nvvm.WGMMAScaleIn`, which is an enum with [[ d7e26b5620/mlir/include/mlir/IR/EnumAttr.td (L22-L25) | no explicit discriminator ]] for the `neg` case.
Note, dialects that didn't get a `GEN_ENUM_BINDINGS` don't have any enums to generate.
Let me know if I should add more tests (the three trivial ones I added exercise both the supported `assemblyFormat`s and `replace=True`).
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D157934
In particular:
* Fix and extend the support for constructing possibly nested ArrayAttrs
from lists of Python ints. This can probably be generalized further
and used in many more places.
* Add arguments for `pad_to_multiple_of` and `copy_back_op`.
* Format with black and reorder (keyword-only) arguments to match
tablegen and (`*_gen.py`) order.
* Extend tests for new features.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D157789