This PR adds the `convert_region_types` API to
`ConversionPatternRewriter` and introduces a new integration test,
`bf.py`, which demonstrates how to combine a Python-defined dialect, the
dialect conversion API, the pass manager, and the execution engine to
build a pure-Python JIT compilation pipeline.
Before this, MLIR error capture in `apply_partial_conversion` and
`apply_full_conversion` wasn’t handled, which meant any `emitError`
would crash the entire program. This PR adds the handling.
This PR fixed
[issues](https://github.com/iree-org/iree/actions/runs/21956878131/job/63423389868#step:7:211)
caused by dropping `LLVMSupport` in PR #180986, dropped the remaining
direct llvm dependencies from mlir-python binding files.
Previously `LLVMSupport` was dropped while some uncleaned `mlir/CAPI/*`
sources were still being pulled into mlir-py, and those files still
directly depended on LLVM headers. The issue was masked via a global
`include_directories(${LLVM_INCLUDE_DIRS})` in `mlir/CMakeLists.txt`,
out-of-tree builds (e.g., IREE) that define Python module targets
outside the mlir/ directory tree would fail with "no such llvm file"
errors.
Provides the infrastructure for implementing and late-binding
OpInterfaces from Python.
* On the mlir-c API declaration side, each `XOpInterface` has a callback
struct, with a callback for each method and a userdata member (provided
as an arg to each method), and a
`mlirXOpInterfaceAttachFallbackModel(ctx, op_name, callbacks)` func.
* This CAPI is implemented by defining a subclass of
`XOpInterface::FallbackModel` that holds the callback struct and has
each method call the corresponding callback (with userdata as an arg).
Given a callback struct, a new `FallbackModel` is created and attached,
i.e. late bound, to the named op. (MLIR's interface infrastructure is
such that the thus registered `FallbackModel` will be returned in case
the op gets cast to the `XOpInterface`.)
* On the Python side, we expose a stand-in `XOpInterface` base class
which has one (class)method: `XOpInterface.attach(cls, op_name, ctx)`.
Python users subclass this class (`class MyInterfaceImpl(XOpInterface):
...`) and implement the interface's methods (with the right names and
signatures). The user calls `attach` on the subclass
(`MyInterfaceImpl.attach("my_dialect.my_op", ctx)`) which prepares the
callbacks struct _with userdata set to the subclass_ (as we use it to
lookup methods). These callbacks (and userdata) are then registered as
an `XOpInterface::FallbackModel` by
`mlirXOpInterfaceAttachFallbackModel(...)`. From then on the Python
methods will be used to respond to calls to the interface methods
(originating in C++).
This PR enables implementing the TransformOpInterface and the
MemoryEffectsOpInterface, both of which are required for making an op
into a transform op.
Everything besides the above linked code is there to facilitate exposing
the interfaces: the right types for the arguments of the methods are
exposed as are functions/methods for manipulating these arguments (e.g.
specifying side effects on `OpOperand`s and `OpResult`s and being able
to access and set the transform handles associated with args and
results).
This PR adds dialect conversion support to the MLIR Python bindings.
Because it introduces a number of new APIs, it’s a fairly large PR. It
mainly includes the following parts:
* Add a set of types and APIs to the C API, including
`MlirConversionTarget`, `MlirConversionPattern`, `MlirTypeConverter`,
`MlirConversionPatternRewriter`, and others.
* Add the corresponding types and APIs to the Python bindings.
* Extend `mlir-tblgen` with codegen for Python adaptor classes, which
generates an adaptor class for each op.
Note that this PR only adds support for 1-to-1 conversions, 1-to-N
type/value conversions are not supported yet.
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This is mainly for two purposes:
1. to keep it consistent with the C++ class name
`mlir::GreedyRewriteConfig`,
2. to make it shorter.
Since this type was only added a few days ago
(654b3e844f21d3f64521e9cb028efdfebbf99bb4), it shouldn’t cause any
obvious compatibility issues.
We already have `GreedyRewriteDriverConfig` on the Python side, but it
hasn’t yet been exposed as a parameter of
`apply_patterns_and_fold_greedily`. This PR does that.
Before:
```python
def apply_patterns_and_fold_greedily(module: ir.Module, set: FrozenRewritePatternSet) -> None
def apply_patterns_and_fold_greedily(op: ir._OperationBase, set: FrozenRewritePatternSet) -> None
```
After:
```python
def apply_patterns_and_fold_greedily(module: ir.Module, set: FrozenRewritePatternSet,
config: GreedyRewriteDriverConfig | None = None) -> None
def apply_patterns_and_fold_greedily(op: ir._OperationBase, set: FrozenRewritePatternSet,
config: GreedyRewriteDriverConfig | None = None) -> None
```
Note this PR is adapted from
https://github.com/llvm/llvm-project/pull/174785 but using
`std::optional` instead of `nb::object`. Note, this required refactoring
`PyGreedyRewriteDriverConfig` to have a `std::shared_ptr` so that it
could support a copy-ctor.
Co-authored-by: PragmaTwice <twice@apache.org>
This PR ports all in-tree dialect extensions to use the
`PyConcreteType`, `PyConcreteAttribute` CRTPs instead of
`mlir_pure_subclass`. After this PR we can soft deprecate
`mlir_pure_subclass`. Also API signatures are updated to use `Py*`
instead of `Mlir*` so that type "inference" and hints are improved.
# What
This PR adds a shared library `MLIRPythonSupport` which contains all of
the CRTP classes ike `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`, as well as other useful code like `Defaulting*`
and etc enabling their reuse in downstream projects. Downstream projects
can now do
```c++
struct PyTestType : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyTestType> {
...
};
class PyTestAttr : public mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteAttribute<PyTestAttr> {
...
}
NB_MODULE(_mlirPythonTestNanobind, m) {
PyTestType::bind(m);
PyTestAttr::bind(m);
}
```
instead of using the discordant alternative
`mlir_type_subclass`/`mlir_attr_subclass` (same goes for
`PyConcreteValue`/`mlir_value_subclass`).
# Why
This PR is mostly code motion (along with CMake) but before I describe
the changes I want to state the goals/benefits:
1. Currently upstream "core" extensions and "dialect" extensions ([all
of the `Dialect*` extensions
here](d7c734b5a1/mlir/lib/Bindings/Python))
are a two-tier system;
**a**. [core
extensions](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Bindings/Python/IRTypes.cpp#L361)
enjoy first class support as far as type inference[^3], type stub
generation, and ease of implementation, while dialect extensions [have
poorer support](https://reviews.llvm.org/D150927), incorrect type stub
generation much more tedious (boilerplate) implementation;
**b**. Crucially, this two-tiered system is reflected in the fact that
**the two sets of types/attributes are not in the same Python object
hierarchy**. To wit: `isinstance(..., Type)` and `isinstance(...,
Attribute)` are not supported for the dialect extensions[^2];
**c**. Since these types are not exposed in public headers, downstream
users (dialect extensions or not) cannot write functions that overload
on e.g. `PyFloat8*Type` - that's quite a [useful
feature](fdbee98df8/cpp_ext/TorchOps.cpp (L29-L69))!
2. The dialect extensions incur a sizeable performance penalty relative
to the core extensions in that every single trip across the wire (either
`python->cpp` or `cpp->python`) requires work in addition to nanobind's
own casting/construction pipeline;
**a**. When going from `python->cpp`, [we extract the capsule object
from the Python
object](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h#L219C24-L219C46)
and then extract from the capsule the `Mlir*` opaque struct/ptr. This
side isn't so onerous;
**b**. When going from `cpp->python` we call long-hand call Python
`import` APIs and construct the Python object using `_CAPICreate`. Note,
there at least 2 `attr` calls incurred in addition to `_CAPICreate`;
this is already much more [efficiently handled by nanobind
itself](4ba51fcf79/src/nb_internals.h (L381-L382))!
3. This division blocks various features: in some configurations[^1] we
trigger a circular import bug because "dialect" types and attributes
perform an [import of the root `_mlir`
module](bd9651bf78/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h (L585))
when they are created (the types themselves, not even instances of those
types). This blocks type stub generation for dialect extensions (i.e.,
the reason we currently only generate type stubs for `_mlir`).
# How
Prior this was not done/possible because of "ODR" issues but I have
resolved those issues; the basic idea for how we solve this is "move
things we want to share into shared libraries":
1. Move IRCore (stuff like `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`) into `MLIRPythonSupport`;
- Note, we move the rest of the things in `IRModule.h` (renamed to
`IRCore.h`) because `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute` depend on them. This makes for a bigger PR than
one would hope for but ultimately I think we should give people access
to these classes to use as they see fit (specifically inherit from, but
also liberally use in bindings signatures instead of the opaque `Mlir*`
struct wrappers).
2. Put all of this code into a nested namespace
`MLIR_BINDINGS_PYTHON_DOMAIN` which is determined by a compile time
define (and tied to `MLIR_BINDINGS_PYTHON_NB_DOMAIN`). This is necessary
in order to prevent conflicts on both symbol name **and** typeid
(necessary for nanobind to not double register binded types) between
multiple bindings libraries (e.g., `torch-mlir`, and `jax`). Note
[nanobind doesn't support `module_local` like
pybind11](https://nanobind.readthedocs.io/en/latest/porting.html#removed-features).
It does support `NB_DOMAIN` but that is not sufficient for
disambiguating typeids across projects (to wit: we currently define
`NB_DOMAIN` and it was still necessary to move everything to a nested
namespace);
3. Build the [nanobind library itself as a shared
object](https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake#L127)
(and link it to both the extensions and `MLIRPythonSupport`).
4. CMake to make this work, in-tree, out-of-tree, downstream, upstream,
etc.
# Testing
Three tests are added here
1. `PythonTestModuleNanobind` is ported to use
`PyConcreteType<PyTestType>` instead of `mlir_type_subclass` and
`PyConcreteAttribute<PyTestAttr>` instead of `mlir_atrr_subclass`,
verifying this works for non-core extensions in-tree;
2. `StandaloneExtensionNanobind` is ported to use `struct PyCustomType :
mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyCustomType>`
instead of `mlir_type_subclass` verifying this works for non-core
extensions out-of-tree;
3. `StandaloneExtensionNanobind`'s `smoketest` is extended to also load
another bindings package (namely `mlir`) verifying
`MLIR_BINDINGS_PYTHON_DOMAIN` successfully disambiguates symbols and
typeids.
I have also tested this downstream:
https://github.com/llvm/eudsl/pull/287 as well run the following builder
bots:
mlir-nvidia-gcc7:
https://lab.llvm.org/buildbot/#/buildrequests/6654424?redirect_to_build=true
I have also tested against IREE:
https://github.com/iree-org/iree/pull/21916
# Integration
It is highly recommended to set the CMake var
`MLIR_BINDINGS_PYTHON_NB_DOMAIN` (which will also determine
`MLIR_BINDINGS_PYTHON_DOMAIN`) to something unique for each downstream.
This can also be passed explicitly to `add_mlir_python_modules` if your
project builds multiple bindings packages. I added a `WARNING` to this
effect in `AddMLIRPython.cmake`.
[^3]: Python values being typed correctly when exiting from cpp;
[^1]: Specifically when the modules are imported using `importlib`,
which occurs with nanobind's
[stubgen](https://github.com/wjakob/nanobind/blob/master/src/stubgen.py#L965);
[^2]: The workaround we implemented was a class method for the dialect
bindings called `Class.isinstance(...)`;
This patch includes the following changes:
- `RewritePatternSet.add` now accepts op name (e.g. `.add("arith.addi",
fn)`) besides op class (e.g. `.add(arith.AddIOp, fn)`)
- add a concrete signature and a more complete docstring to
`RewritePatternSet.add`.
MLIR currently has three main pattern rewrite drivers (see
[https://mlir.llvm.org/docs/PatternRewriter/#common-pattern-drivers](https://mlir.llvm.org/docs/PatternRewriter/#common-pattern-drivers)):
* Dialect Conversion Driver
* Walk Pattern Rewrite Driver
* Greedy Pattern Rewrite Driver
Right now, we already support the greedy pattern rewrite driver in the C
API and Python bindings. This PR adds support for the walk pattern
rewrite driver. This lightweight driver, unlike the greedy driver, does
not repeatedly apply patterns; instead, it walks the IR once. API-wise,
the main change is adding the `walk_and_apply_patterns` function.
Note that the listener parameter is not supported now.
This is a follow-up PR for #162699.
Currently, in the function where we define rewrite patterns, the `op` we
receive is of type `ir.Operation` rather than a specific `OpView` type
(such as `arith.AddIOp`). This means we can’t conveniently access
certain parts of the operation — for example, we need to use
`op.operands[0]` instead of `op.lhs`. The following example code
illustrates this situation.
```python
def to_muli(op, rewriter):
# op is typed ir.Operation instead of arith.AddIOp
pass
patterns.add(arith.AddIOp, to_muli)
```
In this PR, we convert the operation to its corresponding `OpView`
subclass before invoking the rewrite pattern callback, making it much
easier to write patterns.
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This is a follow-up PR of #162699.
In this PR we clean CAPI and Python bindings of MLIR rewrite part by:
- remove all manually-defined `wrap`/`unwrap` functions;
- remove useless nanobind-defined Python class `RewritePattern`.
This PR adds support for defining custom **`RewritePattern`**
implementations directly in the Python bindings.
Previously, users could define similar patterns using the PDL dialect’s
bindings. However, for more complex patterns, this often required
writing multiple Python callbacks as PDL native constraints or rewrite
functions, which made the overall logic less intuitive—though it could
be more performant than a pure Python implementation (especially for
simple patterns).
With this change, we introduce an additional, straightforward way to
define patterns purely in Python, complementing the existing PDL-based
approach.
### Example
```python
def to_muli(op, rewriter):
with rewriter.ip:
new_op = arith.muli(op.operands[0], op.operands[1], loc=op.location)
rewriter.replace_op(op, new_op.owner)
with Context():
patterns = RewritePatternSet()
patterns.add(arith.AddIOp, to_muli) # a pattern that rewrites arith.addi to arith.muli
frozen = patterns.freeze()
module = ...
apply_patterns_and_fold_greedily(module, frozen)
```
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
In [#160520](https://github.com/llvm/llvm-project/pull/160520), we
discussed the current limitations of PDL rewriting in Python (see [this
comment](https://github.com/llvm/llvm-project/pull/160520#issuecomment-3332326184)).
At the moment, we cannot create new operations in PDL native (python)
rewrite functions because the `PatternRewriter` APIs are not exposed.
This PR introduces bindings to retrieve the insertion point of the
`PatternRewriter`, enabling users to create new operations within Python
rewrite functions. With this capability, more complex rewrites e.g. with
branching and loops that involve op creations become possible.
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This is a follow-up to #159926.
That PR (#159926) exposed native rewrite function registration in PDL
through the C API and Python, enabling use with
`pdl.apply_native_rewrite`.
In this PR, we add support for native constraint functions in PDL via
`pdl.apply_native_constraint`, further completing the PDL API.
In the MLIR Python bindings, we can currently use PDL to define simple
patterns and then execute them with the greedy rewrite driver. However,
when dealing with more complex patterns—such as constant folding for
integer addition—we find that we need `apply_native_rewrite` to actually
perform arithmetic (i.e., compute the sum of two constants). For
example, consider the following PDL pseudocode:
```mlir
pdl.pattern : benefit(1) {
%a0 = pdl.attribute
%a1 = pdl.attribute
%c0 = pdl.operation "arith.constant" {value = %a0}
%c1 = pdl.operation "arith.constant" {value = %a1}
%op = pdl.operation "arith.addi"(%c0, %c1)
%sum = pdl.apply_native_rewrite "addIntegers"(%a0, %a1)
%new_cst = pdl.operation "arith.constant" {value = %sum}
pdl.replace %op with %new_cst
}
```
Here, `addIntegers` cannot be expressed in PDL alone—it requires a
*native rewrite function*. This PR introduces a mechanism to support
exactly that, allowing complex rewrite patterns to be expressed in
Python and enabling many passes to be implemented directly in Python as
well.
As a test case, we defined two new operations (`myint.constant` and
`myint.add`) in Python and implemented a constant-folding rewrite
pattern for them. The core code looks like this:
```python
m = Module.create()
with InsertionPoint(m.body):
@pdl.pattern(benefit=1, sym_name="myint_add_fold")
def pat():
...
op0 = pdl.OperationOp(name="myint.add", args=[v0, v1], types=[t])
@pdl.rewrite()
def rew():
sum = pdl.apply_native_rewrite(
[pdl.AttributeType.get()], "add_fold", [a0, a1]
)
newOp = pdl.OperationOp(
name="myint.constant", attributes={"value": sum}, types=[t]
)
pdl.ReplaceOp(op0, with_op=newOp)
def add_fold(rewriter, results, values):
a0, a1 = values
results.push_back(IntegerAttr.get(i32, a0.value + a1.value))
pdl_module = PDLModule(m)
pdl_module.register_rewrite_function("add_fold", add_fold)
```
The idea is previously discussed in Discord #mlir-python channel with
@makslevental.
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This a reland of https://github.com/llvm/llvm-project/pull/155741 which
was reverted at https://github.com/llvm/llvm-project/pull/157831. This
version is narrower in scope - it only turns on automatic stub
generation for `MLIRPythonExtension.Core._mlir` and **does not do
anything automatically**. Specifically, the only CMake code added to
`AddMLIRPython.cmake` is the `mlir_generate_type_stubs` function which
is then used only in a manual way. The API for
`mlir_generate_type_stubs` is:
```
Arguments:
MODULE_NAME: The fully-qualified name of the extension module (used for importing in python).
DEPENDS_TARGETS: List of targets these type stubs depend on being built; usually corresponding to the
specific extension module (e.g., something like StandalonePythonModules.extension._standaloneDialectsNanobind.dso)
and the core bindings extension module (e.g., something like StandalonePythonModules.extension._mlir.dso).
OUTPUT_DIR: The root output directory to emit the type stubs into.
OUTPUTS: List of expected outputs.
DEPENDS_TARGET_SRC_DEPS: List of cpp sources for extension library (for generating a DEPFILE).
IMPORT_PATHS: List of paths to add to PYTHONPATH for stubgen.
PATTERN_FILE: (Optional) Pattern file (see https://nanobind.readthedocs.io/en/latest/typing.html#pattern-files).
Outputs:
NB_STUBGEN_CUSTOM_TARGET: The target corresponding to generation which other targets can depend on.
```
Downstream users should use `mlir_generate_type_stubs` in coordination
with `declare_mlir_python_sources` to turn on stub generation for their
own downstream dialect extensions and upstream dialect extensions if
they so choose. Standalone example shows an example.
Note, downstream will also need to set
`-DMLIR_PYTHON_PACKAGE_PREFIX=...` correctly for their bindings.
In https://github.com/llvm/llvm-project/pull/94714, we add a python
function `apply_patterns_and_fold_greedily` which accepts an
`MlirModule` as the argument type. However, sometimes we want to apply
patterns with an `MlirOperation` argument, and there is currently no
python API to convert an `MlirOperation` to `MlirModule`.
So here we overload this function `apply_patterns_and_fold_greedily` to
do this (also a corresponding new C API
`mlirApplyPatternsAndFoldGreedilyWithOp`)
This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.
This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.
This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.
---------
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Relands #118583, with a fix for Python 3.8 compatibility. It was not
possible to set the buffer protocol accessers via slots in Python 3.8.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::` to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::`
to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
Following a rather direct approach to expose PDL usage from C and then
Python. This doesn't yes plumb through adding support for custom
matchers through this interface, so constrained to basics initially.
This also exposes greedy rewrite driver. Only way currently to define
patterns is via PDL (just to keep small). The creation of the PDL
pattern module could be improved to avoid folks potentially accessing
the module used to construct it post construction. No ergonomic work
done yet.
---------
Signed-off-by: Jacques Pienaar <jpienaar@google.com>