81 Commits

Author SHA1 Message Date
RattataKing
d380b29a7c
[MLIR][Python] Remove partial LLVM APIs in python bindings (5/n) (#180644)
This PR continues work from
https://github.com/llvm/llvm-project/pull/178290
Added local helper functions to avoid dependency on LLVM APIs.

---------

Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
2026-02-10 15:24:56 -05:00
RattataKing
71a8973a3e
[MLIR][Python] Remove partial LLVM APIs in python bindings (4/n) (#180256)
This PR continues work from #178290 
It replaces some LLVM utilities with straightforward `std::`
equivalents.
2026-02-06 17:04:49 -05:00
Ryan Kim
ac88f7bcd4
[mlir][python] Support Arbitrary Precision Integers in MLIR C API and Python Bindings (#177733)
This PR extends the MLIR C API and Python bindings to support
**arbitrary-precision integers (`APInt`)**, overcoming the previous
limitation where `IntegerAttr` values were restricted to 64 bits.

Cryptographic applications often require integer types much larger than
standard machine words (e.g., the 256-bit modulus for the BN254 curve).
Previously, attempting to bind these values resulted in truncation or
errors. This PR exposes the underlying word-based `APInt` structure via
the C API and updates the Python bindings to seamlessly handle Python's
arbitrary-precision integers.
2026-01-24 23:05:03 -08:00
MaPePeR
351d06a819
[MLIR][Python] Improve Iterator performance. Don't throw in dunderNext methods. (#175377)
In
https://github.com/llvm/llvm-project/pull/174139#issuecomment-3733259370
I wrote a scuffed benchmark that mostly iterates MLIR Container Types in
Python. My changes from that PR made the performance worse, so I closed
it.

However, when experimetning with that I also saw a large(?) performance
gain by changing the `dunderNext` methods of the various Iterators to
use `PyErr_SetNone(PyExc_StopIteration);` instead of `throw
nb::stop_iteration();`.

<details><summary>Benchmark attempt script</summary>

```python
import timeit

from mlir.ir import Context, Location, Module, InsertionPoint, Block, Region, OpView
from mlir.dialects import func, builtin, scf, arith

def generate_module():
    m = Module.create()
    with InsertionPoint(m.body):
        f = func.FuncOp("main", builtin.FunctionType.get([], []))
    with InsertionPoint(f.body.blocks.append()):
        generate_ops(10, 2)
        func.ReturnOp([])
    return m

def generate_ops(count: int, depth: int):
    if depth == 0:
        return
    lower = arith.ConstantOp(builtin.IntegerType.get_signless(64), 0)
    upper = arith.ConstantOp(builtin.IntegerType.get_signless(64), 100)
    step =  arith.ConstantOp(builtin.IntegerType.get_signless(64), 1)
    for i in range(count):
        forop = scf.ForOp(lower, upper, step)
        with InsertionPoint(forop.region.blocks[0]):
            generate_ops(count, depth - 1)
            scf.YieldOp([])

def walk_module(m: Module):
    walk_block(m.body)

def walk_region(region: Region):
    for block in region.blocks:
        walk_block(block)

def walk_block(block: Block):
    for predecessors in block.predecessors:
        pass
    for successors in block.successors:
        pass
    for op in block.operations:
        walk_op(op)

def walk_op(op: OpView):
    for result in op.results:
        pass
    for successors in op.successors:
        pass
    for operands in op.operands:
        pass
    for region in op.regions:
        walk_region(region)

with Context(), Location.unknown():
    m = generate_module()

    #  From timeit.main:
    t = timeit.Timer(lambda: walk_module(m))
    number, _ = t.autorange()
    repeats = 5
    raw_timings = t.repeat(repeats, number)
    timings = [dt / number for dt in raw_timings]
    best = min(timings)
    print(f"{number} loops, best of {repeats}: {best * 1000:.3g} msecs per loop")
```
</details>


The performance of the benchmark went from
```
50 loops, best of 5: 5.97 msecs per loop
```
to
```
50 loops, best of 5: 5.12 msecs per loop
```
in my setup, which is a ~14% improvement. (Though you should validate
that yourself, probably. My test setup is very scuffed)

The functions were previously set to return a C++ type like `PyRegion`.
Because of the removal of the `throw` they now had to [return a `NULL`
value to
Python](aa8578dc54/Objects/call.c (L49-L61)),
so I changed the return type to
`nanobind::typed<nanobind::object,PyRegion>` so I could return an
`nb::object()` in case an error was set and otherwise `nb::cast` the
`PyRegion` value to `nb::object` instead of returning it directly.

I'm not a huge fan, that this changes the external "Usage" of the
functions, because now they won't bubble up exceptions, when they are
called from C++ The return type and Python Error State have to be
checked instead.
I couldn't find any location that called them in llvm itself, though.
Maybe these functions should not be public, because they are only
supposed to be called from Python anyway?

---------

Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
2026-01-13 16:02:38 +00:00
Maksim Levental
18fc908566
[mlir][Python] move IRTypes and IRAttributes to MLIRPythonSupport (#174118)
This PR continues the work of
https://github.com/llvm/llvm-project/pull/171775 by moving more useful
types/attributes into MLIRPythonSupport.

You can now do 

```c++
struct PyTestIntegerRankedTensorType
    : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<
          PyTestIntegerRankedTensorType,
          mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyRankedTensorType>
struct PyTestTensorValue
    : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteValue<
          PyTestTensorValue>
```
instead of `mlir_type_subclass` and `mlir_value_subclass`;
**specifically manual registration of the "value caster" via indirection
through the Python interpreter is no longer necessary** . You can also
now freely use all such types at the nanobind API level (e.g., overload
based on `FP*`):

```c++
using mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN;
standaloneM.def("print_fp_type", [](PyF16Type &) { nb::print("this is a fp16 type"); });
standaloneM.def("print_fp_type", [](PyF32Type &) { nb::print("this is a fp32 type"); });
standaloneM.def("print_fp_type", [](PyF64Type &) { nb::print("this is a fp64 type"); });
```

Note, here we only port `PythonTestModuleNanobind` but there is a
follow-up PR that ports **all** in-tree dialect extensions
https://github.com/llvm/llvm-project/pull/174156 to use these. After
that one we can soft deprecate `mlir_pure_subclass`.

Note, depends on https://github.com/llvm/llvm-project/pull/171775
2026-01-05 09:34:58 -08:00
Maksim Levental
f0ef5dba6d
[mlir][Python] create MLIRPythonSupport (#171775)
# What

This PR adds a shared library `MLIRPythonSupport` which contains all of
the CRTP classes ike `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`, as well as other useful code like `Defaulting*`
and etc enabling their reuse in downstream projects. Downstream projects
can now do

```c++
struct PyTestType : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyTestType> {
  ...
};

class PyTestAttr : public mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteAttribute<PyTestAttr> {
  ...
}

NB_MODULE(_mlirPythonTestNanobind, m) {
  PyTestType::bind(m);
  PyTestAttr::bind(m);
}
```

instead of using the discordant alternative
`mlir_type_subclass`/`mlir_attr_subclass` (same goes for
`PyConcreteValue`/`mlir_value_subclass`).

# Why

This PR is mostly code motion (along with CMake) but before I describe
the changes I want to state the goals/benefits:

1. Currently upstream "core" extensions and "dialect" extensions ([all
of the `Dialect*` extensions
here](d7c734b5a1/mlir/lib/Bindings/Python))
are a two-tier system;
**a**. [core
extensions](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Bindings/Python/IRTypes.cpp#L361)
enjoy first class support as far as type inference[^3], type stub
generation, and ease of implementation, while dialect extensions [have
poorer support](https://reviews.llvm.org/D150927), incorrect type stub
generation much more tedious (boilerplate) implementation;
**b**. Crucially, this two-tiered system is reflected in the fact that
**the two sets of types/attributes are not in the same Python object
hierarchy**. To wit: `isinstance(..., Type)` and `isinstance(...,
Attribute)` are not supported for the dialect extensions[^2];
**c**. Since these types are not exposed in public headers, downstream
users (dialect extensions or not) cannot write functions that overload
on e.g. `PyFloat8*Type` - that's quite a [useful
feature](fdbee98df8/cpp_ext/TorchOps.cpp (L29-L69))!
2. The dialect extensions incur a sizeable performance penalty relative
to the core extensions in that every single trip across the wire (either
`python->cpp` or `cpp->python`) requires work in addition to nanobind's
own casting/construction pipeline;
**a**. When going from `python->cpp`, [we extract the capsule object
from the Python
object](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h#L219C24-L219C46)
and then extract from the capsule the `Mlir*` opaque struct/ptr. This
side isn't so onerous;
**b**. When going from `cpp->python` we call long-hand call Python
`import` APIs and construct the Python object using `_CAPICreate`. Note,
there at least 2 `attr` calls incurred in addition to `_CAPICreate`;
this is already much more [efficiently handled by nanobind
itself](4ba51fcf79/src/nb_internals.h (L381-L382))!
3. This division blocks various features: in some configurations[^1] we
trigger a circular import bug because "dialect" types and attributes
perform an [import of the root `_mlir`
module](bd9651bf78/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h (L585))
when they are created (the types themselves, not even instances of those
types). This blocks type stub generation for dialect extensions (i.e.,
the reason we currently only generate type stubs for `_mlir`).

# How

Prior this was not done/possible because of "ODR" issues but I have
resolved those issues; the basic idea for how we solve this is "move
things we want to share into shared libraries":

1. Move IRCore (stuff like `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`) into `MLIRPythonSupport`;
- Note, we move the rest of the things in `IRModule.h` (renamed to
`IRCore.h`) because `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute` depend on them. This makes for a bigger PR than
one would hope for but ultimately I think we should give people access
to these classes to use as they see fit (specifically inherit from, but
also liberally use in bindings signatures instead of the opaque `Mlir*`
struct wrappers).
2. Put all of this code into a nested namespace
`MLIR_BINDINGS_PYTHON_DOMAIN` which is determined by a compile time
define (and tied to `MLIR_BINDINGS_PYTHON_NB_DOMAIN`). This is necessary
in order to prevent conflicts on both symbol name **and** typeid
(necessary for nanobind to not double register binded types) between
multiple bindings libraries (e.g., `torch-mlir`, and `jax`). Note
[nanobind doesn't support `module_local` like
pybind11](https://nanobind.readthedocs.io/en/latest/porting.html#removed-features).
It does support `NB_DOMAIN` but that is not sufficient for
disambiguating typeids across projects (to wit: we currently define
`NB_DOMAIN` and it was still necessary to move everything to a nested
namespace);
3. Build the [nanobind library itself as a shared
object](https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake#L127)
(and link it to both the extensions and `MLIRPythonSupport`).
4. CMake to make this work, in-tree, out-of-tree, downstream, upstream,
etc.

# Testing

Three tests are added here 

1. `PythonTestModuleNanobind` is ported to use
`PyConcreteType<PyTestType>` instead of `mlir_type_subclass` and
`PyConcreteAttribute<PyTestAttr>` instead of `mlir_atrr_subclass`,
verifying this works for non-core extensions in-tree;
2. `StandaloneExtensionNanobind` is ported to use `struct PyCustomType :
mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyCustomType>`
instead of `mlir_type_subclass` verifying this works for non-core
extensions out-of-tree;
3. `StandaloneExtensionNanobind`'s `smoketest` is extended to also load
another bindings package (namely `mlir`) verifying
`MLIR_BINDINGS_PYTHON_DOMAIN` successfully disambiguates symbols and
typeids.

I have also tested this downstream:
https://github.com/llvm/eudsl/pull/287 as well run the following builder
bots:

mlir-nvidia-gcc7:
https://lab.llvm.org/buildbot/#/buildrequests/6654424?redirect_to_build=true

I have also tested against IREE:
https://github.com/iree-org/iree/pull/21916

# Integration

It is highly recommended to set the CMake var
`MLIR_BINDINGS_PYTHON_NB_DOMAIN` (which will also determine
`MLIR_BINDINGS_PYTHON_DOMAIN`) to something unique for each downstream.
This can also be passed explicitly to `add_mlir_python_modules` if your
project builds multiple bindings packages. I added a `WARNING` to this
effect in `AddMLIRPython.cmake`.

[^3]: Python values being typed correctly when exiting from cpp;
[^1]: Specifically when the modules are imported using `importlib`,
which occurs with nanobind's
[stubgen](https://github.com/wjakob/nanobind/blob/master/src/stubgen.py#L965);
[^2]: The workaround we implemented was a class method for the dialect
bindings called `Class.isinstance(...)`;
2026-01-05 09:08:13 -08:00
Victor Chernyakin
c438773432
[LLVM][ADT] Migrate users of make_scope_exit to CTAD (#174030)
This is a followup to #173131, which introduced the CTAD functionality.
2026-01-02 20:42:56 -08:00
Ingo Müller
907335c00c
[mlir:python] Prevent crash in DenseElementsAttr. (#163564)
This PR fixes a crash in the `bf_getbuffer` implementation of
`PyDenseElementsAttribute` that occurred when an element type was not
supported, such as `bf16`. I believe that supportion `bf16` is not
possible with that protocol but that's out of the scope of this PR.
Previsouly, the code raised an `std::exception` out of `bf_getbuffer`
that nanobind does not catch (see also pybind/pybind11#3336). The PR
makes the function catch all `std::exception`s and manually raises a
Python exception instead.

Signed-off-by: Ingo Müller <ingomueller@google.com>
2025-10-20 15:13:00 +02:00
Maksim Levental
3834c5428d
[MLIR][Python] add unchecked gettors (#160954)
Some of the current gettors require passing locations (i.e., there be an
active location) because they're using the "checked" APIs. This PR adds
"unchecked" gettors which only require an active context.
2025-09-27 13:54:33 -05:00
Maksim Levental
0d08ffd22c
[MLIR][Python] use nb::typed for return signatures (#160221)
https://github.com/llvm/llvm-project/pull/160183 removed `nb::typed`
annotation to fix bazel but it turned out to be simply a matter of not
using the correct version of nanobind (see
https://github.com/llvm/llvm-project/pull/160183#issuecomment-3321429155).
This PR restores those annotations but (mostly) moves to the return
positions of the actual methods.
2025-09-23 10:54:22 -07:00
Maksim Levental
81cbd970cf
[MLIR][Python] remove nb::typed to fix bazel build (#160183)
https://github.com/llvm/llvm-project/pull/157930 broke bazel build (see
https://github.com/llvm/llvm-project/pull/157930#issuecomment-3318681217)
because bazel is stricter on implicit conversions (some difference in
flags passed to clang). This PR fixes by moving/removing `nb::typed`.

EDIT: and also the overlay...
2025-09-22 12:55:43 -07:00
Maksim Levental
efd96afedf
[MLIR][Python] reland (narrower) type stub generation (#157930)
This a reland of https://github.com/llvm/llvm-project/pull/155741 which
was reverted at https://github.com/llvm/llvm-project/pull/157831. This
version is narrower in scope - it only turns on automatic stub
generation for `MLIRPythonExtension.Core._mlir` and **does not do
anything automatically**. Specifically, the only CMake code added to
`AddMLIRPython.cmake` is the `mlir_generate_type_stubs` function which
is then used only in a manual way. The API for
`mlir_generate_type_stubs` is:

```
Arguments:
  MODULE_NAME: The fully-qualified name of the extension module (used for importing in python).
  DEPENDS_TARGETS: List of targets these type stubs depend on being built; usually corresponding to the
    specific extension module (e.g., something like StandalonePythonModules.extension._standaloneDialectsNanobind.dso)
    and the core bindings extension module (e.g., something like StandalonePythonModules.extension._mlir.dso).
  OUTPUT_DIR: The root output directory to emit the type stubs into.
  OUTPUTS: List of expected outputs.
  DEPENDS_TARGET_SRC_DEPS: List of cpp sources for extension library (for generating a DEPFILE).
  IMPORT_PATHS: List of paths to add to PYTHONPATH for stubgen.
  PATTERN_FILE: (Optional) Pattern file (see https://nanobind.readthedocs.io/en/latest/typing.html#pattern-files).
Outputs:
  NB_STUBGEN_CUSTOM_TARGET: The target corresponding to generation which other targets can depend on.
```

Downstream users should use `mlir_generate_type_stubs` in coordination
with `declare_mlir_python_sources` to turn on stub generation for their
own downstream dialect extensions and upstream dialect extensions if
they so choose. Standalone example shows an example.

Note, downstream will also need to set
`-DMLIR_PYTHON_PACKAGE_PREFIX=...` correctly for their bindings.
2025-09-20 18:47:32 +00:00
Maksim Levental
67f43c6ee2
[MLIR][Python] add type hints for accessors (#158455)
This PR adds type hints for accessors in the generated builders.
2025-09-18 21:12:35 -05:00
Maksim Levental
c4181e51d1
[MLIR][Python] remove unnecessary arg.none() = nb::none() pattern (#157519)
We have `arg.none() = nb::none()` in a lot of places but this is no
longer necessary (as of
~[2022](62a23bb87b)).
2025-09-08 12:16:35 -07:00
Roman
912ce2631f
[NFC] Fix typos 'seperate' -> 'separate' (#144368)
Correct few typos: 'seperate' -> 'separate' .
2025-08-30 13:41:25 +00:00
Mehdi Amini
589cb6c612 [MLIR] Apply clang-tidy fixes for performance-unnecessary-value-param in IRAttributes.cpp (NFC) 2025-08-26 06:14:24 -07:00
Mehdi Amini
745415d655 [MLIR] Apply clang-tidy fixes for llvm-else-after-return in IRAttributes.cpp (NFC) 2025-08-26 04:48:57 -07:00
Maksim Levental
21774489f0
[mlir][python] fix PyDenseResourceElementsAttribute finalizer (#150561)
This PR melds https://github.com/llvm/llvm-project/pull/150137 and
https://github.com/llvm/llvm-project/pull/149414 *and* partially reverts
https://github.com/llvm/llvm-project/pull/124832.

The summary is the `PyDenseResourceElementsAttribute` finalizer/deleter
has/had two problems

1. wasn't threadsafe (can be called from a different thread than that
which currently holds the GIL)
2. can be called while the interpreter is "not initialized"

https://github.com/llvm/llvm-project/pull/124832 for some reason decides
to re-initialize the interpreter to avoid case 2 and runs afoul of the
fact that `Py_IsInitialized` can be false during the finalization of the
interpreter itself (e.g., at the end of a script).

I don't know why this decision was made (I missed the PR) but I believe
we should never be calling
[Py_Initialize](https://docs.python.org/3/c-api/init.html#c.Py_Initialize):

> In an application \*\*\*\***embedding Python**\*\*\*\*, this should be
called before using any other Python/C API functions

**but we aren't embedding Python**!

So therefore we will only be in case 2 when the interpreter is being
finalized and in that case we should just leak the buffer.

Note,
[lldb](548ca9e976/lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp (L81-L93))
does a similar sort of thing for its finalizers.

Co-authored-by: Anton Korobeynikov <anton@korobeynikov.info>
Co-authored-by: Max Manainen <maximmanainen@gmail.com>

Co-authored-by: Anton Korobeynikov <anton@korobeynikov.info>
Co-authored-by: Max Manainen <maximmanainen@gmail.com>
2025-07-25 08:05:30 -04:00
Longsheng Mou
5a8e60e724
[mlir] Use llvm::fill instead of std::fill(NFC) (#146889) 2025-07-07 09:12:38 +08:00
Matthias Gehre
5d3ae51612
Reapply "[mlir][python] allow DenseIntElementsAttr for index type (#118947)" (#124804)
This reapplies #118947 and adapts to nanobind.
2025-01-29 09:14:37 +01:00
Fabian Tschopp
28507ac629
[MLIR] Fix thread safety of the deleter in PyDenseResourceElementsAttribute (#124832)
In general, `PyDenseResourceElementsAttribute` can get deleted at any
time and any thread, where unlike the `getFromBuffer` call, the Python
interpreter may not be initialized and the GIL may not be held.

This PR fixes segfaults caused by `PyBuffer_Release` when the GIL is not
being held by the thread calling the deleter.
2025-01-28 18:56:00 -05:00
Matthias Gehre
1b729c3d70 Revert "[mlir][python] allow DenseIntElementsAttr for index type (#118947)"
This reverts commit 9dd762e8b10586e749b0ddf3542e5dccf8392395.
2025-01-28 18:35:50 +01:00
Matthias Gehre
9dd762e8b1
[mlir][python] allow DenseIntElementsAttr for index type (#118947)
Model the `IndexType` as `uint64_t` when converting to a python integer. 

With the python bindings, 
```python
DenseIntElementsAttr(op.attributes["attr"])
```
used to `assert` when `attr` had `index` type like `dense<[1, 2, 3, 4]>
: vector<4xindex>`.

---------

Co-authored-by: Christopher McGirr <christopher.mcgirr@amd.com>
Co-authored-by: Tiago Trevisan Jost <tiago.trevisanjost@amd.com>
2025-01-28 18:31:58 +01:00
Han-Chung Wang
9cbc1f29ca
[mlir][NFC] Avoid using braced initializer lists to call a constructor. (#123714)
In the LLVM style guide, we prefer not using braced initializer lists to
call a constructor. Also, we prefer using an equal before the open curly
brace if we use a braced initializer list when initializing a variable.

See

https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor
for more details.

The style guide does not explain the reason well. There is an article
from abseil, which mentions few benefits. E.g., we can avoid the most
vexing parse, etc. See https://abseil.io/tips/88 for more details.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-01-21 21:23:32 -08:00
Peter Hawkins
5cd4274772
[mlir python] Port in-tree dialects to nanobind. (#119924)
This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.

This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.

This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.

---------

Co-authored-by: Jacques Pienaar <jpienaar@google.com>
2024-12-20 20:32:32 -08:00
Peter Hawkins
b56d1ec6cb
[mlir python] Port Python core code to nanobind. (#120473)
Relands #118583, with a fix for Python 3.8 compatibility. It was not
possible to set the buffer protocol accessers via slots in Python 3.8.

Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.

For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.

To a large extent, this is a mechanical change, for instance changing
`pybind11::` to `nanobind::`.

Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
2024-12-18 18:55:42 -08:00
Jacques Pienaar
6e8b3a3e0c Revert "[mlir python] Port Python core code to nanobind. (#118583)"
This reverts commit 41bd35b58bb482fd466aa4b13aa44a810ad6470f.

Breakage detected, rolling back.
2024-12-18 19:31:32 +00:00
Peter Hawkins
41bd35b58b
[mlir python] Port Python core code to nanobind. (#118583)
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.

For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.

To a large extent, this is a mechanical change, for instance changing
`pybind11::`
to `nanobind::`.

Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
2024-12-18 11:16:11 -08:00
Adrian Kuegel
404d0e9966 [mlir] Adjust code flagged by ClangTidyPerformance (NFC).
We can allocate the size of the vector in advance.
2024-11-25 08:17:09 +00:00
Kasper Nielsen
1824e45cd7
[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes (unrevert) (#115481)
This PR re-introduces the functionality of
https://github.com/llvm/llvm-project/pull/113064, which was reverted in
0a68171b3c
due to memory lifetime issues.

Notice that I was not able to re-produce the ASan results myself, so I
have not been able to verify that this PR really fixes the issue.

---

Currently it is unsupported to:
1. Convert a MlirAttribute with type i1 to a numpy array
2. Convert a boolean numpy array to a MlirAttribute

Currently the entire Python application violently crashes with a quite
poor error message https://github.com/pybind/pybind11/issues/3336

The complication handling these conversions, is that MlirAttribute
represent booleans as a bit-packed i1 type, whereas numpy represents
booleans as a byte array with 8 bit used per boolean.

This PR proposes the following approach:
1. When converting a i1 typed MlirAttribute to a numpy array, we can not
directly use the underlying raw data backing the MlirAttribute as a
buffer to Python, as done for other types. Instead, a copy of the data
is generated using numpy's unpackbits function, and the result is send
back to Python.
2. When constructing a MlirAttribute from a numpy array, first the
python data is read as a uint8_t to get it converted to the endianess
used internally in mlir. Then the booleans are bitpacked using numpy's
bitpack function, and the bitpacked array is saved as the MlirAttribute
representation.
2024-11-13 01:23:10 -05:00
Dmitri Gribenko
0a68171b3c Revert "[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes (#113064)"
This reverts commit fb7bf7a5acc65be44fc546f282942b91472553b3. There is
an ASan issue here, see the discussion on
https://github.com/llvm/llvm-project/pull/113064.
2024-11-05 16:08:51 +01:00
Kasper Nielsen
fb7bf7a5ac
[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes (#113064)
Currently it is unsupported to:
1. Convert a `MlirAttribute` with type `i1` to a numpy array
2. Convert a boolean numpy array to a `MlirAttribute`

Currently the entire Python application violently crashes with a quite
poor error message https://github.com/pybind/pybind11/issues/3336

The complication handling these conversions, is that `MlirAttribute`
represent booleans as a bit-packed `i1` type, whereas numpy represents
booleans as a byte array with 8 bit used per boolean.

This PR proposes the following approach:
1. When converting a `i1` typed `MlirAttribute` to a numpy array, we can
not directly use the underlying raw data backing the `MlirAttribute` as
a buffer to Python, as done for other types. Instead, a copy of the data
is generated using numpy's unpackbits function, and the result is send
back to Python.
2. When constructing a `MlirAttribute` from a numpy array, first the
python data is read as a `uint8_t` to get it converted to the endianess
used internally in mlir. Then the booleans are bitpacked using numpy's
bitpack function, and the bitpacked array is saved as the
`MlirAttribute` representation.

Please note that I am not sure if this approach is the desired solution.
I'd appreciate any feedback.
2024-11-02 06:39:48 +00:00
JOE1994
095b41c6ee [mlir] Reland 5a6e52d6ef96d2bcab6dc50bdb369662ff17d2a0 with update (NFC)
Excluded updates to mlir/lib/AsmParser/Parser.cpp ,
which caused LIT failure "FAIL: MLIR::completion.test" on multiple buildbots.
2024-09-15 22:45:28 -04:00
JOE1994
61ff1cb452 Revert "[mlir] Nits on uses of llvm::raw_string_ostream (NFC)"
This reverts commit 5a6e52d6ef96d2bcab6dc50bdb369662ff17d2a0.

"FAIL: MLIR::completion.test" on multiple buildbots.
2024-09-15 22:09:11 -04:00
JOE1994
5a6e52d6ef [mlir] Nits on uses of llvm::raw_string_ostream (NFC)
* Strip calls to raw_string_ostream::flush(), which is essentially a no-op
* Strip unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
2024-09-15 21:33:42 -04:00
Amy Wang
334873fe2d
[MLIR][Python] Python binding support for IntegerSet attribute (#107640)
Support IntegerSet attribute python binding.
2024-09-11 07:37:35 -04:00
Bimo
c36b424828
[MLIR][Python] add value attr for PyAffineMapAttribute (#97254)
Similar to other attributes in Binding, the `PyAffineMapAttribute`
should include a value attribute to enable users to directly retrieve
the `AffineMap` from the `AffineMapAttr`.
2024-07-01 23:44:40 +08:00
pranavm-nvidia
c912f0e773
[mlir][python] Add bindings for mlirDenseElementsAttrGet (#91389)
This change adds bindings for `mlirDenseElementsAttrGet` which accepts a
list of MLIR attributes and constructs a DenseElementsAttr. This allows
for creating `DenseElementsAttr`s of types not natively supported by
Python (e.g. BF16) without requiring other dependencies (e.g. `numpy` +
`ml-dtypes`).
2024-05-22 05:44:22 -05:00
Mehdi Amini
962bf002fe Apply clang-tidy fixes for performance-unnecessary-value-param in IRAttributes.cpp (NFC) 2023-11-18 15:38:21 -08:00
Stella Laurenzo
f66cd9e955
[mlir] Add Python bindings for DenseResourceElementsAttr. (#66319)
Only construction and type casting are implemented. The method to create
is explicitly named "unsafe" and the documentation calls out what the
caller is responsible for. There really isn't a better way to do this
and retain the power-user feature this represents.
2023-09-14 18:45:29 -07:00
Ingo Müller
9f5335487a [mlir][python] Remove __str__ from bindings of StringAttr.
This reverts a feature introduced in commit
2a5d497494c24425e99655b85e2277dd3f15a400. The goal of that commit was to
allow `StringAttr`s to by used transparently wherever Python `str`s are
expected. But, as the tests in https://reviews.llvm.org/D159182 reveal,
pybind11 doesn't do this conversion based on `__str__` automatically,
unlike for the other types introduced in the commit above. At the same
time, changing `__str__` breaks the symmetry with other attributes of
`print(attr)` printing the assembly of the attribute, so the change
probably has more disadvantages than advantages.

Reviewed By: springerm, rkayaith

Differential Revision: https://reviews.llvm.org/D159255
2023-09-01 07:35:54 +00:00
Ingo Müller
2a5d497494 [mlir][python] Add __{bool,float,int,str}__ to bindings of attributes.
This allows to use Python's `bool(.)`, `float(.)`, `int(.)`, and
`str(.)` to convert pybound attributes to the corresponding native
Python types. In particular, pybind11 uses these functions to
automatically cast objects to the corresponding primitive types wherever
they are required by pybound functions, e.g., arguments are converted to
Python's `int` if the C++ signature requires a C++ `int`. With this
patch, pybound attributes can by used wherever the corresponding native
types are expected. New tests show-case this behavior in the
constructors of `Dense*ArrayAttr`.

Note that this changes the output of Python's `str` on `StringAttr` from
`"hello"` to `hello`. Arguably, this is still in line with `str`s goal
of producing a readable interpretation of the value, even if it is now
not unambiously a string anymore (`print(ir.Attribute.parse('"42"'))`
now outputs `42`). However, this is consistent with instances of
Python's `str` (`print("42")` outputs `42`), and `repr` still provides
an unambigous representation if one is required.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D158974
2023-08-29 14:53:26 +00:00
Ingo Müller
8dcb67225b [mlir][python] Make DenseBoolArrayAttr.get work with list of bools.
This patch makes the getter function of `DenseBoolArrayAttr` work more
intuitively. Until now, it was implemented with a `std::vector<int>`
argument, which works in the typical situation where you call the pybind
function with a list of Python bools (like `[True, False]`). However, it
does *not* work if the elements of the list have to be cast to Bool
before (and that is the default behavior for lists of all other types).
The patch thus changes the signature to `std::vector<bool>`, which helps
pybind to make the function behave as expected for bools. The tests now
also contain a case where such a cast is happening. This also makes the
conversion of `DenseBoolArrayAttr` back to Python more intuitive:
instead of converting to `0` and `1`, the elements are now converted to
`False` and `True`.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D158973
2023-08-28 15:15:08 +00:00
Peter Hawkins
71a254543d [MLIR:Python] Make DenseElementsAttr.get() only request a buffer format if no explicit type was provided.
Not every NumPy type (e.g., the `ml_dtypes.bfloat16` NumPy extension
type) has a type in the Python buffer protocol, so exporting such a
buffer with `PyBUF_FORMAT` may fail.

However, we don't care about the self-reported type of a buffer if the
user provides an explicit type. In the case that an explicit type is
provided, don't request the format from the buffer protocol, which
allows arrays whose element types are unknown to the buffer protocol to
be passed.

Reviewed By: jpienaar, ftynse

Differential Revision: https://reviews.llvm.org/D155209
2023-07-14 16:08:15 -07:00
Rahul Kayaith
974c1596ab [mlir][python] Downcast attributes in more places
Update remaining `PyAttribute`-returning APIs to return `MlirAttribute` instead,
so that they go through the downcasting mechanism.

Reviewed By: makslevental

Differential Revision: https://reviews.llvm.org/D154462
2023-07-10 22:01:34 -04:00
max
4eee9ef976 Add SymbolRefAttr to python bindings
Differential Revision: https://reviews.llvm.org/D154541
2023-07-05 20:51:33 -05:00
max
9566ee2806 [MLIR][python bindings] TypeCasters for Attributes
Differential Revision: https://reviews.llvm.org/D151840
2023-06-07 12:01:00 -05:00
max
bfb1ba7526 [MLIR][python bindings] Add TypeCaster for returning refined types from python APIs
depends on D150839

This diff uses `MlirTypeID` to register `TypeCaster`s (i.e., `[](PyType pyType) -> DerivedTy { return pyType; }`) for all concrete types (i.e., `PyConcrete<...>`) that are then queried for (by `MlirTypeID`) and called in `struct type_caster<MlirType>::cast`. The result is that anywhere an `MlirType mlirType` is returned from a python binding, that `mlirType` is automatically cast to the correct concrete type. For example:

```
      c0 = arith.ConstantOp(f32, 0.0)
      # CHECK: F32Type(f32)
      print(repr(c0.result.type))

      unranked_tensor_type = UnrankedTensorType.get(f32)
      unranked_tensor = tensor.FromElementsOp(unranked_tensor_type, [c0]).result

      # CHECK: UnrankedTensorType
      print(type(unranked_tensor.type).__name__)
      # CHECK: UnrankedTensorType(tensor<*xf32>)
      print(repr(unranked_tensor.type))
```

This functionality immediately extends to typed attributes (i.e., `attr.type`).

The diff also implements similar functionality for `mlir_type_subclass`es but in a slightly different way - for such types (which have no cpp corresponding `class` or `struct`) the user must provide a type caster in python (similar to how `AttrBuilder` works) or in cpp as a `py::cpp_function`.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D150927
2023-05-26 11:02:05 -05:00
max
4811270bac [MLIR][python bindings] use pybind C++ APIs for throwing python errors.
Differential Revision: https://reviews.llvm.org/D151167
2023-05-23 11:31:16 -05:00
max
ef1b735dfb [MLIR][python bindings] Add support for DenseElementsAttr of IndexType
Differential Revision: https://reviews.llvm.org/D149690
2023-05-03 18:45:40 -05:00