This PR continues work from
https://github.com/llvm/llvm-project/pull/178290
Added local helper functions to avoid dependency on LLVM APIs.
---------
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
This PR extends the MLIR C API and Python bindings to support
**arbitrary-precision integers (`APInt`)**, overcoming the previous
limitation where `IntegerAttr` values were restricted to 64 bits.
Cryptographic applications often require integer types much larger than
standard machine words (e.g., the 256-bit modulus for the BN254 curve).
Previously, attempting to bind these values resulted in truncation or
errors. This PR exposes the underlying word-based `APInt` structure via
the C API and updates the Python bindings to seamlessly handle Python's
arbitrary-precision integers.
In
https://github.com/llvm/llvm-project/pull/174139#issuecomment-3733259370
I wrote a scuffed benchmark that mostly iterates MLIR Container Types in
Python. My changes from that PR made the performance worse, so I closed
it.
However, when experimetning with that I also saw a large(?) performance
gain by changing the `dunderNext` methods of the various Iterators to
use `PyErr_SetNone(PyExc_StopIteration);` instead of `throw
nb::stop_iteration();`.
<details><summary>Benchmark attempt script</summary>
```python
import timeit
from mlir.ir import Context, Location, Module, InsertionPoint, Block, Region, OpView
from mlir.dialects import func, builtin, scf, arith
def generate_module():
m = Module.create()
with InsertionPoint(m.body):
f = func.FuncOp("main", builtin.FunctionType.get([], []))
with InsertionPoint(f.body.blocks.append()):
generate_ops(10, 2)
func.ReturnOp([])
return m
def generate_ops(count: int, depth: int):
if depth == 0:
return
lower = arith.ConstantOp(builtin.IntegerType.get_signless(64), 0)
upper = arith.ConstantOp(builtin.IntegerType.get_signless(64), 100)
step = arith.ConstantOp(builtin.IntegerType.get_signless(64), 1)
for i in range(count):
forop = scf.ForOp(lower, upper, step)
with InsertionPoint(forop.region.blocks[0]):
generate_ops(count, depth - 1)
scf.YieldOp([])
def walk_module(m: Module):
walk_block(m.body)
def walk_region(region: Region):
for block in region.blocks:
walk_block(block)
def walk_block(block: Block):
for predecessors in block.predecessors:
pass
for successors in block.successors:
pass
for op in block.operations:
walk_op(op)
def walk_op(op: OpView):
for result in op.results:
pass
for successors in op.successors:
pass
for operands in op.operands:
pass
for region in op.regions:
walk_region(region)
with Context(), Location.unknown():
m = generate_module()
# From timeit.main:
t = timeit.Timer(lambda: walk_module(m))
number, _ = t.autorange()
repeats = 5
raw_timings = t.repeat(repeats, number)
timings = [dt / number for dt in raw_timings]
best = min(timings)
print(f"{number} loops, best of {repeats}: {best * 1000:.3g} msecs per loop")
```
</details>
The performance of the benchmark went from
```
50 loops, best of 5: 5.97 msecs per loop
```
to
```
50 loops, best of 5: 5.12 msecs per loop
```
in my setup, which is a ~14% improvement. (Though you should validate
that yourself, probably. My test setup is very scuffed)
The functions were previously set to return a C++ type like `PyRegion`.
Because of the removal of the `throw` they now had to [return a `NULL`
value to
Python](aa8578dc54/Objects/call.c (L49-L61)),
so I changed the return type to
`nanobind::typed<nanobind::object,PyRegion>` so I could return an
`nb::object()` in case an error was set and otherwise `nb::cast` the
`PyRegion` value to `nb::object` instead of returning it directly.
I'm not a huge fan, that this changes the external "Usage" of the
functions, because now they won't bubble up exceptions, when they are
called from C++ The return type and Python Error State have to be
checked instead.
I couldn't find any location that called them in llvm itself, though.
Maybe these functions should not be public, because they are only
supposed to be called from Python anyway?
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This PR continues the work of
https://github.com/llvm/llvm-project/pull/171775 by moving more useful
types/attributes into MLIRPythonSupport.
You can now do
```c++
struct PyTestIntegerRankedTensorType
: mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<
PyTestIntegerRankedTensorType,
mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyRankedTensorType>
struct PyTestTensorValue
: mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteValue<
PyTestTensorValue>
```
instead of `mlir_type_subclass` and `mlir_value_subclass`;
**specifically manual registration of the "value caster" via indirection
through the Python interpreter is no longer necessary** . You can also
now freely use all such types at the nanobind API level (e.g., overload
based on `FP*`):
```c++
using mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN;
standaloneM.def("print_fp_type", [](PyF16Type &) { nb::print("this is a fp16 type"); });
standaloneM.def("print_fp_type", [](PyF32Type &) { nb::print("this is a fp32 type"); });
standaloneM.def("print_fp_type", [](PyF64Type &) { nb::print("this is a fp64 type"); });
```
Note, here we only port `PythonTestModuleNanobind` but there is a
follow-up PR that ports **all** in-tree dialect extensions
https://github.com/llvm/llvm-project/pull/174156 to use these. After
that one we can soft deprecate `mlir_pure_subclass`.
Note, depends on https://github.com/llvm/llvm-project/pull/171775
# What
This PR adds a shared library `MLIRPythonSupport` which contains all of
the CRTP classes ike `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`, as well as other useful code like `Defaulting*`
and etc enabling their reuse in downstream projects. Downstream projects
can now do
```c++
struct PyTestType : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyTestType> {
...
};
class PyTestAttr : public mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteAttribute<PyTestAttr> {
...
}
NB_MODULE(_mlirPythonTestNanobind, m) {
PyTestType::bind(m);
PyTestAttr::bind(m);
}
```
instead of using the discordant alternative
`mlir_type_subclass`/`mlir_attr_subclass` (same goes for
`PyConcreteValue`/`mlir_value_subclass`).
# Why
This PR is mostly code motion (along with CMake) but before I describe
the changes I want to state the goals/benefits:
1. Currently upstream "core" extensions and "dialect" extensions ([all
of the `Dialect*` extensions
here](d7c734b5a1/mlir/lib/Bindings/Python))
are a two-tier system;
**a**. [core
extensions](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Bindings/Python/IRTypes.cpp#L361)
enjoy first class support as far as type inference[^3], type stub
generation, and ease of implementation, while dialect extensions [have
poorer support](https://reviews.llvm.org/D150927), incorrect type stub
generation much more tedious (boilerplate) implementation;
**b**. Crucially, this two-tiered system is reflected in the fact that
**the two sets of types/attributes are not in the same Python object
hierarchy**. To wit: `isinstance(..., Type)` and `isinstance(...,
Attribute)` are not supported for the dialect extensions[^2];
**c**. Since these types are not exposed in public headers, downstream
users (dialect extensions or not) cannot write functions that overload
on e.g. `PyFloat8*Type` - that's quite a [useful
feature](fdbee98df8/cpp_ext/TorchOps.cpp (L29-L69))!
2. The dialect extensions incur a sizeable performance penalty relative
to the core extensions in that every single trip across the wire (either
`python->cpp` or `cpp->python`) requires work in addition to nanobind's
own casting/construction pipeline;
**a**. When going from `python->cpp`, [we extract the capsule object
from the Python
object](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h#L219C24-L219C46)
and then extract from the capsule the `Mlir*` opaque struct/ptr. This
side isn't so onerous;
**b**. When going from `cpp->python` we call long-hand call Python
`import` APIs and construct the Python object using `_CAPICreate`. Note,
there at least 2 `attr` calls incurred in addition to `_CAPICreate`;
this is already much more [efficiently handled by nanobind
itself](4ba51fcf79/src/nb_internals.h (L381-L382))!
3. This division blocks various features: in some configurations[^1] we
trigger a circular import bug because "dialect" types and attributes
perform an [import of the root `_mlir`
module](bd9651bf78/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h (L585))
when they are created (the types themselves, not even instances of those
types). This blocks type stub generation for dialect extensions (i.e.,
the reason we currently only generate type stubs for `_mlir`).
# How
Prior this was not done/possible because of "ODR" issues but I have
resolved those issues; the basic idea for how we solve this is "move
things we want to share into shared libraries":
1. Move IRCore (stuff like `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`) into `MLIRPythonSupport`;
- Note, we move the rest of the things in `IRModule.h` (renamed to
`IRCore.h`) because `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute` depend on them. This makes for a bigger PR than
one would hope for but ultimately I think we should give people access
to these classes to use as they see fit (specifically inherit from, but
also liberally use in bindings signatures instead of the opaque `Mlir*`
struct wrappers).
2. Put all of this code into a nested namespace
`MLIR_BINDINGS_PYTHON_DOMAIN` which is determined by a compile time
define (and tied to `MLIR_BINDINGS_PYTHON_NB_DOMAIN`). This is necessary
in order to prevent conflicts on both symbol name **and** typeid
(necessary for nanobind to not double register binded types) between
multiple bindings libraries (e.g., `torch-mlir`, and `jax`). Note
[nanobind doesn't support `module_local` like
pybind11](https://nanobind.readthedocs.io/en/latest/porting.html#removed-features).
It does support `NB_DOMAIN` but that is not sufficient for
disambiguating typeids across projects (to wit: we currently define
`NB_DOMAIN` and it was still necessary to move everything to a nested
namespace);
3. Build the [nanobind library itself as a shared
object](https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake#L127)
(and link it to both the extensions and `MLIRPythonSupport`).
4. CMake to make this work, in-tree, out-of-tree, downstream, upstream,
etc.
# Testing
Three tests are added here
1. `PythonTestModuleNanobind` is ported to use
`PyConcreteType<PyTestType>` instead of `mlir_type_subclass` and
`PyConcreteAttribute<PyTestAttr>` instead of `mlir_atrr_subclass`,
verifying this works for non-core extensions in-tree;
2. `StandaloneExtensionNanobind` is ported to use `struct PyCustomType :
mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyCustomType>`
instead of `mlir_type_subclass` verifying this works for non-core
extensions out-of-tree;
3. `StandaloneExtensionNanobind`'s `smoketest` is extended to also load
another bindings package (namely `mlir`) verifying
`MLIR_BINDINGS_PYTHON_DOMAIN` successfully disambiguates symbols and
typeids.
I have also tested this downstream:
https://github.com/llvm/eudsl/pull/287 as well run the following builder
bots:
mlir-nvidia-gcc7:
https://lab.llvm.org/buildbot/#/buildrequests/6654424?redirect_to_build=true
I have also tested against IREE:
https://github.com/iree-org/iree/pull/21916
# Integration
It is highly recommended to set the CMake var
`MLIR_BINDINGS_PYTHON_NB_DOMAIN` (which will also determine
`MLIR_BINDINGS_PYTHON_DOMAIN`) to something unique for each downstream.
This can also be passed explicitly to `add_mlir_python_modules` if your
project builds multiple bindings packages. I added a `WARNING` to this
effect in `AddMLIRPython.cmake`.
[^3]: Python values being typed correctly when exiting from cpp;
[^1]: Specifically when the modules are imported using `importlib`,
which occurs with nanobind's
[stubgen](https://github.com/wjakob/nanobind/blob/master/src/stubgen.py#L965);
[^2]: The workaround we implemented was a class method for the dialect
bindings called `Class.isinstance(...)`;
This PR fixes a crash in the `bf_getbuffer` implementation of
`PyDenseElementsAttribute` that occurred when an element type was not
supported, such as `bf16`. I believe that supportion `bf16` is not
possible with that protocol but that's out of the scope of this PR.
Previsouly, the code raised an `std::exception` out of `bf_getbuffer`
that nanobind does not catch (see also pybind/pybind11#3336). The PR
makes the function catch all `std::exception`s and manually raises a
Python exception instead.
Signed-off-by: Ingo Müller <ingomueller@google.com>
Some of the current gettors require passing locations (i.e., there be an
active location) because they're using the "checked" APIs. This PR adds
"unchecked" gettors which only require an active context.
https://github.com/llvm/llvm-project/pull/157930 broke bazel build (see
https://github.com/llvm/llvm-project/pull/157930#issuecomment-3318681217)
because bazel is stricter on implicit conversions (some difference in
flags passed to clang). This PR fixes by moving/removing `nb::typed`.
EDIT: and also the overlay...
This a reland of https://github.com/llvm/llvm-project/pull/155741 which
was reverted at https://github.com/llvm/llvm-project/pull/157831. This
version is narrower in scope - it only turns on automatic stub
generation for `MLIRPythonExtension.Core._mlir` and **does not do
anything automatically**. Specifically, the only CMake code added to
`AddMLIRPython.cmake` is the `mlir_generate_type_stubs` function which
is then used only in a manual way. The API for
`mlir_generate_type_stubs` is:
```
Arguments:
MODULE_NAME: The fully-qualified name of the extension module (used for importing in python).
DEPENDS_TARGETS: List of targets these type stubs depend on being built; usually corresponding to the
specific extension module (e.g., something like StandalonePythonModules.extension._standaloneDialectsNanobind.dso)
and the core bindings extension module (e.g., something like StandalonePythonModules.extension._mlir.dso).
OUTPUT_DIR: The root output directory to emit the type stubs into.
OUTPUTS: List of expected outputs.
DEPENDS_TARGET_SRC_DEPS: List of cpp sources for extension library (for generating a DEPFILE).
IMPORT_PATHS: List of paths to add to PYTHONPATH for stubgen.
PATTERN_FILE: (Optional) Pattern file (see https://nanobind.readthedocs.io/en/latest/typing.html#pattern-files).
Outputs:
NB_STUBGEN_CUSTOM_TARGET: The target corresponding to generation which other targets can depend on.
```
Downstream users should use `mlir_generate_type_stubs` in coordination
with `declare_mlir_python_sources` to turn on stub generation for their
own downstream dialect extensions and upstream dialect extensions if
they so choose. Standalone example shows an example.
Note, downstream will also need to set
`-DMLIR_PYTHON_PACKAGE_PREFIX=...` correctly for their bindings.
This PR melds https://github.com/llvm/llvm-project/pull/150137 and
https://github.com/llvm/llvm-project/pull/149414 *and* partially reverts
https://github.com/llvm/llvm-project/pull/124832.
The summary is the `PyDenseResourceElementsAttribute` finalizer/deleter
has/had two problems
1. wasn't threadsafe (can be called from a different thread than that
which currently holds the GIL)
2. can be called while the interpreter is "not initialized"
https://github.com/llvm/llvm-project/pull/124832 for some reason decides
to re-initialize the interpreter to avoid case 2 and runs afoul of the
fact that `Py_IsInitialized` can be false during the finalization of the
interpreter itself (e.g., at the end of a script).
I don't know why this decision was made (I missed the PR) but I believe
we should never be calling
[Py_Initialize](https://docs.python.org/3/c-api/init.html#c.Py_Initialize):
> In an application \*\*\*\***embedding Python**\*\*\*\*, this should be
called before using any other Python/C API functions
**but we aren't embedding Python**!
So therefore we will only be in case 2 when the interpreter is being
finalized and in that case we should just leak the buffer.
Note,
[lldb](548ca9e976/lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp (L81-L93))
does a similar sort of thing for its finalizers.
Co-authored-by: Anton Korobeynikov <anton@korobeynikov.info>
Co-authored-by: Max Manainen <maximmanainen@gmail.com>
Co-authored-by: Anton Korobeynikov <anton@korobeynikov.info>
Co-authored-by: Max Manainen <maximmanainen@gmail.com>
In general, `PyDenseResourceElementsAttribute` can get deleted at any
time and any thread, where unlike the `getFromBuffer` call, the Python
interpreter may not be initialized and the GIL may not be held.
This PR fixes segfaults caused by `PyBuffer_Release` when the GIL is not
being held by the thread calling the deleter.
Model the `IndexType` as `uint64_t` when converting to a python integer.
With the python bindings,
```python
DenseIntElementsAttr(op.attributes["attr"])
```
used to `assert` when `attr` had `index` type like `dense<[1, 2, 3, 4]>
: vector<4xindex>`.
---------
Co-authored-by: Christopher McGirr <christopher.mcgirr@amd.com>
Co-authored-by: Tiago Trevisan Jost <tiago.trevisanjost@amd.com>
In the LLVM style guide, we prefer not using braced initializer lists to
call a constructor. Also, we prefer using an equal before the open curly
brace if we use a braced initializer list when initializing a variable.
See
https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor
for more details.
The style guide does not explain the reason well. There is an article
from abseil, which mentions few benefits. E.g., we can avoid the most
vexing parse, etc. See https://abseil.io/tips/88 for more details.
Signed-off-by: hanhanW <hanhan0912@gmail.com>
This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.
This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.
This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.
---------
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Relands #118583, with a fix for Python 3.8 compatibility. It was not
possible to set the buffer protocol accessers via slots in Python 3.8.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::` to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::`
to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
This PR re-introduces the functionality of
https://github.com/llvm/llvm-project/pull/113064, which was reverted in
0a68171b3c
due to memory lifetime issues.
Notice that I was not able to re-produce the ASan results myself, so I
have not been able to verify that this PR really fixes the issue.
---
Currently it is unsupported to:
1. Convert a MlirAttribute with type i1 to a numpy array
2. Convert a boolean numpy array to a MlirAttribute
Currently the entire Python application violently crashes with a quite
poor error message https://github.com/pybind/pybind11/issues/3336
The complication handling these conversions, is that MlirAttribute
represent booleans as a bit-packed i1 type, whereas numpy represents
booleans as a byte array with 8 bit used per boolean.
This PR proposes the following approach:
1. When converting a i1 typed MlirAttribute to a numpy array, we can not
directly use the underlying raw data backing the MlirAttribute as a
buffer to Python, as done for other types. Instead, a copy of the data
is generated using numpy's unpackbits function, and the result is send
back to Python.
2. When constructing a MlirAttribute from a numpy array, first the
python data is read as a uint8_t to get it converted to the endianess
used internally in mlir. Then the booleans are bitpacked using numpy's
bitpack function, and the bitpacked array is saved as the MlirAttribute
representation.
Currently it is unsupported to:
1. Convert a `MlirAttribute` with type `i1` to a numpy array
2. Convert a boolean numpy array to a `MlirAttribute`
Currently the entire Python application violently crashes with a quite
poor error message https://github.com/pybind/pybind11/issues/3336
The complication handling these conversions, is that `MlirAttribute`
represent booleans as a bit-packed `i1` type, whereas numpy represents
booleans as a byte array with 8 bit used per boolean.
This PR proposes the following approach:
1. When converting a `i1` typed `MlirAttribute` to a numpy array, we can
not directly use the underlying raw data backing the `MlirAttribute` as
a buffer to Python, as done for other types. Instead, a copy of the data
is generated using numpy's unpackbits function, and the result is send
back to Python.
2. When constructing a `MlirAttribute` from a numpy array, first the
python data is read as a `uint8_t` to get it converted to the endianess
used internally in mlir. Then the booleans are bitpacked using numpy's
bitpack function, and the bitpacked array is saved as the
`MlirAttribute` representation.
Please note that I am not sure if this approach is the desired solution.
I'd appreciate any feedback.
* Strip calls to raw_string_ostream::flush(), which is essentially a no-op
* Strip unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
Similar to other attributes in Binding, the `PyAffineMapAttribute`
should include a value attribute to enable users to directly retrieve
the `AffineMap` from the `AffineMapAttr`.
This change adds bindings for `mlirDenseElementsAttrGet` which accepts a
list of MLIR attributes and constructs a DenseElementsAttr. This allows
for creating `DenseElementsAttr`s of types not natively supported by
Python (e.g. BF16) without requiring other dependencies (e.g. `numpy` +
`ml-dtypes`).
Only construction and type casting are implemented. The method to create
is explicitly named "unsafe" and the documentation calls out what the
caller is responsible for. There really isn't a better way to do this
and retain the power-user feature this represents.
This reverts a feature introduced in commit
2a5d497494c24425e99655b85e2277dd3f15a400. The goal of that commit was to
allow `StringAttr`s to by used transparently wherever Python `str`s are
expected. But, as the tests in https://reviews.llvm.org/D159182 reveal,
pybind11 doesn't do this conversion based on `__str__` automatically,
unlike for the other types introduced in the commit above. At the same
time, changing `__str__` breaks the symmetry with other attributes of
`print(attr)` printing the assembly of the attribute, so the change
probably has more disadvantages than advantages.
Reviewed By: springerm, rkayaith
Differential Revision: https://reviews.llvm.org/D159255
This allows to use Python's `bool(.)`, `float(.)`, `int(.)`, and
`str(.)` to convert pybound attributes to the corresponding native
Python types. In particular, pybind11 uses these functions to
automatically cast objects to the corresponding primitive types wherever
they are required by pybound functions, e.g., arguments are converted to
Python's `int` if the C++ signature requires a C++ `int`. With this
patch, pybound attributes can by used wherever the corresponding native
types are expected. New tests show-case this behavior in the
constructors of `Dense*ArrayAttr`.
Note that this changes the output of Python's `str` on `StringAttr` from
`"hello"` to `hello`. Arguably, this is still in line with `str`s goal
of producing a readable interpretation of the value, even if it is now
not unambiously a string anymore (`print(ir.Attribute.parse('"42"'))`
now outputs `42`). However, this is consistent with instances of
Python's `str` (`print("42")` outputs `42`), and `repr` still provides
an unambigous representation if one is required.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D158974
This patch makes the getter function of `DenseBoolArrayAttr` work more
intuitively. Until now, it was implemented with a `std::vector<int>`
argument, which works in the typical situation where you call the pybind
function with a list of Python bools (like `[True, False]`). However, it
does *not* work if the elements of the list have to be cast to Bool
before (and that is the default behavior for lists of all other types).
The patch thus changes the signature to `std::vector<bool>`, which helps
pybind to make the function behave as expected for bools. The tests now
also contain a case where such a cast is happening. This also makes the
conversion of `DenseBoolArrayAttr` back to Python more intuitive:
instead of converting to `0` and `1`, the elements are now converted to
`False` and `True`.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D158973
Not every NumPy type (e.g., the `ml_dtypes.bfloat16` NumPy extension
type) has a type in the Python buffer protocol, so exporting such a
buffer with `PyBUF_FORMAT` may fail.
However, we don't care about the self-reported type of a buffer if the
user provides an explicit type. In the case that an explicit type is
provided, don't request the format from the buffer protocol, which
allows arrays whose element types are unknown to the buffer protocol to
be passed.
Reviewed By: jpienaar, ftynse
Differential Revision: https://reviews.llvm.org/D155209
Update remaining `PyAttribute`-returning APIs to return `MlirAttribute` instead,
so that they go through the downcasting mechanism.
Reviewed By: makslevental
Differential Revision: https://reviews.llvm.org/D154462
depends on D150839
This diff uses `MlirTypeID` to register `TypeCaster`s (i.e., `[](PyType pyType) -> DerivedTy { return pyType; }`) for all concrete types (i.e., `PyConcrete<...>`) that are then queried for (by `MlirTypeID`) and called in `struct type_caster<MlirType>::cast`. The result is that anywhere an `MlirType mlirType` is returned from a python binding, that `mlirType` is automatically cast to the correct concrete type. For example:
```
c0 = arith.ConstantOp(f32, 0.0)
# CHECK: F32Type(f32)
print(repr(c0.result.type))
unranked_tensor_type = UnrankedTensorType.get(f32)
unranked_tensor = tensor.FromElementsOp(unranked_tensor_type, [c0]).result
# CHECK: UnrankedTensorType
print(type(unranked_tensor.type).__name__)
# CHECK: UnrankedTensorType(tensor<*xf32>)
print(repr(unranked_tensor.type))
```
This functionality immediately extends to typed attributes (i.e., `attr.type`).
The diff also implements similar functionality for `mlir_type_subclass`es but in a slightly different way - for such types (which have no cpp corresponding `class` or `struct`) the user must provide a type caster in python (similar to how `AttrBuilder` works) or in cpp as a `py::cpp_function`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150927