Add `MLIR_ENABLE_PYTHON_STABLE_ABI` cmake flag to build bindings against
the Python limited/stable API (abi3 / PEP 384). This allow for
compatibility across different >=3.12 versions with a single .so /
wheel. We also require CMake >=3.26.
The stable ABI restricts usage to a subset of the CPython C API: frame
and code object structs are opaque, so introspection APIs like
`PyCode_Addr2Location`, `PyFrame_GetLasti`, and `PyFrame_GetCode` are
unavailable. The traceback-based auto-location logic is dropped because
we don’t have stable ABI to produce complete locations.
Assisted-by: claude
Remove problematic `return()` statements from
mlir_configure_python_dev_packages macro and convert nanobind detection
to FATAL_ERROR.
Using `return()` in a CMake macro is problematic because it returns from
the calling scope, not the macro itself, causing the parent
CMakeLists.txt to silently skip subsequent configuration steps. This
leads to confusing build failures when Python bindings are enabled but
nanobind is missing.
With this change, users get an immediate error with clear instructions
to install nanobind or set nanobind_DIR, rather than silent
configuration failures.
This a reland of https://github.com/llvm/llvm-project/pull/155741 which
was reverted at https://github.com/llvm/llvm-project/pull/157831. This
version is narrower in scope - it only turns on automatic stub
generation for `MLIRPythonExtension.Core._mlir` and **does not do
anything automatically**. Specifically, the only CMake code added to
`AddMLIRPython.cmake` is the `mlir_generate_type_stubs` function which
is then used only in a manual way. The API for
`mlir_generate_type_stubs` is:
```
Arguments:
MODULE_NAME: The fully-qualified name of the extension module (used for importing in python).
DEPENDS_TARGETS: List of targets these type stubs depend on being built; usually corresponding to the
specific extension module (e.g., something like StandalonePythonModules.extension._standaloneDialectsNanobind.dso)
and the core bindings extension module (e.g., something like StandalonePythonModules.extension._mlir.dso).
OUTPUT_DIR: The root output directory to emit the type stubs into.
OUTPUTS: List of expected outputs.
DEPENDS_TARGET_SRC_DEPS: List of cpp sources for extension library (for generating a DEPFILE).
IMPORT_PATHS: List of paths to add to PYTHONPATH for stubgen.
PATTERN_FILE: (Optional) Pattern file (see https://nanobind.readthedocs.io/en/latest/typing.html#pattern-files).
Outputs:
NB_STUBGEN_CUSTOM_TARGET: The target corresponding to generation which other targets can depend on.
```
Downstream users should use `mlir_generate_type_stubs` in coordination
with `declare_mlir_python_sources` to turn on stub generation for their
own downstream dialect extensions and upstream dialect extensions if
they so choose. Standalone example shows an example.
Note, downstream will also need to set
`-DMLIR_PYTHON_PACKAGE_PREFIX=...` correctly for their bindings.
Fixes#126162
I check locally that it works without warning for:
- neither options are defined
- both defined to the same value
And I checked that it warns if:
- only one is defined
- they defined to different values
This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.
This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.
This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.
---------
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Relands #118583, with a fix for Python 3.8 compatibility. It was not
possible to set the buffer protocol accessers via slots in Python 3.8.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::` to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::`
to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
This PR allows out-of-tree dialects to write Python dialect modules
using nanobind instead of pybind11.
It may make sense to migrate in-tree dialects and some of the ODS Python
infrastructure to nanobind, but that is a topic for a future change.
This PR makes the following changes:
* adds nanobind to the CMake and Bazel build systems. We also add
robin_map to the Bazel build, which is a dependency of nanobind.
* adds a PYTHON_BINDING_LIBRARY option to various CMake functions, such
as declare_mlir_python_extension, allowing users to select a Python
binding library.
* creates a fork of mlir/include/mlir/Bindings/Python/PybindAdaptors.h
named NanobindAdaptors.h. This plays the same role, using nanobind
instead of pybind11.
* splits CollectDiagnosticsToStringScope out of PybindAdaptors.h and
into a new header mlir/include/mlir/Bindings/Python/Diagnostics.h, since
it is code that is no way related to pybind11 or for that matter,
Python.
* changed the standalone Python extension example to have both pybind11
and nanobind variants.
* changed mlir/python/mlir/dialects/python_test.py to have both pybind11
and nanobind variants.
Notes:
* A slightly unfortunate thing that I needed to do in the CMake
integration was to use FindPython in addition to FindPython3, since
nanobind's CMake integration expects the Python_ names for variables.
Perhaps there's a better way to do this.
Adds a CMake option MLIR_DISABLE_CONFIGURE_PYTHON_DEV_PACKAGES which
gates doing package discovery and configuration for Python dev packages
by MLIR (this was made opt-out to preserve compatibility with
find_package(MLIR) based uses which do not set the standard options).
The default Python setup that MLIR does has been a problem for
super-projects that include LLVM for a long time because it forces a
very specific package discovery mechanism that is not uniform in all
uses.
When reviewing #117922, I noted that this would effectively be a break
the world event for downstreams, forcing them to adapt their nanobind
dep to the exact way that MLIR does it. Adding the option to just
wholesale skip the built-in configuration heuristics at least gives us a
mechanism to tell downstreams to migrate to, giving them complete
control and not requiring packaging workarounds. This seemed a better
option than (once again) creating a situation where downstreams could
not integrate the dep change without doing tricky infra upgrades, and it
removes the burden from the author of that patch from needing to think
about how this affects super-projects that include MLIR (i.e. they can
just be told to do it themselves as needed vs being in a wedged state
and unable to upgrade).
This PR updates the minimal required version of pybind11 from 2.9.0 to
2.10.0. New new version is almost 2.5 years old, which is half a year
less than the previous version. This change is necessary to support the
changes introduced in #115307, which does not compile with pybind11
v.2.9.
Signed-off-by: Ingo Müller <ingomueller@google.com>
NumPy header files were required for building MLIR, however the NumPy
C-API is never used. In other words, NumPy is not a build time
dependency. `numpy`, the python package, is required at runtime for the
python bindings tests. In particular the file
`mlir/python/mlir/runtime/np_to_memref.py` and all tests which may use
it. This commit removes the build time dependency, but the runtime
dependency remains through the `requirements.txt` file.
Co-authored-by: Erick Ochoa <erick@ceci-nest-pas.me>
CMake older than 3.20.0 is no longer supported.
This removes work-arounds for no longer supported versions.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D152101
2.9.0 was released on December 28, 2021, and some following changes
require at least this version.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D150247
The minimum required version is now 3.19 due to the usage of some
more recent features. Update the version check and error message
accordingly. Also remove some logic that behaved differently before
3.18, since we can assume we are now on version 3.19+.
Reviewed By: stella.stamenova
Differential Revision: https://reviews.llvm.org/D130171
On Windows (at least), cmake ignores Python3_EXECUTABLE unless the
'Interpreter' component is being found. If the user is specifying a
different version than the latest installed (say, 3.8 vs 3.9) with the
Python3_EXECUTABLE, cmake was using a combination of the newest version
and the desired version. Mitigated by adding 'Interpreter' in the first
invocation like the second one.
This adds an option to configure the CMake python search priming
behaviour that was introduced in D118148. In some environments the
priming would cause the "real" search to fail. The default behaviour is
unchanged, i.e. the search will be primed.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D120765
Although cmake should be platform-independent, we've observed
that some aspects of Python detection don't work on all platforms,
even with recent versions of cmake. This appears to be due to bugs
in the python detection logic, especially when the NumPy component
is required and not located within the python installation.
As a workaround, this patch first searches for "Development" before
searching for "Development.Module", which seems to workaround the
issue.
Differential Revision: https://reviews.llvm.org/D118148
The constructor function was being defined without indicating its "__init__"
name, which made it interpret it as a regular fuction rather than a
constructor. When overload resolution failed, Pybind would attempt to print the
arguments actually passed to the function, including "self", which is not
initialized since the constructor couldn't be called. This would result in
"__repr__" being called with "self" referencing an uninitialized MLIR C API
object, which in turn would cause undefined behavior when attempting to print
in C++. Even if the correct name is provided, the mechanism used by
PybindAdaptors.h to bind constructors directly as "__init__" functions taking
"self" is deprecated by Pybind. The new mechanism does not seem to have access
to a fully-constructed "self" object (i.e., the constructor in C++ takes a
`pybind11::detail::value_and_holder` that cannot be forwarded back to Python).
Instead, redefine "__new__" to perform the required checks (there are no
additional initialization needed for attributes and types as they are all
wrappers around a C++ pointer). "__new__" can call its equivalent on a
superclass without needing "self".
Bump pybind11 dependency to 3.8.0, which is the first version that allows one
to redefine "__new__".
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D117646
As discussed on discord, we have never actually been able to build with the project-wide published min version of 3.14.3. The buildbot that tests the Python configuration is currently pinned to 3.19.1, and there are a number of non-version/policy controlled features that Python building relies on that makes it unreliable with older versions. Some of the issues are pretty fundamental and I don't know how to do them on the older version. I think that, as an optional feature, at least advertising the PSA as in this patch is a good middle ground until the next project-wide CMake version bump.
Also moves setup logic to a macro so that everyone can use it.
CHECK_* directives for message() where added in Cmake 3.17, LLVM
requires 3.14 as minimum so they may not be intepreted correctly and
just print "CHECK_*" into the message stream. Replace them with STATUS.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D91959
* Makes `pip install pybind11` do the right thing with no further config.
* Since we now require a version of pybind11 greater than many LTS OS installs (>=2.6), a more convenient way to get a recent version is preferable.
* Also adds the version spec to find_package so it will skip older versions that may be lying around.
* Tested the full matrix of old system install, no system install, pip install and no pip install.
Differential Revision: https://reviews.llvm.org/D91903