The idea is to use TableGen records for both custom type constraints and
attributes:
* `PythonTypeName` is for type constraints, while
* `PythonAttrType` is for attributes.
The key types differ between these two records. `PythonTypeName` is
keyed by C++ type because multiple type constraints map to the same C++
type (e.g. `I32` and `I64` both map to `::mlir::IntegerType`), so a
single entry covers all of them. `PythonAttrType` is keyed by TableGen
def name because different attributes can share the same C++ storage
type but need distinct Python types (e.g. `I32ArrayAttr` and
`StrArrayAttr` are both `::mlir::ArrayAttr`).
We could in theory reimplement `getPythonAttrName` using the same
approach, but I decided to leave it for future PRs.
Issue #152743 reports that DefaultValuedProp is printed even when the
property value equals the default, unlike DefaultValuedAttr which is not
printed in that case.
The fix for this was already present in the codebase since commit
8955e285e1ac ("[mlir] Add property combinators, initial ODS support"),
which added elision of default-valued properties in the
genPropDictPrinter
function in OpFormatGen.cpp.
This commit adds:
- Documentation in Operations.md clarifying that DefaultValuedProp is
also elided from prop-dict output when the value equals the default,
consistent with the existing documentation for DefaultValuedAttr.
- An explicit test in properties.mlir verifying that DefaultValuedProp
with value equal to default is elided from prop-dict output, and that
DefaultValuedProp with a non-default value is still printed.
Fixes#152743
Assisted-by: Claude Code
This commit removes the 'Pure' trait from a number of TOSA operations.
Instead of marking most ops as pure by default, the trait is now opt-in
for operations that are provably side-effect free and speculatable.
Several operations were previously marked as pure unintentionally.
The following operations have had 'Pure' removed (reason in brackets):
- ARGMAX (out-of-range index)
- AVG_POOL2D (accumulator overflow/underflow)
- AVG_POOL2D_ADAPTIVE (same as above)
- CONV2D (accumulator overflow/underflow)
- CONV2D_BLOCK_SCALED (accumulator overflow/underflow)
- CONV3D (accumulator overflow/underflow)
- DEPTHWISE_CONV2D (accumulator overflow/underflow)
- MATMUL (accumulator overflow/underflow)
- MATMUL_T_BLOCK_SCALED (accumulator overflow/underflow)
- TRANSPOSE_CONV2D (accumulator overflow/underflow)
- ADD (overflow)
- SUB (underflow)
- MUL (invalid shift, overflow)
- ARITHMETIC_RIGHT_SHIFT (invalid shift value)
- LOGICAL_LEFT_SHIFT (invalid shift value)
- LOGICAL_RIGHT_SHIFT (invalid shift value)
- INTDIV (division by zero)
- POW (negative exponent restrictions)
- TABLE (invalid slope computation)
- ABS (underflow)
- NEGATE (overflow/underflow)
- REDUCE_PRODUCT (overflow)
- REDUCE_SUM (overflow)
- GATHER (out-of-range indices)
- SCATTER (out-of-range or duplicate indices)
- RESCALE (overflow/underflow)
Many of these operations can exhibit undefined behaviour when a
`REQUIRE` condition in the TOSA specification pseudocode fails. Whether
such failures result in a runtime error is implementation-defined. As a
result, speculating or reordering these operations can change program
behaviour.
For this reason, the `AlwaysSpeculatable` property implied by `Pure` is
not valid for these ops. The `NoMemoryEffect` trait is retained, as
these operations do not have direct memory side effects.
`DenseIntOrFPElementsAttr` was recently generalized to accept any type
that implement the `DenseElementType` interface. The name
`DenseIntOrFPElementsAttr` does not make sense anymore. This commit
renames the attribute to `DenseTypedElementsAttr`. An alias is kept for
migration purposes. The alias will be removed after some time.
This patch allows creating a hierarchy of `SideEffects::Resource`s by adding
a virtual `getParent()` method, so that effects on *disjoint* resources
can be proven non-conflicting. It also adds virtual `isAddressable()` method
that represents a property of a resource to be addressable via a pointer
value. The non-addressable resources may not be affected via any pointer.
This is unblocking CSE, LICM and alias analysis without per-pass
special-casing.
RFC:
https://discourse.llvm.org/t/rfc-mlir-memory-region-hierarchy-for-mlir-side-effects/89811
Unifies the two dialects that define x86 operations into a single one.
The AMX dialect is moved into X86 in line with other x86 extensions.
Following the dialect renaming, X86 dialect is now a suitable home for
wider range of operations targeting specific hardware features. Moving
AMX definitions to X86 dialect creates a single, centralized hub for
defining all x86 intrinsic-like operations. The new grouping aims to
eliminate the need for new dialects as new hardware extensions become
available.
The two dialects are simply merged together. X86 dialect refactoring
will be addressed separately.
List of changes:
- operations: 'amx.tile_*' => 'x86.amx.tile_*'
- types: '!amx.tile' => '!x86.amx.tile'
- namespace: 'mlir::amx' => 'mlir::x86::amx'
- test define: 'MLIR_RUN_AMX_TESTS' => 'MLIR_RUN_X86_AMX_TESTS'
- vector lowering: AMX is enabled by default together with X86
The MLIR AMX tests are now nested under X86 directory. To enable AMX
integration tests, 'MLIR_RUN_X86_TESTS' must also be defined.
Renames 'x86vector' dialect to 'x86'.
This is the first PR in series of cleanups around dialects targeting x86
platforms.
The new naming scheme is shorter, cleaner, and opens possibility of
integrating other x86-specific operations not strictly fitting pure
vector representation. For example, the generalization will allow for
future merger of AMX dialect into the x86 dialect to create one-stop x86
operations collection and boost discoverability.
Update MLIR's default SM to `sm_75`. This matches the behavior of
offline compilation tools in the CUDA Toolkit (`nvcc`, `ptxas`, ...) and
follows suit with 9fc5fd0ad689eed94f65b1d6d10f9c5642935e68.
Additionally, `sm_75` is the oldest GPU variant compatible with the
widest range of recent major CUDA Toolkit versions (11/12/13).
# What
This PR adds a shared library `MLIRPythonSupport` which contains all of
the CRTP classes ike `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`, as well as other useful code like `Defaulting*`
and etc enabling their reuse in downstream projects. Downstream projects
can now do
```c++
struct PyTestType : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyTestType> {
...
};
class PyTestAttr : public mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteAttribute<PyTestAttr> {
...
}
NB_MODULE(_mlirPythonTestNanobind, m) {
PyTestType::bind(m);
PyTestAttr::bind(m);
}
```
instead of using the discordant alternative
`mlir_type_subclass`/`mlir_attr_subclass` (same goes for
`PyConcreteValue`/`mlir_value_subclass`).
# Why
This PR is mostly code motion (along with CMake) but before I describe
the changes I want to state the goals/benefits:
1. Currently upstream "core" extensions and "dialect" extensions ([all
of the `Dialect*` extensions
here](d7c734b5a1/mlir/lib/Bindings/Python))
are a two-tier system;
**a**. [core
extensions](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Bindings/Python/IRTypes.cpp#L361)
enjoy first class support as far as type inference[^3], type stub
generation, and ease of implementation, while dialect extensions [have
poorer support](https://reviews.llvm.org/D150927), incorrect type stub
generation much more tedious (boilerplate) implementation;
**b**. Crucially, this two-tiered system is reflected in the fact that
**the two sets of types/attributes are not in the same Python object
hierarchy**. To wit: `isinstance(..., Type)` and `isinstance(...,
Attribute)` are not supported for the dialect extensions[^2];
**c**. Since these types are not exposed in public headers, downstream
users (dialect extensions or not) cannot write functions that overload
on e.g. `PyFloat8*Type` - that's quite a [useful
feature](fdbee98df8/cpp_ext/TorchOps.cpp (L29-L69))!
2. The dialect extensions incur a sizeable performance penalty relative
to the core extensions in that every single trip across the wire (either
`python->cpp` or `cpp->python`) requires work in addition to nanobind's
own casting/construction pipeline;
**a**. When going from `python->cpp`, [we extract the capsule object
from the Python
object](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h#L219C24-L219C46)
and then extract from the capsule the `Mlir*` opaque struct/ptr. This
side isn't so onerous;
**b**. When going from `cpp->python` we call long-hand call Python
`import` APIs and construct the Python object using `_CAPICreate`. Note,
there at least 2 `attr` calls incurred in addition to `_CAPICreate`;
this is already much more [efficiently handled by nanobind
itself](4ba51fcf79/src/nb_internals.h (L381-L382))!
3. This division blocks various features: in some configurations[^1] we
trigger a circular import bug because "dialect" types and attributes
perform an [import of the root `_mlir`
module](bd9651bf78/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h (L585))
when they are created (the types themselves, not even instances of those
types). This blocks type stub generation for dialect extensions (i.e.,
the reason we currently only generate type stubs for `_mlir`).
# How
Prior this was not done/possible because of "ODR" issues but I have
resolved those issues; the basic idea for how we solve this is "move
things we want to share into shared libraries":
1. Move IRCore (stuff like `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`) into `MLIRPythonSupport`;
- Note, we move the rest of the things in `IRModule.h` (renamed to
`IRCore.h`) because `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute` depend on them. This makes for a bigger PR than
one would hope for but ultimately I think we should give people access
to these classes to use as they see fit (specifically inherit from, but
also liberally use in bindings signatures instead of the opaque `Mlir*`
struct wrappers).
2. Put all of this code into a nested namespace
`MLIR_BINDINGS_PYTHON_DOMAIN` which is determined by a compile time
define (and tied to `MLIR_BINDINGS_PYTHON_NB_DOMAIN`). This is necessary
in order to prevent conflicts on both symbol name **and** typeid
(necessary for nanobind to not double register binded types) between
multiple bindings libraries (e.g., `torch-mlir`, and `jax`). Note
[nanobind doesn't support `module_local` like
pybind11](https://nanobind.readthedocs.io/en/latest/porting.html#removed-features).
It does support `NB_DOMAIN` but that is not sufficient for
disambiguating typeids across projects (to wit: we currently define
`NB_DOMAIN` and it was still necessary to move everything to a nested
namespace);
3. Build the [nanobind library itself as a shared
object](https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake#L127)
(and link it to both the extensions and `MLIRPythonSupport`).
4. CMake to make this work, in-tree, out-of-tree, downstream, upstream,
etc.
# Testing
Three tests are added here
1. `PythonTestModuleNanobind` is ported to use
`PyConcreteType<PyTestType>` instead of `mlir_type_subclass` and
`PyConcreteAttribute<PyTestAttr>` instead of `mlir_atrr_subclass`,
verifying this works for non-core extensions in-tree;
2. `StandaloneExtensionNanobind` is ported to use `struct PyCustomType :
mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyCustomType>`
instead of `mlir_type_subclass` verifying this works for non-core
extensions out-of-tree;
3. `StandaloneExtensionNanobind`'s `smoketest` is extended to also load
another bindings package (namely `mlir`) verifying
`MLIR_BINDINGS_PYTHON_DOMAIN` successfully disambiguates symbols and
typeids.
I have also tested this downstream:
https://github.com/llvm/eudsl/pull/287 as well run the following builder
bots:
mlir-nvidia-gcc7:
https://lab.llvm.org/buildbot/#/buildrequests/6654424?redirect_to_build=true
I have also tested against IREE:
https://github.com/iree-org/iree/pull/21916
# Integration
It is highly recommended to set the CMake var
`MLIR_BINDINGS_PYTHON_NB_DOMAIN` (which will also determine
`MLIR_BINDINGS_PYTHON_DOMAIN`) to something unique for each downstream.
This can also be passed explicitly to `add_mlir_python_modules` if your
project builds multiple bindings packages. I added a `WARNING` to this
effect in `AddMLIRPython.cmake`.
[^3]: Python values being typed correctly when exiting from cpp;
[^1]: Specifically when the modules are imported using `importlib`,
which occurs with nanobind's
[stubgen](https://github.com/wjakob/nanobind/blob/master/src/stubgen.py#L965);
[^2]: The workaround we implemented was a class method for the dialect
bindings called `Class.isinstance(...)`;
This commit moves the code responsible for adding newlines and tracking
indent, so that it can be used not only for operation printers, but also
for attribute and type printers.
It could be useful for nested attributes, where proper formatting with
newlines and indents would benefit the readability of the IR. Currently,
everything is printed on one line, which makes it difficult to read if
the attribute is more verbose and there are multiple levels of nesting.
Co-authored-by: Andruszkiewicz, Jacenty <andruszkiewicz.jacenty@intel.com>
# Problem:
There are two build system bugs on MacOS in the case where one intends
to use multiple bindings packages simultaneously (same Python
interpreter session):
1. The nanobind modules are built with
[`-Wl,-flat_namespace`](8518d2c405/llvm/cmake/modules/HandleLLVMOptions.cmake (L268))
thereby leading to ambiguous symbols across multiple whatever dylibs;
2. Intra-library symbol resolution (within the C API aggregate dylib)
fails to resolve symbols correctly unless things are built with
`-DCMAKE_C_VISIBILITY_PRESET=hidden -DCMAKE_CXX_VISIBILITY_PRESET=hidden
-DCMAKE_VISIBILITY_INLINES_HIDDEN=ON`.
# Repro:
On a Mac (with this patch applied):
1. Build without `twolevel_namespace` and without hidden vis properties
and run `LIT_FILTER=test.toy ninja check-mlir` (assuming you have
`-DLLVM_BUILD_EXAMPLES=ON -DLLVM_INCLUDE_EXAMPLES=ON`) and you will see:
```
LLVM ERROR: can't create Attribute 'mlir::StringAttr' because storage
uniquer isn't initialized: the dialect was likely not loaded, or the
attribute wasn't added with addAttributes<...>() in the
Dialect::initialize() method.
```
2. Build with `twolevel_namespace` but not hidden vis and run the same
lit test and you will see:
```
LLVM ERROR: Attempting to attach an interface to an unregistered
operation builtin.unrealized_conversion_cast.
```
# Fix
We only do a partial fix here (adding `twolevel_namespace` to Python
bindings modules) because a full fix requires adding visibility
attributes to all object files. I added docs discussing this.
# Why is this not happening on Linux
Using `DYLD_PRINT_BINDINGS=1` I observe that for the checked-in/updated
test (without the fix) `libMLIRPythonCAPI` resolves many of its symbols
to `libStandalonePythonCAPI`:
```
dyld[98449]: looking for weak-def symbol '__ZN4mlir6TypeID3getINS_13AffineMapAttrEEES0_v':
dyld[98449]: found __ZN4mlir6TypeID3getINS_13AffineMapAttrEEES0_v in map, using impl from /Users/maksimlevental/dev_projects/llvm-project/cmake-build-debug/tools/mlir/test/Examples/standalone/python_packages/standalone/mlir_standalone/_mlir_libs/libStandalonePythonCAPI.dylib
dyld[98449]: <libMLIRPythonCAPI.dylib/bind#22> -> 0x11348fa9c <libStandalonePythonCAPI.dylib/__ZN4mlir6TypeID3getINS_13AffineMapAttrEEES0_v>)
dyld[98449]: looking for weak-def symbol '__ZN4mlir6TypeID3getINS_9ArrayAttrEEES0_v':
dyld[98449]: found __ZN4mlir6TypeID3getINS_9ArrayAttrEEES0_v in map, using impl from /Users/maksimlevental/dev_projects/llvm-project/cmake-build-debug/tools/mlir/test/Examples/standalone/python_packages/standalone/mlir_standalone/_mlir_libs/libStandalonePythonCAPI.dylib
dyld[98449]: <libMLIRPythonCAPI.dylib/bind#23> -> 0x11348f990 <libStandalonePythonCAPI.dylib/__ZN4mlir6TypeID3getINS_9ArrayAttrEEES0_v>)
dyld[98449]: looking for weak-def symbol '__ZN4mlir6TypeID3getINS_14DictionaryAttrEEES0_v':
dyld[98449]: found __ZN4mlir6TypeID3getINS_14DictionaryAttrEEES0_v in map, using impl from /Users/maksimlevental/dev_projects/llvm-project/cmake-build-debug/tools/mlir/test/Examples/standalone/python_packages/standalone/mlir_standalone/_mlir_libs/libStandalonePythonCAPI.dylib
dyld[98449]: <libMLIRPythonCAPI.dylib/bind#24> -> 0x11348eec0 <libStandalonePythonCAPI.dylib/__ZN4mlir6TypeID3getINS_14DictionaryAttrEEES0_v>)
```
Turns out this is "expected" behavior:
> It appears on macOS, when a static library is compiled without
-fvisibility=hidden, its C++ template instantiations could lead to
leftover weak symbols that are resolved and bound at runtime
https://joyeecheung.github.io/blog/2025/01/11/executable-loading-and-startup-performance-on-macos/🤷
Mention that there is no formal definition of the canonical form. Also
add more examples for users to understand what kind of transformations
the community has agreed upon in the past.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Introduce DefaultValuedEnumAttr, which similarly to DefaultValuedAttr
decorates an enum attribute to have a default value from a specific enum
case when not present. The default is constructed as the fully-qualified
enum case symbol.
In comparison to DefaultValuedAttr, this allows using a TableGen
EnumCase
variable instead of a raw string.
Noticed the link was dead when reading the docs. According to the
wayback machine, it was a link to "Revisiting Out-of-SSA Translation for
Correctness, Code Quality, and Efficiency" by Boissinot et al., so
providing a more permanent link.
Currently, Dialect Interfaces can't be defined in ODS. This PR adds the
support for dialect interfaces. It follows the same approach with other
interfaces and extends on top of `Interface` class defined in
`mlir/TableGen/Interfaces.h`.
Given the following input:
```tablegen
#ifndef MY_INTERFACES
#define MY_INTERFACES
include "mlir/IR/Interfaces.td"
def DialectInlinerInterface : DialectInterface<"DialectInlinerInterface"> {
let description = [{
Define a base inlining interface class to allow for dialects to opt-in to the inliner.
}];
let cppNamespace = "::mlir";
let methods = [
InterfaceMethod<
/*desc=*/ [{
Returns true if the given region 'src' can be inlined into the region
'dest' that is attached to an operation registered to the current dialect.
'valueMapping' contains any remapped values from within the 'src' region.
This can be used to examine what values will replace entry arguments into
the 'src' region, for example.
}],
/*returnType=*/ "bool",
/*methodName=*/ "isLegalToInline",
/*args=*/ (ins "::mlir::Region *":$dest, "::mlir::Region *":$src, "::mlir::IRMapping &":$valueMapping),
/*methodBody=*/ [{
return true;
}]
>
];
}
#endif
```
It will generate the following code:
```cpp
/*===- TableGen'erated file -------------------------------------*- C++ -*-===*\
|* *|
|* Dialect Interface Declarations *|
|* *|
|* Automatically generated file, do not edit! *|
|* *|
\*===----------------------------------------------------------------------===*/
namespace mlir {
/// Define a base inlining interface class to allow for dialects to opt-in to the inliner.
class DialectInlinerInterface : public ::mlir::DialectInterface::Base<DialectInlinerInterface> {
public:
/// Returns true if the given region 'src' can be inlined into the region
/// 'dest' that is attached to an operation registered to the current dialect.
/// 'valueMapping' contains any remapped values from within the 'src' region.
/// This can be used to examine what values will replace entry arguments into
/// the 'src' region, for example.
virtual bool isLegalToInline(::mlir::Region * dest, ::mlir::Region * src, ::mlir::IRMapping & valueMapping) const;
protected:
DialectInlinerInterface(::mlir::Dialect *dialect) : Base(dialect) {}
};
} // namespace mlir
bool ::mlir::DialectInlinerInterface::isLegalToInline(::mlir::Region * dest, ::mlir::Region * src, ::mlir::IRMapping & valueMapping) const {
return true;
}
```
Disallow implicit casting, which is surprising, and, IME, usually
indicative of copy-paste errors.
Because the initial value must be a scalar, I don't expect this to
affect any data movement.
This patch updates the NVVM Dialect docs to:
* include information on the type of pointers for the memory spaces.
* include high-level information on mbarrier objects.
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
Same as with pass def & decl. This doesn't change anything with registry
and the big flag kept (e.g., GEN_PASS_REGISTRATION behaves like
GEN_PASS_DECL and so too for sub ones).
…consistent
Since the bindings now use nanobind, I changed the code examples and
mentions in the documentation prose to mention nanobind concepts and
symbols wherever applicable.
I also made the spelling of "Python" consistent by choosing the
uppercase name everywhere that's not an executable name, part of a URL,
or directory name.
----------------
Note that I left mentions of `PybindAdaptors.h` in because of
https://github.com/llvm/llvm-project/pull/162309.
Are there any thoughts about adding a virtual environment setup guide
using [uv](https://docs.astral.sh/uv/)? It has gotten pretty popular,
and is much faster than a "vanilla" Python pip install. It can also
bootstrap an interpreter not present on the user's machine, for example
a free-threaded Python build, with the `-p` flag to the `uv venv`
virtual environment creation command.
Add builders on the Python side that match builders in the C++ side, add tests for launching GPU kernels and regions, and correct some small documentation mistakes. This reflects the API decisions already made in the func dialect's Python bindings and makes use of the GPU dialect's bindings work more similar to C++ interface.
Add documentation for the no-rollback conversion driver. Also improve
the documentation of the old rollback driver. In particular: which
modifications are performed immediately and which are delayed.
The MLIR Python bindings now support defining new passes, new rewrite
patterns (through either `RewritePatternSet` or `PDLModule`), as well as
new dialects using the IRDL bindings. Adding a dedicated section to
document these features would make it easier for users to discover and
understand the full capabilities of the Python bindings.
Python-defined passes have been merged into the main branch for some
time now. I believe adding a corresponding section in the documentation
will help more users learn about this feature and understand how to use
it.
This PR adds such a section to the docs of Python bindings, summarizing
the feature and providing an example.