116 Commits

Author SHA1 Message Date
Maksim Levental
ad5be31c30
[mlir][Python] fix NV examples after #172892 (#174481) 2026-01-05 21:47:35 +00:00
Maksim Levental
18fc908566
[mlir][Python] move IRTypes and IRAttributes to MLIRPythonSupport (#174118)
This PR continues the work of
https://github.com/llvm/llvm-project/pull/171775 by moving more useful
types/attributes into MLIRPythonSupport.

You can now do 

```c++
struct PyTestIntegerRankedTensorType
    : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<
          PyTestIntegerRankedTensorType,
          mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyRankedTensorType>
struct PyTestTensorValue
    : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteValue<
          PyTestTensorValue>
```
instead of `mlir_type_subclass` and `mlir_value_subclass`;
**specifically manual registration of the "value caster" via indirection
through the Python interpreter is no longer necessary** . You can also
now freely use all such types at the nanobind API level (e.g., overload
based on `FP*`):

```c++
using mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN;
standaloneM.def("print_fp_type", [](PyF16Type &) { nb::print("this is a fp16 type"); });
standaloneM.def("print_fp_type", [](PyF32Type &) { nb::print("this is a fp32 type"); });
standaloneM.def("print_fp_type", [](PyF64Type &) { nb::print("this is a fp64 type"); });
```

Note, here we only port `PythonTestModuleNanobind` but there is a
follow-up PR that ports **all** in-tree dialect extensions
https://github.com/llvm/llvm-project/pull/174156 to use these. After
that one we can soft deprecate `mlir_pure_subclass`.

Note, depends on https://github.com/llvm/llvm-project/pull/171775
2026-01-05 09:34:58 -08:00
Maksim Levental
f0ef5dba6d
[mlir][Python] create MLIRPythonSupport (#171775)
# What

This PR adds a shared library `MLIRPythonSupport` which contains all of
the CRTP classes ike `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`, as well as other useful code like `Defaulting*`
and etc enabling their reuse in downstream projects. Downstream projects
can now do

```c++
struct PyTestType : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyTestType> {
  ...
};

class PyTestAttr : public mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteAttribute<PyTestAttr> {
  ...
}

NB_MODULE(_mlirPythonTestNanobind, m) {
  PyTestType::bind(m);
  PyTestAttr::bind(m);
}
```

instead of using the discordant alternative
`mlir_type_subclass`/`mlir_attr_subclass` (same goes for
`PyConcreteValue`/`mlir_value_subclass`).

# Why

This PR is mostly code motion (along with CMake) but before I describe
the changes I want to state the goals/benefits:

1. Currently upstream "core" extensions and "dialect" extensions ([all
of the `Dialect*` extensions
here](d7c734b5a1/mlir/lib/Bindings/Python))
are a two-tier system;
**a**. [core
extensions](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Bindings/Python/IRTypes.cpp#L361)
enjoy first class support as far as type inference[^3], type stub
generation, and ease of implementation, while dialect extensions [have
poorer support](https://reviews.llvm.org/D150927), incorrect type stub
generation much more tedious (boilerplate) implementation;
**b**. Crucially, this two-tiered system is reflected in the fact that
**the two sets of types/attributes are not in the same Python object
hierarchy**. To wit: `isinstance(..., Type)` and `isinstance(...,
Attribute)` are not supported for the dialect extensions[^2];
**c**. Since these types are not exposed in public headers, downstream
users (dialect extensions or not) cannot write functions that overload
on e.g. `PyFloat8*Type` - that's quite a [useful
feature](fdbee98df8/cpp_ext/TorchOps.cpp (L29-L69))!
2. The dialect extensions incur a sizeable performance penalty relative
to the core extensions in that every single trip across the wire (either
`python->cpp` or `cpp->python`) requires work in addition to nanobind's
own casting/construction pipeline;
**a**. When going from `python->cpp`, [we extract the capsule object
from the Python
object](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h#L219C24-L219C46)
and then extract from the capsule the `Mlir*` opaque struct/ptr. This
side isn't so onerous;
**b**. When going from `cpp->python` we call long-hand call Python
`import` APIs and construct the Python object using `_CAPICreate`. Note,
there at least 2 `attr` calls incurred in addition to `_CAPICreate`;
this is already much more [efficiently handled by nanobind
itself](4ba51fcf79/src/nb_internals.h (L381-L382))!
3. This division blocks various features: in some configurations[^1] we
trigger a circular import bug because "dialect" types and attributes
perform an [import of the root `_mlir`
module](bd9651bf78/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h (L585))
when they are created (the types themselves, not even instances of those
types). This blocks type stub generation for dialect extensions (i.e.,
the reason we currently only generate type stubs for `_mlir`).

# How

Prior this was not done/possible because of "ODR" issues but I have
resolved those issues; the basic idea for how we solve this is "move
things we want to share into shared libraries":

1. Move IRCore (stuff like `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute`) into `MLIRPythonSupport`;
- Note, we move the rest of the things in `IRModule.h` (renamed to
`IRCore.h`) because `PyConcreteValue`, `PyConcreteType`,
`PyConcreteAttribute` depend on them. This makes for a bigger PR than
one would hope for but ultimately I think we should give people access
to these classes to use as they see fit (specifically inherit from, but
also liberally use in bindings signatures instead of the opaque `Mlir*`
struct wrappers).
2. Put all of this code into a nested namespace
`MLIR_BINDINGS_PYTHON_DOMAIN` which is determined by a compile time
define (and tied to `MLIR_BINDINGS_PYTHON_NB_DOMAIN`). This is necessary
in order to prevent conflicts on both symbol name **and** typeid
(necessary for nanobind to not double register binded types) between
multiple bindings libraries (e.g., `torch-mlir`, and `jax`). Note
[nanobind doesn't support `module_local` like
pybind11](https://nanobind.readthedocs.io/en/latest/porting.html#removed-features).
It does support `NB_DOMAIN` but that is not sufficient for
disambiguating typeids across projects (to wit: we currently define
`NB_DOMAIN` and it was still necessary to move everything to a nested
namespace);
3. Build the [nanobind library itself as a shared
object](https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake#L127)
(and link it to both the extensions and `MLIRPythonSupport`).
4. CMake to make this work, in-tree, out-of-tree, downstream, upstream,
etc.

# Testing

Three tests are added here 

1. `PythonTestModuleNanobind` is ported to use
`PyConcreteType<PyTestType>` instead of `mlir_type_subclass` and
`PyConcreteAttribute<PyTestAttr>` instead of `mlir_atrr_subclass`,
verifying this works for non-core extensions in-tree;
2. `StandaloneExtensionNanobind` is ported to use `struct PyCustomType :
mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyCustomType>`
instead of `mlir_type_subclass` verifying this works for non-core
extensions out-of-tree;
3. `StandaloneExtensionNanobind`'s `smoketest` is extended to also load
another bindings package (namely `mlir`) verifying
`MLIR_BINDINGS_PYTHON_DOMAIN` successfully disambiguates symbols and
typeids.

I have also tested this downstream:
https://github.com/llvm/eudsl/pull/287 as well run the following builder
bots:

mlir-nvidia-gcc7:
https://lab.llvm.org/buildbot/#/buildrequests/6654424?redirect_to_build=true

I have also tested against IREE:
https://github.com/iree-org/iree/pull/21916

# Integration

It is highly recommended to set the CMake var
`MLIR_BINDINGS_PYTHON_NB_DOMAIN` (which will also determine
`MLIR_BINDINGS_PYTHON_DOMAIN`) to something unique for each downstream.
This can also be passed explicitly to `add_mlir_python_modules` if your
project builds multiple bindings packages. I added a `WARNING` to this
effect in `AddMLIRPython.cmake`.

[^3]: Python values being typed correctly when exiting from cpp;
[^1]: Specifically when the modules are imported using `importlib`,
which occurs with nanobind's
[stubgen](https://github.com/wjakob/nanobind/blob/master/src/stubgen.py#L965);
[^2]: The workaround we implemented was a class method for the dialect
bindings called `Class.isinstance(...)`;
2026-01-05 09:08:13 -08:00
Maksim Levental
e4af5b102b
[mlir][python] fix symbol resolution on MacOS with multiple packages (#174057)
# Problem:

There are two build system bugs on MacOS in the case where one intends
to use multiple bindings packages simultaneously (same Python
interpreter session):

1. The nanobind modules are built with
[`-Wl,-flat_namespace`](8518d2c405/llvm/cmake/modules/HandleLLVMOptions.cmake (L268))
thereby leading to ambiguous symbols across multiple whatever dylibs;
2. Intra-library symbol resolution (within the C API aggregate dylib)
fails to resolve symbols correctly unless things are built with
`-DCMAKE_C_VISIBILITY_PRESET=hidden -DCMAKE_CXX_VISIBILITY_PRESET=hidden
-DCMAKE_VISIBILITY_INLINES_HIDDEN=ON`.

# Repro:

On a Mac (with this patch applied):

1. Build without `twolevel_namespace` and without hidden vis properties
and run `LIT_FILTER=test.toy ninja check-mlir` (assuming you have
`-DLLVM_BUILD_EXAMPLES=ON -DLLVM_INCLUDE_EXAMPLES=ON`) and you will see:
    ```
LLVM ERROR: can't create Attribute 'mlir::StringAttr' because storage
uniquer isn't initialized: the dialect was likely not loaded, or the
attribute wasn't added with addAttributes<...>() in the
Dialect::initialize() method.
    ```
2. Build with `twolevel_namespace` but not hidden vis and run the same
lit test and you will see:
    ```
LLVM ERROR: Attempting to attach an interface to an unregistered
operation builtin.unrealized_conversion_cast.
    ```

# Fix

We only do a partial fix here (adding `twolevel_namespace` to Python
bindings modules) because a full fix requires adding visibility
attributes to all object files. I added docs discussing this.


# Why is this not happening on Linux

Using `DYLD_PRINT_BINDINGS=1` I observe that for the checked-in/updated
test (without the fix) `libMLIRPythonCAPI` resolves many of its symbols
to `libStandalonePythonCAPI`:

```
dyld[98449]: looking for weak-def symbol '__ZN4mlir6TypeID3getINS_13AffineMapAttrEEES0_v':
dyld[98449]:   found __ZN4mlir6TypeID3getINS_13AffineMapAttrEEES0_v in map, using impl from /Users/maksimlevental/dev_projects/llvm-project/cmake-build-debug/tools/mlir/test/Examples/standalone/python_packages/standalone/mlir_standalone/_mlir_libs/libStandalonePythonCAPI.dylib
dyld[98449]: <libMLIRPythonCAPI.dylib/bind#22> -> 0x11348fa9c <libStandalonePythonCAPI.dylib/__ZN4mlir6TypeID3getINS_13AffineMapAttrEEES0_v>)
dyld[98449]: looking for weak-def symbol '__ZN4mlir6TypeID3getINS_9ArrayAttrEEES0_v':
dyld[98449]:   found __ZN4mlir6TypeID3getINS_9ArrayAttrEEES0_v in map, using impl from /Users/maksimlevental/dev_projects/llvm-project/cmake-build-debug/tools/mlir/test/Examples/standalone/python_packages/standalone/mlir_standalone/_mlir_libs/libStandalonePythonCAPI.dylib
dyld[98449]: <libMLIRPythonCAPI.dylib/bind#23> -> 0x11348f990 <libStandalonePythonCAPI.dylib/__ZN4mlir6TypeID3getINS_9ArrayAttrEEES0_v>)
dyld[98449]: looking for weak-def symbol '__ZN4mlir6TypeID3getINS_14DictionaryAttrEEES0_v':
dyld[98449]:   found __ZN4mlir6TypeID3getINS_14DictionaryAttrEEES0_v in map, using impl from /Users/maksimlevental/dev_projects/llvm-project/cmake-build-debug/tools/mlir/test/Examples/standalone/python_packages/standalone/mlir_standalone/_mlir_libs/libStandalonePythonCAPI.dylib
dyld[98449]: <libMLIRPythonCAPI.dylib/bind#24> -> 0x11348eec0 <libStandalonePythonCAPI.dylib/__ZN4mlir6TypeID3getINS_14DictionaryAttrEEES0_v>)
```

Turns out this is "expected" behavior:

> It appears on macOS, when a static library is compiled without
-fvisibility=hidden, its C++ template instantiations could lead to
leftover weak symbols that are resolved and bound at runtime


https://joyeecheung.github.io/blog/2025/01/11/executable-loading-and-startup-performance-on-macos/

🤷
2026-01-02 18:53:57 +00:00
peledins-zimperium
64496be8e0
[mlir] Fix typo s/opreations/operations (#163544) 2025-12-26 12:03:25 +00:00
Vijay Kandiah
db4e6e6911
[NVGPU] Disable nvdsl lit tests if python bindings not enabled (#170898)
A recent change https://github.com/llvm/llvm-project/pull/167321 enabled
nvdsl examples to be run by default. These examples require MLIR python
bindings to be enabled, and this PR makes sure they're skipped if
`config.enable_bindings_python` is not enabled.
2025-12-05 12:36:30 -06:00
Giacomo Castiglioni
9f422915c7
[NVGPU] Fix nvdsl examples - take 2 (#167321)
This PR re-lands https://github.com/llvm/llvm-project/pull/156830

This PR aims at fixing the nvdsl examples which got a bit out of sync
not being tested in the CI.

The fixed bugs were related to the following PRs:
- move to nanobind #118583
- split gpu module initialization #135478
- gpu dialect python API change #163883
2025-12-04 19:43:17 +05:30
Mehdi Amini
037fd30562
Revert "[NVGPU] Fix nvdsl examples" (#166943)
Reverts llvm/llvm-project#156830

This broke the bots.
2025-11-07 15:36:44 +01:00
Giacomo Castiglioni
299df7ed25
[NVGPU] Fix nvdsl examples (#156830)
This PR aims at fixing the nvdsl examples which got a bit out of sync
not being tested in the CI.

The fixed bugs were related to the following PRs:
- move to nanobind #118583
- split gpu module initialization #135478
2025-11-07 16:23:08 +05:30
Jakub Kuderski
3bca1e41e4
[mlir][Examples] Do not run test.wheel.toy by default (#163009)
This test takes ~16s to execute on my machine, which is an order of
magnitude longer than any other mlir test. Put the `test.wheel.toy` test
behind a `requires` check for expensive checks.

LLVM already has some tests enabled conditionally under expensive
checks.
2025-10-11 18:03:39 -04:00
Maksim Levental
1ff3e2e280
[MLIR][Standalone] gate wheel build behind MLIR_ENABLE_BINDINGS_PYTHON=ON (#161427)
If MLIR_ENABLE_BINDINGS_PYTHON=ON then
[StandalonePythonModules](https://github.com/llvm/llvm-project/blob/main/mlir/examples/standalone/pyproject.toml#L38)
isn't a valid target.
2025-09-30 20:06:36 +00:00
Maksim Levental
59e74a0749
Reland "[MLIR][Python] add Python wheel build demo/test" (#160481) (#160488)
Reland standalone wheel build. The fix is to gate the test behind
`BUILD_SHARED_LIBS=OFF` (because bundling all libs in the wheel requires
valid rpaths which is not the case under `BUILD_SHARED_LIBS=ON`).
2025-09-24 04:07:31 -07:00
Maksim Levental
0aba5bf6ef
Revert "[MLIR][Python] add Python wheel build demo/test" (#160481)
Reverts llvm/llvm-project#160388 because it broke
[mlir-nvidia](https://lab.llvm.org/buildbot/#/builders/138) builder.
2025-09-24 10:00:34 +00:00
Maksim Levental
1359f3a83f
[MLIR][Python] add Python wheel build demo/test (#160388)
This PR demos and tests building Python wheels using
[scikit-build-core](https://scikit-build-core.readthedocs.io/en/latest/).
The test is added to standalone and thus demos "out-of-tree" use cases
but the same `pyproject.toml` will work for in-tree builds. Note, one
can easily pair this with
[cibuildwheel](3264909755/docs/guide/build.md (L221-L226))
to build for all Python versions, OSs, architectures, etc.
2025-09-24 02:34:58 -07:00
Maksim Levental
efd96afedf
[MLIR][Python] reland (narrower) type stub generation (#157930)
This a reland of https://github.com/llvm/llvm-project/pull/155741 which
was reverted at https://github.com/llvm/llvm-project/pull/157831. This
version is narrower in scope - it only turns on automatic stub
generation for `MLIRPythonExtension.Core._mlir` and **does not do
anything automatically**. Specifically, the only CMake code added to
`AddMLIRPython.cmake` is the `mlir_generate_type_stubs` function which
is then used only in a manual way. The API for
`mlir_generate_type_stubs` is:

```
Arguments:
  MODULE_NAME: The fully-qualified name of the extension module (used for importing in python).
  DEPENDS_TARGETS: List of targets these type stubs depend on being built; usually corresponding to the
    specific extension module (e.g., something like StandalonePythonModules.extension._standaloneDialectsNanobind.dso)
    and the core bindings extension module (e.g., something like StandalonePythonModules.extension._mlir.dso).
  OUTPUT_DIR: The root output directory to emit the type stubs into.
  OUTPUTS: List of expected outputs.
  DEPENDS_TARGET_SRC_DEPS: List of cpp sources for extension library (for generating a DEPFILE).
  IMPORT_PATHS: List of paths to add to PYTHONPATH for stubgen.
  PATTERN_FILE: (Optional) Pattern file (see https://nanobind.readthedocs.io/en/latest/typing.html#pattern-files).
Outputs:
  NB_STUBGEN_CUSTOM_TARGET: The target corresponding to generation which other targets can depend on.
```

Downstream users should use `mlir_generate_type_stubs` in coordination
with `declare_mlir_python_sources` to turn on stub generation for their
own downstream dialect extensions and upstream dialect extensions if
they so choose. Standalone example shows an example.

Note, downstream will also need to set
`-DMLIR_PYTHON_PACKAGE_PREFIX=...` correctly for their bindings.
2025-09-20 18:47:32 +00:00
Maksim Levental
1a6b2b64b6
[MLIR] enable Standalone example test for Windows (#158183)
This PR turns on all Standalone tests for Windows except for the plugins (which aren't enabled by default).
2025-09-12 11:34:44 -07:00
Maksim Levental
0a16d1a754
[MLIR][Python] fix standalone example/test (#156197)
Fix some things in `standalone` in order to unblock
https://github.com/llvm/llvm-project/pull/155741.
2025-08-30 17:52:04 -07:00
lonely eagle
1a4f0d6115
[mlir][doc] Fix transform dialect tutorial ch3 (#150456)
Fixed some bugs in documentation. Add CallOpInterfaceHandle to the
arguments of ChangeCallTargetOp, after doing so the section described in
the documentation works correctly, Otherwise the following code reports
an error.
```
// Cast to our new type.
 %casted = transform.cast %call : !transform.any_op to !transform.my.call_op_interface
// Using our new operation.
 transform.my.change_call_target %casted, "microkernel" : !transform.my.call_op_interface
```
2025-07-26 09:21:35 +08:00
Renato Golin
6daf2b956d
[MLIR][Linalg] Remove elemwise_unary and elemwise_binary (#147082)
RFC:
https://discourse.llvm.org/t/rfc-deprecate-linalg-elemwise-unary-and-elemwise-binary/87144

Remove the two operations and fix the tests by:
* Cleaning simple operation tests of the old ops
* Changing `linalg.elemwise_{u|bi}nary` with `linalg.{exp|add}` on
transform tests
* Changing some of the tests with `linalg.elementwise` instead, to
broaden test coverage
* Surgically removing the `elemwise_*` part in the Python tests
* Update MLIR transform examples (text and tests) with
`linalg.elementwise` instead

Nothing else changed.
2025-07-07 12:33:55 +01:00
Uday Bondhugula
eab6f2d7a9
[MLIR][Affine] Fix fusion in the presence of cyclic deps in source nests (#128397)
Fixes: https://github.com/llvm/llvm-project/issues/61820

Fix affine fusion in the presence of cyclic deps in the source nest. In
such cases, the nest being fused can't be executed multiple times. Add a
utility to check for dependence cycles and use it in fusion. This fixes
both sibling as well as producer consumer fusion where nests with cyclic
dependences (typically reductions) were being in some cases incorrectly
fused in.

The test case also exercises/required a fix to the check for the
redundant computation being within the specified threshold.
2025-02-25 11:24:31 +05:30
Uday Bondhugula
3aef599d07
[MLIR][Affine] NFC. Drop redundant fusion- suffix from fusion pass options (#128405)
NFC. Drop redundant fusion- suffix from fusion pass options. The pass
already has 'fusion' in its name. Shorten the option names avoiding
repetition.
2025-02-24 08:01:37 +05:30
Krzysztof Drewniak
f4e3b8783c
[mlir][LLVM] Switch undef for poison for uninitialized values (#125629)
LLVM itself is generally moving away from using `undef` and towards
using `poison`, to the point of having a lint that caches new uses of
`undef` in tests.

In order to not trip the lint on new patterns and to conform to the
evolution of LLVM
- Rename valious ::undef() methods on StructBuilder subclasses to
::poison()
- Audit the uses of UndefOp in the MLIR libraries and replace almost all
of them with PoisonOp

The remaining uses of `undef` are initializing `uninitialized` memrefs,
explicit conversions to undef from SPIR-V, and a few cases in
AMDGPUToROCDL where usage like

    %v = insertelement <M x iN> undef, iN %v, i32 0
    %arg = bitcast <M x iN> %v to i(M * N)

is used to handle "i32" arguments that are are really packed vectors of
smaller types that won't always be fully initialized.
2025-02-06 12:49:30 -06:00
Andrzej Warzyński
91c11574e8
Revert "[MLIR] Make OneShotModuleBufferize use OpInterface (#110322)" (#113124)
This reverts commit 2026501cf107fcb3cbd51026ba25fda3af823941.

Failing bot:
  * https://lab.llvm.org/staging/#/builders/125/builds/389
2024-10-22 13:28:44 +01:00
Tzung-Han Juang
2026501cf1
[MLIR] Make OneShotModuleBufferize use OpInterface (#110322)
**Description:** 
This PR replaces a part of `FuncOp` and `CallOp` with
`FunctionOpInterface` and `CallOpInterface` in `OneShotModuleBufferize`.
Also fix the error from an integration test in the a previous PR
attempt. (https://github.com/llvm/llvm-project/pull/107295)

The below fixes skip `CallOpInterface` so that the assertions are not
triggered.


8d78000762/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp (L254-L259)


8d78000762/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp (L311-L315)

**Related Discord Discussion:**
[Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900)

---------

Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>
2024-10-01 15:58:52 +02:00
Matthias Springer
ae7b454f98
Revert "[MLIR] Make OneShotModuleBufferize use OpInterface" (#109919)
Reverts llvm/llvm-project#107295

This commit breaks an integration test:
```
build/bin/mlir-opt mlir/test/Integration/Dialect/Complex/CPU/correctness.mlir  -one-shot-bufferize="bufferize-function-boundaries"
```
2024-09-25 09:17:49 +02:00
Tzung-Han Juang
f586b1e3f4
[MLIR] Make OneShotModuleBufferize use OpInterface (#107295)
**Description:** 

`OneShotModuleBufferize` deals with the bufferization of `FuncOp`,
`CallOp` and `ReturnOp` but they are hard-coded. Any custom
function-like operations will not be handled. The PR replaces a part of
`FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface`
in `OneShotModuleBufferize` so that custom function ops and call ops can
be bufferized.

**Related Discord Discussion:**
[Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900)

---------

Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>
2024-09-25 07:27:21 +02:00
Jeremy Kun
7f1968625a
Add a tutorial on mlir-opt (#96105)
This tutorial gives an introduction to the `mlir-opt` tool, focusing on
how to run basic passes with and without options, run pass pipelines
from the CLI, and point out particularly useful flags.

---------

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-08-01 16:49:01 -07:00
Guray Ozen
f8ff909471
[mlir][gpu] Add py binding for AsyncTokenType (#96466)
The PR adds py binding for `AsyncTokenType`
2024-06-24 11:39:22 +02:00
Guy David
4bce270157
[mlir][llvm] Implement ConstantLike for ZeroOp, UndefOp, PoisonOp (#93690)
These act as constants and should be propagated whenever possible. It is
safe to do so for mlir.undef and mlir.poison because they remain "dirty"
through out their lifetime and can be duplicated, merged, etc. per the
LangRef.

Signed-off-by: Guy David <guy.david@nextsilicon.com>
2024-05-30 08:21:08 +02:00
Guray Ozen
51752ed0dd [mlir][nvgpu] verify the module 2024-05-28 21:17:31 +02:00
Guray Ozen
4d3308202e
[mlir][nvgpu] NVGPU Tutorials (#87065)
I have a tutorial at EuroLLVM 2024 ([Zero to Hero: Programming Nvidia
Hopper Tensor Core with MLIR's NVGPU
Dialect](https://llvm.swoogo.com/2024eurollvm/session/2086997/zero-to-hero-programming-nvidia-hopper-tensor-core-with-mlir's-nvgpu-dialect)).
For that, I implemented tutorial codes in Python. The focus is the nvgpu
dialect and how to use its advanced features. I thought it might be
useful to upstream this.

The tutorial codes are as follows:
- **Ch0.py:** Hello World
- **Ch1.py:** 2D Saxpy
- **Ch2.py:** 2D Saxpy using TMA
- **Ch3.py:** GEMM 128x128x64 using Tensor Core and TMA 
- **Ch4.py:** Multistage performant GEMM using Tensor Core and TMA
- **Ch5.py:** Warp Specialized GEMM using Tensor Core and TMA

I might implement one more chapter:

- **Ch6.py:** Warp Specialized Persistent ping-pong GEMM

This PR also introduces the nvdsl class, making IR building in the
tutorial easier.
2024-04-24 12:00:12 +02:00
Oleksandr "Alex" Zinenko
619ee20b39
[mlir] add an example of using transform dialect standalone (#82623)
Transform dialect interpreter is designed to be usable outside of the
pass pipeline, as the main program transformation driver, e.g., for
languages with explicit schedules. Provide an example of such usage with
a couple of tests.
2024-02-28 09:48:15 +01:00
Oleksandr "Alex" Zinenko
b33b91a217
[mlir] update transform dialect tutorials (#81199)
Use the "main" transform-interpreter pass instead of the test pass.
This, along with the previously introduced debug extension, now allow
tutorials to no longer depend on test passes and extensions.
2024-02-09 17:35:14 +01:00
Oleksandr "Alex" Zinenko
2798b72ae7
[mlir] introduce debug transform dialect extension (#77595)
Introduce a new extension for simple print-debugging of the transform
dialect scripts. The initial version of this extension consists of two
ops that are printing the payload objects associated with transform
dialect values. Similar ops were already available in the test extenion
and several downstream projects, and were extensively used for testing.
2024-01-12 13:24:02 +01:00
Oleksandr "Alex" Zinenko
4cb2ef4fe3
[mlir] add a chapter on matchers to the transform dialect tutorial (#76725)
These operations has been available for a while, but were not described
in the tutorial. Add a new chapter on using and defining match
operations.
2024-01-09 13:19:41 +01:00
Andrzej Warzyński
ca5d34ec71
[mlir][TD] Fix the order of return handles (#76929)
Replace (in tests and docs):

    %forall, %tiled = transform.structured.tile_using_forall

with (updated order of return handles):

    %tiled, %forall = transform.structured.tile_using_forall

Similar change is applied to (in the TD tutorial):

    transform.structured.fuse_into_containing_op

This update makes sure that the tests/documentation are consistent with
the Op specifications. Follow-up for #67320 which updated the order of
the return handles for `tile_using_forall`.
2024-01-04 12:54:16 +00:00
Oleksandr "Alex" Zinenko
aab795a8dc
[mlir] run buffer deallocation in transform tutorial (#67978)
Buffer deallocation pipeline previously was incorrect when applied to
functions. It has since been fixed. Make sure it is exercised in the
tutorial to avoid leaking allocations.
2023-10-02 16:08:11 +02:00
Oleksandr "Alex" Zinenko
96ff0255f2
[mlir] cleanup of structured.tile* transform ops (#67320)
Rename and restructure tiling-related transform ops from the structured
extension to be more homogeneous. In particular, all ops now follow a
consistent naming scheme:

 - `transform.structured.tile_using_for`;
 - `transform.structured.tile_using_forall`;
 - `transform.structured.tile_reduction_using_for`;
 - `transform.structured.tile_reduction_using_forall`.

This drops the "_op" naming artifact from `tile_to_forall_op` that
shouldn't have been included in the first place, consistently specifies
the name of the control flow op to be produced for loops (instead of
`tile_reduction_using_scf` since `scf.forall` also belongs to `scf`),
and opts for the `using` connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as *trailing*
results of the transform op. While this required changing 3 out of 4 ops
(except for `tile_using_for`), this is the only choice that makes sense
when producing multiple `scf.for` ops that can be associated with a
variadic number of handles. This choice is also most consistent with
*other* transform ops from the structured extension, in particular with
fusion ops, that produce the structured op as the leading result and the
loop as the trailing result.
2023-09-26 09:14:29 +02:00
Oleksandr "Alex" Zinenko
6841eff107
[mlir] add transform tutorial chapter for Halide conv mapping (#66386)
This chapter demonstrates how one can replicate Halide DSL
transformations using transform dialect operations transforming payload
expressed using Linalg. This was a part of the live tutorial presented
at EuroLLVM 2023.
2023-09-25 09:47:48 +02:00
Andrey Portnoy
444bb1f1bb [mlir][Toy] Remove unnecessary transpose from chapter 1 example
The call to 'multiply_transpose' in the initialization of the variable 'f' was
intended to have a shape mismatch. However the variable 'a' has shape <2, 3> and
the variable 'c' has shape <3, 2>, so the arguments 'transpose(a)' and 'c' have
in fact compatible shapes (<3, 2> both), the opposite of what is wanted here.
This commit removes the transpose so that arguments 'a' and 'c' have
incompatible shapes <2, 3> and <3, 2>, respectively.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D151897
2023-06-06 11:23:35 -07:00
Alex Zinenko
68ae0d7803 [mlir] add initial chapters of the transform dialect tutorial
The transform dialect has been around for a while and is sufficiently
stable at this point. Add the first three chapters of the tutorial
describing its usage and extension.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D151491
2023-05-30 15:26:58 +00:00
Tobias Hieta
f9008e6366
[NFC][Py Reformat] Reformat python files in mlir subdir
This is an ongoing series of commits that are reformatting our
Python code.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Differential Revision: https://reviews.llvm.org/D150782
2023-05-26 08:05:40 +02:00
max
66fc381af3 [MLIR] Patch StandalonePlugin CMake for MacOS
Differential Revision: https://reviews.llvm.org/D148058
2023-04-17 19:59:08 -05:00
Thomas Preud'homme
bfedf169f4 [MLIR] Fix tensor shapes in Toy chapter 1
In Toy tutorial chapter 1, multiply_transpose() is called with b<2, 3>
and c<3, 2> when both parameters should have the same shape. This commit
fixes this by instead using c and d as parameters and fix a comment typo
where c and d are mentioned to have shape <2, 2> when they actually have
shape <3, 2>.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D142622
2023-01-27 10:09:25 +00:00
Mehdi Amini
ddc496d184 Exclude running MLIR tests for Toy example Ch6 and Ch7 when JIT is unavailable 2023-01-25 06:32:44 -08:00
Paul Robinson
977c6f7867 [mlir] Convert tests to check 'target=...'
Part of the project to eliminate special handling for triples in lit
expressions.
2022-12-15 14:49:54 -08:00
rkayaith
04df971d65 [mlir][standalone] Specify python path when configuring
Specifying the python path here ensures that the python binary used matches the
one used by the main MLIR tests. This is useful when cmake's automatic detection
has to be overridden.

Reviewed By: stellaraccident, bondhugula

Differential Revision: https://reviews.llvm.org/D134251
2022-09-20 15:43:39 -04:00
Stella Laurenzo
768a251587 [mlir] Tunnel LLVM_USE_LINKER through to the standalone example build.
When building in debug mode, the link time of the standalone sample is excessive, taking upwards of a minute if using BFD. This at least allows lld to be used if the main invocation was configured that way. On my machine, this gets a standalone test that requires a relink to run in ~13s for Debug mode. This is still a lot, but better than it was. I think we may want to do something about this test: it adds a lot of latency to a normal compile/test cycle and requires a bunch of arg fiddling to exclude.

I think we may end up wanting a `check-mlir-heavy` target that can be used just prior to submit, and then make `check-mlir` just run unit/lite tests. More just thoughts for the future (none of that is done here).

Reviewed By: bondhugula, mehdi_amini

Differential Revision: https://reviews.llvm.org/D126585
2022-06-05 12:31:41 -07:00
Valentin Clement
02da964350
[mlir][CSE] Remove duplicated operations with MemRead side-effect
This patch enhances the CSE pass to deal with simple cases of duplicated
operations with MemoryEffects.

It allows the CSE pass to remove safely duplicate operations with the
MemoryEffects::Read that have no other side-effecting operations in
between. Other MemoryEffects::Read operation are allowed.

The use case is pretty simple so far so we can build on top of it to add
more features.

This patch is also meant to avoid a dedicated CSE pass in FIR and was
brought together afetr discussion on https://reviews.llvm.org/D112711.
It does not currently cover the full range of use cases described in
https://reviews.llvm.org/D112711 but the idea is to gradually enhance
the MLIR CSE pass to handle common use cases that can be used by
other dialects.

This patch takes advantage of the new CSE capabilities in Fir.

Reviewed By: mehdi_amini, rriddle, schweitz

Differential Revision: https://reviews.llvm.org/D122801
2022-04-07 10:08:55 +02:00
River Riddle
ee2c6cd906 [mlir][toy] Define a FuncOp operation in toy and drop the dependence on FuncOp
FuncOp is being moved out of the builtin dialect, and defining a custom
toy operation showcases various aspects of defining function-like operation
(e.g. inlining, passes, etc.).

Differential Revision: https://reviews.llvm.org/D121264
2022-03-15 14:55:51 -07:00