76 Commits

Author SHA1 Message Date
Jacques Pienaar
07967d4af8
[mlir] Switch to new LDBG macro (#150616)
Change local variants to use new central one.
2025-07-25 18:22:46 +02:00
Kazu Hirata
0925d7572a
[mlir] Remove unused includes (NFC) (#150266)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:53 -07:00
Maksim Levental
dce6679cf5
[mlir][NFC] update mlir/Dialect create APIs (16/n) (#149922)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-21 19:57:30 -04:00
Kazu Hirata
54bd936ec9
[mlir] Remove unused includes (NFC) (#147455)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-07 23:40:44 -07:00
Nicolas Vasilache
c30b5b1549
[mlir][GPU][transform] Add gpu_to_rocdl conversion pattern (#146962)
Co-authored-by: Son Tuan Vu <vuson@google.com>
2025-07-07 18:34:09 +02:00
Nicolas Vasilache
2b28d10022
[mlir][SCF][GPU] Add DeviceMaskingAttrInterface (#146943)
This revision adds DeviceMaskingAttrInterface and extends
DeviceMappingArrayAttr to accept a union of DeviceMappingAttrInterface
and DeviceMaskingAttrInterface.

Support is added to GPUTransformOps to take advantage of this
information and lower to block/warpgroup/warp/thread specialization when
mapped to linear ids.

The revision also connects to scf::ForallOp and uses the new attribute
to implement warp specialization.
The implementation is in the form of a GPUMappingMaskAttr, which can be
additionally passed to the scf.forall.mapping attribute to specify a
mask on compute resources that should be active.

In the first implementation the masking is a bitfield that specifies for
each processing unit whether it is active or not.
In the future, we may want to implement this as a symbol to refer to
dynamically defined values.
Extending op semantics with an operand is deemed too intrusive at this
time.

---------

Co-authored-by: Oleksandr "Alex" Zinenko <git@ozinenko.com>
2025-07-07 18:06:41 +02:00
Nicolas Vasilache
ea62de5b1d
[mlir] NFC - refactor id builder and avoid leaking impl details (#146922) 2025-07-07 15:42:48 +02:00
Nicolas Vasilache
0a62836969
[mlir][gpu][transforms] Add support for mapping to lanes (#146912)
This revision adds a new attribute for mapping `scf.forall` to linear
lane ids.

Example:
```
    // %arg2 and %arg3 map to lanes [0, 6) and are turned into epxressions
    // involving threadIdx.x/y by the map_nested_forall_to_threads
    // transformation. This results in a if (linear_thread_id < 6) conditional.
    scf.forall (%arg2, %arg3) in (2, 3) {
       ...
    } {mapping = [#gpu.lane<linear_dim_0>, #gpu.lane<linear_dim_1>]}
 ```

---------

Co-authored-by: Oleksandr "Alex" Zinenko <git@ozinenko.com>
2025-07-07 15:14:52 +02:00
Ivan Butygin
91f3cdbd4f
[mlir][gpu] Pattern to promote gpu.shuffle to specialized AMDGPU ops (#137109)
Only swizzle promotion for now, may add DPP ops support later.
2025-05-13 13:26:46 +03:00
Jakub Kuderski
198c5dac37
[mlir][transform] Clean up prints. NFC. (#136401)
Use `llvm::interleaved` from #135517 to simplify printing.
2025-04-19 12:11:06 -04:00
Jakub Kuderski
4be84a142e
[mlir][gpu] Clean up prints in GPU dialect. NFC. (#136250)
Clean up printing code by switching to `llvm::interleaved` from
https://github.com/llvm/llvm-project/pull/135517. Also make some minor
readability & performance fixes.
2025-04-18 11:10:17 -04:00
Kazu Hirata
3041fa6c7a
[mlir] Use *Set::insert_range (NFC) (#132326)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-20 22:24:17 -07:00
Matthias Springer
4defac91db
[mlir][GPUToNVVM] Add benefit to populate functions (#128484)
Certain GPU->NVVM patterns compete with Arith->LLVM patterns. (The ones
that lower to libdevice.) Add an optional `benefit` parameter to all
`populate` functions so that users can give preference to GPU->NVVM
patterns.
2025-02-24 17:27:55 +01:00
Guray Ozen
baf27862dd
[MLIR][NVGPU] Move max threads/blocks size to dialect (NFC) (#124454)
This PR moves maximum number of threads in a block and block in a grid
to nvgpu dialect to avoid replicated code.

The limits are defined here:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/#features-and-technical-specifications-technical-specifications-per-compute-capability
2025-02-05 12:38:37 +01:00
Kazu Hirata
129f1001c3
[Dialect] Migrate away from PointerUnion::{is,get} (NFC) (#120818)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-21 08:17:51 -08:00
Nikhil Kalra
84cc1865ef
[mlir] Support DialectRegistry extension comparison (#101119)
`PassManager::run` loads the dependent dialects for each pass into the
current context prior to invoking the individual passes. If the
dependent dialect is already loaded into the context, this should be a
no-op. However, if there are extensions registered in the
`DialectRegistry`, the dependent dialects are unconditionally registered
into the context.

This poses a problem for dynamic pass pipelines, however, because they
will likely be executing while the context is in an immutable state
(because of the parent pass pipeline being run).

To solve this, we'll update the extension registration API on
`DialectRegistry` to require a type ID for each extension that is
registered. Then, instead of unconditionally registered dialects into a
context if extensions are present, we'll check against the extension
type IDs already present in the context's internal `DialectRegistry`.
The context will only be marked as dirty if there are net-new extension
types present in the `DialectRegistry` populated by
`PassManager::getDependentDialects`.

Note: this PR removes the `addExtension` overload that utilizes
`std::function` as the parameter. This is because `std::function` is
copyable and potentially allocates memory for the contained function so
we can't use the function pointer as the unique type ID for the
extension.

Downstream changes required:
- Existing `DialectExtension` subclasses will need a type ID to be
registered for each subclass. More details on how to register a type ID
can be found here:
8b68e06731/mlir/include/mlir/Support/TypeID.h (L30)
- Existing uses of the `std::function` overload of `addExtension` will
need to be refactored into dedicated `DialectExtension` classes with
associated type IDs. The attached `std::function` can either be inlined
into or called directly from `DialectExtension::apply`.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-08-06 01:32:36 +02:00
Kazu Hirata
5262865aac
[mlir] Construct SmallVector with ArrayRef (NFC) (#101896) 2024-08-04 11:43:05 -07:00
Christian Sigg
a5757c5b65
Switch member calls to isa/dyn_cast/cast/... to free function calls. (#89356)
This change cleans up call sites. Next step is to mark the member
functions deprecated.

See https://mlir.llvm.org/deprecation and
https://discourse.llvm.org/t/preferred-casting-style-going-forward.
2024-04-19 15:58:27 +02:00
Jakub Kuderski
971b852546
[mlir][NFC] Simplify type checks with isa predicates (#87183)
For more context on isa predicates, see:
https://github.com/llvm/llvm-project/pull/83753.
2024-04-01 11:40:09 -04:00
Oleksandr "Alex" Zinenko
5a9bdd85ee
[mlir] split transform interfaces into a separate library (#85221)
Transform interfaces are implemented, direction or via extensions, in
libraries belonging to multiple other dialects. Those dialects don't
need to depend on the non-interface part of the transform dialect, which
includes the growing number of ops and transitive dependency footprint.

Split out the interfaces into a separate library. This in turn requires
flipping the dependency from the interface on the dialect that has crept
in because both co-existed in one library. The interface shouldn't
depend on the transform dialect either.

As a consequence of splitting, the capability of the interpreter to
automatically walk the payload IR to identify payload ops of a certain
kind based on the type used for the entry point symbol argument is
disabled. This is a good move by itself as it simplifies the interpreter
logic. This functionality can be trivially replaced by a
`transform.structured.match` operation.
2024-03-20 22:15:17 +01:00
Justin Lebar
fab2bb8bfd
Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)
For some reason this was missing from STLExtras.
2024-03-10 20:00:13 -07:00
Mehdi Amini
9a2a6a728a Apply clang-tidy fixes for readability-identifier-naming in Utils.cpp (NFC) 2024-01-17 08:51:41 -08:00
Mehdi Amini
74cf9bcf71 Apply clang-tidy fixes for performance-unnecessary-value-param in Utils.cpp (NFC) 2024-01-17 08:51:41 -08:00
Mehdi Amini
d8ed736c0e Apply clang-tidy fixes for bugprone-macro-parentheses in Utils.cpp (NFC) 2024-01-15 20:59:13 -08:00
long.chen
1609f1c2a5
[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/

Not all changes are made manually, most of them are made through a clang
tool I wrote https://github.com/lipracer/cpp-refactor.
2023-11-14 13:01:19 +08:00
spaceotter
00c3c73189
[mlir][gpu] Separate the barrier elimination code from transform ops (#71762)
Allows the barrier elimination code to be run from C++ as well. The code
from transforms dialect is copied as-is, the pass and populate functions
have beed added at the end.

Co-authored-by: Eric Eaton <eric@nod-labs.com>
2023-11-10 17:59:09 -08:00
Lukas Sommer
06918a969c
[MLIR][NFC] Mark barrier elimination helper static (#65303)
Make local helper functions static to avoid symbol name collision.
2023-09-05 09:59:22 +02:00
Nicolas Vasilache
92f088d335 [mlir][gpu][transform] Provide better error messages and avoid crashing in MapForallToBlocks.
This revision addresses issues surfaced in https://reviews.llvm.org/D159093
2023-09-04 14:11:38 +00:00
Nicolas Vasilache
888717e853 [mlir][transform] Enable gpu-to-nvvm via conversion patterns driven by TD
This revision untangles a few more conversion pieces and allows rewriting
the relatively intricate (and somewhat inconsistent) LowerGpuOpsToNVVMOpsPass
in a declarative fashion that provides a much better understanding and control.

Differential Revision: https://reviews.llvm.org/D157617
2023-08-10 15:30:48 +00:00
Nicolas Vasilache
44e6318cea [mlir][transforms] Revamp the implementation of mapping loops to GPUs
This revision significantly simplifies the specification and implementation of mapping loops to GPU ids.

Each type of mapping (block, warpgroup, warp, thread) now comes with 2 mapping modes:
  1. a 3-D "grid-like" mode, subject to alignment considerations on threadIdx.x, on which predication
     may occur on a per-dimension 3-D sub-rectangle basis.
  2. a n-D linearized mode, on which predication may only occur on a linear basis.

In the process, better size and alignment requirement inference are introduced along with improved runtime verification messages.

The `warp_dims` attribute was deemed confusing and is removed from the transform in favor of better size inference.

Differential Revision: https://reviews.llvm.org/D155941
2023-07-26 00:09:08 +02:00
Quinn Dawkins
24567dd4ba [mlir][GPU] Fix missing CMake dependency
Fixes buildbot failure from ff8775f

Differential Revision: https://reviews.llvm.org/D156252
2023-07-25 14:28:16 -04:00
Quinn Dawkins
ff8775f3ff [mlir][GPU] Add op for unrolling contractions to a native size
Adds `apply_patterns.gpu.unroll_vectors_subgroup_mma` which allows
specifying a native MMA shape of `m`, `n`, and `k` to unroll to,
greedily unrolling the inner most dimension of contractions and other
vector operations based on expected usage.

Differential Revision: https://reviews.llvm.org/D156079
2023-07-25 13:11:32 -04:00
Nicolas Vasilache
90ecfa2a40 [mlir][linalg] NFC - Move some utils in preparation for revamping mapping of scf.forall 2023-07-25 01:19:57 +02:00
Alex Zinenko
9ab34689b0 [mlir] add a simple gpu barrier elimination mechanism
GPU code generation, and specifically the shared memory copy insertion
may introduce spurious barriers guarding read-after-read dependencies or
read-after-write on non-aliasing data, which degrades performance due to
unnecessary synchronization. Add a pattern and transform op that removes
such barriers by analyzing memory effects that the barrier actually
guards that are not also guarded by other barriers. The code is adapted
from the Polygeist incubator project.

Co-authored-by: William Moses <gh@wsmoses.com>
Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp>

Reviewed By: nicolasvasilache, wsmoses

Differential Revision: https://reviews.llvm.org/D154720
2023-07-07 18:51:49 +00:00
Matthias Springer
efc290ce9c [mlir][affine] More efficient makeComposedFolded... helpers
The old code used to materialize constants as ops, immediately folded them into the resulting affine map and then deleted the constant ops again. Instead, directly fold the attributes into the affine map. Furthermore, all helpers accept `OpFoldResult` instead of `Value` now. This makes the code at call sites more efficient, because it is no longer necessary to materialize a `Value`, just to be able to use these helper functions.

Note: The API has changed (accepts OpFoldResult instead of Value), otherwise this change is NFC.

Differential Revision: https://reviews.llvm.org/D153324
2023-06-22 10:47:38 +02:00
Matthias Springer
c63d2b2c71 [mlir][transform] Add TransformRewriter
All `apply` functions now have a `TransformRewriter &` parameter. This rewriter should be used to modify the IR. It has a `TrackingListener` attached and updates the internal handle-payload mappings based on rewrites.

Implementations no longer need to create their own `TrackingListener` and `IRRewriter`. Error checking is integrated into `applyTransform`. Tracking listener errors are reported only for ops with the `ReportTrackingListenerFailuresOpTrait` trait attached, allowing for a gradual migration. Furthermore, errors can be silenced with an op attribute.

Additional API will be added to `TransformRewriter` in subsequent revisions. This revision just adds an "empty" `TransformRewriter` class and updates all `apply` implementations.

Differential Revision: https://reviews.llvm.org/D152427
2023-06-20 10:49:59 +02:00
Alex Zinenko
2f3ac28cb2 [mlir] don't hardcode PDL_Operation in Transform dialect extensions
Update operations in Transform dialect extensions defined in the Affine,
GPU, MemRef and Tensor dialects to use the more generic
`TransformHandleTypeInterface` type constraint instead of hardcoding
`PDL_Operation`. See
https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702
for motivation.

Remove the dependency on PDLDialect from these extensions.

Update tests to use `!transform.any_op` instead of `!pdl.operation`.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D150781
2023-05-17 15:10:12 +00:00
Matthias Springer
61223c49dd [mlir][GPU] Rename MLIRGPUOps CMake target to MLIRGPUDialect
This is for consistency with other dialects.

Differential Revision: https://reviews.llvm.org/D150659
2023-05-16 14:25:08 +02:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Rahul Kayaith
6089d612a5 [mlir] Prevent implicit downcasting to interfaces
Currently conversions to interfaces may happen implicitly (e.g.
`Attribute -> TypedAttr`), failing a runtime assert if the interface
isn't actually implemented. This change marks the `Interface(ValueT)`
constructor as explicit so that a cast is required.

Where it was straightforward to I adjusted code to not require casts,
otherwise I just made them explicit.

Depends on D148491, D148492

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D148493
2023-04-20 16:31:54 -04:00
Matthias Springer
4c48f016ef [mlir][Affine][NFC] Wrap dialect in "affine" namespace
This cleanup aligns the affine dialect with all the other dialects.

Differential Revision: https://reviews.llvm.org/D148687
2023-04-20 11:19:21 +09:00
Nicolas Vasilache
015cd84d3c Revert "[mlir][Linalg][Transform] Avoid FunctionalStyleTransformOpTrait where unnecesseary to improve usability"
This reverts commit 31aa8ea252c0b6acdcb362c1d0f01cc4b810d6d0.

This is currently not in a good state as we have some footguns due to missing listeners.
2023-03-20 07:07:27 -07:00
Nicolas Vasilache
ba7f3e1d1e [mlir][Transform] Fix support for mapping to GPU warps and to linear ids
c59465e1203dd78d06e15f7ddf62141807dbd5a7 introduced mapping to warps and
linear GPU ids.

In the implementation, the delinearization basis is reversed from [x, y, z]
to [z, y x] order to properly compute the strides and allow delinearization.

Prior to this commit, we forgot to reverse it back to [x, y, z] order
before materializing the indices.

Fix this oversight.
2023-03-20 05:23:17 -07:00
Nicolas Vasilache
31aa8ea252 [mlir][Linalg][Transform] Avoid FunctionalStyleTransformOpTrait where unnecesseary to improve usability
Differential Revision: https://reviews.llvm.org/D146305
2023-03-20 03:17:44 -07:00
Nicolas Vasilache
c59465e120 [mlir][Transform] Add support for mapping to GPU warps and to linear ids
This revisions refactors the implementation of mapping to threads to additionally allow warps and linear ids to be specified.

`warp_dims` is currently specified along with `block_dims` as a transform attribute.

Linear ids on th other hand use the flattened block_dims to predicate on the first (linearized) k threads.
An additional GPULinearIdMappingAttr is added to the GPU dialect to allow specifying loops mapped to this new scheme.

Various implementation and transform op semantics cleanups are also applied.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D146130
2023-03-20 01:05:32 -07:00
Kazu Hirata
8bdf387858 Use *{Map,Set}::contains (NFC)
Differential Revision: https://reviews.llvm.org/D146104
2023-03-15 08:46:32 -07:00
Nicolas Vasilache
768615bba0 [mlir][Transform] NFC - Refactor forall mapping to threads and blocks into one thing
Differential Revision: https://reviews.llvm.org/D146095
2023-03-15 05:09:39 -07:00
Nicolas Vasilache
aafb52d7c9 [mlir][GPUTransforms] NFC - Refactor GPUTransforms.cpp in preparation for improvements.
Depends on: D145977

Differential Revision: https://reviews.llvm.org/D145980
2023-03-14 05:00:01 -07:00
Nicolas Vasilache
1cff4cbda3 [mlir][Transform] NFC - Various API cleanups and use RewriterBase in lieu of PatternRewriter
Depends on: D145685

Differential Revision: https://reviews.llvm.org/D145977
2023-03-14 04:23:12 -07:00
Kazu Hirata
d298b02dba [mlir] Use llvm::is_contained (NFC) 2023-03-13 00:43:27 -07:00