71 Commits

Author SHA1 Message Date
Sang Ik Lee
150145486e
[MLIR][GPU] Generalize gpu.printf op lowering to LLVM call pattern. (#164297)
Existing pattern for lowering gpu.printf op to LLVM call uses fixed
function name and calling convention.
Those two should be exposed as pass option to allow supporting Intel
Compute Runtime for GPU.

Also adds gpu.printf op pattern to GPU to LLVMSPV pass.
It may appear out of place, but integration test is added to XeVM
integration test as that is the current best folder for testing with
Intel Compute Runtime.
Test should be moved in the future if a better test folder is added.
2025-10-23 08:32:53 -07:00
Krzysztof Drewniak
d2e3389abb
[mlir][GPU] Generalize gpu.printf to not need gpu.module (#161266)
In order to make the gpu.printf => [various LLVM calls] passes less
order-dependent and to allow downstreams that don't use gpu.module to
use gpu.printf, allow the flowerings for such prints to target the
nearest `SymbolTable` instead.
2025-09-29 18:21:24 -07:00
Mehdi Amini
062c0fcf4b [MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in GPUOpsLowering.cpp (NFC) 2025-09-28 04:29:19 -07:00
Maksim Levental
eaa67a3cf0
[mlir][NFC] update Conversion create APIs (5/n) (#149887)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-22 10:40:45 -04:00
Adam Straw
3172c61895
[mlir][gpu] Fix bug with gpu.printf global location (#142872)
Bug description: Global variables and functions created during
gpu.printf conversion to NVVM may contain debug info metadata from
function containing the gpu.printf which cannot be used out of that
function.
2025-06-05 00:21:44 -06:00
Oleksandr "Alex" Zinenko
7318074168
[mlir] allow function type cloning to fail (#137130)
`FunctionOpInterface` assumed the fact that the function type (attribute
of the operation) can be cloned with arbirary lists of function
arguments and results to support argument and result list mutation. This
is not always correct, in particular, LLVM dialect functions require
exactly one result making it impossible to erase the result.

Allow function type cloning to fail and propagate this failure through
various APIs that use it. The common assumption is that existing IR has
not been modified.

Fixes #131142.

Reland a8c7ecdcbc3e89b493b495c6831cc93671c3b844 / #136300.
2025-04-30 09:26:42 +02:00
Kazu Hirata
7da385d797 Revert "[mlir] allow function type cloning to fail (#136300)"
This reverts commit 20a104a7d6423784dab04371a5ca728cc27a15a9.

Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/157/builds/25688

I've reproduced the build failure.
2025-04-18 09:52:28 -07:00
Oleksandr "Alex" Zinenko
20a104a7d6
[mlir] allow function type cloning to fail (#136300)
`FunctionOpInterface` assumed the fact that the function type (attribute
of the operation) can be cloned with arbirary lists of function
arguments and results to support argument and result list mutation. This
is not always correct, in particular, LLVM dialect functions require
exactly one result making it impossible to erase the result.

Allow function type cloning to fail and propagate this failure through
various APIs that use it. The common assumption is that existing IR has
not been modified.

Fixes #131142.
2025-04-18 16:05:54 +02:00
Matthias Springer
6c2f8476e7
[mlir][Transforms] Dialect Conversion: Add 1:N support to remapInput (#131454)
This commit adds 1:N support to `SignatureConversion::remapInputs`. This
API allows users to replace a block argument with multiple replacement
values. (And the block argument is dropped.) The API already supported
"bbarg --> multiple bbargs" mappings, but "bbarg --> multiple SSA
values" was missing.

---------

Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
2025-03-15 18:33:06 +01:00
Benoit Jacob
3c46debe6b
[MLIR] Fix 0-dimensional case of conversion of vector ops to GPU (#128075)
This is a follow-up to #127844. That PR got vectors of arbitrary rank
working, but I hadn't thought about the rank-0 case.

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2025-02-20 19:43:33 -05:00
Benoit Jacob
4a411eb4ee
[MLIR] Fix rewrite of ops with vector operands to LLVM on GPU (#127844)
There was a discrepancy between the type-converter and rewrite-pattern
parts of conversion to LLVM used in various GPU targets, at least ROCDL
and NVVM:
- The TypeConverter part was handling vectors of arbitrary rank,
converting them to nests of `!llvm.array< ... >` with a vector at the
inner-most dimension:
8337d01e30/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp (L629-L655)
- The rewrite pattern part was not handling `llvm.array`:
8337d01e30/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp (L594-L596)

That led to conversion failures when lowering `math` dialect ops on
rank-2 vectors, as in the testcase being added in this PR.

This PR fixes this by reusing a shared utility already used in other
conversions to LLVM:

8337d01e30/mlir/lib/Conversion/LLVMCommon/VectorPattern.cpp (L80-L104)

---------

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2025-02-19 14:52:02 -05:00
Krzysztof Drewniak
f4e3b8783c
[mlir][LLVM] Switch undef for poison for uninitialized values (#125629)
LLVM itself is generally moving away from using `undef` and towards
using `poison`, to the point of having a lint that caches new uses of
`undef` in tests.

In order to not trip the lint on new patterns and to conform to the
evolution of LLVM
- Rename valious ::undef() methods on StructBuilder subclasses to
::poison()
- Audit the uses of UndefOp in the MLIR libraries and replace almost all
of them with PoisonOp

The remaining uses of `undef` are initializing `uninitialized` memrefs,
explicit conversions to undef from SPIR-V, and a few cases in
AMDGPUToROCDL where usage like

    %v = insertelement <M x iN> undef, iN %v, i32 0
    %arg = bitcast <M x iN> %v to i(M * N)

is used to handle "i32" arguments that are are really packed vectors of
smaller types that won't always be fully initialized.
2025-02-06 12:49:30 -06:00
Matthias Springer
599c739905
[mlir][GPU] Add NVVM-specific cf.assert lowering (#120431)
This commit add an NVIDIA-specific lowering of `cf.assert` to to
`__assertfail`.

Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and
`getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can
be reused.
2025-01-06 12:00:11 +01:00
Matthias Springer
2da417e7f6
[mlir][GPU] gpu.printf: Do not emit duplicate format strings (#110504)
Even if the same format string is used multiple times, emit just one
`LLVM:GlobalOp`.
2024-10-01 09:12:08 +02:00
Matthias Springer
49df12c01e
[mlir][NFC] Minor cleanup around ModuleOp usage (#110498)
Use `moduleOp.getBody()` instead of `moduleOp.getBodyRegion().front()`.
2024-09-30 21:20:48 +02:00
Victor Perez
d45de8003a
[MLIR][GPU-LLVM] Convert gpu.func to llvm.func (#101664)
Add support in `-convert-gpu-to-llvm-spv` to convert `gpu.func` to
`llvm.func` operations.

- `spir_kernel`/`spir_func` calling conventions used for
kernels/functions.
- `workgroup` attributions encoded as additional `llvm.ptr<3>`
arguments.
- No attribute used to annotate kernels
- `reqd_work_group_size` attribute using to encode
`gpu.known_block_size`.
- `llvm.mlir.workgroup_attrib_size` used to encode workgroup attribution
sizes. This will be attached to the pointer argument workgroup
attributions lower to.

**Note**: A notable missing feature that will be addressed in a
follow-up PR is a `-use-bare-ptr-memref-call-conv` option to replace
MemRef arguments with bare pointers to the MemRef element types instead
of the current MemRef descriptor approach.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-08-09 16:09:11 +02:00
Matthias Springer
9e8ccf6b64
[mlir][Conversion] FuncToLLVM: Simplify bare-pointer handling (#96393)
Before this commit, there used to be a workaround in the
`func.func`/`gpu.func` op lowering when the bare-pointer calling
convention is enabled. This workaround "patched up" the argument
materializations for memref arguments. This can be done directly in the
argument materialization functions (as the TODOs in the code base
indicate).

This commit effectively reverts back to the old implementation
(a664c14001fa2359604527084c91d0864aa131a4) and adds additional checks to
make sure that bare pointers are used only for function entry block
arguments.
2024-06-24 08:38:26 +02:00
Matthias Springer
3f33d2f3ca
[mlir][GPUToNVVM] Fix memref function args/results (#96392)
The `gpu.func` op lowering accounts for memref arguments/results (both
"normal" and bare-pointer supported), but the `gpu.return` op lowering
did not. The lowering produced invalid IR that did not verify.

This commit uses the same lowering strategy as for `func.return` in the
`gpu.return` lowering. (The C++ implementation is copied. We may want to
share some code between `func` and `gpu` lowerings in the future.)
2024-06-23 09:51:12 +02:00
Krzysztof Drewniak
43fd4c49bd
[mlir][GPU] Improve handling of GPU bounds (#95166)
This change reworks how range information for GPU dispatch IDs (block
IDs, thread IDs, and so on) is handled.

1. `known_block_size` and `known_grid_size` become inherent attributes
of GPU functions. This makes them less clunky to work with. As a
consequence, the `gpu.func` lowering patterns now only look at the
inherent attributes when setting target-specific attributes on the
`llvm.func` that they lower to.
2. At the same time, `gpu.known_block_size` and `gpu.known_grid_size`
are made official dialect-level discardable attributes which can be
placed on arbitrary functions. This allows for progressive lowerings
(without this, a lowering for `gpu.thread_id` couldn't know about the
bounds if it had already been moved from a `gpu.func` to an `llvm.func`)
and allows for range information to be provided even when
`gpu.*_{id,dim}` are being used outside of a `gpu.func` context.
3. All of these index operations have gained an optional `upper_bound`
attribute, allowing for an alternate mode of operation where the bounds
are specified locally and not inherited from the operation's context.
These also allow handling of cases where the precise launch sizes aren't
known, but can be bounded more precisely than the maximum of what any
platform's API allows. (I'd like to thank @benvanik for pointing out
that this could be useful.)

When inferring bounds (either for range inference or for setting `range`
during lowering) these sources of information are consulted in order of
specificity (`upper_bound` > inherent attribute > discardable attribute,
except that dimension sizes check for `known_*_bounds` to see if they
can be constant-folded before checking their `upper_bound`).

This patch also updates the documentation about the bounds and inference
behavior to clarify what these attributes do when set and the
consequences of setting them up incorrectly.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-06-17 23:47:38 -05:00
Christian Sigg
a5757c5b65
Switch member calls to isa/dyn_cast/cast/... to free function calls. (#89356)
This change cleans up call sites. Next step is to mark the member
functions deprecated.

See https://mlir.llvm.org/deprecation and
https://discourse.llvm.org/t/preferred-casting-style-going-forward.
2024-04-19 15:58:27 +02:00
Jakub Kuderski
971b852546
[mlir][NFC] Simplify type checks with isa predicates (#87183)
For more context on isa predicates, see:
https://github.com/llvm/llvm-project/pull/83753.
2024-04-01 11:40:09 -04:00
Andrei Golubev
89cd345667
[mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562)
GEPArg can only be constructed from int32_t and mlir::Value. Explicitly
cast other types (e.g. unsigned, size_t) to int32_t to avoid narrowing
conversion warnings on MSVC. Some recent examples of such are:

```
mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398:
Element '1': conversion from 'size_t' to 'T' requires a narrowing
conversion
    with
    [
        T=mlir::LLVM::GEPArg
    ]

mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398:
Element '1': conversion from 'unsigned int' to 'T' requires a narrowing
conversion
    with
    [
        T=mlir::LLVM::GEPArg
    ]
```

Co-authored-by: Nikita Kudriavtsev <nikita.kudriavtsev@intel.com>
2024-01-26 14:27:51 +01:00
Guray Ozen
2aec7083ad
[mlir][gpu] Use DenseI32Array for NVVM's maxntid and reqntid (NFC) (#77466) 2024-01-09 16:44:25 +01:00
Guray Ozen
763109e346
[mlir][gpu] Use known_block_size to set maxntid for NVVM target (#77301)
Setting thread block size with `maxntid` on the kernel has great
performance benefits. In this way, downstream PTX compiler can do better
register allocation.

MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute
(`known_block_size`) that keeps the thread block size when it is known.
This PR simply uses this attribute to set `maxntid`.
2024-01-08 14:49:19 +01:00
Krzysztof Drewniak
ddd6acd7a8
[mlir][GPU] Expand LLVM function attribute copies (#76755)
Expand the copying of attributes on GPU kernel arguments during LLVM
lowering.

Support copying attributes from values that are already LLVM pointers.

Support copying attributes, like `noundef`, that aren't specific to (the
pointer parts of) arguments.
2024-01-03 14:28:15 -06:00
Guray Ozen
ea84897ba3
[mlir][gpu] Introduce gpu.dynamic_shared_memory Op (#71546)
While the `gpu.launch` Op allows setting the size via the
`dynamic_shared_memory_size` argument, accessing the dynamic shared
memory is very convoluted. This PR implements the proposed Op,
`gpu.dynamic_shared_memory` that aims to simplify the utilization of
dynamic shared memory.

RFC:
https://discourse.llvm.org/t/rfc-simplifying-dynamic-shared-memory-access-in-gpu/

**Proposal from RFC**
This PR `gpu.dynamic.shared.memory` Op to use dynamic shared memory
feature efficiently. It is is a powerful feature that enables the
allocation of shared memory at runtime with the kernel launch on the
host. Afterwards, the memory can be accessed directly from the device. I
believe similar story exists for AMDGPU.

**Current way Using Dynamic Shared Memory with MLIR**

Let me illustrate the challenges of using dynamic shared memory in MLIR
with an example below. The process involves several steps:
- memref.global 0-sized array LLVM's NVPTX backend expects
- dynamic_shared_memory_size Set the size of dynamic shared memory
- memref.get_global Access the global symbol
- reinterpret_cast and subview Many OPs for pointer arithmetic

```
// Step 1. Create 0-sized global symbol. Manually set the alignment
memref.global "private" @dynamicShmem  : memref<0xf16, 3> { alignment = 16 }
func.func @main() {
  // Step 2. Allocate shared memory
  gpu.launch blocks(...) threads(...)
    dynamic_shared_memory_size %c10000 {
    // Step 3. Access the global object
    %shmem = memref.get_global @dynamicShmem : memref<0xf16, 3>
    // Step 4. A sequence of `memref.reinterpret_cast` and `memref.subview` operations.
    %4 = memref.reinterpret_cast %shmem to offset: [0], sizes: [14, 64, 128],  strides: [8192,128,1] : memref<0xf16, 3> to memref<14x64x128xf16,3>
    %5 = memref.subview %4[7, 0, 0][7, 64, 128][1,1,1] : memref<14x64x128xf16,3> to memref<7x64x128xf16, strided<[8192, 128, 1], offset: 57344>, 3>
    %6 = memref.subview %5[2, 0, 0][1, 64, 128][1,1,1] : memref<7x64x128xf16, strided<[8192, 128, 1], offset: 57344>, 3> to memref<64x128xf16, strided<[128, 1], offset: 73728>, 3>
    %7 = memref.subview %6[0, 0][64, 64][1,1]  : memref<64x128xf16, strided<[128, 1], offset: 73728>, 3> to memref<64x64xf16, strided<[128, 1], offset: 73728>, 3>
    %8 = memref.subview %6[32, 0][64, 64][1,1] : memref<64x128xf16, strided<[128, 1], offset: 73728>, 3> to memref<64x64xf16, strided<[128, 1], offset: 77824>, 3>
    // Step.5 Use
    "test.use.shared.memory"(%7) : (memref<64x64xf16, strided<[128, 1], offset: 73728>, 3>) -> (index)
    "test.use.shared.memory"(%8) : (memref<64x64xf16, strided<[128, 1], offset: 77824>, 3>) -> (index)
    gpu.terminator
  }
```

Let’s write the program above with that:

```
func.func @main() {
    gpu.launch blocks(...) threads(...) dynamic_shared_memory_size %c10000 {
    	%i = arith.constant 18 : index
        // Step 1: Obtain shared memory directly
        %shmem = gpu.dynamic_shared_memory : memref<?xi8, 3>
        %c147456 = arith.constant 147456 : index
        %c155648 = arith.constant 155648 : index
        %7 = memref.view %shmem[%c147456][] : memref<?xi8, 3> to memref<64x64xf16, 3>
        %8 = memref.view %shmem[%c155648][] : memref<?xi8, 3> to memref<64x64xf16, 3>

        // Step 2: Utilize the shared memory
        "test.use.shared.memory"(%7) : (memref<64x64xf16, 3>) -> (index)
        "test.use.shared.memory"(%8) : (memref<64x64xf16, 3>) -> (index)
    }
}
```

This PR resolves #72513
2023-11-16 14:42:17 +01:00
Christian Ulmann
97a238e863
[MLIR][LLVM] Remove typed pointer conversion utils (#71169)
This commit removes the no longer required type pointer helpers from the
LLVM dialect conversion utils. Typed pointers have been deprecated for a
while now and it's planned to soon remove them from the LLVM dialect.

Related PSA:
https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502
2023-11-03 13:02:35 +01:00
Christian Ulmann
484668c759
Reland "[MLIR][LLVM] Change addressof builders to use opaque pointers" (#69292)
This relands fbde19a664e5fd7196080fb4ff0aeaa31dce8508, which was broken due to incorrect GEP element type creation.

This commit changes the builders of the `llvm.mlir.addressof` operations
to no longer produce typed pointers.

As a consequence, a GPU to NVVM pattern had to be updated, that still
relied on typed pointers.
2023-10-17 11:33:45 +02:00
Christian Ulmann
9397e5f581 Revert "[MLIR][LLVM] Change addressof builders to use opaque pointers (#69215)"
This reverts commit fbde19a664e5fd7196080fb4ff0aeaa31dce8508 due to
breaking integration tests.
2023-10-17 06:31:48 +00:00
Christian Ulmann
fbde19a664
[MLIR][LLVM] Change addressof builders to use opaque pointers (#69215)
This commit changes the builders of the `llvm.mlir.addressof` operations
to no longer produce typed pointers.

As a consequence, a GPU to NVVM pattern and the toy example LLVM lowerings had to be updated, as they still relied on typed pointers.
2023-10-17 07:55:00 +02:00
stefankoncarevic
fbf67bfaf0 [mlir][GPU] Handle LLVM pointer attributes on memref arguments.
Handle pointer attributes (noalias, nonnull, readonly, writeonly,
dereferencable, dereferencable_or_null). "noalias" attribute is
ignore for non-bare pointer.

Reviewed By: krzysz00

Differential Revision: https://reviews.llvm.org/D157082
2023-09-11 15:10:55 +00:00
Matthias Springer
ce254598b7 [mlir][Conversion] Store const type converter in ConversionPattern
ConversionPatterns do not (and should not) modify the type converter that they are using.

* Make `ConversionPattern::typeConverter` const.
* Make member functions of the `LLVMTypeConverter` const.
* Conversion patterns take a const type converter.
* Various helper functions (that are called from patterns) now also take a const type converter.

Differential Revision: https://reviews.llvm.org/D157601
2023-08-14 09:03:11 +02:00
Nicolas Vasilache
888717e853 [mlir][transform] Enable gpu-to-nvvm via conversion patterns driven by TD
This revision untangles a few more conversion pieces and allows rewriting
the relatively intricate (and somewhat inconsistent) LowerGpuOpsToNVVMOpsPass
in a declarative fashion that provides a much better understanding and control.

Differential Revision: https://reviews.llvm.org/D157617
2023-08-10 15:30:48 +00:00
Nicolas Vasilache
67754a9dc4 [mlir][gpu] NFC - Fail gracefully when type conversion fails instead of crashing 2023-07-28 21:03:52 +00:00
Christopher Bate
14858cf05d [mlir][Conversion/GPUCommon] Fix bug in conversion of math ops
The common GPU operation transformation that lowers `math` operations
to function calls in the `gpu-to-nvvm` and `gpu-to-rocdl` passes handles
`vector` types by applying the function to each scalar and returning a
new vector. However, there was a typo that results in incorrectly
accumulating the result vector, and the rewrite returns an `llvm.mlir.undef`
result instead of the correct vector. A patch is added and tests are
strengthened.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D154269
2023-07-03 13:26:51 -06:00
Tobias Gysi
b126ee65fc [mlir][llvm] Add comdat attribute to functions
This revision adds comdat support to functions. Additionally,
it ensures only comdats that have uses are imported/exported and
only non-empty global comdat operations are created.

Reviewed By: Dinistro

Differential Revision: https://reviews.llvm.org/D153739
2023-06-27 07:26:59 +00:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Laszlo Kindrat
17faae95d7 [ADT] Introduce map_to_vector helper
The following pattern is common in the llvm codebase, as well as in downstream projects:
```
llvm::to_vector(llvm::map_range(container, lambda))
```
This patch introduces a shortcut for this called `map_to_vector`.

This template depends on both `llvm/ADT/SmallVector.h` and `llvm/ADT/STLExtras.h`, and since these are both relatively large and do not depend on each other, the `map_to_vector` helper is placed in a new header under `llvm/ADT/SmallVectorExtras.h`. Only a handful of use cases have been updated to use the new helper.

Differential Revision: https://reviews.llvm.org/D145390
2023-05-04 06:25:25 -05:00
Krzysztof Drewniak
94058c41d4 [mlir][GPU] Allow specifying alignment of memory attributions
Add support for argument attributes on workgroup and private
attributions for GPU functions. These arguments are outside the range
of getNumArguments() and get printed separately, so the default
mechanism for function argument attributes can't be used on them.

Having done this, check for the `llvm.align` attribute on workgroup or
private attributions in a `gpu.func` and pass it through to the
relevant allocation op (creating a global or alloca). This allows
people creating kernels that use multiple workgroup buffers to set an
alignment.

(This could, in the future, be a GPU dialect `alignment` attribute,
but I've taken the simpler route of using the LLVM version instead for
simplicity and because I don't know how this might impact backends
like Vulkan)

Reviewed By: nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D148965
2023-05-03 21:51:15 +00:00
Mahesh Ravishankar
162f757206 [mlir][LLVM] Add an attribute to control use of bare-pointer calling convention.
Currently the use of bare pointer calling convention is controlled
globally through use of an option in the `LLVMTypeConverter`. To allow
more fine-grained control use an attribute on a function to drive the
calling convention to use.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D147494
2023-04-06 16:19:56 +00:00
Markus Böck
0e5aeae6f5 [mlir][GPUToLLVM] Add support for emitting opaque pointers
Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179

This patch adds the new pass option `use-opaque-pointers` to the GPU to LLVM lowerings (including ROCD and NVVM) and adapts the code to support using opaque pointers in addition to typed pointers.
The required changes mostly boil down to avoiding `getElementType` and specifying base types in GEP and Alloca.

In the future opaque pointers will be the only supported model, hence tests have been ported to using opaque pointers by default. Additional regression tests for typed-pointers have been added to avoid breaking existing clients.

Note: This does not yet port the `GpuToVulkan` passes.

Differential Revision: https://reviews.llvm.org/D144448
2023-02-21 20:46:33 +01:00
Krzysztof Drewniak
499abb243c Add generic type attribute mapping infrastructure, use it in GpuToX
Remapping memory spaces is a function often needed in type
conversions, most often when going to LLVM or to/from SPIR-V (a future
commit), and it is possible that such remappings may become more
common in the future as dialects take advantage of the more generic
memory space infrastructure.

Currently, memory space remappings are handled by running a
special-purpose conversion pass before the main conversion that
changes the address space attributes. In this commit, this approach is
replaced by adding a notion of type attribute conversions
TypeConverter, which is then used to convert memory space attributes.

Then, we use this infrastructure throughout the *ToLLVM conversions.
This has the advantage of loosing the requirements on the inputs to
those passes from "all address spaces must be integers" to "all
memory spaces must be convertible to integer spaces", a looser
requirement that reduces the coupling between portions of MLIR.

ON top of that, this change leads to the removal of most of the calls
to getMemorySpaceAsInt(), bringing us closer to removing it.

(A rework of the SPIR-V conversions to use this new system will be in
a folowup commit.)

As a note, one long-term motivation for this change is that I would
eventually like to add an allocaMemorySpace key to MLIR data layouts
and then call getMemRefAddressSpace(allocaMemorySpace) in the
relevant *ToLLVM in order to ensure all alloca()s, whether incoming or
produces during the LLVM lowering, have the correct address space for
a given target.

I expect that the type attribute conversion system may be useful in
other contexts.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D142159
2023-02-09 18:00:46 +00:00
Christopher Bate
6ca1a09f03 [mlir][gpu] Migrate hard-coded address space integers to an enum attribute (gpu::AddressSpaceAttr)
This is a purely mechanical change that introduces an enum attribute in the GPU
dialect to represent the various memref memory spaces as opposed to the
hard-coded integer attributes that are currently used.

The following steps were taken to make the transition across the codebase:

1. Introduce a pass "gpu-lower-memory-space-attributes":

The pass updates all memref types that have a memory space attribute that is a
`gpu::AddressSpaceAttr`. These attributes are changed to `IntegerAttr`'s using a
mapping that is given by the caller. This pass is based on the
"map-memref-spirv-storage-class" pass and the common functions can probably
be refactored into a set of utilities under the MemRef dialect.

2. Update the verifiers of GPU/NVGPU dialect operations.

If a verifier currently checks the address space of an operand using
e.g.`getWorkspaceAddressSpace`, then it can continue to do so. However, the
checks are changed to only fail if the memory space is either missing or a wrong
value of type `gpu::AddressSpaceAttr`. Otherwise, it just assumes the address
space is correct because it was specifically lowered to something other than a
`gpu::AddressSpaceAttr`.

3. Update existing gpu-to-llvm conversion infrastructure.

In the existing gpu-to-X passes, we add a full conversion equivalent to
`gpu-lower-memory-space-attributes` just before doing the conversion to the
LLVMDialect. This is done because currently both the gpu-to-llvm passes
(rocdl,nvvm) run gpu-to-gpu rewrites within the pass, which introduce
`AddressSpaceAttr` memory space annotations. Therefore, I inserted the
memory space conversion between the gpu-to-gpu rewrites and the LLVM
conversion.

For more context see the below discourse discussion:
https://discourse.llvm.org/t/gpu-workgroup-shared-memory-address-space-is-hard-coded/

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D140644
2023-01-13 11:00:10 -07:00
Thomas Raoux
7efdc117b1 [mlir][nvvm] Add lowering of gpu.printf to nvvm
When converting to nvvm lowering gpu.printf to vprintf allows us to
support printing when running on cuda.

Differential Revision: https://reviews.llvm.org/D141049
2023-01-06 17:29:30 +00:00
Jeff Niu
53406427cd [mlir] FunctionOpInterface: turn required attributes into interface methods (Reland)
Reland D139447, D139471 With flang actually working

- FunctionOpInterface: make get/setFunctionType interface methods

This patch removes the concept of a `function_type`-named type attribute
as a requirement for implementors of FunctionOpInterface. Instead, this
type should be provided through two interface methods, `getFunctionType`
and `setFunctionTypeAttr` (*Attr because functions may use different
concrete function types), which should be automatically implemented by
ODS for ops that define a `$function_type` attribute.

This also allows FunctionOpInterface to materialize function types if
they don't carry them in an attribute, for example.

Importantly, all the function "helper" still accept an attribute name to
use in parsing and printing functions, for example.

- FunctionOpInterface: arg and result attrs dispatch to interface

This patch removes the `arg_attrs` and `res_attrs` named attributes as a
requirement for FunctionOpInterface and replaces them with interface
methods for the getters, setters, and removers of the relevent
attributes. This allows operations to use their own storage for the
argument and result attributes.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D139736
2022-12-10 15:17:09 -08:00
David Spickett
cf98e8273c Revert "[mlir] FunctionOpInterface: make get/setFunctionType interface methods"
and "[mlir] Fix examples build"

This reverts commit fbc253fe81da4e1d6bfa2519e01e03f21d8c40a8 and
96cf183bccd7d1c3083f169a89a6af1f263b3aae.

Which I missed in the first revert in f3379feabe38fd3711b13ffcf6de4aab03b7ccdc.
2022-12-09 15:36:48 +00:00
Jeff Niu
fbc253fe81 [mlir] FunctionOpInterface: make get/setFunctionType interface methods
This patch removes the concept of a `function_type`-named type attribute
as a requirement for implementors of FunctionOpInterface. Instead, this
type should be provided through two interface methods, `getFunctionType`
and `setFunctionTypeAttr` (*Attr because functions may use different
concrete function types), which should be automatically implemented by
ODS for ops that define a `$function_type` attribute.

This also allows FunctionOpInterface to materialize function types if
they don't carry them in an attribute, for example.

Importantly, all the function "helper" still accept an attribute name to
use in parsing and printing functions, for example.

Reviewed By: rriddle, lattner

Differential Revision: https://reviews.llvm.org/D139447
2022-12-08 11:32:27 -08:00
Uday Bondhugula
5ceaeed659 Improve type conversion error propagation/failure during LLVM lowering
Improve type conversion error propagation/failure during LLVM lowering.

BEFORE

```
llvm-mlir/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp:304: SmallVector<mlir::Type, 5> mlir::LLVMTypeConverter::getMemRefDescriptorFields(mlir::MemRefType, bool): Assertion `isStrided(type) && "Non-strided layout maps must have been normalized away"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
...
```

AFTER
```
<unknown>:0: error: integer overflow during size computation
<unknown>:0: error: Conversion to strided form failed either due to non-strided layout maps (which should have been normalized away) or other reasons
<unknown>:0: error: failed to legalize operation 'gpu.func' that was explicitly marked illegal
<unknown>:0: note: see current operation:
"gpu.func"() ( {
...
```

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D139072
2022-12-03 16:30:44 +05:30
Christian Sigg
b251b608b5 [mlir][gpu] Unroll ops on vectors which map to intrinsic calls
Unroll ops that map to intrinsics when lowering to LLVM, because intrinsics don't support vector operands/results.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D136345
2022-10-28 10:33:38 +02:00
River Riddle
10c04f4641 [mlir:GPU][NFC] Update GPU API to use prefixed accessors
This doesn't flip the switch for prefix generation yet, that'll be
done in a followup.
2022-09-30 15:27:10 -07:00