21 Commits

Author SHA1 Message Date
Jianhui Li
83bff14dfd
[MLIR][GPUToLLVMSPV] Relax the width check in gpu.shuffle lowering (#183445)
This PR modifies gpu.shuffle lowering so only conduct the width check
when the subgroupsize attribute is available.
2026-03-06 06:25:23 -08:00
Jakub Kuderski
59e44799bd
[mlir] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178487)
Pre-commiting this before landing the new check in
https://github.com/llvm/llvm-project/pull/177892
2026-01-28 19:13:47 +00:00
Krzysztof Drewniak
df739ba008
[mlir][gpu] Add address space modifier to gpu.barrier (#177425)
This is a takeover of PR ##110527

This commit adds an optional list of memory fences to gpu.barrier,
allowing users to specify which memory scopes they wish to fence
explicitly, while leaving the default semantics (which are equivalent to
calling for a global and local fence by analogy to CUDA's __syncthreads)
unchanged. The new expanded semantics are implemented for SPIR-V and for
the AMDGPU backend.

See also

https://discourse.llvm.org/t/rfc-add-memory-scope-to-gpu-barrier/81021/2?u=fmarno,
where the default behavior of a gpu.barrier was hashed out (though note
that the examples based on VMCNT are outdated for AMDGPU in that memory
fences can now be annotated with the correct set of address spaces).

This commit also deprecates amdgpu.lds_barrier for usecases that don't
involve targeting a gfx908.

Assisted-by: Cursor/Claude code (tests and extending amdgpu.lds_barrier
pattern while copying it over)

---------

Co-authored-by: Finlay Marno <finlay.marno@codeplay.com>
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
Co-authored-by: Alan Li <alan.li@me.com>
2026-01-26 12:08:47 -08:00
Adam Paszke
9a93769853
[MLIR] Propagate known cluster sizes from gpu.launch to gpu.func (#174404)
This lets us properly annotate ranges for gpu.cluster_block_id and
gpu.cluster_dim_blocks. It also allows us to fill in the
nvvm.cluster_dim attribute for use in the NVVM backend.
2026-01-06 03:49:02 -08:00
Tomek Kuczyński
77455615a4
[MLIR][GPUToLLVMSPV] Use global & local memory scope for GPUBarrierConversion (#169026)
The MLIR [GPU dialect
docs](https://mlir.llvm.org/docs/Dialects/GPU/#gpubarrier-gpubarrierop)
specify that gpu::BarrierOp should make *all memory accesses* visible to
all work items in the workgroup.
Current implementation uses only CLK_LOCAL_MEM_FENCE, which per the
[OpenCL
specification](https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/barrier.html)
guarantees visibility of
only *local memory accesses*.

This PR changes the barrier conversion to use CLK_LOCAL_MEM_FENCE |
CLK_GLOBAL_MEM_FENCE,
ensuring both local and global memory operations are properly
synchronized per the MLIR spec.

This issue was discovered while investigating numerical instabilities on
Intel Battlemage,
where race conditions occurred due to incomplete memory synchronization.
2025-12-17 10:28:57 -05:00
darkbuck
2f6f045ea8
[mlir][LLVM] Resync memory effect attribute with LLVM IR (#168568)
- Add missing locations, namely 'ErrnoMem', 'TargetMem0', and
'TargetMem1'.
2025-11-19 11:56:04 -05:00
Jakub Kuderski
ba0be89cd2
[mlir] Simplify Default cases in type switches. NFC. (#165767)
Use default values instead of lambdas when possible. `std::nullopt` and
`nullptr` can be used now because of
https://github.com/llvm/llvm-project/pull/165724.
2025-10-30 15:10:59 -04:00
Sang Ik Lee
150145486e
[MLIR][GPU] Generalize gpu.printf op lowering to LLVM call pattern. (#164297)
Existing pattern for lowering gpu.printf op to LLVM call uses fixed
function name and calling convention.
Those two should be exposed as pass option to allow supporting Intel
Compute Runtime for GPU.

Also adds gpu.printf op pattern to GPU to LLVMSPV pass.
It may appear out of place, but integration test is added to XeVM
integration test as that is the current best folder for testing with
Intel Compute Runtime.
Test should be moved in the future if a better test folder is added.
2025-10-23 08:32:53 -07:00
Maksim Levental
eaa67a3cf0
[mlir][NFC] update Conversion create APIs (5/n) (#149887)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-22 10:40:45 -04:00
Pietro Ghiglio
cdd652eb28
[MLIR][GPU] Support bf16 and i1 gpu::shuffles to LLVMSPIRV conversion (#119675)
This PR adds support to the `bf16` and `i1` data types when converting
`gpu::shuffle` to the `LLVMSPV` dialect, by inserting `bitcast` to/from
`i16` (for `bf16`) and extending/truncating to `i8` (for `i1`).
2025-01-09 13:16:18 +01:00
Jefferson Le Quellec
81825687b4
[MLIR][GPUToLLVMSPV] Update ConvertGpuOpsToLLVMSPVOps's option (#118818)
## Description

This PR updates the `ConvertGpuOpsToLLVMSPVOps`'s option by replacing
the `index-bitwidth` with a boolean option `use-64bit-index` (similar to
the `ConvertGPUToSPIRV` option).

The reason for this modification is because the
`ConvertGpuOpsToLLVMSPVOps`:
> Generate LLVM operations to be ingested by a SPIR-V backend for gpu
operations

In the context of SPIR-V specifications only two physical addressing
models are allowed: `Physical32` and `Physical64`.

This change guarantees output sanity by preventing invalid or
unsupported index bitwidths from being specified.
2024-12-12 13:35:07 +01:00
Victor Perez
a807bbea6f
[MLIR][GPUToLLVMSPV] Use llvm.func attributes to convert gpu.shuffle (#116967)
Use `llvm.func`'s `intel_reqd_sub_group_size` attribute instead of
SPIR-V environment attributes in the `gpu.shuffle` conversion pattern.
This metadata is needed to check the semantics of the operation are
supported, i.e., it has a constant width and its value is equal to the
sub-group size.

As the pass also converts `gpu.func` to `llvm.func`, adding a
discardable attribute of name `intel_reqd_sub_group_size` attribute to
the latter is enough for this pattern to work.

We no longer have a notion of "default" sub-group size, so this
attribute needs to be set in the parent function for `gpu.shuffle`
operations to be converted.

Drop dependency on the SPIR-V dialect as we no longer require creating
attributes from this dialect to lower `gpu.shuffle` instances.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-11-27 15:04:38 +01:00
Petr Kurapov
f8b7a65395
[MLIR][GPU-LLVM] Add in-pass signature update for opencl kernels (#105664)
Default to Global address space for memrefs that do not have an explicit address space set in the IR.

---------

Co-authored-by: Victor Perez <victor.perez@intel.com>
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
Co-authored-by: Victor Perez <victor.perez@codeplay.com>
2024-10-10 14:04:52 +02:00
Matthias Springer
206fad0e21
[mlir][NFC] Mark type converter in populate... functions as const (#111250)
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.

Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.

Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
2024-10-05 21:32:40 +02:00
Finlay
af7aa223d2
[MLIR][GPU] Lower subgroup query ops in gpu-to-llvm-spv (#108839)
These ops are:
* gpu.subgroup_id
* gpu.lane_id
* gpu.num_subgroups
* gpu.subgroup_size

---------

Signed-off-by: Finlay Marno <finlay.marno@codeplay.com>
2024-09-26 14:52:12 +01:00
Finlay
552d26e275
[mlir][gpu] Add extra value types for gpu::ShuffleOp (#104605)
Expand the accepted types for gpu.shuffle to any integer, float or 1d vector of integers or floats.
Also updated the gpu-to-llvm-spv pass to support those types.
2024-08-20 19:50:25 +01:00
Victor Perez
75cb9edf09
[MLIR][GPU-LLVM] Add GPU to LLVM-SPV address space mapping (#102621)
Implement mapping:

- `global`: 1
- `workgroup`: 3
- `private`: 0

Add `addressSpaceToStorageClass`, mapping GPU address spaces to SPIR-V
storage classes to be able to use SPIR-V's
`storageClassToAddressSpace`, mapping SPIR-V storage classes to LLVM
address spaces according to our mapping above *by definition*.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-08-16 11:18:35 +02:00
Victor Perez
d45de8003a
[MLIR][GPU-LLVM] Convert gpu.func to llvm.func (#101664)
Add support in `-convert-gpu-to-llvm-spv` to convert `gpu.func` to
`llvm.func` operations.

- `spir_kernel`/`spir_func` calling conventions used for
kernels/functions.
- `workgroup` attributions encoded as additional `llvm.ptr<3>`
arguments.
- No attribute used to annotate kernels
- `reqd_work_group_size` attribute using to encode
`gpu.known_block_size`.
- `llvm.mlir.workgroup_attrib_size` used to encode workgroup attribution
sizes. This will be attached to the pointer argument workgroup
attributions lower to.

**Note**: A notable missing feature that will be addressed in a
follow-up PR is a `-use-bare-ptr-memref-call-conv` option to replace
MemRef arguments with bare pointers to the MemRef element types instead
of the current MemRef descriptor approach.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-08-09 16:09:11 +02:00
Finlay
5a53add85a
[mlir] Add optimization attrs for gpu-to-llvmspv function declarations and calls (#99301)
Adds the attributes nounwind and willreturn to all function
declarations. Adds `memory(none)` equivalent to the id/dimension
function declarations. The function declaration attributes are copied to
the function calls.
`nounwind` is legal because there are no exception in SPIR-V. I also do
not see any reason why any of these functions would not return when used
correctly.
I'm confident that the get id/dim functions will have no externally
observable memory effects, but think the convergent functions will have
effects.
2024-07-24 18:30:03 +02:00
Finlay
3670e7f86c
[MLIR] Add the convergent attribute to the barrier and shuffle ops (#97807)
When lowering from the gpu dialect to the llvm dialect for spirv, the
barrier op and shuffle ops need a convergent attribute for correctness.
2024-07-09 12:49:42 +02:00
Victor Perez
98d5d3448d
[MLIR][GPU-LLVM] Define -convert-gpu-to-llvm-spv pass (#90972)
Define pass for GPU to LLVM conversion for SPIR-V backend tool ingest.

Supported operations:

- `gpu.block_id`
- `gpu.global_id`
- `gpu.block_dim`
- `gpu.thread_id`
- `gpu.grid_dim`
- `gpu.barrier`
- `gpu.shuffle`

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-05-31 17:47:53 +02:00