llvm-project

Author	SHA1	Message	Date
Jianhui Li	83bff14dfd	[MLIR][GPUToLLVMSPV] Relax the width check in gpu.shuffle lowering (#183445 ) This PR modifies gpu.shuffle lowering so only conduct the width check when the subgroupsize attribute is available.	2026-03-06 06:25:23 -08:00
Jakub Kuderski	59e44799bd	[mlir] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178487 ) Pre-commiting this before landing the new check in https://github.com/llvm/llvm-project/pull/177892	2026-01-28 19:13:47 +00:00
Krzysztof Drewniak	df739ba008	[mlir][gpu] Add address space modifier to gpu.barrier (#177425 ) This is a takeover of PR ##110527 This commit adds an optional list of memory fences to gpu.barrier, allowing users to specify which memory scopes they wish to fence explicitly, while leaving the default semantics (which are equivalent to calling for a global and local fence by analogy to CUDA's __syncthreads) unchanged. The new expanded semantics are implemented for SPIR-V and for the AMDGPU backend. See also https://discourse.llvm.org/t/rfc-add-memory-scope-to-gpu-barrier/81021/2?u=fmarno, where the default behavior of a gpu.barrier was hashed out (though note that the examples based on VMCNT are outdated for AMDGPU in that memory fences can now be annotated with the correct set of address spaces). This commit also deprecates amdgpu.lds_barrier for usecases that don't involve targeting a gfx908. Assisted-by: Cursor/Claude code (tests and extending amdgpu.lds_barrier pattern while copying it over) --------- Co-authored-by: Finlay Marno <finlay.marno@codeplay.com> Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com> Co-authored-by: Alan Li <alan.li@me.com>	2026-01-26 12:08:47 -08:00
Adam Paszke	9a93769853	[MLIR] Propagate known cluster sizes from gpu.launch to gpu.func (#174404 ) This lets us properly annotate ranges for gpu.cluster_block_id and gpu.cluster_dim_blocks. It also allows us to fill in the nvvm.cluster_dim attribute for use in the NVVM backend.	2026-01-06 03:49:02 -08:00
Tomek Kuczyński	77455615a4	[MLIR][GPUToLLVMSPV] Use global & local memory scope for GPUBarrierConversion (#169026 ) The MLIR [GPU dialect docs](https://mlir.llvm.org/docs/Dialects/GPU/#gpubarrier-gpubarrierop) specify that gpu::BarrierOp should make all memory accesses visible to all work items in the workgroup. Current implementation uses only CLK_LOCAL_MEM_FENCE, which per the [OpenCL specification](https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/barrier.html) guarantees visibility of only local memory accesses. This PR changes the barrier conversion to use CLK_LOCAL_MEM_FENCE \| CLK_GLOBAL_MEM_FENCE, ensuring both local and global memory operations are properly synchronized per the MLIR spec. This issue was discovered while investigating numerical instabilities on Intel Battlemage, where race conditions occurred due to incomplete memory synchronization.	2025-12-17 10:28:57 -05:00
darkbuck	2f6f045ea8	[mlir][LLVM] Resync memory effect attribute with LLVM IR (#168568 ) - Add missing locations, namely 'ErrnoMem', 'TargetMem0', and 'TargetMem1'.	2025-11-19 11:56:04 -05:00
Jakub Kuderski	ba0be89cd2	[mlir] Simplify Default cases in type switches. NFC. (#165767 ) Use default values instead of lambdas when possible. `std::nullopt` and `nullptr` can be used now because of https://github.com/llvm/llvm-project/pull/165724.	2025-10-30 15:10:59 -04:00
Sang Ik Lee	150145486e	[MLIR][GPU] Generalize gpu.printf op lowering to LLVM call pattern. (#164297 ) Existing pattern for lowering gpu.printf op to LLVM call uses fixed function name and calling convention. Those two should be exposed as pass option to allow supporting Intel Compute Runtime for GPU. Also adds gpu.printf op pattern to GPU to LLVMSPV pass. It may appear out of place, but integration test is added to XeVM integration test as that is the current best folder for testing with Intel Compute Runtime. Test should be moved in the future if a better test folder is added.	2025-10-23 08:32:53 -07:00
Maksim Levental	eaa67a3cf0	[mlir][NFC] update `Conversion` create APIs (5/n) (#149887 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-22 10:40:45 -04:00
Pietro Ghiglio	cdd652eb28	[MLIR][GPU] Support bf16 and i1 gpu::shuffles to LLVMSPIRV conversion (#119675 ) This PR adds support to the `bf16` and `i1` data types when converting `gpu::shuffle` to the `LLVMSPV` dialect, by inserting `bitcast` to/from `i16` (for `bf16`) and extending/truncating to `i8` (for `i1`).	2025-01-09 13:16:18 +01:00
Jefferson Le Quellec	81825687b4	[MLIR][GPUToLLVMSPV] Update ConvertGpuOpsToLLVMSPVOps's option (#118818 ) ## Description This PR updates the `ConvertGpuOpsToLLVMSPVOps`'s option by replacing the `index-bitwidth` with a boolean option `use-64bit-index` (similar to the `ConvertGPUToSPIRV` option). The reason for this modification is because the `ConvertGpuOpsToLLVMSPVOps`: > Generate LLVM operations to be ingested by a SPIR-V backend for gpu operations In the context of SPIR-V specifications only two physical addressing models are allowed: `Physical32` and `Physical64`. This change guarantees output sanity by preventing invalid or unsupported index bitwidths from being specified.	2024-12-12 13:35:07 +01:00
Victor Perez	a807bbea6f	[MLIR][GPUToLLVMSPV] Use `llvm.func` attributes to convert `gpu.shuffle` (#116967 ) Use `llvm.func`'s `intel_reqd_sub_group_size` attribute instead of SPIR-V environment attributes in the `gpu.shuffle` conversion pattern. This metadata is needed to check the semantics of the operation are supported, i.e., it has a constant width and its value is equal to the sub-group size. As the pass also converts `gpu.func` to `llvm.func`, adding a discardable attribute of name `intel_reqd_sub_group_size` attribute to the latter is enough for this pattern to work. We no longer have a notion of "default" sub-group size, so this attribute needs to be set in the parent function for `gpu.shuffle` operations to be converted. Drop dependency on the SPIR-V dialect as we no longer require creating attributes from this dialect to lower `gpu.shuffle` instances. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-11-27 15:04:38 +01:00
Petr Kurapov	f8b7a65395	[MLIR][GPU-LLVM] Add in-pass signature update for opencl kernels (#105664 ) Default to Global address space for memrefs that do not have an explicit address space set in the IR. --------- Co-authored-by: Victor Perez <victor.perez@intel.com> Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com> Co-authored-by: Victor Perez <victor.perez@codeplay.com>	2024-10-10 14:04:52 +02:00
Matthias Springer	206fad0e21	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250 ) This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.	2024-10-05 21:32:40 +02:00
Finlay	af7aa223d2	[MLIR][GPU] Lower subgroup query ops in gpu-to-llvm-spv (#108839 ) These ops are: * gpu.subgroup_id * gpu.lane_id * gpu.num_subgroups * gpu.subgroup_size --------- Signed-off-by: Finlay Marno <finlay.marno@codeplay.com>	2024-09-26 14:52:12 +01:00
Finlay	552d26e275	[mlir][gpu] Add extra value types for gpu::ShuffleOp (#104605 ) Expand the accepted types for gpu.shuffle to any integer, float or 1d vector of integers or floats. Also updated the gpu-to-llvm-spv pass to support those types.	2024-08-20 19:50:25 +01:00
Victor Perez	75cb9edf09	[MLIR][GPU-LLVM] Add GPU to LLVM-SPV address space mapping (#102621 ) Implement mapping: - `global`: 1 - `workgroup`: 3 - `private`: 0 Add `addressSpaceToStorageClass`, mapping GPU address spaces to SPIR-V storage classes to be able to use SPIR-V's `storageClassToAddressSpace`, mapping SPIR-V storage classes to LLVM address spaces according to our mapping above by definition. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-08-16 11:18:35 +02:00
Victor Perez	d45de8003a	[MLIR][GPU-LLVM] Convert `gpu.func` to `llvm.func` (#101664 ) Add support in `-convert-gpu-to-llvm-spv` to convert `gpu.func` to `llvm.func` operations. - `spir_kernel`/`spir_func` calling conventions used for kernels/functions. - `workgroup` attributions encoded as additional `llvm.ptr<3>` arguments. - No attribute used to annotate kernels - `reqd_work_group_size` attribute using to encode `gpu.known_block_size`. - `llvm.mlir.workgroup_attrib_size` used to encode workgroup attribution sizes. This will be attached to the pointer argument workgroup attributions lower to. Note: A notable missing feature that will be addressed in a follow-up PR is a `-use-bare-ptr-memref-call-conv` option to replace MemRef arguments with bare pointers to the MemRef element types instead of the current MemRef descriptor approach. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-08-09 16:09:11 +02:00
Finlay	5a53add85a	[mlir] Add optimization attrs for gpu-to-llvmspv function declarations and calls (#99301 ) Adds the attributes nounwind and willreturn to all function declarations. Adds `memory(none)` equivalent to the id/dimension function declarations. The function declaration attributes are copied to the function calls. `nounwind` is legal because there are no exception in SPIR-V. I also do not see any reason why any of these functions would not return when used correctly. I'm confident that the get id/dim functions will have no externally observable memory effects, but think the convergent functions will have effects.	2024-07-24 18:30:03 +02:00
Finlay	3670e7f86c	[MLIR] Add the convergent attribute to the barrier and shuffle ops (#97807 ) When lowering from the gpu dialect to the llvm dialect for spirv, the barrier op and shuffle ops need a convergent attribute for correctness.	2024-07-09 12:49:42 +02:00
Victor Perez	98d5d3448d	[MLIR][GPU-LLVM] Define `-convert-gpu-to-llvm-spv` pass (#90972 ) Define pass for GPU to LLVM conversion for SPIR-V backend tool ingest. Supported operations: - `gpu.block_id` - `gpu.global_id` - `gpu.block_dim` - `gpu.thread_id` - `gpu.grid_dim` - `gpu.barrier` - `gpu.shuffle` --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-05-31 17:47:53 +02:00

21 Commits