5 Commits

Author SHA1 Message Date
Kareem Ergawy
ff55c9bc63
[llvm][amdgpu] Handle indirect refs to LDS GVs during LDS lowering (#124089)
Fixes #123800

Extends LDS lowering by allowing it to discover transitive
indirect/escpaing references to LDS GVs.

For example, given the following input:
```llvm
@lds_item_to_indirectly_load = internal addrspace(3) global ptr undef, align 8

%store_type = type { i32, ptr }
@place_to_store_indirect_caller = internal addrspace(3) global %store_type undef, align 8

define amdgpu_kernel void @offloading_kernel() {
  store ptr @indirectly_load_lds, ptr addrspace(3) getelementptr inbounds nuw (i8, ptr addrspace(3) @place_to_store_indirect_caller, i32 0), align 8
  call void @call_unknown()
  ret void
}

define void @call_unknown() {
  %1 = alloca ptr, align 8
  %2 = call i32 %1()
  ret void
}

define void @indirectly_load_lds() {
  call void @directly_load_lds()
  ret void
}

define void @directly_load_lds() {
  %2 = load ptr, ptr addrspace(3) @lds_item_to_indirectly_load, align 8
  ret void
}

```

With the above input, prior to this patch, LDS lowering failed to lower
the reference to `@lds_item_to_indirectly_load` because:
1. it is indirectly called by a function whose address is taken in the
kernel.
2. we did not check if the kernel indirectly makes any calls to unknown
functions (we only checked the direct calls).

Co-authored-by: Jon Chesterfield <jonathan.chesterfield@amd.com>
2025-01-23 14:53:11 +01:00
Mirko Brkušanin
3def49cb64
[AMDGPU] Remove s_wakeup_barrier instruction (#122277) 2025-01-10 11:30:22 +01:00
Kazu Hirata
be187369a0
[AMDGPU] Remove unused includes (NFC) (#116154)
Identified with misc-include-cleaner.
2024-11-13 21:10:03 -08:00
Gang Chen
8c752900dd
[AMDGPU] modify named barrier builtins and intrinsics (#114550)
Use a local pointer type to represent the named barrier in builtin and
intrinsic. This makes the definitions more user friendly
bacause they do not need to worry about the hardware ID assignment. Also
this approach is more like the other popular GPU programming language.
Named barriers should be represented as global variables of addrspace(3)
in LLVM-IR. Compiler assigns the special LDS offsets for those variables
during AMDGPULowerModuleLDS pass. Those addresses are converted to hw
barrier ID during instruction selection. The rest of the
instruction-selection changes are primarily due to the
intrinsic-definition changes.
2024-11-06 10:37:22 -08:00
Jay Foad
55d744eea3
[AMDGPU] Move AMDGPUMemoryUtils out of Utils. NFC. (#104930)
It is only used by CodeGen so does not need to be shared with the
assembler/disassembler.
2024-08-20 16:15:46 +01:00