10 Commits

Author SHA1 Message Date
Jessica Del
32f9983c06
[AMDGPU] - Add address space for strided buffers (#74471)
This is an experimental address space for strided buffers. These buffers
can have structs as elements and
a stride > 1.
These pointers allow the indexed access in units of stride, i.e., they
point at `buffer[index * stride]`.
Thus, we can use the `idxen` modifier for buffer loads.

We assign address space 9 to 192-bit buffer pointers which contain a
128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially,
they are fat buffer pointers with an additional 32-bit index.
2023-12-15 15:49:25 +01:00
Matt Arsenault
ee795fd1cf AMDGPU: Handle rounding intrinsic exponents in isKnownIntegral
https://reviews.llvm.org/D158999
2023-09-01 08:22:16 -04:00
Matt Arsenault
def228553c AMDGPU: Use pown instead of pow if known integral
https://reviews.llvm.org/D158998
2023-09-01 08:22:16 -04:00
Matt Arsenault
deefda7074 AMDGPU: Use exp2 and log2 intrinsics directly for f16/f32
These codegen correctly but f64 doesn't. This prevents losing fast
math flags on the way to the underlying intrinsic.

https://reviews.llvm.org/D158997
2023-09-01 08:22:16 -04:00
Matt Arsenault
dac8f974b5 AMDGPU: Handle sitofp and uitofp exponents in fast pow expansion
https://reviews.llvm.org/D158996
2023-09-01 08:22:16 -04:00
Matt Arsenault
699685b718 AMDGPU: Enable assumptions in AMDGPULibCalls
https://reviews.llvm.org/D159006
2023-09-01 08:22:16 -04:00
Matt Arsenault
a45b787c91 AMDGPU: Turn pow libcalls into powr
powr is just pow with the assumption that x >= 0, otherwise nan. This
fires at least 6 times in luxmark

https://reviews.llvm.org/D158908
2023-09-01 08:22:16 -04:00
Matt Arsenault
f5d8a9b1bb AMDGPU: Simplify handling of constant vectors in libcalls
Also fixes not handling the partially undef case.

https://reviews.llvm.org/D158905
2023-09-01 08:22:16 -04:00
Matt Arsenault
afb24cbb69 AMDGPU: Don't require all flags to expand fast powr
This was requiring all fast math flags, which is practically
useless. This wouldn't fire using all the standard OpenCL fast math
flags. This only needs afn nnan and ninf.

https://reviews.llvm.org/D158904
2023-09-01 08:22:16 -04:00
Matt Arsenault
aa539b128f AMDGPU: Add baseline tests for libcall recognition of pow/powr/pown 2023-08-30 10:10:03 -04:00