llvm-project

Author	SHA1	Message	Date
Jessica Del	32f9983c06	[AMDGPU] - Add address space for strided buffers (#74471 ) This is an experimental address space for strided buffers. These buffers can have structs as elements and a stride > 1. These pointers allow the indexed access in units of stride, i.e., they point at `buffer[index * stride]`. Thus, we can use the `idxen` modifier for buffer loads. We assign address space 9 to 192-bit buffer pointers which contain a 128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially, they are fat buffer pointers with an additional 32-bit index.	2023-12-15 15:49:25 +01:00
Matt Arsenault	ee795fd1cf	AMDGPU: Handle rounding intrinsic exponents in isKnownIntegral https://reviews.llvm.org/D158999	2023-09-01 08:22:16 -04:00
Matt Arsenault	def228553c	AMDGPU: Use pown instead of pow if known integral https://reviews.llvm.org/D158998	2023-09-01 08:22:16 -04:00
Matt Arsenault	deefda7074	AMDGPU: Use exp2 and log2 intrinsics directly for f16/f32 These codegen correctly but f64 doesn't. This prevents losing fast math flags on the way to the underlying intrinsic. https://reviews.llvm.org/D158997	2023-09-01 08:22:16 -04:00
Matt Arsenault	dac8f974b5	AMDGPU: Handle sitofp and uitofp exponents in fast pow expansion https://reviews.llvm.org/D158996	2023-09-01 08:22:16 -04:00
Matt Arsenault	699685b718	AMDGPU: Enable assumptions in AMDGPULibCalls https://reviews.llvm.org/D159006	2023-09-01 08:22:16 -04:00
Matt Arsenault	a45b787c91	AMDGPU: Turn pow libcalls into powr powr is just pow with the assumption that x >= 0, otherwise nan. This fires at least 6 times in luxmark https://reviews.llvm.org/D158908	2023-09-01 08:22:16 -04:00
Matt Arsenault	f5d8a9b1bb	AMDGPU: Simplify handling of constant vectors in libcalls Also fixes not handling the partially undef case. https://reviews.llvm.org/D158905	2023-09-01 08:22:16 -04:00
Matt Arsenault	afb24cbb69	AMDGPU: Don't require all flags to expand fast powr This was requiring all fast math flags, which is practically useless. This wouldn't fire using all the standard OpenCL fast math flags. This only needs afn nnan and ninf. https://reviews.llvm.org/D158904	2023-09-01 08:22:16 -04:00
Matt Arsenault	aa539b128f	AMDGPU: Add baseline tests for libcall recognition of pow/powr/pown	2023-08-30 10:10:03 -04:00

10 Commits