llvm-project

Author	SHA1	Message	Date
Krzysztof Drewniak	6b9da28b2b	[AMDGPU][LowerBufferFatPointers] Handle addrspacecast null to p7 (#140775 ) Some application code operating on generic pointers (that then gete initialized to buffer fat pointers) may perform tests against nullptr. After address space inference, this results in comparisons against `addrspacecast (ptr null to ptr addrspace(7))`, which were crashing. However, while general casts to ptr addrspace(7) from generic pointers aren't supposted, it is possible to cast null pointers to the all-zerose bufer resource and 0 offset, which this patch adds. It also adds a TODO for casting _out_ of buffer resources, which isn't implemented here but could be.	2025-05-20 16:13:01 -07:00
Alexander Richardson	ee13638362	[AMDGPU] Remove explicit datalayout from tests where not needed Since e39f6c1844fab59c638d8059a6cf139adb42279a opt will infer the correct datalayout when given a triple. Avoid explicitly specifying it in tests that depend on the AMDGPU target being present to avoid the string becoming out of sync with the TargetInfo value. Only tests with REQUIRES: amdgpu-registered-target or a local lit.cfg were updated to ensure that tests for non-target-specific passes that happen to use the AMDGPU layout still pass when building with a limited set of targets. Reviewed By: shiltian, arsenm Pull Request: https://github.com/llvm/llvm-project/pull/137921	2025-04-30 10:58:17 -07:00
Krzysztof Drewniak	f23bb530cf	[AMDGPULowerBufferFatPointers] Use InstSimplifyFolder during rewrites (#134137 ) This PR updates AMDGPULowerBufferFatPointers to use the InstSimplifyFolder when creating IR during buffer fat pointer lowering. This shouldn't cause any large functional changes and might improve the quality of the generated code.	2025-04-03 10:12:18 -05:00
Krzysztof Drewniak	f7d03707d1	[AMDGPU] Generalize amdgcn.make.buffer.rsrc to fat pointers (#126828 ) Attempting to pass a `ptr addrspace(7)` to functions that take `ptr` arguments produces undesirable `addrspacecast(addrspacecast(p8 x to p7) to p0) => addrspacecast(p8 x to p0)` folds. This results in illegal GEP operations on buffer resources, which can't be GEP'd. (However, note that, while unimplemneted, addressspacecast from ptr addrspace(7) to ptr is legal - it's just an effective address computation) To resolve this problem, and thus prevent illegal `getelementptr T, ptr addrspace(8) %x, ...` s from being produces, this commit extends amdgcn.make.buffer.rsrc to also be variadic in its result type, auto-upgrading old manglings. The logic for handling a make.buffer.rsrc in instruction selection remains untouched and expects the output type to be a ptr addrspace(8), as does the Clang lowering for its builtin (the pointer-to-pointer version might want a different name in clang). LowerBufferFatPointers has been updated to lower amdgcn.make.buffer.rsrc.p7.p* to amdgcn.make.buffer.rsrc.p8.p* . This'll also make exposing buffer fat pointers in Clang easier, since you don't have to cast between a `__amdgcn_rsrc_t` and a pointer.	2025-02-18 14:15:28 -06:00
Krzysztof Drewniak	934c97dd16	[LowerBufferFatPointers] Fix support for GEP T, p7, <N x T> idxs (#126126 ) The lowering for GEP didn't properly support the case where the pointer argument was being implicitly broadcast by a vector of indices. Fix that. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-02-11 18:22:50 -06:00
Krzysztof Drewniak	3b0f506c87	[AMDGPU] Support `nuw` and `nusw` in buffer fat pointer lowering (#115039 ) This commit usis the `nuw` flag on `getelemnetptr` to set the `nuw` flag on buffer offset additions, and also moves from `inbounds` to the looser `nusw` for the existing case.	2024-11-06 11:42:47 -06:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
Nikita Popov	0774000e32	[AMDGPULowerBufferFatPointers] Fix offset-only ptrtoint (#95543 ) For ptrtoint that truncates to the offset only, the expansion generated a shift by the bit width, which is poison. Instead, we should return the offset directly. (The same problem exists for the constant expression case, but I plan to address that separately, and more comprehensively.)	2024-06-14 16:38:57 +02:00
Nikita Popov	cb3a6bded7	[AMDGPULowerBufferFatPointers] Restore zero offset special case OffAccum will never be nullptr now, instead check for a zero constant.	2024-06-12 10:30:23 +02:00
Nikita Popov	6fc63ab77d	[AMDGPULowerBufferFatPointers] Simplify and fix GEP offset emission (#95115 ) Use emitGEPOffset() to emit the GEP offset, which already has all the necessary logic. This also fixes the nuw flag incorrectly being set on the offset calculation, while only nsw is implied by inbounds.	2024-06-12 09:51:18 +02:00
Krzysztof Drewniak	6540f1635a	[AMDGPU] Add IR-level pass to rewrite away address space 7 (#77952 ) This commit adds the -lower-buffer-fat-pointers pass, which is applicable to all AMDGCN compilations. The purpose of this pass is to remove the type `ptr addrspace(7)` from incoming IR. This must be done at the LLVM IR level because `ptr addrspace(7)`, as a 160-bit primitive type, cannot be correctly handled by SelectionDAG. The detailed operation of the pass is described in comments, but, in summary, the removal proceeds by: 1. Rewriting loads and stores of ptr addrspace(7) to loads and stores of i160 (including vectors and aggregates). This is needed because the in-register representation of these pointers will stop matching their in-memory representation in step 2, and so ptrtoint/inttoptr operations are used to preserve the expected memory layout 2. Mutating the IR to replace all occurrences of `ptr addrspace(7)` with the type `{ptr addrspace(8), ptr addrspace(6) }`, which makes the two parts of a buffer fat pointer (the 128-bit address space 8 resource and the 32-bit address space 6 offset) visible in the IR. This also impacts the argument and return types of functions. 3. Splitting the resource and offset parts. All instructions that produce or consume buffer fat pointers (like GEP or load) are rewritten to produce or consume the resource and offset parts separately. For example, GEP updates the offset part of the result and a load uses the resource and offset parts to populate the relevant llvm.amdgcn.raw.ptr.buffer.load intrinsic call. At the end of this process, the original mutated instructions are replaced by their new split counterparts, ensuring no invalidly-typed IR escapes this pass. (For operations like call, where the struct form is needed, insertelement operations are inserted). Compared to LGC's PatchBufferOp ( `32cda89776/lgc/patch/PatchBufferOp.cpp` ): this pass - Also handles vectors of ptr addrspace(7)s - Also handles function boundaries - Includes the same uniform buffer optimization for loops and conditionals - Does not handle memcpy() and friends (this is future work) - Does not break up large loads and stores into smaller parts. This should be handled by extending the legalization of .buffer.{load,store} to handle larger types by producing multiple instructions (the same way ordinary LOAD and STORE are legalized). That work is planned for a followup commit. - Does not* have special logic for handling divergent buffer descriptors. The logic in LGC is, as far as I can tell, incorrect in general, and, per discussions with @nhaehnle, isn't widely used. Therefore, divergent descriptors are handled with waterfall loops later in legalization. As a final matter, this commit updates atomic expansion to treat buffer operations analogously to global ones. (One question for reviewers: is the new pass is the right place? Should it be later in the pipeline?) Differential Revision: https://reviews.llvm.org/D158463	2024-03-06 09:49:58 -06:00

11 Commits