llvm-project

Author	SHA1	Message	Date
Krzysztof Drewniak	e2d2affc70	[AMDGPU][LowerBufferFatPointers] Fix crash with `select false` (#166471 ) If the input to LowerBufferFatPointers is such that the resource- and offset-specific `select` instructions generated for a `select` on `ptr addrspae(7)` fold away, the pass would crash when trying to replace an instruction with itself. This commit resolves the issue. Fixes https://github.com/iree-org/iree/issues/22551	2025-11-05 19:21:52 +00:00
Krzysztof Drewniak	01a7c880d2	[AMDGPU][LowerBufferFatPointers] Erase dead ptr(7) intrinsics (#160798 ) Fix a crash that would arise when intrinsics like llvm.masked.load.T.p7 were left in the module when AMDGPULowerBufferFatPointers was applied and so a captures(none) annotation would be applied to a non-pointer value, triggering a verifier failure. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>	2025-09-29 10:46:45 -05:00
Krzysztof Drewniak	96ce9f9d64	[AMDGPU] Prevent re-visits in LowerBufferFatPointers (#159168 ) Fixes https://github.com/iree-org/iree/issues/22001 The visitor in SplitPtrStructs would re-visit instructions if an instruction earlier in program order caused a recursive visit() call via getPtrParts(). This would cause instructions to be processed multiple times. As a consequence of this, PHI nodes could be added to the Conditionals array multiple times, which would to a conditinoal that was already simplified being processed multiple times. After the code moved to InstSimplifyFolder, this re-processing, combined with more agressive simplifications, would lead to an attempt to replace an instruction with itself, causing an assertion failure and crash. This commit resolves the issue and adds the reduced form of the crashing input as a test.	2025-09-16 18:02:18 -07:00
Ivan Kosarev	faca8c9ed4	[AMDGPU][NFC] Only include CodeGenPassBuilder.h where needed. (#154769 ) Saves around 125-210 MB of compilation memory usage per source for roughly one third of our backend sources, ~60 MB on average.	2025-08-22 10:05:06 +01:00
Krzysztof Drewniak	7f27482a32	[AMDGPU][LowerBufferFatPointers] Fix lack of rewrite when loading/storing null (#154128 ) Fixes #154056. The fat buffer lowering pass was erroniously detecting that it did not need to run on functions that only load/store to the null constant (or other such constants). We thought this would be covered by specializing constants out to instructions, but that doesn't account foc trivial constants like null. Therefore, we check the operands of instructions for buffer fat pointers in order to find such constants and ensure the pass runs. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-08-18 12:32:54 -05:00
Orlando Cazalet-Hyams	54f92c7806	[RemoveDIs][AMDGPU] Replace defunct getAssignmentMarkers call (#153212 ) Not quite NFC as it looks like the original intrinsic-handling code never got updated to use records. This was never caught because that code wasn't tested. I've adjusted an existing test so the behaviour is now covered.	2025-08-12 17:20:38 +01:00
Alexander Richardson	87ad9122e5	[AMDGPULowerBufferFatPointers] Handle ptrtoaddr by extending the offset Reviewed By: krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/139413	2025-08-09 16:28:12 -07:00
Jeremy Morse	c9ceb9b75f	[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816 ) This is one of the final remaining debug-intrinsic specific codepaths out there, and pieces of cross-LLVM infrastructure to do with debug intrinsics.	2025-07-21 17:49:25 +01:00
Jeremy Morse	040bffc633	[DebugInfo][AMDGPU] Convert a debug-intrinsic method to debug records (#149505 ) It appears this wasn't handled in the initial migration a year ago, seemingly because it didn't lead to any test failures. Find and interpret debug records in the same way the original code handled intrinsics. Note that we drop a call to copyMetadata: debug records can't carry additional metadata like instructions, nothing relies on this in AMDGPU AFAIUI.	2025-07-21 10:07:14 +01:00
Matt Arsenault	0fa0c3c233	AMDGPU: Use reportFatalUsageError in AMDGPULowerBufferFatPointers (#145132 )	2025-06-21 14:24:30 +09:00
Jeremy Morse	97ac6483aa	[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746 ) This flag was used to let us incrementally introduce debug records into LLVM, however everything is now using records. It serves no purpose now, so delete it.	2025-06-12 11:51:58 +01:00
Matt Arsenault	6b81483e28	AMDGPU: Start using LLVMContext errors in buffer fat pointer lowering (#142014 ) Avoid using report_fatal_error. Many more uses that should be converted in the pass remain.	2025-05-30 07:52:45 +02:00
Devon Loehr	63de20c0de	Reland "Add macro to suppress -Wunnecessary-virtual-specifier" (#141091 ) This fixes #139614 on non-clang compilers by moving `__has_warning` completely inside the `#if defined(__clang__)` block. This prevents a parse failure from compilers which don't recognize `__has_warning`. Original description: Followup to #138741. This adds the requested macro to silence `-Wunnecessary-virtual-specifier` when declaring virtual anchor functions in `final` classes, per [LLVM policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers). It also cleans up any remaining instances of the warning, allowing us to stop disabling it when we build LLVM.	2025-05-28 12:15:22 +02:00
Kazu Hirata	1e8e662174	[AMDGPU] Remove unused includes (NFC) (#141376 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-05-24 14:48:46 -07:00
Philip Reames	e4e7a7e64e	Revert "Add macro to suppress -Wunnecessary-virtual-specifier (#139614 )" This reverts commit 0954c9d487e7cb30673df9f0ac125f71320d2936. It breaks the build when built with gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04).	2025-05-21 11:31:26 -07:00
Devon Loehr	0954c9d487	Add macro to suppress -Wunnecessary-virtual-specifier (#139614 ) Followup to #138741. This adds the requested macro to silence `-Wunnecessary-virtual-specifier` when declaring virtual anchor functions in `final` classes, per [LLVM policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers). It also cleans up any remaining instances of the warning, allowing us to stop disabling it when we build LLVM.	2025-05-21 10:54:36 -07:00
Krzysztof Drewniak	6b9da28b2b	[AMDGPU][LowerBufferFatPointers] Handle addrspacecast null to p7 (#140775 ) Some application code operating on generic pointers (that then gete initialized to buffer fat pointers) may perform tests against nullptr. After address space inference, this results in comparisons against `addrspacecast (ptr null to ptr addrspace(7))`, which were crashing. However, while general casts to ptr addrspace(7) from generic pointers aren't supposted, it is possible to cast null pointers to the all-zerose bufer resource and 0 offset, which this patch adds. It also adds a TODO for casting _out_ of buffer resources, which isn't implemented here but could be.	2025-05-20 16:13:01 -07:00
Krzysztof Drewniak	4bdd116b80	[AMDGPU] Add a new amdgcn.load.to.lds intrinsic (#137425 ) This PR adds a amdgns_load_to_lds intrinsic that abstracts over loads to LDS from global (address space 1) pointers and buffer fat pointers (address space 7), since they use the same API and "gather from a pointer to LDS" is something of an abstract operation. This commit adds the intrinsic and its lowerings for addrspaces 1 and 7, and updates the MLIR wrappers to use it (loosening up the restrictions on loads to LDS along the way to match the ground truth from target features). It also plumbs the intrinsic through to clang.	2025-05-19 07:15:04 -07:00
Jonathan Thackray	6e49f73825	Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-30 22:06:37 +01:00
Krzysztof Drewniak	94dc0a0e7b	[NFC][AMDGPU] Drop recursive types in LowerBufferFatPointers (#137735 ) Now that IRMover and the rest of LLVM don't allow recursive types, drop support for them from the clone of the IRMover code used when lowering buffer fat pointer operations.	2025-04-29 07:23:40 -07:00
Jonathan Thackray	7ee0097b48	Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657 ) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4	2025-04-28 16:53:36 +01:00
Jonathan Thackray	ba420d8122	[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-28 15:31:44 +01:00
Jay Foad	886f1199f0	[AMDGPU] Use variadic isa<>. NFC. (#137016 )	2025-04-24 08:19:09 +01:00
Kazu Hirata	e7c07a0210	[AMDGPU] Construct SmallVector with iterator ranges (NFC) (#136415 )	2025-04-19 09:09:41 -07:00
Krzysztof Drewniak	4a7b34d03c	Revert "[AMDGPU] Add buffer.fat.ptr.load.lds intrinsic wrapping raw rsrc version (#133015 )" (#134871 ) This reverts commit d1a05721172272f7aab685b56d99e86814a15bff. There was further discussion on the PR about whether the intinsics should exist in this form.	2025-04-08 11:00:41 -05:00
Rahul Joshi	a3754ade63	[NFC][LLVM][AMDGPU] Cleanup pass initialization for AMDGPU (#134410 ) - Remove calls to pass initialization from pass constructors. - https://github.com/llvm/llvm-project/issues/111767	2025-04-07 17:27:50 -07:00
Krzysztof Drewniak	d1a0572117	[AMDGPU] Add buffer.fat.ptr.load.lds intrinsic wrapping raw rsrc version (#133015 ) Add a buffer_fat_ptr_load_lds intrinsic, by analogy with global_load_lds, which enables using `ptr addrspace(7)` to set the rsrc and offset arguments to raw_ptr_buffer_load_lds.	2025-04-07 15:42:22 -05:00
Krzysztof Drewniak	f23bb530cf	[AMDGPULowerBufferFatPointers] Use InstSimplifyFolder during rewrites (#134137 ) This PR updates AMDGPULowerBufferFatPointers to use the InstSimplifyFolder when creating IR during buffer fat pointer lowering. This shouldn't cause any large functional changes and might improve the quality of the generated code.	2025-04-03 10:12:18 -05:00
Pedro Lobo	73e23f899f	[AMDGPU] Change placeholder from `undef` to `poison` (#130858 ) Replace `undef` debug info with `poison`.	2025-03-12 12:53:27 +00:00
Krzysztof Drewniak	f8cc509b69	Reapply "[AMDGPU] Handle memcpy()-like ops in LowerBufferFatPointers (#126621 )" (#129078 ) This reverts commit 1559a65efaf327f9c72e14d4bb1834f076e7fc20. Fixed test (I suspect broken by unrelated change in the merge)	2025-02-27 11:26:13 -06:00
Kazu Hirata	1559a65efa	Revert "[AMDGPU] Handle memcpy()-like ops in LowerBufferFatPointers (#126621 )" This reverts commit 469757efafebdd5772d993fca4dc0dfa7cbda17c. Multiple buildbot failures have been reported: https://github.com/llvm/llvm-project/pull/126621	2025-02-26 14:35:07 -08:00
Krzysztof Drewniak	469757efaf	[AMDGPU] Handle memcpy()-like ops in LowerBufferFatPointers (#126621 ) Since LowerBufferFatPointers runs before PreISelIntrinsicLowering, which normally handles unsupported memcpy()s,, and since you can't have a `noalias {ptr addrspace(8), i32}` becasue it crashes later passes, manually expand memcpy()s involving buffer fat pointers to loops. Additionally, though they're unlikely to be used, this commit adds support for memset(). This commit doesn't implement writing direct-to-LDS loads as the intrinsics, but leaves the option in the future.	2025-02-26 16:03:32 -06:00
Krzysztof Drewniak	f7d03707d1	[AMDGPU] Generalize amdgcn.make.buffer.rsrc to fat pointers (#126828 ) Attempting to pass a `ptr addrspace(7)` to functions that take `ptr` arguments produces undesirable `addrspacecast(addrspacecast(p8 x to p7) to p0) => addrspacecast(p8 x to p0)` folds. This results in illegal GEP operations on buffer resources, which can't be GEP'd. (However, note that, while unimplemneted, addressspacecast from ptr addrspace(7) to ptr is legal - it's just an effective address computation) To resolve this problem, and thus prevent illegal `getelementptr T, ptr addrspace(8) %x, ...` s from being produces, this commit extends amdgcn.make.buffer.rsrc to also be variadic in its result type, auto-upgrading old manglings. The logic for handling a make.buffer.rsrc in instruction selection remains untouched and expects the output type to be a ptr addrspace(8), as does the Clang lowering for its builtin (the pointer-to-pointer version might want a different name in clang). LowerBufferFatPointers has been updated to lower amdgcn.make.buffer.rsrc.p7.p* to amdgcn.make.buffer.rsrc.p8.p* . This'll also make exposing buffer fat pointers in Clang easier, since you don't have to cast between a `__amdgcn_rsrc_t` and a pointer.	2025-02-18 14:15:28 -06:00
Krzysztof Drewniak	934c97dd16	[LowerBufferFatPointers] Fix support for GEP T, p7, <N x T> idxs (#126126 ) The lowering for GEP didn't properly support the case where the pointer argument was being implicitly broadcast by a vector of indices. Fix that. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-02-11 18:22:50 -06:00
Krzysztof Drewniak	697c1883f1	Reapply "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123660 ) (#123657) This reverts commit 64749fb01538fba2b56d9850497d5f3a626cabc2. Adds a constructor to VecSlice to address the failure	2025-01-20 16:12:17 -06:00
Krzysztof Drewniak	64749fb015	Revert "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123657 ) Reverts llvm/llvm-project#110572 Seem to have broken a buildbot, not sure why https://lab.llvm.org/buildbot/#/builders/108/builds/8346	2025-01-20 13:14:04 -05:00
Krzysztof Drewniak	3805355ef6	[AMDGPU] Handle natively unsupported types in addrspace(7) lowering (#110572 ) The current lowering for ptr addrspace(7) assumed that the instruction selector can handle arbtrary LLVM types, which is not the case. Code generation can't deal with - Values that aren't 8, 16, 32, 64, 96, or 128 bits long - Aggregates (this commit only handles arrays of scalars, more may come) - Vectors of more than one byte - 3-word values that aren't a vector of 3 32-bit values (for axample, a <6 x half>) This commit adds a buffer contents type legalizer that adds the needed bitcasts, zero-extensions, and splits into subcompnents needed to convert a load or store operation into one that can be successfully lowered through code generation. In the long run, some of the involved bitcasts (though potentially not the buffer operation splitting) ought to be handled by the instruction legalizer, but SelectionDAG makes this difficult. It also takes advantage of the new `nuw` flag on `getelementptr` when lowering GEPs to offset additions. We don't currently plumb through `nsw` on GEPs since that should likely be a separate change and would require declaring what we mean by "the address" in the context of the GEP guarantees.	2025-01-20 11:33:35 -06:00
Nikita Popov	4f614a8f7c	[AMDGPULowerBufferFatPointers] Use typeIncompatible() (#122902 ) Use typeIncompatible() to drop attributes incompatible with the new argument/return type, instead of keeping a custom list.	2025-01-14 16:55:49 +01:00
Acim Maravic	cc3aab580b	[AMDGPU] Handle nontemporal and amdgpu.last.use metadata in amdgpu-lower-buffer-fat-pointers (#120139 )	2025-01-14 11:22:20 +01:00
Krzysztof Drewniak	3b0f506c87	[AMDGPU] Support `nuw` and `nusw` in buffer fat pointer lowering (#115039 ) This commit usis the `nuw` flag on `getelemnetptr` to set the `nuw` flag on buffer offset additions, and also moves from `inbounds` to the looser `nusw` for the existing case.	2024-11-06 11:42:47 -06:00
Kazu Hirata	e1fdaaafc5	[AMDGPU] Work around a warning This patch works around: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:1101:13: error: enumeration values 'USubCond' and 'USubSat' not handled in switch [-Werror,-Wswitch] I've notified the author in #105568.	2024-09-06 09:35:13 -07:00
Jessica Del	ec7f8e1113	[AMDGPU] Add intrinsic for raw atomic buffer loads (#97707 ) Upstream the intrinsics `llvm.amdgcn.raw.atomic.buffer.load` and `llvm.amdgcn.raw.atomic.ptr.buffer.load`. These additional intrinsics mark atomic buffer loads as atomic to LLVM by removing the `IntrReadMem` attribute. Otherwise, it could hoist these intrinsics out of loops in cases where LLVM marks them as invariant. That can cause issues such as infinite loops. Continuation of https://reviews.llvm.org/D138786 with the additional use in the fat buffer lowering, more test cases and the additional ptr versions of these intrinsics. --------- Co-authored-by: rtayl <> Co-authored-by: Jay Foad <jay.foad@amd.com> Co-authored-by: Mariusz Sikora <mariusz.sikora@amd.com>	2024-07-22 18:04:49 +02:00
Jay Foad	6bba44e8dc	[AMDGPU] Use member initializers. NFC.	2024-07-16 15:29:10 +01:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Nikita Popov	5ef768d22b	[AMDGPULowerBufferFatPointers] Expand const exprs using fat pointers (#95558 ) Expand all constant expressions that use fat pointers upfront, so that the rewriting logic only has to deal with instructions and not the constant expression variants as well. My primary motivation is to remove the creation of illegal constant expressions (mul and shl) from this pass, but this also cuts down quite a bit on the amount of duplicate logic.	2024-06-17 09:28:09 +02:00
Nikita Popov	0774000e32	[AMDGPULowerBufferFatPointers] Fix offset-only ptrtoint (#95543 ) For ptrtoint that truncates to the offset only, the expansion generated a shift by the bit width, which is poison. Instead, we should return the offset directly. (The same problem exists for the constant expression case, but I plan to address that separately, and more comprehensively.)	2024-06-14 16:38:57 +02:00
Nikita Popov	1ceede3318	[AMDGPULowerBufferFatPointers] Don't try to preserve flags for constant expressions We expect all of these ConstantExpr ctors to fold away, don't try to preserve flags, especially as the flags are not correct.	2024-06-14 12:26:29 +02:00
Nikita Popov	cb3a6bded7	[AMDGPULowerBufferFatPointers] Restore zero offset special case OffAccum will never be nullptr now, instead check for a zero constant.	2024-06-12 10:30:23 +02:00
Nikita Popov	6fc63ab77d	[AMDGPULowerBufferFatPointers] Simplify and fix GEP offset emission (#95115 ) Use emitGEPOffset() to emit the GEP offset, which already has all the necessary logic. This also fixes the nuw flag incorrectly being set on the offset calculation, while only nsw is implied by inbounds.	2024-06-12 09:51:18 +02:00
Nikita Popov	8cdecd4d3a	[IR] Add getelementptr nusw and nuw flags (#90824 ) This implements the `nusw` and `nuw` flags for `getelementptr` as proposed at https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672. The three possible flags are encapsulated in the new `GEPNoWrapFlags` class. Currently this class has a ctor from bool, interpreted as the InBounds flag. This ctor should be removed in the future, as code gets migrated to handle all flags. There are a few places annotated with `TODO(gep_nowrap)`, where I've had to touch code but opted to not infer or precisely preserve the new flags, so as to keep this as NFC as possible and make sure any changes of that kind get test coverage when they are made.	2024-05-27 16:05:17 +02:00

1 2

56 Commits