61 Commits

Author SHA1 Message Date
Leon Clark
bd9668df0f
Reapply "[AMDGPU] Propagate alias information in AMDGPULowerKernelArguments." (#174977)
Emit `!noalias` and `!alias.scope` metadata for `noalias` kernel
arguments.

Fixes sanitizer issues in #161375.

---------

Co-authored-by: Leon Clark <leoclark@amd.com>
2026-01-20 16:39:08 +00:00
Nikita Popov
1d61ced4bb Revert "[AMDGPU] Propagate alias information in AMDGPULowerKernelArguments. (#161375)"
This reverts commit 9f4f13a793b53adef37dfb63c4e30dccfa98517b.

Broke sanitizer buildbots, and causes test hangs in release builds.
2025-12-22 11:45:05 +01:00
Leon Clark
9f4f13a793
[AMDGPU] Propagate alias information in AMDGPULowerKernelArguments. (#161375)
Emit `!noalias` and `alias.scope` metadata for `noalias` kernel
arguments.

---------

Co-authored-by: Leon Clark <leoclark@amd.com>
2025-12-21 00:29:11 +00:00
Jay Foad
72c69aefba
[AMDGPU] Make use of getFunction and getMF. NFC. (#167872) 2025-11-14 11:00:57 +00:00
Austin Kerbow
2c9a46cce3
[AMDGPU] Move kernarg preload logic to separate pass (#130434)
Moves kernarg preload logic to its own module pass. Cloned function
declarations are removed when preloading hidden arguments. The inreg
attribute is now added in this pass instead of AMDGPUAttributor. The
rest of the logic is copied from AMDGPULowerKernelArguments which now
only check whether an arguments is marked inreg to avoid replacing
direct uses of preloaded arguments. This change requires test updates to
remove inreg from lit tests with kernels that don't actually want
preloading.
2025-05-11 21:18:11 -07:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Scott Linder
eaa460ca49
[AMDGPU] Remove dead function metadata after amdgpu-lower-kernel-arguments (#126147)
The verifier ensures function !dbg metadata is unique across the module,
so ensure the old nameless function we leave behind doesn't violate
this invariant.

Removing the function via e.g. eraseFromParent seems like a better
option, but doesn't seem to be legal from a FunctionPass.
2025-02-17 13:27:23 -05:00
Austin Kerbow
b1d42465fc
[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759)
It is possible that the number of hidden arguments that are selected to
be preloaded in AMDGPULowerKernel arguments and isel can differ. This
isn't an issue with explicit arguments since isel can lower the argument
correctly either way, but with hidden arguments we may have alignment
issues if we try to load these hidden arguments that were added to the
kernel signature.

The reason for the mismatch is that isel reserves an extra synthetic
user SGPR for module LDS.

Instead of teaching lowerFormalArguments how to handle these properly it
makes more sense and is less expensive to fix the mismatch and assert if
we ever run into this issue again. We should never be trying to lower
these in the normal way.

In a future change we probably want to revise how we track "synthetic"
user SGPRs and unify the handling in GCNUserSGPRUsageInfo. Sometimes
synthetic SGPRSs are considered user SGPRs and sometimes they are not.
Until then this patch resolves the inconsistency, fixes the bug, and is
otherwise a NFC.
2024-12-08 10:10:08 -08:00
Krzysztof Drewniak
87c21bf064
[AMDGPU] Preserve noundef and range during kernel argument loads (#118395)
This commit ensures than noundef (which is frequently a prerequisite for
other annotations) and range() annotations on kernel arguments are
copied onto their corresponding load from the kernel argument structure.
2024-12-04 11:04:03 -06:00
Kazu Hirata
be187369a0
[AMDGPU] Remove unused includes (NFC) (#116154)
Identified with misc-include-cleaner.
2024-11-13 21:10:03 -08:00
Rahul Joshi
6924fc0326
[LLVM] Add Intrinsic::getDeclarationIfExists (#112428)
Add `Intrinsic::getDeclarationIfExists` to lookup an existing
declaration of an intrinsic in a `Module`.
2024-10-16 07:21:10 -07:00
Austin Kerbow
c4d89203f3
[AMDGPU] Support preloading hidden kernel arguments (#98861)
Adds hidden kernel arguments to the function signature and marks them
inreg if they should be preloaded into user SGPRs. The normal kernarg
preloading logic then takes over with some additional checks for the
correct implicitarg_ptr alignment.

Special care is needed so that metadata for the hidden arguments is not
added twice when generating the code object.
2024-10-06 17:44:33 -07:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Krzysztof Drewniak
e31bfc040a
[AMDGPU] Strengthen preload intrinsics to noundef and nonnull (#92801)
The various preloaded registers (workitem IDs, workgroup IDs, and
various implicit pointers) always have a finite, invariant, well-defined
value throughout a well-defined program.

In cases where the compiler infers or the user declares that some
implicit input will not be used (ex. via amdgcn-no-workitem-id-y), the
behavior of the entire program is undefined, since that misdeclaration
can cause arbitrary other preloaded-register intrinsics to access the
wrong register. This case is not expected to arise in practice, but
could occur when the no implicit argument attributes were not cleared
correctly in the presence of external functions, indrect calls, or other
means of executing un-analyzable code. Failure to detect that case would
be a bug in the attributor.

This commit updates the documentation to reflect this long-standing
reality.

Then, on the basis that all implicit arguments are defined in all
correct programs, the intrinsics that return those values are
annototated with `noundef``. Some implicit pointer arguments gain a
`nonnull`, but the kernel argument segment pointer or implicit argument
pointers don't necessarily have this property.

This will prevent spurious calls to `freeze` in front-end optimizations
that destroy user-provided ranges on built-in IDs.

(While I'm here, this commit adds a test for `noundef` on kernel
arguments which is currently unimplemented)
2024-06-03 16:37:08 -05:00
Austin Kerbow
4bcbeaed63
[AMDGPU] Enable kernel arg preloading with gfx90a (#81180)
Add a trap instruction to the beginning of the kernel prologue to handle
cases where preloading is attempted on HW loaded with incompatible
firmware.
2024-02-12 22:33:29 -08:00
Jeremy Morse
52a8bed426
[DebugInfo][RemoveDIs] Adjust AMDGPU passes to work with DPValues (#78736)
This patch tweaks two AMDGPU passes to use iterators rather than
instruction pointers for expressing an insertion point. This is needed
to accurately support DPValues, the non-instruction storage object for
debug-info.

Two tests were sensitive to this change (variable assignments were being
put in the wrong place), and I've added extra run-lines with the "try
new debug-info..." flag. These get tested on our public buildbot to
ensure they continue to work accurately.
2024-01-22 14:25:08 +00:00
Jay Foad
4a77414660
[AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (#77633) 2024-01-17 10:28:03 +00:00
Austin Kerbow
7b70af297a [AMDGPU] Add IR lowering changes for preloaded kernargs
Preloaded kernel arguments should not be lowered in the IR pass
AMDGPULowerKernelArguments. Therefore it's necessary to calculate the
total number of user SGPRs that are available for preloading and how
many SGPRs would be required to preload each argument to determine
whether we should skip lowering i.e. the argument will be preloaded
instead.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D156853
2023-09-25 08:54:07 -07:00
Matt Arsenault
58e87c961e AMDGPU: Port AMDGPULowerKernelArguments to new pass manager
https://reviews.llvm.org/D157498
2023-08-09 18:34:30 -04:00
Matt Arsenault
71ba28eaac Revert "AMDGPU: Use generic helper for skipping over allocas"
This reverts commit aa7e09ebd38c5f23f6d7d6d8394a2aea04715ba9.
2023-06-22 18:15:19 -04:00
Matt Arsenault
aa7e09ebd3 AMDGPU: Use generic helper for skipping over allocas 2023-06-22 18:02:49 -04:00
Matt Arsenault
3d0350b762 AMDGPU: Add MF independent version of getImplicitParameterOffset 2023-06-07 08:26:31 -04:00
Matt Arsenault
70d9c62f54 AMDGPU: Don't need pointer bitcast in AMDGPULowerKernelArguments 2023-04-29 15:11:31 -04:00
Matt Arsenault
3ae5f74f54 AMDGPU: Don't try to create pointer bitcasts in kernarg lowering 2023-04-29 10:04:45 -04:00
Guillaume Chatelet
b55f83d013 [NFC] Remove Function::getParamAlignment
Differential Revision: https://reviews.llvm.org/D141696
2023-01-13 16:20:58 +00:00
Kazu Hirata
20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Guillaume Chatelet
d154d0ac06 [NFC] Simplify code 2022-06-20 15:15:52 +00:00
Sebastian Neubauer
6527b2a4d5 [AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235
2022-02-18 15:05:21 +01:00
Arthur Eubanks
3f4d00bc3b [NFC] More get/removeAttribute() cleanup 2021-08-17 21:05:41 -07:00
Nikita Popov
9914200393 [CodeGen] Add missing includes (NFC)
These currently rely on the IRBuilder.h include in TargetLowering.h.
Make them explicit.
2021-06-06 15:48:27 +02:00
dfukalov
560d7e0411 [NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
2021-01-20 22:22:45 +03:00
dfukalov
6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Juneyoung Lee
420d046d6b clang-format, address warnings 2020-12-30 23:05:07 +09:00
Juneyoung Lee
9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Matt Arsenault
1168119c2f AMDGPU: Start interpreting byref on kernel arguments
These are treated identically to value aggregates placed in the kernel
argument list. A %struct.foo or %struct.foo addrspace(4)*
byref(sizeof(%struct.foo)) align(alignof(%struct.foo)) argument should
produce the same offsets and argument metadata.

This handles all 3 kernel ABI implementations, and the two HSA
metadata emission paths.
2020-07-21 18:11:22 -04:00
Guillaume Chatelet
d3085c2501 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82956
2020-07-01 14:31:56 +00:00
Christopher Tetreault
aad9365482 [SVE] Eliminate calls to default-false VectorType::get() from AMDGPU
Reviewers: efriedma, david-arm, fpetrogalli, arsenm

Reviewed By: david-arm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, tschuett, hiraditya, rkruppe, psnobl, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80328
2020-05-29 17:54:17 -07:00
Christopher Tetreault
3254a001fc [SVE] Remove usages of VectorType::getNumElements() from AMDGPU
Reviewers: efriedma, arsenm, david-arm, fpetrogalli

Reviewed By: efriedma

Subscribers: dmgreen, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, tschuett, hiraditya, rkruppe, psnobl, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79807
2020-05-13 15:57:55 -07:00
Matt Arsenault
074c371a48 AMDGPU: Insert kernarg code after allocas
This produces more normal looking IR by keeping all the allocas
clustered at the start of the block.
2020-05-06 10:19:56 -04:00
Christopher Tetreault
e634f482ea Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: arsenm, efriedma, sdesmalen

Reviewed By: arsenm

Subscribers: wdng, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77268
2020-04-09 13:11:37 -07:00
Eli Friedman
1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Guillaume Chatelet
279fa8e006 [Alignement][NFC] Deprecate untyped CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73260
2020-01-23 13:34:32 +01:00
Guillaume Chatelet
b65fa48305 [Alignment] Migrate Attribute::getWith(Stack)Alignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jdoerfert

Reviewed By: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68792

llvm-svn: 374884
2019-10-15 12:56:24 +00:00
Matt Arsenault
e4c2e9b016 AMDGPU: Consolidate some getGeneration checks
This is incomplete, and ideally these would all be removed, but it's
better to localize them to the subtarget first with comments about
what they're for.

llvm-svn: 363902
2019-06-19 23:54:58 +00:00
James Y Knight
7716075a17 [opaque pointer types] Pass value type to GetElementPtr creation.
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

llvm-svn: 352913
2019-02-01 20:44:47 +00:00
James Y Knight
14359ef1b6 [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
Matt Arsenault
cdd191d9db AMDGPU: Add DS append/consume intrinsics
Since these pass the pointer in m0 unlike other DS instructions, these
need to worry about whether the address is uniform or not. This
assumes the address is dynamically uniform, and just uses
readfirstlane to get a copy into an SGPR.

I don't know if these have the same 16-bit add for the addressing mode
offset problem on SI or not, but I've just assumed they do.

Also includes some misc. changes to avoid test differences between the
LDS and GDS versions.

llvm-svn: 352422
2019-01-28 20:14:49 +00:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00