126 Commits

Author SHA1 Message Date
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Nikita Popov
2c6eec219d [Tests] Avoid lifetime intrinsics on non-allocas (NFC)
Don't rely on auto-upgrade, instead either remove unnecessary
casts or remove no longer applicable tests.
2025-07-23 15:05:43 +02:00
Shilei Tian
4d48673562 Reapply "Reapply "[AMDGPU] Make getAssumedAddrSpace return AS1 for pointer kernel arguments (#137488)""
This reverts commit 37ea3b32cdcb6c0dcecbcc4bf844f5190c7378dd.
2025-05-30 22:11:22 -04:00
Shilei Tian
37ea3b32cd Revert "Reapply "[AMDGPU] Make getAssumedAddrSpace return AS1 for pointer kernel arguments (#137488)""
This reverts commit 4efc13f8ff1eaf4f9fb1fcea8d4552b3eca052ca.
2025-05-30 22:06:16 -04:00
Shilei Tian
4efc13f8ff Reapply "[AMDGPU] Make getAssumedAddrSpace return AS1 for pointer kernel arguments (#137488)"
This reverts commit 3c6211c183885afb5d89259a53c4f4f46a6bf399.
2025-05-30 21:56:24 -04:00
Shilei Tian
3c6211c183 Revert "[AMDGPU] Make getAssumedAddrSpace return AS1 for pointer kernel arguments (#137488)"
This reverts commit 9bf6b2a8cb0467b62173659306e43a0346f063a2.
2025-05-30 21:15:25 -04:00
Shilei Tian
9bf6b2a8cb
[AMDGPU] Make getAssumedAddrSpace return AS1 for pointer kernel arguments (#137488) 2025-05-30 17:30:42 -04:00
Shilei Tian
84a69a0f8f
[AMDGPU] Move InferAddressSpacesPass to middle end optimization pipeline (#138604)
It will run twice in the non-LTO pipeline with `O1` or higher. In LTO post link pipeline, it will be run once with `O2` or higher, since inline and SROA don't run in `O1`.
2025-05-29 17:20:56 -04:00
Shilei Tian
0f1277d0b3
[NFC][AMDGPU] Move flat_atomic.ll to llvm/test/CodeGen/AMDGPU/ (#141126) 2025-05-23 08:15:26 -04:00
QiYue
758fea0e99
[InferAddressSpaces] Handle llvm.lifetime (#141045)
Co-authored-by: Zhenhao Yang <zhenhao.yang@nio.com>
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-22 18:06:01 +02:00
Krzysztof Drewniak
13c467b2cd
[AMDGPU] Add make.buffer.rsrc to InferAddressSpaces (#140770)
make.buffer.rsrc can be subjected to address space inference. There's
not _currently_ a reason to have this, but we might as well handle this
in case it comes up.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-20 16:13:37 -07:00
Alexander Richardson
07e2ba445d
[AMDGPU] Set AS8 address width to 48 bits
Of the 128-bits of buffer descriptor only 48 bits are address bits, so
following the discussion on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54,
the logic conclusion is to set the index width to 48 bits instead of
the current value of 128.

Most of the test changes are mechanical datalayout updates, but there
is one actual change: the ptrmask test now uses .i48 instead of .i128
and I had to update SelectionDAGBuilder to correctly extend the mask.

Reviewed By: krzysz00

Pull Request: https://github.com/llvm/llvm-project/pull/139419
2025-05-19 17:26:05 -07:00
Krzysztof Drewniak
4bdd116b80
[AMDGPU] Add a new amdgcn.load.to.lds intrinsic (#137425)
This PR adds a amdgns_load_to_lds intrinsic that abstracts over loads to
LDS from global (address space 1) pointers and buffer fat pointers
(address space 7), since they use the same API and "gather from a
pointer to LDS" is something of an abstract operation.

This commit adds the intrinsic and its lowerings for addrspaces 1 and 7,
and updates the MLIR wrappers to use it (loosening up the restrictions
on loads to LDS along the way to match the ground truth from target
features).

It also plumbs the intrinsic through to clang.
2025-05-19 07:15:04 -07:00
Alexander Richardson
ee13638362
[AMDGPU] Remove explicit datalayout from tests where not needed
Since e39f6c1844fab59c638d8059a6cf139adb42279a opt will infer the
correct datalayout when given a triple. Avoid explicitly specifying it
in tests that depend on the AMDGPU target being present to avoid the
string becoming out of sync with the TargetInfo value.
Only tests with REQUIRES: amdgpu-registered-target or a local lit.cfg
were updated to ensure that tests for non-target-specific passes that
happen to use the AMDGPU layout still pass when building with a limited
set of targets.

Reviewed By: shiltian, arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/137921
2025-04-30 10:58:17 -07:00
Shilei Tian
3570908519
[NFC][AMDGPU] Auto generate check lines for some codegen tests (#137534)
Make preparation for #137488.
2025-04-28 09:25:05 -04:00
Matt Arsenault
15ba2ce7ac
InferAddressSpaces: Replace undef with poison in tests (#130083) 2025-03-06 23:20:46 +07:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Matt Arsenault
e28e93550a
AMDGPU: Make vector_shuffle legal for v2i32 with v_pk_mov_b32 (#123684)
For VALU shuffles, this saves an instruction in some case.
2025-01-23 20:58:02 +07:00
Nikita Popov
eeac0ffaf4 Revert "[MachineLICM] Use RegisterClassInfo::getRegPressureSetLimit (#119826)"
This reverts commit b4e17d4a314ed87ff6b40b4b05397d4b25b6636a.

This causes a large compile-time regression.
2025-01-10 09:05:06 +01:00
Pengcheng Wang
b4e17d4a31
[MachineLICM] Use RegisterClassInfo::getRegPressureSetLimit (#119826)
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from https://github.com/llvm/llvm-project/pull/118787
2025-01-09 21:05:52 +08:00
Shilei Tian
6548b6354d Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
2024-11-08 20:21:16 -05:00
Shilei Tian
ca33649abe Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.
2024-11-08 16:36:35 -05:00
Shilei Tian
e215a1e27d
[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403) 2024-11-08 13:05:35 -05:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Matt Arsenault
1d0370872f
AMDGPU: Expand flat atomics that may access private memory (#109407)
If the runtime flat address resolves to a scratch address,
64-bit atomics do not work correctly. Insert a runtime address
space check (which is quite likely to be uniform) and select between
the non-atomic and real atomic cases.

Consider noalias.addrspace metadata and avoid this expansion when
possible (we also need to consider it to avoid infinitely expanding
after adding the predication code).
2024-10-31 08:08:48 -07:00
Matt Arsenault
c198f775cd
AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642)
These have been replaced with atomicrmw
2024-10-09 09:27:28 +04:00
Matt Arsenault
a87640c97e
AMDGPU: Fix assertion on load of vector of pointers (#110436)
Fix InferAddressSpaces asserting on a load of a vector of flat
pointers.

Fixes #110433
2024-09-30 10:16:38 +04:00
Matt Arsenault
ee08d9cba5
AMDGPU: Remove global/flat atomic fadd intrinics (#97051)
These have been replaced with atomicrmw.
2024-08-22 23:27:33 +04:00
Matt Arsenault
1db674b83d InferAddressSpaces: Convert test to generated checks
Also use named values
2024-08-16 15:05:41 +04:00
Matt Arsenault
2ccbf92f87 InferAddressSpaces: Restore non-instruction user check
Fixes regression after 79658d65c3c7a075382b74d81e74714e2ea9bd2d.
We were missing test coverage for the nested constant expression
case.
2024-08-15 15:55:09 +04:00
Matt Arsenault
7a51dde4e6
InferAddressSpaces: Improve handling of instructions with multiple pointer uses (#101922)
The use list iteration worked correctly for the load and store case. The atomic
instructions happen to have the pointer value as the last visited operand, but we
rejected the instruction as simple after the first encountered use.

Ignore the use list for the recognized load/store/atomic instructions, and just
try to directly replace the known pointer use.
2024-08-08 13:19:35 +04:00
Matt Arsenault
2ef553c05f
InferAddressSpaces: Handle llvm.is.constant (#102010) 2024-08-06 00:20:01 +04:00
Matt Arsenault
47fc4c37bb
InferAddressSpaces: Handle masked load and store intrinsics (#102007) 2024-08-06 00:17:07 +04:00
Matt Arsenault
f01a6f5ecb
InferAddressSpaces: Handle prefetch intrinsic (#101982) 2024-08-06 00:14:02 +04:00
Matt Arsenault
3c483b887e
InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877) 2024-08-04 16:36:00 +04:00
Matt Arsenault
b1bcb7ca46 Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.
2024-07-15 11:51:44 +04:00
dyung
adaff46d08
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.
2024-07-14 18:48:54 -07:00
Matt Arsenault
78bc1b64a6
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.
2024-07-14 08:36:33 +04:00
Shan Huang
a355c2d074
[DebugInfo][InferAddressSpaces] Fix the missing debug location update for the new addrspacecast (#97038)
Fix #97006 .
2024-07-03 09:39:17 +08:00
Matt Arsenault
eda9ff899f
AMDGPU: Flat instructions do not have signed offsets gfx7-gfx11 (#95852)
Fixes some atomicrmw fadd and intrinsic cases
2024-06-18 13:20:34 +02:00
Nikita Popov
deab451e7a
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.

This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
2024-06-04 08:31:03 +02:00
Nikita Popov
d10b76552f
[ConstantFold] Remove notional over-indexing fold (#93697)
The data-layout independent constant folding currently has some rather
gnarly code for canonicalizing GEP indices to reduce "notional
overindexing", and then infers inbounds based on that canonicalization.

Now that we canonicalize to i8 GEPs, this canonicalization is
essentially useless, as we'll discard it as soon as the GEP hits the
data-layout aware constant folder anyway. As such, I'd like to remove
this code entirely.

This shouldn't have any impact on optimization capabilities.
2024-05-30 08:36:44 +02:00
Nikita Popov
a49b5cad99 [InferAddressSpaces] Generate test checks (NFC) 2024-05-29 15:26:59 +02:00
Matt Arsenault
9f9856d623 AMDGPU: Update name for amdgpu.no.remote.memory metadata 2024-05-03 11:50:59 +02:00
Florian Hahn
c8e5ad4e12
Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit 7dbba39e583a3fd64e7e6b947251c035e483f054.

Revert as there are reports this triggers during ThinLTO in some
configurations.
2024-04-22 10:50:49 +01:00
Matt Arsenault
f433c3b380
AMDGPU: Add tests for atomicrmw handling of new metadata (#89248)
Add baseline tests which should comprehensively test the new atomic
metadata. Test codegen / expansion, and preservation in a few
transforms.

New metadata defined in #85052
2024-04-20 00:43:36 +02:00
Julian Nagele
7dbba39e58
Reapply "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit b9cd48f96acdd07c627ccafbf4386a1f3dcd6c51.

-------------------------------------------------------------
Original commit message:

Adds logic to the IR verifier that checks whether !tbaa.struct nodes are
well-formed. That is, it checks that the operands of !tbaa.struct nodes
are in groups of three, that each group of three operands consists of
two integers and a valid tbaa node, and that the regions described by
the offset and size operands are non-overlapping.

PR: https://github.com/llvm/llvm-project/pull/86709
2024-04-15 11:25:06 +01:00
Florian Hahn
b9cd48f96a
Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit df75183d70e029352a49c93f275db703c81a65c1.

Revert for now as this appears to cause failures on some buildbots,
e.g.:
https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio
2024-03-27 21:22:15 +00:00
Julian Nagele
df75183d70
[TBAA] Add verifier for tbaa.struct metadata (#86709)
Adds logic to the IR verifier that checks whether !tbaa.struct nodes are
well-formed. That is, it checks that the operands of !tbaa.struct nodes
are in groups of three, that each group of three operands consists of
two integers and a valid tbaa node, and that the regions described by
the offset and size operands are non-overlapping.

PR: https://github.com/llvm/llvm-project/pull/86709
2024-03-27 10:30:27 +01:00
Pierre van Houtryve
c831d83bb1
[InferAddrSpaces] Correctly replace identical operands of insts (#82610)
It's important for PHI nodes because if a PHI node has multiple edges
coming from the same block, we can have the same incoming value multiple
times in the list of incoming values. All of those need to be consistent
(exact same Value*) otherwise verifier complains.

Fixes SWDEV-445797
2024-02-22 13:59:04 +01:00