773 Commits

Author SHA1 Message Date
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Nikita Popov
2c6eec219d [Tests] Avoid lifetime intrinsics on non-allocas (NFC)
Don't rely on auto-upgrade, instead either remove unnecessary
casts or remove no longer applicable tests.
2025-07-23 15:05:43 +02:00
Nikita Popov
92c55a315e
[IR] Only allow lifetime.start/end on allocas (#149310)
lifetime.start and lifetime.end are primarily intended for use on
allocas, to enable stack coloring and other liveness optimizations. This
is necessary because all (static) allocas are hoisted into the entry
block, so lifetime markers are the only way to convey the actual
lifetimes.

However, lifetime.start and lifetime.end are currently *allowed* to be
used on non-alloca pointers. We don't actually do this in practice, but
just the mere fact that this is possible breaks the core purpose of the
lifetime markers, which is stack coloring of allocas. Stack coloring can
only work correctly if all lifetime markers for an alloca are
analyzable.

* If a lifetime marker may operate on multiple allocas via a select/phi,
we don't know which lifetime actually starts/ends and handle it
incorrectly (https://github.com/llvm/llvm-project/issues/104776).
* Stack coloring operates on the assumption that all lifetime markers
are visible, and not, for example, hidden behind a function call or
escaped pointer. It's not possible to change this, as part of the
purpose of lifetime markers is that they work even in the presence of
escaped pointers, where simple use analysis is insufficient.

I don't think there is any way to have coherent semantics for lifetime
markers on allocas, while also permitting them on arbitrary pointer
values.

This PR restricts lifetimes to operate on allocas only. As a followup, I
will also drop the size argument, which is superfluous if we always
operate on an alloca. (This change also renders various code handling
lifetime markers on non-alloca dead. I plan to clean up that kind of
code after dropping the size argument as well.)

In practice, I've only found a few places that currently produce
lifetimes on non-allocas:

* CoroEarly replaces the promise alloca with the result of an intrinsic,
which will later be replaced back with an alloca. I think this is the
only place where there is some legitimate loss of functionality, but I
don't think this is particularly important (I don't think we'd expect
the promise in a coroutine to admit useful lifetime optimization.)
* SafeStack moves unsafe allocas onto a separate frame. We can safely
drop lifetimes here, as SafeStack performs its own stack coloring.
* Similar for AddressSanitizer, it also moves allocas into separate
memory.
* LSR sometimes replaces the lifetime argument with a GEP chain of the
alloca (where the offsets ultimately cancel out). This is just
unnecessary. (Fixed separately in
https://github.com/llvm/llvm-project/pull/149492.)
* InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast
of an alloca. I don't think this is necessary.
2025-07-21 15:04:50 +02:00
Andreas Jonson
0a067dc107
[Attributor] Swap range metadata to attribute for calls. (#108835) 2025-07-05 16:47:03 +02:00
zGoldthorpe
f393211454
[Reland][IPO] Added attributor for identifying invariant loads (#146584)
Patched and tested the `AAInvariantLoadPointer` attributor from #141800,
which identifies pointers whose loads are eligible to be marked as
`!invariant.load`.

The bug in the attributor was due to `AAMemoryBehavior` always
identifying pointers obtained from `alloca`s as having no writes. I'm
not entirely sure why `AAMemoryBehavior` behaves this way, but it seems
to be beceause it identifies the scope of an `alloca` to be limited to
only that instruction (and, certainly, no memory writes occur within the
`alloca` instructin). This patch just adds a check to disallow all loads
from `alloca` pointers from being marked `!invariant.load` (since any
well-defined program will have to write to stack pointers at some
point).
2025-07-01 17:46:19 -04:00
Wenju He
9d570d568b
[ValueTracking] Return true for AddrSpaceCast in canCreateUndefOrPoison (#144686)
In our downstream GPU target, following IR is valid before instcombine
although the second addrspacecast causes UB.
  define i1 @test(ptr addrspace(1) noundef %v) {
    %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4)
    %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0)
    %2 = icmp eq i32 %1, 0
    %3 = addrspacecast ptr addrspace(4) %0 to ptr addrspace(3)
    %4 = select i1 %2, ptr addrspace(3) null, ptr addrspace(3) %3
    %5 = icmp eq ptr addrspace(3) %4, null
    ret i1 %5
  }
We have a custom optimization that replaces invalid addrspacecast with
poison, and IR is still valid since `select` stops poison propagation.

However, instcombine pass optimizes `select` to `or`:
    %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4)
    %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0)
    %2 = icmp eq i32 %1, 0
    %3 = addrspacecast ptr addrspace(1) %v to ptr addrspace(3)
    %4 = icmp eq ptr addrspace(3) %3, null
    %5 = or i1 %2, %4
    ret i1 %5
The transform is invalid for our target.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-24 08:43:47 +08:00
zGoldthorpe
00ae89a1cb
Revert "[IPO] Added attributor for identifying invariant loads" (#144808)
Reverts llvm/llvm-project#141800

The implementation critically misunderstands the `AAMemoryBehavior`
attributor, which it relies on heavily.

@shiltian, since I do not have commit permissions.
2025-06-18 18:35:01 -04:00
zGoldthorpe
25dcd231bf
[IPO] Added attributor for identifying invariant loads (#141800)
The attributor conservatively marks pointers whose loads are eligible to
be marked as `!invariant.load`.
It does so by identifying:
1. Pointers marked `noalias` and `readonly`
2. Pointers whose underlying objects are all eligible for invariant
loads.

The attributor then manifests this attribute at non-atomic non-volatile
load instructions.
2025-06-16 11:16:47 -05:00
Craig Topper
b8a4a3b99c
[ValueTracking] Support scalable vector splats of ConstantInt/ConstantFP in isGuaranteedNotToBeUndefOrPoison. (#142894)
Scalable vectors use insertelt+shufflevector ConstantExpr to
represent a splat.
2025-06-05 22:08:03 -07:00
Craig Topper
d5d6f60632
[ValueTracking] Support scalable vectors for ExtractElement in computeKnownFPClass. (#143051)
We can support scalable vectors by setting the demanded mask to APInt(1,
1) to demand the whole vector.
2025-06-05 20:48:07 -07:00
Shilei Tian
d2992423e3
[Attributor] Don't replace addrspacecast (ptr null to ptr addrspace(x)) with ptr addrspace(x) null (#126779)
`ConstantPointerNull` represents a pointer with value 0, but it doesn’t
necessarily mean a `nullptr`. `ptr addrspace(x) null` is not the same as
`addrspacecast (ptr null to ptr addrspace(x))` if the `nullptr` in AS X
is not
zero. Therefore, we can't simply replace it.

Fixes #115083.
2025-05-20 18:08:42 -04:00
Matt Arsenault
609a8331a0
ValueTracking: Handle minimumnum and maximumnum in computeKnownFPClass (#138737)
For now use the same treatment as minnum/maxnum, but these should
diverge. alive2 seems happy with this, except for some preexisting bugs
with weird denormal modes.
2025-05-07 08:02:24 +02:00
Matt Arsenault
03f3f15690
ValueTracking: Add baseline tests for minimumnum/maximumnum (#138736)
Mostly copied from existing min/max tests, with a few additions.
2025-05-07 07:59:25 +02:00
Matt Arsenault
37b135cc8f
Attributor: Don't rely on use_empty for constants (#137218)
This allows inferring noalias on a null argument parameter. This
avoids a non-NFC diff in a future change.
2025-04-24 21:41:55 +02:00
Nikita Popov
d69ee885cc
[CaptureTracking] Remove dereferenceable_or_null special case (#135613)
Remove the special case where comparing a dereferenceable_or_null
pointer with null results in captures(none) instead of
captures(address_is_null).

This special case is not entirely correct. Let's say we have an
allocated object of size 2 at address 1 and have a pointer `%p` pointing
either to address 1 or 2. Then passing `gep p, -1` to a
`dereferenceable_or_null(1)` function is well-defined, and allows us to
distinguish between the two possible pointers, capturing information
about the address.

Now that we ignore address captures in alias analysis, I think we're
ready to drop this special case. Additionally, if there are regressions
in other places, the fact that this is inferred as address_is_null
should allow us to easily address them if necessary.
2025-04-17 12:44:57 +02:00
Matt Arsenault
34e8f00066
Attributor: Propagate align to cmpxchg instructions (#134838)
Fixes #134480
2025-04-08 22:15:50 +07:00
Matt Arsenault
66f0343609
Attributor: Propagate align to atomicrmw instructions (#134837)
Partially fixes #134480
2025-04-08 22:12:20 +07:00
Matt Arsenault
2cf4254466
Attributor: Add baseline tests for propagating align to atomics (#134836) 2025-04-08 22:08:11 +07:00
Matt Arsenault
783201b184
Attributor: Don't follow uses of ConstantData (#134573)
These should not really have uselists, and it's not worth the compile
time of looking at all uses of trivial constants. The main observable
change of this is it no longer adds align attributes on constant null
uses, but those are not useful. Some of these cases should potentially
be more aggressive and not look at any Constant users.
2025-04-07 23:59:53 +07:00
Jeremy Morse
792a6f8119
[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.

Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.

(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
2025-03-14 15:50:49 +00:00
Pierre van Houtryve
5470dffda2
[Attributor] Do not optimize away externally_initialized loads. (#128170)
Fixes SWDEV-515029
2025-03-03 14:58:47 +01:00
Nikita Popov
abd97d9685 [CaptureTracking] Take non-willreturn calls into account
We can leak one bit of information about the address by either
diverging or not.

Part of https://github.com/llvm/llvm-project/issues/129090.
2025-02-28 11:15:28 +01:00
Yeaseen
951ba3e9fa
[llvm] Remove undef from some llvm/test/Transforms tests (#125460)
This PR replaces some instances of `undef` with function argument value
or poison or concrete values in several tests under
`llvm/test/Transforms/` directory.
2025-02-03 08:18:23 +00:00
Nikita Popov
8a43d0e873 [Attributor] Check correct IRPosition in AANoCapture::isImpliedByIR()
This case is intended to check the callee argument, not the call-site.

Fixes an issue introduced in #123181.
2025-01-29 17:34:10 +01:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Paul Walker
56c091ea71
[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856)
This brings the printing of scalable vector constant splats inline with
their fixed length counterparts.
2024-11-21 11:21:12 +00:00
Lee Wei
58ca7078ce
[llvm] Remove br i1 undef from some regression tests [NFC] (#115688)
This PR aims to remove undefined behavior from tests.
2024-11-12 08:41:27 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
David Green
0f919444ad
[ValueTracking] Handle recursive phis in knownFPClass (#114008)
As a follow-on to 113686, this breaks the recursion between phi nodes
that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be
calculated from the classes of p1 and p2.
2024-11-01 13:38:29 +00:00
David Green
9735c05186
[ValueTracking] Compute KnownFP state from recursive select/phi. (#113686)
Given a recursive phi with select:
 %p = phi [ 0, entry ], [ %sel, loop]
 %sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi
pair can only be the initial state (0 here) or from %other. This adds a
short-cut into computeKnownFPClass for PHI to detect that the select is
recursive back to the phi, and if so use the state from the other
operand.

This helps to address a regression from #83200.
2024-10-31 07:50:44 +00:00
David Green
f358422268 [Attributor] Add nofpclass test for phi+select recurrences. NFC 2024-10-30 08:10:35 +00:00
Yingwei Zheng
8d8bb4032b
[Verifier] Verify attribute denormal-fp-math[-f32] (#112310)
Some typos are also fixed. Address
https://github.com/llvm/llvm-project/pull/112067#pullrequestreview-2363722447.
2024-10-15 17:32:16 +08:00
Johannes Doerfert
335e137267
[Attributor][FIX] Track returned pointer offsets (#110534)
If the pointer returned by a function is not "the base pointer" but has
an offset, we need to track the offset such that users can apply it to
their offset chain when they create accesses.
This was reported by @ye-luo and reduced test cases are included. The
OffsetInfo was moved and the container was replaced with a set to avoid
excessive growth. Otherwise, the patch just replaces the "returns
pointer" flag with the "returned offsets", and deals with the applying
to offsets at the call site.

---------

Co-authored-by: Johannes Doerfert <jdoerfert@llnl.gov>
2024-10-01 12:41:15 -05:00
Shilei Tian
0b7a18bd4a
[Attributor] Use more appropriate approach to check flat address space (#108713) 2024-09-27 18:26:55 -04:00
Yonghong Song
becc02ce93 Revert "[Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgume… (#105742)"
This reverts commit 959448fbd6bc6f74fb3f9655b1387d0e8a272ab8.
Reverting because multiple test failures e.g.
  https://lab.llvm.org/buildbot/#/builders/187/builds/1290
  https://lab.llvm.org/buildbot/#/builders/153/builds/9389
and maybe a few others.
2024-09-19 03:54:13 -07:00
yonghong-song
959448fbd6
[Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgume… (#105742)
…ntElimination

ArgumentPromotion and DeadArgumentElimination passes could change
function signatures but the function name remains the same as before the
transformation. This makes it hard for tracing with bpf programs where
user tends to use function signature in the source. See discussion [1]
for details.

This patch added suffix to functions whose signatures are changed. The
suffix lets users know that function signature has changed and they need
to impact the IR or binary to find modified signature before tracing
those functions.

The suffix for ArgumentPromotion is ".argprom" and the suffixes for
DeadArgumentElimination are ".argelim" and ".retelim". The suffix also
gives user hints about what kind of transformation has been done.

With this patch, I built a recent linux kernel with full LTO enabled. I
got 4 functions with only argpromotion like
```
  set_track_update.argelim.argprom
  pmd_trans_huge_lock.argprom
  ...
```
I got 1058 functions with only deadargelim like
```
  process_bit0.argelim
  pci_io_ecs_init.argelim
  ...
```
I got 3 functions with both argpromotion and deadargelim
```
  set_track_update.argelim.argprom
  zero_pud_populate.argelim.argprom
  zero_pmd_populate.argelim.argprom
```

  [1] https://github.com/llvm/llvm-project/issues/104678
2024-09-19 10:21:58 +02:00
Johannes Doerfert
56a033462e
[Attributor] Keep track of reached returns in AAPointerInfo (#107479)
Instead of visiting call sites in Attribute::checkForAllUses, we now
keep track of returns in AAPointerInfo and use the call site return
information as required. This way, the user of
AAPointerInfo(CallSite)Argument can determine if the call return should
be visited. We do not collect them as "may accesses" in the
AAPointerInfo(CallSite)Argument itself in case a return user is found.
2024-09-10 08:13:21 -07:00
Johannes Doerfert
84bf0da34d
[Attributor][FIX] Ensure to always translate call site arguments (#107323)
When we propagate call site arguments we always need to translate them,
this is important as we ended up picking the function argument for a
recurisve call not the call site argument. `@recBad` and `@recGood` in
`returned.ll` show the problem as they used to transform them the same
way. The restructuring cleans the code up and helps derive more
"returned" arguments and better information in the presence of recursive
calls. The "dropped" attributes are simply dropped because we do not
query them anymore, not because we cannot derive them.
2024-09-05 13:37:21 -07:00
Johannes Doerfert
e6dece9f69
[Attributor][FIX] Mark "may" accesses through call sites as such (#107439)
Before, we kept the call site access kind (may/must) when we translated
the access. However, the pointer we access it through (by passing it to
the callee) might not be the underlying object. We have similar logic
when we add store and load accesses.
2024-09-05 13:33:58 -07:00
Johannes Doerfert
3726f9c575
[Attributor][NFC] Pre-commits for #107439 (#107457) 2024-09-05 13:10:37 -07:00
Alex MacLean
369d8148e0
[ValueTracking] use KnownBits to compute fpclass from bitcast (#97762)
When we encounter a bitcast from an integer type we can use the
information from `KnownBits` to glean some information about the
fpclass:
- If the sign bit is known, we can transfer this information over. 
- If the float is IEEE format and enough of the bits are known, we may
  be able to prove or rule out some fpclasses such as NaN, Zero, or Inf.
2024-08-30 07:34:49 -07:00
Johannes Doerfert
8266d47cd1
[Attributor] Improve AAUnderlyingObjects (#104835)
- Allocas and GlobalValues cannot be simplified, so we should not try.
- If we never used any assumed state, the AAUnderlyingObjects doesn't
require an additional update.
- If we have seen an object (or it's underlying object) before, we do
not need to inspect it anymore.

The original logic for "SeenObjects" was flawed and caused us to add
intermediate values to the underlying object list if a PHI or select
instruction referenced the same underlying object twice. The test
changes are all instances of this situation and we now correctly derive
`memory(none)` for the functions that only access stack memory.

---------

Co-authored-by: Shilei Tian <i@tianshilei.me>
2024-08-20 12:05:20 -07:00
Nikita Popov
472c79ca52
[IR] Check that arguments of naked function are not used (#104757)
Verify that the arguments of a naked function are not used. They can
only be referenced via registers/stack in inline asm, not as IR values.
Doing so will result in assertion failures in the backend.

There's probably more that we should verify, though I'm not completely
sure what the constraints are (would it be correct to require that naked
functions are exactly an inline asm call + unreachable, or is more
allowed?)

Fixes https://github.com/llvm/llvm-project/issues/104718.
2024-08-20 09:29:05 +02:00
Shilei Tian
907c7eb311
[Attributor] Enable AAAddressSpace in OpenMPOpt (#104363)
This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4.

We can finally reland the PR since the issue that caused the PR to be
reverted has been resolved in
https://github.com/llvm/llvm-project/pull/104051.
2024-08-16 13:33:48 -04:00
Johannes Doerfert
7156bcf286
[Attributor][FIX] Ensure we do not use stale references (#104495)
When copying map entries, we might run into resizing and invalidate the
RHS of the assignment. We dealt with this before and now use the proper
helper to avoid the problem in another place.

Fixes: https://github.com/llvm/llvm-project/issues/104397
2024-08-15 18:45:36 -04:00
Matt Arsenault
f9060f1b7e
AMDGPU: Fix using wrong alloca address space in test (#102108) 2024-08-07 00:19:22 +04:00
Shilei Tian
9373a43218
[Attributor] Indicate optimistic fixed point if an instruction already has non-zero address space (#101589) 2024-08-01 22:55:09 -04:00
Vidush Singhal
c7633ddb28
[Attributor]: Ensure cycle info is not null when handling PHI in AAPointerInfo (#97321)
Ensure cycle info object is not null for simple PHI case

for the test:
`llvm/test/Transforms/Attributor/phi_bug_pointer_info.ll`

Debug info Before the change: 

```
Accesses by bin after update:
[8-12] : 1
     - 9 -   store i32 %0, ptr %field2, align 4
       - c:   %0 = load i32, ptr %val, align 4
[32-36] : 1
     - 9 -   store i32 %1, ptr %field8, align 4
       - c:   %1 = load i32, ptr %val2, align 4
[2147483647-4294967294] : 1
     - 6 -   %ret = load i32, ptr %x, align 4
       - c: <unknown>
```

Debug info After the change: 

```
Accesses by bin after update:
[8-12] : 2
     - 9 -   store i32 %0, ptr %field2, align 4
       - c:   %0 = load i32, ptr %val, align 4
     - 6 -   %ret = load i32, ptr %x, align 4
       - c: <unknown>
[32-36] : 2
     - 9 -   store i32 %1, ptr %field8, align 4
       - c:   %1 = load i32, ptr %val2, align 4
     - 6 -   %ret = load i32, ptr %x, align 4
       - c: <unknown>
```

Co-authored-by: Vidush Singhal <singhal2@ruby964.llnl.gov>
2024-07-01 17:20:34 -07:00
Fangrui Song
89e8e63f47 [Attributor] Stabilize llvm.assume output
Don't rely on the iteration order of DenseSet<StringRef>, which is not
guaranteed to be deterministic.
2024-06-19 15:36:46 -07:00
Ethan Luis McDonough
b629d4b912
[Attributor] Prevent infinite loop in AAGlobalValueInfoFloating (#94941)
Global variables that reference themselves alongside a function that is
called indirectly can cause an infinite loop in
`AAGlobalValueInfoFloating`. The recursive reference is continually
pushed back into the workload, causing the attributor to hang
indefinitely.
2024-06-18 09:36:42 -07:00