491 Commits

Author SHA1 Message Date
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Nikita Popov
c1b387e23d
[MemoryLocation] Compute lifetime size from alloca size (#151982)
Split out from #150248:

Since #150944 the size passed to lifetime.start/end is considered
meaningless. The lifetime always applies to the whole alloca.

This adjusts MemoryLocation to determine the MemoryLocation size from
the alloca size, instead of using the argument.
2025-08-05 10:47:07 +02:00
Nikita Popov
92c55a315e
[IR] Only allow lifetime.start/end on allocas (#149310)
lifetime.start and lifetime.end are primarily intended for use on
allocas, to enable stack coloring and other liveness optimizations. This
is necessary because all (static) allocas are hoisted into the entry
block, so lifetime markers are the only way to convey the actual
lifetimes.

However, lifetime.start and lifetime.end are currently *allowed* to be
used on non-alloca pointers. We don't actually do this in practice, but
just the mere fact that this is possible breaks the core purpose of the
lifetime markers, which is stack coloring of allocas. Stack coloring can
only work correctly if all lifetime markers for an alloca are
analyzable.

* If a lifetime marker may operate on multiple allocas via a select/phi,
we don't know which lifetime actually starts/ends and handle it
incorrectly (https://github.com/llvm/llvm-project/issues/104776).
* Stack coloring operates on the assumption that all lifetime markers
are visible, and not, for example, hidden behind a function call or
escaped pointer. It's not possible to change this, as part of the
purpose of lifetime markers is that they work even in the presence of
escaped pointers, where simple use analysis is insufficient.

I don't think there is any way to have coherent semantics for lifetime
markers on allocas, while also permitting them on arbitrary pointer
values.

This PR restricts lifetimes to operate on allocas only. As a followup, I
will also drop the size argument, which is superfluous if we always
operate on an alloca. (This change also renders various code handling
lifetime markers on non-alloca dead. I plan to clean up that kind of
code after dropping the size argument as well.)

In practice, I've only found a few places that currently produce
lifetimes on non-allocas:

* CoroEarly replaces the promise alloca with the result of an intrinsic,
which will later be replaced back with an alloca. I think this is the
only place where there is some legitimate loss of functionality, but I
don't think this is particularly important (I don't think we'd expect
the promise in a coroutine to admit useful lifetime optimization.)
* SafeStack moves unsafe allocas onto a separate frame. We can safely
drop lifetimes here, as SafeStack performs its own stack coloring.
* Similar for AddressSanitizer, it also moves allocas into separate
memory.
* LSR sometimes replaces the lifetime argument with a GEP chain of the
alloca (where the offsets ultimately cancel out). This is just
unnecessary. (Fixed separately in
https://github.com/llvm/llvm-project/pull/149492.)
* InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast
of an alloca. I don't think this is necessary.
2025-07-21 15:04:50 +02:00
clubby789
74c396afb2
[DSE] Remove uninitialized from allockind when creating dummy zeroed variant function (#149336)
cc https://github.com/llvm/llvm-project/pull/138299

rustc sets `allockind("uninitialized")` - if we copy the attributes
as-is when creating a dummy function, Verify complains about
`allockind("uninitialized,zeroed")` conflicting, so we need to clear the
flag.

Co-authored-by: Jamie Hill-Daniel <jamie@osec.io>
2025-07-18 09:30:23 +02:00
Antonio Frighetto
f1cc0b607b [IR] Introduce dead_on_return attribute
Add `dead_on_return` attribute, which is meant to be taken advantage
by the frontend, and states that the memory pointed to by the argument
is dead upon function return. As with `byval`, it is supposed to be
used for passing aggregates by value. The difference lies in the ABI:
`byval` implies that the pointer is explicitly passed as argument to
the callee (during codegen the copy is emitted as per byval contract),
whereas a `dead_on_return`-marked argument implies that the copy
already exists in the IR, is located at a specific stack offset within
the caller, and this memory will not be read further by the caller upon
callee return – or otherwise poison, if read before being written.

RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
2025-07-02 09:29:36 +02:00
Nikita Popov
bb70023cbf
[MemoryLocation][DSE] Allow other read effects in MemoryLocation::getForDest() (#144343)
MemoryLocation::getForDest() returns a (potentially) written location,
while still allowing other reads. Currently, this is limited to
argmemonly functions. However, we can ignore other (non-argmem) read
effects here for the same reason we can ignore argument reads.
    
Fixes https://github.com/llvm/llvm-project/issues/144300.

Proof: https://alive2.llvm.org/ce/z/LKq_dc
2025-06-17 09:49:18 +02:00
clubby789
c7c79d2590
[IR][DSE] Support non-malloc functions in malloc+memset->calloc fold (#138299)
Add a `alloc-variant-zeroed` function attribute which can be used to
inform folding allocation+memset. This addresses
https://github.com/rust-lang/rust/issues/104847, where LLVM does not
know how to perform this transformation for non-C languages.

Co-authored-by: Jamie <jamie@osec.io>
2025-06-04 09:35:20 +02:00
Philip Reames
15c2f79153 [DSE/GVN] Continue to improve memset.pattern testing [nfc]
This batch reveals two missed optimizations, but only one of which
is regression as compared to the memset_patternN libcall family.
2025-05-05 09:25:23 -07:00
Philip Reames
76b9973a78 [DSE] Strengthen test coverage for memset.pattern 2025-05-05 07:32:31 -07:00
Nikita Popov
249d9492a2
[DSE] Only consider provenance captures (#138286)
As a memory analysis, DSE only cares about provenance captures. Address
captures can be ignored as they cannot be used to read or modify memory.
2025-05-05 09:22:15 +02:00
Michael Berg
b88eef95e7
[DSE] Add predicated vector length store support for masked store elimination (#134175)
In isMaskedStoreOverwrite we process two stores that fully overwrite one
another, here we add support for predicated vector length stores so that
DSE will eliminate this variant of masked stores.

This is the follow up installment mentioned in:
https://reviews.llvm.org/D132700
2025-04-09 18:12:15 -07:00
Jeremy Morse
792a6f8119
[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.

Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.

(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
2025-03-14 15:50:49 +00:00
Nikita Popov
de895751d2
[CaptureTracking][AA] Only consider provenance captures (#130777)
For the purposes of alias analysis, we should only consider provenance
captures, not address captures. To support this, change (or add)
CaptureTracking APIs to accept a Mask and StopFn argument. The Mask
determines which components we are interested in (for AA that would be
Provenance).

The StopFn determines when we can abort the walk early. Currently, we
want to do this as soon as any of the components in the Mask is
captured. The purpose of making this a separate predicate is that in the
future we will also want to distinguish between capturing full
provenance and read-only provenance. In that case, we can only stop
early once full provenance is captured. The earliest escape analysis
does not get a StopFn, because it must always inspect all captures.
2025-03-13 09:54:36 +01:00
Björn Pettersson
74016728e3
[DSE] Update dereferenceable attributes when adjusting memintrinsic ptr (#125073)
Consider IR like this
call void @llvm.memset.p0.i64(ptr dereferenceable(28) %p, i8 0, i64 28,
i1 false)
  store i32 1, ptr %p

In the past it has been optimized like this:
  %p2 = getelementptr inbounds i8, ptr %p, i64 4
call void @llvm.memset.p0.i64(ptr dereferenceable(28) %p2, i8 0, i64 24,
i1 false)
  store i32 1, ptr %p

As the input IR doesn't guarantee that it is OK to deref 28 bytes
starting at the adjusted pointer %p2 the transformation has been a bit
flawed.

With this patch we make sure to drop any
dereferenceable/dereferenceable_or_null attributes when doing such
transforms. An alternative would have been to adjust the amount of
dereferenceable bytes, but since a memset with a constant length already
implies dereferenceability by itself it is simpler to just drop the
attributes.

The new filtering of attributes is done using a helper that only keep
attributes that we explicitly handle. For the adjusted mem instrinsic
pointers that currently involve "NonNull", "NoUndef" and "Alignment"
(when the alignment is known to be fulfilled also after offsetting the
pointer).

Fixes #115976
2025-02-18 17:51:14 +01:00
Nikita Popov
8600d89e55 [DSE] Add test for interaction with return-only captures (NFC)
Regression test for the miscompile reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
2025-02-13 15:19:05 +01:00
Nikita Popov
2d31a12dbe
[DSE] Don't use initializes on byval argument (#126259)
There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.
2025-02-10 10:34:03 +01:00
Nikita Popov
a325622be5
[DSE] Allow attribute differences in redundant store elimination (#125190)
When comparing the instructions, enable attribute intersection to allow
differences in attributes.

Note that we don't actually have to intersect the attributes on the
earlier instruction, because we're not RAUWing, so there's no chance
that we make any values more poisonous.
2025-01-31 16:10:48 +01:00
Nikita Popov
a8a5998e90 [DSE] Add tests for redundant store elimination with different attrs (NFC) 2025-01-31 10:37:37 +01:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Alex Bradbury
d1314d0152
[MemoryLocation] Teach MemoryLocation about llvm.experimental.memset.pattern (#120421)
Relates to (but isn't dependent on) #120420.

This allows alias analysis o the intrinsic of the same quality as for
the libcall, which we want in order to move LoopIdiomRecognize over to
selecting the intrinsic.
2025-01-15 13:50:23 +00:00
Alex Bradbury
b92e97bdd5 [test] Pre-commit llvm.experimental.memset.pattern tests prior to MemoryLocation improvements
Reviewed as part of <https://github.com/llvm/llvm-project/pull/120421>.
2025-01-15 12:52:31 +00:00
Haopeng Liu
fed817a8b2
[DSE] Consider the aliasing through global variable while checking clobber (#120044)
While update the read clobber check for the "initializes" attr, we
checked the aliasing among arguments, but didn't consider the aliasing
through global variable. It causes problems in this example:

```
int g_var = 123;

void update(int* ptr) {
  *ptr = g_var;

void foo() {
  g_var = 0;
  bar(&g_var);
}
```
We mistakenly removed `g_var = 0;` as a dead store.

Fix the issue by requiring the CallBase only access argmem or
inaccessiblemem.
2025-01-14 10:04:41 -08:00
Ramkumar Ramachandra
4e8eabd93e
DSE: pre-commit tests for scalable vectors (#110669)
As AliasAnalysis now has support for scalable sizes, add tests to
DeadStoreElimination covering the scalable vectors case, in preparation
to extend it.
2024-11-28 16:16:16 +00:00
Lee Wei
1ca64c5fb7
[llvm] Remove br i1 undef from some regression tests [NFC] (#115691)
This PR aims to remove undefined behavior from tests under the directory
`llvm/transforms/CodegenPrepare, ConstantHoisting, Coroutines` etc.
2024-11-11 12:56:31 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Haopeng Liu
a31ce36f56
Apply initializes attribute to DSE (#113630)
retry #107282

Fixed with `MadeChange |= Changed;` and confirmed it works.

```
cmake -DLLVM_CCACHE_BUILD=ON -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON -DLLVM_ENABLE_WERROR=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS=-U_GLIBCXX_DEBUG '-DLLVM_LIT_ARGS=-v -vv -j96' '-DLLVM_ENABLE_PROJECTS=llvm;lld' -DLLVM_ENABLE_ASSERTIONS=ON -GNinja ../llvm

ninja check-llvm
```
2024-10-24 18:43:20 -07:00
Arthur Eubanks
3cec720449
Revert "[DSE] Apply initializes attribute to DSE" (#113589)
Reverts llvm/llvm-project#107282

Seems to be causing invalid analysis caching as mentioned in
https://github.com/llvm/llvm-project/pull/107282#issuecomment-2435083978.
2024-10-24 08:51:31 -07:00
Haopeng Liu
089237c0d0
[DSE] Apply initializes attribute to DSE (#107282)
Apply the initializes attribute to DSE and guard with a flag,
"enable-dse-initializes-attr-improvement".

The attribute support has been landed in:
https://github.com/llvm/llvm-project/pull/84803
The attribute inference will be landed after this PR:
https://github.com/llvm/llvm-project/pull/97373
2024-10-23 22:18:59 -07:00
Alex Voicu
4852374135
[llvm][opt][Transforms] Replacement calloc should match replaced malloc (#110524)
Currently DSE unconditionally emits `calloc` as returning a pointer to
AS0. However, this is incorrect for targets that have a non-zero default
AS, as it'd not match the `malloc` signature. This patch addresses that
by piping through the AS for the pointer returned by `malloc` into the
`calloc` insertion call.
2024-10-01 02:05:28 +01:00
Antonio Frighetto
10d720b5b4 [DeadStoreElimination] Add test for recent worklist revision (NFC)
As d5c89cc proved not to be NFC, prior to this change, duplicate
`MemoryAccess` entries were being added to the worklist in
`isWriteAtEndOfFunction`, prematurely reaching the exploration
limit. When `MemorySSAScanLimit` cutoff is set to 4, the store
was previously not eliminated. Introduce a regression test for
additional validation. The test is a simplified variant of function
`ntlmssp_create_session_key`, coming from @dtcxzyw/llvm-opt-benchmark,
bench/wireshark/original/packet-ntlmssp.c.ll.
2024-07-22 09:05:18 +02:00
Stephen Tozer
094572701d
[RemoveDIs] Print IR with debug records by default (#91724)
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records. 

If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.

For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
2024-06-14 15:07:27 +01:00
Ralender
d46e37348e
[DebugCounter] Add support for non-continous ranges. (#89470) 2024-05-28 12:40:39 +02:00
eaeltsin
243ffbdf8b
[DSE] Check write location in IsRedundantStore (#93400)
Fix https://github.com/llvm/llvm-project/issues/93298.
2024-05-27 09:26:44 +02:00
Florian Hahn
eb8f379567
[DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157)
Not removing the malloc from earliest escape info leaves stale entries
in the cache.

Fixes https://github.com/llvm/llvm-project/issues/84051.

PR: https://github.com/llvm/llvm-project/pull/84157
2024-03-06 20:08:00 +00:00
Florian Hahn
10f5e983a9
[DSE] Delay deleting non-memory-defs until end of DSE. (#83411)
DSE uses BatchAA, which caches queries using pairs of MemoryLocations.
At the moment, DSE may remove instructions that are used as pointers in
cached MemoryLocations. If a new instruction used by a new MemoryLoation
and this instruction gets allocated at the same address as a previosuly
cached and then removed instruction, we may access an incorrect entry in
the cache.

To avoid this delay removing all instructions except MemoryDefs until
the end of DSE. This should avoid removing any values used in BatchAA's
cache.

Test case by @vporpo from
https://github.com/llvm/llvm-project/pull/83181.
(Test not precommitted because the results are non-determinstic - memset
only sometimes gets removed)

PR: https://github.com/llvm/llvm-project/pull/83411
2024-03-02 12:34:36 +00:00
Vasileios Porpodas
b1d2e8510b Revert "[DSE] Test precommit for a bug caused by a read-clobber being skipped. (#83084)"
This reverts commit 91791c60bd7d1783d84e2e6ed87e5f957fbaee56.
2024-02-26 17:12:23 -08:00
vporpo
91791c60bd
[DSE] Test precommit for a bug caused by a read-clobber being skipped. (#83084) 2024-02-26 16:53:47 -08:00
Shreyansh Chouhan
65b5647e16
[DeadStoreElimination] Optimize tautological assignments (#75744)
If a store is dominated by a condition that ensures that the value being
stored in a memory location is already present at that memory location,
consider the store a noop.

Fixes #63419
2024-02-14 11:25:11 +01:00
Nikita Popov
bf5d96c96c
[IR] Add dead_on_unwind attribute (#74289)
Add the `dead_on_unwind` attribute, which states that the caller will
not read from this argument if the call unwinds. This allows eliding
stores that could otherwise be visible on the unwind path, for example:

```
declare void @may_unwind()

define void @src(ptr noalias dead_on_unwind %out) {
    store i32 0, ptr %out
    call void @may_unwind()
    store i32 1, ptr %out
    ret void
}

define void @tgt(ptr noalias dead_on_unwind %out) {
    call void @may_unwind()
    store i32 1, ptr %out
    ret void
}
```

The optimization is not valid without `dead_on_unwind`, because the `i32
0` value might be read if `@may_unwind` unwinds.

This attribute is primarily intended to be used on sret arguments. In
fact, I previously wanted to change the semantics of sret to include
this "no read after unwind" property (see D116998), but based on the
feedback there it is better to keep these attributes orthogonal (sret is
an ABI attribute, dead_on_unwind is an optimization attribute). This is
a reboot of that change with a separate attribute.
2023-12-14 09:58:14 +01:00
Jeremy Morse
d2d9dc8eb4
[DebugInfo][RemoveDIs] Make debugify pass convert to/from RemoveDIs mode (#73251)
Debugify is extremely useful as a testing and debugging tool, and a good
number of LLVM-IR transform tests use it. We need it to support "new"
non-instruction debug-info to get test coverage, but it's not important
enough to completely convert right now (and it'd be a large
undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and
exit of the pass, which gives us the functionality without any further
work. The cost is compile-time, but again this is only happening during
tests.

Tested by: the large set of debugify tests enabled here. Note the
InstCombine test (cast-mul-select.ll) that hasn't been fully enabled:
this is because there's a debug-info sinking piece of code there that
hasn't been instrumented.
2023-11-29 13:19:50 +00:00
Florian Hahn
fd95f398c7
Revert "[CaptureTracking] Ignore ephemeral values when determining po… (#71066)
Unfortunately the commit (D123162) introduced a mis-compile
(https://github.com/llvm/llvm-project/issues/70547), which wasn't fixed
by the alternative fix (c0de28b92e98acbeb73)

I think as long as the call considered as ephemeral is not removed, we
need to be conservative. To address the correctness issue quickly, I
think we should revert the patch (as this patch does, it doens't revert
cleanly)

This reverts commit 17fdaccccfad9b143e4aadbcdda7f645de127153.

Fixes https://github.com/llvm/llvm-project/issues/70547
2023-11-02 20:23:38 +00:00
Nikita Popov
deb5bd1289 [DSE] Add test for #70547 (NFC) 2023-10-31 12:34:11 +01:00
Arthur Eubanks
56f7c7e52f
[test] Remove test added in #67479 (#67578)
With 7aab12e1c, the test is no longer relevant, but the patch is still
good to have.
2023-09-27 10:59:23 -07:00
Arthur Eubanks
339fc5e6b0 [test] Mark test added in #67479 as XFAIL
This was merged after a different change caused the test to fail in the meantime.
2023-09-27 08:43:28 -07:00
Arthur Eubanks
cf7eac9650
[ObjectSizeOffsetVisitor] Bail after visiting 100 instructions (#67479)
We're running into stack overflows for huge functions with lots of phis.
Even without the stack overflows, this is recursing >7000 in some
auto-generated code.

This fixes the stack overflow and brings down the compile time to
something reasonable.
2023-09-27 14:54:41 +02:00
Nikita Popov
89c564704e [DSE] Handle unexpected memory attribute on malloc (PR64827)
Make sure we don't crash if we encounter a malloc with memory(none).

Related to https://github.com/llvm/llvm-project/issues/64827.
2023-08-28 15:06:53 +02:00
Nikita Popov
cc488b80ad [DSE][LICM] Regenerate test checks (NFC)
Avoid spurious variable name changes in future patch.
2023-08-09 14:49:15 +02:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
ManuelJBrito
8b56da5e9f [IR] Change shufflevector undef mask to poison
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.

Differential Revision: https://reviews.llvm.org/D149210
2023-04-27 14:41:10 +01:00
Florian Hahn
64233ae3eb
[DSE] Add test with llvm.memcpy & memcpy_chk.
This adds test coverage to avoid crashes with further changes.
2023-02-08 13:20:21 +00:00