584 Commits

Author SHA1 Message Date
Aiden Grossman
ec059d81aa
[DSE] Handle variable offsets with sized dead_on_return (#180364)
With a sized dead_on_return, we need to not eliminate stores if there
are to a pointer with a variable offset from the underlying object
marked dead_on_return. This manifested as an assertion failure as
BaseValue/V ended up not being equal. It's possible we could do a range
analysis to try and prove the variable offset stays within bounds, but
this case seems to come up relatively rarely (only reproducible with a
UBSan build of LLVM) and is probably not worth the compile time.

Fixes #180361.
2026-02-07 11:43:56 -08:00
Aiden Grossman
6277359e5a [DSE] Mark BaseValue Variable [[maybe_unused]]
It is only used within an assertion but we cannot inline the call as it
is needed to get the offset from the base pointer for the function to
work.
2026-01-23 21:39:20 +00:00
Aiden Grossman
fbc970bb0e Revert "[LLVM] Update assert to removed unused variable warning. (#177632)"
This reverts commit fdb05bbf62b8d4fc8dc7ce1f1cfa570f3265a8ae.

This was causing failures in release-mode builds because the
GetPointerBaseWithConstantOffset call would never be run which leads to
ValueOffset being uninitialized and thus the behavior of the function is
unpredictable.
2026-01-23 21:39:20 +00:00
cmtice
fdb05bbf62
[LLVM] Update assert to removed unused variable warning. (#177632)
Remove the variable definition and move the function call directly into
the assert statement. Otherwise builds with -Werror that don't use
asserts would fail.
2026-01-23 19:03:09 +00:00
Aiden Grossman
0145a643cc
[DSE] Make DSE eliminate stores to objects with a sized dead_on_return
dead_on_return is made optionally sized in #171712. This patch adds
handling in DSE so that we can actually eliminate stores to pointer
parameters marked with a sized dead_on_return attribute. We do not
eliminate stores where the store may overlap with bytes that are not
known to be dead after return.

Reviewers: nikic, antoniofrighetto, alinas, aeubanks

Pull Request: https://github.com/llvm/llvm-project/pull/173694
2026-01-23 07:40:44 -08:00
Aiden Grossman
e2d7cd685d
[IR] Make dead_on_return attribute optionally sized
This patch makes the dead_on_return parameter attribute optionally require a number
of bytes to be passed in to specify the number of bytes known to be dead
upon function return/unwind. This is aimed at enabling annotating the
this pointer in C++ destructors with dead_on_return in clang. We need
this to handle cases like the following:

```
struct X {
  int n;
  ~X() {
    this[n].n = 0;
  }
};
void f() {
  X xs[] = {42, -1};
}
```

Where we only certain that sizeof(X) bytes are dead upon return of ~X.
Otherwise DSE would be able to eliminate the store in ~X which would not
be correct.

This patch only does the wiring within IR. Future patches will make
clang emit correct sizing information and update DSE to only delete
stores to objects marked dead_on_return that are provably in bounds of
the number of bytes specified to be dead_on_return.

Reviewers: nikic, alinas, antoniofrighetto

Pull Request: https://github.com/llvm/llvm-project/pull/171712
2026-01-21 08:22:05 -08:00
Oxygen
9671aae8d5
[DSE][Verifier] Respect the calling convention of the function specified by "alloc-variant-zeroed" (#175911)
Require that the calling convention between the zeroed and non-zeroed
variants is the same, and set it appropriate in the DSE transform.
2026-01-16 15:45:40 +00:00
Nikita Popov
573ca36753
[IR] Replace alignment argument with attribute on masked intrinsics (#163802)
The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter`
intrinsics currently accept a separate alignment immarg. Replace this
with an `align` attribute on the pointer / vector of pointers argument.

This is the standard representation for alignment information on
intrinsics, and is already used by all other memory intrinsics. This
means the signatures now match llvm.expandload, llvm.vp.load, etc.
(Things like llvm.memcpy used to have a separate alignment argument as
well, but were already migrated a long time ago.)

It's worth noting that the masked.gather and masked.scatter intrinsics
previously accepted a zero alignment to indicate the ABI type alignment
of the element type. This special case is gone now: If the align
attribute is omitted, the implied alignment is 1, as usual. If ABI
alignment is desired, it needs to be explicitly emitted (which the
IRBuilder API already requires anyway).
2025-10-20 08:50:09 +00:00
Rahul Joshi
1394e39b1b
[NFC][LLVM] Namespace cleanup in DeadStoreElimination (#163303) 2025-10-15 08:21:20 -07:00
Philip Reames
e6b4a21849
[IR] Add utilities for manipulating length of MemIntrinsic [nfc] (#153856)
Goal is simply to reduce direct usage of getLength and setLength so that
if we end up moving memset.pattern (whose length is in elements) there
are fewer places to audit.
2025-08-20 13:50:11 -07:00
Orlando Cazalet-Hyams
d13341db26
[RemoveDIs][NFC] Remove getAssignmentMarkers (#153214)
getAssignmentMarkers was for debug intrinsics. getDVRAssignmentMarkers
is used for DbgRecords.
2025-08-13 10:56:19 +01:00
Farzon Lotfi
544562ebc2
[DirectX] Remove lifetime intrinsics and run Dead Store Elimination (#152636)
fixes #151764

This fix has two parts first we track all lifetime intrinsics and if
they are users of an alloca of a target extention like dx.RawBuffer then
we eliminate those memory intrinsics when we visit the alloca.

We do step one to allow us to use the Dead Store Elimination Pass. This
removes the alloca and simplifies the use of the target extention back
to using just the global. That keeps things in a form the
DXILBitcodeWriter is expecting.

Obviously to pull this off we needed to bring back the legacy pass
manager plumbing for the DSE pass and hook it up into the DirectX
backend.

The net impact of this change is that DML shader pass rate went from
89.72% (4268 successful compilations) to 90.98% (4328 successful
compilations).
2025-08-12 12:42:08 -04:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Nikita Popov
74001beded [DSE] Use MemoryLocation API to get lifetime.end size (NFC) 2025-07-29 15:46:49 +02:00
clubby789
74c396afb2
[DSE] Remove uninitialized from allockind when creating dummy zeroed variant function (#149336)
cc https://github.com/llvm/llvm-project/pull/138299

rustc sets `allockind("uninitialized")` - if we copy the attributes
as-is when creating a dummy function, Verify complains about
`allockind("uninitialized,zeroed")` conflicting, so we need to clear the
flag.

Co-authored-by: Jamie Hill-Daniel <jamie@osec.io>
2025-07-18 09:30:23 +02:00
Antonio Frighetto
f1cc0b607b [IR] Introduce dead_on_return attribute
Add `dead_on_return` attribute, which is meant to be taken advantage
by the frontend, and states that the memory pointed to by the argument
is dead upon function return. As with `byval`, it is supposed to be
used for passing aggregates by value. The difference lies in the ABI:
`byval` implies that the pointer is explicitly passed as argument to
the callee (during codegen the copy is emitted as per byval contract),
whereas a `dead_on_return`-marked argument implies that the copy
already exists in the IR, is located at a specific stack offset within
the caller, and this memory will not be read further by the caller upon
callee return – or otherwise poison, if read before being written.

RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
2025-07-02 09:29:36 +02:00
Nikita Popov
bc7fafbeea
[AA] Take read-only provenance captures into account (#143097)
Update the AA CaptureAnalysis providers to return CaptureComponents, so
we can distinguish between full provenance and read-only provenance
captures.

Use this to restrict "other" memory effects on call from ModRef to Ref.

Ideally we would also apply the same reasoning for escape sources, but
the current API cannot actually convey the necessary information (we can
only say NoAlias or MayAlias, not MayAlias but only via Ref).
2025-06-12 14:13:15 +02:00
clubby789
8ed3cb0e64
[DSE] Fix uninitialized variable (#142768)
Introduced by accident in #138299
(https://lab.llvm.org/buildbot/#/builders/164/builds/10604)
2025-06-04 15:00:28 +02:00
clubby789
c7c79d2590
[IR][DSE] Support non-malloc functions in malloc+memset->calloc fold (#138299)
Add a `alloc-variant-zeroed` function attribute which can be used to
inform folding allocation+memset. This addresses
https://github.com/rust-lang/rust/issues/104847, where LLVM does not
know how to perform this transformation for non-C languages.

Co-authored-by: Jamie <jamie@osec.io>
2025-06-04 09:35:20 +02:00
Philip Reames
650dca5d89
[IR] Remove the AtomicMem*Inst helper classes (#138710)
Migrate their usage to the `AnyMem*Inst` family, and add a isAtomic()
query on the base class for that hierarchy. This matches the idioms we
use for e.g. isAtomic on load, store, etc.. instructions, the existing
isVolatile idioms on mem* routines, and allows us to more easily share
code between atomic and non-atomic variants.

As with #138568, the goal here is to simplify the class hierarchy and
make it easier to reason about. I'm moving from easiest to hardest, and
will stop at some point when I hit "good enough". Longer term, I'd sorta
like to merge or reverse the naming on the plain Mem*Inst and the
AnyMem*Inst, but that's a much larger and more risky change. Not sure
I'm going to actually do that.
2025-05-06 14:24:40 -07:00
Nikita Popov
249d9492a2
[DSE] Only consider provenance captures (#138286)
As a memory analysis, DSE only cares about provenance captures. Address
captures can be ignored as they cannot be used to read or modify memory.
2025-05-05 09:22:15 +02:00
NewSigma
af497d9a65
[DSE] Simpily if condition (NFC) (#137777)
Note that the key-value pair has already been initialized, so assignment
is not necessary.
2025-04-30 09:01:19 +02:00
Nikita Popov
91e1922d45 [DSE] Skip non-pointer args in initializes handling (NFCI)
Avoid performing AA queries on non-pointers.
2025-04-23 15:21:52 +02:00
Michael Berg
b88eef95e7
[DSE] Add predicated vector length store support for masked store elimination (#134175)
In isMaskedStoreOverwrite we process two stores that fully overwrite one
another, here we add support for predicated vector length stores so that
DSE will eliminate this variant of masked stores.

This is the follow up installment mentioned in:
https://reviews.llvm.org/D132700
2025-04-09 18:12:15 -07:00
Kazu Hirata
73dc2afd2c
[Transforms] Use *Set::insert_range (NFC) (#132652)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-23 19:42:53 -07:00
Nikita Popov
9cbdcfcafd [CaptureTracking] Remove StoreCaptures parameter (NFC)
The implementation doesn't use it, and is unlikely to use it in
the future.

The places that do set StoreCaptures=false, do so incorrectly and
would be broken if the parameter actually did anything.
2025-02-24 12:00:57 +01:00
Björn Pettersson
c833746c6c
[DSE] Make iter order deterministic in removePartiallyOverlappedStores. NFC (#127678)
In removePartiallyOverlappedStores we iterate over
InstOverlapIntervalsTy which is a DenseMap. Change that map into using
MapVector to ensure that we apply the transforms in a deterministic
order. I've only seen that the order matters if starting to use names
for the instructions created when doing the transforms. But such things
are a bit annoying when debugging etc.
2025-02-19 21:24:49 +01:00
Björn Pettersson
74016728e3
[DSE] Update dereferenceable attributes when adjusting memintrinsic ptr (#125073)
Consider IR like this
call void @llvm.memset.p0.i64(ptr dereferenceable(28) %p, i8 0, i64 28,
i1 false)
  store i32 1, ptr %p

In the past it has been optimized like this:
  %p2 = getelementptr inbounds i8, ptr %p, i64 4
call void @llvm.memset.p0.i64(ptr dereferenceable(28) %p2, i8 0, i64 24,
i1 false)
  store i32 1, ptr %p

As the input IR doesn't guarantee that it is OK to deref 28 bytes
starting at the adjusted pointer %p2 the transformation has been a bit
flawed.

With this patch we make sure to drop any
dereferenceable/dereferenceable_or_null attributes when doing such
transforms. An alternative would have been to adjust the amount of
dereferenceable bytes, but since a memset with a constant length already
implies dereferenceability by itself it is simpler to just drop the
attributes.

The new filtering of attributes is done using a helper that only keep
attributes that we explicitly handle. For the adjusted mem instrinsic
pointers that currently involve "NonNull", "NoUndef" and "Alignment"
(when the alignment is known to be fulfilled also after offsetting the
pointer).

Fixes #115976
2025-02-18 17:51:14 +01:00
Nikita Popov
2d31a12dbe
[DSE] Don't use initializes on byval argument (#126259)
There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.
2025-02-10 10:34:03 +01:00
Nikita Popov
a325622be5
[DSE] Allow attribute differences in redundant store elimination (#125190)
When comparing the instructions, enable attribute intersection to allow
differences in attributes.

Note that we don't actually have to intersect the attributes on the
earlier instruction, because we're not RAUWing, so there's no chance
that we make any values more poisonous.
2025-01-31 16:10:48 +01:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Haopeng Liu
13dae34819
[DSE] Enable the initializes improvement in DSE (#124058)
(Retry) enable the initializes improvement in DSE.

Initially enabled in https://github.com/llvm/llvm-project/pull/119116.

Fix the aliasing issue through global variables in
https://github.com/llvm/llvm-project/pull/120044.

The compile-time comparison of this enabling (no meaningful diff):
https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
2025-01-23 15:51:04 -08:00
Haopeng Liu
fed817a8b2
[DSE] Consider the aliasing through global variable while checking clobber (#120044)
While update the read clobber check for the "initializes" attr, we
checked the aliasing among arguments, but didn't consider the aliasing
through global variable. It causes problems in this example:

```
int g_var = 123;

void update(int* ptr) {
  *ptr = g_var;

void foo() {
  g_var = 0;
  bar(&g_var);
}
```
We mistakenly removed `g_var = 0;` as a dead store.

Fix the issue by requiring the CallBase only access argmem or
inaccessiblemem.
2025-01-14 10:04:41 -08:00
Owen Anderson
bc8fa9c443
Revert "SimplifyLibCalls: Use default globals address space when building new global strings. (#118729)" (#119616)
This reverts commit cfa582e8aaa791b52110791f5e6504121aaf62bf.
2024-12-21 09:33:39 +13:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Haopeng Liu
0663a73104
Revert "[DSE] Enable initializes improvement" (#119590)
Reverts llvm/llvm-project#119116
2024-12-11 09:01:06 -08:00
Haopeng Liu
ebe741fad0
[DSE] Enable initializes improvement (#119116)
Tested with an internal search backend loadtest.

With `-ftrivial-auto-var-init`, this work has a 0.2%-0.3% total QPS
improvement.

Note that, the metric is total QPS instead of cpu-time, even 1%
improvement means a lot.

- Add the "initializes" attr:
https://github.com/llvm/llvm-project/pull/97373
- Apply the attr to DSE:
https://github.com/llvm/llvm-project/pull/107282
2024-12-10 09:58:13 -08:00
Owen Anderson
cfa582e8aa
SimplifyLibCalls: Use default globals address space when building new global strings. (#118729)
Writing a test for this transitively exposed a number of places in
BuildLibCalls where
we were failing to propagate address spaces properly, which are
additionally fixed.
2024-12-06 10:51:14 +13:00
Nikita Popov
1e32a7d42c
[AA] Rename CaptureInfo -> CaptureAnalysis (NFC) (#116842)
I'd like to use the name CaptureInfo to represent the new attribute
proposed at
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420,
but it's already taken by AA, and I can't think of great alternatives
(CaptureEffects would be something of a stretch).

As such, I'd like to rename CaptureInfo -> CaptureAnalysis in AA, which
also seems like the more accurate terminology.
2024-11-20 09:42:28 +01:00
Kazu Hirata
94f9cbbe49
[Scalar] Remove unused includes (NFC) (#114645)
Identified with misc-include-cleaner.
2024-11-02 08:32:26 -07:00
Haopeng Liu
a31ce36f56
Apply initializes attribute to DSE (#113630)
retry #107282

Fixed with `MadeChange |= Changed;` and confirmed it works.

```
cmake -DLLVM_CCACHE_BUILD=ON -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON -DLLVM_ENABLE_WERROR=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS=-U_GLIBCXX_DEBUG '-DLLVM_LIT_ARGS=-v -vv -j96' '-DLLVM_ENABLE_PROJECTS=llvm;lld' -DLLVM_ENABLE_ASSERTIONS=ON -GNinja ../llvm

ninja check-llvm
```
2024-10-24 18:43:20 -07:00
Arthur Eubanks
3cec720449
Revert "[DSE] Apply initializes attribute to DSE" (#113589)
Reverts llvm/llvm-project#107282

Seems to be causing invalid analysis caching as mentioned in
https://github.com/llvm/llvm-project/pull/107282#issuecomment-2435083978.
2024-10-24 08:51:31 -07:00
Haopeng Liu
089237c0d0
[DSE] Apply initializes attribute to DSE (#107282)
Apply the initializes attribute to DSE and guard with a flag,
"enable-dse-initializes-attr-improvement".

The attribute support has been landed in:
https://github.com/llvm/llvm-project/pull/84803
The attribute inference will be landed after this PR:
https://github.com/llvm/llvm-project/pull/97373
2024-10-23 22:18:59 -07:00
Kazu Hirata
48e4d67537
[DSE] Simplify code with MapVector::operator[] (NFC) (#111621) 2024-10-09 06:44:20 -07:00
Alex Voicu
4852374135
[llvm][opt][Transforms] Replacement calloc should match replaced malloc (#110524)
Currently DSE unconditionally emits `calloc` as returning a pointer to
AS0. However, this is incorrect for targets that have a non-zero default
AS, as it'd not match the `malloc` signature. This patch addresses that
by piping through the AS for the pointer returned by `malloc` into the
`calloc` insertion call.
2024-10-01 02:05:28 +01:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Haopeng Liu
6421dcc0a9
[NFC] [DSE] Refactor DSE (#100956)
Refactor DSE with MemoryDefWrapper and MemoryLocationWrapper.

Normally, one MemoryDef accesses one MemoryLocation. With "initializes"
attribute, one MemoryDef (like call instruction) could initialize
multiple MemoryLocations.

Refactor DSE as a preparation to apply "initializes" attribute in DSE in
a follow-up PR
(58dd8a4403).
2024-08-29 11:28:49 -07:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Antonio Frighetto
d5c89cc811 [DeadStoreElimination] Refactor out pushMemUses, drop dead check (NFC) 2024-07-12 08:31:49 +02:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00