1255 Commits

Author SHA1 Message Date
Arne Stenkrona
ea2f5395b1
[SimplifyCFG] Avoid threading for loop headers (#151142)
Updates SimplifyCFG to avoid jump threading through loop headers if
-keep-loops is requested. Canonical loop form requires a loop header
that dominates all blocks in the loop. If we thread through a header, we
risk breaking its domination of the loop. This change avoids this issue
by conservatively avoiding threading through headers entirely.

Fixes: https://github.com/llvm/llvm-project/issues/151144
2025-08-18 09:46:55 +00:00
Andreas Jonson
5ae8a9b8ce
[SimplifyCfg] Handle trunc nuw i1 condition in Equality comparison. (#153051)
proof: https://alive2.llvm.org/ce/z/WVt4-F
2025-08-17 09:53:40 +02:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Andreas Jonson
c6fd3d32c3
[SimplifyCfg] Add nneg to zext for switch to table conversion (#147180) 2025-08-04 16:18:05 +02:00
LU-JOHN
a757f23404
[SimplifyCFG] Extend jump-threading to allow live local defs (#135079)
Extend jump-threading to allow local defs that are live outside of the
threaded block. Allow threading to destinations where the local defs are
not live.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-07-31 09:44:14 -04:00
Nikita Popov
2c6eec219d [Tests] Avoid lifetime intrinsics on non-allocas (NFC)
Don't rely on auto-upgrade, instead either remove unnecessary
casts or remove no longer applicable tests.
2025-07-23 15:05:43 +02:00
Prabhu Rajasekaran
921c6dbeca
[llvm] Introduce callee_type metadata
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewers: morehouse, petrhosek, nikic, ilovepi

Reviewed By: nikic, ilovepi

Pull Request: https://github.com/llvm/llvm-project/pull/87573
2025-07-18 14:40:54 -07:00
Antonio Frighetto
c435cd1730 [SimplifyCFG] Cache unique predecessors in simplifyDuplicateSwitchArms
Avoid repeatedly querying `getUniquePredecessor` for already-visited
switch successors so as not to incur quadratic runtime.

Fixes: https://github.com/llvm/llvm-project/issues/147239.
2025-07-18 08:33:42 +02:00
David Green
0967957d7a
[CostModel] Handle all cost kinds in getCmpSelInstrCost (#148233)
Currently we always produce a cost of 1 for all CostKinds that are not
RecipThroughput, which can underestimate the cost if the type has a
higher legalization cost (like larger vectors). This relaxes it to cover
all cost kinds.
2025-07-15 18:08:52 +01:00
Gábor Spaits
338fd8b12c
[SimplifyCFG] Transform switch to select when common bits uniquely identify one case (#145233)
Fix #141753 .

This patch introduces a new check, that tries to decide if the
conjunction of all the values uniquely identify the accepted values by
the switch.
2025-07-02 18:16:12 +02:00
Andreas Jonson
33c265ddf7
[SimplifyCFG] Use indexType from data layout in switch to table conversion (#146207)
Generate the GEP with the index type that InstCombine will cast it to but use the knowledge that the index is unsigned.
2025-06-28 21:00:34 +02:00
Mircea Trofin
62f8281e08
[IR][PGO] Verify invalid MD_prof metadata on instructions (#145576)
This PR places the validation of `MD_prof` instruction metadata in the Verifier.
2025-06-25 13:10:43 -07:00
Antonio Frighetto
1247fddf36 [SimplifyCFG] Relax cttz cost check in simplifySwitchOfPowersOfTwo
We should be able to allow `simplifySwitchOfPowersOfTwo` transform
to take place, as, on recent X86 targets, the weighted latency-size
appears to be 2. This favours computing trailing zeroes and indexing
into a smaller value table, over generating a jump table with an
indirect branch, which overall should be more efficient.
2025-06-24 09:06:18 +02:00
Yingwei Zheng
7e1fa09ce2
[SimplifyCFG] Bail out on vector GEPs in passingValueIsAlwaysUndefined (#142526)
Closes https://github.com/llvm/llvm-project/issues/142522.
2025-06-04 12:37:30 +08:00
Vitaly Buka
3cb967a2cd
[NFCI][PromoteMem2Reg] Don't handle the first successor out of order (#142464)
Just for consistency, to avoid confusing conditions.

`reverse` helps to avoid tests updates as nothing is
changing for for successors count <=2.

For #142461
2025-06-03 10:26:55 -07:00
Yingwei Zheng
1e08febf0a
[SimplifyCFG] Switch to use paramHasNonNullAttr (#125383) 2025-06-02 12:20:13 +08:00
Nikita Popov
eee958285b
[SimplifyCFG] Only consider provenance capture in store speculation (#138548)
The capture check here is to protect against concurrent accesses from
other threads. This requires the provenance to escape.
2025-05-22 17:01:37 +02:00
Ellis Hoag
78f0af5d89
[SimplifyCFG][swifterror] Don't sink calls with swifterror params (#139015)
We've encountered an LLVM verification failure when building Swift with
the SimplifyCFG pass enabled. I found that
https://reviews.llvm.org/D158083 fixed this pass by preventing sinking
loads or stores of swifterror values, but it did not implement the same
protection for call or invokes.
In `Verifier.cpp`
[here](c685355811/llvm/lib/IR/Verifier.cpp (L4360-L4364))
and
[here](c685355811/llvm/lib/IR/Verifier.cpp (L3661-L3662))
we can see that swifterror values must also be used directly by call
instructions.
2025-05-12 14:37:26 -07:00
Nikita Popov
a7bff2a1c6 [SimplifyCFG] Add test for addr-only capture in store speculation (NFC) 2025-05-05 17:30:01 +02:00
Stephen Tozer
d6bb786705
[DebugInfo] Propagate source loc from invoke to replacement branch (#137206)
An existing transformation replaces invoke instructions with a call to
the invoked function and a branch to the destination; when this happens,
we propagate the invoke's source location to the call but not to the
branch. This patch updates this behaviour to propagate to the branch as
well.

Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-24 18:59:29 +01:00
Snehasish Kumar
2007dcfeb8
Reapply [Metadata] Preserve MD_prof when merging instructions when one is missing. (#135418)
Preserve branch weight metadata when merging instructions if one of the
instructions is missing metadata. This is similar in behaviour to what
we do today for other types of metadata such as mmra, memprof and
callsite metadata.

Also add a legality check when merging prof metadata based on
instruction type. Without this check GVN PRE optimizations result in
prof metadata on phi nodes which break the module verifier.

Build failure caught by
https://lab.llvm.org/buildbot/#/builders/113/builds/6621
```
!9185 = !{!"branch_weights", i32 3912, i32 802}
Wrong number of operands
!9185 = !{!"branch_weights", i32 3912, i32 802}
fatal error: error in backend: Broken module found, compilation aborted!
```

Reverts #134200 with additional changes.
2025-04-17 08:22:19 -07:00
Andreas Jonson
ed43207306
[SimplifyCFG] Handle trunc condition in foldBranchToCommonDest. (#135490)
proof: https://alive2.llvm.org/ce/z/v32Aof
2025-04-13 13:16:15 +02:00
Andreas Jonson
4dd80b73b0 [SimplifyCFG] test for trunc condition (NFC) 2025-04-12 12:25:40 +02:00
Snehasish Kumar
7f2abe8fd1
Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200)
Reverts llvm/llvm-project#132433

I suspect this change caused a failure in the bolt build bot.
https://lab.llvm.org/buildbot/#/builders/113/builds/6621

```
!9185 = !{!"branch_weights", i32 3912, i32 802}
Wrong number of operands
!9185 = !{!"branch_weights", i32 3912, i32 802}
fatal error: error in backend: Broken module found, compilation aborted!
```
2025-04-02 22:11:17 -07:00
Snehasish Kumar
c18994c7cd
[Metadata] Preserve MD_prof when merging instructions when one is missing. (#132433)
Preserve branch weight metadata when merging instructions if one of the
instructions is missing metadata. This is similar in behaviour to what
we do today for other types of metadata such as mmra, memprof and
callsite metadata.
2025-04-02 11:13:45 -06:00
Snehasish Kumar
dde0be9d97
[Metadata] Handle memprof, callsite merging when one is missing. (#132106)
For memprof and callsite metadata we want to pick one deterministically
and keep that even if one of them may be missing.
2025-04-02 11:10:02 -06:00
Phoebe Wang
369be311a7
[X86,SimplifyCFG] Support conditional faulting load or store only (#132032)
This is to fix a bug when a target only support conditional faulting
load, see test case hoist_store_without_cstore.

Split `-simplifycfg-hoist-loads-stores-with-cond-faulting` into
`-simplifycfg-hoist-loads-with-cond-faulting` and
`-simplifycfg-hoist-stores-with-cond-faulting` to control conditional
faulting load and store respectively.
2025-03-21 21:19:46 +08:00
Jeremy Morse
792a6f8119
[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.

Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.

(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
2025-03-14 15:50:49 +00:00
Gábor Spaits
a0b175cb34
[SimplifyCFG] Treat extract oneuse(op.with.overflow),1 pattern as a single instruction (#128021)
Closes #115683 .

Overflow arithmetic instruction plus extract value are usually generated
when a division is being replaced, but the zero check may still be
there. In that case hoist these two instructions out of this basic
block, and let later optimizations take care of the unnecessary zero
checks.
2025-03-14 14:18:57 +01:00
Stephen Tozer
af68927a83
Do not treat llvm.fake.use as a debug instruction (#128684)
The llvm.fake.use intrinsic is used to prevent certain values from being
optimized out for the benefit of debug info; it is not, however, a debug
or pseudo instruction itself and necessarily must not be treated as one,
since its purpose is to act like a normal instruction. In the original
commit that added them, the IR intrinsic however was treated as one in
`getPrevNonDebugInstruction` (but _not_ in `getNextNonDebugInstruction`,
or in the MIR equivalents). This patch correctly treats it as a
non-debug instruction.
2025-02-25 14:49:59 +00:00
Nikita Popov
d8b2e432d6
[IR] Remove mul constant expression (#127046)
Remove support for the mul constant expression, which has previously
already been marked as undesirable. This removes the APIs to create mul
expressions and updates tests to stop using mul expressions.

Part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
2025-02-14 09:28:57 +01:00
Florian Hahn
65640c1d4c
[AssumeBundles] Dereferenceable used in bundle only applies at assume. (#126117)
Update LangRef and code using `Dereferenceable` in assume bundles to
only use the information if it is safe at the point of use.

`Dereferenceable` in an assume bundle is only guaranteed at the point of
the assumption, but may not be guaranteed at later points, because the
pointer may have been freed.

Update code using `Dereferenceable` to only use it if the pointer cannot
be freed. This can further be refined to check if the pointer could be
freed between assume and use.

This follows up on https://github.com/llvm/llvm-project/pull/123196.

With that change, it should be safe to expose dereferenceable
assumptions more widely as in
https://github.com/llvm/llvm-project/pull/121789

PR: https://github.com/llvm/llvm-project/pull/126117
2025-02-13 20:41:23 +01:00
goldsteinn
a56ba1fab0
[ValueTracking] Handle recursive select/PHI in ComputeKnownBits (#114689)
Finish porting #114008 to `KnownBits` (Follow up to #113707).
2025-01-22 11:51:18 -06:00
Teresa Johnson
3a423a10ff
[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359)
This patch fixes a couple of places where memprof-related metadata
(!memprof and !callsite) were being dropped, and one place where PGO
metadata (!prof) was being dropped.

All were due to instances of combineMetadata() being invoked. That
function drops all metadata not in the list provided by the client, and
also drops any not in its switch statement.

Memprof metadata needed a case in the combineMetadata switch statement.
For now we simply keep the metadata of the instruction being kept, which
doesn't retain all the profile information when two calls with
memprof metadata are being combined, but at least retains some.

For the memprof metadata being dropped during call CSE, add memprof and
callsite metadata to the list of known ids in combineMetadataForCSE.

Neither memprof nor regular prof metadata were in the list of known ids
for the callsite in MemCpyOptimizer, which was added to combine AA
metadata after optimization of byval arguments fed by memcpy
instructions, and similar types of optimizations of memcpy uses.

There is one other callsite of combineMetadata, but it is only invoked
on load instructions, which do not carry these types of metadata.
2025-01-02 12:11:59 -08:00
DaPorkchop_
cea738bc9a
[SimplifyCFG] Replace unreachable switch lookup table holes with poison (#94990)
As discussed in #94468, this causes switch lookup table entries which
are unreachable to be poison instead of filling them with a value from
one of the reachable cases.

---------

Co-authored-by: DianQK <dianqk@dianqk.net>
2024-12-26 07:47:26 +08:00
Dominik Steenken
fa9cef50b1
Only guard loop metadata that has non-debug info in it (#118825)
This PR is motivated by a mismatch we discovered between compilation
results with vs. without `-g3`. We noticed this when compiling SPEC2017
testcases. The specific instance we saw is fixed in this PR by modifying
a guard (see below), but it is likely similar instances exist elsewhere
in the codebase.

The specific case fixed in this PR manifests itself in the `SimplifyCFG`
pass doing different things depending on whether DebugInfo is generated
or not. At the end of this comment, there is reduced example code that
shows the behavior in question.

The differing behavior has two root causes:
1. Commit https://github.com/llvm/llvm-project/commit/c07e19b adds loop
metadata including debug locations to loops that otherwise would not
have loop metadata
2. Commit https://github.com/llvm/llvm-project/commit/ac28efa6c100 adds
a guard to a simplification action in `SImplifyCFG` that prevents it
from simplifying away loop metadata

So, the change in 2. does not consider that when compiling with debug
symbols, loops that otherwise would not have metadata that needs
preserving, now have debug locations in their loop metadata. Thus, with
`-g3`, `SimplifyCFG` behaves differently than without it.

The larger issue is that while debug info is not supposed to influence
the final compilation result, commits like 1. blur the line between what
is and is not debug info, and not all optimization passes account for
this.

This PR does not address that and rather just modifies this particular
guard in order to restore equivalent behavior between debug and
non-debug builds in this one instance.

---

Here is a reduced version of a file from `f526.blender_r` that showcases
the behavior in question:
```C
struct LinkNode;
typedef struct LinkNode {
 struct LinkNode *next;
 void *link;
} LinkNode;

void do_projectpaint_thread_ph_v_state() {
  int *ps = do_projectpaint_thread_ph_v_state;
  LinkNode *node;
  while (do_projectpaint_thread_ph_v_state)
    for (node = ps; node; node = node->next)
      ;
}
```
Compiling this with and without DebugInfo, and then disassembling the
results, leads to different outcomes (tested on SystemZ and X86). The
reason for this is that the `SimplifyCFG` pass does different things in
either case.
2024-12-20 15:15:51 +01:00
Florian Hahn
c4a78b6fe3
[SimplifyCFG] Always allow hoisting if all instructions match. (#97158)
Generalize hoistCommonCodeFromSuccessors's `EqTermsOnly` to
`AllInstsEqOnly` and always allow hoisting if all instructions match.

In that case, all instructions can be hoisted and the
original branch will be replaced and selects for PHIs are added. This
allows preserving metadata in more cases, using the existing hoisting
logic, whereas previously FoldTwoEntryPHINode would drop the metadata.


https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u

PR: https://github.com/llvm/llvm-project/pull/97158
2024-12-13 21:26:27 +00:00
Antonio Frighetto
d26df32255 [SimplifyCFG] Consider preds to switch in simplifyDuplicateSwitchArms
Allow a duplicate basic block with multiple predecessors to the
jump table to be simplified, by considering that the same basic
block may appear in more switch cases.
2024-12-13 09:07:24 +01:00
Antonio Frighetto
e32c428bec [SimplifyCFG] Precommit tests for PR118955 (NFC) 2024-12-13 09:07:24 +01:00
Nikita Popov
462cb3cd6c
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-05 14:36:40 +01:00
Lee Wei
9bf6365237
[llvm] Remove br i1 undef from some regression tests [NFC] (#118419)
This PR removes tests with `br i1 undef` under
`llvm/tests/Transforms/ObjCARC, Reassociate, SCCP, SLPVectorizer...`.
After this PR, I'll continue to fix tests under `llvm/tests/CodeGen`,
which has more UB tests than `llvm/tests/Transforms`.
2024-12-03 20:54:36 +00:00
AdityaK
39601a6e54
Bail out jump threading on indirect branches only (#117778)
Remove check for PHI in pred as pointed out in #103688 
Reduced the testcase to remove redundant phi in pred

Fixes: #102351
2024-11-26 14:57:28 -08:00
Matt Arsenault
4028bb10c3
Local: Handle noalias_addrspace in combineMetadata (#103938)
This should act like range.

Previously ConstantRangeList assumed a 64-bit range. Now query from the
actual entries. This also means that the empty range has no bitwidth, so
move asserts to avoid checking the bitwidth of empty ranges.
2024-11-26 09:13:34 -05:00
Phoebe Wang
2568e52a73
[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812)
This is a follow up of #96878 to support hoisting load/store from BBs
have the same predecessor, if load/store are the only instructions and
the branch is unpredictable, e.g.:

```
void test (int a, int *c, int *d) {
  if (a)
   *c = a;
  else
   *d = a;
}
```
2024-11-25 15:19:28 +08:00
Stephen Tozer
2188a56a75
[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235)
Currently when we merge invokes as part of SimplifyCFG we apply a merge
of the invoke DILocations to the merged invoke. We also insert an
unconditional branch to the merged invoke at the positions previously
occupied by the original invokes; as this branch is part of the
substitution for the invoke it has replaced, we should propagate the
original invoke DebugLoc to it.
2024-11-15 17:20:55 +00:00
Michael Maitland
6b9952759f
[SimplifyCFG] Simplify switch instruction that has duplicate arms (#114262)
I noticed that the two C functions emitted different IR:

```
int switch_duplicate_arms(int switch_val, int v, int w) {
  switch (switch_val) {
  default:
    break;
  case 0:
    w = v;
    break;
  case 1:
    w = v;
    break;
  }
  return w;
}

int if_duplicate_arms(int switch_val, int v, int w) {
  if (switch_val == 0)
    w = v;
  else if (switch_val == 1)
    w = v;
  return v0;
}
```

We generate IR that looks like this:

```
define i32 @switch_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) {
  switch i32 %1, label %7 [
    i32 0, label %5
    i32 1, label %6
  ]

5:
  br label %7

6:
  br label %7

7:
  %8 = phi i32 [ %3, %4 ], [ %2, %6 ], [ %2, %5 ]
  ret i32 %8
}

define i32 @if_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) {
  %5 = icmp ult i32 %1, 2
  %6 = select i1 %5, i32 %2, i32 %3
  ret i32 %6
}
```

For `switch_duplicate_arms`, taking case 0 and 1 are the same since %5
and %6
branch to the same location and the incoming values for %8 are the same
from
those blocks. We could remove one on the duplicate switch targets and
update
the switch with the single target.

On RISC-V, prior to this patch, we generate the following code:
```
switch_duplicate_arms:
        li      a4, 1
        beq     a1, a4, .LBB0_2
        mv      a0, a3
        bnez    a1, .LBB0_3
.LBB0_2:
        mv      a0, a2
.LBB0_3:
        ret

if_duplicate_arms:
        li      a4, 2
        mv      a0, a2
        bltu    a1, a4, .LBB1_2
        mv      a0, a3
.LBB1_2:
        ret
```

After this patch, the O3 code is optimized to the icmp + select pair,
which
gives us the same code gen as `if_duplicate_arms`, as desired. This
results
is one less branch instruction in the final assembly.

This may help with both code size and further switch simplification. I
found
that this patch causes no significant impact to spec2006/int/ref and
spec2017/intrate/ref.

---------

Co-authored-by: Min Hsu <min@myhsu.dev>
2024-11-15 15:38:34 +01:00
Florian Hahn
40c75426a9
[SimplifyCFG] Add test for updating llvm.access.group when hoisting.
Add extra test coverage for preserving llvm.access.group metadata when
hoisting.
2024-11-12 13:14:30 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
elhewaty
9efb07f261
[IR] Add samesign flag to icmp instruction (#111419)
Inspired by
https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423
2024-10-15 17:11:25 +08:00
Noah Goldstein
82ac399733 [SimplifyCFG] Allow merging invoke's with different attrs
Same logic as other callsites, if the attributes are intersectable, we
merge.

Closes #111713
2024-10-10 01:07:59 -05:00