1403 Commits

Author SHA1 Message Date
Antonio Frighetto
d26df32255 [SimplifyCFG] Consider preds to switch in simplifyDuplicateSwitchArms
Allow a duplicate basic block with multiple predecessors to the
jump table to be simplified, by considering that the same basic
block may appear in more switch cases.
2024-12-13 09:07:24 +01:00
David Green
18abc7e0c5
[PatternMatch] Introduce m_c_Select (#114328)
This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).
2024-11-25 13:47:23 +00:00
Phoebe Wang
2568e52a73
[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812)
This is a follow up of #96878 to support hoisting load/store from BBs
have the same predecessor, if load/store are the only instructions and
the branch is unpredictable, e.g.:

```
void test (int a, int *c, int *d) {
  if (a)
   *c = a;
  else
   *d = a;
}
```
2024-11-25 15:19:28 +08:00
Stephen Tozer
2188a56a75
[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235)
Currently when we merge invokes as part of SimplifyCFG we apply a merge
of the invoke DILocations to the merged invoke. We also insert an
unconditional branch to the merged invoke at the positions previously
occupied by the original invokes; as this branch is part of the
substitution for the invoke it has replaced, we should propagate the
original invoke DebugLoc to it.
2024-11-15 17:20:55 +00:00
Michael Maitland
6b9952759f
[SimplifyCFG] Simplify switch instruction that has duplicate arms (#114262)
I noticed that the two C functions emitted different IR:

```
int switch_duplicate_arms(int switch_val, int v, int w) {
  switch (switch_val) {
  default:
    break;
  case 0:
    w = v;
    break;
  case 1:
    w = v;
    break;
  }
  return w;
}

int if_duplicate_arms(int switch_val, int v, int w) {
  if (switch_val == 0)
    w = v;
  else if (switch_val == 1)
    w = v;
  return v0;
}
```

We generate IR that looks like this:

```
define i32 @switch_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) {
  switch i32 %1, label %7 [
    i32 0, label %5
    i32 1, label %6
  ]

5:
  br label %7

6:
  br label %7

7:
  %8 = phi i32 [ %3, %4 ], [ %2, %6 ], [ %2, %5 ]
  ret i32 %8
}

define i32 @if_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) {
  %5 = icmp ult i32 %1, 2
  %6 = select i1 %5, i32 %2, i32 %3
  ret i32 %6
}
```

For `switch_duplicate_arms`, taking case 0 and 1 are the same since %5
and %6
branch to the same location and the incoming values for %8 are the same
from
those blocks. We could remove one on the duplicate switch targets and
update
the switch with the single target.

On RISC-V, prior to this patch, we generate the following code:
```
switch_duplicate_arms:
        li      a4, 1
        beq     a1, a4, .LBB0_2
        mv      a0, a3
        bnez    a1, .LBB0_3
.LBB0_2:
        mv      a0, a2
.LBB0_3:
        ret

if_duplicate_arms:
        li      a4, 2
        mv      a0, a2
        bltu    a1, a4, .LBB1_2
        mv      a0, a3
.LBB1_2:
        ret
```

After this patch, the O3 code is optimized to the icmp + select pair,
which
gives us the same code gen as `if_duplicate_arms`, as desired. This
results
is one less branch instruction in the final assembly.

This may help with both code size and further switch simplification. I
found
that this patch causes no significant impact to spec2006/int/ref and
spec2017/intrate/ref.

---------

Co-authored-by: Min Hsu <min@myhsu.dev>
2024-11-15 15:38:34 +01:00
Nikita Popov
255a99c29f
[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309)
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.

The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.

Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
2024-10-17 08:48:08 +02:00
Noah Goldstein
82ac399733 [SimplifyCFG] Allow merging invoke's with different attrs
Same logic as other callsites, if the attributes are intersectable, we
merge.

Closes #111713
2024-10-10 01:07:59 -05:00
Noah Goldstein
4d4beeb43c [SimplifyCFG] Supporting hoisting/sinking callbases with differing attrs
Some (many) attributes can safely be dropped to enable sinking. For
example removing `nonnull` on a return/param can't affect correctness.

Closes #109472
2024-10-01 18:27:08 -05:00
Simone Campanoni
5d19d55ce1
[SimplifyCFG] Better aligned a comment. (#109307) 2024-09-30 09:39:35 -07:00
Nikita Popov
f445e39ab2
[SimplifyCFG] Use isWritableObject() API (#110127)
SimplifyCFG store speculation currently has some homegrown code to check
for a writable object, handling the alloca special case only.

Switch it to use the generic isWritableObject() API, which means that we
also support byval arguments, allocator return values, and writable
arguments.

I've adjusted isWritableObject() to also check for the noalias attribute
when handling writable. Otherwise, I don't think that we can generalize
from at-entry writability. This was not relevant for previous uses of
the function, because they'd already require noalias for other reasons
anyway.
2024-09-30 10:03:46 +02:00
Nikita Popov
6f194a6dea [SimplifyCFG] Avoid truncation in linear map overflow check
This is supposed to test multiplication of the linear multiplifier
with the largest value it can be multiplied with. However, if
we truncate TableSize-1 here, it might not actually be the largest
value. I think in practice this still works out, because in cases
where we'd truncate the value here we'd also fail the NonMonotonic
check. But to match the intent of the code, we should treat the
truncating case as overflowing.
2024-09-23 15:13:32 +02:00
Nikita Popov
8a6248b739
[SimplifyCFG] Don't separate a load/store from its gep during sinking (#102318)
If we can sink the a load/store, but not the gep producing its pointer
operand, don't sink the load/store either. This may prevent the gep from
being folded into an addressing mode, and may also negatively affect
further analysis.

Fixes https://github.com/llvm/llvm-project/issues/96838.
2024-09-23 09:32:24 +02:00
Phoebe Wang
c9e5c42ad1
[X86,SimplifyCFG][NFC] Refactor code for #108812 (#109398) 2024-09-21 20:18:29 +08:00
Nikita Popov
30cdf1e959
[SimplifyCFG] Pass context instruction to isSafeToSpeculativelyExecute() (#109132)
Pass speculation target and assumption cache to
isSafeToSpeculativelyExecute() calls.

This allows speculating based on dereferenceable/align assumptions, but
the primary motivation here is to avoid regressions from planned changes
to fix https://github.com/llvm/llvm-project/issues/108854.
2024-09-19 10:19:15 +02:00
Noah Goldstein
37932643ab [SimplifyCFG] Deduce paths unreachable if they cause div/rem UB
Same we way mark a path unreachable if it may cause a nullptr
dereference, div/rem by zero or signed div/rem of INT_MIN by -1 cause
immediate UB.

Closes #109008
2024-09-18 12:59:52 -05:00
Noah Goldstein
419c53477e [SimplifyCFG] Mark div/rem as not-cheap to sink if we are replacing const denominator
Close #109007
2024-09-17 12:04:34 -05:00
Andreas Jonson
a0d00c94c2
[SimplifyCFG] Swap range metadata to attribute for calls. (#108984)
Among the last usages of range metadata for call before being able to
deprecate and only have the range attribute for calls.
2024-09-17 18:25:53 +02:00
Phoebe Wang
af5a45b34b
[X86,SimplifyCFG] Use passthru to reduce select (#108754) 2024-09-16 20:20:36 +08:00
Kazu Hirata
e99eb89d5d
[SimplifyCFG] Use range-based for loops (NFC) (#107180) 2024-09-04 01:29:13 -07:00
Shengchen Kan
87c86aa6b9
[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part I) (#96878)
This is simplifycfg part of
https://github.com/llvm/llvm-project/pull/95515

In this PR, we support hoisting load/store with conditional faulting in
`SimplifyCFGOpt::speculativelyExecuteBB` to eliminate conditional
branches.
This is for cases like
```
void test (int a, int *b) {
  if (a)
   *b = a;
}
``` 

In the following patches, we will support the hoist in
`SimplifyCFGOpt::hoistCommonCodeFromSuccessors`.
That is for cases like
```
void test (int a, int *c, int *d) {
  if (a)
   *c = a;
  else 
   *d = a;
}
```
2024-08-29 10:42:44 +08:00
Nikita Popov
84497c6f4f
[SimplifyCFG] Remove limitation on sinking of load/store of alloca (#104788)
This is a followup to https://github.com/llvm/llvm-project/pull/104579
to remove the limitation on sinking loads/stores of allocas entirely,
even if this would introduce a phi node.

Nowadays, SROA supports speculating load/store over select/phi.
Additionally, SimplifyCFG with sinking only runs at the end of the
function simplification pipeline, after SROA. I checked that the two
tests modified here still successfully SROA after the SimplifyCFG
transform.

We should, however, keep the limitation on lifetime intrinsics. SROA
does not have speculation support for these, and I've also found that
the way these are handled in the backend is very problematic
(https://github.com/llvm/llvm-project/issues/104776), so I think we
should leave them alone.
2024-08-26 10:14:43 +02:00
Nikita Popov
4d85285ff6
[SimplifyCFG] Fold switch over ucmp/scmp to icmp and br (#105636)
If we switch over ucmp/scmp and have two switch cases going to the same
destination, we can convert into icmp+br.

Fixes https://github.com/llvm/llvm-project/issues/105632.
2024-08-22 16:57:09 +02:00
Nikita Popov
b3fa45b642
[SimplifyCFG] Add support for hoisting commutative instructions (#104805)
This extends SimplifyCFG hoisting to also hoist instructions with
commuted operands, for example a+b on one side and b+a on the other
side.

This should address the issue mentioned in:
https://github.com/llvm/llvm-project/pull/91185#issuecomment-2097447927
2024-08-20 12:48:06 +02:00
Nikita Popov
83879f4f53
[SimplifyCFG] Don't block sinking for allocas if no phi created (#104579)
SimplifyCFG sinking currently does not sink loads/stores of allocas,
because historically SROA was unable to handle the resulting IR. Since
then, SROA both learned to speculate loads/stores over selects and phis,
*and* SimplifyCFG sinking has been deferred to the end of the function
simplification pipeline, which means that SROA happens before it.

As such, I believe that this workaround should no longer be necessary.
Given how sensitive SimplifyCFG sinking seems to be, this patch takes a
very conservative step towards removing this, by allowing sinking if we
don't actually need to form a phi over the pointer argument.

This fixes https://github.com/llvm/llvm-project/issues/104567, where
sinking a store to an escaped alloca allows converting a switch into
arithmetic.
2024-08-19 09:55:30 +02:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Shengchen Kan
bb790b8bf2 [NFC] Extract the probability check for the hoisted BB into a local function
So that we can early bail out to avoid nested if clauses.

This is to extract the NFC change in #96878 into a separate PR.
2024-08-03 00:17:28 +08:00
Shengchen Kan
60054dcd81 [NFC] Lowercase the first letter of functions defined in SimplifyCFG.cpp 2024-08-02 23:54:40 +08:00
Jan Patrick Lehr
a347bdb2b8
Revert "[SimplifyCFG] Skip threading if the target may have divergent branches" (#100994)
Reverts llvm/llvm-project#100185

See comments on PR (PR not accepted, outstanding review comments, breaks HIP-clang buildbot)
2024-07-29 11:34:26 +02:00
darkbuck
ba45453c0a
[SimplifyCFG] Skip threading if the target may have divergent branches
- This patch skips the threading on known values if the target has
  divergent branch.
- So far, threading on known values is skipped when the basic block has
  covergent calls. However, even without convergent calls, if that
  condition is divergent, threading duplicates the execution of that
  block threaded and hence results in lower performance. E.g.,
  ```
  BB1:
    if (cond) BB3, BB2

  BB2:
    // work2
    br BB3

  BB3:
    // work3
    if (cond) BB5, BB4

  BB4:
    // work4
    br BB5

  BB5:
  ```

  after threading,

  ```
  BB1:
    if (cond) BB3', BB2'

  BB2':
    // work3
    br BB5

  BB3':
    // work2
    // work3
    // work4
    br BB5

  BB5:

  ```

  After threading, work3 is executed twice if 'cond' is a divergent one.

Reviewers: yxsamliu, nikic

Pull Request: https://github.com/llvm/llvm-project/pull/100185
2024-07-26 12:15:49 -04:00
Tianqing Wang
3d494bfc7f
[SimplifyCFG] Increase budget for FoldTwoEntryPHINode() if the branch is unpredictable. (#98495)
The `!unpredictable` metadata has been present for a long time, but
it's usage in optimizations is still limited. This patch teaches
`FoldTwoEntryPHINode()` to be more aggressive with an unpredictable
branch to reduce mispredictions.

A TTI interface `getBranchMispredictPenalty()` is added to distinguish
between different hardwares to ensure we don't go too far for simpler
cores. For simplicity, only a naive x86 implementation is included for
the time being.
2024-07-23 07:47:21 +08:00
DianQK
8afb6432c2
[SimplifyCFG] Select the first instruction that we can handle in passingValueIsAlwaysUndefined (#98802)
Fixes #98799.
2024-07-16 19:06:32 +08:00
Kazu Hirata
92f4001906
[Transforms] Use range-based for loops (NFC) (#97576) 2024-07-03 12:53:19 -07:00
Yingwei Zheng
4997af98a0
[SimplifyCFG] Simplify nested branches (#97067)
This patch folds the following pattern (I don't know what to call this):
```
bb0:
  br i1 %cond1, label %bb1, label %bb2
bb1:
  br i1 %cond2, label %bb3, label %bb4
bb2:
  br i1 %cond2, label %bb4, label %bb3
bb3:
  ...
bb4:
  ...
```
into
```
bb0:
  %cond = xor i1 %cond1, %cond2
  br i1 %cond, label %bb4, label %bb3
bb3:
  ...
bb4:
  ...
```

Alive2: https://alive2.llvm.org/ce/z/5iOJEL
Closes https://github.com/llvm/llvm-project/issues/97022.
Closes https://github.com/llvm/llvm-project/issues/83417.

I found this pattern in some verilator-generated code, which is widely
used in RTL simulation. This fold will reduces branches and improves the
performance of CPU frontend. To my surprise, this pattern is also common
in C/C++ code base.
Affected libraries/applications:
cmake/cvc5/freetype/git/gromacs/jq/linux/openblas/openmpi/openssl/php/postgres/ruby/sqlite/wireshark/z3/...
2024-07-01 03:35:39 +08:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
DianQK
5a052ef76a
Reapply "[SimplifyCFG] Forward indirect switch condition value if it can help fold the PHI (#95932)"
This reverts commit c7adfb5e715334c9de176f434088bd9f89aa9eb3.
2024-06-27 07:23:59 +08:00
DianQK
c7adfb5e71
Revert "[SimplifyCFG] Forward indirect switch condition value if it can help fold the PHI (#95932)"
This reverts commit 0c56fd0a29ffb0425ca2ee2a4ff8f380880fdbfa.
This is breaking https://lab.llvm.org/buildbot/#/builders/72/builds/483.
2024-06-27 06:49:40 +08:00
DianQK
0c56fd0a29
[SimplifyCFG] Forward indirect switch condition value if it can help fold the PHI (#95932)
Fixes #95919.
2024-06-27 05:49:33 +08:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Nikita Popov
ede27d8d39
[SimplifyCFG] Add support for sinking instructions with multiple uses (#95521)
Sinking currently only supports instructions that have zero or one uses.
Extend this to handle instructions with any number of uses, as long as
all uses are consistent (i.e. the "same" for all sinking candidates).

After #94462 this is basically just a matter of looping over all uses
instead of checking the first one only.
2024-06-17 08:48:30 +02:00
Kazu Hirata
7c6d0d26b1
[llvm] Use llvm::unique (NFC) (#95628) 2024-06-14 22:49:36 -07:00
Nikita Popov
dfde0773fd
[SimplifyCFG] More accurate use legality check for sinking (#94462)
When sinking instructions, we have to make sure that the uses of that
instruction are consistent: If used in a phi node in the sink target,
then the phi operands have to match the sink candidates. This allows the
phi to be removed when the instruction is sunk. This case is already
handled accurately.

However, what the current code doesn't handle are uses in the same
block. These are just unconditionally accepted, even though this needs
the same consistency check for the phi node that sinking the using
instruction would introduce.

Instead, the code has another check when actually performing the
sinking, which repeats the phi check (just at a later time, where all
the later instructions have already been sunk and any new phis
introduced).

This is problematic, because it messes up the profitability heuristic.
The code will think that certain instructions will get sunk, but they
actually won't. This may result in more phi nodes being created than is
considered profitable. See the changed test for a case where we no
longer do this after this patch.

The new approach makes sure that the uses are consistent during the
initial legality check. This is based on PhiOperands, which we already
collect.

The primary motivation for this is to generalize sinking to support more
than one use, and doing that generalization is hard with the current
split checking approach.
2024-06-14 10:38:50 +02:00
Paul Kirth
294f3ce5dd
Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… (#95281)
…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
2024-06-12 12:52:28 -07:00
Paul Kirth
607afa0b63
Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" (#95060)
Reverts llvm/llvm-project#86609

This change causes compile-time regressions for stage2 builds
(https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u).
It also introduced unintended changes to `.text` which should be
addressed before relanding.
2024-06-11 08:06:06 +02:00
Paul Kirth
c5978f1eb5
[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)
This patch implements the changes to LLVM IR discussed in

https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof metadata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
2024-06-10 11:27:21 -07:00
DaPorkchop_
540f68c44f
[SimplifyCFG] Don't use a mask for lookup tables generated from switches with an unreachable default case (#94468)
When transforming a switch with holes into a lookup table, we currently
use a mask to check if the current index is handled by the switch or if
it is a hole. If it is a hole, we skip loading from the lookup table.
Normally, if the switch's default case is unreachable this has no
impact, as the mask test gets optimized away by subsequent passes.
However, if the switch is large enough that the number of lookup table
entries exceeds the target's register width, we won't be able to fit all
the cases into a mask and the switch won't get transformed into a lookup
table. If we know that the switch's default case is unreachable, we know
that the mask is unnecessary and can skip constructing it entirely,
which allows us to transform the switch into a lookup table.

[Example](https://godbolt.org/z/7x7qfx8M1)

In the future, it might be interesting to consider allowing lookup table
masks to be more than one register large (e.g. using a constant array of
bit flags, similar to `std::bitset`).
2024-06-08 21:32:34 +08:00
Qiongsi Wu
7dc2f66022
Reland "[SimplifyCFG] switch: Do Not Transform the Default Case if the Condition is Too Wide (#77831)"
https://github.com/llvm/llvm-project/pull/76669 taught SimplifyCFG to
handle switches when `default` has only one case. When the `switch`'s
condition is wider than 64 bit, the current implementation can calculate
the wrong default value. This PR skips cases where the condition is too
wide.

(cherry picked from commit 39bb790b906f4921a5d9fc09e856abe53ae7a320)
2024-05-25 10:38:44 +08:00
DianQK
64ed699b3d
Reland "[SimplifyCFG] When only one case value is missing, replace default with that case (#76669)"
When the default branch is the last case, we can transform that branch
into a concrete branch with an unreachable default branch.

```llvm
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i64 @src(i64 %0) {
  %2 = urem i64 %0, 4
  switch i64 %2, label %5 [
    i64 1, label %3
    i64 2, label %3
    i64 3, label %4
  ]

3:                                                ; preds = %1, %1
  br label %5

4:                                                ; preds = %1
  br label %5

5:                                                ; preds = %1, %4, %3
  %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
  ret i64 %.0
}

define i64 @tgt(i64 %0) {
  %2 = urem i64 %0, 4
  switch i64 %2, label %unreachable [
    i64 0, label %5
    i64 1, label %3
    i64 2, label %3
    i64 3, label %4
  ]

unreachable:                              ; preds = %1
  unreachable

3:                                                ; preds = %1, %1
  br label %5

4:                                                ; preds = %1
  br label %5

5:                                                ; preds = %1, %4, %3
  %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
  ret i64 %.0
}
```

Alive2: https://alive2.llvm.org/ce/z/Y-PGXv

After transform to a lookup table, I believe `tgt` is better code.

The final instructions are as follows:

```asm
src:                                    # @src
        and     edi, 3
        lea     rax, [rdi - 1]
        cmp     rax, 2
        ja      .LBB0_1
        mov     rax, qword ptr [8*rdi + .Lswitch.table.src-8]
        ret
.LBB0_1:
        xor     eax, eax
        ret
tgt:                                    # @tgt
        and     edi, 3
        mov     rax, qword ptr [8*rdi + .Lswitch.table.tgt]
        ret
.Lswitch.table.src:
        .quad   1                               # 0x1
        .quad   1                               # 0x1
        .quad   2                               # 0x2

.Lswitch.table.tgt:
        .quad   0                               # 0x0
        .quad   1                               # 0x1
        .quad   1                               # 0x1
        .quad   2                               # 0x2
```

Godbolt: https://llvm.godbolt.org/z/borME8znd

Closes #73446.

(cherry picked from commit 7d81e072712f4e6a150561b5538ccebda289aa13)
2024-05-25 10:38:19 +08:00
Eli Friedman
f893dccbba
Replace uses of ConstantExpr::getCompare. (#91558)
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
2024-05-09 16:50:01 -07:00
Harald van Dijk
a6171900a4
[RemoveDIs] Change remapDbgVariableRecord to remapDbgRecord (#91456)
We need to remap any DbgRecord, not just DbgVariableRecords.

This is the followup to #91447.

Co-authored-by: PietroGhg <pietro.ghiglio@codeplay.com>
2024-05-08 17:02:25 +01:00