803 Commits

Author SHA1 Message Date
Sergei Barannikov
becd418626
[CGP] Despeculate ctlz/cttz with "illegal" integer types (#137197)
The code below the removed check looks generic enough to support
arbitrary integer widths. This change helps 32-bit targets avoid
expensive expansion/libcalls in the case of zero input.

Pull Request: https://github.com/llvm/llvm-project/pull/137197
2025-04-29 22:33:40 +03:00
Sergei Barannikov
5080a0251f
[CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (#102731)
DAG combiner already does this transformation, but in some cases it does
not have a chance because either CodeGenPrepare or SelectionDAGBuilder
move icmp to a different basic block.

https://alive2.llvm.org/ce/z/ARzh99

Fixes #94829

Pull Request: https://github.com/llvm/llvm-project/pull/102731
2025-04-23 08:54:10 +03:00
Matt Arsenault
430b0c4434
CodeGenPrepare: Check use_empty instead of getNumUses == 0 (#136334) 2025-04-18 21:13:21 +02:00
Kazu Hirata
58774f1b1f
[CodeGen] Construct SmallVector with iterator ranges (NFC) (#136258) 2025-04-18 10:26:48 -07:00
Ryan Buchner
fa2a6d68c6
[CodeGenPrepare][RISCV] Combine (X ^ Y) and (X == Y) where appropriate (#130922)
Fixes #130510.

In RISCV, modify the folding of (X ^ Y == 0) -> (X == Y) to account for
cases where the (X ^ Y) will be re-used.

If a constant is being used for the XOR before a branch, ensure that it
is small enough to fit within a 12-bit immediate field. Otherwise, the
equality check is more efficient than the check against 0, see the
following:
```
# %bb.0:
        lui     a1, 5
        addiw   a1, a1, 1365
        xor     a0, a0, a1
        beqz    a0, .LBB0_2
# %bb.1: 
        ret
.LBB0_2: 
```

```
# %bb.0:
        lui     a1, 5
        addiw   a1, a1, 1365
        beq    a0, a1, .LBB0_2
# %bb.1: 
        xor     a0, a0, a1
        ret
.LBB0_2: 
```

Similarly, if the XOR is between 1 and a size one integer, we should
still fold away the XOR since that comparison can be optimized as a
comparison against 0.
```
# %bb.0:
        slt a0, a0, a1
        xor  a0, a0, 1
        beqz    a0, .LBB0_2
# %bb.1: 
        ret
.LBB0_2: 
```

```
# %bb.0:
        slt a0, a0, a1
        bnez    a0, .LBB0_2
# %bb.1: 
        xor  a0, a0, 1
        ret
.LBB0_2: 
```

One question about my code is that I used a hard-coded value for the
width of a RISCV ALU immediate. Do you know of a way that I can gather
this from the `context`, I was unable to devise one.
2025-04-02 09:56:09 -07:00
Kazu Hirata
673f4705a8
[llvm] Use *Set::insert_range (NFC) (#133353)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E.first);

down to:

  Set.insert_range(llvm::make_first_range(Range));

In some cases, we can further fold that into the set declaration.
2025-03-27 20:44:20 -07:00
Kazu Hirata
1019457891
[CodeGen] Use *Set::insert_range (NFC) (#132651)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);
2025-03-23 21:20:44 -07:00
Kazu Hirata
41b76119ec
[llvm] Use range constructors for *Set (NFC) (#132636) 2025-03-23 15:50:34 -07:00
Kazu Hirata
1b189cab5e
[llvm] Use *Set::insert_range (NFC) (#132509)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch uses insert_range in
conjunction with llvm::{predecessors,successors} and
MachineBasicBlock::{predecessors,successors}.
2025-03-22 08:07:33 -07:00
Kazu Hirata
599005686a
[llvm] Use *Set::insert_range (NFC) (#132325)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-20 22:24:06 -07:00
Kazu Hirata
b2b267ea7a
[CodeGen] Avoid repeated hash lookups (NFC) (#130543) 2025-03-09 23:15:32 -07:00
Yingwei Zheng
3c6aa04cf4
[CodeGenPrepare] Replace deleted ext instr with the promoted value. (#71058)
This PR replaces the deleted ext with the promoted value in `AddrMode`.
Fixes #70938.
2025-01-30 08:58:23 +08:00
Jeremy Morse
81d18ad864
[NFC][DebugInfo] Make some block-start-position methods return iterators (#124287)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators. A
number of these (such as getFirstNonPHIOrDbg) are sufficiently
infrequently used that we can just replace the pointer-returning version
with an iterator-returning version, hopefully without much/any
disruption.

Thus this patch has getFirstNonPHIOrDbg and
getFirstNonPHIOrDbgOrLifetime return an iterator, and updates all
call-sites. There are no concerns about the iterators returned being
converted to Instruction*'s and losing the debug-info bit: because the
methods skip debug intrinsics, the iterator head bit is always false
anyway.
2025-01-27 16:27:54 +00:00
Jeremy Morse
e14962a39c
[NFC][DebugInfo] Use iterators for instruction insertion in more places (#124291)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators.
This patch changes some more complex call-sites, those crossing file
boundaries and where I've had to perform some minor rewrites.
2025-01-27 15:25:17 +00:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Pedro Lobo
c23f2417dc
[CodeGenPrepare] Replace undef use with poison [NFC] (#123111)
When generating a constant vector, if `UseSplat` is false, the indices
different from the index of the extract can be filled with `poison`
instead of `undef`.
2025-01-16 08:17:55 +00:00
Kazu Hirata
ebb58567e7
[CodeGen] Avoid repeated hash lookups (NFC) (#123016) 2025-01-15 07:57:04 -08:00
Ryan Mansfield
67efbd0bf1
[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955) 2025-01-08 11:07:23 +01:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Yingwei Zheng
6568ceb9fa
[CodeGenPrepare] Drop nsw flags in optimizeLoadExt (#118180)
Alive2: https://alive2.llvm.org/ce/z/pMcD7q
Closes https://github.com/llvm/llvm-project/issues/118172.
2024-12-01 11:25:31 +08:00
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
goldsteinn
1e072ae289
[CGP] [CodeGenPrepare] Folding urem with loop invariant value plus offset (#104724)
This extends the existing fold:

```
for(i = Start; i < End; ++i)
   Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant;
```
 ->
```
Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant;
for(i = Start; i < End; ++i, ++rem)
   Rem = rem == RemAmtLoopInvariant ? 0 : Rem;
```

To work with a non-zero `IncrLoopInvariant`.

This is a common usage in cases such as:

```
for(i = 0; i < N; ++i)
    if ((i + 1) % X) == 0)
        do_something_occasionally_but_not_first_iter();
```

Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout):
https://alive2.llvm.org/ce/z/6tgyN3

Exhaust proof over all uint8_t combinations in C++:
https://godbolt.org/z/WYa561388
2024-10-31 09:14:33 -05:00
Ellis Hoag
6ab26eab4f
Check hasOptSize() in shouldOptimizeForSize() (#112626) 2024-10-28 09:45:03 -07:00
Nuno Lopes
509af087cc replace 2 placeholder uses of undef with poison [NFC] 2024-10-24 09:01:25 +01:00
Yingwei Zheng
637e81f8ad
Reland [CodeGenPrepare] Convert ctpop(X) ==/!= 1 into ctpop(X) u</u> 2/1 (#111284) (#111998)
Relands #111284. Test failure with stage2 build has been fixed by
https://github.com/llvm/llvm-project/pull/111946.


Some targets have better codegen for `ctpop(X) u< 2` than `ctpop(X) ==
1`. After https://github.com/llvm/llvm-project/pull/100899, we set the
range of ctpop's return value to indicate the argument/result is
non-zero.

This patch converts `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` in CGP
to fix https://github.com/llvm/llvm-project/issues/95255.
2024-10-15 08:17:50 +08:00
Yingwei Zheng
ec3e0a5900
Revert "[CodeGenPrepare] Convert ctpop(X) ==/!= 1 into ctpop(X) u</u> 2/1" (#111932)
Reverts llvm/llvm-project#111284 to fix clang stage2 builds.
Investigating...

Failed buildbots:
https://lab.llvm.org/buildbot/#/builders/76/builds/3576
https://lab.llvm.org/buildbot/#/builders/168/builds/4308
https://lab.llvm.org/buildbot/#/builders/127/builds/1087
2024-10-11 11:08:07 +08:00
Yingwei Zheng
e3894f58e1
[CodeGenPrepare] Convert ctpop(X) ==/!= 1 into ctpop(X) u</u> 2/1 (#111284)
Some targets have better codegen for `ctpop(X) u< 2` than `ctpop(X) ==
1`. After https://github.com/llvm/llvm-project/pull/100899, we set the
range of ctpop's return value to indicate the argument/result is
non-zero.

This patch converts `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` in CGP
to fix https://github.com/llvm/llvm-project/issues/95255.
2024-10-11 09:08:38 +08:00
Jeffrey Byrnes
853c43d04a
[TTI] NFC: Port TLI.shouldSinkOperands to TTI (#110564)
Porting to TTI provides direct access to the instruction cost model,
which can enable instruction cost based sinking without introducing code
duplication.
2024-10-09 14:30:09 -07:00
Antonio Frighetto
e4e0dfb0c2 [CGP] Undo constant propagation of pointers across calls
It may be profitable to revert SCCP propagation of C++ static values,
if such constants are pointers, in order to avoid redundant pointer
computation, since the method returning the constant is non-removable.
2024-09-02 09:33:23 +02:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
Freddy Ye
3a5c578966
[MachineLoopInfo] Fix getLoopID to handle multi latches. (#106195)
This patch also fixed `CodegenPrepare` to preserve loop metadata when
merging blocks.

This fixes issue #102632
2024-08-29 08:44:22 +08:00
Noah Goldstein
e4c67ba67e Recommit "[CodeGenPrepare] Folding urem with loop invariant value"
Was missing remainder on `Start` value.

Also changed logic as as nikic suggested (getting loop from `PN`
instead of `Rem`). The prior impl increased the complexity of the code
and made debugging it more difficult.

Closes #104877
2024-08-20 09:17:49 -07:00
Noah Goldstein
731ae694a3 Revert "[CodeGenPrepare] Folding urem with loop invariant value"
This reverts commit c64ce8bf283120fd145a57d0e61f9697f719139d.

Seems to be causing stage2 failures on buildbots. Reverting while I
investigate.
2024-08-18 20:36:35 -07:00
Noah Goldstein
c64ce8bf28 [CodeGenPrepare] Folding urem with loop invariant value
```
for(i = Start; i < End; ++i)
   Rem = (i nuw+ IncrLoopInvariant) u% RemAmtLoopInvariant;
```
 ->
```
Rem = (Start nuw+ IncrLoopInvariant) % RemAmtLoopInvariant;
for(i = Start; i < End; ++i, ++rem)
   Rem = rem == RemAmtLoopInvariant ? 0 : Rem;
```

In its current state, only if `IncrLoopInvariant` and `Start` both
being zero.

Alive2 seemed unable to prove this (see:
https://alive2.llvm.org/ce/z/ATGDp3 which is clearly wrong but still
checks out...) so wrote an exhaustive test here:
https://godbolt.org/z/WYa561388

Closes #96625
2024-08-18 15:58:24 -07:00
Daniil Fukalov
0da2ba811a
[NFC] Cleanup in ADT and Analysis headers. (#104484)
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
2024-08-17 13:11:18 +02:00
Kazu Hirata
dfa13c010f
[CodeGen] Use a range-based for loop (NFC) (#104408) 2024-08-15 17:58:31 -07:00
Nikita Popov
3b27fce960 [CGP] Use getAllOnesValue()
Split out from https://github.com/llvm/llvm-project/pull/80309.
2024-08-12 16:37:42 +02:00
Momchil Velikov
a497e987e5 Reapply "[AArch64] Lower extending sitofp using tbl (#92528)"
This re-commits d1a4f0c9fb559eb4c2fb56112e56343bcd333edc after
a issue was fixed in f92bfca9fc217cad9026598ef6755e711c0be070
("[AArch64] All bits of an exact right shift are demanded (#97448)").
2024-07-08 11:55:29 +01:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Momchil Velikov
d058b51604 Revert "[AArch64] Lower extending sitofp using tbl (#92528)"
This reverts commit d1a4f0c9fb559eb4c2fb56112e56343bcd333edc.

There are reports about test failures with Eigen and JAX.
2024-06-26 20:00:22 +01:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Noah Goldstein
ef16f7ac1b [CodeGenPrepare] Add missing static decl on matchIncrement(); NFC 2024-06-21 16:41:03 +08:00
Fangrui Song
8ea31db272 [CodeGenPrepare] Use MapVector to stabilize iteration order
DenseMap iteration order is not guaranteed to be deterministic.

Without the change,
llvm/test/Transforms/CodeGenPrepare/X86/statepoint-relocate.ll would
fail when `combineHashValue` changes (#95970).

Fixes: dba7329ebb0dbe1fabb3faaedfd31da3b8bd611d
2024-06-18 17:19:51 -07:00
Momchil Velikov
d1a4f0c9fb
[AArch64] Lower extending sitofp using tbl (#92528)
In a similar manner as in https://reviews.llvm.org/D133494
use `TBL` to place bytes in the *upper* part of `i32` elements
and then convert to float using fixed-point `scvtf`, i.e.

    scvtf Vd.4s, Vn.4s, #24
2024-06-17 17:57:07 +01:00
Kazu Hirata
7c6d0d26b1
[llvm] Use llvm::unique (NFC) (#95628) 2024-06-14 22:49:36 -07:00
Paul Kirth
294f3ce5dd
Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… (#95281)
…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
2024-06-12 12:52:28 -07:00
Paul Kirth
607afa0b63
Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" (#95060)
Reverts llvm/llvm-project#86609

This change causes compile-time regressions for stage2 builds
(https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u).
It also introduced unintended changes to `.text` which should be
addressed before relanding.
2024-06-11 08:06:06 +02:00
Paul Kirth
c5978f1eb5
[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)
This patch implements the changes to LLVM IR discussed in

https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof metadata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
2024-06-10 11:27:21 -07:00