848 Commits

Author SHA1 Message Date
Wang Yaduo
ee9aeb4e05
[NFC] Fix typos 'bicast' -> 'bitcast' (#180890)
Fix typos bicast -> bitcast. I find this when I'm resovling a codegen
prepare optimazation.
2026-02-15 18:47:34 +08:00
Mingjie Xu
28a0cfa946
Reland "[BasicBlockUtils] Fix dominator tree update for entry block in splitBlockBefore() (#178895) (#179392)
https://github.com/llvm/llvm-project/pull/178895 caused a clang
crash(see https://lab.llvm.org/buildbot/#/builders/210/builds/8229),
reverted in 6d52d2683c2ceb9ab75810730c3ced2509c32bc5.

The crash is assertion `DT && "DT should be available to update
LoopInfo!"' failed.

ad8d5349d4/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp (L1106)

```
 #7 0x00007f5a380254e3 __assert_perror_fail (/usr/lib/libc.so.6+0x254e3)
 #8 0x0000563df5d8fde1 UpdateAnalysisInformation(llvm::BasicBlock*, llvm::BasicBlock*, llvm::ArrayRef<llvm::BasicBlock*>, llvm::DomTreeUpdater*, llvm::DominatorTree*, llvm::LoopInfo*, llvm::MemorySSAUpdater*, bool, bool&) BasicBlockUtils.cpp:0:0
 #9 0x0000563df5d8f3bb llvm::splitBlockBefore(llvm::BasicBlock*, llvm::ilist_iterator_w_bits<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void, true, llvm::BasicBlock>, false, false>, llvm::DomTreeUpdater*, llvm::LoopInfo*, llvm::MemorySSAUpdater*, llvm::Twine const&)
#10 0x0000563df5d8cb08 llvm::SplitEdge(llvm::BasicBlock*, llvm::BasicBlock*, llvm::DominatorTree*, llvm::LoopInfo*, llvm::MemorySSAUpdater*, llvm::Twine const&)
#11 0x0000563df4ff5b59 (anonymous namespace)::CodeGenPrepare::splitLargeGEPOffsets()::$_1::operator()(long, llvm::Value*, llvm::GetElementPtrInst*) const CodeGenPrepare.cpp:0:0
#12 0x0000563df4fc0ec8 (anonymous namespace)::CodeGenPrepare::_run(llvm::Function&) CodeGenPrepare.cpp:0:0
#13 0x0000563df4fbb36c (anonymous namespace)::CodeGenPrepareLegacyPass::runOnFunction(llvm::Function&) CodeGenPrepare.cpp:0:0
```

I think this happened when we get DominatorTree with `DT.get()` in
`splitLargeGEPOffsets()` but `DT.reset()` already setting it to nullptr
in
ad8d5349d4/llvm/lib/CodeGen/CodeGenPrepare.cpp (L660).
To fix this assertion failure, use `getDT()` for
`splitLargeGEPOffsets()` to build the DominatorTree if it is set to
nullptr by `DT.reset()`.

I don't have a RSIC-V environment, so no reproducer. Checked that the
crash is fixed by rerunning buildbot with this patch
https://lab.llvm.org/buildbot/#/builders/210/builds/8248
2026-02-03 22:38:41 +08:00
Nikita Popov
fd1e37b653
[IR] Remove Before argument from splitBlock APIs (NFC) (#179195)
We never need to use this conditionally (and it doesn't really make
sense, as the behavior is substantially different). Force the use of
separate APIs instead of a boolean argument.
2026-02-02 10:50:58 +00:00
Jameson Nash
d10b2b566a
[NFCI] replace getValueType with new getGlobalSize query (#177186)
Returns uint64_t to simplify callers. The goal is eventually replace
getValueType with this query, which should return the known minimum
reference-able size, as provided (instead of a Type) during create.
Additionally the common isSized query would be replaced with an
isExactKnownSize query to test if that size is an exact definition.
2026-01-22 13:55:53 -05:00
nataliakokoromyti
fa0071baab
[CodeGenPrepare] Fix infinite loop with same-type bitcasts (#176694)
OptimizeNoopCopyExpression was sinking same-type bitcasts (e.g. bitcast
i32 to i32) which would then be reintroduced by optimizePhiType, causing
an infinite loop.

Fix by adding a check (PhiTy == ConvertTy) in optimizePhiType to skip
the conversion when types are already identical.

Fixes #176688.
2026-01-22 09:29:08 +01:00
Rahul Joshi
26f962465e
[LLVM][CodeGen] Remove pass initialization calls from pass constructors (#173061)
- Remove pass initialization calls from pass constructors.
- For some passes, add the initialization to `initializeCodeGen` or
`initializeGlobalISel`.
- Remove redundant initializations from llc and X86 target for some
passes.
2026-01-21 08:44:51 -08:00
Jameson Nash
ba2bd3fbba
Use AllocaInst::getAllocationSize instead of manual size calculations (#176486)
Replace patterns that manually compute allocation sizes by multiplying
getTypeAllocSize(getAllocatedType()) by the array size with calls to the
getAllocationSize(DL) API, which handles this correctly and concisely,
returning nullopt for VLAs.

This fixes several places that were not accounting for array allocations
when computing sizes, simplifies code that was doing this manually, and
adds some explicit isFixed checks where implied convert was being used.

This PR is because now that we have opaque pointers, I hate that some
AllocaInst still has type information being consumed by some passes
instead of just using the size, since passes rarely handle that type
information well or correctly. I hope this will grow into a sequence of
commits to slowly eliminate uses of getAllocatedType from AllocaInst.
And similarly later to remove type information from GlobalValue too (it
can be replaced with just dereferenceable bytes, similar to arguments).

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 09:55:52 -05:00
David Green
a4975a8089
[CGP][AArch64] Do not sink instructions that might read/write memory. (#176182)
The test case's call instruction was being sank past the point where the
memory
it accessed was valid. Add a check that CGP does not try to sink
instruction that
might be invalid to move.

Fixes #176095
2026-01-18 14:18:25 +08:00
Antonio Frighetto
0456bcdd8d
[CGP] Refactor tail call eligibility checks in dupRetToEnableTailCallOpts (NFC)
Tail call eligibility and profitability checks have been combined
into a single helper to reduce code duplication.
2026-01-16 12:36:43 +01:00
Ramkumar Ramachandra
d69335bac9
[LLVM] Clean up code using [not_]equal_to (NFC) (#175824)
Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce
bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner
code.
2026-01-13 21:19:39 +00:00
Nikita Popov
7f6afc499f [CGP] Use getSigned() for scale during address sinking
The scale is a signed quantity.

This avoids an assertion failure with github.com/llvm/llvm-project/pull/171456.
2026-01-06 17:23:53 +01:00
Nikita Popov
8fd85ba9e6 [LLVM] Temporarily allow implicit truncation in some places
Split out from https://github.com/llvm/llvm-project/pull/171456.

This explicitly allows implicit truncation in a number of places,
prior to switching the default. This limits the scope of the
initial change.
2026-01-05 09:52:57 +01:00
Teja Alaghari
4e89e710d9
[CodeGenPrepare][NPM] Remove incorrect LoopAnalysis preservation in CodeGenPrepare (#172418)
CodeGenPrepare modifies and restructures loops & control flow. So, it
shouldn't preserve LoopAnalysis.

The test `llvm/test/CodeGen/AMDGPU/cf-loop-on-constant.ll` shows
CodeGenPrepare modifying loop structure, hence we cannot preserve
LoopAnalysis.
2025-12-19 11:08:31 +05:30
Hassnaa Hamdi
3d5d32c605
[CGP]: Optimize mul.overflow. (#148343)
- Detect cases where LHS & RHS values will not cause overflow
(when the Hi halfs are zero).
2025-11-18 13:15:47 +00:00
Simon Pilgrim
5b20453062
[CodeGenPrepare] sinkCmpExpression - don't sink larger than legal integer comparisons (#166778)
A generic alternative to #166564 - make the assumption that expanding
integer comparisons will be expensive if they are larger than the largest
legal type so avoid sinking if they are also used in the current BB + any phis.

Fixes #166534
2025-11-10 14:39:43 +00:00
Kazu Hirata
b82bde695e
[Analysis, CodeGen] Use "= default" (NFC) (#166024)
Identified with modernize-use-equals-default.
2025-11-01 23:20:11 -07:00
Yingwei Zheng
59e601a3d5
[CodeGenPrepare] Don't simplify incomplete expression tree in AddrModeCombine (#164628)
Since new select/phi instructions may construct loops, the expression
tree to be simplified may still be incomplete (i.e., it may contain
select with dummy values or phi without incoming values). This patch
removes the call to simplifyInstruction for now, as it doesn't break
existing tests.

Original PR: https://reviews.llvm.org/D36073
Fix the crash reported in
https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732.
2025-10-25 16:47:32 +08:00
Kazu Hirata
f2306b6304
[llvm] Replace LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]] (NFC) (#163507)
This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]].  Note
that this patch adjusts the placement of [[maybe_unused]] to comply
with the C++17 language.
2025-10-15 06:54:14 -07:00
Vladimir Radosavljevic
be7f85168d
[CGP] Fix missing sign extension for base offset in optimizeMemoryInst (#161377)
If we have integers larger than 64-bit we need to explicitly sign extend
them, otherwise we will get wrong zero extended values.
2025-10-10 10:52:52 +00:00
AZero13
09bdbfd9d1
[CodeGenPrepare] Bail out of usubo creation if sub's parent is not the same as the comparison (#160358)
We match uadd's behavior here.

Codegen comparison: https://godbolt.org/z/x8j4EhGno
2025-09-25 22:55:01 +09:00
Jeffrey Byrnes
d8a4c61fe4
[CodeGenPrepare] Consider target memory intrinics as memory use (#159638)
When deciding to sink address instructions into their uses, we check if
it is profitable to do so. The profitability check is based on the types
of uses of this address instruction -- if there are users which are not
memory instructions, then do not fold.

However, this profitability check wasn't considering target intrinsics,
which may be loads / stores.

This adds some logic to handle target memory intrinsics.
2025-09-19 14:18:21 -07:00
Mingming Liu
8b3c91c4fb
Re-apply "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159161)
This is a reland of https://github.com/llvm/llvm-project/pull/158460

Test failures are gone once I undo the changes in codegenprepare.
2025-09-16 20:33:29 +00:00
Mingming Liu
9277bcd1ab
Revert "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159159)
Reverts llvm/llvm-project#158460 due to buildbot failures
2025-09-16 12:51:54 -07:00
Mingming Liu
027bccc469
[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if existing prefix is not equivalent to the new one. Returns whether prefix changed. (#158460)
Before this change, `setSectionPrefix` overwrites existing section
prefix with new one unconditionally.

After this change, `setSectionPrefix` checks for equivalences, updates
conditionally and returns whether an update happens.

Update the existing callers to make use of the return value. [PR
155337](https://github.com/llvm/llvm-project/pull/155337/files#diff-cc0c67ac89807f4453f0cfea9164944a4650cd6873a468a0f907e7158818eae9)
is a motivating use case whether the 'update' semantic is needed.
2025-09-16 12:01:21 -07:00
Nikita Popov
3f757a39f2
[CodeGen] Remove ExpandInlineAsm hook (#156617)
This hook replaces inline asm with LLVM intrinsics. It was intended to
match inline assembly implementations of bswap in libc headers and
replace them more optimizable implementations.

At this point, it has outlived its usefulness (see
https://github.com/llvm/llvm-project/issues/156571#issuecomment-3247638412),
as libc implementations no longer use inline assembly for this purpose.

Additionally, it breaks the "black box" property of inline assembly,
which some languages like Rust would like to guarantee.

Fixes https://github.com/llvm/llvm-project/issues/156571.
2025-09-04 09:28:11 +02:00
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
Paul Walker
94d374ab6c
[LLVM][CGP] Allow finer control for sinking compares. (#151366)
Compare sinking is selectable based on the result of
hasMultipleConditionRegisters. This function is too coarse grained by
not taking into account the differences between scalar and vector
compares. This PR extends the interface to take an EVT to allow finer
control.
    
The new interface is used by AArch64 to disable sinking of scalable
vector compares, but with isProfitableToSinkOperands updated to maintain
the cases that are specifically tested.
2025-08-05 11:43:41 +01:00
Phoebe Wang
ebf96f9316
[X86][APX] Do optimizeMemoryInst for v1X masked load/store (#151331)
Fix redundant LEA: https://godbolt.org/z/34xEYE818
2025-07-31 11:52:39 +08:00
Yingwei Zheng
2d0ca09305
[CodeGenPrepare] Make sure that AddOffset is also a loop invariant (#150625)
Closes https://github.com/llvm/llvm-project/issues/150611.
2025-07-26 00:23:56 +08:00
Jeremy Morse
c9ceb9b75f
[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816)
This is one of the final remaining debug-intrinsic specific codepaths
out there, and pieces of cross-LLVM infrastructure to do with debug
intrinsics.
2025-07-21 17:49:25 +01:00
Jeremy Morse
c9d8b68676
[DebugInfo] Suppress lots of users of DbgValueInst (#149476)
This is another prune of dead code -- we never generate debug intrinsics
nowadays, therefore there's no need for these codepaths to run.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2025-07-18 11:31:52 +01:00
Jeremy Morse
5b8c15c6e7
[DebugInfo] Remove getPrevNonDebugInstruction (#148859)
With the advent of intrinsic-less debug-info, we no longer need to
scatter calls to getPrevNonDebugInstruction around the codebase. Remove
most of them -- there are one or two that have the "SkipPseudoOp" flag
turned on, however they don't seem to be in positions where skipping
anything would be reasonable.
2025-07-16 11:41:32 +01:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Evgenii Kudriashov
5ffdd9480d
[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization (#143566)
Follow up on #139303
2025-06-28 23:30:03 +02:00
Jeremy Morse
9eb0020555
[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.

This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-17 15:55:14 +01:00
Jeremy Morse
97ac6483aa
[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746)
This flag was used to let us incrementally introduce debug records
into LLVM, however everything is now using records. It serves no
purpose now, so delete it.
2025-06-12 11:51:58 +01:00
Florian Hahn
dde30a4731
[CGP] Bail out if (Base|Scaled)Reg does not dominate insert point. (#142949)
(Base|Scaled)Reg may not dominate the chosen insert point, if there are
multiple uses of the address. Bail out if that's the case, otherwise we
will generate invalid IR.

In some cases, we could probably adjust the insert point or hoist the
(Base|Scaled)Reg.

Fixes https://github.com/llvm/llvm-project/issues/142830.

PR: https://github.com/llvm/llvm-project/pull/142949
2025-06-06 12:38:30 +01:00
mikael-nilsson-arm
09967917e7
[CodeGenPrepare] Fix signed overflow (#141487)
The signed addition could overflow which is undefined behavior, now the
code checks for it.
2025-06-03 09:27:25 +02:00
Tim Gymnich
571a24c314
Reland [llvm] add GenericFloatingPointPredicateUtils #140254 (#141065)
#140254 was previously missing 2 files in the bazel build config.
2025-05-22 17:17:02 +02:00
Kewen12
c47a5fbb22
Revert "[llvm] add GenericFloatingPointPredicateUtils (#140254)" (#140968)
This reverts commit d00d74bb2564103ae3cb5ac6b6ffecf7e1cc2238. 

The PR breaks our buildbots and blocks downstream merge.
2025-05-21 19:31:14 -04:00
Tim Gymnich
d00d74bb25
[llvm] add GenericFloatingPointPredicateUtils (#140254)
add `GenericFloatingPointPredicateUtils` in order to generalize
effects of floating point comparisons on `KnownFPClass` for both IR and
MIR.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-21 23:45:31 +02:00
weiguozhi
59c6d70ed8
[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst (#139303)
Function optimizeBlock may do optimizations on a block for multiple
times. In the first iteration of the loop, MemoryInst1 may generate a
sunk instruction and store it into SunkAddrs. In the second iteration of
the loop, MemoryInst2 may use the same address and then it can reuse the
sunk instruction stored in SunkAddrs, but MemoryInst2 may be before
MemoryInst1 and the corresponding sunk instruction. In order to avoid
use before def error, we need to find appropriate insert position for the
 sunk instruction.

Fixes #138208.
2025-05-15 09:27:25 -07:00
Matt Arsenault
9383fb23e1
Reapply "IR: Remove uselist for constantdata (#137313)" (#138961)
Reapply "IR: Remove uselist for constantdata (#137313)"

This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75.

Fix checking uselists of constants in assume bundle queries
2025-05-08 08:00:09 +02:00
Kirill Stoimenov
5936c02c8b Revert "IR: Remove uselist for constantdata (#137313)"
Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119

This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.
2025-05-07 00:07:55 +00:00
Matt Arsenault
87f312aad6
IR: Remove uselist for constantdata (#137313)
This is a resurrected version of the patch attached to this RFC:

https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606

In this adaptation, there are a few differences. In the original patch, the Use's
use list was replaced with an unsigned* to the reference count in the value. This
version leaves them as null and leaves the ref counting only in Value.

Remove use-lists from instances of ConstantData (which are shared
across modules and have no operands).

To continue supporting most of the use-list API, store a ref-count in
place of the use-list; this is for API like Value::use_empty and
Value::hasNUses.  Operations that actually need the use-list -- like
Value::use_begin -- will assert.

This change has three benefits:

 1. The compiler output cannot in any way depend on the use-list order
    of instances of ConstantData.

 2. There's no use-list traffic when adding and removing simple
    constants from operand lists (although there is ref-count traffic;
    YMMV).

 3. It's cheaper to serialize use-lists (since we're no longer
    serializing the use-list order of things like i32 0).

The downside is that you can't look at all the users of ConstantData,
but traversals of users of i32 0 are already ill-advised.

Possible follow-ups:
  - Track if an instance of a ConstantVector/ConstantArray/etc. is known
    to have all ConstantData arguments, and drop the use-lists to
    ref-counts in those cases.  Callers need to check Value::hasUseList
    before iterating through the use-list.
  - Remove even the ref-counts.  I'm not sure they have any benefit
    besides minimizing the scope of this commit, and maintaining the
    counts is not free.

Fixes #58629

Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
2025-05-06 17:20:37 +02:00
Sergei Barannikov
becd418626
[CGP] Despeculate ctlz/cttz with "illegal" integer types (#137197)
The code below the removed check looks generic enough to support
arbitrary integer widths. This change helps 32-bit targets avoid
expensive expansion/libcalls in the case of zero input.

Pull Request: https://github.com/llvm/llvm-project/pull/137197
2025-04-29 22:33:40 +03:00
Sergei Barannikov
5080a0251f
[CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (#102731)
DAG combiner already does this transformation, but in some cases it does
not have a chance because either CodeGenPrepare or SelectionDAGBuilder
move icmp to a different basic block.

https://alive2.llvm.org/ce/z/ARzh99

Fixes #94829

Pull Request: https://github.com/llvm/llvm-project/pull/102731
2025-04-23 08:54:10 +03:00
Matt Arsenault
430b0c4434
CodeGenPrepare: Check use_empty instead of getNumUses == 0 (#136334) 2025-04-18 21:13:21 +02:00
Kazu Hirata
58774f1b1f
[CodeGen] Construct SmallVector with iterator ranges (NFC) (#136258) 2025-04-18 10:26:48 -07:00
Ryan Buchner
fa2a6d68c6
[CodeGenPrepare][RISCV] Combine (X ^ Y) and (X == Y) where appropriate (#130922)
Fixes #130510.

In RISCV, modify the folding of (X ^ Y == 0) -> (X == Y) to account for
cases where the (X ^ Y) will be re-used.

If a constant is being used for the XOR before a branch, ensure that it
is small enough to fit within a 12-bit immediate field. Otherwise, the
equality check is more efficient than the check against 0, see the
following:
```
# %bb.0:
        lui     a1, 5
        addiw   a1, a1, 1365
        xor     a0, a0, a1
        beqz    a0, .LBB0_2
# %bb.1: 
        ret
.LBB0_2: 
```

```
# %bb.0:
        lui     a1, 5
        addiw   a1, a1, 1365
        beq    a0, a1, .LBB0_2
# %bb.1: 
        xor     a0, a0, a1
        ret
.LBB0_2: 
```

Similarly, if the XOR is between 1 and a size one integer, we should
still fold away the XOR since that comparison can be optimized as a
comparison against 0.
```
# %bb.0:
        slt a0, a0, a1
        xor  a0, a0, 1
        beqz    a0, .LBB0_2
# %bb.1: 
        ret
.LBB0_2: 
```

```
# %bb.0:
        slt a0, a0, a1
        bnez    a0, .LBB0_2
# %bb.1: 
        xor  a0, a0, 1
        ret
.LBB0_2: 
```

One question about my code is that I used a hard-coded value for the
width of a RISCV ALU immediate. Do you know of a way that I can gather
this from the `context`, I was unable to devise one.
2025-04-02 09:56:09 -07:00