1154 Commits

Author SHA1 Message Date
Vladimir Radosavljevic
57d1fbf62c
[InstCombine] Limit (icmp eq/ne (and (add A, Addend), Msk), C) fold to one use of and (#172858)
If the and has multiple uses, the fold can increase the instruction
count.
2026-02-07 03:09:27 +08:00
Andreas Jonson
faa4b97b10
[InstCombine] fold icmp ne (and X, 1), 0 --> trunc X to i1 (#178977)
Remove vector check so this fold always is done.

proof: https://alive2.llvm.org/ce/z/oabD6J
closes #172888
2026-02-03 19:14:27 +01:00
Nathan Corbyn
efe75626cd
[InstCombine] Add combines for unsigned comparison of absolute value to constant (#176148)
This patch implements the following two peephole optimisations:
1. ``` abs(X) u> K --> K >= 0 ? `X + K u> 2 * K` : `false` ```;
2. If `abs(INT_MIN)` is `poison`, ```abs(X) u< K --> K >= 1 ? `X + (K -
1) u<= 2 * (K - 1)` : K != 0```.

See the following Alive2 proofs:
[1](https://alive2.llvm.org/ce/z/J2SRSv) and
[2](https://alive2.llvm.org/ce/z/tfxTrU).
2026-01-29 02:49:55 +08:00
Alan Zhao
f2921e536b
[InstCombine][profcheck] More fixes for missing branch data in InstCombineCompares.cpp (#178084)
Again, these fixes are trivial as we're creating new select instructions
with predicates from existing select instructions.

In this case, we create one select instruction from two existing select
instructions, but since both existing select instructions have the same
predicate, their profile data should be the same, so we can reuse the
profile data from either instruction. Therefore, we arbitrarily reuse
the profile data from the first select instruction.

Tracking issue: #147390
2026-01-27 11:20:15 -08:00
Alan Zhao
88257505b0
[InstCombine][profcheck] Fix missing branch data in InstCombineCompares.cpp (#178070)
These are trivial fixes where we create a new select instruction with
the same conditional as an existing select.

Tracking issue: #147390
2026-01-26 23:25:21 +00:00
Manasij Mukherjee
4bc2e4b4c1
[InstCombine] Add new pattern to foldICmpAddConstant (#175876)
icmp ult (add nuw X, (lshr A, ShAmtC)), C --> icmp ult A, C 
when C <= (1 << ShAmtC)
Pattern found in ffmpeg according to the report

https://alive2.llvm.org/ce/z/rpY8LY

Fixes https://github.com/llvm/llvm-project/issues/167178
2026-01-17 15:45:13 +00:00
Justin Lebar
bbcab0bf57
[InstCombine] Fix i1 ssub.sat compare folding (#173742)
For every type other than i1, ssub.sat x, y = 0 implies x == y.  But
ssub.sat.i1 0, -1 = 0 (because the result of 1 saturates to 0).

The changes to instcombine are not strictly necessary.  Instcombine
canonicalizes the ssub.sat.i1 before we arrive at these pattern-matches.
The real fix is in ValueTracking.

Nonetheless we agreed in review it makes sense to add these checks to
instcombine, even though they're currently unreachable:
https://github.com/llvm/llvm-project/pull/173742#issuecomment-3696631396

This was found by a fuzzer I'm working on!
2026-01-12 11:03:00 -08:00
Justin Lebar
5243501cca
[InstCombine] Guard foldICmpSRemConstant against zero divisors (#173702)
instcombine can create srem X, 0 or icmp ult X, 0 mid-pass when
operands fold to zero, which trips assertions in foldICmpSRemConstant.
Bail out on zero divisors / zero ULT constants instead of asserting,
and add a regression test from the minimized reproducer.

This was found by a fuzzer I'm working on.  The high-level design is to
randomly generate LLVM IR, run a pass on it, and then run the original
and new IR through the interpreter.  They should produce the same
results.  Right now I'm only fuzzing instcombine.
2026-01-09 10:32:22 +01:00
Wenju He
993054d96f
[InstCombine] Fold redundant FP clamp selects; relax min-max-pattern bailout in visitFCmp (#173452)
visitFCmp() previously bailed out when a following select matched a
clamp pattern. This blocks simplifications when the clamp is provably
redundant.

This PR allows simplification for clamp selects of flavor SPF_FMAXNUM/
SPF_FMINNUM when one arm is a constant and the other is a sitofp/uitofp
of an integer value, and the constant equals the exact min/max of that
integer domain:
* SPF_FMAXNUM (pattern max(X,C)): redundant if C is the minimum integer
mapped exactly to FP (e.g. X = sitofp i8, C = -128.0f).
* SPF_FMINNUM (pattern min(X,C)): redundant if C is the maximum integer
mapped exactly to FP (e.g. X = uitofp i8, C = 255.0f).

This fixes a regression in #173454

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2026-01-05 11:34:53 +08:00
Hongyu Chen
0952ccc712
[InstCombine] Bail out on type mismatch in foldICmpBinOpWithConstantViaTruthTable (#173179)
Fixes https://github.com/llvm/llvm-project/issues/173177
The previous implementation doesn't consider cases like `<2 x i1>
icmp(binop(sel <2 x i1>, sel i1))`.
2025-12-21 16:40:38 +08:00
Nikita Popov
bc19a0ad49 [InstCombine] Use getSigned() for negative numbers 2025-12-11 17:30:37 +01:00
Tirthankar Mazumder
2ce17ba347
[InstCombine][CmpInstAnalysis] Use consistent spelling and function names. NFC. (#171645)
Both `decomposeBitTestICmp` and `decomposeBitTest` have a parameter
called `lookThroughTrunc`. This was spelled in full (i.e. `lookThroughTrunc`)
in the header. However, in the implementation, it's written as `lookThruTrunc`.

I opted to convert all instances of `lookThruTrunc` into
`lookThroughTrunc` to reduce surprise while reading the code and for
conformity.

---

The other change in this PR is the renaming of the wrapper around
`decomposeBitTest()`. Even though it was a wrapper around
`CmpInstAnalysis.h`'s `decomposeBitTest`, the function was called
`decomposeBitTestICmp`. This is quite confusing because such a function
_also_ exists in `CmpInstAnalysis.h`, but it is _not_ the one actually
being used in `InstCombineAndOrXor.cpp`.
2025-12-11 07:40:04 +00:00
Tirthankar Mazumder
d94958b2f2
[InstCombine] Fold icmp samesign u{gt/lt} (X +nsw C2), C -> icmp s{gt/lt} X, (C - C2) (#169960)
Fixes #166973

Partially addresses #134028

Alive2 proof: https://alive2.llvm.org/ce/z/BqHQNN
2025-12-08 13:05:37 +01:00
actink
583fba3524
[InstCombine] fold icmp of select with invertible shl (#147182)
Proof: https://alive2.llvm.org/ce/z/a5fzlJ
Closes https://github.com/llvm/llvm-project/issues/146642

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2025-11-28 08:54:47 +08:00
Pedro Lobo
e8af134bb7
[InstCombine] Generalize trunc-shift-icmp fold from (1 << Y) to (Pow2 << Y) (#169163)
Extends the `icmp(trunc(shl))` fold to handle any power of 2 constant as
the shift base, not just 1. This generalizes the following patterns by
adjusting the comparison offsets by `log2(Pow2)`.

```llvm
(trunc (1 << Y) to iN) == 0    --> Y u>= N
(trunc (1 << Y) to iN) != 0    --> Y u<  N
(trunc (1 << Y) to iN) == 2**C --> Y ==  C
(trunc (1 << Y) to iN) != 2**C --> Y !=  C

; to

(trunc (Pow2 << Y) to iN) == 0    --> Y u>= N - log2(Pow2)
(trunc (Pow2 << Y) to iN) != 0    --> Y u<  N - log2(Pow2)
(trunc (Pow2 << Y) to iN) == 2**C --> Y ==  C - log2(Pow2)
(trunc (Pow2 << Y) to iN) != 2**C --> Y !=  C - log2(Pow2)
```

Proof: https://alive2.llvm.org/ce/z/2zwTkp
2025-11-22 15:44:06 +00:00
Peter Collingbourne
b3c54914ef
InstCombine: Stop transforming EQ/NE of SHR to 0 to ULT/UGT if >1 use
This is a small code size optimization that lets us avoid both shifting
and comparing to a constant if we need the shifted value anyway. On most
architectures the zero comparison is cheaper than a constant comparison
(or free if the shift sets flags).

Although this change appears to remove the optimization entirely, we
continue to do this transform if there is one use because of the code
below the removed code that transforms the shift into an and, followed
by the PR10267 case in InstCombinerImpl::foldICmpAndConstConst that
transforms the and into a ult/ugt. Added a test case to verify this
explicitly.

Per [1] reduces clang .text size by 0.09% and dynamic instruction count
by 0.01%.

[1] https://llvm-compile-time-tracker.com/compare.php?from=1f38d49ebe96417e368a567efa4d650b8a9ac30f&to=0873787a12b8f2eab019d8211ace4bccc1807343&stat=size-text

Reviewers: nikic, dtcxzyw

Reviewed By: dtcxzyw

Pull Request: https://github.com/llvm/llvm-project/pull/168007
2025-11-17 19:39:20 -08:00
kper
fcb5293ad0
[InstCombine]: Canonicalize to a mask when trunc nuw (#163628)
The canonicalize is also triggered when the `trunc` is `nuw`.

Proof: https://alive2.llvm.org/ce/z/eWvWe3
Fixes: https://github.com/llvm/llvm-project/issues/162451
2025-10-18 00:13:29 +08:00
Brandon
11faf88d8f
[InstCombine] Fold icmp with clamp into unsigned bound check (#161303)
Fix #157315 

alive2: https://alive2.llvm.org/ce/z/TEnuFV

The equality comparison of `min(max(X, Lo), Hi)` and `X` is actually a
range check on `X`. This PR folds this into an unsigned bound check `(X
- Lo) u< (Hi - Lo + 1)`.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2025-10-02 21:51:39 +02:00
Yingwei Zheng
0b7129afcc
[InstCombine] Fix FMF propagation in foldFCmpFSubIntoFCmp (#161539)
Proof: https://alive2.llvm.org/ce/z/orSP-S
Closes https://github.com/llvm/llvm-project/issues/161525.
2025-10-03 01:44:03 +08:00
Ramkumar Ramachandra
7fb3a91418
[PatternMatch] Introduce match functor (NFC) (#159386)
A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.

Co-authored-by: Luke Lau <luke@igalia.com>
2025-09-17 21:04:33 +01:00
Nikita Popov
1cbb35e044
[InstCombine] Support GEP chains in foldCmpLoadFromIndexedGlobal() (#157447)
Currently this fold only supports a single GEP. However, in ptradd
representation, it may be split across multiple GEPs. In particular, PR
#151333 will split off constant offset GEPs.

To support this, add a new helper decomposeLinearExpression(), which
decomposes a pointer into a linear expression of the form BasePtr +
Index * Scale + Offset.

I plan to also extend this helper to look through mul/shl on the index
and use it in more places that currently use collectOffset() to extract
a single index * scale. This will make sure such optimizations are not
affected by the ptradd migration.
2025-09-09 16:50:45 +02:00
Hongyu Chen
75b0c89e62
[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597)
This patch addresses
https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663.
This patch adds a helper function to put the inverse cast on constants,
with cast flags preserved(optional).
Follow-up patches will add trunc/ext handling on VectorCombine and flags
preservation on InstCombine.
2025-09-08 13:30:06 +00:00
Nikita Popov
305cf0e912
[InstCombine] Make foldCmpLoadFromIndexedGlobal() GEP-type independent (#157089)
foldCmpLoadFromIndexedGlobal() currently checks that the global type,
the GEP type and the load type match in certain ways. Replace this with
generic logic based on offsets.

This is a reboot of https://github.com/llvm/llvm-project/pull/67093.
This PR is less ambitious by requiring that the constant offset is
smaller than the stride, which avoids the additional complexity of that
PR.
2025-09-08 12:54:24 +02:00
Seraphimt
9f620b8f62
[InstCombine] Slightly optimize visitFcmp (NFC) (#156097)
Studying the code related to float found a slightly optimal sequence of
actions.
2025-08-31 17:48:56 +02:00
Yingwei Zheng
49144f7e49
[InstCombine] Improve range computation in foldICmpAddConstant (#155096)
Address comment
https://github.com/llvm/llvm-project/pull/110511#discussion_r1788946221.
2025-08-24 14:32:21 +08:00
zGoldthorpe
a8d25683ee
[PatternMatch] Allow m_ConstantInt to match integer splats (#153692)
When matching integers, `m_ConstantInt` is a convenient alternative to
`m_APInt` for matching unsigned 64-bit integers, allowing one to
simplify

```cpp
const APInt *IntC;
if (match(V, m_APInt(IntC))) {
  if (IntC->ule(UINT64_MAX)) {
    uint64_t Int = IntC->getZExtValue();
    // ...
  }
}
```
to
```cpp
uint64_t Int;
if (match(V, m_ConstantInt(Int))) {
  // ...
}
```

However, this simplification is only true if `V` is a scalar type.
Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt`
does not.

This patch ensures that the matching behaviour of `m_ConstantInt`
parallels that of `m_APInt`, and also incorporates it in some obvious
places.
2025-08-15 10:43:54 -06:00
Pavel Skripkin
30144226a4
[llvm] [InstCombine] fold "icmp eq (X + (V - 1)) & -V, X" to "icmp eq (and X, V - 1), 0" (#152851)
This fold optimizes 

```llvm
define i1 @src(i32 %num, i32 %val) {
  %mask = add i32 %val, -1
  %neg = sub nsw i32 0, %val

  %num.biased = add i32 %num, %mask
  %_2.sroa.0.0 = and i32 %num.biased, %neg
  %_0 = icmp eq i32 %_2.sroa.0.0, %num
  ret i1 %_0
}
```
to
```llvm
define i1 @tgt(i32 %num, i32 %val) {
  %mask = add i32 %val, -1
  %tmp = and i32 %num, %mask
  %ret = icmp eq i32 %tmp, 0
  ret i1 %ret
}
```

For power-of-two `val`.

Observed in real life for following code

```rust
pub fn is_aligned(num: usize) -> bool {
    num.next_multiple_of(1 << 12) == num
}
```
which verifies that num is aligned to 4096.

Alive2 proof https://alive2.llvm.org/ce/z/QisECm
2025-08-14 10:23:03 +03:00
Paul Walker
fb4a8f67b9
[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726) 2025-08-04 13:54:44 +01:00
David Green
d9971be83e
[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639)
My understanding is that gep [n x i8] and gep i8 can be treated
equivalently - the array type conveys no extra information and could be
removed. This goes through foldCmpLoadFromIndexedGlobal and tries to
make it work for non-array gep types, so long as the index type still
matches the array being loaded.
2025-08-03 10:19:42 +01:00
Nikita Popov
2672719a09
[InstCombine] Don't handle non-canonical index type in icmp of load fold (#151346)
We should just bail out and wait for it to be canonicalized. The current
implementation could emit a trunc without actually performing the
transform.
2025-07-30 17:52:08 +02:00
Nikita Popov
f0f3194e19
[InstCombine] Fold icmp of gep chains (#146714)
This extends https://github.com/llvm/llvm-project/pull/144065 to the
general case of an icmp between two GEP chains that have a common base.
2025-07-23 17:08:34 +02:00
Nikita Popov
1e24b53534
[InstCombine] Add limit for expansion of gep chains (#147065)
When converting gep subtraction / comparison to offset subtraction /
comparison, avoid expanding very long multi-use gep chains.
2025-07-23 09:47:53 +02:00
kissholic
baf2953097
Optimize fptrunc(x)>=C1 --> x>=C2 (#99475)
Fix https://github.com/llvm/llvm-project/issues/85265#issue-2186848949
2025-07-19 17:52:06 +09:00
Ross Kirsling
b1a93cfc32
[InstCombine] foldOpIntoPhi should apply to icmp with non-constant operand (#147676)
Alive2: https://alive2.llvm.org/ce/z/4MeCzA
Fixes #146263.
2025-07-16 10:03:25 +09:00
Yingwei Zheng
c9d9c3e349
[InstCombine] Fold icmp pred X + K, Y -> icmp pred2 X, Y if both X and Y is divisible by K (#147130)
This patch generalizes `icmp ule X +nuw 1, Y -> icmp ult X, Y`-like
optimizations to handle the case that the added RHS constant is a common
power-of-2 divisor of both X and Y. We can further generalize this
pattern to handle non-power-of-2 divisors as well.
Alive2: https://alive2.llvm.org/ce/z/QgpeM_

Compile-time improvement (Stage2-O3 -0.09%):
https://llvm-compile-time-tracker.com/compare.php?from=0ba59587fa98849ed5107fee4134e810e84b69a3&to=f80e5fe0bb2e63c05401bde7cd42899ea270909b&stat=instructions:u

The original case is from the comparison of expanded GEP offsets:
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2530/files#r2183005292
2025-07-05 23:42:53 +08:00
Nikita Popov
83272a4849
[InstCombine] Fold icmp of gep chain with base (#144065)
Fold icmp between a chain of geps and its base pointer. Previously only
a single gep was supported.
    
This will be extended to handle the case of two gep chains with a common
base in a followup.

This helps to avoid regressions after #137297.
2025-07-02 09:23:36 +02:00
Nikita Popov
bedd7ddb7f [InstCombine] Fix use after free
Load the nowrap flags before calling EmitGEPOffset(), as this may
free the instruction.
2025-07-01 15:18:49 +02:00
Nikita Popov
b8b7494551
[InstCombine] Rewrite multi-use GEPs when simplifying comparison (#146100)
We already do this when both sides are a GEP, but not if only one is.
This ensures that the offset arithmetic is not duplicated.
2025-07-01 14:26:47 +02:00
Iris Shi
32f911f3e8
[InstCombine] Fold ceil(X / (2 ^ C)) == 0 -> X == 0 (#143683)
Co-authored-by: Yingwei Zheng <dtcxzyw2333@gmail.com>
2025-06-23 10:51:17 +08:00
Acthinks Yang
f2734aa25e
[InstCombine] fold icmp with add/sub instructions having the same operands (#143241)
Closes #143211.
2025-06-16 17:05:30 +02:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
AZero13
eaf911bb98
[InstCombine] Fix comment typo that incorrectly described fold (NFC) (#141105)
icmp ne X, (sext (icmp ne X, 0)) --> X != 0 && X != -1, not X != 0 && X
== -1, which would go to X == -1 anyway.
2025-05-22 20:28:45 +02:00
Antonio Frighetto
adfd59fdb8 [InstCombine] Introduce foldICmpBinOpWithConstantViaTruthTable folding
Match icmps of binops where both operands are select with constant arms,
i.e., `icmp pred (select A ? C1 : C2) binop (select B ? C3 : C4), C5`.
Fold such patterns by creating a truth table of the possible four
constant variants, and materialize back the optimal logic from it via
`createLogicFromTable` helper. This also generalizes an existing fold,
which has therefore been dropped.

Proofs: https://alive2.llvm.org/ce/z/NS7Vzu.

Fixes: https://github.com/llvm/llvm-project/issues/138212.
2025-05-13 09:04:25 +02:00
Simon Pilgrim
26da8870ed Fix MSVC "not all control paths return a value" warning. NFC. 2025-04-30 12:32:22 +01:00
Yingwei Zheng
d20796dab7
[InstCombine] Offset both sides of an equality icmp (#134086)
Proof: https://alive2.llvm.org/ce/z/zQ2UW4
Closes https://github.com/llvm/llvm-project/issues/134024
2025-04-30 00:19:23 +08:00
Matt Arsenault
48585caf72
InstCombine: Avoid counting uses of constants (#136566)
Logically it does not matter; getFreelyInvertedImpl doesn't
depend on the value for the m_ImmConstant case.

This use count logic should probably sink into getFreelyInvertedImpl,
every use of this appears to just be a hasOneUse or hasNUse count,
so this could change to just be a use count threshold.
2025-04-23 10:51:55 +02:00
Yingwei Zheng
65ed35393c
[IR] Add helper CmpPredicate::dropSameSign (#134071)
Address review comment
https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641
2025-04-02 22:25:01 +08:00
Veera
4cdcf3b193
[InstCombine] Fold (trunc nuw A to i1) == (trunc nuw B to i1) to A == B (#133368)
Fixes #133344

Proof: https://alive2.llvm.org/ce/z/X3Uh23 

InstCombine couldn't optimize `i1` because `canonicalizeICmpBool()` was
transforming the comparison into bitwise operations before
`foldICmpTruncWithTruncOrExt()` was called.

This PR solves the ordering issue by placing
`foldICmpTruncWithTruncOrExt()` before `canonicalizeICmpBool()`.

I believe this will not cause any regressions since all tests are
passing.
2025-03-28 08:32:45 -04:00
Nikita Popov
e56a6a2683
Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880) (#128020)
Relative to the previous attempt this includes two fixes:
 * Adjust callCapturesBefore() to not skip captures(ret: address,
    provenance) arguments, as these will not count as a capture
    at the call-site.
 * When visiting uses during stack slot optimization, don't skip
    the ModRef check for passthru captures. Calls can both modref
    and be passthru for captures.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-27 09:38:29 +01:00
Nico Weber
e2ba1b6ffd Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729.
Seems to break LTO builds of clang on Windows, see comments on
https://github.com/llvm/llvm-project/pull/125880
2025-02-19 11:32:57 -05:00