1135 Commits

Author SHA1 Message Date
Ramkumar Ramachandra
7fb3a91418
[PatternMatch] Introduce match functor (NFC) (#159386)
A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.

Co-authored-by: Luke Lau <luke@igalia.com>
2025-09-17 21:04:33 +01:00
Nikita Popov
1cbb35e044
[InstCombine] Support GEP chains in foldCmpLoadFromIndexedGlobal() (#157447)
Currently this fold only supports a single GEP. However, in ptradd
representation, it may be split across multiple GEPs. In particular, PR
#151333 will split off constant offset GEPs.

To support this, add a new helper decomposeLinearExpression(), which
decomposes a pointer into a linear expression of the form BasePtr +
Index * Scale + Offset.

I plan to also extend this helper to look through mul/shl on the index
and use it in more places that currently use collectOffset() to extract
a single index * scale. This will make sure such optimizations are not
affected by the ptradd migration.
2025-09-09 16:50:45 +02:00
Hongyu Chen
75b0c89e62
[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597)
This patch addresses
https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663.
This patch adds a helper function to put the inverse cast on constants,
with cast flags preserved(optional).
Follow-up patches will add trunc/ext handling on VectorCombine and flags
preservation on InstCombine.
2025-09-08 13:30:06 +00:00
Nikita Popov
305cf0e912
[InstCombine] Make foldCmpLoadFromIndexedGlobal() GEP-type independent (#157089)
foldCmpLoadFromIndexedGlobal() currently checks that the global type,
the GEP type and the load type match in certain ways. Replace this with
generic logic based on offsets.

This is a reboot of https://github.com/llvm/llvm-project/pull/67093.
This PR is less ambitious by requiring that the constant offset is
smaller than the stride, which avoids the additional complexity of that
PR.
2025-09-08 12:54:24 +02:00
Seraphimt
9f620b8f62
[InstCombine] Slightly optimize visitFcmp (NFC) (#156097)
Studying the code related to float found a slightly optimal sequence of
actions.
2025-08-31 17:48:56 +02:00
Yingwei Zheng
49144f7e49
[InstCombine] Improve range computation in foldICmpAddConstant (#155096)
Address comment
https://github.com/llvm/llvm-project/pull/110511#discussion_r1788946221.
2025-08-24 14:32:21 +08:00
zGoldthorpe
a8d25683ee
[PatternMatch] Allow m_ConstantInt to match integer splats (#153692)
When matching integers, `m_ConstantInt` is a convenient alternative to
`m_APInt` for matching unsigned 64-bit integers, allowing one to
simplify

```cpp
const APInt *IntC;
if (match(V, m_APInt(IntC))) {
  if (IntC->ule(UINT64_MAX)) {
    uint64_t Int = IntC->getZExtValue();
    // ...
  }
}
```
to
```cpp
uint64_t Int;
if (match(V, m_ConstantInt(Int))) {
  // ...
}
```

However, this simplification is only true if `V` is a scalar type.
Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt`
does not.

This patch ensures that the matching behaviour of `m_ConstantInt`
parallels that of `m_APInt`, and also incorporates it in some obvious
places.
2025-08-15 10:43:54 -06:00
Pavel Skripkin
30144226a4
[llvm] [InstCombine] fold "icmp eq (X + (V - 1)) & -V, X" to "icmp eq (and X, V - 1), 0" (#152851)
This fold optimizes 

```llvm
define i1 @src(i32 %num, i32 %val) {
  %mask = add i32 %val, -1
  %neg = sub nsw i32 0, %val

  %num.biased = add i32 %num, %mask
  %_2.sroa.0.0 = and i32 %num.biased, %neg
  %_0 = icmp eq i32 %_2.sroa.0.0, %num
  ret i1 %_0
}
```
to
```llvm
define i1 @tgt(i32 %num, i32 %val) {
  %mask = add i32 %val, -1
  %tmp = and i32 %num, %mask
  %ret = icmp eq i32 %tmp, 0
  ret i1 %ret
}
```

For power-of-two `val`.

Observed in real life for following code

```rust
pub fn is_aligned(num: usize) -> bool {
    num.next_multiple_of(1 << 12) == num
}
```
which verifies that num is aligned to 4096.

Alive2 proof https://alive2.llvm.org/ce/z/QisECm
2025-08-14 10:23:03 +03:00
Paul Walker
fb4a8f67b9
[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726) 2025-08-04 13:54:44 +01:00
David Green
d9971be83e
[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639)
My understanding is that gep [n x i8] and gep i8 can be treated
equivalently - the array type conveys no extra information and could be
removed. This goes through foldCmpLoadFromIndexedGlobal and tries to
make it work for non-array gep types, so long as the index type still
matches the array being loaded.
2025-08-03 10:19:42 +01:00
Nikita Popov
2672719a09
[InstCombine] Don't handle non-canonical index type in icmp of load fold (#151346)
We should just bail out and wait for it to be canonicalized. The current
implementation could emit a trunc without actually performing the
transform.
2025-07-30 17:52:08 +02:00
Nikita Popov
f0f3194e19
[InstCombine] Fold icmp of gep chains (#146714)
This extends https://github.com/llvm/llvm-project/pull/144065 to the
general case of an icmp between two GEP chains that have a common base.
2025-07-23 17:08:34 +02:00
Nikita Popov
1e24b53534
[InstCombine] Add limit for expansion of gep chains (#147065)
When converting gep subtraction / comparison to offset subtraction /
comparison, avoid expanding very long multi-use gep chains.
2025-07-23 09:47:53 +02:00
kissholic
baf2953097
Optimize fptrunc(x)>=C1 --> x>=C2 (#99475)
Fix https://github.com/llvm/llvm-project/issues/85265#issue-2186848949
2025-07-19 17:52:06 +09:00
Ross Kirsling
b1a93cfc32
[InstCombine] foldOpIntoPhi should apply to icmp with non-constant operand (#147676)
Alive2: https://alive2.llvm.org/ce/z/4MeCzA
Fixes #146263.
2025-07-16 10:03:25 +09:00
Yingwei Zheng
c9d9c3e349
[InstCombine] Fold icmp pred X + K, Y -> icmp pred2 X, Y if both X and Y is divisible by K (#147130)
This patch generalizes `icmp ule X +nuw 1, Y -> icmp ult X, Y`-like
optimizations to handle the case that the added RHS constant is a common
power-of-2 divisor of both X and Y. We can further generalize this
pattern to handle non-power-of-2 divisors as well.
Alive2: https://alive2.llvm.org/ce/z/QgpeM_

Compile-time improvement (Stage2-O3 -0.09%):
https://llvm-compile-time-tracker.com/compare.php?from=0ba59587fa98849ed5107fee4134e810e84b69a3&to=f80e5fe0bb2e63c05401bde7cd42899ea270909b&stat=instructions:u

The original case is from the comparison of expanded GEP offsets:
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2530/files#r2183005292
2025-07-05 23:42:53 +08:00
Nikita Popov
83272a4849
[InstCombine] Fold icmp of gep chain with base (#144065)
Fold icmp between a chain of geps and its base pointer. Previously only
a single gep was supported.
    
This will be extended to handle the case of two gep chains with a common
base in a followup.

This helps to avoid regressions after #137297.
2025-07-02 09:23:36 +02:00
Nikita Popov
bedd7ddb7f [InstCombine] Fix use after free
Load the nowrap flags before calling EmitGEPOffset(), as this may
free the instruction.
2025-07-01 15:18:49 +02:00
Nikita Popov
b8b7494551
[InstCombine] Rewrite multi-use GEPs when simplifying comparison (#146100)
We already do this when both sides are a GEP, but not if only one is.
This ensures that the offset arithmetic is not duplicated.
2025-07-01 14:26:47 +02:00
Iris Shi
32f911f3e8
[InstCombine] Fold ceil(X / (2 ^ C)) == 0 -> X == 0 (#143683)
Co-authored-by: Yingwei Zheng <dtcxzyw2333@gmail.com>
2025-06-23 10:51:17 +08:00
Acthinks Yang
f2734aa25e
[InstCombine] fold icmp with add/sub instructions having the same operands (#143241)
Closes #143211.
2025-06-16 17:05:30 +02:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
AZero13
eaf911bb98
[InstCombine] Fix comment typo that incorrectly described fold (NFC) (#141105)
icmp ne X, (sext (icmp ne X, 0)) --> X != 0 && X != -1, not X != 0 && X
== -1, which would go to X == -1 anyway.
2025-05-22 20:28:45 +02:00
Antonio Frighetto
adfd59fdb8 [InstCombine] Introduce foldICmpBinOpWithConstantViaTruthTable folding
Match icmps of binops where both operands are select with constant arms,
i.e., `icmp pred (select A ? C1 : C2) binop (select B ? C3 : C4), C5`.
Fold such patterns by creating a truth table of the possible four
constant variants, and materialize back the optimal logic from it via
`createLogicFromTable` helper. This also generalizes an existing fold,
which has therefore been dropped.

Proofs: https://alive2.llvm.org/ce/z/NS7Vzu.

Fixes: https://github.com/llvm/llvm-project/issues/138212.
2025-05-13 09:04:25 +02:00
Simon Pilgrim
26da8870ed Fix MSVC "not all control paths return a value" warning. NFC. 2025-04-30 12:32:22 +01:00
Yingwei Zheng
d20796dab7
[InstCombine] Offset both sides of an equality icmp (#134086)
Proof: https://alive2.llvm.org/ce/z/zQ2UW4
Closes https://github.com/llvm/llvm-project/issues/134024
2025-04-30 00:19:23 +08:00
Matt Arsenault
48585caf72
InstCombine: Avoid counting uses of constants (#136566)
Logically it does not matter; getFreelyInvertedImpl doesn't
depend on the value for the m_ImmConstant case.

This use count logic should probably sink into getFreelyInvertedImpl,
every use of this appears to just be a hasOneUse or hasNUse count,
so this could change to just be a use count threshold.
2025-04-23 10:51:55 +02:00
Yingwei Zheng
65ed35393c
[IR] Add helper CmpPredicate::dropSameSign (#134071)
Address review comment
https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641
2025-04-02 22:25:01 +08:00
Veera
4cdcf3b193
[InstCombine] Fold (trunc nuw A to i1) == (trunc nuw B to i1) to A == B (#133368)
Fixes #133344

Proof: https://alive2.llvm.org/ce/z/X3Uh23 

InstCombine couldn't optimize `i1` because `canonicalizeICmpBool()` was
transforming the comparison into bitwise operations before
`foldICmpTruncWithTruncOrExt()` was called.

This PR solves the ordering issue by placing
`foldICmpTruncWithTruncOrExt()` before `canonicalizeICmpBool()`.

I believe this will not cause any regressions since all tests are
passing.
2025-03-28 08:32:45 -04:00
Nikita Popov
e56a6a2683
Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880) (#128020)
Relative to the previous attempt this includes two fixes:
 * Adjust callCapturesBefore() to not skip captures(ret: address,
    provenance) arguments, as these will not count as a capture
    at the call-site.
 * When visiting uses during stack slot optimization, don't skip
    the ModRef check for passthru captures. Calls can both modref
    and be passthru for captures.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-27 09:38:29 +01:00
Nico Weber
e2ba1b6ffd Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729.
Seems to break LTO builds of clang on Windows, see comments on
https://github.com/llvm/llvm-project/pull/125880
2025-02-19 11:32:57 -05:00
Yingwei Zheng
29f3a35206
[InstCombine] Do not keep samesign when speculatively executing icmps (#127007)
Closes https://github.com/llvm/llvm-project/issues/126974.
2025-02-16 20:18:29 +08:00
Nikita Popov
7e3735d1a1 Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)
Relative to the previous attempt, this adjusts isEscapeSource()
to not treat calls with captures(ret: address, provenance) or similar
arguments as escape sources. This addresses the miscompile reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577

The implementation uses a helper function on CallBase to make this
check a bit more efficient (e.g. by skipping the byval checks) as
checking attributes on all arguments if fairly expensive.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-14 12:38:04 +01:00
Nikita Popov
1e64ea9914 Revert "[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a.

A miscompilation has been reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
2025-02-13 14:56:12 +01:00
Nikita Popov
ee655ca27a
[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)
This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-13 09:36:35 +01:00
hanbeom
553f8e71dd
[InstCombine] simplify icmp pred x, ~x (#73990)
simplify compare between specific variable `X` and  `NOT(X)`

Proof: https://alive2.llvm.org/ce/z/KTCpjP

Fixed https://github.com/llvm/llvm-project/issues/57532.
2025-02-06 23:43:26 +08:00
Yingwei Zheng
e4980c68c1
[InstCombine] Handle isSubnormalOrZero idiom (#125501)
Alive2: https://alive2.llvm.org/ce/z/9nYE3J
2025-02-04 00:49:51 +08:00
Yingwei Zheng
9725595f3a
[InstCombine] Check nowrap flags when folding comparison of GEPs with the same base pointer (#121892)
Alive2: https://alive2.llvm.org/ce/z/P5XbMx
Closes https://github.com/llvm/llvm-project/issues/121890

TODO: It is still safe to perform this transform without nowrap flags if
the corresponding scale factor is 1 byte:
https://alive2.llvm.org/ce/z/J-JCJd
2025-02-01 20:41:15 +08:00
Yingwei Zheng
626c23112f
[ValueTracking] Use SimplifyQuery in isKnownNonEqual (#124942)
It is needed by https://github.com/llvm/llvm-project/pull/117442.
2025-02-01 15:13:11 +08:00
David Majnemer
503e4b2d54 [InstCombine] Perform some cleanups, add some tests
No functional change is intended.
2025-01-31 20:30:14 +00:00
Jacob Young
bb59eb8ed5
[InstCombine] fold unsigned predicates on srem result (#122520)
This allows optimization of more signed floor implementations when the
divisor is a known power of two to an arithmetic shift.

Proof for the implemented optimizations:
https://alive2.llvm.org/ce/z/j6C-Nz

Proof for the test cases:
https://alive2.llvm.org/ce/z/M_PBjw

---------

Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
2025-01-18 14:05:31 -08:00
Yingwei Zheng
03e7862962
[ValueTracking] Move getFlippedStrictnessPredicateAndConstant into ValueTracking. NFC. (#122064)
Needed by https://github.com/llvm/llvm-project/pull/121958.
2025-01-08 20:02:49 +08:00
Nikita Popov
63d4e0fb66 [InstCombine] Compute result directly on APInts
If the bitwidth is 2 and we add two 1s, the result may overflow.
This is fine in terms of correctness, but triggers the APInt ctor
assertion. Fix this by performing the calculation directly on APInts.

Fixes the issue reported in:
https://github.com/llvm/llvm-project/pull/114539#issuecomment-2574845003
2025-01-07 12:13:19 +01:00
Yingwei Zheng
fac4646997
[InstCombine] Check no wrap flags before folding icmp of GEPs with same indices (#121628)
Alive2: https://alive2.llvm.org/ce/z/Dr3Sbe
Closes https://github.com/llvm/llvm-project/issues/121581.
2025-01-04 17:23:57 +08:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Yingwei Zheng
f4f6566e44
[InstCombine] Fix type mismatch in foldICmpBinOpEqualityWithConstant (#119068)
Closes https://github.com/llvm/llvm-project/issues/119063.
2024-12-08 13:21:34 +08:00
Nikita Popov
ae73bc8e94 Reapply [InstCombine] Support gep nuw in icmp folds (#118472)
The profile runtime test failure this caused has been addressed in:
https://github.com/llvm/llvm-project/pull/118782

-----

Unsigned icmp of gep nuw folds to unsigned icmp of offsets. Unsigned
icmp of gep nusw nuw folds to unsigned samesign icmp of offsets.

Proofs: https://alive2.llvm.org/ce/z/VEwQY8
2024-12-06 14:41:10 +01:00
Yingwei Zheng
59720dc703
[InstCombine] Fold icmp spred (X *nsw Z), (Y *nsw Z) -> icmp pred Z, 0 if scmp(X, Y) is known (#118726)
```
icmp spred (X *nsw Z), (Y *nsw Z) -> icmp swap(spred) Z, 0 if X s< Y
icmp spred (X *nsw Z), (Y *nsw Z) -> icmp spred       Z, 0 if X s> Y
```
Alive2: https://alive2.llvm.org/ce/z/F2D0GE
2024-12-05 19:59:31 +08:00
Vitaly Buka
fc201d6133
Revert "[InstCombine] Support gep nuw in icmp folds" (#118698)
Reverts llvm/llvm-project#118472

Breaks profile tests on i386
https://lab.llvm.org/buildbot/#/builders/66/builds/7009
2024-12-04 15:07:27 -08:00
Ramkumar Ramachandra
51a895aded
IR: introduce struct with CmpInst::Predicate and samesign (#116867)
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.

We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.

The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
2024-12-03 13:31:04 +00:00