1094 Commits

Author SHA1 Message Date
Yingwei Zheng
03e7862962
[ValueTracking] Move getFlippedStrictnessPredicateAndConstant into ValueTracking. NFC. (#122064)
Needed by https://github.com/llvm/llvm-project/pull/121958.
2025-01-08 20:02:49 +08:00
Nikita Popov
63d4e0fb66 [InstCombine] Compute result directly on APInts
If the bitwidth is 2 and we add two 1s, the result may overflow.
This is fine in terms of correctness, but triggers the APInt ctor
assertion. Fix this by performing the calculation directly on APInts.

Fixes the issue reported in:
https://github.com/llvm/llvm-project/pull/114539#issuecomment-2574845003
2025-01-07 12:13:19 +01:00
Yingwei Zheng
fac4646997
[InstCombine] Check no wrap flags before folding icmp of GEPs with same indices (#121628)
Alive2: https://alive2.llvm.org/ce/z/Dr3Sbe
Closes https://github.com/llvm/llvm-project/issues/121581.
2025-01-04 17:23:57 +08:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Yingwei Zheng
f4f6566e44
[InstCombine] Fix type mismatch in foldICmpBinOpEqualityWithConstant (#119068)
Closes https://github.com/llvm/llvm-project/issues/119063.
2024-12-08 13:21:34 +08:00
Nikita Popov
ae73bc8e94 Reapply [InstCombine] Support gep nuw in icmp folds (#118472)
The profile runtime test failure this caused has been addressed in:
https://github.com/llvm/llvm-project/pull/118782

-----

Unsigned icmp of gep nuw folds to unsigned icmp of offsets. Unsigned
icmp of gep nusw nuw folds to unsigned samesign icmp of offsets.

Proofs: https://alive2.llvm.org/ce/z/VEwQY8
2024-12-06 14:41:10 +01:00
Yingwei Zheng
59720dc703
[InstCombine] Fold icmp spred (X *nsw Z), (Y *nsw Z) -> icmp pred Z, 0 if scmp(X, Y) is known (#118726)
```
icmp spred (X *nsw Z), (Y *nsw Z) -> icmp swap(spred) Z, 0 if X s< Y
icmp spred (X *nsw Z), (Y *nsw Z) -> icmp spred       Z, 0 if X s> Y
```
Alive2: https://alive2.llvm.org/ce/z/F2D0GE
2024-12-05 19:59:31 +08:00
Vitaly Buka
fc201d6133
Revert "[InstCombine] Support gep nuw in icmp folds" (#118698)
Reverts llvm/llvm-project#118472

Breaks profile tests on i386
https://lab.llvm.org/buildbot/#/builders/66/builds/7009
2024-12-04 15:07:27 -08:00
Ramkumar Ramachandra
51a895aded
IR: introduce struct with CmpInst::Predicate and samesign (#116867)
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.

We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.

The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
2024-12-03 13:31:04 +00:00
Nikita Popov
f33536468b
[InstCombine] Support gep nuw in icmp folds (#118472)
Unsigned icmp of gep nuw folds to unsigned icmp of offsets. Unsigned
icmp of gep nusw nuw folds to unsigned samesign icmp of offsets.

Proofs: https://alive2.llvm.org/ce/z/VEwQY8
2024-12-03 14:28:56 +01:00
Nikita Popov
bdc6faf775 [InstCombine] Support nusw in icmp of two geps with same base
Proof: https://alive2.llvm.org/ce/z/BYNQ7s
2024-12-03 11:51:14 +01:00
Nikita Popov
9c5a84b394 [InstCombine] Support nusw in icmp of gep with base
Proof: https://alive2.llvm.org/ce/z/omnQXt
2024-12-03 11:51:14 +01:00
Yingwei Zheng
c1ad064dd3
[InstCombine] Fold icmp spred (and X, highmask), C1 into icmp spred X, C2 (#118197)
Alive2: https://alive2.llvm.org/ce/z/Ffg64g
Closes https://github.com/llvm/llvm-project/issues/104772.
2024-12-03 16:19:12 +08:00
David Green
18abc7e0c5
[PatternMatch] Introduce m_c_Select (#114328)
This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).
2024-11-25 13:47:23 +00:00
Jie Fu
aa746495af [InstCombine] Remove unused variable in InstCombineCompares.cpp (NFC)
/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp:3190:14:
 error: unused variable 'CmpBW' [-Werror,-Wunused-variable]
    unsigned CmpBW = Ty->getScalarSizeInBits();
             ^
1 error generated.
2024-11-21 21:04:44 +08:00
Yingwei Zheng
2e60048641
[InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 (#110511)
Motivating case from
9852d85ec9/drivers/gpu/drm/drm_edid.c (L5238-L5240):
```
define i1 @src(i8 noundef %v13) {
entry:
  %conv1 = zext i8 %v13 to i32
  %add = add nsw i32 %conv1, -4
  %cmp = icmp ult i32 %add, 3
  %cmp4 = icmp slt i8 %v13, 4
  %cond = select i1 %cmp4, i1 true, i1 %cmp
  ret i1 %cond
}

define i1 @tgt(i8 noundef %v13) {
entry:
  %cmp4 = icmp slt i8 %v13, 7
  ret i1 %cmp4
}
```
2024-11-21 20:47:24 +08:00
Nikita Popov
78f7ca0980
[InstCombine] Use KnownBits predicate helpers (#115874)
Inside foldICmpUsingKnownBits(), instead of rolling our own logic based
on min/max values, make use of ICmpInst::compare() working on KnownBits.
This gives better results for the equality predicates. In practice, the
improvement is only for pointers, because isKnownNonEqual() handles the
non-pointer case.

I've adjusted some tests to prevent the new fold from triggering, to
retain their original intent of testing constant expressions.
2024-11-14 10:13:50 +01:00
Lee Wei
6ad1dd3bdc
[InstCombine] Fold (sext(a) & c1) == c2 to (a & c3) == trunc(c2) (#112646)
Fixes https://github.com/llvm/llvm-project/issues/85830.

Updated Alive proof: https://alive2.llvm.org/ce/z/KnvoP5
2024-11-11 12:51:54 +01:00
Kazu Hirata
e5bf14e9ac
[InstCombine] Remove unused includes (NFC) (#114709)
Identified with misc-include-cleaner.
2024-11-03 08:06:29 -08:00
David Majnemer
902acde341 [InstCombine] Optimize away certain additions using modular arithmetic
We can turn:
```
  %add = add i8 %arg, C1
  %and = and i8 %add, C2
  %cmp = icmp eq i1 %and, C3
```

into:
```
  %and = and i8 %arg, C2
  %cmp = icmp eq i1 %and, (C3 - C1) & C2
```

This is only worth doing if the sequence is the sole user of the addition
operation.
2024-10-28 22:51:35 +00:00
Noah Goldstein
294726d738 Reapply "[InstCombine] Folding (icmp eq/ne (and X, -P2), INT_MIN)" (#111236)
The underlying issue with msan was fixed by #113200
2024-10-23 09:12:08 -05:00
Kazu Hirata
8819267747
[InstCombine] Simplify code with SmallMapVector::operator[] (NFC) (#113022) 2024-10-19 14:38:40 -07:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Yingwei Zheng
095d49da76
[InstCombine] Set samesign when converting signed predicates into unsigned (#112642)
Alive2: https://alive2.llvm.org/ce/z/6cqdt-
2024-10-17 20:43:48 +08:00
Yingwei Zheng
0936195311
[InstCombine] Drop samesign in InstCombine (#112480)
Closes https://github.com/llvm/llvm-project/issues/112476.
2024-10-16 19:13:52 +08:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Kazu Hirata
2d8cd32ae5
[InstCombine] Avoid repeated hash lookups (NFC) (#111618) 2024-10-08 20:37:33 -07:00
Vitaly Buka
574266ce33
Revert "[InstCombine] Folding (icmp eq/ne (and X, -P2), INT_MIN)" (#111236)
Reverts #110880 because of exposed issue is Msan instrumentation
#111212.

This reverts commit a64643688526114b50c25b3eda8a57855bd2be87.
2024-10-04 23:20:40 -07:00
Nikita Popov
67d247a441
[InstCombine] Decompose more icmps into masks (#110836)
Extend decomposeBitTestICmp() to handle cases where the resulting
comparison is of the form `icmp (X & Mask) pred C` with non-zero
`C`. Add a flag to allow code to opt-in to this behavior and use it in
the "log op of icmp" fold infrastructure.

This addresses regressions from #97289.

Proofs: https://alive2.llvm.org/ce/z/hUhdbU
2024-10-04 10:17:23 +02:00
Noah Goldstein
a646436885 [InstCombine] Folding (icmp eq/ne (and X, -P2), INT_MIN)
Folds to `(icmp slt/sge X, (INT_MIN + P2))`

Proofs: https://alive2.llvm.org/ce/z/vpNFY5

Closes #110880
2024-10-03 13:05:08 -05:00
Nikita Popov
7de492f90d [InstCombine] Preserve nuw flag in indexed compare fold
If all the involved GEPs have the nuw flag, also preserve it on
the resulting adds and GEPs.
2024-10-02 16:03:47 +02:00
Yingwei Zheng
2a2c35a9a6
[InstCombine] Fold icmp spred (mul nsw X, Z), (mul nsw Y, Z) into icmp spred X, Y (#110630)
```
icmp spred (mul nsw X, Z), (mul nsw Y, Z) -> icmp spred X, Y iff Z > 0
icmp spred (mul nsw X, Z), (mul nsw Y, Z) -> icmp spred Y, X iff Z < 0
```
Alive2: https://alive2.llvm.org/ce/z/9fXFfn
2024-10-01 22:16:05 +08:00
Yingwei Zheng
1efd1227b2
[InstCombine] Fold icmp eq/ne (X *nw Z), (Y *nw Z) -> icmp eq/ne Z, 0 when X != Y (#110413)
Alive2: https://alive2.llvm.org/ce/z/9oDP6K
I found this pattern in
04e75858d7/casadi/core/repmat.cpp (L70-L78).
2024-09-30 10:21:20 +08:00
Nikita Popov
b8d1bae648
[CmpInstAnalysis] Return decomposed bit test as struct (NFC) (#109819)
decomposeBitTestICmp() currently returns the result via two out
parameters plus an in-place modification of Pred. This changes it to
return an optional struct instead.

The motivation here is twofold. First, I'd like to extend this code to
handle cases where the comparison is against a value other than zero,
which would mean yet another out parameter. Second, while doing that I
was badly bitten by the in-place modification, so I'd like to get rid of
it.
2024-09-25 10:14:15 +02:00
Marina Taylor
5cd0900ef6
[InstCombine] Compare icmp inttoptr, inttoptr values directly (#107012)
InstCombine already has some rules for `icmp ptrtoint, ptrtoint` to drop
the casts and compare the source values. This change adds the same for
the reverse case with `inttoptr`.
2024-09-24 09:39:07 +02:00
Yingwei Zheng
872932b7a9
[InstCombine] Generalize icmp (shl nuw C2, Y), C -> icmp Y, C3 (#104696)
The motivation of this patch is to fold more generalized patterns like
`icmp ult (shl nuw 16, X), 64 -> icmp ult X, 2`.

Alive2: https://alive2.llvm.org/ce/z/gyqjQH
2024-09-18 19:10:41 +08:00
c8ef
86f0399c1f
[InstCombine] Fold expression using basic properties of floor and ceiling function (#107107)
alive2: ~~https://alive2.llvm.org/ce/z/Ag3Ki7~~
https://alive2.llvm.org/ce/z/ywP5t2
related: #76438

This patch adds the following foldings: `floor(x) <= x --> true` and `x
<= ceil(x) --> true`. We leverage the properties of these math functions
and ensure there is no floating point input of `nan`.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2024-09-15 14:25:00 +04:00
Nikita Popov
de2b6cb6ab
[InstCombine] Fold icmp over select of cmp more aggressively (#105536)
When folding an icmp into a select, treat an icmp of a constant with a
one-use ucmp/scmp intrinsic as a simplification. These comparisons will
reduce down to an icmp.

This addresses a regression seen in Rust and also in llvm-opt-benchmark.
2024-08-22 09:47:35 +02:00
Volodymyr Vasylkun
7e23a23d5e
[InstCombine] Fold an unsigned icmp of ucmp/scmp with a constant to an icmp of the original arguments (#104471)
Proofs: https://alive2.llvm.org/ce/z/9mv8HU
2024-08-16 13:38:13 +01:00
Volodymyr Vasylkun
8320b97ab9
[InstCombine] Fold an unsigned comparison of add nsw X, C with a constant into a signed comparison (#103480)
Given an unsigned integer comparison of `add nsw X, C1` with some
constant `C2` we can fold it into a signed comparison of `X` and `C2 -
C1` under the following conditions:
  * There's a `nsw` flag on the addition
  * `C2` is non-negative
  * `X + C1` is non-negative
  * `C2 - C1` is non-negative
2024-08-14 15:31:19 +01:00
Nikita Popov
adb4cfe0b6 [InstCombine] Use getAllOnesValue()
Split off from https://github.com/llvm/llvm-project/pull/80309.
2024-08-13 15:04:23 +02:00
Noah Goldstein
b4ac7f4fc9 [InstCombine] Fold (icmp eq/ne (or (select cond, 0/NZ, 0/NZ), X), 0)
Four cases:
`(icmp eq (or (select cond, 0, NonZero), Other))`
 -> `(and cond, (icmp eq Other, 0))`
`(icmp ne (or (select cond, NonZero, 0), Other))`
 -> `(or cond, (icmp ne Other, 0))`
`(icmp ne (or (select cond, 0, NonZero), Other))`
 -> `(or (not cond), (icmp ne Other, 0))`
`(icmp eq (or (select cond, NonZero, 0), Other))`
 -> `(and (not cond), (icmp eq Other, 0))`

These cases came up in tests on: #88088

Proofs: https://alive2.llvm.org/ce/z/ojGo_J

Closes #88183
2024-08-05 23:48:49 +08:00
mskamp
533190acdb
[InstCombine] Canonicalize Bit Testing by Shifting to Bit 0 (#101838)
Implement a new transformation that fold the bit-testing expression
(icmp ne (and (lshr V B) 1) 0) to (icmp ne (and V (shl 1 B)) 0) for
constant V. This rule already existed for non-constant V and constants
other than 1; this restriction to non-constant V has been added in
commit c3b2111d975a39d19f0c5d635e2961a4449c5a71 to fix an infinite loop.
Avoid the infinite loop by allowing constant V only if the shift
instruction is an lshr and the constant is 1. Also fold the negated
variant of the LHS.
    
This transformation necessitates an adaption of existing tests in
`icmp-and-shift.ll` and `load-cmp.ll`. One test in `icmp-and-shift.ll`,
which previously was a negative test, now gets folded. Rename it to
indicate that it is a positive test.
    
Alive proof: https://alive2.llvm.org/ce/z/vcJJTx
    
Relates to issue #86813.
2024-08-04 09:32:40 +02:00
Yingwei Zheng
8bd9ade628
[InstCombine] Fold fcmp pred sqrt(X), 0.0 -> fcmp pred2 X, 0.0 (#101626)
Proof (Please run alive-tv with larger smt-to):
https://alive2.llvm.org/ce/z/-aqixk
FMF propagation: https://alive2.llvm.org/ce/z/zyKK_p

```
sqrt(X) < 0.0 --> false
sqrt(X) u>= 0.0 --> true
sqrt(X) u< 0.0 --> X u< 0.0
sqrt(X) u<= 0.0 --> X u<= 0.0
sqrt(X) > 0.0 --> X > 0.0
sqrt(X) >= 0.0 --> X >= 0.0
sqrt(X) == 0.0 --> X == 0.0
sqrt(X) u!= 0.0 --> X u!= 0.0
sqrt(X) <= 0.0 --> X == 0.0
sqrt(X) u> 0.0 --> X u!= 0.0
sqrt(X) u== 0.0 --> X u<= 0.0
sqrt(X) != 0.0 --> X > 0.0
!isnan(sqrt(X)) --> X >= 0.0
isnan(sqrt(X)) --> X u< 0.0
```

In most cases, `sqrt` cannot be eliminated since it has multiple uses.
But this patch will break data dependencies and allow optimizer to sink
expensive `sqrt` calls into successor blocks.
2024-08-03 13:35:22 +08:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Nikita Popov
11484cb817 [InstCombine] Pass SimplifyQuery to SimplifyDemandedBits()
This will enable calling SimplifyDemandedBits() with a SimplifyQuery
that has CondContext set in the future.

Additionally this also marginally strengthens the analysis by
retaining the original context instruction for one-use chains.
2024-07-01 12:41:21 +02:00
SahilPatidar
a7bf4124bf
[InstCombine] Fold fcmp pred (x - y), 0 into fcmp pred x, y (#85506)
Resolve #85245
Alive2: https://alive2.llvm.org/ce/z/F4rDwK

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Jay Foad <jay.foad@gmail.com>
2024-06-29 15:37:35 +08:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Poseydon42
905e4ec747
[InstCombine] Implement folds of icmp of UCMP/SCMP call and a constant (#96118)
This patch handles various cases where an operation of the kind `icmp
(ucmp/scmp x, y), constant` folds to `icmp x, y`. Another patch with
cases where this operation folds to a constant (i.e. dumb cases like
`icmp eq (cmp x, y), 4` should be published in a couple of days.

I wasn't sure what negative tests should be added here, if any are
necessary at all. I'd love to hear your suggestions.

Proofs (ucmp): https://alive2.llvm.org/ce/z/qQ7ihz
Proofs (scmp): https://alive2.llvm.org/ce/z/cipKEn

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-06-22 12:23:20 +08:00