996 Commits

Author SHA1 Message Date
Nikita Popov
de8f782355 Revert "Simplify (a % b) lt/ge (b-1) into (a % b) eq/ne (b-1) (#72504)"
This reverts commit 01f4d40aad58c5c34a8ae30edbf4e0ebbf235838.

Causes test failures.
2024-01-16 11:39:42 +01:00
elhewaty
01f4d40aad
Simplify (a % b) lt/ge (b-1) into (a % b) eq/ne (b-1) (#72504)
Alive2: https://alive2.llvm.org/ce/z/i7zYtE
Fixes: https://github.com/llvm/llvm-project/issues/71280
2024-01-16 10:15:15 +01:00
Noah Goldstein
60e8915d22 [InstCombine] Add folds for (add/sub/disjoint_or/icmp C, (ctpop (not x)))
`(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The
`sub` expression can sometimes be constant folded depending on the use
case of `(ctpop (not x))`.

This patch adds fold for the following cases:

`(add/sub/disjoint_or C, (ctpop (not x))`
    -> `(add/sub/disjoint_or C', (ctpop x))`
`(cmp pred C, (ctpop (not x))`
    -> `(cmp swapped_pred C', (ctpop x))`

Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for
the given opcode.

Proofs: https://alive2.llvm.org/ce/z/qUgfF3

Closes #77859
2024-01-15 12:05:38 -08:00
Vitaly Buka
253d2f931e
Revert "[InstCombine] Fold icmp pred (inttoptr X), (inttoptr Y) -> icmp pred X, Y" (#78023)
Reverts llvm/llvm-project#77832

To fix https://lab.llvm.org/buildbot/#/builders/236/builds/8673

Also truncation to shorter type looks incorrect.

Issue for tracking #78024 .
2024-01-13 11:15:30 -08:00
Yingwei Zheng
2aae304cbc
[InstCombine] Fold icmp pred (inttoptr X), (inttoptr Y) -> icmp pred X, Y (#77832)
NOTE: Alive2 proofs are unavailable because `inttoptr` is unsupported.
2024-01-12 23:03:07 +08:00
Yingwei Zheng
2eb7a82af3
[InstCombine] Relax the one-use constraints for icmp pred (binop X, Z), (binop Y, Z) (#76384)
This patch relaxes the one-use constraints for `icmp pred (binop X, Z),
(binop Y, Z)`. It will enable more optimizations with pointer
arithmetic.
One example in `boost::match_results::set_size`:

```
declare void @use(i64)
define i1 @src(ptr %a1, ptr %a2, ptr %add.ptr.i66, i64 %sub.ptr.rhs.cast.i) {
  %sub.ptr.lhs.cast.i = ptrtoint ptr %a1 to i64
  %sub.ptr.rhs.cast.i = ptrtoint ptr %a2 to i64
  %sub.ptr.sub.i = sub i64 %sub.ptr.lhs.cast.i, %sub.ptr.rhs.cast.i
  %sub.ptr.div.i = sdiv exact i64 %sub.ptr.sub.i, 24
  call void @use(i64 %sub.ptr.div.i)
  %sub.ptr.lhs.cast.i.i = ptrtoint ptr %add.ptr.i66 to i64
  %sub.ptr.sub.i.i = sub i64 %sub.ptr.lhs.cast.i.i, %sub.ptr.rhs.cast.i
  %sub.ptr.div.i.i = sdiv exact i64 %sub.ptr.sub.i.i, 24
  %cmp.i.not.i.i = icmp eq i64 %sub.ptr.div.i.i, %sub.ptr.div.i
  ret i1 %cmp.i.not.i.i
}
define i1 @tgt(ptr %a1, ptr %a2, ptr %add.ptr.i66, i64 %sub.ptr.rhs.cast.i) {
  %sub.ptr.lhs.cast.i = ptrtoint ptr %a1 to i64
  %sub.ptr.rhs.cast.i = ptrtoint ptr %a2 to i64
  %sub.ptr.sub.i = sub i64 %sub.ptr.lhs.cast.i, %sub.ptr.rhs.cast.i
  %sub.ptr.div.i = sdiv exact i64 %sub.ptr.sub.i, 24
  call void @use(i64 %sub.ptr.div.i)
  %cmp.i.not.i.i = icmp eq i64 %sub.ptr.sub.i.i, %sub.ptr.sub.i
  ret i1 %cmp.i.not.i.i
}
```
2024-01-07 20:16:12 +08:00
Z572
86ef039220
[InstCombine] Simplify compare abs(X) and X. (#76385)
fix https://github.com/llvm/llvm-project/issues/72653
proof: https://alive2.llvm.org/ce/z/LZzZaj
2024-01-05 17:08:49 +08:00
Yingwei Zheng
6681650025
[InstCombine] Revert the signed icmp -> unsigned icmp canonicalization when folding icmp Pred min|max(X, Y), Z (#76685)
This patch tries to flip the signedness of predicates when folding an
unsigned icmp with a signed min/max. It will enable more optimizations
as we canonicalizes a signed icmp into an unsigned icmp when both
operands are known to have the same sign.
Fixes #76672.

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=949ec83eaf6fa6dbffb94c2ea9c0a4d5efdbd239&to=2deca1aea8a4e13609bab72c522a97d424f0fc2d&stat=instructions:u


|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|-0.00%|+0.01%|+0.05%|-0.12%|-0.01%|-0.03%|-0.00%|

NOTE: We can flip the signedness of predicate if both operands are
negative. But I don't see the benefit of handling these cases.
2024-01-05 14:39:16 +08:00
Nikita Popov
9d5b0965c4 [InstCombine] Add helper for commutative icmp folds (NFCI)
Add a common place for icmp folds that should be tried with both
operand orders, so we don't have to repeat this pattern for
individual folds.
2024-01-02 16:16:32 +01:00
Mikhail Gudim
7a581c34f1
Reland "[InstCombine] Extend foldICmpBinOp to add-like or" (#76531)
The original PR had a typo which was causing a bug.
2023-12-30 01:55:07 -05:00
XChy
dafd17895f [InstCombine][NFC] Format code in foldCmpLoadFromIndexedGlobal 2023-12-29 17:42:38 +08:00
Yingwei Zheng
aacff347af
[InstCombine] Simplify icmp pred (sdiv exact X, C), (sdiv exact Y, C) into icmp pred X, Y when C is positive (#76409)
Alive2: https://alive2.llvm.org/ce/z/u49dQ9
It will improve the codegen of `std::_Vector_base<T>::~_Vector_base()` when `sizeof(T)` is not a power of 2.

NOTE: We can also fold `icmp signed-pred (sdiv exact X, C), (sdiv exact Y, C)` into `icmp signed-pred (sdiv exact Y, C), (sdiv exact X, C)` when C is negative. But I don't think it enables more optimizations for real-world applications.
2023-12-27 06:06:16 +08:00
Mikhail Gudim
411cba215a
Revert "[InstCombine] Extend foldICmpBinOp to add-like or. (#71… (#76167)
…396)"

This reverts commit 8773c9be3d9868288f1f46957945d50ff58e4e91.
2023-12-21 11:41:09 -05:00
Mikhail Gudim
8773c9be3d
[InstCombine] Extend foldICmpBinOp to add-like or. (#71396)
InstCombine canonicalizes `add` to `or` when possible, but this makes
some optimizations applicable to `add` to be missed because they don't
realize that the `or` is equivalent to `add`.

In this patch we generalize `foldICmpBinOp` to handle such cases.
2023-12-20 17:28:57 -05:00
Yingwei Zheng
b7f50e13d8
[InstCombine] Improve foldICmpWithDominatingICmp with DomConditionCache (#75370)
This patch uses affected values from DomConditionCache(introduced by #73662), instead of a cheap/incomplete check `getSinglePredecessor`.
2023-12-14 21:02:10 +08:00
Kazu Hirata
a16429365c [Transforms] Remove unnecessary includes (NFC) 2023-12-09 18:23:06 -08:00
Nikita Popov
4a2a6397f1 [InstCombine] Relax one-use check for icmp of gep fold
Instead of checking whether the GEP as a whole is constant, only
check whether it has constant incides. This matches what we do in
other places in this code.

This has little practical impact, because it is mostly already
handled through other cases anyway. We see a difference for
non-inbounds equality comparisons.
2023-12-08 15:45:58 +01:00
Nikita Popov
cf47af493b
[InstCombine] Generalize folds for inversion of icmp operands (#74317)
We have a bunch of folds that basically perform X pred Y to ~Y pred ~X
for various special cases where this saves an instruction.

Generalize these folds to use isFreeToInvert(). We have to make sure
that we consume an instruction in either of the inversions, otherwise
we're just going to swap the icmp back and forth.

Fixes https://github.com/llvm/llvm-project/issues/74302.
2023-12-08 11:25:41 +01:00
Nikita Popov
d77067d08a
[ValueTracking] Add dominating condition support in computeKnownBits() (#73662)
This adds support for using dominating conditions in computeKnownBits()
when called from InstCombine. The implementation uses a
DomConditionCache, which stores which branches may provide information
that is relevant for a given value.

DomConditionCache is similar to AssumptionCache, but does not try to do
any kind of automatic tracking. Relevant branches have to be explicitly
registered and invalidated values explicitly removed. The necessary
tracking is done inside InstCombine.

The reason why this doesn't just do exactly the same thing as
AssumptionCache is that a lot more transforms touch branches and branch
conditions than assumptions. AssumptionCache is an immutable analysis
and mostly gets away with this because only a handful of places have to
register additional assumptions (mostly as a result of cloning). This is
very much not the case for branches.

This change regresses compile-time by about ~0.2%. It also improves
stage2-O0-g builds by about ~0.2%, which indicates that this change results
in additional optimizations inside clang itself.

Fixes https://github.com/llvm/llvm-project/issues/74242.
2023-12-06 14:17:18 +01:00
Nikita Popov
d6e8f3b9a2 [ValueTracking] Convert isKnownPositive() to use SimplifyQuery (NFC) 2023-11-29 11:08:39 +01:00
Nikita Popov
d01237c45b
[InstCombine] Make indexed compare fold GEP source type independent (#71663)
The indexed compare fold converts comparisons of GEPs with same
(indirect) base into comparisons of offset. Currently, it only supports
GEPs with the same source element type.

This change makes the transform operate on offsets instead, which
removes the type dependence. To keep closer to the scope of the original
implementation, this keeps the limitation that we should only have at
most one variable index per GEP.

This addresses the main regression from
https://github.com/llvm/llvm-project/pull/68882.

TBH I have some doubts that this is really a useful transform (at least
for the case where there are extra pointer users, so we have to
rematerialize pointers at some point). I can only assume it exists for a
reason...
2023-11-28 09:16:04 +01:00
Noah Goldstein
b7c0f79926 [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
2023-11-20 17:59:27 -06:00
Noah Goldstein
99387e33dc [InstCombine] Add transforms for (icmp uPred (trunc x),(truncOrZext(y)))->(icmp uPred x,y)
Three transforms (all commutative):
https://alive2.llvm.org/ce/z/Bc-nh4

Closes #71309
2023-11-19 12:15:04 -06:00
elhewaty
daddf402d9
[InstCombine] Fold xored one-complemented operand comparisons (#69882)
- [InstCombine] Add test coverage for comparisons of operands including
one-complemented oparands(NFC).
- [InstCombine] Fold xored one-complemented operand comparisons.
Alive2: https://alive2.llvm.org/ce/z/PZMJeB
Fixes #69803.
2023-11-14 21:54:03 +08:00
Léonard Oest O'Leary
ff36411b23
[InstCombine] Use zext's nneg flag for icmp folding (#70845)
This PR fixes https://github.com/llvm/llvm-project/issues/55013 : the
max intrinsics is not generated for this simple loop case :
https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being
folded into a select, thus not generating the max intrinsics.

For the story :

Since LLVM 14, SCCP pass got smarter by folding sext into zext for
positive ranges : https://reviews.llvm.org/D81756. After this change,
InstCombine was sometimes unable to fold ICMP correctly as both of the
arguments pointed to mismatched zext/sext. To fix this, @rotateright
implemented this fix : https://reviews.llvm.org/D124419 that tries to
resolve the mismatch by knowing if the argument of a zext is positive
(in which case, it is like a sext) by using ValueTracking, however
ValueTracking is not smart enough to infer that the value is positive in
some cases. Recently, @nikic implemented #67982 which keeps the
information that a zext is non-negative. This PR simply uses this
information to do the folding accordingly.

TLDR : This PR uses the recent nneg tag on zext to fold the icmp
accordingly in instcombine.

This PR also contains test cases for sext/zext folding with InstCombine
as well as a x86 regression tests for the max/min case.
2023-11-13 00:53:53 +08:00
Nikita Popov
567c02a80e [InstCombine] Remove inttoptr/ptrtoint handling from indexed compare fold
Looking through inttoptr / ptrtoint intermixed with GEPs is very
questionable from a provenance perspective. We also don't seem to
have any test coverage that shows this is useful (apart from one
test I added to guard against a crash).
2023-11-08 11:13:57 +01:00
Nikita Popov
abc27bd31f [InstCombine] Avoid some FP cast constant expressions (NFCI)
Instead of doing fptoxi and xitofp casts to check for round-trip,
directly check the IsExact flag on the convertToInteger() API.
2023-11-06 14:42:42 +01:00
Nikita Popov
03110ddeb2 [IR] Remove ZExtOperator (NFC)
Now that zext constant expressions are no longer supported,
ZExtInst should be used instead.
2023-11-03 14:52:59 +01:00
Nikita Popov
930bc6c7b5 [InstCombine] Avoid use of ConstantExpr::getSExtOrTrunc()
InstCombine will canonicalize the index type, no need to handle
the non-canonical case.
2023-11-01 16:10:57 +01:00
Nikita Popov
5c8a71d82b [InstCombine] Remove unnecessary icmp of all-zero gep folds (NFC)
All-zero GEPs will be removed anyway, no need to special-case them
here.
2023-10-30 10:01:21 +01:00
Amara Emerson
2228b35f93 Revert "Revert "[InstCombine] Add oneuse checks to shr + cmp constant folds.""
This reverts commit d37b283cdd37feca5ea71456cf350005add268e7.

There was a simple logic bug in the else path. Tests codegen is different with
the fix.
2023-10-28 03:12:15 -07:00
Noah Goldstein
0289dad538 [InstCombine] Add folds for (icmp eq/ne (and (add/sub/xor A, P2), P2), 0/P2)
- `(icmp eq/ne (and (add/sub/xor X, P2), P2), P2)`
    -> `(icmp eq/ne (and X, P2), 0)`
- `(icmp eq/ne (and (add/sub/xor X, P2), P2), 0)`
    -> `(icmp eq/ne (and X, P2), P2)`

Folds like this come up with reasonable regularity in odd/even loops.

Proofs: https://alive2.llvm.org/ce/z/45pq2x

Closes #67836
2023-10-27 17:36:30 -05:00
Amara Emerson
d37b283cdd Revert "[InstCombine] Add oneuse checks to shr + cmp constant folds."
This reverts commit a66051c68a43af39f9fd962f71d58ae0efcf860d.

This seems to have caused issue #70509 so reverting until I have time
to investigate.
2023-10-27 14:27:58 -07:00
Amara Emerson
a66051c68a [InstCombine] Add oneuse checks to shr + cmp constant folds.
This change has virtually no code size regressions on the llvm test suite (+ SPECs)
while having these improvements (measured with -Os on Darwin arm64):

External/S.../CFP2006/450.soplex/450.soplex    214024.00      213920.00     -0.0%
External/S...7speed/641.leela_s/641.leela_s     93412.00       93348.00     -0.1%
External/S...17rate/541.leela_r/541.leela_r     93412.00       93348.00     -0.1%
MultiSourc.../Applications/JM/lencod/lencod    426044.00      425748.00     -0.1%
MultiSourc...rks/mediabench/gsm/toast/toast     20436.00       20416.00     -0.1%
MultiSourc...ench/telecomm-gsm/telecomm-gsm     20436.00       20416.00     -0.1%
MultiSourc...Prolangs-C/assembler/assembler     16172.00       16156.00     -0.1%
MultiSourc...nch/mpeg2/mpeg2dec/mpeg2decode     35332.00       35256.00     -0.2%
SingleSour...Adobe-C++/stepanov_abstraction      6904.00        6888.00     -0.2%
External/SPEC/CINT2000/254.gap/254.gap         366060.00      365132.00     -0.3%
MultiSourc...-ProxyApps-C++/PENNANT/PENNANT     79688.00       79484.00     -0.3%
External/S...NT2006/464.h264ref/464.h264ref    352044.00      351132.00     -0.3%
SingleSour...arks/Adobe-C++/functionobjects     15524.00       15480.00     -0.3%
SingleSour...arks/Adobe-C++/stepanov_vector     10728.00       10696.00     -0.3%
SingleSour...ks/Misc-C++/stepanov_container     16900.00       16848.00     -0.3%
MultiSource/Applications/oggenc/oggenc         124184.00      123780.00     -0.3%
SingleSour...tout-C++/Shootout-C++-wordfreq      7060.00        7036.00     -0.3%
MultiSourc...ity-rijndael/security-rijndael      8976.00        8936.00     -0.4%
MultiSource/Benchmarks/McCat/18-imp/imp          9816.00        9772.00     -0.4%
SingleSour...chmarks/Misc-C++/stepanov_v1p2      1772.00        1764.00     -0.5%
MultiSourc...iabench/g721/g721encode/encode      5492.00        5464.00     -0.5%
MultiSourc...rks/McCat/03-testtrie/testtrie      1364.00        1344.00     -1.5%
SingleSour.../execute/GCC-C-execute-pr42833       400.00         364.00     -9.0%

Doing so also prevents a regression described in https://reviews.llvm.org/D143624

Differential Revision: https://reviews.llvm.org/D149918
2023-10-26 11:36:10 -07:00
Nikita Popov
ea99df2e84 [InstCombine] Rename some variables (NFC)
Split NFC rename out from #69882.
2023-10-25 17:08:40 +02:00
Nikita Popov
c912f88c21 [InstCombine] Remove false commutativity from processUMulZExtIdiom() (NFCI)
This fold requires a fold against a constant, which will always be
on the RHS. If the swapped fold actually did trigger, it would
result in a miscompile, because it did not work with the swapped
predicate when swapping operands.
2023-10-25 11:31:31 +02:00
Nikita Popov
82aeedc852 [InstCombine] Remove unnecessary eq/ne handling from processUMulZExtIdiom() (NFCI)
The eq/ne pattern being matched will get canonicalized to the
ugt/ult form.
2023-10-25 11:18:44 +02:00
Nikita Popov
4baeed803f [InstCombine] Remove unnecessary handling of non-canonical predicates (NFCI)
ule/uge with a constant will be converted to ult/ugt, so there is
no need to handle these variants.
2023-10-25 10:42:25 +02:00
Nikita Popov
3a39346a06 [InstCombine] Remove unnecessary typed pointer handling (NFC) 2023-10-25 10:34:39 +02:00
Nikita Popov
7df92fbe74 [InstCombine] Remove redundant icmp gep fold (NFCI)
Gep with zero indices will be folded away independently. It will
only be retained for splat geps, for which the transform is not
applicable anyway.
2023-10-25 10:31:02 +02:00
Nikita Popov
b901acad54 [InstCombine] Remove unnecessary typed pointer fold (NFCI)
Pointer bitcasts will be optimized away, no need to fold them for
icmps in particular.
2023-10-24 16:43:22 +02:00
XChy
b22917e6e2
[InstCombine] Fold Ext(i1) Pred shr(A, BW - 1) => i1 Pred A s< 0 (#68244)
Resolves #67916 .
This patch folds `Ext(icmp (A, xxx)) Pred shr(A, BW - 1)` into `i1 Pred
A s< 0`.
[Alive2](https://alive2.llvm.org/ce/z/k53Xwa).
2023-10-13 22:02:57 +08:00
Antonio Frighetto
1c12dcc910 [InstCombine] Extend sext/zext boolean additions to vectors
Reported-by: shao-hua-li

Fixes: https://github.com/llvm/llvm-project/issues/68745.
2023-10-12 14:38:54 +00:00
Yingwei Zheng
b3b3336e82
[InstCombine] Simplify the pattern a ne/eq (zext/sext (a ne/eq c)) (#65852)
This patch folds the pattern `a ne/eq (zext/sext (a ne/eq c))` into a boolean constant or a compare.
Clang vs GCC: https://godbolt.org/z/4ro817WE8
Proof for `zext`: https://alive2.llvm.org/ce/z/6z9NRF
Proof for `sext`: https://alive2.llvm.org/ce/z/tv5wuE
Fixes #65073.
2023-10-06 20:57:58 +08:00
elhewaty
5d8fb473d3
[InstCombine] Fold comparison of adding two z/sext booleans (#67895)
- Add test coverage for sext/zext boolean additions
- [InstCombine] Fold comparison of adding two z/sext booleans

Fixes https://github.com/llvm/llvm-project/issues/64859.
2023-10-06 17:29:13 +08:00
Yingwei Zheng
c4e2fcff78
[InstCombine] Don't simplify icmp eq/ne OneUse(A ^ Cst1), Cst2 in foldICmpEquality
This special case will be handled in foldICmpXorConstant later.
See also commit e9cb50a1e351e90be1a8e4ac1a4564cfc44a984b.
2023-09-29 21:21:00 +08:00
Yingwei Zheng
e9cb50a1e3
[InstCombine] Fix infinite loop in #67273
Closes #67783.
2023-09-29 20:26:39 +08:00
Nikita Popov
6ce7461eea [InstCombine] Avoid uses of ConstantExpr::getCast()
Add a generalized getLosslessTrunc() helper to simplify this.
2023-09-29 11:32:41 +02:00
Nikita Popov
b4afade175 [InstCombine] Avoid use of ConstantExpr::getZExtOrBitcast() (NFC)
Use the constant folding API instead. In the second case using
IR builder should also work, but the way the instructions are
created an inserted there is very unusual, so I've left it alone.
2023-09-29 09:44:43 +02:00
Yingwei Zheng
e158add121
[InstCombine] Canonicalize icmp eq/ne (A ^ C), B to icmp eq/ne (A ^ B), C (#67273)
This patch canonicalizes `icmp eq/ne (A ^ Cst), B` to `icmp eq/ne (A ^ B), Cst` since the latter form exposes more optimizations.
Proof: https://alive2.llvm.org/ce/z/9DbhGc
Fixes #65968.
2023-09-26 19:33:31 +08:00