301 Commits

Author SHA1 Message Date
Noah Goldstein
b3ee127e7d [InstCombine] integrate N{U,S}WAddLike into existing folds
Just went a quick replacement of `N{U,S}WAdd` with the `Like` variant
that old matches `or disjoint`

Closes #86082
2024-03-21 13:03:38 -05:00
Noah Goldstein
79ce933114 [InstCombine] Extend (lshr/shl (shl/lshr -1, x), x) -> (lshr/shl -1, x) for multi-use
We previously did this iff the inner `(shl/lshr -1, x)` was
one-use. No instructions are added even if the inner `(shl/lshr -1,
x)` is multi-use and this canonicalization both makes the resulting
instruction easier to analyze and shrinks its dependency chain.

Closes #81576
2024-02-13 12:53:16 -06:00
Craig Topper
55c6d91034
[InstCombine] Preserve nuw/nsw/exact flags when transforming (C shift (A add nuw C1)) --> ((C shift C1) shift A). (#79490)
If we weren't shifting out any non-zero bits or changing the sign before the transform, we
shouldn't be after.

Alive2: https://alive2.llvm.org/ce/z/mB-rWz
2024-01-26 11:33:53 -08:00
AtariDreams
96adf69ba9
[InstCombine] Remove one-use check if other logic operand is constant (#77973)
By using `match(W, m_ImmConstant())`, we do not need to worry about
one-time use anymore.
2024-01-23 12:10:59 +01:00
Yingwei Zheng
741975df92
[InstCombine][InstSimplify] Pass SimplifyQuery to computeKnownBits directly. NFC. (#74246)
This patch passes `SimplifyQuery` to `computeKnownBits` directly in
`InstSimplify` and `InstCombine`.
As the `DomConditionCache` in #73662 is only used in `InstCombine`, it
is inconvenient to introduce a new argument `DC` to `computeKnownBits`.
2023-12-04 02:26:39 +08:00
Yingwei Zheng
a2cf44b72c
[InstCombine] Propagate NUW flags for shl (lshr X, C1), C2 -> shl X, C2-C1 (#72525)
Alive2: https://alive2.llvm.org/ce/z/KNXNQA
2023-11-20 23:29:44 +08:00
Yingwei Zheng
6da4ecdf92
[InstCombine] Infer shift flags with unknown shamt (#72535)
Alive2: https://alive2.llvm.org/ce/z/82Wr3q

Related patch:
2dd52b4527
2023-11-18 15:15:14 +08:00
Yingwei Zheng
26ce3e4239
[InstCombine] Preserve NSW flags for lshr (mul nuw X, C1), C2 -> mul nuw nsw X, (C1 >> C2) (#72625)
Alive2: https://alive2.llvm.org/ce/z/TU_V9M

This missed optimization is discovered with the help of
https://github.com/AliveToolkit/alive2/pull/962.
2023-11-17 21:50:21 +08:00
Yingwei Zheng
e8fe15ccf1
[InstCombine] Add exact flags for ext idiom shr (shl X, Y), Y (#72483)
This patch adds exact flags for sext/zext idiom `shr (shl X, Y), Y`.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern `shr (shl X, Y), Z` with `Y u>=
Z` (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of
https://github.com/AliveToolkit/alive2/pull/962.
2023-11-16 17:30:01 +08:00
Yingwei Zheng
4fdc289d4a
[InstCombine] Infer nsw flag for (X <<nuw C1) >>u C --> X << (C1 - C) (#72407)
Alive2: https://alive2.llvm.org/ce/z/nnHAPy
This missed optimization is discovered with the help of
https://github.com/AliveToolkit/alive2/pull/962.
2023-11-16 02:34:59 +08:00
Nikita Popov
002da67d01 [InstCombine] Require ImmConstant in shift of shift fold
This fixes an infinite loop reported at:
82f68a992b (commitcomment-132406739)
2023-11-13 14:56:06 +01:00
Nikita Popov
707bb42163 [InstCombine] Require immediate constant in canEvaluateShifted()
Otherwise we risk infinite loops when shift constant expressions
are no longer supported.
2023-11-10 16:12:49 +01:00
Nikita Popov
8391f405cb [InstCombine] Avoid uses of ConstantExpr::getLShr()
Use the constant folding API instead.
2023-11-10 15:50:42 +01:00
Noah Goldstein
2dd52b4527 [InstCombine] Improve logic for adding flags to shift instructions.
Instead of relying on constant operands, use known bits to do the
computation.

Proofs: https://alive2.llvm.org/ce/z/M-aBnw

Differential Revision: https://reviews.llvm.org/D157532
2023-10-12 16:05:19 -05:00
Nikita Popov
b4afade175 [InstCombine] Avoid use of ConstantExpr::getZExtOrBitcast() (NFC)
Use the constant folding API instead. In the second case using
IR builder should also work, but the way the instructions are
created an inserted there is very unusual, so I've left it alone.
2023-09-29 09:44:43 +02:00
Nikita Popov
7eda63b814 [InstCombine] Avoid use of ConstantExpr::getZExt() (NFC)
Check the result of constant folding here, as I'm not confident
that no constant expressions can make it in here.
2023-09-28 17:13:49 +02:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Nikita Popov
357a002c7c [InstCombine] Remove old add in foldLShrOverflowBit()
Explicitly remove the old add instruction, so we don't need a
separate InstCombine iteration to DCE it.
2023-06-01 15:30:24 +02:00
Noah Goldstein
5a3d9e0617 [InstCombine] Transform (shift X,Or(Y,BitWidth-1)) -> (shift X,BitWidth-1)
shl : https://alive2.llvm.org/ce/z/_B7Qca
lshr: https://alive2.llvm.org/ce/z/6eXz_W
ashr: https://alive2.llvm.org/ce/z/oGEx-q

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D145326
2023-03-06 20:30:06 -06:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Sanjay Patel
8e8467d9d8 [InstCombine] canonicalize "extract lowest set bit" away from cttz intrinsic
1 << (cttz X) --> -X & X
https://alive2.llvm.org/ce/z/qv3E9e

This creates an extra use of the input value, so that's generally
not preferred, but there are advantages to this direction:
1. 'negate' and 'and' allow for better analysis than 'cttz'.
2. This is more likely to induce follow-on transforms (in the
   example from issue #60801, we'll get the decrement pattern).
3. The more basic ALU ops are more likely to result in better
   codegen across a variety of targets.

This won't solve the motivating bugs (see issue #60799) because
we do not recognize the redundant icmp+sel, and the x86 backend
may not have the pattern-matching to produce the optimal BMI
instructions.

Differential Revision: https://reviews.llvm.org/D144329
2023-02-19 17:29:40 -05:00
Noah Goldstein
c17ccced4b Recommit "Reorder (shl (add/sub (shl x, C0), y), C1) -> (add/sub (shl x, C0 + C1), (shl y, C1))" 2nd Try
First time caused build failure:
    https://lab.llvm.org/buildbot/#/builders/183/builds/10447
but after investigating it seems to be unrelated. The same
test/build passed later with the original commit here:
    https://lab.llvm.org/buildbot/#/builders/183/builds/10448

This is just expanding the existing pattern that exists for AND/XOR/OR
and gets a bit more parallelism in from the instruction sequence.

Alive2:
Add  - https://alive2.llvm.org/ce/z/dSmPkV
Sub1 - https://alive2.llvm.org/ce/z/6rpi5V
Sub2 - https://alive2.llvm.org/ce/z/UfYeUd

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D141875
2023-01-27 20:35:40 -06:00
Noah Goldstein
423bcf89f1 Revert "Reorder (shl (add/sub (shl x, C0), y), C1) -> (add/sub (shl x, C0 + C1), (shl y, C1))"
This reverts commit edd80befeeb92000800ded2a6f3dcdfd672d95ea.

Caused test failures in Clangd: https://lab.llvm.org/buildbot/#/builders/183/builds/10447
reverting while investigating.
2023-01-27 18:44:45 -06:00
Noah Goldstein
edd80befee Reorder (shl (add/sub (shl x, C0), y), C1) -> (add/sub (shl x, C0 + C1), (shl y, C1))
This is just expanding the existing pattern that exists for AND/XOR/OR
and gets a bit more parallelism in from the instruction sequence.

Alive2:
Add  - https://alive2.llvm.org/ce/z/dSmPkV
Sub1 - https://alive2.llvm.org/ce/z/6rpi5V
Sub2 - https://alive2.llvm.org/ce/z/UfYeUd

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D141875
2023-01-27 17:45:36 -06:00
Sanjay Patel
c2ab7e2abd [InstCombine] simplify code for matching shift-logic-shift pattern; NFC
We can match and capture in one statement. Also, make the
code more closely resemble the description comment by using
the constant name of an operand value.
2023-01-18 08:13:37 -05:00
Pierre van Houtryve
b3fdb7b0cb [InstCombine] Combine lshr of add -> (a + b < a)
Tries to perform
  (lshr (add (zext X), (zext Y)), K)
  ->  (icmp ult (add X, Y), X)
  where
    - The add's operands are zexts from a K-bits integer to a bigger type.
    - The add is only used by the shr, or by iK (or narrower) truncates.
    - The lshr type has more than 2 bits (other types are boolean math).
    - K > 1

This seems to be a pattern that just comes from OpenCL front-ends, so adding DAG/GISel combines doesn't seem to be worth the complexity.

Original patch D107552 by @abinavpp - adapted to use (a + b < a) instead of uaddo following discussion on the review.
See this issue https://github.com/RadeonOpenCompute/ROCm/issues/488

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D138814
2023-01-10 03:37:23 -05:00
Sanjay Patel
21d3871b7c [InstCombine] fold not-shift of signbit to icmp+zext, part 2
Follow-up to:
6c39a3aae1dc

That converted a pattern with ashr directly to icmp+zext, and
this updates the pattern that we used to convert to.

This canonicalizes to icmp for better analysis in the minimum case
and shortens patterns where the source type is not the same as dest type:
https://alive2.llvm.org/ce/z/tpXJ64
https://alive2.llvm.org/ce/z/dQ405O

This requires an adjustment to an icmp transform to avoid infinite looping.
2023-01-08 12:04:09 -05:00
Sanjay Patel
a0c8017286 [InstCombine] do not add "nuw" to 1<<X if the "1" has undefined elements
This was noted as a potential miscompile in the post-commit feedback
for the patch that added this fold:
d4493dd1ed58ac3f1eab0
2022-12-26 13:16:03 -05:00
Sanjay Patel
d4493dd1ed [InstCombine] add nuw to any (1<<x)
https://alive2.llvm.org/ce/z/9EjDKE

This was mentioned as a missing fold in D139598.

It can unlock follow-on folds in some cases.
This verifies one of the changed tests:
https://alive2.llvm.org/ce/z/B_btDM
2022-12-15 12:03:47 -05:00
Sanjay Patel
71df24dd39 [InstCombine] fold add-carry of bools to logic
((zext BoolX) + (zext BoolY)) >> 1 --> zext (BoolX && BoolY)
https://alive2.llvm.org/ce/z/LvZFKj

This was noted as a missing fold in D138814.
2022-12-06 13:42:42 -05:00
Sanjay Patel
3e6767ed5f [InstCombine] propagate 'exact' when converting ashr to lshr
The shift amount is not changing, so if we guaranteed
shifting out zeros before, those bits are still zeros.

https://alive2.llvm.org/ce/z/sokQca
2022-10-07 13:17:19 -04:00
Sanjay Patel
1d1d1e6f22 [InstCombine] fold full-shift of sdiv to icmp+extend
This is a disguised sign-bit test with offset:
(X / +DivC) >> (Width - 1) --> ext (X <= -DivC)
(X / -DivC) >> (Width - 1) --> ext (X >= +DivC)

https://alive2.llvm.org/ce/z/cO8JO4

We don't match/test poison in the sdiv constant because
that would be immediate undefined behavior.
2022-09-18 13:13:14 -04:00
Craig Topper
d76c8f5127 [InstCombine] Add mul with negated power of 2 constant to canEvaluateShifted.
If we are right shifting a multiply by a negated power of 2 where
the power of 2 is the same as the shift amount, we can replace with
a negate followed by an And.

New tests have not been committed yet but the patch shows the diffs.
Let me know if you want any changes or additional tests.

Differential Revision: https://reviews.llvm.org/D130103
2022-07-20 11:00:22 -07:00
Nikita Popov
df698a5762 [InstCombine] Avoid some calls to ConstantExpr::get() (NFCI)
Replace some calls to ConstantExpr::get() with IRBuilder APIs
(which will also constant fold if possible).
2022-06-29 16:26:02 +02:00
Simon Moll
b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Sanjay Patel
a0c3c60728 [InstCombine] fold shift-right-by-constant with shift-right-of-constant operand
(C2 >> X) >> C1 --> (C2 >> C1) >> X

The shift-left form of this transform has existed since:
16f18ed7b555bce5163

...but it applies to matching shift right opcodes too:
https://alive2.llvm.org/ce/z/c5eQms
2022-05-30 15:30:01 -04:00
Sanjay Patel
c5d942a4fb [InstCombine] remove unnecessary one-use check from (C2 << X) << C1 fold
The restriction goes back to:
16f18ed7b555bce51
...but the fold only replaces a shift with a shift, so that's not necessary.
Generalizing to other opcodes is planned as a follow-up.
2022-05-30 15:17:54 -04:00
Sanjay Patel
07d549bce9 Revert "[InstCombine] invert canonicalization for cast of signbit test"
This reverts commit 3794cc0e996481e10307b67c8436aa44e0d65d22.
This change is suspected of causing bots to hang at stage 2
compiles, so reverting to confirm and investigate.
2022-05-16 17:47:02 -04:00
Sanjay Patel
3794cc0e99 [InstCombine] invert canonicalization for cast of signbit test
The existing transform was wrong in 3 ways:
1. It created an extra instruction when the source and dest types don't match.
2. It did not account for an extra use of the icmp, so could create 2 extra insts.
3. It favored bit hacks over icmp (icmp generally has better analysis).

This fixes #54692 (modeled by the PhaseOrdering tests).

This is a minimal step to fix the bug, but we should likely invert
the sibling transform for the "is negative" pattern too.

The backend should be able to invert this back to a shift if that
leads to better codegen.
2022-05-16 12:55:52 -04:00
Chenbing Zheng
4c8c101b49 [InstCombine] try to narrow more shifted bswap-of-zext
Try to narrow more bswap, if the shift amount is less than the zext
(bswap (zext X)) >> C --> (zext (bswap X)) << C'

https://alive2.llvm.org/ce/z/i7ddjn

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124598
2022-05-06 10:45:10 +08:00
Nicolas Abram Lujan
f8a574bf4d [InstCombine] C0 >> (X - C1) --> (C0 << C1) >> X
With the right pre-conditions, we can fold the offset
into the shifted constant:
https://alive2.llvm.org/ce/z/drMRBU
https://alive2.llvm.org/ce/z/cUQv-_

Fixes #55016

Differential Revision: https://reviews.llvm.org/D124369
2022-04-27 14:18:30 -04:00
Sanjay Patel
664ae7bbcc [InstCombine] C0 <<{nsw, nuw} (X - C1) --> (C0 >> C1) << X (2nd try)
The first attempt at this missed a check to make sure the offset
constant was in range and caused many bot failures.

That was missed in the Alive2 proof because on overshift creates
poison rather than the assert from APInt. Here's an alternate
attempt at a proof using count-trailing-zeros:
https://alive2.llvm.org/ce/z/pnXQYR

Original commit message:

This is similar to an existing pre-shift-of-constant fold:
8a9c70fc01e6
...but in this case, we need no-wrap on the shl and a negative
offset:
https://alive2.llvm.org/ce/z/_RVz99
2022-04-21 16:18:46 -04:00
chenglin.bi
25aba1abb5 Revert "[InstCombine] Add one use limitation for (X * C2) << C1 --> X * (C2 << C1)"
This reverts commit b543d28df7b067dcda833c717a59faa28c1151a1.
2022-04-22 00:56:20 +08:00
chenglin.bi
b543d28df7 [InstCombine] Add one use limitation for (X * C2) << C1 --> X * (C2 << C1)
Follow up D123453, add one-use limitation for
(X * C2) << C1 --> X * (C2 << C1)
to make consistent with
lshr (mul nuw x, MulC), ShAmtC -> mul nuw x, (MulC >> ShAmtC)

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124183
2022-04-22 00:32:36 +08:00
Sanjay Patel
8960ba7491 Revert "[InstCombine] C0 <<{nsw, nuw} (X - C1) --> (C0 >> C1) << X"
This reverts commit 5819f4a422865fc9a8ea4dc772769e14010ff6a7.
This caused bots to fail with a crash/assert during the fold,
so some constraint was missed.
2022-04-21 12:15:27 -04:00
Sanjay Patel
5819f4a422 [InstCombine] C0 <<{nsw, nuw} (X - C1) --> (C0 >> C1) << X
This is similar to an existing pre-shift-of-constant fold:
8a9c70fc01e6
...but in this case, we need no-wrap on the shl and a negative
offset:
https://alive2.llvm.org/ce/z/_RVz99

Fixes #54890
2022-04-21 11:38:27 -04:00
chenglin.bi
1fae4b492d [InstCombine] Fold mul nuw+lshr to a single multiplication when the latter is a factor
if c is divisible by (1 << ShAmtC), we can fold this pattern:
lshr (mul nuw x, c), ShAmtC -> mul nuw x, (c >> ShAmtC)

https://alive2.llvm.org/ce/z/ox4wAt

Fix https://github.com/llvm/llvm-project/issues/54824

Reviewed By: spatel, lebedev.ri, craig.topper

Differential Revision: https://reviews.llvm.org/D123453
2022-04-21 00:13:36 +08:00
Sanjay Patel
bf09a925f2 [InstCombine] remove likely redundant ValueTracking-based folds for shifts
This is not expected to have a functional difference as discussed in the
post-commit comments for 8a9c70fc01e6. All of the motivating tests for
the older fold still optimize as expected because other code can infer
the 'nuw'.
2022-04-20 11:28:31 -04:00
Sanjay Patel
8a9c70fc01 [InstCombine] C0 shift (X add nuw C) --> (C0 shift C) shift X
With 'nuw' we can convert the increment of the shift amount
into a pre-shift (constant fold) of the shifted constant:
https://alive2.llvm.org/ce/z/FkTyR2

Fixes issue #41976
2022-04-19 15:21:34 -04:00
Sanjay Patel
1206a18d41 [InstCombine] guard against splat-mul corner case
The test is already simplified, and I'm not sure how
to write a test to exercise the new clause. But it
protects the 2-bit pattern from miscompiling as noted
in D123453.

https://alive2.llvm.org/ce/z/QPyVfv
(If we managed to fall into the mul transform, it
would wrongly create a zero on this pattern.)
2022-04-11 15:50:13 -04:00