802 Commits

Author SHA1 Message Date
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Yingwei Zheng
6a65e98fa7
[InstCombine] Drop range attributes in foldIsPowerOf2 (#111946)
Fixes https://github.com/llvm/llvm-project/issues/111934.
2024-10-11 18:19:21 +08:00
Nikita Popov
67d247a441
[InstCombine] Decompose more icmps into masks (#110836)
Extend decomposeBitTestICmp() to handle cases where the resulting
comparison is of the form `icmp (X & Mask) pred C` with non-zero
`C`. Add a flag to allow code to opt-in to this behavior and use it in
the "log op of icmp" fold infrastructure.

This addresses regressions from #97289.

Proofs: https://alive2.llvm.org/ce/z/hUhdbU
2024-10-04 10:17:23 +02:00
Nikita Popov
b8d1bae648
[CmpInstAnalysis] Return decomposed bit test as struct (NFC) (#109819)
decomposeBitTestICmp() currently returns the result via two out
parameters plus an in-place modification of Pred. This changes it to
return an optional struct instead.

The motivation here is twofold. First, I'd like to extend this code to
handle cases where the comparison is against a value other than zero,
which would mean yet another out parameter. Second, while doing that I
was badly bitten by the in-place modification, so I'd like to get rid of
it.
2024-09-25 10:14:15 +02:00
Nikita Popov
7a181980b9 [InstCombine] Fix nits in new xor fold
Followup to https://github.com/llvm/llvm-project/pull/105992,
use the simplifyXorInst helper and use getWithInstruction
consistently.
2024-09-23 11:59:37 +02:00
Amr Hesham
9614f69b4b
[InstCombine] Fold Xor with or disjoint (#105992)
Implement a missing optimization fold `(X | Y) ^ M to (X ^ M) ^ Y` and
`(X | Y) ^ M to (Y ^ M) ^ X`
2024-09-22 12:32:17 +02:00
Noah Goldstein
a6edcea211 [InstCombine] Simplify (add/sub (sub/add) (sub/add)) irrelivant of use-count
Added folds:
    - `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)`
    - `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)`

The fold typically is handled in the `Reassosiate` pass, but it fails
if the inner `sub`/`add` are multi-use. Less importantly, Reassosiate
doesn't propagate flags correctly.

This patch adds the fold explicitly the InstCombine

Proofs: https://alive2.llvm.org/ce/z/p6JyRP

Closes #105866
2024-08-27 11:43:17 -07:00
Nikita Popov
28fe6ddd9b
[InstCombine] Remove AllOnes fallbacks in getMaskedTypeForICmpPair() (#104941)
getMaskedTypeForICmpPair() tries to model non-and operands as x & -1.
However, this can end up confusing the matching logic, by picking the -1
operand as the "common" operand, resulting in a successful, but useless,
match. This is what causes commutation failures for some of the
optimizations driven by this function.

Fix this by treating a match against -1 as a non-match.
2024-08-26 09:55:52 +02:00
Nikita Popov
32679e10a9 [InstCombine] Handle logical op for and/or of icmp 0/-1
This aligns the transform with what foldLogOpOfMaskedICmp() does.
2024-08-22 16:17:39 +02:00
Bjorn Pettersson
145aff6d92 Clean up pointer casts etc after opaque pointers transition. NFC (#102631) 2024-08-12 13:28:53 +02:00
Yingwei Zheng
dba3dfa2bd
[InstCombine] Fold isnan idioms (#101510)
Folds `(icmp ne ((bitcast X) & FractionBits), 0) & (icmp eq ((bitcast X)
& ExpBits), ExpBits) -> fcmp uno X, 0.0` and
`(icmp eq ((bitcast X) & FractionBits), 0) | (icmp ne ((bitcast X) &
ExpBits), ExpBits) -> fcmp ord X, 0.0`

Alive2: https://alive2.llvm.org/ce/z/mrLn_x
2024-08-02 21:21:31 +08:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Noah Goldstein
52727626af [InstCombine] Add transforms for (or/and (icmp eq/ne X,0),(icmp eq/ne X,Pow2OrZero))
`(or (icmp eq X, 0), (icmp eq X, Pow2OrZero))`
    --> `(icmp eq (and X, Pow2OrZero), X)`

`(and (icmp ne X, 0), (icmp ne X, Pow2OrZero))`
    --> `(icmp ne (and X, Pow2OrZero), X)`

Proofs: https://alive2.llvm.org/ce/z/nPo2BN

Closes #94648
2024-07-13 02:34:38 +08:00
Allen
2b3376f353
[InstCombine] Guard noundef for transformation from xor to or disjoint (#96905)
Fix https://github.com/llvm/llvm-project/issues/96857
2024-07-03 18:35:34 +08:00
AtariDreams
515e048e36
[InstCombine] Simplify commutative matchers (NFC) (#96665) 2024-06-26 09:37:36 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Zain Jaffal
f7fc72e281
[InstCombine] fold (a == c && b != c) || (a != c && b == c)) to (a == c) == (b != c) (#94915)
resolves https://github.com/llvm/llvm-project/issues/92966

alive proof
https://alive2.llvm.org/ce/z/bLAQBS
2024-06-23 10:39:07 +01:00
AtariDreams
fbac697782
[Transforms] Replace incorrect uses of m_Deferred with m_Specific (#95719)
The values have been bound already, so use m_Specific.
2024-06-17 05:50:39 +08:00
Eli Friedman
f893dccbba
Replace uses of ConstantExpr::getCompare. (#91558)
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
2024-05-09 16:50:01 -07:00
Nikita Popov
534701d5f9 [InstCombine] Handle commuted variants in or of xor pattern
This pattern only handled commutation in the "or", while all
involved operations are commutative. Make sure we handle all
sixteen patterns.
2024-05-09 15:16:25 +09:00
Troy Butler
468fecfc39
Fix mismatches between function parameter definitions and declarations (#89512)
Addresses issue #88716.

Some function parameter names in the affected header files did not match
the parameter names in the definitions, or were listed in a different
order.

---------

Signed-off-by: Troy-Butler <squintik@outlook.com>
2024-04-26 13:00:31 +02:00
Yingwei Zheng
945eeb2d92
[InstCombine] Simplify (X / C0) * C1 + (X % C0) * C2 to (X / C0) * (C1 - C2 * C0) + X * C2 (#76285)
Since `DivRemPairPass` runs after `ReassociatePass` in the optimization
pipeline, I decided to do this simplification in `InstCombine`.

Alive2: https://alive2.llvm.org/ce/z/Jgsiqf
Fixes #76128.
2024-04-24 17:01:49 +08:00
Nikita Popov
eb7ad8853c
[InstCombine] Remove some uses with replaceUndefsWith() (#89190)
Now that we don't accept undef splat in PatternMatch, we can remove some
uses of replaceUndefsWith(). I believe in all these cases only poison
splats are possible now, in which case no replacement is necessary.
2024-04-19 09:01:56 +09:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Nikita Popov
d9a5aa8e2d
[PatternMatch] Do not accept undef elements in m_AllOnes() and friends (#88217)
Change all the cstval_pred_ty based PatternMatch helpers (things like
m_AllOnes and m_Zero) to only allow poison elements inside vector
splats, not undef elements.

Historically, we used to represent non-demanded elements in vectors
using undef. Nowadays, we use poison instead. As such, I believe that
support for undef in vector splats is no longer useful.

At the same time, while poison splat elements are pretty much always
safe to ignore, this is not generally the case for undef elements. We
have existing miscompiles in our tests due to this (see the
masked-merge-*.ll tests changed here) and it's easy to miss such cases
in the future, now that we write tests using poison instead of undef
elements.

I think overall, keeping support for undef elements no longer makes
sense, and we should drop it. Once this is done consistently, I think we
may also consider allowing poison in m_APInt by default, as doing that
change is much less risky than doing the same with undef.

This change involves a substantial amount of test changes. For most
tests, I've just replaced undef with poison, as I don't think there is
value in retaining both. For some tests (where the distinction between
undef and poison is important), I've duplicated tests.
2024-04-17 18:22:05 +09:00
Harald van Dijk
60de56c743
[ValueTracking] Restore isKnownNonZero parameter order. (#88873)
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
2024-04-16 15:21:09 +01:00
Yingwei Zheng
e0a628715a
[ValueTracking] Convert isKnownNonZero to use SimplifyQuery (#85863)
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.

Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-04-12 23:47:20 +08:00
Yingwei Zheng
caa2258250
[LLVM] Remove nuw neg (#86295)
This patch removes APIs that creating NUW neg. It is a trivial case
because `sub nuw 0, X` always gets simplified into zero.
I believe there is no optimization opportunities in the real-world
applications that we can take advantage of the nuw flag.

Motivated by
https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134.

Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
2024-03-26 20:56:16 +08:00
Yingwei Zheng
9eb399b854
[InstCombine] Support zext nneg in foldLogicCastConstant (#82355)
This patch extends [D36234](https://reviews.llvm.org/D36234) to handle
`zext nneg` instructions.
I found this while adding support for cast instructions in
`getFreelyInvertedImpl`.
2024-02-20 23:09:00 +08:00
SahilPatidar
4b483ecd55
[InstCombine] Fix failure to fold (and %x, (sext i1 %m)) -> (select %m, %x, 0) with multiple uses of %m (#81409)
Resolves #81288.
2024-02-19 11:07:16 +01:00
Eikansh Gupta
db870cfc9e [InstCombine] Extract helper from matchFunnelShift (NFC)
The matchFunnelShift function was doing pattern matching and creating
the fshl/fshr instruction if needed. Moved the pattern matching code to
function convertOrOfShiftsToFunnelShift. It can be reused for other
optimizations.
2024-02-16 16:46:41 +01:00
Yingwei Zheng
470c5b8011
[InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81712)
This patch removes unnecessary `m_c_*` matchers since we always
canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
2024-02-14 16:40:36 +08:00
Nikita Popov
074f7c2235 [InstCombine] Remove redundant fold (NFCI)
This has been subsumed by simplifyAndOrWithOpReplaced().
2024-02-12 09:54:32 +01:00
Nikita Popov
35d6ae8110
[InstCombine] Handle multi-use in simplifyAndOrWithOpReplaced() (#81006)
Slightly generalize simplifyAndOrWithOpReplaced() by allowing it to
perform simplifications (without creating new instructions) in multi-use
cases. This way we can remove existing patterns without worrying about
multi-use edge cases.

I've opted to change the general way the implementation works to be more
similar to the standard simplifyWithOpReplaced(). We perform the operand
replacement generically, and then try to simplify the result or create a
new instruction if we're allowed to do so.
2024-02-08 09:44:51 +01:00
Yingwei Zheng
65bf93dd7b
[InstCombine] Clean up bitwise folds without one-use check (#80587)
This patch removes some bitwise folds that fail to check the one-use
constraint on the operands.
See also the comments
https://github.com/llvm/llvm-project/pull/77231#issuecomment-1904090035.
2024-02-08 01:15:05 +08:00
Yingwei Zheng
f37d81f8a3
[PatternMatch] Add a matching helper m_ElementWiseBitCast. NFC. (#80764)
This patch introduces a matching helper `m_ElementWiseBitCast`, which is
used for matching element-wise int <-> fp casts.
The motivation of this patch is to avoid duplicating checks in
https://github.com/llvm/llvm-project/pull/80740 and
https://github.com/llvm/llvm-project/pull/80414.
2024-02-07 21:02:13 +08:00
Yingwei Zheng
4858e9c9fe
[InstCombine] Canonicalize the fcmp range check idiom into fabs + fcmp (#76367)
This patch canonicalizes the fcmp range check idiom into `fabs + fcmp`
since the canonicalized form is better than the original form for the
backends.
Godbolt: https://godbolt.org/z/x3eqPb1fz
```
and (fcmp olt/ole/ult/ule x, C), (fcmp ogt/oge/ugt/uge x, -C) --> fabs(x) olt/ole/ult/ule C
or  (fcmp ogt/oge/ugt/uge x, C), (fcmp olt/ole/ult/ule x, -C) --> fabs(x) ogt/oge/ugt/uge C
```
Alive2: https://alive2.llvm.org/ce/z/MRtoYq
2024-02-07 04:33:26 +08:00
elhewaty
2614672cc1
[InstCombine] Fold ((cst << x) & 1) --> x == 0 when cst is odd (#79772)
Fold ((cst << x) & 1) to zext(x == 0) when cst is odd.

Fixes: https://github.com/llvm/llvm-project/issues/73384
Alive2: https://alive2.llvm.org/ce/z/5RbaK6
2024-02-05 16:27:53 +01:00
Yingwei Zheng
f2816ff60c
[InstCombine] Simplify and/or by replacing operands with constants (#77231)
This patch tries to simplify `X | Y` by replacing occurrences of `Y` in
`X` with 0. Similarly, it tries to simplify `X & Y` by replacing
occurrences of `Y` in `X` with -1.

Alive2: https://alive2.llvm.org/ce/z/cNjDTR
Note: As the current implementation is too conservative in the one-use
checks, I cannot remove other existing hard-coded simplifications if
they involves more than two instructions (e.g, `A & ~(A ^ B) --> A &
B`).

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=a085402ef54379758e6c996dbaedfcb92ad222b5&to=9d655c6685865ffce0ad336fed81228f3071bd03&stat=instructions%3Au

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|+0.01%|-0.00%|+0.00%|-0.02%|+0.01%|+0.02%|-0.01%|

Fixes #76554.
2024-01-31 14:30:55 +08:00
Yingwei Zheng
9acc404230
[InstCombine] Recognize more rotation patterns (#78107)
InstCombine already handles the pattern `(shl ShVal, (X & (Width - 1)))
| (lshr ShVal, ((-X) & (Width - 1)))`. Under certain circumstances, `X &
(Width - 1)` will be simplified to `X`. Therefore, this patch adds
support for the pattern `(shl ShVal, X) | (lshr ShVal, ((-X) & (Width -
1)))`.

Alive2: https://alive2.llvm.org/ce/z/P7JQ2V
2024-01-18 20:29:53 +08:00
Noah Goldstein
60e8915d22 [InstCombine] Add folds for (add/sub/disjoint_or/icmp C, (ctpop (not x)))
`(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The
`sub` expression can sometimes be constant folded depending on the use
case of `(ctpop (not x))`.

This patch adds fold for the following cases:

`(add/sub/disjoint_or C, (ctpop (not x))`
    -> `(add/sub/disjoint_or C', (ctpop x))`
`(cmp pred C, (ctpop (not x))`
    -> `(cmp swapped_pred C', (ctpop x))`

Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for
the given opcode.

Proofs: https://alive2.llvm.org/ce/z/qUgfF3

Closes #77859
2024-01-15 12:05:38 -08:00
Yingwei Zheng
29f98d6c25
[InstCombine] Fold bitwise logic with intrinsics (#77460)
This patch does the following folds:
```
bitwise(fshl (A, B, ShAmt), fshl(C, D, ShAmt)) -> fshl(bitwise(A, C), bitwise(B, D), ShAmt)
bitwise(fshr (A, B, ShAmt), fshr(C, D, ShAmt)) -> fshr(bitwise(A, C), bitwise(B, D), ShAmt)
bitwise(bswap(A), bswap(B)) -> bswap(bitwise(A, B))
bitwise(bswap(A), C) -> bswap(bitwise(A, bswap(C)))
bitwise(bitreverse(A), bitreverse(B)) -> bitreverse(bitwise(A, B))
bitwise(bitreverse(A), C) -> bitreverse(bitwise(A, bitreverse(C)))
```
Alive2: https://alive2.llvm.org/ce/z/iZN_TL
2024-01-10 19:33:18 +08:00
Yingwei Zheng
90802e652d
[InstCombine] Handle commuted cases of the fold ((B|C)&A)|B -> B|(A&C) (#76565)
Alive2: https://alive2.llvm.org/ce/z/Qdsqk6

The commit f1eda23514
didn't handle other cases that commute operands.
2023-12-29 23:58:58 +08:00
Yingwei Zheng
7a1a476116
[InstCombine] Fold (X & C1) | C2 into X & (C1 | C2) iff (X & C2) == C2 (#76470)
Alive2: https://alive2.llvm.org/ce/z/VKJYaS
2023-12-28 20:47:40 +08:00
Yingwei Zheng
0d454d6e59
[InstCombine] Fold xor of icmps using range information (#76334)
This patch folds xor of icmps into a single comparison using range-based reasoning as `foldAndOrOfICmpsUsingRanges` does.
Fixes #70928.
2023-12-25 07:14:31 +08:00
Yingwei Zheng
c59ea32f82
[InstCombine] Canonicalize icmp pred (X +/- C1), C2 into icmp pred X, C2 -/+ C1 with nowrap flag implied by with.overflow intrinsic (#75511)
This patch tries to canonicalize the pattern `Overflow | icmp pred Res,
C2` into `Overflow | icmp pred X, C2 +/- C1`, where `Overflow` and `Res`
are return values of `xxx.with.overflow X, C1`.
Alive2: https://alive2.llvm.org/ce/z/PhR_3S

Fixes #75360.
2023-12-16 17:58:57 +08:00
Yingwei Zheng
9cf3e31172
[InstCombine] Explicitly fold ~(~X >>u Y) into X >>s Y (#75473)
Fixes #75369.

This patch explicitly folds `~(~X >>u Y)` into `X >>s Y` to fix assertion failure in #75369.
2023-12-14 23:06:38 +08:00
Nikita Popov
6e8b17d821 [InstCombine] Support or disjoint in displaced shift fold
When I originally added this fold, it did not actually fix my
motivation case, where the add was represented as an or. Now that
we have the disjoint flag this can finally be cleanly supported.
2023-12-07 15:00:40 +01:00
Craig Topper
56248caa3b
[InstCombine] Explicitly set disjoint flag when converting xor to or. (#74229) 2023-12-06 09:41:59 -08:00