526 Commits

Author SHA1 Message Date
Sanjay Patel
727e642e97 [InstCombine] generalize fold for mask-with-signbit-splat
(iN X s>> (N-1)) & Y --> (X < 0) ? Y : 0

https://alive2.llvm.org/ce/z/qeYhdz

I was looking at a missing abs() transform and found my way to this
generalization of an existing fold that was added with D67799.
As discussed in that review, we want to make sure codegen handles
this difference well, and for all of the targets/types that I
spot-checked, it looks good.

I am leaving the existing fold in place in this commit because
it covers a potentially missing icmp fold, but I plan to remove
that as a follow-up commit as suggested during review.

Differential Revision: https://reviews.llvm.org/D111410
2021-10-15 16:25:48 -04:00
Sanjay Patel
905d170803 [InstCombine] allow matching vector splat constants in foldLogOpOfMaskedICmps()
This is NFC-intended for scalar code. There are still unnecessary
m_ConstantInt restrictions in surrounding code, so this is not a
complete fix.

This prevents regressions seen with a planned follow-on to D111410.
2021-10-13 10:15:26 -04:00
Sanjay Patel
bc72baa047 [InstCombine] add folds for logical nand/nor
This is noted as a regression in:
https://llvm.org/PR52077
2021-10-05 18:31:20 -04:00
Sanjay Patel
668beb8ae8 [InstCombine] refactor folds of 'not' instructions; NFC
This removes repeated calls to m_Not, so hopefully a little
more efficient.

Also, we may need to enhance some of these blocks to allow
logical and/or (select of bools).
2021-10-05 16:36:57 -04:00
Jay Foad
a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Sanjay Patel
41ff7612b3 [InstCombine] allow splat vectors for narrowing masked fold
Mostly cosmetic diffs, but the use of m_APInt matches splat constants.
2021-09-17 11:24:16 -04:00
Chris Lattner
735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Nikita Popov
fafe5a6f44 [InstCombine] Perform "eq of parts" fold with logical ops
The pattern matched here is too complex for the general logical
and/or to bitwise and/or conversion to trigger. However, the
fold is poison-safe, so match it with a select root as well:

https://alive2.llvm.org/ce/z/vNzzSg
https://alive2.llvm.org/ce/z/Beyumt
2021-08-22 16:55:53 +02:00
Sanjay Patel
eee0ded337 [InstCombine] add min/max intrinsics as freely invertible candidates
In the optimized test, we are able to peak through the
min/max that has 2 min/max operands and invert them all:
https://alive2.llvm.org/ce/z/7gYMN5
2021-08-19 08:41:38 -04:00
Sanjay Patel
e10c3beca5 [InstCombine] add one-use check for min/max fold with not operands; NFC
This makes the intrinsic logic match the cmp+select idiom folds
just below. It's not clearly a win either way unless we think
that a 'not' op costs more than min/max.

The cmp+select folds on these patterns are more extensive than
the intrinsics currently and may have some complicated interactions,
so I'm trying to make those line up and bring the optimizations
for intrinsics up to parity.
2021-08-19 08:41:38 -04:00
Simon Pilgrim
afc6b09dee [InstCombine] getMaskedTypeForICmpPair - remove dead code. NFCI.
Ok should be true at this point, so the early-out is dead - replace with an assert.
2021-07-30 19:23:05 +01:00
Simon Pilgrim
401d6685c0 [InstCombine] InstCombinerImpl::visitOr - enable bitreverse matching
Currently we only match bswap intrinsics from or(shl(),lshr()) style patterns when we could often match bitreverse intrinsics almost as cheaply.

Differential Revision: https://reviews.llvm.org/D90170
2021-05-15 13:39:09 +01:00
Nikita Popov
a8f7dee1df [InstCombine] Support one-hot merge for logical and/or
If a logical and/or is used, we need to be careful not to propagate
a potential poison value from the RHS by inserting a freeze
instruction. Otherwise it works the same way as bitwise and/or.

This is intended to address the regression reported at
https://reviews.llvm.org/D101191#2751002.

Differential Revision: https://reviews.llvm.org/D102279
2021-05-12 21:01:18 +02:00
Roman Lebedev
554b1bced3
[InstCombine] ~(C + X) --> ~C - X (PR50308)
We can not rely on (C+X)-->(X+C) already happening,
because we might not have visited that `add` yet.
The added testcase would get stuck in an endless combine loop.
2021-05-12 16:10:55 +03:00
Nikita Popov
1556540372 [InstCombine] Clean up one-hot merge optimization (NFC)
Remove the requirement that the instruction is a BinaryOperator,
make the predicate check more compact and use slightly more
meaningful naming for the and operands.
2021-05-11 23:22:11 +02:00
Nikita Popov
463ea28e96 [InstCombine] Fold comparison of integers by parts
Let's say you represent (i32, i32) as an i64 from which the parts
are extracted with lshr/trunc. Then, if you compare two tuples by
parts you get something like A[0] == B[0] && A[1] == B[1], just
that the part extraction happens by lshr/trunc and not a narrow
load or similar.

The fold implemented here reduces such equality comparisons by
converting them into a comparison on a larger part of the integer
(which might be the whole integer). It handles both the "and of eq"
and the conjugated "or of ne" case.

I'm being conservative with one-use for now, though this could be
relaxed if profitable (the base pattern converts 11 instructions
into 5 instructions, but there's quite a few variations on how it
can play out).

Differential Revision: https://reviews.llvm.org/D101232
2021-05-10 22:22:39 +02:00
Juneyoung Lee
24ce194cfe [InstCombine] generalize select + select/and/or folding using implied conditions
This patch optimizes the remaining possible cases in D101191 by generalizing isImpliedCondition()-based
foldings.

Assume that there is `op a, (select b, _, _)` where op is one of `and i1`, `or i1` or their select forms.

We can do the following optimization based on the result of `isImpliedCondition(a, b)`:

If a = true implies…
- b = true:
    - select a, (select b, A, B), false => select a, A, false : https://alive2.llvm.org/ce/z/WCnZYh
    - and a, (select b, A, B) => select a, A, false : https://alive2.llvm.org/ce/z/uZhcMG
- b = false:
    - select a, (select b, A, B), false => select a, B, false : https://alive2.llvm.org/ce/z/c2hJpV
    - and a, (select b, A, B) => select a, B, false : https://alive2.llvm.org/ce/z/5ggwMM

If a = false implies…
- b = true:
    - select a, true, (select b, A, B) => select a, true, A : https://alive2.llvm.org/ce/z/tidKvH
    - or a, (select b, A, B) =>  select a, true, A : https://alive2.llvm.org/ce/z/cC-uyb
- b = false:
    - select a, true, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/ZXpJq9
    - or a, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/hnDrJj

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101720
2021-05-04 09:42:06 +09:00
Roman Lebedev
91248e2db9
[InstCombine] Improve "get low bit mask upto and including bit X" pattern
https://alive2.llvm.org/ce/z/3u-48R
2021-04-11 18:08:08 +03:00
Sanjay Patel
c0bbd0cc35 [InstCombine] fold not ops around min/max intrinsics
This is another step towards parity with the existing
cmp+select folds (see D98152).
2021-04-07 17:31:36 -04:00
Sanjay Patel
0333ed8e0c [InstCombine] move abs transform to helper function; NFC
The swap of the operands can affect later transforms that
are expecting a constant as operand 1. I don't think we
can trigger a bug with the current code, but I hit that
problem while drafting a new transform for min/max intrinsics.
2021-04-07 08:35:07 -04:00
Philip Reames
4bf8985f4f Replace calls to IntrinsicInst::Create with CallInst::Create [nfc]
There is no IntrinsicInst::Create.  These are binding to the method in the super type.  Be explicitly about which method is being called.
2021-04-06 13:23:58 -07:00
Sanjay Patel
412fc74140 [InstCombine] fold not+or+neg
~((-X) | Y) --> (X - 1) & (~Y)

We generally prefer 'add' over 'sub', this reduces the
dependency chain, and this looks better for codegen on
x86, ARM, and AArch64 targets.

https://llvm.org/PR45755

https://alive2.llvm.org/ce/z/cxZDSp
2021-04-02 13:16:36 -04:00
Philip Reames
ebc61f9d3c [instcombine] Collapse trivial or recurrences
If we have a recurrence of the form <Start, Or, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence.

Differential Revision: https://reviews.llvm.org/D97578 (or part)
2021-03-08 09:21:38 -08:00
Philip Reames
239a618180 [instcombine] Collapse trivial and recurrences
If we have a recurrence of the form <Start, And, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence.

Differential Revision: https://reviews.llvm.org/D97578 (and part)
2021-03-08 09:21:38 -08:00
Simon Pilgrim
609d0c9772 [InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI.
recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time.

This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created).

Differential Revision: https://reviews.llvm.org/D97056
2021-02-20 13:15:34 +00:00
Roman Lebedev
d1a6f92fd5
[InstCombine] Fold (~x) | y --> ~(x & (~y)) iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.

Note that we could position this transformation as just hoisting
of the `not` (still, iff y is freely negatible), but the test changes
show a number of regressions, so let's not do that.
2021-01-22 17:23:54 +03:00
Roman Lebedev
79b0d21ce9
[InstCombine] Fold (~x) & y --> ~(x | (~y)) iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.
2021-01-22 17:23:54 +03:00
Juneyoung Lee
2d89ebd5d1 Address unused variable warning 2021-01-19 09:30:16 +09:00
Juneyoung Lee
0441df94ad [InstCombine,InstSimplify] Optimize select followed by and/or/xor
This patch adds `A & (A && B)` -> `A && B`  (similarly for or + logical or)

Also, this patch adds `~(select C, (icmp pred X, Y), const)` -> `select C, (icmp pred' X, Y), ~const`.

Alive2 proof:
merge_and: https://alive2.llvm.org/ce/z/teMR97
merge_or: https://alive2.llvm.org/ce/z/b4yZUp
xor_and: https://alive2.llvm.org/ce/z/_-TXHi
xor_or: https://alive2.llvm.org/ce/z/2uYx_a

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94861
2021-01-19 09:14:17 +09:00
Dávid Bolvanský
0529946b5b [instCombine] Add (A ^ B) | ~(A | B) -> ~(A & B)
define i32 @src(i32 %x, i32 %y) {
%0:
  %xor = xor i32 %y, %x
  %or = or i32 %y, %x
  %neg = xor i32 %or, 4294967295
  %or1 = or i32 %xor, %neg
  ret i32 %or1
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
  %and = and i32 %x, %y
  %neg = xor i32 %and, 4294967295
  ret i32 %neg
}
Transformation seems to be correct!

https://alive2.llvm.org/ce/z/Cvca4a
2021-01-12 19:29:17 +01:00
Roman Lebedev
374ef57f13
[InstCombine] 'hoist xor-by-constant from xor-by-value': completely give up on constant exprs
As Mikael Holmén is noting in the post-commit review for the first fix
https://reviews.llvm.org/rGd4ccef38d0bb#967466
not hoisting constantexprs is not enough,
because if the xor originally was a constantexpr (i.e. X is a constantexpr).
`SimplifyAssociativeOrCommutative()` in `visitXor()` will immediately
undo this transform, thus again causing an infinite combine loop.

This transform has resulted in a surprising number of constantexpr failures.
2020-12-29 16:28:18 +03:00
Roman Lebedev
d4ccef38d0
[InstCombine] 'hoist xor-by-constant from xor-by-value': ignore constantexprs
As it is being reported (in post-commit review) in
https://reviews.llvm.org/D93857
this fold (as i expected, but failed to come up with test coverage
despite trying) has issues with constant expressions.
Since we only care about true constants, which constantexprs are not,
don't perform such hoisting for constant expressions.
2020-12-28 20:15:20 +03:00
Roman Lebedev
d9ebaeeb46
[InstCombine] Hoist xor-by-constant from xor-by-value
This is one of the deficiencies that can be observed in
https://godbolt.org/z/YPczsG after D91038 patch set.

This exposed two missing folds, one was fixed by the previous commit,
another one is `(A ^ B) | ~(A ^ B) --> -1` / `(A ^ B) & ~(A ^ B) --> 0`.

`-early-cse` will catch it: https://godbolt.org/z/4n1T1v,
but isn't meaningful to fix it in InstCombine,
because we'd need to essentially do our own CSE,
and we can't even rely on `Instruction::isIdenticalTo()`,
because there are no guarantees that the order of operands matches.
So let's just accept it as a loss.
2020-12-24 21:20:50 +03:00
Roman Lebedev
5b78303433
[InstCombine] Fold a & ~(a ^ b) to x & y
```
----------------------------------------
define i32 @and_xor_not_common_op(i32 %a, i32 %b) {
%0:
  %b2 = xor i32 %b, 4294967295
  %t2 = xor i32 %a, %b2
  %t4 = and i32 %t2, %a
  ret i32 %t4
}
=>
define i32 @and_xor_not_common_op(i32 %a, i32 %b) {
%0:
  %t4 = and i32 %a, %b
  ret i32 %t4
}
Transformation seems to be correct!
```
2020-12-24 21:20:49 +03:00
Roman Lebedev
a91e96702a
[InstCombine] Fold and(shl(zext(x), width(SIGNMASK) - width(%x)), SIGNMASK) to and(sext(%x), SIGNMASK)
One less instruction and reducing use count of zext.
As alive2 confirms, we're fine with all the weird combinations of
undef elts in constants, but unless the shift amount was undef
for a lane, we must sanitize undef mask to zero, since sign bits
are no longer zeros.

https://rise4fun.com/Alive/d7r
```
----------------------------------------
Optimization: zz
Precondition: ((C1 == (width(%r) - width(%x))) && isSignBit(C2))
  %o0 = zext %x
  %o1 = shl %o0, C1
  %r = and %o1, C2
=>
  %n0 = sext %x
  %r = and %n0, C2

Done: 2016
Optimization is correct!
```
2020-11-20 00:31:27 +03:00
Sanjay Patel
4a66a1d17a [InstCombine] allow vectors for masked-add -> xor fold
https://rise4fun.com/Alive/I4Ge

  Name: add with pow2 mask
  Pre: isPowerOf2(C2) && (C1 & C2) != 0 && (C1 & (C2-1)) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %n = and i8 %x, C2
  %r = xor i8 %n, C2
2020-11-17 13:36:08 -05:00
Simon Pilgrim
f7ebdec987 [InstCombine] visitAnd - remove unnecessary Value *X, *Y shadow variables. NFCI.
Fixes a number of Wshadow warnings.
2020-11-17 17:59:21 +00:00
Simon Pilgrim
abf29d9862 [InstCombine] visitAnd - use m_SpecificInt instead of m_APInt + comparison. NFCI.
m_SpecificInt has the same 'no undef element' behaviour as m_APInt so no change there, and anyway we have test coverage for undef elements in the fold.

Noticed while fixing a Wshadow warning about shadow Value *X, *Y variables.
2020-11-17 17:37:10 +00:00
Sanjay Patel
f791ad7e1e [InstCombine] remove scalar constraint for mask-of-add fold
https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2
2020-11-17 12:13:45 -05:00
Sanjay Patel
433696911a [InstCombine] relax constraints on mask-of-add
There are 2 changes:
1. Remove the unnecessary one-use check.
2. Remove the unnecessary power-of-2 check.

https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2
2020-11-17 12:13:44 -05:00
Sanjay Patel
6ddc237766 [InstCombine] reduce code for flip of masked bit; NFC
There are 1-2 potential follow-up NFC commits to reduce
this further on the way to generalizing this for vectors.

The operand replacing path should be dead code because demanded
bits handles that more generally (D91415).
2020-11-15 15:43:34 -05:00
Simon Pilgrim
6b2eb31e1e [InstCombine] Add support for zext(and(neg(amt),width-1)) rotate shift amount patterns
Alive2: https://alive2.llvm.org/ce/z/bCvvHd
2020-10-26 11:22:41 +00:00
Simon Pilgrim
3052e474ec [InstCombine] matchBSwapOrBitReversem - recognise or(fshl(),fshl()) bswap patterns.
I'm not certain InstCombinerImpl::matchBSwapOrBitReverse needs to filter the or(op0(),op1()) ops - there are just too many cases that recognizeBSwapOrBitReverseIdiom/collectBitParts handle now (and quickly).
2020-10-25 10:17:45 +00:00
Simon Pilgrim
1cab3bf004 [InstCombine] matchBSwapOrBitReverse - expose bswap/bitreverse matching flags.
matchBSwapOrBitReverse was hardcoded to just match bswaps - we're going to need to expose the ability to match bitreverse as well, so make this part of the function call.
2020-10-23 12:35:28 +01:00
Simon Pilgrim
19a13bf538 [InstCombine] Rename InstCombinerImpl::matchBSwap to matchBSwapOrBitReverse. NFCI.
This matches bswap and bitreverse intrinsics, so we should make that clear in the function name.
2020-10-23 12:35:27 +01:00
Simon Pilgrim
7b4a828452 [InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI. 2020-10-21 11:53:45 +01:00
Martin Storsjö
4de215ff18 Revert "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
Also revert "[InstCombine] foldOrOfICmps - use m_Specific instead of
explicit comparisons. NFCI." to make the primarily intended revert
work.

This reverts commits ce13549761b6a22263e051dda09ef5122435008b and
e372a5f86f6488bb0c2593a665d51fdd3a97c6e4.

This commit caused failed asserts e.g. like this:

$ cat repro.cpp
bool a(char b) {
  return b >= '0' && b <= '9' || (b | 32) >= 'a' && (b | 32) <= 'z';
$ clang++ -target x86_64-linux-gnu -c -O2 repro.cpp
clang++: ../include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const
llvm::APInt&) const: Assertion `BitWidth == RHS.BitWidth && "Comparison
requires equal bit widths"' failed.
2020-10-21 09:47:18 +03:00
Simon Pilgrim
ce13549761 [InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI. 2020-10-20 16:26:41 +01:00
Simon Pilgrim
e372a5f86f [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support
Reapplied rGa704d8238c86 with a check for integer/integervector types to prevent matching with pointer types
2020-10-20 14:14:26 +01:00
Simon Pilgrim
adb52e5f9e [InstCombine] foldOrOfICmps - only fold (icmp_eq B, 0) | (icmp_ult/gt A, B) for integer types
Fixes a number of stage2 buildbots that were failing when I generalized the m_ConstantInt() logic - that didn't match for pointer types but m_Zero() does......
2020-10-19 17:05:38 +01:00