522 Commits

Author SHA1 Message Date
XChy
313a33b9df
[InstCombine] Reduce nested logical operator if poison is implied (#86823)
Fixes #76623
Alive2 proof: https://alive2.llvm.org/ce/z/gX6znJ (I'm not sure how to
write a proof for such transform, maybe there are mistakes)

In most cases, `icmp(a, C1) && (other_cond && icmp(a, C2))` will be
reduced to `icmp(a, C1) & (other_cond && icmp(a, C2))`, since latter
icmp always implies the poison of the former. After reduction, it's
easier to simplify the icmp chain.
Similarly, this patch does the same thing for `(A && B) && C --> A && (B
& C)`. Maybe we could constraint such reduction only on icmps if there
is regression in benchmarks.
2024-04-10 14:19:44 +08:00
hanbeom
4ef22fce82
[InstCombine] Simplify select if it combinated and/or/xor (#73362)
`and/or/xor` operations can each be changed to sum of logical
operations including operators other than themselves.

 `x&y -> (x|y) ^ (x^y)`
 `x|y -> (x&y) | (x^y)`
 `x^y -> (x|y) ^ (x&y)`

if left of condition of `SelectInst` is `and/or/xor` logical
operation and right is equal to `0, -1`, or a `constant`, and
if `TrueVal` consist of `and/or/xor` logical operation then we
can optimize this case.

This patch implements this combination.

Proof: https://alive2.llvm.org/ce/z/WW8iRR

Fixes https://github.com/llvm/llvm-project/issues/71792.
2024-04-03 14:29:10 +08:00
Michele Scandale
09eb9f1136
[InstCombine] Fix for folding select into floating point binary operators. (#83200)
Folding a `select` into a floating point binary operators can only be
done if the result is preserved for both case. In particular, if the
other operand of the `select` can be a NaN, then the transformation
won't preserve the result value.
2024-03-19 09:47:07 -07:00
Artem Tyurin
141145232f
[IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Nikita Popov
9f45c5e1a6
[InstCombine] Fix infinite loop in select equivalence fold (#84036)
When replacing with a non-constant, it's possible that the result of the
simplification is actually more complicated than the original, and may
result in an infinite combine loop.

Mitigate the issue by requiring that either the replacement or
simplification result is constant, which should ensure that it's
simpler. While this check is crude, it does not appear to cause
optimization regressions in real-world code in practice.

Fixes https://github.com/llvm/llvm-project/issues/83127.
2024-03-06 09:33:51 +01:00
Yingwei Zheng
0c47363385
[InstCombine] Simplify nested selects with implied condition (#83739)
This patch does the following simplification:
```
sel1 = select cond1, X, Y 
sel2 = select cond2, sel1, Z
-->
sel2 = select cond2, X, Z if cond2 implies cond1
sel2 = select cond2, Y, Z if cond2 implies !cond1
```
Alive2: https://alive2.llvm.org/ce/z/9A_arU

It cannot be done in CVP/SCCP since we should guarantee that `cond2` is
not an undef.
2024-03-05 14:11:37 +08:00
Yingwei Zheng
dc866ae49e
[ValueTracking] Move the isSignBitCheck helper into ValueTracking. NFC. (#81704)
This patch moves the `isSignBitCheck` helper into ValueTracking to reuse
the logic in ValueTracking/InstSimplify.

Addresses the comment
https://github.com/llvm/llvm-project/pull/80740#discussion_r1488440050.
2024-02-14 15:33:08 +08:00
Yingwei Zheng
f37d81f8a3
[PatternMatch] Add a matching helper m_ElementWiseBitCast. NFC. (#80764)
This patch introduces a matching helper `m_ElementWiseBitCast`, which is
used for matching element-wise int <-> fp casts.
The motivation of this patch is to avoid duplicating checks in
https://github.com/llvm/llvm-project/pull/80740 and
https://github.com/llvm/llvm-project/pull/80414.
2024-02-07 21:02:13 +08:00
Yingwei Zheng
930996e9e4
[ValueTracking][NFC] Pass SimplifyQuery to computeKnownFPClass family (#80657)
This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.

Example (extracted from
[fmt/format.h](e17bc67547/include/fmt/format.h (L3555-L3566))):
```
define float @test(float %x, i1 %cond) {
  %i32 = bitcast float %x to i32
  %cmp = icmp slt i32 %i32, 0
  br i1 %cmp, label %if.then1, label %if.else

if.then1:
  %fneg = fneg float %x
  br label %if.end

if.else:
  br i1 %cond, label %if.then2, label %if.end

if.then2:
  br label %if.end

if.end:
  %value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
  %ret = call float @llvm.fabs.f32(float %value)
  ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.
2024-02-06 02:30:12 +08:00
Yingwei Zheng
f292f90bc2
[InstCombine] Fold select with signbit idiom into fabs (#76342)
This patch folds:
```
((bitcast X to int) <s 0 ? -X : X) -> fabs(X)
((bitcast X to int) >s -1 ? X : -X) -> fabs(X)
((bitcast X to int) <s 0 ? X : -X) -> -fabs(X)
((bitcast X to int) >s -1 ? -X : X) -> -fabs(X)
```
Alive2: https://alive2.llvm.org/ce/z/rGepow
2024-01-31 15:42:09 +08:00
hanbeom
66eedd1dd3
[InstCombine] Fix worklist management in select fold (#77738)
`InstCombine` uses `Worklist` to manage change history. `setOperand`,
which was previously used to change the `Select` Instruction, does not,
so it is `run` twice, which causes an `LLVM ERROR`.

This problem is resolved by changing `setOperand` to `replaceOperand` as
the change history will be registered in the Worklist.

Fixes #77553.
2024-01-11 09:34:30 +01:00
Jie Fu
bf312263bf [InstCombine] Remove unused variables in InstCombineSelect.cpp (NFC)
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:3810:14: error: unused variable 'LHS' [-Werror,-Wunused-variable]
 3810 |       Value *LHS, *RHS;
      |              ^~~
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:3810:20: error: unused variable 'RHS' [-Werror,-Wunused-variable]
 3810 |       Value *LHS, *RHS;
      |
2023-12-31 18:40:26 +08:00
Yingwei Zheng
b23f59a646
[InstCombine] Fold select (A &/| B), T, F if select B, T, F is foldable (#76621)
This patch does the following folds:
```
(select A && B, T, F) -> (select A, (select B, T, F), F)
(select A || B, T, F) -> (select A, T, (select B, T, F))
```
if `(select B, T, F)` can be folded into a value or a canonicalized SPF.
Alive2: https://alive2.llvm.org/ce/z/4Bdrbu

The original motivation of this patch is to simplify the following
pattern:
```
%.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%cmp9.i = icmp ugt i64 %add.i, 1152921504606846975
%or.cond.i = or i1 %cmp7.i, %cmp9.i
%cond.i = select i1 %or.cond.i, i64 1152921504606846975, i64 %add.i
->
%.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%max = call i64 @llvm.umax.i64(i64 %add.i, 1152921504606846975)
%cond.i = select i1 %cmp7.i, i64 1152921504606846975, i64 %max
```
The later form has a better codegen for some backends. It is also more
analysis-friendly than the original one.
Godbolt: https://godbolt.org/z/eK6eb5jf1
Alive2: https://alive2.llvm.org/ce/z/VHlxL2

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=7c71d3996a72b9b024622f23bf556539b961c88c&to=638ce8666fadaca1ab2639a3c2bc52a4a8508f40&stat=instructions:u

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|+0.02%|-0.00%|+0.02%|-0.03%|-0.00%|-0.05%|-0.00%|

It is an alternative to #76203 and #76363 because we can simplify
`select (icmp eq/ne a, b), a, b` into `b` or `a`.
Fixes #75784.
Fixes #76043.

Thank @XChy for providing additional tests.
Co-authored-by: XChy <xxs_chy@outlook.com>
2023-12-31 18:28:48 +08:00
Yingwei Zheng
568db84247
[InstCombine] Refactor canonicalizeSPF to support decomposed select. NFC.
See also https://github.com/llvm/llvm-project/pull/76621
2023-12-31 16:30:24 +08:00
Yingwei Zheng
ff76627aeb
[InstCombine] Fix type mismatch between cond and value in foldSelectToCopysign (#76343)
This patch fixes the miscompilation when we try to bitcast a floating point vector into an integer scalar.
2023-12-26 00:04:06 +08:00
Nikita Popov
465ecf872e [InstCombine] Rename UndefElts -> PoisonElts (NFC)
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
2023-12-18 12:36:19 +01:00
Noah Goldstein
b7c0f79926 [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
2023-11-20 17:59:27 -06:00
Noah Goldstein
9ef829097b [InstCombine] Fix buggy transform in foldNestedSelects; PR 71330
The bug is that `IsAndVariant` is used to assume which arm in the
select the output `SelInner` should be placed but match the inner
select condition with `m_c_LogicalOp`. With fully simplified ops, this
works fine, but its possible if the select condition is not
simplified, for it match both `LogicalAnd` and `LogicalOr` i.e `select
true, true, false`.

In PR71330 for example, the issue occurs in the following IR:
```
define i32 @bad() {
  %..i.i = select i1 false, i32 0, i32 3
  %brmerge = select i1 true, i1 true, i1 false
  %not.cmp.i.i.not = xor i1 true, true
  %.mux = zext i1 %not.cmp.i.i.not to i32
  %retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
  ret i32 %retval.0.i.i
}
```

When simplifying:
```
%retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
```

We end up matching `%brmerge` as `LogicalAnd` for `IsAndVariant`, but
the inner select (`%..i.i`) condition which is `false` with
`LogicalOr`.

Closes #71489
2023-11-09 16:36:49 -06:00
Nikita Popov
1a7061c1ad [InstCombine] Remove redundant logical select fold (NFCI)
This has been subsumed by simplifyWithOpReplaced().
2023-10-24 16:28:23 +02:00
Nikita Popov
34c33bbb8b [InstCombine] Remove redundant fold in foldSelectExtConst() (NFCI)
This has been subsumed by the more general simplifyWithOpReplaced()
fold.
2023-10-24 16:24:27 +02:00
Nikita Popov
d3cf00bb4d [InstCombine] Remove some redundant select folds (NFCI)
simplifyWithOpReplaced() has become more powerful in the
meantime, subsuming these folds.
2023-10-24 16:17:47 +02:00
Nikita Popov
d4300154b6 Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)"
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.

This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
2023-10-16 14:04:09 +02:00
Nikita Popov
b5743d4798 [ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)
Remove the old overloads that accept KnownBits by reference, in
favor of those that return it by value.
2023-10-16 13:00:31 +02:00
Nikita Popov
9ace23c9a2 [InstCombine] Avoid use of ConstantExpr::getSExt() (NFC)
Use the constant folding API instead.
2023-10-02 11:30:15 +02:00
Nikita Popov
6ce7461eea [InstCombine] Avoid uses of ConstantExpr::getCast()
Add a generalized getLosslessTrunc() helper to simplify this.
2023-09-29 11:32:41 +02:00
Nikita Popov
c41b4b6397 [InstCombine] Make flag drop during select equiv fold more generic
Instead of unsetting flags on the instruction, attempting the
fold, and the resetting the flags if it failed, add support to
simplifyWithOpReplaced() to ignore poison-generating flags/metadata
and collect all instructions where they may need to be dropped.

This allows us to perform the fold a) with poison-generating
metadata, which was previously not handled and b) poison-generating
flags/metadata that are not on the root instruction.

Proof for the ctpop case: https://alive2.llvm.org/ce/z/3H3HFs

Fixes https://github.com/llvm/llvm-project/issues/62450.
2023-09-19 14:54:25 +02:00
Yingwei Zheng
1679b20cd0
[InstCombine] Fix transforms of two select patterns (#65845)
This patch fixes transforms of `select (~a | c), a, b` and `select (c &
b), a, b` as discussed in [D158983](https://reviews.llvm.org/D158983).
Alive2: https://alive2.llvm.org/ce/z/ft6TDw
2023-09-18 01:28:37 +08:00
Antonio Frighetto
ce5b88bf10 [InstCombine] Handle constant arms in select of srem fold
Extend folding for `2^n` euclidean division remainder operations
on signed integers by handling the specific instance in which one
`select` arm has already been replaced by 1.

Reported-By: HypheX

Fixes: https://github.com/llvm/llvm-project/issues/66417.
2023-09-16 12:22:46 +02:00
Jeremy Morse
e54277fa10 [NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that
takes a block/iterator instead of an instruction, and updates many call
sites to use it. The motivating reason for doing this is given here [0],
we'd like to pass around more information about the position of debug-info
in the iterator object. That necessitates passing iterators around most of
the time.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152468
2023-09-11 20:01:19 +01:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Noah Goldstein
54ec8bcaf8 Recommit "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops" (3rd Try)
Fixed bug that assumed binop was commutative.
Was re-reviewed by nikic and chapuni

Differential Revision: https://reviews.llvm.org/D148414
2023-09-01 17:15:51 -05:00
Matt Arsenault
5ae881ff0a InstCombine: Fold out scale-if-denormal pattern
Fold select (fcmp oeq x, 0), (fmul x, y), x => x

This cleans up a pattern left behind by denormal range checks under
denormals are zero.

The pattern starts out as something like:
  x = x < smallest_normal ? x * K : x;

The comparison folds to an == 0 when the denormal mode treats input
denormals as zero. This makes library denormal checks free after
linked into DAZ enabled code.

alive2 is mostly happy with this, but there are some issues. First,
there are many reported failures in some of the negative tests that
happen to trigger some preexisting canonicalize introducing
combine. Second, alive2 is incorrectly asserting that denormals must
be flushed with the DAZ modes. It's allowed to drop a canonicalize.

https://reviews.llvm.org/D157030
2023-09-01 07:47:12 -04:00
Yingwei Zheng
074f23e3e1
[InstCombine] Fold two select patterns into or-and
This patch is the follow-up improvement of D122152.
Fixes https://github.com/llvm/llvm-project/issues/64558.

`select (a | c), a, b -> select a, true, (select ~c, b, false)` where `c` is free to invert
`select (c & ~b), a, b -> select b, true, (select c, a, false)`
Alive2: https://alive2.llvm.org/ce/z/KwxtMA

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D158983
2023-08-30 00:57:08 +08:00
Noah Goldstein
2acf00bd0a Revert "Recommit "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops" (2nd Try)"
Still appears to be buggy:
https://lab.llvm.org/buildbot/#/builders/124/builds/8260

This reverts commit 397a9cc4d875712a648271ecbac05ac6382c5708.
2023-08-25 02:22:23 -05:00
Noah Goldstein
397a9cc4d8 Recommit "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops" (2nd Try)
Was missing a nullptr check before derefencing. Fixed + test case
included in the patch.

Re-Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148414
2023-08-24 19:43:10 -05:00
David Spickett
2121e35ac2 Revert "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops"
This reverts commit d3402bc4460acefbc3d5278743601fa090784614.

This has caused a second stage build failure on one of our Armv7 32 bit builders:
https://lab.llvm.org/buildbot/#/builders/182/builds/7193
2023-08-17 10:13:47 +00:00
Noah Goldstein
d3402bc446 [InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops
This just expands on the existing logic that worked for `Or` and
applies it to any binop where `0` is the identity value on the RHS
i.e: `add`, `or`, `xor`, `shl`, etc...

Proofs For Some: https://alive2.llvm.org/ce/z/XZo6JD

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148414
2023-08-16 22:43:05 -05:00
Noah Goldstein
00f0381461 [InstCombine] Refactor foldSelectICmpAndOr to use decomposeBitTestICmp instead of bespoke logic
This is essentially NFC as the cases `decomposeBitTestICmp` covers
that weren't already covered explicitly, will be canonicalized into
the cases explicitly covered. As well the unsigned cases don't apply
as the `Mask` is not a power of 2.

That being said, using a well established helper is less bug prone and
if some canonicalization changes, will prevent regressions here.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148744
2023-08-16 22:43:05 -05:00
Noah Goldstein
82292d1ae5 [InstCombine] Remove requirement on trunc in slt/sgt case in foldSelectICmpAndOr
AFAICT, the trunc is not needed for correctness/performance and just
blocks what should be handlable cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148413
2023-08-16 22:43:05 -05:00
Noah Goldstein
2c606dc16f [InstCombine] Cleanup code in foldSelectICmpAndOr; NFC
There was just alot of boolean logic to propegate conditions that seem
clearer in conditions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148412
2023-08-16 22:43:05 -05:00
Antonio Frighetto
211692137a [InstCombine] Fold select of srem and conditional add
Simplify a pattern that may show up when computing
the remainder of euclidean division. Particularly,
when the divisor is a power of two and never negative,
the signed remainder can be folded with a bitwise and.

Fixes 64305.

Proofs: https://alive2.llvm.org/ce/z/9_KG6c

Differential Revision: https://reviews.llvm.org/D156811
2023-08-08 00:02:16 +00:00
Paweł Bylica
966318005a
[InstCombine] Preserve metadata when combining select+binop
Fixes https://github.com/llvm/llvm-project/issues/63910

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D155461
2023-07-19 15:20:49 +02:00
Nikita Popov
cd1dcd2c95 [InstCombine] Handle const select arm in foldSelectCtlzToCttz()
The select arm that takes the ctlz result can also instead be a
constant with the bit width (as this is what the ctlz evaluates to
for a==0).

This avoids a regression when strengthening the
simplifyWithOpReplaced() fold.

Proof: https://alive2.llvm.org/ce/z/DMRL5A
2023-07-14 12:00:39 +02:00
Matt Arsenault
0f4eb557e8 ValueTracking: Replace CannotBeNegativeZero
This is now just a wrapper around computeKnownFPClass.
2023-07-12 13:14:05 -04:00
Peixin Qiao
ab73bd3897 [InstCombine] Enhance select icmp and folding
This folds (a << k) ? 2^k * a : 0 to 2^k * a.

https://alive2.llvm.org/ce/z/_dDRjo

Fix #62155.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148420
2023-07-12 22:39:45 +08:00
Nikita Popov
336d7281ad [InstCombine] Preserve inbounds when folding select of GEP
The select base, (gep base, offset) to gep base, select (0, offset)
fold used to drop inbounds, because the gep base, 0 this introduces
might not be inbounds. After the semantics change in D154051, such
a GEP is always considered inbounds, in which allows us to preserve
the flag here.

As the PhaseOrdering test demonstrates, this can result in major
optimization improvements in some cases.

Differential Revision: https://reviews.llvm.org/D154055
2023-07-07 09:56:33 +02:00
Matt Arsenault
17eaa55e9f InstCombine: Fold select of ldexp to ldexp of select
The select-of-different-exp pattern appears in the device
libraries. I haven't seen the select-of-values case.
2023-06-22 14:22:01 -04:00
Nikita Popov
8378f1f4cd [InstCombine] Remove adjustMinMax() fold (PR62088)
This fold is buggy if the constant adjustment overflows.
Additionally, since we now canonicalize to min/max intrinsics,
the constants picked here don't actually matter, as long as SPF
still recognizes the pattern.

Fixes https://github.com/llvm/llvm-project/issues/62088.
2023-05-30 16:06:38 +02:00
Florian Hahn
cd2fc73b49
Revert "[ValueTracking][InstCombine] Add a new API to allow to ignore poison generating flags or metadatas when implying poison"
This reverts commit 754f3ae65518331b7175d7a9b4a124523ebe6eac.

Unfortunately the change can cause regressions due to dropping flags
from instructions (like nuw,nsw,inbounds), prevent further optimizations
depending on those flags.

A simple example is the IR below, where `inbounds` is dropped with the
patch and the phase-ordering test added in 7c91d82ab912fae8b.

    define i1 @test(ptr %base, i64 noundef %len, ptr %p2) {
    bb:
      %gep = getelementptr inbounds i32, ptr %base, i64 %len
      %c.1 = icmp uge ptr %p2, %base
      %c.2 = icmp ult ptr %p2, %gep
      %select = select i1 %c.1, i1 %c.2, i1 false
      ret i1 %select
    }

For more discussion, see D149404.
2023-05-29 15:44:37 +01:00
Nikita Popov
2938f9b46f [InstCombine] Fix worklist management in select value equiv fold (NFCI)
Requeue the modified instruction.

This should be NFC apart from worklist order effects.
2023-05-23 16:37:56 +02:00