546 Commits

Author SHA1 Message Date
Nikita Popov
3969d2c3b5 [InstCombine] Disable select known bits fold for vectors
This is not safe if the simplification ends up looking through
lane-crossing operations. For now, we don't have a good way to
limit this in computeKnownBits(), so just disable vector handling
entirely.

Fixes https://github.com/llvm/llvm-project/issues/97475.
2024-07-03 09:56:48 +02:00
Alex MacLean
8361d9065e
[InstCombine] disable select folding resulting in extra instructions (#97184)
Disable conversion of a `(select (icmp))` when it would result in more
instructions `(xor (lshr (and)))`. This transformation produces more
instructions and can interfere with other more profitable folds for
`select`. For example before this change the following folding would
occur:
```llvm
  %1 = icmp slt i32 %X, 0
  %2 = select i1 %1, i64 0, i64 8
```
to 
```llvm
  %1 = lshr i32 %X, 28
  %2 = and i32 %1, 8
  %3 = xor i32 %2, 8
  %4 = zext nneg i32 %3 to i64
```
2024-07-01 08:10:56 -07:00
Nikita Popov
77eb056830
[InstCombine] Simplify select using KnownBits of condition (#95923)
Simplify the arms of a select based on the KnownBits implied by its condition.
For now this only handles the case where the select arm folds to a constant,
but this can be generalized to handle other patterns by using
SimplifyDemandedBits instead (in that case we would also have to limit to
non-undef conditions).

This is implemented by adding a new member to SimplifyQuery that can be used
to inject an additional condition. The affected values are pre-computed and
we don't call computeKnownBits() if the select arms don't contain affected
values. This reduces the cost in some pathological cases.
2024-07-01 09:26:01 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Noah Goldstein
b37a4b9991 [InstCombine] Improve coverage of foldSelectValueEquivalence for non-constants
If f(Y) simplifies to Y, replace with Y. This requires Y to be
non-undef.

Closes #94719
2024-06-23 11:15:47 +08:00
Nikita Popov
9e6625d6a2 [InstCombine] Preserve all gep flags in another select of gep fold 2024-06-19 12:18:01 +02:00
Nikita Popov
4c8ce5d301 [InstCombine] Preserve all flags in select of gep fold
Preserve the flag intersection.
2024-06-19 12:01:48 +02:00
Nikita Popov
9a86d0a6b5 [InstCombine] Prefer source over result element type (NFC)
For single-index GEPs the source and result element types are the
same, but using the source type is semantically more correct.
2024-06-17 11:03:00 +02:00
Zain Jaffal
22ff7c5dc9
[ValueTracking][NFC] move isKnownInversion to ValueTracking (#95321)
I am using `isKnownInversion` in the following pr 
https://github.com/llvm/llvm-project/pull/94915

it is useful to have the method in a shared class so I can reuse it. I am not sure if `ValueTracking` is the correct place but it looks like most of the methods with the pattern `isKnownX` belong there.
2024-06-13 07:14:08 +01:00
Nikita Popov
ec16f44d08 [InstCombine] Use named values in comment (NFC)
Also use opaque pointers.
2024-06-12 15:08:48 +02:00
Noah Goldstein
7e7c29ba08 [InstCombine] Improve coverage of foldSelectValueEquivalence for constants
We don't need the `noundef` check if the new simplification is a
constant.

This cleans up regressions from folding multiuse:
    `(icmp eq/ne (sub/xor x, y), 0)` -> `(icmp eq/ne x, y)`.

Closes #88298
2024-06-06 20:02:57 -05:00
Nikita Popov
9bea770b63 [InstCombine] Only requite not-undef in select equiv fold
As the comment already indicates, only replacement with undef
is problematic, as it introduces an additional use of undef.
Use the correct ValueTracking helper.
2024-06-06 09:38:08 +02:00
Yingwei Zheng
0a39c88e81
[InstCombine] Fold select Cond, not X, X into Cond ^ X (#93591)
See the following example:
```
define i1 @src(i64 %x, i1 %y) {
  %1526 = icmp ne i64 %x, 0
  %1527 = icmp eq i64 %x, 0
  %sel = select i1 %y, i1 %1526, i1 %1527
  ret i1 %sel
}

define i1 @tgt(i64 %x, i1 %y) {
  %1527 = icmp eq i64 %x, 0
  %sel = xor i1 %y, %1527
  ret i1 %sel
}
```
I find that this pattern is common in C/C++/Rust code base.
This patch folds `select Cond, Y, X` into `Cond ^ X` iff:
1. X has the same type as Cond
2. X is poison -> Y is poison
3. X == !Y

Alive2: https://alive2.llvm.org/ce/z/hSmkHS
2024-06-04 23:50:17 +08:00
Nikita Popov
3cd67eeca2 [InstCombine] Drop range attr in select of ctz fold
The range may no longer be valid after the select has been
optimized away.

This fixes the kernel miscompiles reported at
https://github.com/ClangBuiltLinux/linux/issues/2031.
2024-06-04 15:48:08 +02:00
Yingwei Zheng
b5f4210e9f
[InstCombine] Drop nuw flag when CtlzOp is a sub nuw (#91776)
See the following case:
```
define i32 @src1(i32 %x) {
  %dec = sub nuw i32 -2, %x
  %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
  %sub = sub nsw i32 32, %ctlz
  %shl = shl i32 1, %sub
  %ugt = icmp ult i32 %x, -2
  %sel = select i1 %ugt, i32 %shl, i32 1
  ret i32 %sel
}

define i32 @tgt1(i32 %x) {
  %dec = sub nuw i32 -2, %x
  %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
  %sub = sub nsw i32 32, %ctlz
  %and = and i32 %sub, 31
  %shl = shl nuw i32 1, %and
  ret i32 %shl
}
```
`nuw` in `%dec` should be dropped after the select instruction is
eliminated.

Alive2: https://alive2.llvm.org/ce/z/7S9529

Fixes https://github.com/llvm/llvm-project/issues/91691.
2024-05-13 14:27:59 +08:00
Eli Friedman
f893dccbba
Replace uses of ConstantExpr::getCompare. (#91558)
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
2024-05-09 16:50:01 -07:00
Maciej Gabka
bfc0317153
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Nikita Popov
7339f7ba30
[InstCombine] Fix poison propagation in select of bitwise fold (#89701)
We're replacing the select with the false value here, but it may be more
poisonous if m_Not contains poison elements. Fix this by introducing a
m_NotForbidPoison matcher and using it here.

Fixes https://github.com/llvm/llvm-project/issues/89500.
2024-04-24 10:57:17 +09:00
Yingwei Zheng
6309440c21
[InstCombine] Fix unexpected overwriting in foldSelectWithSRem (#89539)
Fixes #89516
2024-04-21 22:41:32 +08:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Andreas Jonson
ff3523f67b
[IR] Drop poison-generating return attributes when necessary (#89138)
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
2024-04-18 15:27:36 +09:00
Nikita Popov
525d00e5ed [InstCombine] Fix poison propagation in round up alignment fold
We can't directly use the high bits value if it is more poisonous
due to poison elements in the masks.

This fixes the issue reported in
https://github.com/llvm/llvm-project/pull/88217#issuecomment-2061034941.
2024-04-18 10:58:15 +09:00
XChy
313a33b9df
[InstCombine] Reduce nested logical operator if poison is implied (#86823)
Fixes #76623
Alive2 proof: https://alive2.llvm.org/ce/z/gX6znJ (I'm not sure how to
write a proof for such transform, maybe there are mistakes)

In most cases, `icmp(a, C1) && (other_cond && icmp(a, C2))` will be
reduced to `icmp(a, C1) & (other_cond && icmp(a, C2))`, since latter
icmp always implies the poison of the former. After reduction, it's
easier to simplify the icmp chain.
Similarly, this patch does the same thing for `(A && B) && C --> A && (B
& C)`. Maybe we could constraint such reduction only on icmps if there
is regression in benchmarks.
2024-04-10 14:19:44 +08:00
hanbeom
4ef22fce82
[InstCombine] Simplify select if it combinated and/or/xor (#73362)
`and/or/xor` operations can each be changed to sum of logical
operations including operators other than themselves.

 `x&y -> (x|y) ^ (x^y)`
 `x|y -> (x&y) | (x^y)`
 `x^y -> (x|y) ^ (x&y)`

if left of condition of `SelectInst` is `and/or/xor` logical
operation and right is equal to `0, -1`, or a `constant`, and
if `TrueVal` consist of `and/or/xor` logical operation then we
can optimize this case.

This patch implements this combination.

Proof: https://alive2.llvm.org/ce/z/WW8iRR

Fixes https://github.com/llvm/llvm-project/issues/71792.
2024-04-03 14:29:10 +08:00
Michele Scandale
09eb9f1136
[InstCombine] Fix for folding select into floating point binary operators. (#83200)
Folding a `select` into a floating point binary operators can only be
done if the result is preserved for both case. In particular, if the
other operand of the `select` can be a NaN, then the transformation
won't preserve the result value.
2024-03-19 09:47:07 -07:00
Artem Tyurin
141145232f
[IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Nikita Popov
9f45c5e1a6
[InstCombine] Fix infinite loop in select equivalence fold (#84036)
When replacing with a non-constant, it's possible that the result of the
simplification is actually more complicated than the original, and may
result in an infinite combine loop.

Mitigate the issue by requiring that either the replacement or
simplification result is constant, which should ensure that it's
simpler. While this check is crude, it does not appear to cause
optimization regressions in real-world code in practice.

Fixes https://github.com/llvm/llvm-project/issues/83127.
2024-03-06 09:33:51 +01:00
Yingwei Zheng
0c47363385
[InstCombine] Simplify nested selects with implied condition (#83739)
This patch does the following simplification:
```
sel1 = select cond1, X, Y 
sel2 = select cond2, sel1, Z
-->
sel2 = select cond2, X, Z if cond2 implies cond1
sel2 = select cond2, Y, Z if cond2 implies !cond1
```
Alive2: https://alive2.llvm.org/ce/z/9A_arU

It cannot be done in CVP/SCCP since we should guarantee that `cond2` is
not an undef.
2024-03-05 14:11:37 +08:00
Yingwei Zheng
dc866ae49e
[ValueTracking] Move the isSignBitCheck helper into ValueTracking. NFC. (#81704)
This patch moves the `isSignBitCheck` helper into ValueTracking to reuse
the logic in ValueTracking/InstSimplify.

Addresses the comment
https://github.com/llvm/llvm-project/pull/80740#discussion_r1488440050.
2024-02-14 15:33:08 +08:00
Yingwei Zheng
f37d81f8a3
[PatternMatch] Add a matching helper m_ElementWiseBitCast. NFC. (#80764)
This patch introduces a matching helper `m_ElementWiseBitCast`, which is
used for matching element-wise int <-> fp casts.
The motivation of this patch is to avoid duplicating checks in
https://github.com/llvm/llvm-project/pull/80740 and
https://github.com/llvm/llvm-project/pull/80414.
2024-02-07 21:02:13 +08:00
Yingwei Zheng
930996e9e4
[ValueTracking][NFC] Pass SimplifyQuery to computeKnownFPClass family (#80657)
This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.

Example (extracted from
[fmt/format.h](e17bc67547/include/fmt/format.h (L3555-L3566))):
```
define float @test(float %x, i1 %cond) {
  %i32 = bitcast float %x to i32
  %cmp = icmp slt i32 %i32, 0
  br i1 %cmp, label %if.then1, label %if.else

if.then1:
  %fneg = fneg float %x
  br label %if.end

if.else:
  br i1 %cond, label %if.then2, label %if.end

if.then2:
  br label %if.end

if.end:
  %value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
  %ret = call float @llvm.fabs.f32(float %value)
  ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.
2024-02-06 02:30:12 +08:00
Yingwei Zheng
f292f90bc2
[InstCombine] Fold select with signbit idiom into fabs (#76342)
This patch folds:
```
((bitcast X to int) <s 0 ? -X : X) -> fabs(X)
((bitcast X to int) >s -1 ? X : -X) -> fabs(X)
((bitcast X to int) <s 0 ? X : -X) -> -fabs(X)
((bitcast X to int) >s -1 ? -X : X) -> -fabs(X)
```
Alive2: https://alive2.llvm.org/ce/z/rGepow
2024-01-31 15:42:09 +08:00
hanbeom
66eedd1dd3
[InstCombine] Fix worklist management in select fold (#77738)
`InstCombine` uses `Worklist` to manage change history. `setOperand`,
which was previously used to change the `Select` Instruction, does not,
so it is `run` twice, which causes an `LLVM ERROR`.

This problem is resolved by changing `setOperand` to `replaceOperand` as
the change history will be registered in the Worklist.

Fixes #77553.
2024-01-11 09:34:30 +01:00
Jie Fu
bf312263bf [InstCombine] Remove unused variables in InstCombineSelect.cpp (NFC)
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:3810:14: error: unused variable 'LHS' [-Werror,-Wunused-variable]
 3810 |       Value *LHS, *RHS;
      |              ^~~
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:3810:20: error: unused variable 'RHS' [-Werror,-Wunused-variable]
 3810 |       Value *LHS, *RHS;
      |
2023-12-31 18:40:26 +08:00
Yingwei Zheng
b23f59a646
[InstCombine] Fold select (A &/| B), T, F if select B, T, F is foldable (#76621)
This patch does the following folds:
```
(select A && B, T, F) -> (select A, (select B, T, F), F)
(select A || B, T, F) -> (select A, T, (select B, T, F))
```
if `(select B, T, F)` can be folded into a value or a canonicalized SPF.
Alive2: https://alive2.llvm.org/ce/z/4Bdrbu

The original motivation of this patch is to simplify the following
pattern:
```
%.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%cmp9.i = icmp ugt i64 %add.i, 1152921504606846975
%or.cond.i = or i1 %cmp7.i, %cmp9.i
%cond.i = select i1 %or.cond.i, i64 1152921504606846975, i64 %add.i
->
%.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%max = call i64 @llvm.umax.i64(i64 %add.i, 1152921504606846975)
%cond.i = select i1 %cmp7.i, i64 1152921504606846975, i64 %max
```
The later form has a better codegen for some backends. It is also more
analysis-friendly than the original one.
Godbolt: https://godbolt.org/z/eK6eb5jf1
Alive2: https://alive2.llvm.org/ce/z/VHlxL2

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=7c71d3996a72b9b024622f23bf556539b961c88c&to=638ce8666fadaca1ab2639a3c2bc52a4a8508f40&stat=instructions:u

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|+0.02%|-0.00%|+0.02%|-0.03%|-0.00%|-0.05%|-0.00%|

It is an alternative to #76203 and #76363 because we can simplify
`select (icmp eq/ne a, b), a, b` into `b` or `a`.
Fixes #75784.
Fixes #76043.

Thank @XChy for providing additional tests.
Co-authored-by: XChy <xxs_chy@outlook.com>
2023-12-31 18:28:48 +08:00
Yingwei Zheng
568db84247
[InstCombine] Refactor canonicalizeSPF to support decomposed select. NFC.
See also https://github.com/llvm/llvm-project/pull/76621
2023-12-31 16:30:24 +08:00
Yingwei Zheng
ff76627aeb
[InstCombine] Fix type mismatch between cond and value in foldSelectToCopysign (#76343)
This patch fixes the miscompilation when we try to bitcast a floating point vector into an integer scalar.
2023-12-26 00:04:06 +08:00
Nikita Popov
465ecf872e [InstCombine] Rename UndefElts -> PoisonElts (NFC)
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
2023-12-18 12:36:19 +01:00
Noah Goldstein
b7c0f79926 [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
2023-11-20 17:59:27 -06:00
Noah Goldstein
9ef829097b [InstCombine] Fix buggy transform in foldNestedSelects; PR 71330
The bug is that `IsAndVariant` is used to assume which arm in the
select the output `SelInner` should be placed but match the inner
select condition with `m_c_LogicalOp`. With fully simplified ops, this
works fine, but its possible if the select condition is not
simplified, for it match both `LogicalAnd` and `LogicalOr` i.e `select
true, true, false`.

In PR71330 for example, the issue occurs in the following IR:
```
define i32 @bad() {
  %..i.i = select i1 false, i32 0, i32 3
  %brmerge = select i1 true, i1 true, i1 false
  %not.cmp.i.i.not = xor i1 true, true
  %.mux = zext i1 %not.cmp.i.i.not to i32
  %retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
  ret i32 %retval.0.i.i
}
```

When simplifying:
```
%retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
```

We end up matching `%brmerge` as `LogicalAnd` for `IsAndVariant`, but
the inner select (`%..i.i`) condition which is `false` with
`LogicalOr`.

Closes #71489
2023-11-09 16:36:49 -06:00
Nikita Popov
1a7061c1ad [InstCombine] Remove redundant logical select fold (NFCI)
This has been subsumed by simplifyWithOpReplaced().
2023-10-24 16:28:23 +02:00
Nikita Popov
34c33bbb8b [InstCombine] Remove redundant fold in foldSelectExtConst() (NFCI)
This has been subsumed by the more general simplifyWithOpReplaced()
fold.
2023-10-24 16:24:27 +02:00
Nikita Popov
d3cf00bb4d [InstCombine] Remove some redundant select folds (NFCI)
simplifyWithOpReplaced() has become more powerful in the
meantime, subsuming these folds.
2023-10-24 16:17:47 +02:00
Nikita Popov
d4300154b6 Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)"
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.

This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
2023-10-16 14:04:09 +02:00
Nikita Popov
b5743d4798 [ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)
Remove the old overloads that accept KnownBits by reference, in
favor of those that return it by value.
2023-10-16 13:00:31 +02:00
Nikita Popov
9ace23c9a2 [InstCombine] Avoid use of ConstantExpr::getSExt() (NFC)
Use the constant folding API instead.
2023-10-02 11:30:15 +02:00
Nikita Popov
6ce7461eea [InstCombine] Avoid uses of ConstantExpr::getCast()
Add a generalized getLosslessTrunc() helper to simplify this.
2023-09-29 11:32:41 +02:00
Nikita Popov
c41b4b6397 [InstCombine] Make flag drop during select equiv fold more generic
Instead of unsetting flags on the instruction, attempting the
fold, and the resetting the flags if it failed, add support to
simplifyWithOpReplaced() to ignore poison-generating flags/metadata
and collect all instructions where they may need to be dropped.

This allows us to perform the fold a) with poison-generating
metadata, which was previously not handled and b) poison-generating
flags/metadata that are not on the root instruction.

Proof for the ctpop case: https://alive2.llvm.org/ce/z/3H3HFs

Fixes https://github.com/llvm/llvm-project/issues/62450.
2023-09-19 14:54:25 +02:00