507 Commits

Author SHA1 Message Date
Nikita Popov
465ecf872e [InstCombine] Rename UndefElts -> PoisonElts (NFC)
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
2023-12-18 12:36:19 +01:00
Noah Goldstein
b7c0f79926 [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
2023-11-20 17:59:27 -06:00
Noah Goldstein
9ef829097b [InstCombine] Fix buggy transform in foldNestedSelects; PR 71330
The bug is that `IsAndVariant` is used to assume which arm in the
select the output `SelInner` should be placed but match the inner
select condition with `m_c_LogicalOp`. With fully simplified ops, this
works fine, but its possible if the select condition is not
simplified, for it match both `LogicalAnd` and `LogicalOr` i.e `select
true, true, false`.

In PR71330 for example, the issue occurs in the following IR:
```
define i32 @bad() {
  %..i.i = select i1 false, i32 0, i32 3
  %brmerge = select i1 true, i1 true, i1 false
  %not.cmp.i.i.not = xor i1 true, true
  %.mux = zext i1 %not.cmp.i.i.not to i32
  %retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
  ret i32 %retval.0.i.i
}
```

When simplifying:
```
%retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
```

We end up matching `%brmerge` as `LogicalAnd` for `IsAndVariant`, but
the inner select (`%..i.i`) condition which is `false` with
`LogicalOr`.

Closes #71489
2023-11-09 16:36:49 -06:00
Nikita Popov
1a7061c1ad [InstCombine] Remove redundant logical select fold (NFCI)
This has been subsumed by simplifyWithOpReplaced().
2023-10-24 16:28:23 +02:00
Nikita Popov
34c33bbb8b [InstCombine] Remove redundant fold in foldSelectExtConst() (NFCI)
This has been subsumed by the more general simplifyWithOpReplaced()
fold.
2023-10-24 16:24:27 +02:00
Nikita Popov
d3cf00bb4d [InstCombine] Remove some redundant select folds (NFCI)
simplifyWithOpReplaced() has become more powerful in the
meantime, subsuming these folds.
2023-10-24 16:17:47 +02:00
Nikita Popov
d4300154b6 Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)"
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.

This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
2023-10-16 14:04:09 +02:00
Nikita Popov
b5743d4798 [ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)
Remove the old overloads that accept KnownBits by reference, in
favor of those that return it by value.
2023-10-16 13:00:31 +02:00
Nikita Popov
9ace23c9a2 [InstCombine] Avoid use of ConstantExpr::getSExt() (NFC)
Use the constant folding API instead.
2023-10-02 11:30:15 +02:00
Nikita Popov
6ce7461eea [InstCombine] Avoid uses of ConstantExpr::getCast()
Add a generalized getLosslessTrunc() helper to simplify this.
2023-09-29 11:32:41 +02:00
Nikita Popov
c41b4b6397 [InstCombine] Make flag drop during select equiv fold more generic
Instead of unsetting flags on the instruction, attempting the
fold, and the resetting the flags if it failed, add support to
simplifyWithOpReplaced() to ignore poison-generating flags/metadata
and collect all instructions where they may need to be dropped.

This allows us to perform the fold a) with poison-generating
metadata, which was previously not handled and b) poison-generating
flags/metadata that are not on the root instruction.

Proof for the ctpop case: https://alive2.llvm.org/ce/z/3H3HFs

Fixes https://github.com/llvm/llvm-project/issues/62450.
2023-09-19 14:54:25 +02:00
Yingwei Zheng
1679b20cd0
[InstCombine] Fix transforms of two select patterns (#65845)
This patch fixes transforms of `select (~a | c), a, b` and `select (c &
b), a, b` as discussed in [D158983](https://reviews.llvm.org/D158983).
Alive2: https://alive2.llvm.org/ce/z/ft6TDw
2023-09-18 01:28:37 +08:00
Antonio Frighetto
ce5b88bf10 [InstCombine] Handle constant arms in select of srem fold
Extend folding for `2^n` euclidean division remainder operations
on signed integers by handling the specific instance in which one
`select` arm has already been replaced by 1.

Reported-By: HypheX

Fixes: https://github.com/llvm/llvm-project/issues/66417.
2023-09-16 12:22:46 +02:00
Jeremy Morse
e54277fa10 [NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that
takes a block/iterator instead of an instruction, and updates many call
sites to use it. The motivating reason for doing this is given here [0],
we'd like to pass around more information about the position of debug-info
in the iterator object. That necessitates passing iterators around most of
the time.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152468
2023-09-11 20:01:19 +01:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Noah Goldstein
54ec8bcaf8 Recommit "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops" (3rd Try)
Fixed bug that assumed binop was commutative.
Was re-reviewed by nikic and chapuni

Differential Revision: https://reviews.llvm.org/D148414
2023-09-01 17:15:51 -05:00
Matt Arsenault
5ae881ff0a InstCombine: Fold out scale-if-denormal pattern
Fold select (fcmp oeq x, 0), (fmul x, y), x => x

This cleans up a pattern left behind by denormal range checks under
denormals are zero.

The pattern starts out as something like:
  x = x < smallest_normal ? x * K : x;

The comparison folds to an == 0 when the denormal mode treats input
denormals as zero. This makes library denormal checks free after
linked into DAZ enabled code.

alive2 is mostly happy with this, but there are some issues. First,
there are many reported failures in some of the negative tests that
happen to trigger some preexisting canonicalize introducing
combine. Second, alive2 is incorrectly asserting that denormals must
be flushed with the DAZ modes. It's allowed to drop a canonicalize.

https://reviews.llvm.org/D157030
2023-09-01 07:47:12 -04:00
Yingwei Zheng
074f23e3e1
[InstCombine] Fold two select patterns into or-and
This patch is the follow-up improvement of D122152.
Fixes https://github.com/llvm/llvm-project/issues/64558.

`select (a | c), a, b -> select a, true, (select ~c, b, false)` where `c` is free to invert
`select (c & ~b), a, b -> select b, true, (select c, a, false)`
Alive2: https://alive2.llvm.org/ce/z/KwxtMA

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D158983
2023-08-30 00:57:08 +08:00
Noah Goldstein
2acf00bd0a Revert "Recommit "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops" (2nd Try)"
Still appears to be buggy:
https://lab.llvm.org/buildbot/#/builders/124/builds/8260

This reverts commit 397a9cc4d875712a648271ecbac05ac6382c5708.
2023-08-25 02:22:23 -05:00
Noah Goldstein
397a9cc4d8 Recommit "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops" (2nd Try)
Was missing a nullptr check before derefencing. Fixed + test case
included in the patch.

Re-Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148414
2023-08-24 19:43:10 -05:00
David Spickett
2121e35ac2 Revert "[InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops"
This reverts commit d3402bc4460acefbc3d5278743601fa090784614.

This has caused a second stage build failure on one of our Armv7 32 bit builders:
https://lab.llvm.org/buildbot/#/builders/182/builds/7193
2023-08-17 10:13:47 +00:00
Noah Goldstein
d3402bc446 [InstCombine] Expand foldSelectICmpAndOr -> foldSelectICmpAndBinOp to work for more binops
This just expands on the existing logic that worked for `Or` and
applies it to any binop where `0` is the identity value on the RHS
i.e: `add`, `or`, `xor`, `shl`, etc...

Proofs For Some: https://alive2.llvm.org/ce/z/XZo6JD

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148414
2023-08-16 22:43:05 -05:00
Noah Goldstein
00f0381461 [InstCombine] Refactor foldSelectICmpAndOr to use decomposeBitTestICmp instead of bespoke logic
This is essentially NFC as the cases `decomposeBitTestICmp` covers
that weren't already covered explicitly, will be canonicalized into
the cases explicitly covered. As well the unsigned cases don't apply
as the `Mask` is not a power of 2.

That being said, using a well established helper is less bug prone and
if some canonicalization changes, will prevent regressions here.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148744
2023-08-16 22:43:05 -05:00
Noah Goldstein
82292d1ae5 [InstCombine] Remove requirement on trunc in slt/sgt case in foldSelectICmpAndOr
AFAICT, the trunc is not needed for correctness/performance and just
blocks what should be handlable cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148413
2023-08-16 22:43:05 -05:00
Noah Goldstein
2c606dc16f [InstCombine] Cleanup code in foldSelectICmpAndOr; NFC
There was just alot of boolean logic to propegate conditions that seem
clearer in conditions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148412
2023-08-16 22:43:05 -05:00
Antonio Frighetto
211692137a [InstCombine] Fold select of srem and conditional add
Simplify a pattern that may show up when computing
the remainder of euclidean division. Particularly,
when the divisor is a power of two and never negative,
the signed remainder can be folded with a bitwise and.

Fixes 64305.

Proofs: https://alive2.llvm.org/ce/z/9_KG6c

Differential Revision: https://reviews.llvm.org/D156811
2023-08-08 00:02:16 +00:00
Paweł Bylica
966318005a
[InstCombine] Preserve metadata when combining select+binop
Fixes https://github.com/llvm/llvm-project/issues/63910

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D155461
2023-07-19 15:20:49 +02:00
Nikita Popov
cd1dcd2c95 [InstCombine] Handle const select arm in foldSelectCtlzToCttz()
The select arm that takes the ctlz result can also instead be a
constant with the bit width (as this is what the ctlz evaluates to
for a==0).

This avoids a regression when strengthening the
simplifyWithOpReplaced() fold.

Proof: https://alive2.llvm.org/ce/z/DMRL5A
2023-07-14 12:00:39 +02:00
Matt Arsenault
0f4eb557e8 ValueTracking: Replace CannotBeNegativeZero
This is now just a wrapper around computeKnownFPClass.
2023-07-12 13:14:05 -04:00
Peixin Qiao
ab73bd3897 [InstCombine] Enhance select icmp and folding
This folds (a << k) ? 2^k * a : 0 to 2^k * a.

https://alive2.llvm.org/ce/z/_dDRjo

Fix #62155.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148420
2023-07-12 22:39:45 +08:00
Nikita Popov
336d7281ad [InstCombine] Preserve inbounds when folding select of GEP
The select base, (gep base, offset) to gep base, select (0, offset)
fold used to drop inbounds, because the gep base, 0 this introduces
might not be inbounds. After the semantics change in D154051, such
a GEP is always considered inbounds, in which allows us to preserve
the flag here.

As the PhaseOrdering test demonstrates, this can result in major
optimization improvements in some cases.

Differential Revision: https://reviews.llvm.org/D154055
2023-07-07 09:56:33 +02:00
Matt Arsenault
17eaa55e9f InstCombine: Fold select of ldexp to ldexp of select
The select-of-different-exp pattern appears in the device
libraries. I haven't seen the select-of-values case.
2023-06-22 14:22:01 -04:00
Nikita Popov
8378f1f4cd [InstCombine] Remove adjustMinMax() fold (PR62088)
This fold is buggy if the constant adjustment overflows.
Additionally, since we now canonicalize to min/max intrinsics,
the constants picked here don't actually matter, as long as SPF
still recognizes the pattern.

Fixes https://github.com/llvm/llvm-project/issues/62088.
2023-05-30 16:06:38 +02:00
Florian Hahn
cd2fc73b49
Revert "[ValueTracking][InstCombine] Add a new API to allow to ignore poison generating flags or metadatas when implying poison"
This reverts commit 754f3ae65518331b7175d7a9b4a124523ebe6eac.

Unfortunately the change can cause regressions due to dropping flags
from instructions (like nuw,nsw,inbounds), prevent further optimizations
depending on those flags.

A simple example is the IR below, where `inbounds` is dropped with the
patch and the phase-ordering test added in 7c91d82ab912fae8b.

    define i1 @test(ptr %base, i64 noundef %len, ptr %p2) {
    bb:
      %gep = getelementptr inbounds i32, ptr %base, i64 %len
      %c.1 = icmp uge ptr %p2, %base
      %c.2 = icmp ult ptr %p2, %gep
      %select = select i1 %c.1, i1 %c.2, i1 false
      ret i1 %select
    }

For more discussion, see D149404.
2023-05-29 15:44:37 +01:00
Nikita Popov
2938f9b46f [InstCombine] Fix worklist management in select value equiv fold (NFCI)
Requeue the modified instruction.

This should be NFC apart from worklist order effects.
2023-05-23 16:37:56 +02:00
luxufan
754f3ae655 [ValueTracking][InstCombine] Add a new API to allow to ignore poison generating flags or metadatas when implying poison
This patch add a new API `impliesPoisonIgnoreFlagsOrMetadatas` which is
the same as `impliesPoison` but ignoring poison generating flags or
metadatas in the process of implying poison and recording these ignored
instructions.

In InstCombineSelect, replacing `impliesPoison` with
`impliesPoisonIgnoreFlagsOrMetadatas` to allow more patterns like
`select i1 %a, i1 %b, i1 false` to be optimized to and/or instructions
by droping the poison generating flags or metadatas.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D149404
2023-05-19 14:50:32 +08:00
Nuno Lopes
8a1373d308 Revert "[InstCombine] Generate better code for std::bit_floor from libstdc++"
This reverts commit d775fc390d3c78cc81872e276c4b1314f19af577.

The patch is wrong wrt undef and the author didn't fix it after 2 weeks.
2023-04-30 09:56:34 +01:00
ManuelJBrito
d22edb9794 [IR][NFC] Change UndefMaskElem to PoisonMaskElem
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.

Differential Revision: https://reviews.llvm.org/D149256
2023-04-27 18:01:54 +01:00
Kazu Hirata
d775fc390d [InstCombine] Generate better code for std::bit_floor from libstdc++
Without this patch, std::bit_floor<uint32_t> in libstdc++ is compiled
as:

  %eq0 = icmp eq i32 %x, 0
  %lshr = lshr i32 %x, 1
  %ctlz = tail call i32 @llvm.ctlz.i32(i32 %lshr, i1 false)
  %sub = sub i32 32, %ctlz
  %shl = shl i32 1, %sub
  %sel = select i1 %eq0, i32 0, i32 %shl

With this patch:

  %eq0 = icmp eq i32 %x, 0
  %ctlz = call i32 @llvm.ctlz.i32(i32 %x, i1 false)
  %lshr = lshr i32 -2147483648, %1
  %sel = select i1 %eq0, i32 0, i32 %lshr

This patch recognizes the specific pattern emitted for std::bit_floor
in libstdc++.

https://alive2.llvm.org/ce/z/piMdFX

This patch fixes:

https://github.com/llvm/llvm-project/issues/61183

Differential Revision: https://reviews.llvm.org/D145890
2023-04-15 11:32:33 -07:00
Kazu Hirata
231fa27435 [InstCombine] Generate better code for std::bit_ceil
Without this patch, std::bit_ceil<uint32_t> is compiled as:

  %dec = add i32 %x, -1
  %lz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
  %sub = sub i32 32, %lz
  %res = shl i32 1, %sub
  %ugt = icmp ugt i32 %x, 1
  %sel = select i1 %ugt, i32 %res, i32 1

With this patch, we generate:

  %dec = add i32 %x, -1
  %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
  %sub = sub nsw i32 0, %ctlz
  %and = and i32 %1, 31
  %sel = shl nuw i32 1, %and
  ret i32 %sel

https://alive2.llvm.org/ce/z/pwezvF

This patch recognizes the specific pattern from std::bit_ceil in
libc++ and libstdc++ and drops the conditional move.  In addition to
the LLVM IR generated for std::bit_ceil(X), this patch recognizes
variants like:

  std::bit_ceil(X - 1)
  std::bit_ceil(X + 1)
  std::bit_ceil(X + 2)
  std::bit_ceil(-X)
  std::bit_ceil(~X)

This patch fixes:

https://github.com/llvm/llvm-project/issues/60802

Differential Revision: https://reviews.llvm.org/D145299
2023-03-23 19:26:43 -07:00
Nikita Popov
fdda602c04 Revert "[InstCombine] Return instruction from replaceUse()"
This reverts commit 27c4e233104ba765cd986b3f8b0dcd3a6c3a9f89.

I think I made a mistake with the use in RemoveConditionFromAssume(),
because the instruction being changed is not the current one, but
the next assume. Revert the change for now.
2023-03-14 17:46:33 +01:00
Nikita Popov
27c4e23310 [InstCombine] Return instruction from replaceUse()
Same as with other replacement methods, it's generally necessary
to report a change on the instruction itself, e.g. by returning
it from the visit method (or possibly explicitly adding it to the
worklist).

Return Instruction * from replaceUse() to encourage the usual
"return replaceXYZ" pattern.
2023-03-14 16:53:03 +01:00
Nikita Popov
271b5cf562 [InstCombine] Fix infinite combine loop (PR61361)
In the degenerate case where the select is fed by an unsimplified
icmp with two constant operands, don't try to replace one constant
with another. Wait for the icmp to be simplified first instead.

Fixes https://github.com/llvm/llvm-project/issues/61361.
2023-03-14 16:43:00 +01:00
Sanjay Patel
74a58499b7 [InstCombine] fold signed absolute diff patterns
This overlaps partially with the codegen patch D144789. This needs no-wrap
for correctness, and I'm not sure if there's an unsigned equivalent:
https://alive2.llvm.org/ce/z/ErmQ-9
https://alive2.llvm.org/ce/z/mr-c_A

This is obviously an improvement in IR, and it looks like a codegen win
for all targets and data types that I sampled.

The 'nabs' case is left as a potential follow-up (and seems less likely
to occur in real code).

Differential Revision: https://reviews.llvm.org/D145073
2023-03-06 13:49:48 -05:00
Paul Walker
15915fa10a [InstCombine] Implement "A & (~A | B) --> A & B" like transforms for boolean based selects.
Alive2 links for "A & (~A | B) --> A & B":
https://alive2.llvm.org/ce/z/oKiodu (scalar)
https://alive2.llvm.org/ce/z/8yn8GL (vector)

Alive2 links for "A | (~A & B) --> A | B"
https://alive2.llvm.org/ce/z/v5GEKu (scalar)
https://alive2.llvm.org/ce/z/wvtJsj (vector)

NOTE: The commutative variants of these transforms, for example:
  "(~A | B) & A --> A & B"
are already handled by simplifying the underlying selects to
normal logical operations due to that combination having simpler
poison semantics.

Differential Revision: https://reviews.llvm.org/D145157
2023-03-06 13:53:41 +00:00
Sanjay Patel
452279efe2 [InstCombine] prevent miscompiles from select-of-div/rem transform
This avoids the danger shown in issue #60906.
There were no regression tests for these patterns, so these potential
failures have been around for a long time.

We freeze the condition and preserve the optimization because
getting rid of a div/rem is always a win.

Here are a couple of examples that can be corrected by freezing the
condition:
https://alive2.llvm.org/ce/z/sXHTTC

Differential Revision: https://reviews.llvm.org/D144671
2023-03-01 08:54:23 -05:00
Sanjay Patel
2ea0e530d3 [InstCombine] simplify test for div/rem; NFC
This is too conservative as noted in the TODO comment.
2023-02-28 14:21:13 -05:00
Sander de Smalen
68b56e3a74 [InstCombine] NFC: Add implied condition to block in foldSelectInstWithICmp
Added the condition 'TrueVal->getType()->isIntOrIntVectorTy' to a block of code
in foldSelectInstWithICmp which is only valid if the TrueVal is integer type.

This change was split off from D136861.
2023-02-23 16:11:00 +00:00
Sanjay Patel
f48f178717 [InstCombine] canonicalize cmp+select as smin/smax
(V == SMIN) ? SMIN+1 : V --> smax(V, SMIN+1)
(V == SMAX) ? SMAX-1 : V --> smin(V, SMAX-1)

https://alive2.llvm.org/ce/z/d5bqjy

Follow-up for the unsigned variants added with:
86b4d8645fc1b866

issue #60374
2023-02-12 07:54:43 -05:00
Sanjay Patel
86b4d8645f [InstCombine] canonicalize cmp+select as umin/umax
(V == 0) ? 1 : V --> umax(V, 1)
(V == UMAX) ? UMAX-1 : V --> umin(V, UMAX-1)

https://alive2.llvm.org/ce/z/pfDBAf

This is one pair of the variants discussed in issue #60374.

Enhancements for the other end of the constant range and
signed variants are potential follow-ups, but that may
require more work because we canonicalize at least one
min/max like that to icmp+zext.
2023-02-08 17:25:58 -05:00