Fold select (fcmp oeq x, 0), (fmul x, y), x => x
This cleans up a pattern left behind by denormal range checks under
denormals are zero.
The pattern starts out as something like:
x = x < smallest_normal ? x * K : x;
The comparison folds to an == 0 when the denormal mode treats input
denormals as zero. This makes library denormal checks free after
linked into DAZ enabled code.
alive2 is mostly happy with this, but there are some issues. First,
there are many reported failures in some of the negative tests that
happen to trigger some preexisting canonicalize introducing
combine. Second, alive2 is incorrectly asserting that denormals must
be flushed with the DAZ modes. It's allowed to drop a canonicalize.
https://reviews.llvm.org/D157030
Was missing a nullptr check before derefencing. Fixed + test case
included in the patch.
Re-Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D148414
This just expands on the existing logic that worked for `Or` and
applies it to any binop where `0` is the identity value on the RHS
i.e: `add`, `or`, `xor`, `shl`, etc...
Proofs For Some: https://alive2.llvm.org/ce/z/XZo6JD
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D148414
This is essentially NFC as the cases `decomposeBitTestICmp` covers
that weren't already covered explicitly, will be canonicalized into
the cases explicitly covered. As well the unsigned cases don't apply
as the `Mask` is not a power of 2.
That being said, using a well established helper is less bug prone and
if some canonicalization changes, will prevent regressions here.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D148744
AFAICT, the trunc is not needed for correctness/performance and just
blocks what should be handlable cases.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D148413
There was just alot of boolean logic to propegate conditions that seem
clearer in conditions.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D148412
Simplify a pattern that may show up when computing
the remainder of euclidean division. Particularly,
when the divisor is a power of two and never negative,
the signed remainder can be folded with a bitwise and.
Fixes 64305.
Proofs: https://alive2.llvm.org/ce/z/9_KG6c
Differential Revision: https://reviews.llvm.org/D156811
The select arm that takes the ctlz result can also instead be a
constant with the bit width (as this is what the ctlz evaluates to
for a==0).
This avoids a regression when strengthening the
simplifyWithOpReplaced() fold.
Proof: https://alive2.llvm.org/ce/z/DMRL5A
The select base, (gep base, offset) to gep base, select (0, offset)
fold used to drop inbounds, because the gep base, 0 this introduces
might not be inbounds. After the semantics change in D154051, such
a GEP is always considered inbounds, in which allows us to preserve
the flag here.
As the PhaseOrdering test demonstrates, this can result in major
optimization improvements in some cases.
Differential Revision: https://reviews.llvm.org/D154055
This fold is buggy if the constant adjustment overflows.
Additionally, since we now canonicalize to min/max intrinsics,
the constants picked here don't actually matter, as long as SPF
still recognizes the pattern.
Fixes https://github.com/llvm/llvm-project/issues/62088.
This reverts commit 754f3ae65518331b7175d7a9b4a124523ebe6eac.
Unfortunately the change can cause regressions due to dropping flags
from instructions (like nuw,nsw,inbounds), prevent further optimizations
depending on those flags.
A simple example is the IR below, where `inbounds` is dropped with the
patch and the phase-ordering test added in 7c91d82ab912fae8b.
define i1 @test(ptr %base, i64 noundef %len, ptr %p2) {
bb:
%gep = getelementptr inbounds i32, ptr %base, i64 %len
%c.1 = icmp uge ptr %p2, %base
%c.2 = icmp ult ptr %p2, %gep
%select = select i1 %c.1, i1 %c.2, i1 false
ret i1 %select
}
For more discussion, see D149404.
This patch add a new API `impliesPoisonIgnoreFlagsOrMetadatas` which is
the same as `impliesPoison` but ignoring poison generating flags or
metadatas in the process of implying poison and recording these ignored
instructions.
In InstCombineSelect, replacing `impliesPoison` with
`impliesPoisonIgnoreFlagsOrMetadatas` to allow more patterns like
`select i1 %a, i1 %b, i1 false` to be optimized to and/or instructions
by droping the poison generating flags or metadatas.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D149404
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.
Differential Revision: https://reviews.llvm.org/D149256
This reverts commit 27c4e233104ba765cd986b3f8b0dcd3a6c3a9f89.
I think I made a mistake with the use in RemoveConditionFromAssume(),
because the instruction being changed is not the current one, but
the next assume. Revert the change for now.
Same as with other replacement methods, it's generally necessary
to report a change on the instruction itself, e.g. by returning
it from the visit method (or possibly explicitly adding it to the
worklist).
Return Instruction * from replaceUse() to encourage the usual
"return replaceXYZ" pattern.
In the degenerate case where the select is fed by an unsimplified
icmp with two constant operands, don't try to replace one constant
with another. Wait for the icmp to be simplified first instead.
Fixes https://github.com/llvm/llvm-project/issues/61361.
This overlaps partially with the codegen patch D144789. This needs no-wrap
for correctness, and I'm not sure if there's an unsigned equivalent:
https://alive2.llvm.org/ce/z/ErmQ-9https://alive2.llvm.org/ce/z/mr-c_A
This is obviously an improvement in IR, and it looks like a codegen win
for all targets and data types that I sampled.
The 'nabs' case is left as a potential follow-up (and seems less likely
to occur in real code).
Differential Revision: https://reviews.llvm.org/D145073
This avoids the danger shown in issue #60906.
There were no regression tests for these patterns, so these potential
failures have been around for a long time.
We freeze the condition and preserve the optimization because
getting rid of a div/rem is always a win.
Here are a couple of examples that can be corrected by freezing the
condition:
https://alive2.llvm.org/ce/z/sXHTTC
Differential Revision: https://reviews.llvm.org/D144671
Added the condition 'TrueVal->getType()->isIntOrIntVectorTy' to a block of code
in foldSelectInstWithICmp which is only valid if the TrueVal is integer type.
This change was split off from D136861.
(V == 0) ? 1 : V --> umax(V, 1)
(V == UMAX) ? UMAX-1 : V --> umin(V, UMAX-1)
https://alive2.llvm.org/ce/z/pfDBAf
This is one pair of the variants discussed in issue #60374.
Enhancements for the other end of the constant range and
signed variants are potential follow-ups, but that may
require more work because we canonicalize at least one
min/max like that to icmp+zext.
As discussed in issue #59279, we want fneg/fabs to conform to the
IEEE-754 spec for signbit operations - quoting from section 5.5.1
of IEEE-754-2008:
"negate(x) copies a floating-point operand x to a destination in
the same format, reversing the sign bit"
"abs(x) copies a floating-point operand x to a destination in the
same format, setting the sign bit to 0 (positive)"
"The operations treat floating-point numbers and NaNs alike."
So we gate this transform with "nnan" in addition to "nsz":
(X > 0.0) ? X : -X --> fabs(X)
Without that restriction, we could have for example:
(+NaN > 0.0) ? +NaN : -NaN --> -NaN
(because an ordered compare with NaN is always false)
That would be different than fabs(+NaN) --> +NaN.
More fabs/fneg patterns demonstrated here:
https://godbolt.org/z/h8ecc659d
(without any FMF, these are correct independently of this patch -
no fabs should be created)
The code change is a one-liner, but we have lots of tests diffs
because there are many variations of the basic pattern.
Differential Revision: https://reviews.llvm.org/D139785
This mirrors a similar shufflevector transformation so the same
effect is obtained for scalable vectors. The transformation is
only performed when it can be proven the number of resulting
reversals is not increased. By bubbling the reversals from operand
to result this should typically be the case and ideally leads to
back-back shuffles that can be elimitated entirely.
Differential Revision: https://reviews.llvm.org/D139339
In the X == C ? f(X) : Y -> X == C ? f(C) : Y fold, perform the
replacement in f(X) recursively. For now, this just goes two
instructions up rather than one instruction up.
One of these two changes is exposing (or causing) some more miscompiles.
A reproducer is in progress, so reverting until resolved.
This reverts commit 9ddff66d0c9c3e18d56e6b20aa26a2a8cdfb6d2b.
Matches what we do for binary operations, but a special care needs
is needed to preserve operand order, as the logical operations
are not strictly commutative!