foldFDivPowDivisor can address A / powi(x, y) to A * powi(x, -y),
while for small const value y, for example y=2, the instcombine will
transform powi(x, 2) to fmul x, x, so it is not optimal for A / powi(x, 2).
Fix https://github.com/llvm/llvm-project/issues/77171
Since PR86428, foldPowiReassoc is called by both FMul and FDiv,
as the optimization of FDiv is placed after the FMul, so now
it is correct we don't add the checking of FDiv for powi(X, Y) / X.
But, we may add more matching scenarios later, so add the checking opcode
explicitly is easier to understand.
Now that we don't accept undef splat in PatternMatch, we can remove some
uses of replaceUndefsWith(). I believe in all these cases only poison
splats are possible now, in which case no replacement is necessary.
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.
This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.
As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.
There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
This change updates a few of the transformations in foldFMulReassoc to
respect absent fast-math flags in cases where fmul and fdiv, fadd, or fsub
instructions were being folded but the code was only checking for
fast-math flags on the fmul instruction and was transferring flags to
the folded instruction that were not present on the other original
instructions.
This fixes https://github.com/llvm/llvm-project/issues/82857
Remove the fold working on abs in SPF representation now that we
canonicalize SPF to intrinsics.
This is not strictly NFC because the SPF fold might fire for
non-canonical IR due to multi-use, but given the lack of test coverage,
I assume this is not important.
Try to transform the powi(X, Y) * X into powi(X, Y+1) with Ofast
For this case, when the Y is 3, then powi(X, 4) is replaced by
X2 = X * X; X2 * X2 in the further step.
Similar to D109954, who requires reassoc.
Fixes https://github.com/llvm/llvm-project/issues/69862.
This patch fixes the issues introduced in
bb5c3899d1.
I moved the check for the instruction to be div before I check for the
fast math flags which resolves the crash in
```
float a, b;
double sqrt();
void c() { b = a / sqrt(a); }
```
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
The full fold is one of the following:
1) `(fp_binop ({s|u}itofp x), ({s|u}itofp y))`
-> `({s|u}itofp (int_binop x, y))`
2) `(fp_binop ({s|u}itofp x), FpC)`
-> `({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))`
And support the following binops:
`fmul` -> `mul`
`fadd` -> `add`
`fsub` -> `sub`
Proofs: https://alive2.llvm.org/ce/z/zuacA8
The proofs timeout, so they must be reproduced locally.
Closes#82555
If there are two undef operands, the select would get folded away
entirely. One undef operand can occur if the other two operands
do not satisfy the poison implication check. However, I don't think
that handling this edge case is worthwhile in this fold. If we
wanted to handle this, it would be more natural to do so in the
simplifyValueKnownNonZero() fold (as this is actually the property
we would be exploiting -- this doesn't really have any relation
to taking the log2).
This patch cleans up the duplicate code for folding commutative binops
over `select/phi/minmax`.
Related commits:
+ select support:
88cc35b27e
+ phi support:
8674a023bc
+ minmax support:
624973806c
Try to transform the powi(X, Y) / X into powi(X, Y-1) with Ofast.
For this case, when the Y is 3, then powi(X, 2) is replaced by X * X in
the further step.
Fixes https://github.com/llvm/llvm-project/pull/67216
Reviewed By: dtcxzyw, nikic, jcranmer-intel
When pushing a sub nsw 0, %x negation into an expression, try to
preserve the nsw flag for the cases where this is possible. Do this
by passing the flag through recursive Negator::negate() calls.
Proofs: https://alive2.llvm.org/ce/z/oRPNcY
Differential Revision: https://reviews.llvm.org/D158510
This is just filling in a missing case from D144225.
We treat `(shl Y, X)` and `(shl Z, X)` as `(mul Z, 1 << X)` and `(mul
Y, 1 << X)` then reuse the same transformations that already exist.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D147108
Forked from D142901 to deduce more `nsw`/`nuw` flag for the output
`shl`.
We can handle the following cases + some `nsw`/`nuw` flags:
The rationale for doing this all in `InstCombine` rather than handling
the constant `shl` cases in `InstSimplify` is we often create a new
instruction because we are able to deduce more `nsw`/`nuw` flags than
the original instruction had.
Differential Revision: https://reviews.llvm.org/D144225
Using the more robust log2 search allows us to fold more cases (same
logic as exists for idiv/irem).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D146347
Unfortunately alive2 cannot prove the correctness due to fails by timeout even for
float type half.
However it should be correct. If a and b are not NaN, maximum and minimum will just
return different values (a and b) and take into account a + b == b + a this is the same.
If a or b is NaN, than maximum and minimum are equal to NaN and NaN + NaN is NaN.
a + b is also a NaN.
In terms of preserving fast flags, we cannot preserve ninf due to
minimum(NaN, Infinity) == maximum(NaN, Infinity) == NaN,
minimum(NaN, Infinity) +ninf maximum(NaN, Infinity) == NaN +ninf NaN = NaN
However transformation will change
minimum(NaN, Infinity) + maximum(NaN, Infinity) to NaN +ninf Infinity == poison.
But if fadd is marked as nnan, we can preserve because NaN +ninf/nnan NaN = poison as well.
The same optimization for
maximum(a,b) * minimum(a,b) => a * b
is added.
All said above for fadd is correct for fmul.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D147299
Revert commit due to failure on buildbot:
error: 'match_combine_or' may not intend to support class template argument deduction
This reverts commit b86a06ef284f2637bef89bf5bb20157a8b195568.