Try to transform the powi(X, Y) * X into powi(X, Y+1) with Ofast
For this case, when the Y is 3, then powi(X, 4) is replaced by
X2 = X * X; X2 * X2 in the further step.
Similar to D109954, who requires reassoc.
Fixes https://github.com/llvm/llvm-project/issues/69862.
This patch fixes the issues introduced in
bb5c3899d1.
I moved the check for the instruction to be div before I check for the
fast math flags which resolves the crash in
```
float a, b;
double sqrt();
void c() { b = a / sqrt(a); }
```
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
The full fold is one of the following:
1) `(fp_binop ({s|u}itofp x), ({s|u}itofp y))`
-> `({s|u}itofp (int_binop x, y))`
2) `(fp_binop ({s|u}itofp x), FpC)`
-> `({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))`
And support the following binops:
`fmul` -> `mul`
`fadd` -> `add`
`fsub` -> `sub`
Proofs: https://alive2.llvm.org/ce/z/zuacA8
The proofs timeout, so they must be reproduced locally.
Closes#82555
If there are two undef operands, the select would get folded away
entirely. One undef operand can occur if the other two operands
do not satisfy the poison implication check. However, I don't think
that handling this edge case is worthwhile in this fold. If we
wanted to handle this, it would be more natural to do so in the
simplifyValueKnownNonZero() fold (as this is actually the property
we would be exploiting -- this doesn't really have any relation
to taking the log2).
This patch cleans up the duplicate code for folding commutative binops
over `select/phi/minmax`.
Related commits:
+ select support:
88cc35b27e
+ phi support:
8674a023bc
+ minmax support:
624973806c
Try to transform the powi(X, Y) / X into powi(X, Y-1) with Ofast.
For this case, when the Y is 3, then powi(X, 2) is replaced by X * X in
the further step.
Fixes https://github.com/llvm/llvm-project/pull/67216
Reviewed By: dtcxzyw, nikic, jcranmer-intel
When pushing a sub nsw 0, %x negation into an expression, try to
preserve the nsw flag for the cases where this is possible. Do this
by passing the flag through recursive Negator::negate() calls.
Proofs: https://alive2.llvm.org/ce/z/oRPNcY
Differential Revision: https://reviews.llvm.org/D158510
This is just filling in a missing case from D144225.
We treat `(shl Y, X)` and `(shl Z, X)` as `(mul Z, 1 << X)` and `(mul
Y, 1 << X)` then reuse the same transformations that already exist.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D147108
Forked from D142901 to deduce more `nsw`/`nuw` flag for the output
`shl`.
We can handle the following cases + some `nsw`/`nuw` flags:
The rationale for doing this all in `InstCombine` rather than handling
the constant `shl` cases in `InstSimplify` is we often create a new
instruction because we are able to deduce more `nsw`/`nuw` flags than
the original instruction had.
Differential Revision: https://reviews.llvm.org/D144225
Using the more robust log2 search allows us to fold more cases (same
logic as exists for idiv/irem).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D146347
Unfortunately alive2 cannot prove the correctness due to fails by timeout even for
float type half.
However it should be correct. If a and b are not NaN, maximum and minimum will just
return different values (a and b) and take into account a + b == b + a this is the same.
If a or b is NaN, than maximum and minimum are equal to NaN and NaN + NaN is NaN.
a + b is also a NaN.
In terms of preserving fast flags, we cannot preserve ninf due to
minimum(NaN, Infinity) == maximum(NaN, Infinity) == NaN,
minimum(NaN, Infinity) +ninf maximum(NaN, Infinity) == NaN +ninf NaN = NaN
However transformation will change
minimum(NaN, Infinity) + maximum(NaN, Infinity) to NaN +ninf Infinity == poison.
But if fadd is marked as nnan, we can preserve because NaN +ninf/nnan NaN = poison as well.
The same optimization for
maximum(a,b) * minimum(a,b) => a * b
is added.
All said above for fadd is correct for fmul.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D147299
Revert commit due to failure on buildbot:
error: 'match_combine_or' may not intend to support class template argument deduction
This reverts commit b86a06ef284f2637bef89bf5bb20157a8b195568.
The pair of div folds was just added with:
4966d8ebe1bbe5bd6a4d28
But as noted in the post-commit review, we don't actually need
the no-remainder requirement for an unsigned division (still
need the no-unsigned-wrap though):
https://alive2.llvm.org/ce/z/qHjK3Q