Try to transform the powi(X, Y) / X into powi(X, Y-1) with Ofast.
For this case, when the Y is 3, then powi(X, 2) is replaced by X * X in
the further step.
Fixes https://github.com/llvm/llvm-project/pull/67216
Reviewed By: dtcxzyw, nikic, jcranmer-intel
When pushing a sub nsw 0, %x negation into an expression, try to
preserve the nsw flag for the cases where this is possible. Do this
by passing the flag through recursive Negator::negate() calls.
Proofs: https://alive2.llvm.org/ce/z/oRPNcY
Differential Revision: https://reviews.llvm.org/D158510
This is just filling in a missing case from D144225.
We treat `(shl Y, X)` and `(shl Z, X)` as `(mul Z, 1 << X)` and `(mul
Y, 1 << X)` then reuse the same transformations that already exist.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D147108
Forked from D142901 to deduce more `nsw`/`nuw` flag for the output
`shl`.
We can handle the following cases + some `nsw`/`nuw` flags:
The rationale for doing this all in `InstCombine` rather than handling
the constant `shl` cases in `InstSimplify` is we often create a new
instruction because we are able to deduce more `nsw`/`nuw` flags than
the original instruction had.
Differential Revision: https://reviews.llvm.org/D144225
Using the more robust log2 search allows us to fold more cases (same
logic as exists for idiv/irem).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D146347
Unfortunately alive2 cannot prove the correctness due to fails by timeout even for
float type half.
However it should be correct. If a and b are not NaN, maximum and minimum will just
return different values (a and b) and take into account a + b == b + a this is the same.
If a or b is NaN, than maximum and minimum are equal to NaN and NaN + NaN is NaN.
a + b is also a NaN.
In terms of preserving fast flags, we cannot preserve ninf due to
minimum(NaN, Infinity) == maximum(NaN, Infinity) == NaN,
minimum(NaN, Infinity) +ninf maximum(NaN, Infinity) == NaN +ninf NaN = NaN
However transformation will change
minimum(NaN, Infinity) + maximum(NaN, Infinity) to NaN +ninf Infinity == poison.
But if fadd is marked as nnan, we can preserve because NaN +ninf/nnan NaN = poison as well.
The same optimization for
maximum(a,b) * minimum(a,b) => a * b
is added.
All said above for fadd is correct for fmul.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D147299
Revert commit due to failure on buildbot:
error: 'match_combine_or' may not intend to support class template argument deduction
This reverts commit b86a06ef284f2637bef89bf5bb20157a8b195568.
The pair of div folds was just added with:
4966d8ebe1bbe5bd6a4d28
But as noted in the post-commit review, we don't actually need
the no-remainder requirement for an unsigned division (still
need the no-unsigned-wrap though):
https://alive2.llvm.org/ce/z/qHjK3Q
This is a corrected version of:
bc886e9b587b
I made a copy-paste error that created an "add" instead of the
intended "sub" on that attempt. The regression tests showed the
bug, but I overlooked that.
As I said in a comment on issue #58717, the bug reports resulting
from the botched patch confirm that the pattern does occur in
many real-world applications, so hopefully eliminating the multiply
results in better code.
I added one more regression test in this version of the patch,
and here's an Alive2 proof to show that exact example:
https://alive2.llvm.org/ce/z/dge7VC
Original commit message:
This is a sibling to:
6064e92b0a84
...but we canonicalize the shl+add to shl+xor,
so the pattern is different than I expected:
https://alive2.llvm.org/ce/z/8CX16e
I have not found any patterns that are safe
to propagate no-wrap, so that is not included
here.
Differential Revision: https://reviews.llvm.org/D137157
This is a sibling to:
6064e92b0a84
...but we canonicalize the shl+add to shl+xor,
so the pattern is different than I expected:
https://alive2.llvm.org/ce/z/8CX16e
I have not found any patterns that are safe
to propagate no-wrap, so that is not included
here.
X * ((1 << Z) + 1) --> (X << Z) + X
https://alive2.llvm.org/ce/z/P-7WK9
It's possible that we could do better with propagating
no-wrap, but this carries over the existing logic and
appears to be correct.
The naming differences on the existing folds are a result
of using getName() to set the final value via Builder.
That makes it easier to transfer no-wrap rather than the
gymnastics required from the raw create instruction APIs.
If the divisor is a power-of-2 or negative-power-of-2 and the dividend
is known to have >= trailing zeros than the divisor, the division is exact:
https://alive2.llvm.org/ce/z/UGBksM (general proof)
https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests)
This isn't the most direct optimization (we could create ashr in these
examples instead of relying on existing folds for exact divides), but
it's possible that there's a more general constraint than just a pow2
divisor, so this might be extended in the future.
This should solve issue #58348.
Differential Revision: https://reviews.llvm.org/D135970
This should be functionally equivalent - both calls are thin
wrappers around computeKnownBits(). We'll probably want to use
known-bits directly in follow-up patches because that could
determine "exact" for example (see issue #58348).
(X << Z) / (Y << Z) --> X / Y
https://alive2.llvm.org/ce/z/CLKzqT
This requires a surprising "nuw" constraint because we have
to guard against immediate UB via signed-div overflow with
-1 divisor.
This extends 008a89037a49ca0d9 and is another transform
derived from issue #58137.