459 Commits

Author SHA1 Message Date
Yingwei Zheng
a77dedcacb
[InstSimplify][InstCombine][ConstantFold] Move vector div/rem by zero fold to InstCombine (#114280)
Previously we fold `div/rem X, C` into `poison` if any element of the
constant divisor `C` is zero or undef. However, it is incorrect when
threading udiv over an vector select:
https://alive2.llvm.org/ce/z/3Ninx5
```
define <2 x i32> @vec_select_udiv_poison(<2 x i1> %x) {
  %sel = select <2 x i1> %x, <2 x i32> <i32 -1, i32 -1>, <2 x i32> <i32 0, i32 1>
  %div = udiv <2 x i32> <i32 42, i32 -7>, %sel
  ret <2 x i32> %div
}
```
In this case, `threadBinOpOverSelect` folds `udiv <i32 42, i32 -7>, <i32
-1, i32 -1>` and `udiv <i32 42, i32 -7>, <i32 0, i32 1>` into
`zeroinitializer` and `poison`, respectively. One solution is to
introduce a new flag indicating that we are threading over a vector
select. But it requires to modify both `InstSimplify` and
`ConstantFold`.

However, this optimization doesn't provide benefits to real-world
programs:

https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/IR/ConstantFold.cpp.html#L908

https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp.html#L1107

This patch moves the fold into InstCombine to avoid breaking numerous
existing tests.

Fixes #114191 and #113866 (only poison-safety issue).
2024-11-01 22:56:22 +08:00
David Majnemer
5d4a0d54b5 [InstCombine] Teach takeLog2 about right shifts, truncation and bitwise-and
We left some easy opportunities for further simplifications.

log2(trunc(x)) is simply trunc(log2(x)). This is safe if we know that
trunc is NUW because it means that the truncation didn't drop any bits.
It is also safe if the caller is OK with zero as a possible answer.

log2(x >>u y) is simply `log2(x) - y`.

log2(x & y) is a funny one. It comes up when doing something like:
```
unsigned int f(unsigned int x, unsigned int y) {
  unsigned char a = 1u << x;
  return y / a;
}
```

LLVM would canonicalize this to:
```
  %shl = shl nuw i32 1, %x
  %conv1 = and i32 %shl, 255
  %div = udiv i32 %y, %conv1
```

In cases like these, we can ignore the mask entirely.
This is equivalent to `y >> x`.
2024-10-28 05:13:04 +00:00
AtariDreams
60e90a1929
[InstCombine] Check for undef first before freeze (#96769)
All of these insert freeze due to multi-use, which is only
relevant for undef values, not poison.
2024-07-13 18:39:41 +02:00
Noah Goldstein
afa3d58ee2 [InstCombine] Fold (mul (div exact X, C0), C1) -> (div exact X, C0/C1)
We can do this if `C0 % C1 == 0` and if we avoid UB in the signed
case.

Proofs: https://alive2.llvm.org/ce/z/HHWHDg

Closes #96915
2024-06-28 16:52:03 +08:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Alex MacLean
a4ca22506c
[InstCombine] (uitofp bool X) * Y --> X ? Y : 0 (#96216)
Fold `mul (uitofp i1 X), Y` to `select i1 X, Y, 0.0` when the `mul` is
`nnan` and `nsz`

Proof: https://alive2.llvm.org/ce/z/_stiPm
2024-06-22 09:50:21 -07:00
Nikita Popov
76e889d3b0 [InstCombine] Avoid use of ConstantExpr::getShl()
Use the constant folding API instead (we later call isNotMinSignedValue
on it, so we do need the Constant* return type here). Use ImmConstant
to guarantee that constant folding succeeds.
2024-06-18 16:29:40 +02:00
Nikita Popov
3c553fc9e0
[InstCombine] Infer nuw on mul nsw with non-negative operands (#90170)
If a mul nsw has non-negative operands, it's also nuw.

Proof: https://alive2.llvm.org/ce/z/2Dz9Uu

Fixes https://github.com/llvm/llvm-project/issues/90020.
2024-04-29 09:53:09 +09:00
zhongyunde 00443407
56ca5ecf41 [InstCombine] Optimize powi(X, Y)/ (X * Z) with Ofast
foldFDivPowDivisor can address A / powi(x, y) to A * powi(x, -y),
while for small const value y, for example y=2, the instcombine will
transform powi(x, 2) to fmul x, x, so it is not optimal for A / powi(x, 2).

Fix https://github.com/llvm/llvm-project/issues/77171
2024-04-21 12:09:20 +08:00
zhongyunde 00443407
cb7cb83010 [InstCombine] Add check to avoid dependent optimization order, NFC
Since PR86428, foldPowiReassoc is called by both FMul and FDiv,
as the optimization of FDiv is placed after the FMul, so now
it is correct we don't add the checking of FDiv for powi(X, Y) / X.
But, we may add more matching scenarios later, so add the checking opcode
explicitly is easier to understand.
2024-04-21 12:07:56 +08:00
Nikita Popov
eb7ad8853c
[InstCombine] Remove some uses with replaceUndefsWith() (#89190)
Now that we don't accept undef splat in PatternMatch, we can remove some
uses of replaceUndefsWith(). I believe in all these cases only poison
splats are possible now, in which case no replacement is necessary.
2024-04-19 09:01:56 +09:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Andy Kaylor
be50a259f1
Update foldFMulReassoc to respect absent fast-math flags (#88589)
This change updates a few of the transformations in foldFMulReassoc to
respect absent fast-math flags in cases where fmul and fdiv, fadd, or fsub
instructions were being folded but the code was only checking for
fast-math flags on the fmul instruction and was transferring flags to
the folded instruction that were not present on the other original 
instructions.

This fixes https://github.com/llvm/llvm-project/issues/82857
2024-04-16 16:22:31 -07:00
Nikita Popov
c50f7e9a42
[InstCombine] Remove mul of SPF abs fold (#88675)
Remove the fold working on abs in SPF representation now that we
canonicalize SPF to intrinsics.

This is not strictly NFC because the SPF fold might fire for
non-canonical IR due to multi-use, but given the lack of test coverage,
I assume this is not important.
2024-04-16 09:17:52 +09:00
AtariDreams
5d6b00929b
[NFC] Replace m_Sub(m_Zero(), X) with m_Neg(X) (#88461) 2024-04-12 18:24:03 +09:00
zhongyunde 00443407
bd9bb31bce [InstCombine] add restrict reassoc for the powi(X,Y) / X
add restrict reassoc for the powi(X,Y) / X according the discuss on PR69998.
2024-03-27 16:47:03 +08:00
zhongyunde 00443407
2938f1cff9 [InstCombine] Refactor powi(X,Y) / X to call foldPowiReassoc, NFC 2024-03-27 16:47:03 +08:00
Yingwei Zheng
caa2258250
[LLVM] Remove nuw neg (#86295)
This patch removes APIs that creating NUW neg. It is a trivial case
because `sub nuw 0, X` always gets simplified into zero.
I believe there is no optimization opportunities in the real-world
applications that we can take advantage of the nuw flag.

Motivated by
https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134.

Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
2024-03-26 20:56:16 +08:00
Noah Goldstein
b3ee127e7d [InstCombine] integrate N{U,S}WAddLike into existing folds
Just went a quick replacement of `N{U,S}WAdd` with the `Like` variant
that old matches `or disjoint`

Closes #86082
2024-03-21 13:03:38 -05:00
Yingwei Zheng
2bfa7d0e16
[InstCombine] Fold fmul X, -0.0 into copysign(0.0, -X) (#85772)
`fneg + copysign` is better than fmul for analysis/codegen.
godbolt: https://godbolt.org/z/eEs6dGd1G
Alive2: https://alive2.llvm.org/ce/z/K3M5BA
2024-03-21 21:48:10 +08:00
SahilPatidar
e61e26091c
[InstCombine] Fold mul (sext bool X), Y into select X, -Y, 0 (#84792)
Alive2: https://alive2.llvm.org/ce/z/n_ns-W

Resolve #84608
2024-03-15 16:08:46 +08:00
zhongyunde 00443407
2d6988a45e [InstCombine] Add restrict reassoc for the operands of fmul
According the discussion, except the fmul itself, all its operands
should also have reassoc flag.
Add new API m_AllowReassoc to check reassoc flag
2024-03-14 22:05:21 +08:00
zhongyunde 00443407
1752b9e4c7 [InstCombine] create a helper function foldPowiReassoc, NFC 2024-03-14 22:05:20 +08:00
zhongyunde 00443407
098520244f [InstCombine] optimize powi(X,Y) * X with Ofast
Try to transform the powi(X, Y) * X into powi(X, Y+1) with Ofast

For this case, when the Y is 3, then powi(X, 4) is replaced by
X2 = X * X; X2 * X2 in the further step.
Similar to D109954, who requires reassoc.

Fixes https://github.com/llvm/llvm-project/issues/69862.
2024-03-14 22:05:20 +08:00
Zain Jaffal
f5811494b0
check if operand is div in fold FDivSqrtDivisor (#81970)
This patch fixes the issues introduced in
bb5c3899d1.

I moved the check for the instruction to be div before I check for the
fast math flags which resolves the crash in

```
float a, b;
double sqrt();
void c() { b = a / sqrt(a); }
```

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-03-09 17:15:14 +00:00
Noah Goldstein
946ea4e3ca [InstCombine] Add folds for (fp_binop ({s|u}itofp x), ({s|u}itofp y))
The full fold is one of the following:
1) `(fp_binop ({s|u}itofp x), ({s|u}itofp y))`
    -> `({s|u}itofp (int_binop x, y))`
2) `(fp_binop ({s|u}itofp x), FpC)`
    -> `({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))`

And support the following binops:
    `fmul` -> `mul`
    `fadd` -> `add`
    `fsub` -> `sub`

Proofs: https://alive2.llvm.org/ce/z/zuacA8

The proofs timeout, so they must be reproduced locally.

Closes #82555
2024-03-06 13:28:04 -06:00
Nikita Popov
f7f947e620 [InstCombine] Remove some uninteresting FIXMEs (NFC)
If there are two undef operands, the select would get folded away
entirely. One undef operand can occur if the other two operands
do not satisfy the poison implication check. However, I don't think
that handling this edge case is worthwhile in this fold. If we
wanted to handle this, it would be more natural to do so in the
simplifyValueKnownNonZero() fold (as this is actually the property
we would be exploiting -- this doesn't really have any relation
to taking the log2).
2024-02-12 10:36:29 +01:00
Martin Storsjö
f022aaf4e7 Revert "[InstCombine] Optimise x / sqrt(y / z) with fast-math pattern. (#76737)"
This reverts commit bb5c3899d1936ebdf7ebf5ca4347ee2e057bee7f.

That commit caused failed asserts like this:

$ cat repro.c
float a, b;
double sqrt();
void c() { b = a / sqrt(a); }
$ clang -target x86_64-linux-gnu -c -O2 -ffast-math repro.c
clang: ../lib/IR/Instruction.cpp:522: bool llvm::Instruction::hasAllowReassoc() const: Assertion `isa<FPMathOperator>(this) && "getting fast-math flag on invalid op"' failed.
2024-02-10 11:54:31 +02:00
Zain Jaffal
bb5c3899d1
[InstCombine] Optimise x / sqrt(y / z) with fast-math pattern. (#76737)
Replace the pattern with
x * sqrt(z/y)

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-02-09 17:24:41 +00:00
AtariDreams
966f78bdf8
[InstCombine] Resolve TODO: nnan nsz X / -0.0 -> copysign(inf, X) (#79766) 2024-02-07 11:48:37 +05:30
Congcong Cai
64e94438a4
[InstCombine] combine mul(abs(x),abs(y)) to abs(mul(x,y)) (#78395)
Fixes: https://github.com/llvm/llvm-project/issues/78076
Alive2 Proof: https://alive2.llvm.org/ce/z/XEDy0f
2024-01-18 20:12:00 +08:00
Yingwei Zheng
0ce193708c
[InstCombine] Refactor folding of commutative binops over select/phi/minmax (#76692)
This patch cleans up the duplicate code for folding commutative binops
over `select/phi/minmax`.

Related commits:
+ select support:
88cc35b27e
+ phi support:
8674a023bc
+ minmax support:
624973806c
2024-01-04 15:11:28 +08:00
Z572
e6d2bb0ed8
[InstCombine] Simplifiy (-x * y * -x) into (x * y * x) (#72953)
fix https://github.com/llvm/llvm-project/issues/72259
proof: https://alive2.llvm.org/ce/z/HsrmTC
2023-12-21 19:13:09 +08:00
Z572
1c494198c3
[InstCombine] simplify (X * C0) / (X * C1) into C0 / C1. (#73204)
fix #72114
proof: https://alive2.llvm.org/ce/z/xqprFm
2023-12-13 17:17:06 +08:00
Nikita Popov
5295b12cd0 [PatternMatch] Add m_AddLike matcher (NFC)
This matches either a plain "add" or an "or disjoint" that can
be converted into an add. The AddLike terminology is adopted from
the SDAG layer.
2023-12-07 14:45:12 +01:00
Nikita Popov
410bf5e142 [InstCombine] Use disjoint flag in mul of or fold
Slightly more powerful if the information used to infer disjoint
was lost.
2023-12-05 15:24:50 +01:00
Nikita Popov
4b3ea337ad [ValueTracking] Convert isKnownNonNegative() to use SimplifyQuery (NFC) 2023-11-29 10:52:52 +01:00
Yingwei Zheng
865c1fda6f
[InstCombine] Preserve NSW flags for neg instructions (#72548)
Alive2: https://alive2.llvm.org/ce/z/F9HG3M

This missed optimization is discovered with the help of
https://github.com/AliveToolkit/alive2/pull/962.
2023-11-17 14:02:47 +08:00
Z572
c350a1eaae
[InstCombine] Simplifiy sdiv -X, X into X == INT_MIN ? 1 : -1 (#71768)
Alive2: https://alive2.llvm.org/ce/z/dHddwH
Fixes #69574
2023-11-15 23:30:08 +08:00
Allen
a0cd6265bc
[InstCombine] Split the FMul with reassoc into a helper function, NFC (#71493)
The reassoc check is really hard to find because the handle branch it
too large, so spilt it into a helper function.
2023-11-07 15:30:56 +08:00
zhongyunde 00443407
7e3d1103e6 [InstCombine] optimize powi(X,Y)/X with Ofast (#67236)
Try to transform the powi(X, Y) / X into powi(X, Y-1) with Ofast.
For this case, when the Y is 3, then powi(X, 2) is replaced by X * X in
the further step.

Fixes https://github.com/llvm/llvm-project/pull/67216
Reviewed By: dtcxzyw, nikic, jcranmer-intel
2023-10-21 08:56:14 +08:00
XChy
f2898def69
[InstCombine] Don't mix X << Y / Z << Y with X << Y / X << Z (#69302)
Fixes #69291.
This patch improve the logic handling different patterns to avoid mixing these
pattern.
2023-10-17 18:47:49 +08:00
XChy
0823cb7911
[InstCombine] Fold (X << Y) / (X << Z) -> 1 << Y >> Z (#68863)
Resolve #68857.
Alive2 proofs:
[Whole proofs](https://alive2.llvm.org/ce/z/A5b85F)
2023-10-15 22:51:45 +08:00
Nikita Popov
80fa5a6377 [ValueTracking] Use SimplifyQuery in haveNoCommonBitsSet() (NFC)
Pass SimplifyQuery instead of unpacked list of arguments.
2023-10-10 11:39:59 +02:00
Nikita Popov
6cd5eb1f54 [InstCombine] Avoid some uses of ConstantExpr::getZExt() (NFC)
Add helpers getLosslessUnsignedTrunc/getLosslessSignedTrunc for
this common pattern.
2023-09-28 17:02:33 +02:00
Nikita Popov
1fc73cacb2 [InstCombine] Propagate nsw flag when negating
When pushing a sub nsw 0, %x negation into an expression, try to
preserve the nsw flag for the cases where this is possible. Do this
by passing the flag through recursive Negator::negate() calls.

Proofs: https://alive2.llvm.org/ce/z/oRPNcY

Differential Revision: https://reviews.llvm.org/D158510
2023-09-14 09:09:45 +02:00
Matt Arsenault
0ec9ccb39e InstCombine: Fix typo in comment 2023-08-24 07:42:37 -04:00
Antonio Frighetto
f12a5561b2 [InstCombine] Fold binop of select and cast of select condition
Simplify binary operations, whose operands involve a `select`
instruction and a cast of the `select` condition. Specifically,
the binop is canonicalized into a `select` with folded arguments
as follows:

(Binop (zext C), (select C, T, F))
  -> (select C, (binop 1, T), (binop 0, F))

(Binop (sext C), (select C, T, F))
  -> (select C, (binop -1, T), (binop 0, F))

Proofs: https://alive2.llvm.org/ce/z/c_JwwM

Differential Revision: https://reviews.llvm.org/D153963
2023-07-20 19:42:58 +00:00
Noah Goldstein
453d983d56 [InstCombine] Add transforms for (rem (shl Y, X), (shl Z, X))
This is just filling in a missing case from D144225.

We treat `(shl Y, X)` and `(shl Z, X)` as `(mul Z, 1 << X)` and `(mul
Y, 1 << X)` then reuse the same transformations that already exist.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D147108
2023-07-06 14:46:34 -05:00
Noah Goldstein
2cb6b06c89 [InstCombine] Add constant combines for (urem/srem (shl X, Y), (shl X, Z))
Forked from D142901 to deduce more `nsw`/`nuw` flag for the output
`shl`.

We can handle the following cases + some `nsw`/`nuw` flags:

The rationale for doing this all in `InstCombine` rather than handling
the constant `shl` cases in `InstSimplify` is we often create a new
instruction because we are able to deduce more `nsw`/`nuw` flags than
the original instruction had.

Differential Revision: https://reviews.llvm.org/D144225
2023-07-06 14:46:34 -05:00