6305 Commits

Author SHA1 Message Date
Marina Taylor
5cd0900ef6
[InstCombine] Compare icmp inttoptr, inttoptr values directly (#107012)
InstCombine already has some rules for `icmp ptrtoint, ptrtoint` to drop
the casts and compare the source values. This change adds the same for
the reverse case with `inttoptr`.
2024-09-24 09:39:07 +02:00
Volodymyr Vasylkun
d4798498c4
[InstCombine] Fold (x == y) ? 0 : (x > y ? 1 : -1) into ucmp/scmp(x,y) (#107314)
This also handles commuted cases of the same fold, with either the
condition or the true/false values of the inner select being swapped.
2024-09-23 15:41:22 +01:00
Volodymyr Vasylkun
b189b89bde
[InstCombine] Relax the conditons of fold of ucmp/scmp into phi by allowing the phi node to use the result of ucmp/scmp more than once (#109593)
This extends the optimisation implemented in #107769 by relaxing the
condtions to make it happen. Now, the value produced by `ucmp`/`scmp`
doesn't need to be one-use, but only one-user, meaning it can be present
in a single phi node more than once.
2024-09-23 15:39:11 +01:00
Nikita Popov
7a181980b9 [InstCombine] Fix nits in new xor fold
Followup to https://github.com/llvm/llvm-project/pull/105992,
use the simplifyXorInst helper and use getWithInstruction
consistently.
2024-09-23 11:59:37 +02:00
Amr Hesham
9614f69b4b
[InstCombine] Fold Xor with or disjoint (#105992)
Implement a missing optimization fold `(X | Y) ^ M to (X ^ M) ^ Y` and
`(X | Y) ^ M to (Y ^ M) ^ X`
2024-09-22 12:32:17 +02:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Nikita Popov
f1ff3a279f [InstCombine] Rename TTI member for clarity (NFC)
There is already a comment on the member and documentation in the
InstCombine contributor guide, but also rename it to make add
an additional speed bump.
2024-09-19 12:31:11 +02:00
Nikita Popov
dc6876fc98
[ValueTracking] Use isSafeToSpeculativelyExecuteWithVariableReplaced() in more places (#109149)
This replaces some uses of isSafeToSpeculativelyExecute() with
isSafeToSpeculativelyExecuteWithVariableReplaced(), in cases where we
are guarding against operand changes rather plain speculation.

I believe that this is NFC with the current implementation of the
function (as it only does something different from loads), but this
makes us more defensive against future generalizations.
2024-09-19 09:38:20 +02:00
Yingwei Zheng
872932b7a9
[InstCombine] Generalize icmp (shl nuw C2, Y), C -> icmp Y, C3 (#104696)
The motivation of this patch is to fold more generalized patterns like
`icmp ult (shl nuw 16, X), 64 -> icmp ult X, 2`.

Alive2: https://alive2.llvm.org/ce/z/gyqjQH
2024-09-18 19:10:41 +08:00
Chengjun
94a98cf5dc
[InstCombine] Remove dead phi web (#108876)
In current visitPHINode function during InstCombine, it can remove dead
phi cycles (all phis have one use, which is another phi). However, it
cannot deal with the case when the phis form a web (all phis have one or
more uses, and all the uses are phi). This change extends the algorithm
so that it can also deal with the dead phi web.
2024-09-18 10:04:49 +02:00
Alex MacLean
790f2eb16a
[InstCombine] Avoid simplifying bitcast of undef to a zeroinitializer vector (#108872)
In some cases, if an undef value is the product of another instcombine
simplification, a bitcast of undef is simplified to a zeroinitializer
vector instead of undef.
2024-09-17 15:31:28 -07:00
c8ef
86f0399c1f
[InstCombine] Fold expression using basic properties of floor and ceiling function (#107107)
alive2: ~~https://alive2.llvm.org/ce/z/Ag3Ki7~~
https://alive2.llvm.org/ce/z/ywP5t2
related: #76438

This patch adds the following foldings: `floor(x) <= x --> true` and `x
<= ceil(x) --> true`. We leverage the properties of these math functions
and ensure there is no floating point input of `nan`.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2024-09-15 14:25:00 +04:00
Volodymyr Vasylkun
21e3a212c5
[InstCombine] Replace an integer comparison of a phi node with multiple ucmp/scmp operands and a constant with phi of individual comparisons of original intrinsic's arguments (#107769)
When we have a `phi` instruction with more than one of its incoming
values being a call to `ucmp` or `scmp`, which is then compared with an
integer constant, we can move the comparison through the `phi` into the
incoming basic blocks because we know that a comparison of `ucmp`/`scmp`
with a constant will be simplified by the next iteration of InstCombine.

There's a high chance that other similar patterns can be identified, in
which case they can be easily handled by the same code by moving the
check for "simplifiable" instructions into a lambda.
2024-09-13 19:50:27 +01:00
Nikita Popov
1c298c9274 [InstCombine] Preserve nuw flags when merging geps
These transforms all perform a variant of (gep (gep p, x), y)
to (gep p, (x + y)). We can preserve both inbounds and nuw
during such transforms (https://alive2.llvm.org/ce/z/Stu4cN), but
not nusw, which would require proving that the new add is nsw.

For the constant offset case, I've conservatively retained the
logic that checks for negative intermediate offsets, though I'm
not sure it's still reachable nowadays.
2024-09-13 11:15:22 +02:00
Nikita Popov
cd39242032 [InstCombine] Remove no longer needed constant offset case (NFCI)
Now that we canonicalize constant geps to i8 type, this special
handling should no longer be needed.
2024-09-13 10:15:54 +02:00
Nikita Popov
940f89255e [InstCombine] Do not modify GEP in place
This was modifying the GEP in place, with code to adjust the
inbounds flag. This was correct at the time, but now fails to
account for other GEP flags like nuw, leading to miscompilations.

Remove the special case, and always create a new GEP instruction.
Logic for preserving nuw in the cases where it is valid will be
added in a followup patch.
2024-09-13 10:04:39 +02:00
David Green
c0e308ba3d
[InstCombine] Pass DomTree and DomTreeCacheto LibCallSimplifier (#108446)
This allows any combines to pick up Known states from dominating
conditions.
2024-09-13 08:36:48 +01:00
Yingwei Zheng
52fac608bd
[InstCombine] Fold [l|a]shr iN (X-1)&~X, N-1 -> [z|s]ext(X==0) (#107259)
Alive2: https://alive2.llvm.org/ce/z/kwvTFn
Closes #107228.

`ashr iN (X-1)&~X, N-1` also exists. See
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/1274.
2024-09-06 21:37:50 +08:00
Nikita Popov
3bc38fb27a
[InstCombine] Generalize and consolidate phi translation check (#106051)
The foldOpIntoPhi() transforms requires all operands to be
phi-translatable. This can be the case either because they are phi nodes
in the same block, or because the operand dominates the block.

Currently, most callers of foldOpIntoPhi() satisfy this pre-condition by
requiring a constant operand, which trivially dominates everything. Only
selects had handling for variable operands.

Move this logic into foldOpIntoPhi(), so things are handled correctly if
other callers are generalized. Also make the implementation a bit more
general by querying the dominator tree.
2024-09-04 16:22:43 +02:00
Nikita Popov
34b10e165d [InstCombine] Remove optional LoopInfo dependency
https://github.com/llvm/llvm-project/pull/106075 has removed the
last dependency on LoopInfo in InstCombine, so don't fetch the
analysis anymore and remove the use-loop-info pass option.
2024-09-02 10:25:45 +02:00
Nikita Popov
f044564db1
[InstCombine] Make backedge check in op of phi transform more precise (#106075)
The op of phi transform wants to prevent moving an operation across a
backedge, as this may lead to an infinite combine loop.

Currently, this is done using isPotentiallyReachable(). The problem with
that is that all blocks inside a loop are reachable from each other.
This means that the op of phi transform is effectively completely
disabled for code inside loops, even when it's not actually operating on
a loop phi (just a phi that happens to be in a loop).

Fix this by explicitly computing the backedges inside the function
instead. Do this via RPOT, which is a bit more efficient than using
FindFunctionBackedges() (which does it without any pre-computed
analyses).

For irreducible cycles, there may be multiple possible choices of
backedge, and this just picks one of them. This is still sufficient to
prevent combine loops.

This also removes the last use of LoopInfo in InstCombine -- I'll drop
the analysis in a followup.
2024-09-02 09:09:21 +02:00
Yingwei Zheng
380fa875ab
[InstCombine] Replace all dominated uses of condition with constants (#105510)
This patch replaces all dominated uses of condition with true/false to
improve context-sensitive optimizations. It eliminates a bunch of
branches in llvm-opt-benchmark.

As a side effect, it may introduce new phi nodes in some corner cases.
See the following case:
```
define i1 @test(i1 %cmp, i1 %cond) {
entry:
   br i1 %cond, label %bb1, label %bb2
bb1:
   br i1 %cmp, label %if.then, label %if.else
if.then:
   br %bb2
if.else:
   br %bb2
bb2:
  %res = phi i1 [%cmp, %entry], [%cmp, %if.then], [%cmp, %if.else]
  ret i1 %res
}
```
It will be simplified into:
```
define i1 @test(i1 %cmp, i1 %cond) {
entry:
   br i1 %cond, label %bb1, label %bb2
bb1:
   br i1 %cmp, label %if.then, label %if.else
if.then:
   br %bb2
if.else:
   br %bb2
bb2:
  %res = phi i1 [%cmp, %entry], [true, %if.then], [false, %if.else]
  ret i1 %res
}
```

I am planning to fix this in late pipeline/CGP since this problem exists
before the patch.
2024-09-01 09:49:23 +08:00
Maciej Gabka
95d2d1cba0
Move stepvector intrinsic out of experimental namespace (#98043)
This patch is moving out stepvector intrinsic from the experimental
namespace.

This intrinsic exists in LLVM for several years now, and is widely used.
2024-08-28 12:48:20 +01:00
Noah Goldstein
a6edcea211 [InstCombine] Simplify (add/sub (sub/add) (sub/add)) irrelivant of use-count
Added folds:
    - `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)`
    - `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)`

The fold typically is handled in the `Reassosiate` pass, but it fails
if the inner `sub`/`add` are multi-use. Less importantly, Reassosiate
doesn't propagate flags correctly.

This patch adds the fold explicitly the InstCombine

Proofs: https://alive2.llvm.org/ce/z/p6JyRP

Closes #105866
2024-08-27 11:43:17 -07:00
Nikita Popov
b74248dae8 [InstCombine] Pass RPOT to InstCombiner (NFC)
To make use of it in a followup change.
2024-08-26 15:17:38 +02:00
Nikita Popov
28fe6ddd9b
[InstCombine] Remove AllOnes fallbacks in getMaskedTypeForICmpPair() (#104941)
getMaskedTypeForICmpPair() tries to model non-and operands as x & -1.
However, this can end up confusing the matching logic, by picking the -1
operand as the "common" operand, resulting in a successful, but useless,
match. This is what causes commutation failures for some of the
optimizations driven by this function.

Fix this by treating a match against -1 as a non-match.
2024-08-26 09:55:52 +02:00
c8ef
43c6fb29a6
[InstCombine] Update the select operand when the cond is trunc and has the nuw or nsw property. (#105914)
This patch updates the select operand when the cond has the nuw or nsw
property. Considering the semantics of the nuw and nsw flag, if there is
no poison value in this expression, this code assumes that X can only be
0, 1 or -1.

close: #96765
alive2: https://alive2.llvm.org/ce/z/3n3n2Q
2024-08-24 19:56:59 +08:00
Volodymyr Vasylkun
da6f423251
[InstCombine] Fold (x < y) ? -1 : zext(x > y) and (x > y) ? 1 : sext(x < y) to ucmp/scmp(x, y) (#105272)
This patch expands already existing funcionality to include these two
additional folds, which are nearly identical to the ones already
implemented.

Proofs: https://alive2.llvm.org/ce/z/Xy7s4j
2024-08-23 22:31:03 +01:00
Nikita Popov
32679e10a9 [InstCombine] Handle logical op for and/or of icmp 0/-1
This aligns the transform with what foldLogOpOfMaskedICmp() does.
2024-08-22 16:17:39 +02:00
Volodymyr Vasylkun
d163935585
[InstCombine] Fold scmp(x -nsw y, 0) to scmp(x, y) (#105583)
Proof: https://alive2.llvm.org/ce/z/v6VtXz
2024-08-22 14:18:48 +01:00
Nikita Popov
de2b6cb6ab
[InstCombine] Fold icmp over select of cmp more aggressively (#105536)
When folding an icmp into a select, treat an icmp of a constant with a
one-use ucmp/scmp intrinsic as a simplification. These comparisons will
reduce down to an icmp.

This addresses a regression seen in Rust and also in llvm-opt-benchmark.
2024-08-22 09:47:35 +02:00
Volodymyr Vasylkun
be7d08cd59
[InstCombine] Fold sext(A < B) + zext(A > B) into ucmp/scmp(A, B) (#103833)
This change also covers the fold of `zext(A > B) - zext(A < B)` since it
is already being canonicalized into the aforementioned pattern.

Proof: https://alive2.llvm.org/ce/z/AgnfMn
2024-08-21 23:15:24 +01:00
Marius Kamp
170a21e7f0
[InstCombine] Extend Fold of Zero-extended Bit Test (#102100)
Previously, (zext (icmp ne (and X, (1 << ShAmt)), 0)) has only been
folded if the bit width of X and the result were equal. Use a trunc or
zext instruction to also support other bit widths.
    
This is a follow-up to commit 533190acdb9d2ed774f96a998b5c03be3df4f857,
which introduced a regression: (zext (icmp ne (and (lshr X ShAmt) 1) 0))
is not folded any longer to (zext/trunc (and (lshr X ShAmt) 1)) since
the commit introduced the fold of (icmp ne (and (lshr X ShAmt) 1) 0) to
(icmp ne (and X (1 << ShAmt)) 0). The change introduced by this commit
restores this fold.
    
Alive proof: https://alive2.llvm.org/ce/z/MFkNXs
    
Relates to issue #86813 and pull request #101838.
2024-08-21 20:09:02 +08:00
Nikita Popov
a105877646
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be canonicalized away. This is
indeed very useful to e.g. know that constants are always on the right.

However, this is only useful if the canonicalization is actually
reliable. This is the case for constants, but not for arguments: Moving
these to the right makes it look like the "more complex" expression is
guaranteed to be on the left, but this is not actually the case in
practice. It fails as soon as you replace the argument with another
instruction.

The end result is that it looks like things correctly work in tests,
while they actually don't. We use the "thwart complexity-based
canonicalization" trick to handle this in tests, but it's often a
challenge for new contributors to get this right, and based on the
regressions this PR originally exposed, we clearly don't get this right
in many cases.

For this reason, I think that it's better to remove this complexity
canonicalization. It will make it much easier to write tests for
commuted cases and make sure that they are handled.
2024-08-21 12:02:54 +02:00
Nikita Popov
2511cdb078 [InstCombine] Adjust fixpoint error message (NFC)
Add a hint to use the no-verify-fixpoint option.
2024-08-20 14:30:09 +02:00
Volodymyr Vasylkun
abf69a167b
[InstCombine] Fold (x < y) ? -1 : zext(x != y) into u/scmp(x,y) (#101049)
This patch adds the aforementioned fold to InstCombine. This pattern is
produced after naive implementations of 3-way comparison in high-level
languages are transformed into LLVM IR and then optimized.

Proofs: https://alive2.llvm.org/ce/z/w4QLq_
2024-08-19 13:02:29 +01:00
Yingwei Zheng
48ae614701
[InstCombine] Avoid infinite loop when negating phi nodes (#104581)
Closes https://github.com/llvm/llvm-project/issues/96012

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-08-17 16:48:29 +08:00
Nikita Popov
dd9a99f2b6 [InstCombine] Preserve nsw in A + -B fold
This was already done for -B + A, but not for A + -B.

Proof: https://alive2.llvm.org/ce/z/F3V2yZ
2024-08-16 16:33:12 +02:00
Nikita Popov
5d28678277 [InstCombine] Fix incorrect zero ext in select of lshr/ashr fold
The -1 constant should be sign extended, not zero extended.

Split out from https://github.com/llvm/llvm-project/pull/80309.
2024-08-16 15:02:16 +02:00
Volodymyr Vasylkun
7e23a23d5e
[InstCombine] Fold an unsigned icmp of ucmp/scmp with a constant to an icmp of the original arguments (#104471)
Proofs: https://alive2.llvm.org/ce/z/9mv8HU
2024-08-16 13:38:13 +01:00
Volodymyr Vasylkun
d68d2172f9
[InstCombine] Fold ucmp/scmp(x, y) >> N to zext/sext(x < y) when N is one less than the width of the result of ucmp/scmp (#104009)
Proof: https://alive2.llvm.org/ce/z/4diUqN

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-08-15 18:08:23 +01:00
Snehasish Kumar
95daf1aedf
Allow optimization of __size_returning_new variants. (#102258)
https://github.com/llvm/llvm-project/pull/101564 added support to TLI to
detect variants of operator new which provide feedback on the actual
size of memory allocated (http://wg21.link/P0901R5). This patch extends
SimplifyLibCalls to handle hot cold hinting of these variants.
2024-08-15 08:06:41 -07:00
Volodymyr Vasylkun
8320b97ab9
[InstCombine] Fold an unsigned comparison of add nsw X, C with a constant into a signed comparison (#103480)
Given an unsigned integer comparison of `add nsw X, C1` with some
constant `C2` we can fold it into a signed comparison of `X` and `C2 -
C1` under the following conditions:
  * There's a `nsw` flag on the addition
  * `C2` is non-negative
  * `X + C1` is non-negative
  * `C2 - C1` is non-negative
2024-08-14 15:31:19 +01:00
Alexis Engelke
5b40a05d8f
[InstCombine] Don't look at ConstantData users
When looking at PHI operand for combining, only look at instructions and
arguments. The loop later iteraters over Arg's users, which is not
useful if Arg is a constant -- it's users are not meaningful and might
be in different functions, which causes problems for the dominates()
query.

Pull Request: https://github.com/llvm/llvm-project/pull/103302
2024-08-13 20:44:45 +02:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Nikita Popov
adb4cfe0b6 [InstCombine] Use getAllOnesValue()
Split off from https://github.com/llvm/llvm-project/pull/80309.
2024-08-13 15:04:23 +02:00
Nikita Popov
4d97ad59f9 [InstCombine] Use APInt::getSplat()
Split off from https://github.com/llvm/llvm-project/pull/80309.
2024-08-13 15:04:23 +02:00
Bjorn Pettersson
145aff6d92 Clean up pointer casts etc after opaque pointers transition. NFC (#102631) 2024-08-12 13:28:53 +02:00
Simon Pilgrim
11ba72e651
[KnownBits] Add KnownBits::add and KnownBits::sub helper wrappers. (#99468) 2024-08-12 10:21:28 +01:00
Nikita Popov
cc14ecc281
[InstCombine] Don't change fn signature for calls to declarations (#102596)
transformConstExprCastCall() implements a number of highly dubious
transforms attempting to make a call function type line up with the
function type of the called function. Historically, the main value this
had was to avoid function type mismatches due to pointer type
differences, which is no longer relevant with opaque pointers.

This patch is a step towards reducing the scope of the transform, by
applying it only to definitions, not declarations. For declarations, the
declared signature might not match the actual function signature, e.g.
`void @fn()` is sometimes used as a placeholder for functions with
unknown signature. The implementation already bailed out in some cases
for declarations, but I think it would be safer to disable the transform
entirely.

For the test cases, I've updated some of them to use definitions
instead, so that the test coverage is preserved.
2024-08-12 10:12:00 +02:00