166 Commits

Author SHA1 Message Date
Nikita Popov
8a09adc22a
[InstCombine] Split GEPs with multiple variable indices (#137297)
Split GEPs that have more than one variable index into two. This is in
preparation for the ptradd migration, which will not support multi-index
GEPs.

This also enables the split off part to be CSEd and LICMed.
2025-07-30 12:54:06 +02:00
Luke Lau
9a5b0f302b
Reapply "[InstCombine] Match scalable splats in m_ImmConstant (#132522)" (#134262)
This reapplies #132522.

Previously casts of scalable m_ImmConstant splats weren't being folded
by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant
should not fail" assertion.

There are no changes to the code in this PR, instead we just needed
#133207 to land first.

A test has been added for the assertion in
llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll
@icmp_ult_sext_scalable_splat_is_true.

<hr/>

#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.

However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.

This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.

But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.

I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.
2025-04-03 18:03:16 +01:00
Luke Lau
b61e3874fa Revert "[InstCombine] Match scalable splats in m_ImmConstant (#132522)"
This reverts commit df9e5ae5b40c4d245d904a2565e46f5b7ab9c7c8.

This is triggering an assertion failure on llvm-test-suite with
-enable-vplan-native-path:
https://lab.llvm.org/buildbot/#/builders/198/builds/3365
2025-04-03 15:16:56 +01:00
Luke Lau
df9e5ae5b4
[InstCombine] Match scalable splats in m_ImmConstant (#132522)
#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.

However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.

This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.

But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.

I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2025-04-02 21:21:52 +01:00
Nikita Popov
462cb3cd6c
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-05 14:36:40 +01:00
Paul Walker
56c091ea71
[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856)
This brings the printing of scalable vector constant splats inline with
their fixed length counterparts.
2024-11-21 11:21:12 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Yingwei Zheng
095d49da76
[InstCombine] Set samesign when converting signed predicates into unsigned (#112642)
Alive2: https://alive2.llvm.org/ce/z/6cqdt-
2024-10-17 20:43:48 +08:00
Yingwei Zheng
62cd07fb67
[InstCombine] Canonicalize sub mask, X -> ~X when high bits are ignored (#110635)
Alive2: https://alive2.llvm.org/ce/z/NJgBPL

The motivating case of this patch is to emit `andn` on RISC-V with zbb
for expressions like `(sub 63, X) & 63`.
2024-10-02 12:48:06 +08:00
Nikita Popov
a105877646
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be canonicalized away. This is
indeed very useful to e.g. know that constants are always on the right.

However, this is only useful if the canonicalization is actually
reliable. This is the case for constants, but not for arguments: Moving
these to the right makes it look like the "more complex" expression is
guaranteed to be on the left, but this is not actually the case in
practice. It fails as soon as you replace the argument with another
instruction.

The end result is that it looks like things correctly work in tests,
while they actually don't. We use the "thwart complexity-based
canonicalization" trick to handle this in tests, but it's often a
challenge for new contributors to get this right, and based on the
regressions this PR originally exposed, we clearly don't get this right
in many cases.

For this reason, I think that it's better to remove this complexity
canonicalization. It will make it much easier to write tests for
commuted cases and make sure that they are handled.
2024-08-21 12:02:54 +02:00
Yingwei Zheng
0c03b4ce10
[InstCombine] Infer sub nuw from dominating conditions (#100164)
Alive2: https://alive2.llvm.org/ce/z/g3xxnM
2024-07-24 23:39:30 +08:00
Nikita Popov
2a1169088c [InstCombine] Preserve all gep flags when emitting offset 2024-06-19 12:36:36 +02:00
Nikita Popov
cbe1760f02
[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs (#90017)
Currently, the OptimizePointerDifference fold does not trigger when
working on the sub of two geps where one of the geps has multiple uses,
to avoid duplicating the offset arithmetic too much.

However, there are cases where performing it would still be
clearly profitable, e.g. test_sub_ptradd_multiuse.

This patch drops the one-use restriction using the same strategy we use
in GEP comparison folds: If there are multiple uses, we rewrite the GEP
to use the expanded offset arithmetic instead (effectively
canonicalizing it into ptradd representation).

Fixes https://github.com/llvm/llvm-project/issues/88231.
2024-04-26 10:53:03 +09:00
Nikita Popov
7a77b763af [InstCombine] Add additional multi-use test for sub of gep (NFC)
A case where still performing the fold is clearly profitable.
2024-04-25 14:24:55 +09:00
Nikita Popov
d9a5aa8e2d
[PatternMatch] Do not accept undef elements in m_AllOnes() and friends (#88217)
Change all the cstval_pred_ty based PatternMatch helpers (things like
m_AllOnes and m_Zero) to only allow poison elements inside vector
splats, not undef elements.

Historically, we used to represent non-demanded elements in vectors
using undef. Nowadays, we use poison instead. As such, I believe that
support for undef in vector splats is no longer useful.

At the same time, while poison splat elements are pretty much always
safe to ignore, this is not generally the case for undef elements. We
have existing miscompiles in our tests due to this (see the
masked-merge-*.ll tests changed here) and it's easy to miss such cases
in the future, now that we write tests using poison instead of undef
elements.

I think overall, keeping support for undef elements no longer makes
sense, and we should drop it. Once this is done consistently, I think we
may also consider allowing poison in m_APInt by default, as doing that
change is much less risky than doing the same with undef.

This change involves a substantial amount of test changes. For most
tests, I've just replaced undef with poison, as I don't think there is
value in retaining both. For some tests (where the distinction between
undef and poison is important), I've duplicated tests.
2024-04-17 18:22:05 +09:00
Noah Goldstein
17162b61c2 [KnownBits] Make nuw and nsw support in computeForAddSub optimal
Just some improvements that should hopefully strengthen analysis.

Closes #83580
2024-03-05 12:59:58 -06:00
Paul Walker
fd07b8f809 [LLVM][tests/Transforms/InstCombine] Convert instances of ConstantExpr based splats to use splat().
This is mostly NFC but some output does change due to consistently
inserting into poison rather than undef and using i64 as the index
type for inserts.
2024-02-27 13:37:23 +00:00
Nikita Popov
90ba33099c
[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.

This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.

The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.

Fixes https://github.com/llvm/llvm-project/issues/69841.
2024-01-24 15:25:29 +01:00
Noah Goldstein
dbf6f30926 [InstCombine] Add folds for (X + Y) - (W + Z)
If `Y` and `Z` are constant then we can simplify to `(X - W) + (Y -
Z)`. If `Y == Z` we can fold to `X - W`.

Note these transform exist outside of InstCombine. The purpose of this
commit is primarily to make it so that folds can generate these
simplifiable patterns without having to worry about creating an inf
loop.
2023-11-20 17:59:26 -06:00
Noah Goldstein
c4d8d91679 [InstCombine] Add tests for folding (X + Y) - (W + Z); NFC 2023-11-20 17:59:26 -06:00
Yingwei Zheng
6da4ecdf92
[InstCombine] Infer shift flags with unknown shamt (#72535)
Alive2: https://alive2.llvm.org/ce/z/82Wr3q

Related patch:
2dd52b4527
2023-11-18 15:15:14 +08:00
Sanjay Patel
1ce06176ec [InstSimplify] reduce "mul nsw i1" to "false"
https://alive2.llvm.org/ce/z/XYGvYg

Follow-up for:
68c197f07eeae71
2023-01-18 13:51:40 -05:00
Sanjay Patel
1720ec6da0 [InstCombine] restrict no-wrap propagation for i1/i2 to avoid miscompiles
This transform was added with 68c197f07eeae71b9b7,
and the post-commit review noted the potential
for miscompiles at narrow bitwidths.

I'm not sure how to expose the i1 nuw bug because we
already simplify that, but other cases show that
there are missing transforms to add in follow-up
patches.
2023-01-18 10:32:12 -05:00
Sanjay Patel
eb42b67e8c [InstCombine][InstSimplify] add tests for i1/i2 mul with no-wrap; NFC
A bug was introduced with 68c197f07eeae71 as noted in the
post-commit review comments, and there are potentially
missed smaller transforms/simplifications because no-wrap
multiply with only 1 or 2 bits eliminates some potential
results.
2023-01-18 10:17:06 -05:00
Sanjay Patel
f81151edbf [InstCombine] add tests for diff-of-squares; NFC
Ideally, these negative tests would have been added
before 68c197f07eea to make sure the specific
operand matching is working as expected, but I did
not remember to include them.
2023-01-18 08:13:37 -05:00
Sanjay Patel
68c197f07e [InstCombine] factor difference-of-squares to reduce multiplication
(X * X) - (Y * Y) --> (X + Y) * (X - Y)
https://alive2.llvm.org/ce/z/BAuRCf

The no-wrap propagation could be relaxed in some cases,
but there does not seem to be an obvious rule for that.
2023-01-17 14:58:40 -05:00
Sanjay Patel
bf44905183 [InstCombine] add tests for difference-of-squares; NFC 2023-01-17 14:58:40 -05:00
Paul Walker
eae26b6640 [IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.

NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.

Differential Revision: https://reviews.llvm.org/D140983
2023-01-11 14:08:06 +00:00
chenglin.bi
6703290361 [InstCombine] fold sub + and pattern with specific const value
`C1 - ((C3 - X) & C2) --> (X & C2) + (C1 - (C2 & C3))`
when:
    (C3 - ((C2 & C3) - 1)) is pow2 &&
    ((C2 + C3) & ((C2 & C3) - 1)) == ((C2 & C3) - 1) &&
    C2 is negative pow2 || (C3 - X) is nuw

https://alive2.llvm.org/ce/z/HXQJV-

Fix: #58523

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D136582
2022-11-05 12:58:45 +08:00
chenglin.bi
a3a9fffea1 [InstCombine] Precommit test for D136582; NFC 2022-11-01 04:46:19 +08:00
Sanjay Patel
d2d23795ca [InstCombine] improve demanded bits for Sub operand 0
This is copying the code that was added for 'add' with D130075.
(That patch removed a fallthrough in the cases, but we can
probably still share at least some code again as a follow-up
cleanup, but I didn't want to risk it here.)

The reasoning is similar to the carry propagation for 'add':
if we don't demand low bits of the subtraction and the
subtrahend (aka RHS or operand 1) is known zero in those low
bits, then there can't be any borrowing required from the
higher bits of operand 0, so the low bits don't matter.

Also, the no-wrap flags can be propagated (and I think that
should be true for add too).

Here's an attempt to prove that in Alive2:
https://alive2.llvm.org/ce/z/xqh7Pa
(can add nsw or nuw to src and tgt, and it should still pass)

Differential Revision: https://reviews.llvm.org/D136788
2022-10-27 09:41:57 -04:00
Sanjay Patel
7d3a37a4b4 [InstCombine] add tests for demanded bits of sub; NFC 2022-10-26 17:23:33 -04:00
Sanjay Patel
1bd856fbe5 [InstCombine] add tests for demanded bits of sub; NFC 2022-10-26 14:04:46 -04:00
Bjorn Pettersson
4ab40eca08 [test][InstCombine] Update some test cases to use opaque pointers
These tests cases were converted using the script at
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34

Differential Revision: https://reviews.llvm.org/D135094
2022-10-03 22:17:59 +02:00
Sanjay Patel
64d309131a [InstCombine] try multi-use demanded bits fold for 'sub'
This is similar to D133788 / 73919a87e9a6, but for sub
the transform is valid only for low zeros in operand 1.

https://alive2.llvm.org/ce/z/EmRsXC
2022-09-21 14:13:05 -04:00
Sanjay Patel
eecaa86314 [InstCombine] add tests for multi-use sub demanded bits: NFC 2022-09-21 14:13:05 -04:00
Sanjay Patel
0f32a5dea0 [InstCombine] don't canonicalize shl+sub to mul+add
This stops Negator from transforming:
`C1 - shl X, C2 --> mul X, (1<<C2) + C1`
...in the general case. There does not seem to be any analysis
benefit to using mul in IR, and there's definitely downside in
codegen (particularly when the multiply has to be expanded).

If `C1` is 0, then there's a stronger argument that the single
mul is a better canonicalization than negate-of-shl, but we may
want to remove that too.

This was noted as a potential conflict for D133667.

Differential Revision: https://reviews.llvm.org/D134310
2022-09-21 08:39:07 -04:00
Eric Gullufsen
eb1e2b3997 [InstCombine] Canonicalize "and, add", "or, add", "xor, add"
Canonicalize
```
((x + C1) & C2) --> ((x & C2) + C1)
((x + C1) ^ C2) --> ((x ^ C2) + C1)
((x + C1) | C2) --> ((x | C2) + C1)
```
for suitable constants `C1` and `C2`.

Alive2 proofs: [[ https://alive2.llvm.org/ce/z/BqMDVZ | add, or --> or, add ]]
[[ https://alive2.llvm.org/ce/z/BhAeCl | add, xor --> xor, add ]]
[[ https://alive2.llvm.org/ce/z/jYRHEt | add, and --> and, add ]]

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D131142
2022-08-26 17:23:29 -04:00
Philip Reames
c58791c286 Revert "[InstCombine] Canonicalize "and, add", "or, add", "xor, add""
This reverts commit d2f110c693c88d1bb7caee4f72ebb14766f85239.  test/Transforms/InstCombine/freeze.ll fails on ninja check-llvm on x86_64.
2022-08-26 11:18:31 -07:00
Eric Gullufsen
d2f110c693 [InstCombine] Canonicalize "and, add", "or, add", "xor, add"
Canonicalize
```
((x + C1) & C2) --> ((x & C2) + C1)
((x + C1) ^ C2) --> ((x ^ C2) + C1)
((x + C1) | C2) --> ((x | C2) + C1)
```
for suitable constants `C1` and `C2`.

Alive2 proofs: [[ https://alive2.llvm.org/ce/z/BqMDVZ | add, or --> or, add ]]
[[ https://alive2.llvm.org/ce/z/BhAeCl | add, xor --> xor, add ]]
[[ https://alive2.llvm.org/ce/z/jYRHEt | add, and --> and, add ]]

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D131142
2022-08-26 14:07:43 -04:00
Sanjay Patel
0cfc651032 [InstCombine] ease use constraint in tryFactorization()
The stronger one-use checks prevented transforms like this:
(x * y) + x --> x * (y + 1)
(x * y) - x --> x * (y - 1)

https://alive2.llvm.org/ce/z/eMhvQa

This is one of the IR transforms suggested in issue #57255.

This should be better in IR because it removes a use of a
variable operand (we already fold the case with a constant
multiply operand).
The backend should be able to re-distribute the multiply if
that's better for the target.

Differential Revision: https://reviews.llvm.org/D132412
2022-08-24 12:10:54 -04:00
Sanjay Patel
4376afe727 [InstCombine] add tests for mul+sub common factor; NFC 2022-08-24 11:31:18 -04:00
Jay Foad
f82c55fa08 [InstCombine] Change order of canonicalization of ADD and AND
Canonicalize ((x + C1) & C2) --> ((x & C2) + C1) for suitable constants
C1 and C2, instead of the other way round. This should allow more
constant ADDs to be matched as part of addressing modes for loads and
stores.

Differential Revision: https://reviews.llvm.org/D130080
2022-08-22 20:03:53 +01:00
Sanjay Patel
4022551a15 [ValueTracking] recognize sub X, (X -nuw Y) as not overflowing
This extends a similar pattern from D125500 and D127754.
If we know that operand 1 (RHS) of a subtract is itself a
non-overflowing subtract from operand 0 (LHS), then the
final/outer subtract is also non-overflowing:
https://alive2.llvm.org/ce/z/Bqan8v

InstCombine uses this analysis to trigger a narrowing
optimization, so that is what the first changed test shows.

The last test models a motivating case from issue #48013.
In that example, we determine 'nuw' on the first sub from
the urem, then we determine that the 2nd sub can be narrowed,
and that leads to eliminating both subtracts.

here are still several missing subtract narrowing optimizations
demonstrated in the tests above the diffs shown here - those
should be handled in InstCombine with another set of patches.
2022-06-19 15:12:19 -04:00
Sanjay Patel
bfb915ec8b [InstCombine] add tests for 'sub nuw' with zext; NFC 2022-06-19 15:12:19 -04:00
Sanjay Patel
8605b4d8c5 [ValueTracking] recognize sub X, (X -nsw Y) as not overflowing
This extends a similar pattern from D125500.
If we know that operand 1 (RHS) of a subtract is itself a
non-overflowing subtract from operand 0 (LHS), then the
final/outer subtract is also non-overflowing:
https://alive2.llvm.org/ce/z/Bqan8v

InstCombine uses this analysis to trigger a narrowing
optimization, so that is what the first changed test shows.

The last test models the motivating case from issue #48013.
In that example, we determine 'nsw' on the first sub from
the srem, then we determine that the 2nd sub can be narrowed,
and that leads to eliminating both subtracts.

This works for unsigned sub too, but I left that out to keep
the patch minimal. If this looks ok, I will follow up with
that change. There are also several missing subtract narrowing
optimizations demonstrated in the tests above the diffs shown
here - those should be handled in InstCombine with another set
of patches.

Differential Revision: https://reviews.llvm.org/D127754
2022-06-14 14:51:49 -04:00
Sanjay Patel
304cda0b16 [InstCombine] add tests for sub with extended operands; NFC 2022-06-14 11:24:06 -04:00
Sanjay Patel
ee6754c277 [ValueTracking] recognize sub X, (X % Y) as not overflowing
I fixed some poison-safety violations on related patterns in InstCombine
and noticed that we missed adding nsw/nuw on them, so this adds clauses
to the underlying analysis for that.

We need the undef input restriction to make this safe according to Alive2:
https://alive2.llvm.org/ce/z/48g9K8

Differential Revision: https://reviews.llvm.org/D125500
2022-05-13 09:59:41 -04:00
Sanjay Patel
0fefb56da7 [InstCombine] add tests for sub with rem operand; NFC 2022-05-13 09:59:40 -04:00
Nikita Popov
a266af7211 [InstCombine] Canonicalize SPF to min/max intrinsics
Now that integer min/max intrinsics have good support in both
InstCombine and other passes, start canonicalizing SPF min/max
to intrinsic min/max.

Once this sticks, we can stop matching SPF min/max in various
places, and can remove hacks we have for preventing infinite loops
and breaking of SPF canonicalization.

Differential Revision: https://reviews.llvm.org/D98152
2022-02-24 09:01:20 +01:00