496 Commits

Author SHA1 Message Date
Nikita Popov
503ef0a8e7 [InstCombine] Remove addrspacecast bitcast extraction fold (NFC)
This is not relevant for opaque pointers, and as such no longer
necessary.
2023-04-06 09:53:32 +02:00
Nikita Popov
032e5d403e [InstCombine] Remove convertBitCastToGEP() fold (NFC)
This only applies to typed pointers, so the fold is no longer
necessary.
2023-04-05 16:20:14 +02:00
Jie Fu
d1dd995196 [InstCombine] Remove unneeded internal function 'decomposeSimpleLinearExpr' in InstCombineCasts.cpp (NFC)
/data/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:32:15: error: function 'decomposeSimpleLinearExpr' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static Value *decomposeSimpleLinearExpr(Value *Val, unsigned &Scale,
              ^
1 error generated.
2023-04-05 22:18:39 +08:00
Nikita Popov
3cbdcd6ebf [InstCombine] Remove PromoteCastOfAllocation() fold (NFC)
This fold does not apply to opaque pointers, and as such is no
longer needed.
2023-04-05 15:55:43 +02:00
Nikita Popov
aff1863859 [IR] Remove ConstantExpr::getUMin() (NFC)
This is part of select constant expression removal. As there is
only a single place where this is used, just expand it to explicit
constant folding calls.

(Normally we'd just use the IRBuilder here, but this isn't possible
due to mergeUndefsWith use).
2023-03-06 13:16:27 +01:00
Nikita Popov
ee2f9d6dfb Reapply [InstCombine] Remove early constant fold
The reported compile-time regression has been address in
47f9109dff80a1abbe2705ee71dc0882b1d62274.

Additionally, this contains a change to immediately fold zext
with constant operand, even if it's used in a trunc. I'm not sure
if this is relevant for anything, but I noticed it as a behavioral
discrepancy when investigating this issue.

-----

InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)

There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).

This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in be88b5814d9efce131dbc0c8e288907e2e6c89be.

Differential Revision: https://reviews.llvm.org/D144369
2023-02-27 12:23:06 +01:00
Vitaly Buka
779679284e Revert "[InstCombine] Remove early constant fold"
Increase compile time with ubsan ARM from 3 to 14 min single file.
I upload reproducer into D144369.

Also we have random timeouts on internal x86_64 builds.
Both bisected to this one.

This reverts commit 45a0b812fa13ec255cae91f974540a4d805a8d79.
2023-02-24 10:21:32 -08:00
Nikita Popov
8347ca7dc8 [PatternMatch] Don't require DataLayout for m_VScale()
The m_VScale() matcher is unusual in that it requires a DataLayout.
It is currently used to determine the size of the GEP type. However,
I believe it is sufficient to check for the canonical
<vscale x 1 x i8> form here -- I don't think there's a need to
recognize exotic variations like <vscale x 1 x i4> as a vscale
constant representation as well.

Differential Revision: https://reviews.llvm.org/D144566
2023-02-23 15:30:29 +01:00
Nikita Popov
45a0b812fa [InstCombine] Remove early constant fold
InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)

There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).

This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in be88b5814d9efce131dbc0c8e288907e2e6c89be.

Differential Revision: https://reviews.llvm.org/D144369
2023-02-20 16:48:39 +01:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Sander de Smalen
da4a5a46b3 [InstCombine] Promote expression tree with @llvm.vscale when zero-extending result.
The LoopVectorizer emits the (scaled) element count as i32, which for
scalable VFs results in calls to @llvm.vscale.i32(). This value is scaled
and further zero-extended to i64.

The zero-extend can be folded away by executing the whole expression in i64
type using @llvm.vscale.i64(). Any logical `and` that would needed to mask
the result can be further folded away by KnownBits analysis when
vscale_range is set.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D143016
2023-02-02 11:18:16 +00:00
Samuel Parker
038f7debfd [DAGCombine] fp_to_sint isSaturatingMinMax
Recommitting after fixing scalable vector crash.

Check for single smax pattern against zero when converting from a
small enough float.

Differential Revision: https://reviews.llvm.org/D142481
2023-01-30 12:25:25 +00:00
Samuel Parker
e60b91df13 Revert "[DAGCombine] fp_to_sint isSaturatingMinMax"
This reverts commit 85395af27241ab9c8d5763b8afcaa07f1bab26d5.

This is causing trouble with scalable vectors.
2023-01-27 15:42:12 +00:00
Samuel Parker
85395af272 [DAGCombine] fp_to_sint isSaturatingMinMax
Check for single smax pattern against zero when converting from a
small enough float.

Differential Revision: https://reviews.llvm.org/D142481
2023-01-26 12:37:43 +00:00
Sanjay Patel
e44a305690 [InstCombine] invert canonicalization of sext (x > -1) --> not (ashr x)
https://alive2.llvm.org/ce/z/2iC4oB

This is similar to changes made for zext + lshr:
21d3871b7c90
6c39a3aae1dc

The existing fold did not account for extra uses, so we
see some instruction count reductions in the test diffs.

This is intended to improve analysis (icmp likely has more
transforms than any other opcode), make other transforms
more symmetric with zext/lshr, and it can be inverted
in codegen if profitable.

As with the earlier changes, there is potential to uncover
infinite combine loops, but I have not found any yet.
2023-01-24 16:44:15 -05:00
Sanjay Patel
b977f8df49 [InstCombine] reduce code duplication; NFC 2023-01-24 14:18:40 -05:00
Sanjay Patel
c09c90b90b [InstCombine] rename variables for readability; NFC
There's no reason to use "CI" (cast instruction) when
we know that the value is a more specific (exact) type
of instruction (although we might want to common-ize some
of this code to eliminate duplication or logic diffs).

It's also visually difficult to distinguish between "CI",
"ICI", and "IC" acronyms (and those could change meaning
depending on context).

This was partially changed in earlier commits, so this
makes this pair of functions consistent.
2023-01-24 14:18:40 -05:00
Samuel Parker
b1b7fb6f20 [InstCombine] trunc (fptoui|fptosi)
Attempt to fold the trunc into the fp-to-int conversion.

Differential Revision: https://reviews.llvm.org/D142093
2023-01-24 09:16:25 +00:00
Guillaume Chatelet
48f5d77eee [NFC] Use TypeSize::getKnownMinValue() instead of TypeSize::getKnownMinSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:36:39 +00:00
Sanjay Patel
2aa471bd92 [InstCombine] remove zext-of-icmp fold that may conflict with other folds
This bit-hack transform would cause the new test to infinite loop
after 21d3871b7c90f85b3ae.

The deleted transform has existed for a very long time,
but the profitable parts appear to be handled by other
folds now. This fold could replace 2 instructions with
4 instructions, so it was always in danger of going
overboard.

No tests regress by removing the whole thing.
2023-01-10 10:23:21 -05:00
Sanjay Patel
f400daae90 [InstCombine] limit zext-of-icmp folds to bit-hacks
In the changed tests, we avoid creating extra instructions,
and there are no obvious regressions in IR tests at least.

Codegen should be able to create the shift+mask form if that
is profitable.

This is a more general fix for issue #59897 than 0eedc9e56712 .
2023-01-09 16:29:24 -05:00
Sanjay Patel
a4f3b23671 [InstCombine] simplify code and fix formatting; NFC 2023-01-09 16:27:44 -05:00
Sanjay Patel
21d3871b7c [InstCombine] fold not-shift of signbit to icmp+zext, part 2
Follow-up to:
6c39a3aae1dc

That converted a pattern with ashr directly to icmp+zext, and
this updates the pattern that we used to convert to.

This canonicalizes to icmp for better analysis in the minimum case
and shortens patterns where the source type is not the same as dest type:
https://alive2.llvm.org/ce/z/tpXJ64
https://alive2.llvm.org/ce/z/dQ405O

This requires an adjustment to an icmp transform to avoid infinite looping.
2023-01-08 12:04:09 -05:00
Krzysztof Parzyszek
26424c96c0 Attributes: convert Optional to std::optional 2022-12-02 08:15:45 -06:00
Matthias Gehre
5a1d92fa3e [InstCombine] Update debug intrinsics when rewriting allocas 2022-11-25 08:20:54 +01:00
OCHyams
fcd5098a03 [Assignment Tracking][14/*] Account for assignment tracking in instcombine
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Most of the updates here are just to ensure DIAssignID attachments are
maintained and propagated correctly.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133307
2022-11-18 09:25:33 +00:00
Sanjay Patel
1c6ebe29d3 [InstCombine] reduce multi-use casts+masks
As noted in the code comment, we could generalize this:
https://alive2.llvm.org/ce/z/N5m-eZ

It saves an instruction even without a constant operand,
but the 'and' is wider. We can do that as another step
if it doesn't harm anything.

I noticed that this missing pattern with a constant operand
inhibited other transforms in a recent bug report, so this
is enough to solve that case.
2022-11-06 09:07:17 -05:00
Nikita Popov
8df376db72 [InstCombine] Remove buggy zext of icmp eq with pow2 fold (PR57899)
For the case where the constant is a power of two rather than zero,
the fold is incorrect, because it fails to check that the bit set
in the LHS matches the bit in the RHS.

Rather than fixing this, remove the power of two handling entirely,
as a different fold will already canonicalize such comparisons to
use a zero constant.

Fixes https://github.com/llvm/llvm-project/issues/57899.
2022-09-22 16:37:10 +02:00
Sanjay Patel
8a19842c0e [InstCombine] delete redundant folds; NFC
InstSimplify does this via isKnownNonEqual(), so it's already
using knownbits on these patterns and trying other folds.
2022-08-30 14:21:29 -04:00
Chenbing Zheng
adf4519c0e [InstCombine] recognize bitreverse disguised as shufflevector
This patch complete TODO left in D66965, and achieve
related pattern for bitreverse.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132431
2022-08-25 10:41:47 +08:00
Chenbing Zheng
14fae4d136 [InstCombine] Add undef elements support for shrinkFPConstantVector
Reviewed By: RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D132343
2022-08-25 10:38:48 +08:00
Jakub Kuderski
6fa87ec10f [ADT] Deprecate is_splat and replace all uses with all_equal
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D132335
2022-08-23 11:36:27 -04:00
zhongyunde
b2b4c8721d [InstCombine] Make use of low zero bits to determine exact int->fp cast
According the comment https://reviews.llvm.org/D127854#inline-1226805,
We could also make use of these low zero bits, https://alive2.llvm.org/ce/z/GYxTRu

Reviewed By: spatel, nikic, xbolva00

Differential Revision: https://reviews.llvm.org/D128895
2022-07-05 09:15:12 +08:00
zhongyunde
404479b4b0 [InstCombine] Use known bits to determine exact int->fp cast
Reviewed By: spatel, nikic

Differential Revision: https://reviews.llvm.org/D127854
2022-06-30 09:45:11 +08:00
Kazu Hirata
7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Wael Yehia
0952cf5bbb [InstCombine] decomposeSimpleLinearExpr should bail out on negative operands.
InstCombine tries to rewrite

  %prod = mul nsw i64 %X,   Scale
  %acc = add nsw i64 %prod,   Offset
  %0 = alloca i8, i64 %acc, align 4
  %1 = bitcast i8* %0 to i32*
  Use ( %1 )

into

  %prod = mul nsw i64 %X,   Scale/4
  %acc = add nsw i64 %prod,   Offset/4
  %0 = alloca i32, i64 %acc, align 4
  Use (%0)

But it assumes Scale is unsigned, and performs an unsigned division.
So we should bail out if Scale cannot be interpreted as an unsigned safely.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126546
2022-06-08 00:57:25 +00:00
Chenbing Zheng
ef256ed58e [InstCombine] bitcast (extractelement <1 x elt>, dest) -> bitcast(<1 x elt>, dest)
Only solve dest type is vector to avoid inverse transform in visitBitCast.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D125951
2022-05-30 10:16:32 +08:00
Chenbing Zheng
41aab93afc [InstCombine] bitcast(logic(bitcast(X), bitcast(Y))) -> bitcast'(logic(bitcast'(X), Y))
This patch break foldBitCastBitwiseLogic limite the destination
must have an integer element type, and eliminate one bitcast by
doing the logic op in the type of the input that has an integer
element type.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D126184
2022-05-26 10:23:44 +08:00
Chenbing Zheng
269e3f7369 [InstCombine] [NFC] Move transforms for truncated shifts into narrowBinOp
Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D126056
2022-05-25 10:21:39 +08:00
Chenbing Zheng
cf348f6a2c [InstCombine] [NFC] Use a pattern matcher for ExtractElementInst
Reviewed By: RKSimon, rampitec

Differential Revision: https://reviews.llvm.org/D125857
2022-05-20 10:31:40 +08:00
Sanjay Patel
f31d39c42c [InstCombine] remove cast-of-signbit to shift transform
The transform was wrong in 3 ways:

1. It created an extra instruction when the source and dest types don't match.
2. It did not account for an extra use of the icmp, so could create 2 extra insts.
3. It favored bit hacks over icmp (icmp generally has better analysis).

This fixes #54692 (modeled by the PhaseOrdering tests).

This is a minimal step to fix the bug, but we should likely invert
this and the sibling transform for the "is negative" pattern too.

The backend should be able to invert this back to a shift if that
leads to better codegen.

This is a reduced try of 3794cc0e9964 - that was reverted because
it could cause infinite loops by conflicting with the related
transforms in this block that create shifts.
2022-05-17 11:10:28 -04:00
Nikita Popov
a694546f7c [KnownBits] Add operator==
Checking whether two KnownBits are the same is somewhat common,
mainly in test code.

I don't think there is a lot of room for confusion with "determine
what the KnownBits for an icmp eq would be", as that has a
different result type (this is what the eq() method implements,
which returns Optional<bool>).

Differential Revision: https://reviews.llvm.org/D125692
2022-05-17 09:38:13 +02:00
Sanjay Patel
07d549bce9 Revert "[InstCombine] invert canonicalization for cast of signbit test"
This reverts commit 3794cc0e996481e10307b67c8436aa44e0d65d22.
This change is suspected of causing bots to hang at stage 2
compiles, so reverting to confirm and investigate.
2022-05-16 17:47:02 -04:00
Sanjay Patel
3794cc0e99 [InstCombine] invert canonicalization for cast of signbit test
The existing transform was wrong in 3 ways:
1. It created an extra instruction when the source and dest types don't match.
2. It did not account for an extra use of the icmp, so could create 2 extra insts.
3. It favored bit hacks over icmp (icmp generally has better analysis).

This fixes #54692 (modeled by the PhaseOrdering tests).

This is a minimal step to fix the bug, but we should likely invert
the sibling transform for the "is negative" pattern too.

The backend should be able to invert this back to a shift if that
leads to better codegen.
2022-05-16 12:55:52 -04:00
Sanjay Patel
8650f05c97 [InstCombine] fix miscompile when casting int->FP->int
As shown in https://github.com/llvm/llvm-project/issues/55150 -
the existing fold may be wrong when converting to a signed value.
This is a quick fix to avoid the miscompile.

I added tests/comments for all of the signed/unsigned combinations
at either side of the boundary width, and tried to confirm with Alive2:
https://alive2.llvm.org/ce/z/3p9DSu

There are already some TODO items in the test file that suggest
possible refinements, so the regression with ui->FP->si is probably ok.
It seems unlikely that we'd see these kind of edge cases with
non-byte-width integer types in real code. The potential miscompile
went undetected for several years.

This and 747c6a0c734e fixes #55150.

Differential Revision: https://reviews.llvm.org/D124692
2022-05-07 08:46:25 -04:00
Chenbing Zheng
8eaa1ef0d8 [InstCombine] add casts from splat-a-bit pattern if necessary
Splatting a bit of constant-index across a value:
sext (ashr (trunc iN X to iM), M-1) to iN --> ashr (shl X, N-M), N-1
If the dest type is different, use a cast (adjust use check).

https://alive2.llvm.org/ce/z/acAan3

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124590
2022-05-07 15:34:57 +08:00
Sanjay Patel
6631907ad2 [InstCombine] use isKnownNonNegative to reduce code duplication; NFC
We may be able to make the ValueTracking wrapper smarter
in the future (for example, analyze a simple recurrence),
so this will automatically benefit if that happens.
2022-04-25 17:13:29 -04:00
Craig Topper
e3f6c2d288 [InstCombine] Don't look through bitcast from vector in collectInsertionElements.
We're making a recursive call here and everything in the function
assumes we're looking at scalars. This would be violated if we
looked through a bitcast from vectors.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124015
2022-04-20 09:15:32 -07:00
serge-sans-paille
59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Benjamin Kramer
85243124cf Tweak some uses of std::iota to skip initializing the underlying storage. NFCI. 2022-02-04 17:00:50 +01:00