361 Commits

Author SHA1 Message Date
Alexey Bataev
ebcb5d59fc Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix
buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.
2023-09-29 15:03:46 -07:00
Alexey Bataev
9f5960e004 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-09-29 13:16:03 -07:00
Alexey Bataev
3204f88a8b Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the
crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.
2023-09-28 11:57:32 -07:00
Alexey Bataev
c88c281cf1 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-09-28 11:03:21 -07:00
Jeremy Morse
e54277fa10 [NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that
takes a block/iterator instead of an instruction, and updates many call
sites to use it. The motivating reason for doing this is given here [0],
we'd like to pass around more information about the position of debug-info
in the iterator object. That necessitates passing iterators around most of
the time.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152468
2023-09-11 20:01:19 +01:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Nuno Lopes
eb1617a582 [InstCombineVectorOps] Use poison instead of undef as placeholder [NFC]
It's used to create a vector where only 1 element is used
While at it, change OOB extractelement to yield poison per LangRef
2023-07-29 15:28:13 +01:00
Nuno Lopes
9324e1be07 [InstCombineVectorOps] Use poison instead of undef as placeholder [NFC]
Undef was being used to populate unused vector lanes.
While at it, switch extractelement to use poison as the OOB value (per LangRef)
2023-07-20 08:14:55 +01:00
Nikita Popov
b7e38ff223 [InstCombine] Add old extract to worklist for DCE
To make sure it is removed in the same InstCombine iteration.
2023-07-05 17:00:40 +02:00
Nikita Popov
ab94c1bad3 [InstCombine] Add created extracts to worklist
Use InstCombine's insertion helper for the created extracts, so
they become part of the worklist and will be revisited.
2023-06-23 16:11:47 +02:00
Nikita Popov
57a8ea8553 [InstCombine] Avoid infinite loop in insert/extract combine
Fix the infinite loop reported on https://reviews.llvm.org/D151807#4420467.

collectShuffleElements() will widen vectors and replace extracts
via replaceExtractElements(), to allow the next call of
collectShuffleElements() to fold. However, it's possible for another
fold to run first, and break the expected sequence again. To ensure
this does not happen, directly rerun the collectShuffleElements()
fold if we have adjusted extracts.
2023-06-14 14:58:49 +02:00
Nikita Popov
7c878f4504 [InstCombine] Directly iterate over users (NFC)
After 3a223f1eafe331508d171b519df8a4984791ab48, it's no longer
necessary to put the users into a vector. We can directly iterate
them instead.
2023-05-24 10:39:32 +02:00
Nikita Popov
3a223f1eaf [InstCombine] Fix crash due to early extractvalue removal
Fixes the issue reported at 4b8320868c (commitcomment-114671248).

The extractvalue instructions may still be used by the calling code
in some cases. Rather than trying to figure out which extracts are
safe to remove and which aren't, add them to the worklist so they
will get DCEd by the main loop.
2023-05-24 09:55:52 +02:00
Nikita Popov
4b8320868c [InstCombine] Remove dead extractelements (NFCI)
Directly remove these dead extractelement instructions, rather than
leaving them for the next InstCombine iteration to clean up.

Should be mostly NFC, apart from worklist order differences.
2023-05-23 15:40:48 +02:00
Nikita Popov
86a5a75049 [InstCombine] Use canonical index type in more places
Also fixes an issue with 80cd3e4e2059f8deda735c32bf7dfa5b9d75c4e0
pointed out by Roman Divacky: For one of the occurrences I only
changed the name of the type variable, without actually changing
the type...
2023-05-17 18:06:08 +02:00
Nikita Popov
605f0a46dc [InstCombine] Use IRBuilder in evaluateInDifferentElementOrder()
This ensures that the new instructions get reprocessed in the same
iteration.

This should be largely NFC, apart from worklist order effects and
naming changes, as seen in the test diff.
2023-05-17 15:07:36 +02:00
Nikita Popov
3b8f442289 [InstCombine] Fix worklist management for multi-use demanded element fold
Add the old instruction to the worklist, so it can be DCEd in the
same iteration.
2023-05-17 14:54:27 +02:00
Nikita Popov
80cd3e4e20 [InstCombine] Use canonical type in insertelement (NFC)
We can directly create these with the correct type.
2023-05-17 14:37:32 +02:00
ManuelJBrito
e335e8a432 [InstCombine] Update instcombine for vectorOps to use new shufflevector semantics
This patch updates the transformations in InstCombineVectorOps to use the new
hufflevector semantics that say that undefined values in the mask yield poison.

To prevent miscompilations we have to match with m_Poison instead of m_Undef.
Otherwise, we might introduce poison where there was previously undef.

Differential Revision: https://reviews.llvm.org/D150039
2023-05-17 07:56:45 +01:00
Nikita Popov
60766678c7 [InstCombine] Use canonical index type (NFC)
Directly use the canonical index type, rather than canonicalizing
it afterwards.
2023-05-05 12:19:18 +02:00
ManuelJBrito
d22edb9794 [IR][NFC] Change UndefMaskElem to PoisonMaskElem
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.

Differential Revision: https://reviews.llvm.org/D149256
2023-04-27 18:01:54 +01:00
Kazu Hirata
c8f9555c4d [Transforms] Use *{Set,Map}::contains (NFC) 2023-03-14 00:24:30 -07:00
Sanjay Patel
40d772c642 [InstCombine] add one-use check to prevent creating an instruction in shuffle-of-binop
This fold was added with https://reviews.llvm.org/D135876 ,
but we missed the one-use check.

This might be the root cause for issue #60632.
2023-02-22 19:20:32 -05:00
Nikita Popov
c9fad20f6a [InstCombine] Call simplifyInsertValueInst()
InstCombine is supposed to be a superset of InstSimplify, but we
were not attempting simplification of insertvalue instructions.
As the test change illustrates, we failed to remove some aggregate
construction patterns because of that.
2023-02-16 09:51:40 +01:00
Sanjay Patel
a8f13dbdeb [InstCombine] fold shuffle of fabs
shuffle (fabs X), Mask --> fabs (shuffle X, Mask)
shuffle (fabs X), (fabs Y), Mask --> fabs (shuf X, Y, Mask)

https://alive2.llvm.org/ce/z/JH2nkf

This generalizes the existing fneg transforms to also work with fabs.

A likely follow-up would generalize this further to move any unary
intrinsic op.
2023-02-03 14:23:17 -05:00
Guillaume Chatelet
8fd5558b29 [NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:49:38 +00:00
Sanjay Patel
d5f8878a6e [InstCombine] canonicalize insertelement order based on index
This puts lower insert indexes before higher. This is independent
of endian, so it requires an adjustment to a fold added with
4446f71ce392, but it makes that fold more robust.
That's also where this patch was suggested - D139668.

This matches what we already do in DAGCombiner, but there is one
more constraint because there's an existing canonicalization for
insert-of-scalar-constant. I'm not sure if that is still needed,
so it may be adjusted/removed as a follow-up.
2022-12-18 07:08:48 -05:00
Sanjay Patel
8efee510be [InstCombine] limit pair-of-insertelement folds to avoid miscompile
This transform was added with 4446f71ce392. However, as noted in
the post-commit feedback, the transform is not safe with an
arbitrary base vector because we may leak poison from a narrow
element into an adjacent element when bitcasting.

I made the least invasive code change in case we do figure out
a way to make this safe.
2022-12-15 08:27:43 -05:00
Fangrui Song
21cd58baa1 [Transforms/InstCombine] llvm::Optional => std::optional 2022-12-13 08:26:08 +00:00
Sanjay Patel
4446f71ce3 [InstCombine] try to fold a pair of insertelements into one insertelement
This replaces patches that tried to convert related patterns to shuffles
(D138872, D138873, D138874 - reverted/abandoned) but caused codegen
problems and were questionable as a canonicalization because an
insertelement is a simpler op than a shuffle.

This detects a larger pattern -- insert-of-insert -- and replaces with
another insert, so this hopefully does not cause any problems.

As noted by TODO items in the code and tests, this could go a lot further.
But this is enough to reduce the motivating test from issue #17113.

Example proofs:
https://alive2.llvm.org/ce/z/NnUv3a

I drafted a version of this for AggressiveInstCombine, but it seems that
would uncover yet another phase ordering gap. If we do generalize this to
handle the full range of potential patterns, that may be worth looking at
again.

Differential Revision: https://reviews.llvm.org/D139668
2022-12-12 10:39:58 -05:00
Sanjay Patel
05dbdb0088 Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)"
This reverts commit e71b81cab09bf33e3b08ed600418b72cc4117461.

As discussed in the planned follow-on to this patch (D138874),
this and the subsequent patches in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).
2022-12-08 14:16:46 -05:00
Sanjay Patel
99254f9251 Revert "[InstCombine] improve efficiency of bool logic; NFC"
This reverts commit b7c7fe3d0779b6e332fe6db64e87561deba2e56a.

As discussed in the planned follow-on to this patch (D138874),
this and the previous patch in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).
2022-12-08 14:16:46 -05:00
Sanjay Patel
286ae63e16 Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 2"
This reverts commit dd8d0d21ce6d0665ef5d426372096aaed85b479a.
As discussed in the planned follow-on to this patch (D138874),
this and the previous patch in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).
2022-12-08 09:58:17 -05:00
Kazu Hirata
343de6856e [Transforms] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:37 -08:00
Sanjay Patel
dd8d0d21ce [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 2
This enhances the base fold from part 1 to allow mapping a
right-shift to an insert index.

Example of translating a middle chunk of the scalar to vector
for either endian:
https://alive2.llvm.org/ce/z/fRXCOZ

This only allows creating an identity shuffle (with optional
shortening/lengthening) because that is considered the safe
baseline for any target (can be inverted if needed). If we
tried this fold with target-specific costs/legality, then we
could do the transform more generally.

Differential Revision: https://reviews.llvm.org/D138873
2022-12-01 14:47:37 -05:00
Sanjay Patel
b7c7fe3d07 [InstCombine] improve efficiency of bool logic; NFC
As noted in issue #59266, the logic reduction could be
beyond the capabilities of an optimizing compiler, and
the code with ternary op is easier to read either way.
2022-12-01 14:47:37 -05:00
Sanjay Patel
e71b81cab0 [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)
The first attempt was reverted because a clang test changed
unexpectedly - the file is already marked with a FIXME, so
I just updated it this time to pass.

Original commit message:
This is the main patch for converting a truncated scalar that is
inserted into a vector to bitcast+shuffle. We could go either way
on patterns like this, but this direction will allow collapsing a
pair of these sequences on the motivating example from issue

The patch is split into 3 parts to make it easier to see the
progression of tests diffs. We allow inserting/shuffling into a
different size vector for flexibility, so there are several test
variations. The length-changing is handled by shortening/padding
the shuffle mask with undef elements.

In part 1, handle the basic pattern:
inselt undef, (trunc T), IndexC --> shuffle (bitcast T), IdentityMask

Proof for the endian-dependency behaving as expected:
https://alive2.llvm.org/ce/z/BsA7yC

The TODO items for handling shifts and insert into an arbitrary base
vector value are implemented as follow-ups.

Differential Revision: https://reviews.llvm.org/D138872
2022-11-30 14:52:20 -05:00
Sanjay Patel
5eacdcff06 Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1"
This reverts commit a4c466766db77cd1fb42d7f98f32bb87a3d38829.
This broke clang tests that are wrongly dependent on the optimizer.
2022-11-30 14:10:50 -05:00
Sanjay Patel
a4c466766d [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1
This is the main patch for converting a truncated scalar that is
inserted into a vector to bitcast+shuffle. We could go either way
on patterns like this, but this direction will allow collapsing a
pair of these sequences on the motivating example from issue

The patch is split into 3 parts to make it easier to see the
progression of tests diffs. We allow inserting/shuffling into a
different size vector for flexibility, so there are several test
variations. The length-changing is handled by shortening/padding
the shuffle mask with undef elements.

In part 1, handle the basic pattern:
inselt undef, (trunc T), IndexC --> shuffle (bitcast T), IdentityMask

Proof for the endian-dependency behaving as expected:
https://alive2.llvm.org/ce/z/BsA7yC

The TODO items for handling shifts and insert into an arbitrary base
vector value are implemented as follow-ups.

Differential Revision: https://reviews.llvm.org/D138872
2022-11-30 13:22:04 -05:00
Sanjay Patel
535c5d56a7 [InstCombine] ease restriction for extractelt (bitcast X) fold
We were checking for a desirable integer type even when there
is no shift in the transform. This is unnecessary since we
are truncating directly to the destination type.

This removes an extractelt in more cases and seems to make the
canonicalization more uniform overall. There's still a potential
difference between patterns that need a shift vs. trunc-only.

I'm not sure if that is worth keeping at this point, but it can
be adjusted in another step (assuming this change does not cause
trouble).

In the most basic case where I noticed this, we missed a fold
that would have completely removed vector ops from a pattern
like:
https://alive2.llvm.org/ce/z/y4Qdte
2022-11-24 13:27:19 -05:00
Sanjay Patel
bf7f87e62c [InstCombine] reduce code duplication in foldBitcastExtElt(); NFC 2022-11-24 10:16:37 -05:00
Thomas Symalla
470aea5ed4 [InstCombine] Fold extractelt with select of constants
An extractelt with a constant index which extracts an element from the
two vector operands of a select can be directly folded into a select.

extractelt (select %x, %vec1, %vec2), %const ->
select %x, %vec1[%const], %vec2[%const]

Note: the implementation currently only works for constant vector operands.

Reviewed By: foad, spatel

Differential Revision: https://reviews.llvm.org/D137934
2022-11-22 14:07:06 +01:00
Matt Devereau
a8c24d57b8 [InstCombine] Remove redundant splats in InstCombineVectorOps
Splatting the first vector element of the result of a BinOp, where any of the
BinOp's operands are the result of a first vector element splat can be simplified to
splatting the first vector element of the result of the BinOp

Differential Revision: https://reviews.llvm.org/D135876
2022-11-07 15:39:05 +00:00
Peter Waller
e1790c8c29 Revert "[InstCombine] Remove redundant splats in InstCombineVectorOps"
This reverts commit 957eed0b1af2cb88edafe1ff2643a38165c67a40.
2022-11-03 07:56:03 +00:00
Matt Devereau
957eed0b1a [InstCombine] Remove redundant splats in InstCombineVectorOps
Splatting the first vector element of the result of a BinOp, where any of the
BinOp's operands are the result of a first vector element splat can be simplified to
splatting the first vector element of the result of the BinOp

Differential Revision: https://reviews.llvm.org/D135876
2022-11-02 11:57:05 +00:00
Nabeel Omer
e1fd6d49a3 [InstCombine] Fix assert condition in foldSelectShuffleOfSelectShuffle
Bug introduced in e239198cdbbf.

The assert() is making an assumption that the resulting shuffle mask
will always select elements from both vectors, this is untrue in the
case of two shuffles being folded if the former shuffle has a mask with
undef elements in it. In such a case folding the shuffles might result
in a mask which only selects from one of the vectors because the other
elements (in the mask) are undef.

Differential Revision: https://reviews.llvm.org/D136256
2022-10-20 12:10:54 +00:00
Daniel Sanders
021e6e05d3 [instsimplify] Move (extelt (inselt Vec, Value, Index), Index) -> Value from InstCombine
As requested in https://reviews.llvm.org/D135625#3858141

Differential Revision: https://reviews.llvm.org/D136099
2022-10-17 15:22:06 -07:00
Daniel Sanders
4a95a64e4a [instcombine] (extelt (inselt Vec, Value, Index), Index) -> Value
When Index is variable but still trivially known to be equal we can use Value
from before the insertion, possibly eliminating the vector.

Reverts a functional change from:
Author: Philip Reames <listmail@philipreames.com>
Date:   Wed Dec 8 12:21:10 2021 -0800

    [instcombine] A couple style tweaks to visitExtractElementInst [nfc]

Thanks to Michele Scandale for identifying the bug

Differential Revision: https://reviews.llvm.org/D135625
2022-10-10 15:41:53 -07:00
Sanjay Patel
e239198cdb [InstCombine] fold select shuffles with shared operand together
We don't combine generic shuffles together in IR, but select
shuffles are a special-case because a select shuffle of a
select shuffle is just another select shuffle; codegen is
expected to efficiently lower those (select shuffles are also
the canonical form of a vector select with constant condition).
2022-09-28 11:56:27 -04:00
jacquesguan
df525c7705 [InstCombine] fold fake floating point vector extract to shift+trunc.
This patch supports the FP part of D111082.

Differential Revision: https://reviews.llvm.org/D125750
2022-08-30 10:12:16 +08:00