Otherwise we may replace undef with poison.
Note that a lot of tests regressing here already have variants
that use poison instead of undef (often in a separate
inseltpoison file), which is why I'm not adjusting them to the
new pattern.
The specialisation will not be valid when ConstantInt gains native
support for vector types.
This is largely a mechanical change but with extra attention paid to constant
folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to
remove the need to call `getIntegerType()`.
Co-authored-by: Nikita Popov <github@npopov.com>
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152543
Fix the infinite loop reported on https://reviews.llvm.org/D151807#4420467.
collectShuffleElements() will widen vectors and replace extracts
via replaceExtractElements(), to allow the next call of
collectShuffleElements() to fold. However, it's possible for another
fold to run first, and break the expected sequence again. To ensure
this does not happen, directly rerun the collectShuffleElements()
fold if we have adjusted extracts.
Fixes the issue reported at 4b8320868c (commitcomment-114671248).
The extractvalue instructions may still be used by the calling code
in some cases. Rather than trying to figure out which extracts are
safe to remove and which aren't, add them to the worklist so they
will get DCEd by the main loop.
Directly remove these dead extractelement instructions, rather than
leaving them for the next InstCombine iteration to clean up.
Should be mostly NFC, apart from worklist order differences.
Also fixes an issue with 80cd3e4e2059f8deda735c32bf7dfa5b9d75c4e0
pointed out by Roman Divacky: For one of the occurrences I only
changed the name of the type variable, without actually changing
the type...
This ensures that the new instructions get reprocessed in the same
iteration.
This should be largely NFC, apart from worklist order effects and
naming changes, as seen in the test diff.
This patch updates the transformations in InstCombineVectorOps to use the new
hufflevector semantics that say that undefined values in the mask yield poison.
To prevent miscompilations we have to match with m_Poison instead of m_Undef.
Otherwise, we might introduce poison where there was previously undef.
Differential Revision: https://reviews.llvm.org/D150039
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.
Differential Revision: https://reviews.llvm.org/D149256
InstCombine is supposed to be a superset of InstSimplify, but we
were not attempting simplification of insertvalue instructions.
As the test change illustrates, we failed to remove some aggregate
construction patterns because of that.
shuffle (fabs X), Mask --> fabs (shuffle X, Mask)
shuffle (fabs X), (fabs Y), Mask --> fabs (shuf X, Y, Mask)
https://alive2.llvm.org/ce/z/JH2nkf
This generalizes the existing fneg transforms to also work with fabs.
A likely follow-up would generalize this further to move any unary
intrinsic op.
This puts lower insert indexes before higher. This is independent
of endian, so it requires an adjustment to a fold added with
4446f71ce392, but it makes that fold more robust.
That's also where this patch was suggested - D139668.
This matches what we already do in DAGCombiner, but there is one
more constraint because there's an existing canonicalization for
insert-of-scalar-constant. I'm not sure if that is still needed,
so it may be adjusted/removed as a follow-up.
This transform was added with 4446f71ce392. However, as noted in
the post-commit feedback, the transform is not safe with an
arbitrary base vector because we may leak poison from a narrow
element into an adjacent element when bitcasting.
I made the least invasive code change in case we do figure out
a way to make this safe.
This replaces patches that tried to convert related patterns to shuffles
(D138872, D138873, D138874 - reverted/abandoned) but caused codegen
problems and were questionable as a canonicalization because an
insertelement is a simpler op than a shuffle.
This detects a larger pattern -- insert-of-insert -- and replaces with
another insert, so this hopefully does not cause any problems.
As noted by TODO items in the code and tests, this could go a lot further.
But this is enough to reduce the motivating test from issue #17113.
Example proofs:
https://alive2.llvm.org/ce/z/NnUv3a
I drafted a version of this for AggressiveInstCombine, but it seems that
would uncover yet another phase ordering gap. If we do generalize this to
handle the full range of potential patterns, that may be worth looking at
again.
Differential Revision: https://reviews.llvm.org/D139668
This reverts commit e71b81cab09bf33e3b08ed600418b72cc4117461.
As discussed in the planned follow-on to this patch (D138874),
this and the subsequent patches in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).
This reverts commit b7c7fe3d0779b6e332fe6db64e87561deba2e56a.
As discussed in the planned follow-on to this patch (D138874),
this and the previous patch in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).
This reverts commit dd8d0d21ce6d0665ef5d426372096aaed85b479a.
As discussed in the planned follow-on to this patch (D138874),
this and the previous patch in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This enhances the base fold from part 1 to allow mapping a
right-shift to an insert index.
Example of translating a middle chunk of the scalar to vector
for either endian:
https://alive2.llvm.org/ce/z/fRXCOZ
This only allows creating an identity shuffle (with optional
shortening/lengthening) because that is considered the safe
baseline for any target (can be inverted if needed). If we
tried this fold with target-specific costs/legality, then we
could do the transform more generally.
Differential Revision: https://reviews.llvm.org/D138873
As noted in issue #59266, the logic reduction could be
beyond the capabilities of an optimizing compiler, and
the code with ternary op is easier to read either way.
The first attempt was reverted because a clang test changed
unexpectedly - the file is already marked with a FIXME, so
I just updated it this time to pass.
Original commit message:
This is the main patch for converting a truncated scalar that is
inserted into a vector to bitcast+shuffle. We could go either way
on patterns like this, but this direction will allow collapsing a
pair of these sequences on the motivating example from issue
The patch is split into 3 parts to make it easier to see the
progression of tests diffs. We allow inserting/shuffling into a
different size vector for flexibility, so there are several test
variations. The length-changing is handled by shortening/padding
the shuffle mask with undef elements.
In part 1, handle the basic pattern:
inselt undef, (trunc T), IndexC --> shuffle (bitcast T), IdentityMask
Proof for the endian-dependency behaving as expected:
https://alive2.llvm.org/ce/z/BsA7yC
The TODO items for handling shifts and insert into an arbitrary base
vector value are implemented as follow-ups.
Differential Revision: https://reviews.llvm.org/D138872