We cannot determine ScalarTy from VL because some ScalarTy is determined
from VL[0]->getType(), while others are determined from
getValueType(VL[0]).
Fix "Mask and VecTy are incompatible".
If the last instruction in the SplitVectorize node is vectorized and
scheduled as part of some bundles, the SplitVectorize node might be
placed in the wrong order, leading to a compiler crash. Need to check if
the vectorized node has vector value and place the SplitVectorize node after the vector instruction to prevent a compile crash.
Fixes issue reported in https://github.com/llvm/llvm-project/pull/133091#issuecomment-2782826805
This is a mostly straightforward replacement of the previous
`std::pair<int, std::set<std::pair<...>>>` data structure used in
`SLPVectorizerPass::vectorizeStores()` with slightly more readable
alternatives.
I had done that change in my local tree to help me better understand the
code. It’s not very invasive, so I thought I’d create a PR for it.
Need to update the mapping between gathered values and their matching
entries, if the list of the entries is updated and only some of them are
selected for final shuffling.
Fixes#134085
If the scalar instructions is marked for the vectorization in the tree,
it cannot be vectorized as part of the another node in the same tree, in
general. It may prevent some potentially profitable vectorization
opportunities, since some nodes end up being buildvector/gather nodes,
which add to the total cost.
Patch allows revectorization of the previously vectorized scalars.
Reviewers: hiraditya, RKSimon
Reviewed By: RKSimon, hiraditya
Pull Request: https://github.com/llvm/llvm-project/pull/133091
getSameOpcode in some cases may consider 2 compares as having same
opcode, even though previously they were considered as alternate. It may
happen, because getSameOpcode looses info about previous instructions
and their states. Need to use isAlternateInstruction function instead
for the correct analysis.
Reviewers: RKSimon, hiraditya
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/133769
Need to check the value type, not the return type, of the instructions,
when doing the analysis for the whole register use to prevent a compiler
crash.
Fixes#133751
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E.first);
down to:
Set.insert_range(llvm::make_first_range(Range));
In some cases, we can further fold that into the set declaration.
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.
---------
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
Need to fix a check for non-schedulable instructions in
getLastInstructionInBundle function, because this check may not work
correctly during the codegen. Instead, need to check that actually these
instructions were never scheduled, since the scheduling analysis always
performed before the codegen and is stable.
Fixes#132841
This reverts commit 71a0cfd93263552ddc0bfd2ea7b0abe9a578f87e.
This commit triggers failed asserts when compiling ffmpeg. The
issue is reproducible with a small standalone reproducer like this:
void make_filters_from_proto(int *filter[][2], int bands) {
int c, q, n;
for (;; q++) {
n = 0;
for (; n < 7; n++) {
int theta = (q * (n - 6) + (n >> 1) - 3) % bands;
if (theta)
c = theta;
filter[q][n][0] = c;
}
}
}
$ clang -target x86_64-linux-gnu -c repro.c -O3
clang: ../lib/Transforms/Vectorize/SLPVectorizer.cpp:989: llvm::SmallVector<llvm
::Value*> {anonymous}::BinOpSameOpcodeHelper::InterchangeableInfo::getOperand(ll
vm::Instruction*) const: Assertion `FromCIValue.isZero() && "Cannot convert the
instruction."' failed.
The same issue also reproduces for a large number of other target
triples, aarch64-linux-gnu and others.
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.
---------
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
Need to drop all previous estimations/vectorizations, when found
a perfect diamond match. This improves cost estimation and improves code
emission.
Also, need to adjust getScalarizationOverhead cost for non-poison input
vector. Currently, it does not allow to estimate it correctly, so
instead use conservative element-by-element insertelement cost for each
unique scalar.
Reviewers: RKSimon, hiraditya
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/132466
That's the initial patch, intended to support revectorization of the
previously vectorized scalars. If the scalar is marked for the
vectorization, it becomes a part of the schedule bundle, used to check
dependencies and then schedule tree entry scalars into a single batch of
instructions. Unfortunately, currently this info is part of the
ScheduleData struct and it does not allow making scalars part of many
bundles. The patch separates schedule bundles from the ScheduleData,
introduces explicit class ScheduleBundle for bundles, allowing later to
extend it to support revectorization of the previously vectorized
scalars.
Reviewers: hiraditya, RKSimon
Reviewed By: RKSimon, hiraditya
Pull Request: https://github.com/llvm/llvm-project/pull/131625
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
If the insertion point (last instruction) of the user nodes is the same,
need to check the whole def-use chain in the tree to find proper
dominance to prevent a compiler crash.
Fixes#131818
If the subvector operands has reuses mask, need to reorder the mask, not
the scalars, to prevent compiler crash due to mask/scalars size
mismatch.
Fixes#131360
Need to check if the user node is same as other node and check operand
order to prevent a compiler crash when trying to find matching gather
node with user nodes, having the same last instruction.
Fixes#131195
If the phi node contains multiple same incoming blocks/values, need to
update the corresponding vectorized value, if it is not going to be
vectorized, if the incoming value was vectorized already.
Fixes#131355
Need to set the insert point for non-schedulable instructions in
SplitVectorize node after the last instruction, not before, to avoid
a crash in case of buildvector subvector node.