Need to check if the operand node of the split vectorize node has reuses
and check if it is possible to build the order for this node to reorder
it correctly.
Fixes#135912
Duplicates are handled in BoUpSLP::processBuildVector (see TryPackScalars), support for duplicates in getGatherCost is not needed anymore.
Reviewers: hiraditya, RKSimon
Reviewed By: hiraditya, RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/135834
If the potential split node is a perfect/shuffled match of another split
node, need to skip creation of the another split node with the same
scalars, it should be a buildvector.
Fixes#135800
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.
---------
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
We should align REVEC with the SLP algorithm as closely as possible. For
example, by applying REVEC-specific handling when calling IRBuilder's
Create methods, performing cost analysis via TTI, and expanding shuffle
masks using transformScalarShuffleIndicesToVector.
reference commit: 3b18d47ecbaba4e519ebf0d1bc134a404a56a9da
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.
---------
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
If the buildvector node contains constants and non-constants, need to
consider shuffling of the constant vec and insertion of unique elements
into the vector. Also, if there is an input vector, need to consider the
cost of shuffling source vector and constant vector and then insertion
and shuffling of the non-constant elements.
Reviewers: hiraditya, RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/135245
When REVEC is enabled, ScalarTy may be a FixedVectorType. Compare its
element type to decide if casting is needed. Also apply mask
transformation accordingly.
Moved check from buildTree_rec function to a separate
isLegalToVectorizeScalars function.
Reviewers: RKSimon, hiraditya
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/134132
Patch removes the restriction for the revectorization of the previously
vectorized scalars in split nodes, and moves the cost profitability
check to avoid regressions.
Reviewers: hiraditya, RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/134286
Need to check, that the mask of the potentially matching splat node is
not less defined than the requested mask to avoid poison propagation and
incorrect code.
Fixes#135113
We cannot determine ScalarTy from VL because some ScalarTy is determined
from VL[0]->getType(), while others are determined from
getValueType(VL[0]).
Fix "Mask and VecTy are incompatible".
If the last instruction in the SplitVectorize node is vectorized and
scheduled as part of some bundles, the SplitVectorize node might be
placed in the wrong order, leading to a compiler crash. Need to check if
the vectorized node has vector value and place the SplitVectorize node after the vector instruction to prevent a compile crash.
Fixes issue reported in https://github.com/llvm/llvm-project/pull/133091#issuecomment-2782826805
This is a mostly straightforward replacement of the previous
`std::pair<int, std::set<std::pair<...>>>` data structure used in
`SLPVectorizerPass::vectorizeStores()` with slightly more readable
alternatives.
I had done that change in my local tree to help me better understand the
code. It’s not very invasive, so I thought I’d create a PR for it.
Need to update the mapping between gathered values and their matching
entries, if the list of the entries is updated and only some of them are
selected for final shuffling.
Fixes#134085
If the scalar instructions is marked for the vectorization in the tree,
it cannot be vectorized as part of the another node in the same tree, in
general. It may prevent some potentially profitable vectorization
opportunities, since some nodes end up being buildvector/gather nodes,
which add to the total cost.
Patch allows revectorization of the previously vectorized scalars.
Reviewers: hiraditya, RKSimon
Reviewed By: RKSimon, hiraditya
Pull Request: https://github.com/llvm/llvm-project/pull/133091
getSameOpcode in some cases may consider 2 compares as having same
opcode, even though previously they were considered as alternate. It may
happen, because getSameOpcode looses info about previous instructions
and their states. Need to use isAlternateInstruction function instead
for the correct analysis.
Reviewers: RKSimon, hiraditya
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/133769
Need to check the value type, not the return type, of the instructions,
when doing the analysis for the whole register use to prevent a compiler
crash.
Fixes#133751
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E.first);
down to:
Set.insert_range(llvm::make_first_range(Range));
In some cases, we can further fold that into the set declaration.