2202 Commits

Author SHA1 Message Date
Alexey Bataev
913dcf1aa3 [SLP]Fix type promotion for smax reduction with unsigned reduced operands
Need to add an extra bit for sign info for unsigned reduced values to
generate correct code.
2025-04-16 10:14:29 -07:00
Alexey Bataev
76b7ae7e45 [SLP][NFC]Remove std::placeholders:: qualifiers, NFC 2025-04-16 09:42:17 -07:00
Kazu Hirata
0045b82a42
[Vectorize] Construct SmallVector with an iterator range (NFC) (#135936) 2025-04-16 08:39:55 -07:00
Alexey Bataev
af28c9c65a [SLP]Do not reorder split node operand with reuses, if not possible
Need to check if the operand node of the split vectorize node has reuses
and check if it is possible to build the order for this node to reorder
it correctly.

Fixes #135912
2025-04-16 06:23:44 -07:00
Alexey Bataev
41c97afea0
[SLP][NFC]Remove handling of duplicates from getGatherCost
Duplicates are handled in BoUpSLP::processBuildVector (see TryPackScalars), support for duplicates in getGatherCost is not needed anymore.

Reviewers: hiraditya, RKSimon

Reviewed By: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/135834
2025-04-16 06:53:10 -04:00
Alexey Bataev
ddb1267430 [SLP]Insert vector instruction after landingpad
If the node must be emitted in the landingpad block, need to insert the
instructions after the landingpad instruction to avoid a crash.

Fixes #135781
2025-04-15 13:57:53 -07:00
Alexey Bataev
85eb44e304 [SLP]Fix number of operands for the split node
FOr the split node number of operands should be requested via
getNumOperands() function, even if the main op is CallInst.
2025-04-15 13:33:36 -07:00
Alexey Bataev
2271f0bebd [SLP]Check for perfect/shuffled match for the split node
If the potential split node is a perfect/shuffled match of another split
node, need to skip creation of the another split node with the same
scalars, it should be a buildvector.

Fixes #135800
2025-04-15 13:17:46 -07:00
Han-Kuan Chen
d41e517748
[SLP] Make getSameOpcode support interchangeable instructions. (#135797)
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2025-04-16 00:08:59 +08:00
Han-Kuan Chen
bcfc9f4529
[SLP][REVEC] VectorValuesAndScales should be supported by REVEC. (#135762)
We should align REVEC with the SLP algorithm as closely as possible. For
example, by applying REVEC-specific handling when calling IRBuilder's
Create methods, performing cost analysis via TTI, and expanding shuffle
masks using transformScalarShuffleIndicesToVector.

reference commit: 3b18d47ecbaba4e519ebf0d1bc134a404a56a9da
2025-04-15 23:03:55 +08:00
Alexey Bataev
57025b42c4 [SLP]Mark smin reduction as signed compare
Reduction signed min must be marked as signed compare, fixing the
analysis for the cases, where the incoming arguments are unsigned.

Fixes #133943
2025-04-15 07:24:17 -07:00
Han-Kuan Chen
e1382b3b45 Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#133888)"
This reverts commit 123993fd974629ca0a094918db4c21ad1c2624d0.
2025-04-15 06:02:42 -07:00
Han-Kuan Chen
123993fd97
[SLP] Make getSameOpcode support interchangeable instructions. (#133888)
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2025-04-14 19:23:18 +08:00
Alexey Bataev
38e64b1a84 [SLP]Fix minbiwidth analysis for gather nodes with SIToFP users
If the buildvector node has cast to float user, it cannot be considered as safe
for truncation, need to use the original bitwidth here.

Fixes #135410
2025-04-11 11:40:41 -07:00
Alexey Bataev
a2d129b792 [SLP]Fix a crash when trying to reduce in revec after minbitwidth analysis
Need to use the original scalar type, when building the reduction, and
use the scalar type, when performing casting, to avoid compiler crash.
2025-04-11 10:58:39 -07:00
Alexey Bataev
bd0b2bdacc [SLP][NFC]Use VF instead of VL.size and modernize some transformations, NFC. 2025-04-11 10:29:30 -07:00
Alexey Bataev
33af951f3f
[SLP]Synchronize cost of gather/buildvector nodes with codegen
If the buildvector node contains constants and non-constants, need to
consider shuffling of the constant vec and insertion of unique elements
into the vector. Also, if there is an input vector, need to consider the
cost of shuffling source vector and constant vector and then insertion
and shuffling of the non-constant elements.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/135245
2025-04-11 09:42:34 -04:00
Han-Kuan Chen
d77dc87511
[SLP][REVEC] Fix type comparison and mask transformation for REVEC. (#135310)
When REVEC is enabled, ScalarTy may be a FixedVectorType. Compare its
element type to decide if casting is needed. Also apply mask
transformation accordingly.
2025-04-11 17:28:34 +08:00
Alexey Bataev
61d04f1aac
[SLP][NFC]Extract preliminary checks from buildTree_rec, NFC
Moved check from buildTree_rec function to a separate
isLegalToVectorizeScalars function.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134132
2025-04-10 16:05:01 -04:00
Alexey Bataev
aaaa2a325b
[SLP]Support vectorization of previously vectorized scalars in split nodes
Patch removes the restriction for the revectorization of the previously
vectorized scalars in split nodes, and moves the cost profitability
check to avoid regressions.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134286
2025-04-10 12:06:38 -04:00
Alexey Bataev
4ea57b3481 [SLP]Fix detection of matching splat vector
Need to check, that the mask of the potentially matching splat node is
not less defined than the requested mask to avoid poison propagation and
incorrect code.

Fixes #135113
2025-04-10 08:30:43 -07:00
Han-Kuan Chen
a693f23ef2
[SLP][REVEC] Fix CompressVectorize does not expand mask when REVEC is enabled. (#135174) 2025-04-10 23:07:45 +08:00
Han-Kuan Chen
d02a704ec9
[SLP][REVEC] Make getExtractWithExtendCost support FixedVectorType as Dst. (#134822) 2025-04-10 18:54:45 +08:00
Alexey Bataev
9dc6551fa8
[SLP][NFC]Extract a check for a SplitVectorize node, NFC
Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134896
2025-04-09 14:05:02 -04:00
Alexey Bataev
076318bd78 [SLP]Use proper order when calculating costs for geps/extracts to correctly identify profitability
Need to reorder properly the scalars, when evaluating the costs for the
external uses/geps to prevent differences in the calculating of the
profitability costs, used to choose between gather/compressed loads.

Fixes https://github.com/llvm/llvm-project/pull/132099#issuecomment-2789627454
2025-04-09 07:43:23 -07:00
Alexey Bataev
edcbd4a211
[SLP][NFC]Extract a check for strided loads into separate function, NFC
Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134876
2025-04-08 13:02:31 -04:00
Alexey Bataev
02a708b93b
[SLP][NFC]Extract TryToFindDuplicates lambda into a separate function, NFC
Reviewers: RKSimon, hiraditya

Reviewed By: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134873
2025-04-08 13:01:54 -04:00
Han-Kuan Chen
2347aa1fcc
[SLP][REVEC] Fix the mismatch between the result of getAltInstrMask and the VecTy argument of TargetTransformInfo::isLegalAltInstr. (#134795)
We cannot determine ScalarTy from VL because some ScalarTy is determined
from VL[0]->getType(), while others are determined from
getValueType(VL[0]).

Fix "Mask and VecTy are incompatible".
2025-04-08 22:29:11 +08:00
Han-Kuan Chen
97c4cb4d13
[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134763) 2025-04-08 22:29:03 +08:00
Han-Kuan Chen
d7354e337a
[SLP][REVEC] Fix ShuffleVector does not consider alternate instruction. (#134599) 2025-04-08 08:04:43 +08:00
Alexey Bataev
f413772b31 [SLP]Fix last instruction selection for vectorized last instruction in SplitVectorize nodes
If the last instruction in the SplitVectorize node is vectorized and
scheduled as part of some bundles, the SplitVectorize node might be
placed in the wrong order, leading to a compiler crash. Need to check if
the vectorized node has vector value and place the SplitVectorize node after the vector instruction to prevent a compile crash.

Fixes issue reported in https://github.com/llvm/llvm-project/pull/133091#issuecomment-2782826805
2025-04-07 09:27:08 -07:00
Han-Kuan Chen
5748ddbab4
[SLP] NFC. Add a comment to introduce the alternate instruction. (#134572) 2025-04-07 18:03:26 +08:00
Matt Arsenault
65c7ea713e
SLPVectorizer: Avoid looking at uselists of constants (#134578) 2025-04-07 16:52:11 +07:00
Alexey Bataev
19aec00735 [SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-04 11:14:49 -07:00
Alexey Bataev
90cf2e31ab Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved"
This reverts commit daab7d08078bb7cd37c66b78a56f4773e6b12fba to fix
a crash reported in https://github.com/llvm/llvm-project/issues/134411.
2025-04-04 10:09:39 -07:00
Gaëtan Bossu
aca270877f
[SLP] Use named structs in vectorizeStores() (NFC) (#132781)
This is a mostly straightforward replacement of the previous
`std::pair<int, std::set<std::pair<...>>>` data structure used in
`SLPVectorizerPass::vectorizeStores()` with slightly more readable
alternatives.

I had done that change in my local tree to help me better understand the
code. It’s not very invasive, so I thought I’d create a PR for it.
2025-04-04 16:27:25 +01:00
Alexey Bataev
daab7d0807 [SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-03 13:17:40 -07:00
Alexey Bataev
7c4013d591 Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved"
This reverts commit 0bec0f5c059af5f920fe22ecda469b666b5971b0 to fix
a crash reported in https://lab.llvm.org/buildbot/#/builders/143/builds/6668.
2025-04-03 12:58:49 -07:00
Alexey Bataev
0bec0f5c05
[SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-03 13:21:22 -04:00
Alexey Bataev
843ef77dc2 [SLP]Update mapping between values and their matching entries upon selection
Need to update the mapping between gathered values and their matching
entries, if the list of the entries is updated and only some of them are
selected for final shuffling.

Fixes #134085
2025-04-02 11:59:32 -07:00
Alexey Bataev
48a4b14cb6 [SLP]Fix whole vector registers calculations for compares
Need to check that the calculated number of the elements is not larger
than the original number of scalars to prevent a compiler crash.

Fixes #134013
2025-04-02 07:26:40 -07:00
Han-Kuan Chen
5bbcc765cc
[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134031) 2025-04-02 19:04:07 +08:00
Alexey Bataev
0e3049c562
[SLP]Support revectorization of the previously vectorized scalars
If the scalar instructions is marked for the vectorization in the tree,
it cannot be vectorized as part of the another node in the same tree, in
general. It may prevent some potentially profitable vectorization
opportunities, since some nodes end up being buildvector/gather nodes,
which add to the total cost.
Patch allows revectorization of the previously vectorized scalars.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon, hiraditya

Pull Request: https://github.com/llvm/llvm-project/pull/133091
2025-04-01 14:30:06 -04:00
Alexey Bataev
cf6a452cc7
[SLP]Fix same/alternate analysis in split node analysis for compares
getSameOpcode in some cases may consider 2 compares as having same
opcode, even though previously they were considered as alternate. It may
happen, because getSameOpcode looses info about previous instructions
and their states. Need to use isAlternateInstruction function instead
for the correct analysis.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/133769
2025-03-31 19:33:40 -04:00
Alexey Bataev
bfd8cc0a3e [SLP]Fix a check for the whole register use
Need to check the value type, not the return type, of the instructions,
when doing the analysis for the whole register use to prevent a compiler
crash.

Fixes #133751
2025-03-31 10:52:12 -07:00
Han-Kuan Chen
65734de9b9
[SLP] NFC. Remove the redundant MainOp and AltOp find process. (#133642) 2025-03-31 10:26:45 +08:00
Alexey Bataev
1bfc61064a [SLP]Fix spill cost analysis for split vectorized nodes
If the entry is SplitVectorize, it can be skipped in favor of its
operands, operands allow correctly detect spill costs.

Fixes #133288
2025-03-28 12:45:53 -07:00
Kazu Hirata
673f4705a8
[llvm] Use *Set::insert_range (NFC) (#133353)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E.first);

down to:

  Set.insert_range(llvm::make_first_range(Range));

In some cases, we can further fold that into the set declaration.
2025-03-27 20:44:20 -07:00
Kazu Hirata
cde58bfc16
[Transforms] Use range constructors of *Set (NFC) (#133203) 2025-03-27 07:51:58 -07:00
Martin Storsjö
a2e5932e8b Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#132887)"
This reverts commit 6e66cfeeaec6f09a4454400e45d690457ecdd3de.

This change causes crashes on compiling some inputs, see
https://github.com/llvm/llvm-project/pull/127450#issuecomment-2752833710
and
https://github.com/llvm/llvm-project/pull/127450#issuecomment-2753375326
for details.
2025-03-26 10:24:25 +02:00