2181 Commits

Author SHA1 Message Date
Han-Kuan Chen
a693f23ef2
[SLP][REVEC] Fix CompressVectorize does not expand mask when REVEC is enabled. (#135174) 2025-04-10 23:07:45 +08:00
Han-Kuan Chen
d02a704ec9
[SLP][REVEC] Make getExtractWithExtendCost support FixedVectorType as Dst. (#134822) 2025-04-10 18:54:45 +08:00
Alexey Bataev
9dc6551fa8
[SLP][NFC]Extract a check for a SplitVectorize node, NFC
Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134896
2025-04-09 14:05:02 -04:00
Alexey Bataev
076318bd78 [SLP]Use proper order when calculating costs for geps/extracts to correctly identify profitability
Need to reorder properly the scalars, when evaluating the costs for the
external uses/geps to prevent differences in the calculating of the
profitability costs, used to choose between gather/compressed loads.

Fixes https://github.com/llvm/llvm-project/pull/132099#issuecomment-2789627454
2025-04-09 07:43:23 -07:00
Alexey Bataev
edcbd4a211
[SLP][NFC]Extract a check for strided loads into separate function, NFC
Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134876
2025-04-08 13:02:31 -04:00
Alexey Bataev
02a708b93b
[SLP][NFC]Extract TryToFindDuplicates lambda into a separate function, NFC
Reviewers: RKSimon, hiraditya

Reviewed By: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/134873
2025-04-08 13:01:54 -04:00
Han-Kuan Chen
2347aa1fcc
[SLP][REVEC] Fix the mismatch between the result of getAltInstrMask and the VecTy argument of TargetTransformInfo::isLegalAltInstr. (#134795)
We cannot determine ScalarTy from VL because some ScalarTy is determined
from VL[0]->getType(), while others are determined from
getValueType(VL[0]).

Fix "Mask and VecTy are incompatible".
2025-04-08 22:29:11 +08:00
Han-Kuan Chen
97c4cb4d13
[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134763) 2025-04-08 22:29:03 +08:00
Han-Kuan Chen
d7354e337a
[SLP][REVEC] Fix ShuffleVector does not consider alternate instruction. (#134599) 2025-04-08 08:04:43 +08:00
Alexey Bataev
f413772b31 [SLP]Fix last instruction selection for vectorized last instruction in SplitVectorize nodes
If the last instruction in the SplitVectorize node is vectorized and
scheduled as part of some bundles, the SplitVectorize node might be
placed in the wrong order, leading to a compiler crash. Need to check if
the vectorized node has vector value and place the SplitVectorize node after the vector instruction to prevent a compile crash.

Fixes issue reported in https://github.com/llvm/llvm-project/pull/133091#issuecomment-2782826805
2025-04-07 09:27:08 -07:00
Han-Kuan Chen
5748ddbab4
[SLP] NFC. Add a comment to introduce the alternate instruction. (#134572) 2025-04-07 18:03:26 +08:00
Matt Arsenault
65c7ea713e
SLPVectorizer: Avoid looking at uselists of constants (#134578) 2025-04-07 16:52:11 +07:00
Alexey Bataev
19aec00735 [SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-04 11:14:49 -07:00
Alexey Bataev
90cf2e31ab Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved"
This reverts commit daab7d08078bb7cd37c66b78a56f4773e6b12fba to fix
a crash reported in https://github.com/llvm/llvm-project/issues/134411.
2025-04-04 10:09:39 -07:00
Gaëtan Bossu
aca270877f
[SLP] Use named structs in vectorizeStores() (NFC) (#132781)
This is a mostly straightforward replacement of the previous
`std::pair<int, std::set<std::pair<...>>>` data structure used in
`SLPVectorizerPass::vectorizeStores()` with slightly more readable
alternatives.

I had done that change in my local tree to help me better understand the
code. It’s not very invasive, so I thought I’d create a PR for it.
2025-04-04 16:27:25 +01:00
Alexey Bataev
daab7d0807 [SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-03 13:17:40 -07:00
Alexey Bataev
7c4013d591 Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved"
This reverts commit 0bec0f5c059af5f920fe22ecda469b666b5971b0 to fix
a crash reported in https://lab.llvm.org/buildbot/#/builders/143/builds/6668.
2025-04-03 12:58:49 -07:00
Alexey Bataev
0bec0f5c05
[SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-03 13:21:22 -04:00
Alexey Bataev
843ef77dc2 [SLP]Update mapping between values and their matching entries upon selection
Need to update the mapping between gathered values and their matching
entries, if the list of the entries is updated and only some of them are
selected for final shuffling.

Fixes #134085
2025-04-02 11:59:32 -07:00
Alexey Bataev
48a4b14cb6 [SLP]Fix whole vector registers calculations for compares
Need to check that the calculated number of the elements is not larger
than the original number of scalars to prevent a compiler crash.

Fixes #134013
2025-04-02 07:26:40 -07:00
Han-Kuan Chen
5bbcc765cc
[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134031) 2025-04-02 19:04:07 +08:00
Alexey Bataev
0e3049c562
[SLP]Support revectorization of the previously vectorized scalars
If the scalar instructions is marked for the vectorization in the tree,
it cannot be vectorized as part of the another node in the same tree, in
general. It may prevent some potentially profitable vectorization
opportunities, since some nodes end up being buildvector/gather nodes,
which add to the total cost.
Patch allows revectorization of the previously vectorized scalars.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon, hiraditya

Pull Request: https://github.com/llvm/llvm-project/pull/133091
2025-04-01 14:30:06 -04:00
Alexey Bataev
cf6a452cc7
[SLP]Fix same/alternate analysis in split node analysis for compares
getSameOpcode in some cases may consider 2 compares as having same
opcode, even though previously they were considered as alternate. It may
happen, because getSameOpcode looses info about previous instructions
and their states. Need to use isAlternateInstruction function instead
for the correct analysis.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/133769
2025-03-31 19:33:40 -04:00
Alexey Bataev
bfd8cc0a3e [SLP]Fix a check for the whole register use
Need to check the value type, not the return type, of the instructions,
when doing the analysis for the whole register use to prevent a compiler
crash.

Fixes #133751
2025-03-31 10:52:12 -07:00
Han-Kuan Chen
65734de9b9
[SLP] NFC. Remove the redundant MainOp and AltOp find process. (#133642) 2025-03-31 10:26:45 +08:00
Alexey Bataev
1bfc61064a [SLP]Fix spill cost analysis for split vectorized nodes
If the entry is SplitVectorize, it can be skipped in favor of its
operands, operands allow correctly detect spill costs.

Fixes #133288
2025-03-28 12:45:53 -07:00
Kazu Hirata
673f4705a8
[llvm] Use *Set::insert_range (NFC) (#133353)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E.first);

down to:

  Set.insert_range(llvm::make_first_range(Range));

In some cases, we can further fold that into the set declaration.
2025-03-27 20:44:20 -07:00
Kazu Hirata
cde58bfc16
[Transforms] Use range constructors of *Set (NFC) (#133203) 2025-03-27 07:51:58 -07:00
Martin Storsjö
a2e5932e8b Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#132887)"
This reverts commit 6e66cfeeaec6f09a4454400e45d690457ecdd3de.

This change causes crashes on compiling some inputs, see
https://github.com/llvm/llvm-project/pull/127450#issuecomment-2752833710
and
https://github.com/llvm/llvm-project/pull/127450#issuecomment-2753375326
for details.
2025-03-26 10:24:25 +02:00
Han-Kuan Chen
6e66cfeeae
[SLP] Make getSameOpcode support interchangeable instructions. (#132887)
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2025-03-25 19:46:15 +08:00
Alexey Bataev
8122bb9dbe [SLP]Fix a check for non-schedulable instructions
Need to fix a check for non-schedulable instructions in
getLastInstructionInBundle function, because this check may not work
correctly during the codegen. Instead, need to check that actually these
instructions were never scheduled, since the scheduling analysis always
performed before the codegen and is stable.

Fixes #132841
2025-03-25 04:35:33 -07:00
Han-Kuan Chen
2682a9433b
[SLP][REVEC] Add ExtractSubvector cost for ExternalUses. (#132761)
For llvm/test/Transforms/SLPVectorizer/revec-shufflevector.ll,
ScalarCost and ExtraCost is 1, so the original scalar will be kept.
2025-03-25 18:58:54 +08:00
Martin Storsjö
b33bec9b21 Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#127450)"
This reverts commit 71a0cfd93263552ddc0bfd2ea7b0abe9a578f87e.

This commit triggers failed asserts when compiling ffmpeg. The
issue is reproducible with a small standalone reproducer like this:

    void make_filters_from_proto(int *filter[][2], int bands) {
      int c, q, n;
      for (;; q++) {
        n = 0;
        for (; n < 7; n++) {
          int theta = (q * (n - 6) + (n >> 1) - 3) % bands;
          if (theta)
            c = theta;
          filter[q][n][0] = c;
        }
      }
    }

$ clang -target x86_64-linux-gnu -c repro.c -O3
clang: ../lib/Transforms/Vectorize/SLPVectorizer.cpp:989: llvm::SmallVector<llvm
::Value*> {anonymous}::BinOpSameOpcodeHelper::InterchangeableInfo::getOperand(ll
vm::Instruction*) const: Assertion `FromCIValue.isZero() && "Cannot convert the
instruction."' failed.

The same issue also reproduces for a large number of other target
triples, aarch64-linux-gnu and others.
2025-03-25 10:22:44 +02:00
Martin Storsjö
dd059338a2 Revert "[Vectorize] Fix a warning"
This reverts commit 4c68061254c896214b7ad5ab807ac4ba11517812.

Reverting as part of a revert of a preceding commit.
2025-03-25 10:21:05 +02:00
Kazu Hirata
4c68061254 [Vectorize] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:855:52: error:
  unused variable 'SupportedOp' [-Werror,-Wunused-const-variable]
2025-03-24 17:38:47 -07:00
Han-Kuan Chen
71a0cfd932
[SLP] Make getSameOpcode support interchangeable instructions. (#127450)
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2025-03-25 08:24:46 +08:00
Alexey Bataev
ad9909dd73
[SLP]Fix perfect diamond match with extractelements in scalars
Need to drop all previous estimations/vectorizations, when found
a perfect diamond match. This improves cost estimation and improves code
emission.
Also, need to adjust getScalarizationOverhead cost for non-poison input
vector. Currently, it does not allow to estimate it correctly, so
instead use conservative element-by-element insertelement cost for each
unique scalar.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132466
2025-03-24 09:29:18 -04:00
Alexey Bataev
3b0ec61156
[SLP][NFC] Redesign schedule bundle, separate from schedule data, NFC
That's the initial patch, intended to support revectorization of the
previously vectorized scalars. If the scalar is marked for the
vectorization, it becomes a part of the schedule bundle, used to check
dependencies and then schedule tree entry scalars into a single batch of
instructions. Unfortunately, currently this info is part of the
ScheduleData struct and it does not allow making scalars part of many
bundles. The patch separates schedule bundles from the ScheduleData,
introduces explicit class ScheduleBundle for bundles, allowing later to
extend it to support revectorization of the previously vectorized
scalars.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon, hiraditya

Pull Request: https://github.com/llvm/llvm-project/pull/131625
2025-03-21 13:36:57 -04:00
Han-Kuan Chen
73558dc329
[SLP][REVEC] Fix getStoreMinimumVF only accept scalar types. (#132181)
Fix "Element type of a VectorType must " "be an integer, floating point,
or " "pointer type.".
2025-03-20 21:04:30 +08:00
Han-Kuan Chen
a5d4b50f93
[SLP] NFC. Change the inner loop and outer loop of appendOperandsOfVL. (#132152) 2025-03-20 20:32:20 +08:00
Han-Kuan Chen
c3e16337a4
[SLP][REVEC] Ignore UserTreeIndex if it is empty. (#131993)
Previously, the all_of check did not consider the case where the
TreeEntry is empty (i.e., when it is the first entry).
2025-03-20 11:31:49 +08:00
Kazu Hirata
0dcc201ac4
[Transforms] Use *Set::insert_range (NFC) (#132056)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-19 15:35:01 -07:00
Longsheng Mou
f3f7f08eca
[SLP] Fix Wsign-compare warning (NFC) (#131948)
llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4805:57:
warning: comparison of integer expressions of different signedness:
‘int’ and ‘std::size_t’ {aka ‘long unsigned int’} [-Wsign-compare]
    [](const auto &P) { return P.value() % 2 != P.index() % 2; }))
                               ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
2025-03-19 17:01:42 +08:00
Alexey Bataev
45090b3059 [SLP]Check the whole def-use chain in the tree to find proper dominance, if the last instruction is the same
If the insertion point (last instruction) of the user nodes is the same,
need to check the whole def-use chain in the tree to find proper
dominance to prevent a compiler crash.

Fixes #131818
2025-03-18 10:01:13 -07:00
Jeffrey Byrnes
4336e5edbc
[SLP] Sort PHIs by ExtractElements when relevant (#131229)
Considering the PHIs in order of element extracted can lead to better shuffles.
2025-03-17 14:19:46 -07:00
Alexey Bataev
ead9d6a56d [SLP]Check VectorizableTree is not empty before accessing elements
Need to check VectorizableTree is not empty before accessing elements.

Fixes #131635
2025-03-17 11:04:38 -07:00
Alexey Bataev
fbf0276b6a [SLP] Reorder reuses mask, if it is not empty, for subvector operands
If the subvector operands has reuses mask, need to reorder the mask, not
the scalars, to prevent compiler crash due to mask/scalars size
mismatch.

Fixes #131360
2025-03-14 14:11:09 -07:00
Alexey Bataev
605a9f590d [SLP]Check if user node is same as other node and check operand order
Need to check if the user node is same as other node and check operand
order to prevent a compiler crash when trying to find matching gather
node with user nodes, having the same last instruction.

Fixes #131195
2025-03-14 13:46:07 -07:00
Alexey Bataev
9c86198caf [SLP] Update vector value for incoming phi node, beeing vectorized already
If the phi node contains multiple same incoming blocks/values, need to
update the corresponding vectorized value, if it is not going to be
vectorized, if the incoming value was vectorized already.

Fixes #131355
2025-03-14 12:53:56 -07:00
Alexey Bataev
bbd1bb4057 [SLP]Set insert point for split node with non-scheulable instructions after the last instruction
Need to set the insert point for non-schedulable instructions in
SplitVectorize node after the last instruction, not before, to avoid
a crash in case of buildvector subvector node.
2025-03-14 07:04:55 -07:00