2313 Commits

Author SHA1 Message Date
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
Alexey Bataev
b157599156 [SLP]Do not include copyable data to the same user twice
If the copyable schedule data is created and the user is used several
times in the user node, no need to count same data for the same user
several times, need to include it only ones.

Fixes #153754
2025-08-15 12:36:45 -07:00
Alexey Bataev
09f5b9ab0a Revert "[SLP]Do not include copyable data to the same user twice"
This reverts commit 758c6852c3ffe6b5e259cafadd811e60d8c276fb to fix
buildbot  https://lab.llvm.org/buildbot/#/builders/195/builds/13298
2025-08-15 12:08:31 -07:00
Alexey Bataev
758c6852c3 [SLP]Do not include copyable data to the same user twice
If the copyable schedule data is created and the user is used several
times in the user node, no need to count same data for the same user
several times, need to include it only ones.

Fixes #153754
2025-08-15 11:47:35 -07:00
Alexey Bataev
13b54f7dc1 [SLP] Recalculate dependencies for potential control dependencies if cleared
If the control dependecies are cleared after calcellation of the
copyables, need to reclculate them unconditionally.

Fixes #153754 #153676
2025-08-15 07:52:10 -07:00
Alexey Bataev
bf2f241458 [SLP]Support LShr as base for copyable elements
Added support for LShr instructions as base for copyable elements. Also,
added simple analysis for best base instruction selection, if multiple
candidates are available.

Fixed scheduling after cancellation

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/153393
2025-08-14 19:12:27 -07:00
Alex Bradbury
db5f7dc374 Revert "[SLP]Support LShr as base for copyable elements"
This reverts commit ca4ebf95172d24f8c47655709b2c9eb85bda5cb2.

Causes compile-time crashes for some inputs with RVV zvl512b/zvl1024b
configurations. See here for a minimal reproducer:
https://github.com/llvm/llvm-project/pull/153393#issuecomment-3189898813
2025-08-14 22:18:24 +01:00
Alexey Bataev
ca4ebf9517
[SLP]Support LShr as base for copyable elements
Added support for LShr instructions as base for copyable elements. Also,
added simple analysis for best base instruction selection, if multiple
candidates are available.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/153393
2025-08-14 12:35:28 -04:00
Alexey Bataev
d57ab276b6 [SLP] Recalculate cleared deps for potential control schedule data nodes
Need to recalculate the dependencies for all potential control data
schedule nodes to prevent compiler crash.

Fixes #153571
2025-08-14 09:00:42 -07:00
Kazu Hirata
1f04b15c56
[Vectorize] Remove a redundant call to std::unique_ptr<T>::get (NFC) (#153359) 2025-08-13 10:37:31 -07:00
Alexey Bataev
dd5ba694bd [SLP]Recalculate deps for potential control-dependent schedule data
After clearing the dependencies in copyable data, need to recalculate
dependencies for the original ScheduleData, if it can be marked as
control dependent.

Fixes #153289
2025-08-13 08:18:26 -07:00
Sam Tebbs
0bfa1718af
[LV] Create in-loop sub reductions (#147026)
This PR allows the loop vectorizer to handle in-loop sub reductions by
forming a normal in-loop add reduction with a negated input.

Stacked PRs:
1. -> https://github.com/llvm/llvm-project/pull/147026
2. https://github.com/llvm/llvm-project/pull/147255
3. https://github.com/llvm/llvm-project/pull/147302
4. https://github.com/llvm/llvm-project/pull/147513
2025-08-12 10:22:41 +01:00
Alexey Bataev
2d7b55a028
[SLP]Initial support for copyable elements
Adds initial support for copyable elements, both schedulable and
non-schedulable.
Adds support only for add for now, other opcodes will added in future.
Still some cases are not handled, e.g. stores do not include this,
because currently do not check for copyable elements.

Reviewers: hiraditya, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/147366
2025-08-11 09:41:19 -04:00
Alexey Bataev
67af2f6c5c [SLP]Initial FMAD support (#149102)
Added initial check for potential fmad conversion in reductions and
operands vectorization.

Added the check for instruction to fix #152683

Skipped the code for reduction to avoid regressions.
2025-08-11 05:53:55 -07:00
David Green
cfe190979e Revert "[SLP]Initial FMAD support (#149102)"
This reverts commit 0fffb9f9ed81f4c2084b8fe040c88b60bb6c372a due to major
performance regressions.
2025-08-10 15:16:01 +01:00
Alexey Bataev
0fffb9f9ed [SLP]Initial FMAD support (#149102)
Added initial check for potential fmad conversion in reductions and
operands vectorization.

Added the check for instruction to fix #152683
2025-08-08 10:30:23 -07:00
Alexey Bataev
0419b459be Revert "[SLP]Initial FMAD support (#149102)"
This reverts commit 0bcf45ea3458ba79eb4257afcfd6af954292c9ce to fix the
regresions, reported in https://github.com/llvm/llvm-project/issues/152683
2025-08-08 09:17:59 -07:00
Alexey Bataev
0bcf45ea34
[SLP]Initial FMAD support (#149102)
Added initial check for potential fmad conversion in reductions and
operands vectorization.
2025-08-07 09:51:43 -04:00
Alexey Bataev
4784ce9ebc [SLP][NFC]Check an external user before trying to address it in debug dump, NFC 2025-08-06 08:58:16 -07:00
Alexey Bataev
e27831ff9b [SLP] Fix a check for main/alternate interchanged instruction
If the instruction is checked for matching the main instruction, need to
check if the opcode of the main instruction is compatible with the
operands of the instruction. If they are not, need to check the
alternate instruction and its operands for compatibility and return
alternate instruction as a match.

Fixes #151699

Fixed check for non-supported binary operations.
2025-08-04 11:20:54 -07:00
Michael Halkenhäuser
70af09e3a1
Revert "[SLP] Fix a check for main/alternate interchanged instruction" (#151997)
This reverts commit 3ee8d047109ea4bb479095f4b153c2120a8d726c.

Revert reason: FAILED build for openmp-offload-amdgpu-runtime-2 
https://lab.llvm.org/buildbot/#/builders/10/builds/10827
2025-08-04 12:57:20 -04:00
Alexey Bataev
3ee8d04710 [SLP] Fix a check for main/alternate interchanged instruction
If the instruction is checked for matching the main instruction, need to
check if the opcode of the main instruction is compatible with the
operands of the instruction. If they are not, need to check the
alternate instruction and its operands for compatibility and return
alternate instruction as a match.

Fixes #151699
2025-08-04 08:31:35 -07:00
Alexey Bataev
7cd1ce3aa0 [SLP]Check vector-like instruction for dominance in copyables
Need to check if the vector-like instruction is dominated by main
operation in the copyables to prevent broken def-use chain

Fixes #151456
2025-08-04 06:14:19 -07:00
Kazu Hirata
3549134836
[Vectorize] Remove an unnecessary cast (NFC) (#151850)
getNumElements() already returns unsigned.
2025-08-03 08:44:50 -07:00
Alexey Bataev
ef98e248c7 [SLP]Initial support for copyable elements (non-schedulable only)
Adds initial support for copyable elements. This patch only models adds
and model copyable elements as add <element>, 0, i.e. uses identity
constants for missing lanes.
Only support for elements, which do not require scheduling, is added to
reduce size of the patch.

Fixed compile time regressions, reported crashes, updated release notes

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/140279
2025-07-25 10:55:07 -07:00
Martin Storsjö
936ee35dcc Revert "[SLP]Initial support for copyable elements (non-schedulable only)"
This reverts commit 898bba311f180ed54de33dc09e7071c279a4942a.

This change caused hangs and crashes, see
https://github.com/llvm/llvm-project/pull/140279#issuecomment-3115051063.
2025-07-25 01:22:20 +03:00
Martin Storsjö
bd170b78bb Revert "[SLP] Check if the user node has state before trying getting main instruction/opcode"
This reverts commit c9cea24fe68e24750b2d479144f839e1c2ec9d2b.

This is being reverted as it is intermixed with another commit
(898bba311f180ed54de33dc09e7071c279a4942a) that needs to be reverted.
2025-07-25 01:22:19 +03:00
Alexey Bataev
c9cea24fe6 [SLP] Check if the user node has state before trying getting main instruction/opcode
Need to check if the parent node has state to prevent compiler crashes.
Fixes #150479
2025-07-24 12:00:43 -07:00
Alexey Bataev
898bba311f [SLP]Initial support for copyable elements (non-schedulable only)
Adds initial support for copyable elements. This patch only models adds
and model copyable elements as add <element>, 0, i.e. uses identity
constants for missing lanes.
Only support for elements, which do not require scheduling, is added to
reduce size of the patch.

Fixed compile time regressions, updated release notes

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/140279
2025-07-23 13:38:34 -07:00
Alexey Bataev
a415d68e48 Revert "[SLP]Initial support for copyable elements (non-schedulable only)"
This reverts commit e202dba288edd47f1b370cc43aa8cd36a924e7c1 to try to
resolve compile time issues, reported in https://llvm-compile-time-tracker.com/compare.php?from=36089e5d983fe9ae00f497c2d500f37227f82db1&to=e202dba288edd47f1b370cc43aa8cd36a924e7c1&stat=instructions%3Au&details=on
2025-07-22 07:39:32 -07:00
Alexey Bataev
e202dba288
[SLP]Initial support for copyable elements (non-schedulable only)
Adds initial support for copyable elements. This patch only models adds
and model copyable elements as add <element>, 0, i.e. uses identity
constants for missing lanes.
Only support for elements, which do not require scheduling, is added to
reduce size of the patch.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/140279
2025-07-21 14:07:28 -04:00
Florian Hahn
004c67ea25
[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239)
Update LV to vectorize maxnum/minnum reductions without fast-math flags,
by adding an extra check in the loop if any inputs to maxnum/minnum are
NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros 
are already handled consistently by maxnum/minnum.

If any input is NaN,
 *exit the vector loop,
 *compute the reduction result up to the vector iteration that contained
   NaN inputs and
 * resume in the scalar loop


New recurrence kinds are added for reductions using maxnum/minnum
without fast-math flags.

PR: https://github.com/llvm/llvm-project/pull/148239
2025-07-18 21:58:19 +01:00
Alexey Bataev
60ae9c9c63
[SLP]Do not consider non-profitable loads slices
If all slices are small and end up with strided or even vectorization
states, better to not consider these candidates for the vectorization
and try to vectorize the whole bunch as gathered loads.

Reviewers: hiraditya, RKSimon, HanKuanChen

Reviewed By: RKSimon, HanKuanChen

Pull Request: https://github.com/llvm/llvm-project/pull/149209
2025-07-17 08:00:02 -04:00
Piotr Fusik
ade2f1023d
[SLP][NFCI] Don't trim indexes, reuse a variable (#149074) 2025-07-16 14:09:27 +02:00
Piotr Fusik
7674566c96
[SLP][NFC] Simplify count_if to count (#149072) 2025-07-16 14:09:09 +02:00
Piotr Fusik
949103b45c
[SLP][NFC] Use range-based for in matchAssociativeReduction (#149029) 2025-07-16 14:08:41 +02:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Gaëtan Bossu
adb6efeac9
[SLP] Fix cost estimation of external uses with wrong VF (#148185)
It assumed that the VF remains constant throughout the tree. That's not
always true. This meant that we could query the extraction cost for a
lane that is out of bounds.

While experimenting with re-vectorisation for AArch64, we ran into this
issue. We cannot add a proper AArch64 test as more changes would need to
be brought in.

This commit is only fixing the computation of VF and adding an assert.
Some tests were failing after adding the assert:
 - foo() in llvm/test/Transforms/SLPVectorizer/X86/horizontal.ll
- test() in
llvm/test/Transforms/SLPVectorizer/X86/reduction-with-removed-extracts.ll
- test_with_extract() in
llvm/test/Transforms/SLPVectorizer/RISCV/segmented-loads.ll
2025-07-15 11:39:09 +01:00
Alexey Bataev
a999a1b88c
[SLP]Remove emission of vector_insert/vector_extract intrinsics
Replaced by the regular shuffles.

Fixes #145512

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/148007
2025-07-11 15:26:45 -04:00
Gaëtan Bossu
d386b3b0b5
[SLP] Harmonise findLaneForValue() return type (NFC) (#148232)
The lane is computed as an unsigned, so let's return it as unsigned.
2025-07-11 14:05:22 +01:00
Alexey Bataev
dd60663b9b [SLP] Emit reduction instead of 2 extracts + scalar op, when vectorizing operands (#147583)
Added emission of the 2-element reduction instead of 2 extracts + scalar
op, when trying to vectorize operands of the instruction, if it is more
profitable.
2025-07-10 12:50:52 -07:00
Alex Bradbury
18627e995c Revert "[SLP] Emit reduction instead of 2 extracts + scalar op, when vectorizing operands (#147583)"
This reverts commit ac4a38e9bd573a173432b89cbef7cce7a48e7907.

This breaks the RVV builders
(MicroBenchmarks/ImageProcessing/Blur/blur.test and
MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test from llvm-test-suite)
and reportedly SPEC Accel2023
<https://github.com/llvm/llvm-project/pull/147583#issuecomment-3057183138>.
2025-07-10 14:55:22 +01:00
Alexey Bataev
ac4a38e9bd
[SLP] Emit reduction instead of 2 extracts + scalar op, when vectorizing operands (#147583)
Added emission of the 2-element reduction instead of 2 extracts + scalar
op, when trying to vectorize operands of the instruction, if it is more
profitable.
2025-07-09 19:52:09 -04:00
Ramkumar Ramachandra
62f8377e40
[LV] Extend FindFirstIV to unsigned case (#146386)
Extend FindFirstIV vectorization to the unsigned case by introducing and
handling FindFirstIVUMin.

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-07-09 15:56:40 +01:00
Alexey Bataev
9e132f5068 [SLP][NFC]Move function SLPVectorizerPass::tryToVectorize around, NFC 2025-07-09 05:34:36 -07:00
Gaëtan Bossu
50facad7fc
[SLP][REVEC] Fix insertelement legality checks (#146921)
The current code assumes that all the values in VL are valid
instructions, while it is possible to get poison.
2025-07-09 10:28:50 +01:00
Rahul Joshi
b38de6c18e
[NFCI][LLVM] Adopt ArrayRef::consume_front() in a few places (#146793) 2025-07-04 10:42:14 -07:00
Austin
a550fef906
[llvm] Use llvm::fill instead of std::fill(NFC) (#146911)
Use llvm::fill instead of std::fill
2025-07-04 14:10:28 +08:00
Florian Hahn
20fbbd7675
[LV] Add support for cmp reductions with decreasing IVs. (#140451)
Similar to FindLastIV, add FindFirstIVSMin to support select (icmp(), x, y)
reductions where one of x or y is a decreasing induction, producing a SMin
 reduction. It uses signed max as sentinel value.

PR: https://github.com/llvm/llvm-project/pull/140451
2025-06-29 11:17:03 +01:00
Ramkumar Ramachandra
bb8c42e859
[LV] Extend FindLastIV to unsigned case (#141752)
Split the FindLastIV RecurKind into SMax and UMax variants, depending on
the reduction op produced.
2025-06-23 15:27:49 +01:00