1788 Commits

Author SHA1 Message Date
Alexey Bataev
c5c1bd164f [SLP]Improve minbitwidth analysis for trun'ed gather nodes.
If the gather node is trunc'ed, better to trunc scalars and then gather
them rather than gather and then trunc. Trunc for scalars is free in
most cases.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/99072
2024-07-17 07:41:00 -07:00
Alexey Bataev
05b067b5f9 Revert "[SLP]Improve minbitwidth analysis for trun'ed gather nodes."
This reverts commit d3d2f9a4208eedbd2f372c34725ab61c3f4d3aed to fix
buildbot https://lab.llvm.org/buildbot/#/builders/92/builds/1880.
2024-07-17 07:31:27 -07:00
Alexey Bataev
d3d2f9a420 [SLP]Improve minbitwidth analysis for trun'ed gather nodes.
If the gather node is trunc'ed, better to trunc scalars and then gather
them rather than gather and then trunc. Trunc for scalars is free in
most cases.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/99072
2024-07-17 07:29:02 -07:00
Alexey Bataev
b05ccaf451 Revert "[SLP]Improve minbitwidth analysis for trun'ed gather nodes."
This reverts commit 6425f2d66740b84fc3027b649cd4baf660c384e8 to fix the
buildbost issues reported in https://lab.llvm.org/buildbot/#/builders/95/builds/1404.
2024-07-17 05:51:54 -07:00
Han-Kuan Chen
1813ffd6b2
[SLP][REVEC] Make SLP support revectorization (-slp-revec) and add simple test. (#98269)
This PR will make SLP support revectorization. Add an option -slp-revec
to control the functionality.

reference:

https://discourse.llvm.org/t/rfc-make-slp-vectorizer-revectorize-vector-instructions/79436
2024-07-17 20:14:12 +08:00
Alexey Bataev
6425f2d667
[SLP]Improve minbitwidth analysis for trun'ed gather nodes.
If the gather node is trunc'ed, better to trunc scalars and then gather
them rather than gather and then trunc. Trunc for scalars is free in
most cases.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/99072
2024-07-17 07:17:25 -04:00
Alexey Bataev
15915c06d5
[SLP]Do not vectorize small (<=2) buildvector/buildvalue sequences with MaxVF==true.
If MaxVFOnly for buildvector/buildvalue vectorization is set to true and the
total number of elements to vectorize is <= 2, better to try to
vectorize reductions at first, which may produce larger tree (reductions
have a limit of at least 4 elements to vectorize). Smaller
buildvector/buildvalue sequence will be attempted to vectorize later,
with MaxVFOnly set to false.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98957
2024-07-16 12:45:58 -04:00
Alexey Bataev
8ff233f4f1 [SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.

Reviewers: nikic, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98570
2024-07-16 09:42:08 -07:00
Alexey Bataev
c3540d0b6b Revert "[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats."
This reverts commit c7aac38c29f564bc48f7cfb71d3b3b8b482c873b to fix
crashes reavealed by the buildbot in https://lab.llvm.org/buildbot/#/builders/168/builds/1104.
2024-07-16 05:59:59 -07:00
Alexey Bataev
c7aac38c29
[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.

Reviewers: nikic, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98570
2024-07-16 08:14:27 -04:00
Alexey Bataev
beccecaacd [SLP]Fix PR98838: do no replace condition of select-based logical op by poison.
If the reduction operation is a select-based logical op, the condition
should be replaced by the poison, better to replace by the non-poisoning
constant to prevent poison propagation in the vector code.

Fixes https://github.com/llvm/llvm-project/issues/98838
2024-07-15 07:27:54 -07:00
Alexey Bataev
9e261c5bee [SLP]Do not salvage debug info from instructions, marked for deletion already.
If the instruction was processed already for the deletion, no need to
process it second time, it may cause compiler crash.
2024-07-12 08:08:50 -07:00
Alexey Bataev
01a9888694 [SLP][NFC]Add isGather() function and use it instead direct comparison, NFC. 2024-07-11 11:56:32 -07:00
Alexey Bataev
3742c2a83c [SLP]Use stored signedness after minbitwidth analysis.
Need to used stored signedness info for the root node instead of
recalculating it after the vectorization, which may lead to a compiler
crash.
2024-07-10 03:58:00 -07:00
Han-Kuan Chen
ac299ed2c7
[SLP] Provide an universal interface for FixedVectorType::get. NFC. (#96845)
SLP vectorizes scalar type to vector type. In the future, we will try to
make SLP vectorizes vector type to vector type. We add a getWidenedType
as a helper function. For example, SLP will make the following code

%v0 = load i32, ptr %in0, align 4
%v1 = load i32, ptr %in1, align 4
%v2 = load i32, ptr %in2, align 4
%v3 = load i32, ptr %in3, align 4

into a load <4 x i32>. The ScalarTy is i32 and VF is 4. In the future,
SLP will make the following code

%v0 = load <4 x i32>, ptr %in0, align 4
%v1 = load <4 x i32>, ptr %in1, align 4
%v2 = load <4 x i32>, ptr %in2, align 4
%v3 = load <4 x i32>, ptr %in3, align 4

into a load <16 x i32>. The ScalarTy is <4 x i32> and VF is 4.

reference:
https://discourse.llvm.org/t/rfc-make-slp-vectorizer-revectorize-vector-instructions/79436
2024-07-10 11:50:35 +08:00
Alexey Bataev
af21bc1917 [SLP]Fix a crash on attempt to revectorize vectorized phi.
If the PHI node is vectorized during vectorization of its operands, no
need to try to vectorize its operands once again.
2024-07-09 14:11:08 -07:00
Alexey Bataev
822a818786 [SLP][NFC]Add comments for the code, NFC. 2024-07-09 10:06:34 -07:00
Alexey Bataev
a988821123
[SLP]Keep the original order in the reductions.
The patch tries to keep the original order of the instruction in the
reductions. Previously, two first instructions were switched, giving
reverse order.
The first step to support of the ordered reductions.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98025
2024-07-09 12:26:42 -04:00
Alexey Bataev
2cba218ca5 [SLP]Fix PR98133: Inserting PHI after debug-records!
The phi-node-to-be-deleted still should be inserted as the first
instruction in the block to avoid random compiler crashes.

Fixes https://github.com/llvm/llvm-project/issues/98133
2024-07-09 05:44:45 -07:00
Alexey Bataev
f5ee07a1b5
[SLP]Improve instruction reordering mode detection.
The "instruction" reordering mode should be selected only if there are
compatible instructions in other operands, which can be reordered.
Otherwise, better to select splat reordering mode.

Metric: size..text

Program                                                                                                                                                size..text
                                                                                                                                                       results     results0    diff

test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12383340.00 12383324.00 -0.0%

Some 4x operations get replaced by 8x.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/97485
2024-07-08 16:01:55 -04:00
Alexey Bataev
385118644c [SLP]Remove operands upon marking instruction for deletion.
If the instruction is marked for deletion, better to drop all its
operands and mark them for deletion too (if allowed). It allows to have
more vectorizable patterns and generate less useless extractelement
instructions.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/97409
2024-07-08 07:56:48 -07:00
Alexey Bataev
4c47b41771
[SLP]Allow matching and shuffling of extractelement vector operands with different VF.
Allows better codegen with the free resizing of small VF vector operands
and then regular shuffling of the operands of the same size and
simplifies the code.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/97414
2024-07-08 09:27:08 -04:00
tcwzxx
c2fe75f99c
Make the logic for checking scatter vectorized nodes of GEP clearer (#97826)
There is no functional change.

Authored-by: zhizhixu <zhizhixu@tencent.com>
2024-07-08 06:08:04 -04:00
Kazu Hirata
75bc20ff89
[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#97914) 2024-07-07 08:23:41 +09:00
Jon Roelofs
d3a76b03d8
[llvm][SLPVectorizer] Fix a bad cast assertion (#97621)
Fixes: rdar://128092379
2024-07-03 16:25:32 -07:00
Alexey Bataev
873c3f7e78 Revert "[SLP]Remove operands upon marking instruction for deletion."
This reverts commit bbd52dd44ceee80e3b6ba6a9b2bd8ee9a9713833 to fix
a crash revealed in https://lab.llvm.org/buildbot/#/builders/4/builds/505
2024-07-03 13:05:17 -07:00
Alexey Bataev
bbd52dd44c
[SLP]Remove operands upon marking instruction for deletion.
If the instruction is marked for deletion, better to drop all its
operands and mark them for deletion too (if allowed). It allows to have
more vectorizable patterns and generate less useless extractelement
instructions.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/97409
2024-07-03 15:11:18 -04:00
Alexey Bataev
4eecf3c650
[SLP]Reorder buildvector/reduction vectorization and fuse the loops.
Currently SLP vectorizer tries at first to find reduction nodes, and
then vectorize buildvector sequences. Need to try to vectorize wide
buildvector sequences at first and only then try to vectorize
reductions, and then smaller buildvector sequences.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/96943
2024-07-03 14:36:30 -04:00
Gabriel Baraldi
380beaec86
Fix potential crash in SLPVectorizer caused by missing check (#95937)
I'm not super familiar with this code, but it seems that we were just
missing a check.

The original code that triggered this did not have uselistorders but
llvm-reduce created them and it reproduces the same issue in a way more
compact way.

Fixes https://github.com/llvm/llvm-project/issues/95016
2024-07-02 08:15:51 -04:00
Youngsuk Kim
2051736f7b [llvm][Transforms] Avoid 'raw_string_ostream::str' (NFC)
Since `raw_string_ostream` doesn't own the string buffer, it is
desirable (in terms of memory safety) for users to directly reference
the string buffer rather than use `raw_string_ostream::str()`.

Work towards TODO comment to remove `raw_string_ostream::str()`.
2024-06-30 09:03:29 -05:00
Alexey Bataev
d70963a762 [SLP]Fix the cost of the adjusted extracts in per-register analysis.
Previous patch did not pass the list of the extract indices by
reference, so the compiler just ignored them. Pass indices by reference
and fix the per-register analysis.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/96808
2024-06-28 14:33:08 -07:00
Alexey Bataev
a9c12e481b Revert "[SLP]Fix the cost of the adjusted extracts in per-register analysis."
This reverts commit 784152056ea40a800a8fd9f4157a428dfb7a6de8 to fix
buildbots issues reported in
https://lab.llvm.org/buildbot/#/builders/4/builds/315 and https://lab.llvm.org/buildbot/#/builders/35/builds/481
2024-06-28 13:41:51 -07:00
Alexey Bataev
784152056e
[SLP]Fix the cost of the adjusted extracts in per-register analysis.
Previous patch did not pass the list of the extract indices by
reference, so the compiler just ignored them. Pass indices by reference
and fix the per-register analysis.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/96808
2024-06-28 15:49:47 -04:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Alexey Bataev
6f582b7ed3 [SLP][NFC]Remove extra check for VU. 2024-06-26 05:39:37 -07:00
Alexey Bataev
0280f97b36 [SLP]Fix PR95925: extract vectorized index of the potential buildvector sequence.
If the vectorized scalar is not the insert value in the buildvector
sequence but the index, it should be always extracted.
2024-06-25 14:07:51 -07:00
Alexey Bataev
228c2e1473 [SLP]Fix incorrect promotion of nodes before shuffling.
If the base node is signed, but some values are unsigned, still the
whole node should be considered signed. Also, an extra bitwidth analysis
should be performed, when estimating the minimal bitwidth.
2024-06-25 13:39:28 -07:00
Han-Kuan Chen
de7c1396f2
[SLP] NFC. Refactor and add getAltInstrMask help function. (#94709)
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2024-06-26 00:42:38 +08:00
Nikita Popov
8263bec533
[SLP] Use poison instead of undef in reorderScalars() (#96619)
-1 mask elements are specified to return poison rather than undef
nowadays , so update the reorderScalars() implementation to match.
2024-06-25 14:23:40 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Simon Pilgrim
f9fc6f6d75 [SLP] Remove dead initialization noticed by static analyser. NFC. 2024-06-21 17:42:01 +01:00
Han-Kuan Chen
be339fd99d
[SLP] NFC. Reduce redundant assignment. (#96149) 2024-06-20 20:09:28 +08:00
Tyler Lanphear
d337c504ef
[SLP][NFCI] Address issues seen in downstream Coverity scan. (#93757)
- Prevent null dereference: if the Mask given to
  `ShuffleInstructionBuilder::adjustExtracts()` is empty or all-poison,
  then `VecBase` will be `nullptr` and the call to
  `castToScalarTyElem(VecBase)` will dereference it. Add an assert
  to guard against this.

- Prevent use of uninitialized scalar: in the unlikely event that
  `CandidateVFs` is empty, then `AnyProfitableGraph` will be
  uninitialized in `if` condition following the loop. (This seems like a
  false-positive, but I submitted this change anyways as initializing
  bools costs nothing and is generally good practice)
2024-05-31 18:34:23 -07:00
Alexey Bataev
70a54bca6f
[SLP]Improve/fix extracts calculations for non-power-of-2 elements.
One of the previous patches introduced initial support for non-power-of-2
number of elements but some parts of the SLP vectorizer still were not
adjusted to handle the costs correctly. Patch fixes it by improving
analysis of the non-power-of-2 number of elements and fixes in the cost
of the extractelements instructions.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/93213
2024-05-24 09:33:36 -04:00
Alexey Bataev
554c47c8e9 [SLP]Fix undef poison vector values shuffles with poisonous vectors.
If trying to find vector value in shuffling of the extractelements and
one of the vector values is undef value, need to generate real mask value
for such vector and either undef vector, or incoming second vector, if
  non-poisonous.
2024-05-22 10:41:57 -07:00
Han-Kuan Chen
5b205956e1
[SLP] NFC. Reduce newTreeEntry usage. (#92994) 2024-05-22 23:26:33 +08:00
Alexey Bataev
30d484fa99 [SLP]Fix a crash when trying to convert masked gather nodes to strided.
Need to check if the loads node is masked gather. Only vectorized loads
can be converted to strided.
2024-05-22 08:08:56 -07:00
Han-Kuan Chen
9f449c3427
[SLP] NFC. Use TreeEntry::getOperand if setOperandsInOrder is called (#92727)
already.
2024-05-20 18:46:30 +08:00
Jay Foad
1650f1b3d7
Fix typo "indicies" (#92232) 2024-05-15 13:10:16 +01:00