1492 Commits

Author SHA1 Message Date
Valery Dmitriev
c80b503496
[SLP] Improve gather tree nodes matching when users are PHIs. (#69392) 2023-10-18 09:05:11 -07:00
Valery Dmitriev
3caccb22ab
[NFC][SLP] Test case exposing gather nodes matching deficiency affecting cost. (#69382) 2023-10-17 14:58:10 -07:00
Alexey Bataev
66775f8ccd [SLP]Fix PR69196: Instruction does not dominate all uses
During emission of the postponed gathers, need to insert them before
user instruction to avoid use before definition crash.
2023-10-17 10:43:59 -07:00
Alexey Bataev
119b0f3895 Revert "[SLP]Fix PR69196: Instruction does not dominate all uses"
This reverts commit 8e2b2c4181506efc5b9321c203dd107bbd63392b to fix
a crash reported in https://lab.llvm.org/buildbot/#/builders/230/builds/19993.
2023-10-16 13:29:17 -07:00
Alexey Bataev
8e2b2c4181 [SLP]Fix PR69196: Instruction does not dominate all uses
During emission of the postponed gathers, need to insert them before
user instruction to avoid use before definition crash.
2023-10-16 12:57:18 -07:00
Alexey Bataev
e22818d5c9 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-05 06:17:07 -07:00
Alexey Bataev
2c49311dea [SLP][NFC]Add insertsubvector test with small source vector, NFC. 2023-10-05 06:03:58 -07:00
Arthur Eubanks
07389535a7 Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit b186f1f68be11630355afb0c08b80374a6d31782.

Causes crashes, see https://reviews.llvm.org/D158449.
2023-10-04 14:37:16 -07:00
Alex Richardson
e86d6a43f0 Regenerate test checks for tests affected by D141060 2023-10-04 10:51:35 -07:00
Alexey Bataev
b186f1f68b [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-04 07:53:30 -07:00
Alexey Bataev
ff48e83f18 [SLP][NFC]Add a test for reused extracts corner case, NFC. 2023-10-04 06:28:49 -07:00
Alexey Bataev
1129dec778 Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix
a crash reported in https://reviews.llvm.org/D158449.
2023-10-03 13:02:16 -07:00
Alexey Bataev
6f43d28f34 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-03 10:26:11 -07:00
Alexey Bataev
263a00fa91 [COST][AARCH64]Fix crash in cost calculation for shuffles.
Need to take the mask size as number of elements, not the number of
elements of the original fixed vector. Otherwise, the compiler may
crash.
2023-10-02 07:49:03 -07:00
Alexey Bataev
ebcb5d59fc Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix
buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.
2023-09-29 15:03:46 -07:00
Alexey Bataev
9f5960e004 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-09-29 13:16:03 -07:00
Alexey Bataev
019aee8327 [SLP]Improve costs in computeExtractCost() to avoid crash after D158449.
Need to consider the length of the original vector for extractelements,
not the length, matched number of the scalars. It fixes 2 issues: 1)
improves cost estimation; 2) Fixes crashes after D158449.
2023-09-29 07:48:02 -07:00
Hans Wennborg
06f3b0ed43 Revert "[SLP]Improve costs in computeExtractCost() to avoid crash after D158449."
This caused asserts:

  Assertion failed: NumElts > 1 && "Expected at least 2-element fixed length vector(s).",
  file C:\b\s\w\ir\cache\builder\src\third_party\llvm\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp, line 7096

see comment on 59a67ea35d

> Need to consider the length of the original vector for extractelements,
> not the length, matched number of the scalars. It fixes 2 issues: 1)
> improves cost estimation; 2) Fixes crashes after D158449.

This reverts commit 59a67ea35d608480257fc64ec3e5106ef50de740.
2023-09-29 10:42:19 +02:00
Alexey Bataev
59a67ea35d [SLP]Improve costs in computeExtractCost() to avoid crash after D158449.
Need to consider the length of the original vector for extractelements,
not the length, matched number of the scalars. It fixes 2 issues: 1)
improves cost estimation; 2) Fixes crashes after D158449.
2023-09-28 09:36:08 -07:00
Alexey Bataev
9eeb0293e2 [SLP]Cleanup MultiNodeScalars when tree deleted.
Need to clear MultiNodeScalars map to avoid compiler crash when tree is
deleted.
2023-09-27 07:48:53 -07:00
Alexey Bataev
ea7f43ec14 [SLP]Do not gather node, if the instruction, that does not require
scheduling, is previously vectorized.

If the main node was vectorized already, but does not require
scheduling, we still can try to vectorize it in this new node instead of
gathering.
2023-09-26 11:57:35 -07:00
alexfh
5d86176f48
Revert "[SLP]Do not gather node, if the instruction, that does not require" (#67386)
This reverts commit 77053421228edd12a3ba73d4eebd970fcdd3b2c0, which
introduces a
clang crash (test case: https://gcc.godbolt.org/z/zn5n4KWPY).
2023-09-26 02:45:11 +02:00
Alexey Bataev
7ff83ed6cd [SLP]Do not try to reorder possible strided nodes.
Reordering of possible strided nodes in bottom-to-top order requires
top-to-bottom reordering of the operands of such nodes, which is not
supported. Need to disable reordering of strided operands to avoid
compiler crashes.
2023-09-22 07:55:43 -07:00
David Spickett
8f548610a6 Revert "[SLP]Use source vector type as the original vector type instead of"
This reverts commit 9a99944df068b29b905cd8ba9a2132cc6382b6fb.

Due to test suite failures on all our SVE buildbots e.g.:
https://lab.llvm.org/buildbot/#/builders/184/builds/7375

clang: ../llvm/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:3565:
InstructionCost llvm::AArch64TTIImpl::getShuffleCost(TTI::ShuffleKind,
VectorType *, ArrayRef<int>, TTI::TargetCostKind, int, VectorType *,
ArrayRef<const Value *>): Assertion `Mask.size() == TpNumElts && "Expected Mask and Tp size to match!"' failed.
2023-09-22 07:52:16 +00:00
Alexey Bataev
9a99944df0 [SLP]Use source vector type as the original vector type instead of
artificial for better cost estimation.

Need to use original source vector type, not the one artificially
constructed, based on the number of vectorized scalars. It affect the
cost significantly.
2023-09-21 11:34:02 -07:00
Alexey Bataev
3dc28e6c6a [SLp]Fix a crash because of wrong deps between vectorized nodes.
Need to change the order of the nodes vectorization to avoid too early
insertion of the first node.
2023-09-21 10:19:11 -07:00
Alexey Bataev
7705342122 [SLP]Do not gather node, if the instruction, that does not require
scheduling, is previously vectorized.

If the main node was vectorized already, but does not require
scheduling, we still can try to vectorize it in this new node instead of
gathering.
2023-09-20 12:52:37 -07:00
Alexey Bataev
03feab7499 [SLP][NFC]Add a test with the reused main op instruction, NFC. 2023-09-20 11:28:59 -07:00
Alexey Bataev
ebed4692f8 [SLP]Fix a crash when trying to find operand with re-vectorized main
instruction.

Need to check if the operand scalars are vectorized in the a different
vector node, if the main instruction is already gets vectorized in other
vector node.
2023-09-20 09:54:15 -07:00
Alexey Bataev
7db87a66b0 [SLP]Fix PR66795: Check correct deps for vectorized inst with multiple
vectorized node uses.

If the instruction is vectorized in many different vector nodes, it may
break the dependency analysis for gathered nodes with matched scalars.
Need to properly check the dependency between such gather nodes to avoid
cycle dependency.
2023-09-19 12:11:33 -07:00
Alexey Bataev
434aa2fe56 [SLP]Improve canreuseExtracts for reordering analysis.
Improve the analysis in canReuseExtracts for the reodering to better
reorder extracts for ExtractSubvector pattern.
2023-09-15 12:09:45 -07:00
Alexey Bataev
b9ad72ba05 [SLP]Fix PR66176: SLP incorrectly reorders select operands.
On the very first iteration for the reductions, when trying to build
reduction for boolean logic operations, no need to compare LHS/RHS with
the Reduction(VectorizedTree), need to compare with actual parameters of
the reduction operations.
2023-09-15 03:57:36 -07:00
Alexey Bataev
d2ab97b00c [SLP][NFC]Add a test with incorrect reduction of poisoned logical bool. 2023-09-14 17:11:44 -07:00
Alexey Bataev
c15c1e5dd5 [SLP]Do not account non-instructions for external use.
If the non-instruction gets vectorized, no need to account its extract
cost, it won't be removed and replaced by extractelement instruction.
2023-09-14 12:40:33 -07:00
Alexey Bataev
1034405486 [SLP][NFC]Add a test for non-instruction with external use. 2023-09-14 12:34:14 -07:00
Thomas
0a7a926007
[NVPTX] Make i16x2 a native type and add supported vec instructions (#65799)
recommit https://github.com/llvm/llvm-project/pull/65432 with minor bug
fix for bitcasts
2023-09-08 13:44:58 -07:00
Alexey Bataev
5bab59de44 [SLP]Try to vectorize scalars, being vectorized already, but does not need to be scheduled.
If the scalar does not need to be scheduled and it was vectorized
already in one of the vector nodes, we still can try to vectorize it in
another node. Just does not need account its cost in the scalar total
cost, as it will be handled in the main vectorized node.

Differential Revision: https://reviews.llvm.org/D159205
2023-09-08 13:34:12 -07:00
Dmitri Gribenko
b3a14cac4f Revert "[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)"
This reverts commit db5d845c73ee2d64f1a5bab3fc72edece9e3a7ba.

As per PR discussion "Looks like we've missed lowering of bitcasts
between v2f16 and v2i16 and it breaks XLA."
2023-09-08 19:28:15 +02:00
Alexey Bataev
30edf1c449
[SLP]Do not early exit if the number of unique elements is non-power-of-2. (#65476)
We still can try to vectorize the bundle of the instructions, even if
the
repeated number of instruction is non-power-of-2. In this case need to
adjust the cost (calculate the cost only for unique scalar instructions)
and cost of the extracts. Also, when scheduling the bundle need to
schedule only unique scalars to avoid compiler crash because of the
multiple dependencies. Can be safely applied only if all scalars's users
are also vectorized and do not require memory accesses (this one is
a temporarily requirement, can be relaxed later).

---------

Co-authored-by: Alexey Bataev <a.bataev@outlook.com>
2023-09-08 10:00:46 -04:00
Ramkumar Ramachandra
a06be8a2e4
SLP/RISCV: add negative test for lrint (#55208) (#65611)
The issue #55208 describes a current deficiency of the SLPVectorizer,
namely that it doesn't vectorize code written with lrint, while similar
code written with rint is vectorized. Add a test corresponding to this
issue for the RISC-V target.
2023-09-08 10:58:14 +01:00
Ramkumar Ramachandra
7f499579a8
SLP/RISCV: add test for vectorized ctpop, like in X86 (#65330)
Recently, 7f26c27 turned on SLP by default for RISC-V, and although
there are quite a few tests for SLP under the X86/ target, it is unclear
whether the same constructs would be vectorized on RISC-V. This patch
takes a step in the direction of remedying this, by noticing that ctpop
is often vectorized on RISC-V, and adding four tests for different
integer widths.
2023-09-07 17:02:13 +01:00
Thomas
db5d845c73
[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)
On sm_90 some instructions now support i16x2 which allows hardware to
execute more efficiently add, min and max instructions.

In order to support that we need to make i16x2 a native type in the
backend. This does the necessary changes to make i16x2 a native type and
adds support for the instructions natively supporting i16x2.

This caused a negative test in nvptx slp to start passing. Changed the
test to a positive one as the IR is correctly vectorized.
2023-09-06 21:59:13 -07:00
Alexey Bataev
25fd5e63f8 [SLP][NFC]Update tests checks, NFC. 2023-09-06 13:57:49 -07:00
Matt Arsenault
5c0da5839d InstCombine: Recognize fabs as bitcasted integer
In the past we sort of pretended float might be implementable
as a non-IEEE type but that never realistically would work. Exotic
FP types would need to be added to the IR. Turning these
into FP operations enables FP tracking optimizations.

https://reviews.llvm.org/D151937
2023-08-31 19:03:48 -04:00
Matt Arsenault
50a9b3d8a5 InstCombine: Recognize fneg when performed as bitcasted integer
This is a resurrection of D18874. This was previously wrong with
fneg conflated with fsub, but we now have a proper fneg instruction.
Additionally, I think it is now clearer that IR float=IEEE float,
and a different bit layout would require adding a different IR type.

https://reviews.llvm.org/D151934
2023-08-31 18:59:34 -04:00
Philip Reames
aada8f2e54 [slp] Tweak debug costing output to include VL
This makes it much easier to understand which vector length is being considered when the same set of nodes are evaluated at multiple vector lengths.
2023-08-30 09:13:19 -07:00
Philip Reames
514b38cd7e [RISCV] Remove mask size restriction on single source and dual src shuffle costing (try 2)
Some callers pass in an empty mask to represent "unknown".  We should use the generic costs for these cases.  We can add VL=1 costing seperately if desired.

Reapplying after revert.  A new test had been added, and I'd missed updating it when rebasing before.  This is a great happy accident as I hadn't figured out how to get SLP to exercise this case, I'd merely noticed it via inspection.
2023-08-23 14:43:02 -07:00
wangpc
9a82bda9de [RISCV] Fix assertion of getShuffleCost
This assertion is introduced by D157425.

We should calculate the cost iff `Mask` is not empty.

Fixes 64901

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D158590
2023-08-23 20:10:50 +08:00
Alexey Bataev
b51195dece [SLP]Fix PR63854: Add proper sorting of pointers for masked stores.
If the masked gathers can be reordered, it may produce strided access
pattern and the reordering does not affect common reodering, better to
try to reorder masked gathers for better performance.

Differential Revision: https://reviews.llvm.org/D157009
2023-08-22 06:14:01 -07:00
Nikita Popov
69bd66b3ce [Tests] Remove some and/or constant expressions in tests (NFC)
In preparation for their removal in D158081.
2023-08-21 12:05:32 +02:00