1562 Commits

Author SHA1 Message Date
Douglas Yung
fb981e6b4b Revert "[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461)"
This reverts commit bc8c4bbd7973ab9527a78a20000aecde9bed652d.

Change is failing to build on several bots:
- https://lab.llvm.org/buildbot/#/builders/127/builds/60184
- https://lab.llvm.org/buildbot/#/builders/123/builds/23709
- https://lab.llvm.org/buildbot/#/builders/216/builds/32302
2023-12-27 23:52:04 -08:00
Alexey Bataev
bc8c4bbd79
[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461)
SLP/TTI do not know about the cost estimation for addsub pattern,
supported by X86. Previously the support for pattern detection was added
(seeTTI::isLegalAltInstr), but the cost still did not estimated
properly.
2023-12-27 15:57:21 -05:00
Kazu Hirata
03dc806b12 [Transforms] Use {DenseMap,SmallPtrSet}::contains (NFC) 2023-12-22 14:51:22 -08:00
Alexey Bataev
a13148a880 [SLP]Fix PR75995: drop wrapping flags for resized wrapped binops.
If decided to resize the instruction, need to drop wrapping flags from
the resulting vector instructions to avoid incorrect
optimizations/assumptions later.
Fixes PR75995.
2023-12-20 06:51:39 -08:00
Arthur Eubanks
71a9292298 Revert "[SLP]Improve findReusedOrderedScalars processing, NFCI."
This reverts commit 44dc1e0baae7c4b8a02ba06dcf396d3d452aa873.

Causes non-determinism, see #75987.
2023-12-19 16:14:04 -08:00
Alexey Bataev
00edad17c2 [SLP][NFC]Check for equal opcode preliminary to meet weak strict order
requirement, NFC.

This change does not affect functionality, just fixes the assertions in
some standard c++ library implementations.
2023-12-18 14:12:33 -08:00
Alexey Bataev
a7e10e6603 Revert "[SLP][NFC]Check for equal opcode preliminary to meet weak strict order"
This reverts commit 58a2c4e2f24ffce3966c3988d1a4ca7b04c52244 to fix the
issue detected by https://lab.llvm.org/buildbot/#/builders/233/builds/5424.
2023-12-18 12:35:52 -08:00
Alexey Bataev
58a2c4e2f2 [SLP][NFC]Check for equal opcode preliminary to meet weak strict order
requirement, NFC.

This change does not affect functionality, just fixes the assertions in
some standard c++ library implementations.
2023-12-18 06:42:03 -08:00
Reid Kleckner
3e16152ebc [SLP] Fix OOB GEP index access for a no-op GEP
Issue is covered by existing test
llvm/test/Transforms/SLPVectorizer/RISCV/phi-const.ll

See issue #75632 for ideas for how we could catch these more easily in
the future.
2023-12-15 17:33:06 +00:00
Maurice Heumann
f42b930af9
[SLP] Pessimistically handle unknown vector entries in SLP vectorizer (#75438)
SLP Vectorizer can discard vector entries at unknown positions. This
example shows the behaviour:

https://godbolt.org/z/or43EM594

The following instruction inserts an element at an unknown position:

```
%2 = insertelement <3 x i64> poison, i64 %value, i64 %position
```

The position depends on an argument that is unknown at compile time.

After running SLP, one can see there is no more instruction present
referencing `%position`.

This happens as SLP parallelizes the two adds in the example. It then
needs to merge the original vector with the new vector.

Within `isUndefVector`, the SLP vectorizer constructs a bitmap
indicating which elements of the original vector are poison values. It
does this by walking the insertElement instructions.

If it encounters an insert with a non-constant position, it is ignored.
This will result in poison values to be used for all entries, where
there are no inserts with constant positions.

However, as the position is unknown, the element could be anywhere.
Therefore, I think it is only safe to assume none of the entries are
poison values and to simply take them all over when constructing the
shuffleVector instruction.

This fixes #75437
2023-12-14 09:48:23 -05:00
Alexey Bataev
44dc1e0baa [SLP]Improve findReusedOrderedScalars processing, NFCI.
Tries to simplify structural complexity of the findReusedOrderedScalars function.
2023-12-08 14:27:55 -08:00
Alexey Bataev
fb35bb48c6 [SLP][NFC]Build value-to-gather-nodes map during nodes building, NFC. 2023-12-07 13:41:19 -08:00
Alexey Bataev
58785ebd24 [SLP][NFC]Check for ephemeral values beforehand, NFC. 2023-12-07 13:25:15 -08:00
Alexey Bataev
0e1a9e3084 [SLP]Fix PR74607: Fix dependency between buildvector nodes with user
nodes, having same last instruction.

If the user nodes has the same last-instruction, used as insert points
for the buildvector nodes, finding the proper dependency is crucial.
  Before, it depended on the indices of the buildvectors themselves but
  looks like it should depend on indices of the user nodes, because it
  identifies the vectorization order and, thus, properly aligns
  buildvector nodes in terms of def-use chain.
2023-12-06 10:15:01 -08:00
Paschalis Mpeis
7b83f69db4
[NFC] Replace CallInst with FunctionType in VFABI, VFShape API (#74569)
Minor simplification applied to VFShape::getScalarShape,
VFShape::get, and VFABI::tryDemangleForVFABI methods.

Also, remove unnecessary `static_cast` in `SLPVectorizer.cpp`
2023-12-06 17:14:58 +00:00
Alexey Bataev
279b1ea65f [SLP]Improve gathering of the scalars used in the graph.
Currently we emit gathers for scalars being vectorized in the tree as
a pair of extractelement/insertelement instructions. Instead we can try
to find all required vectors and emit shuffle vector instructions
directly, improving the code and reducing compile time.

Part of non-power-of-2 vectorization.

Differential Revision: https://reviews.llvm.org/D110978
2023-12-01 11:23:57 -08:00
Alexey Bataev
ba52310657
[SLP][NFC] Unify code for cost estimation/codegen for buildvector, NFC. (#73182)
This just moves towards reusing same function for both cost
estimation/codegen for buildvector.
2023-11-30 10:04:57 -05:00
Alexey Bataev
1f88e62db4 [SLP]Fix/improve minbitwidth mapping to use TreeEntry as a key.
Currently, MinBWs map uses Value* as a key and stores mapping for each
value to be demoted. It make is it hard to get the actual MinBWs value
for the buildvector scalars(constants), since same constant might be
  used in different nodes with the different MinBWs values/decisions.
Also, it consumes extra memory for the vectorized values/instructions
 from the same nodes.
Better to map actual nodes. It fixes the bitwidth data fetching for
buildvector scalars and improves memory consumption/analysis time for
other instructions.
2023-11-30 06:33:31 -08:00
Alexey Bataev
447da954c7 [SLP][NFC]Use DenseSet instead of SetVector, NFC.
For CSEBlocks we can safely use DenseSet, the order should not be
preserved for this container.
2023-11-28 11:27:49 -08:00
Alexey Bataev
badec9b7bf [SLP][NFC]Fix loops variables names, NFC. 2023-11-28 10:30:19 -08:00
Alexey Bataev
c72884225b [SLP][NFC]Fix naming of variables/functions, NFC. 2023-11-28 09:15:38 -08:00
Alexey Bataev
b6eb740cae [SLP][NFC]Improve/fix auto declarations, NFC. 2023-11-28 07:39:21 -08:00
Alexey Bataev
45139ab6ca [SLP][NFC]Improve aliasing support in SLP, NFC.
No need to store optional boolean in the map, enough to store boolean
directly. Also, we can do preliminary check for instruction and if they
are not simple, mark as aliased without storing this result in the map.
2023-11-28 07:24:44 -08:00
Alexey Bataev
e9fdb965f9 [SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using
Expr container, NFC.

Saves the memory and may improve compile time.
2023-11-24 08:05:19 -08:00
Alexander Kornienko
af7a145352 Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using"
This reverts commit 52df67ba76a03ad33132d1d4f4202d5a2313a3cd, which causes
spurious clang crashes. See
52df67ba76 (commitcomment-133381701)
2023-11-24 01:18:46 +01:00
Alexey Bataev
53f912480f [SLP][NFC]Remove extra unused vars, add TODO, NFC. 2023-11-22 12:26:54 -08:00
Alexey Bataev
12bcd6339d [SLP]Improve detection of gathered loads, if no other deps are detected.
If the gather node includes ordered loads only partially (not the whole
node consists of loads) and the other gathered scalar are not loads, and
no other dependency from other nodes is found, we still can improve the
cost of gather, if take into account the fact that these loads still can
be vectorized.
2023-11-22 11:35:51 -08:00
Alexey Bataev
369c0eb55b [SLP][NFC]Use SmallVector instead of std::vector and remove unused
includes, NFC.
2023-11-22 08:11:27 -08:00
Alexey Bataev
f609d4ba1d [SLP]Fix PR72833: do not crash if only operand is casted but the use
instruction.

Need to check if only operand is casted, not the user instruction
itself, if the types of the operands does not match the actual type.
2023-11-20 08:35:35 -08:00
Alexey Bataev
40e46b6eff [SLP]Do not emit int bitcast after minbitwidth analysis.
No need to emit bitcat op for integer operands if it is detected that
after minbitwidth analysis the type is the same.
2023-11-20 06:25:17 -08:00
Alexey Bataev
52df67ba76 [SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using
Expr container, NFC.

Saves the memory and may improve compile time.
2023-11-17 13:45:28 -08:00
Arthur Eubanks
6a126e279d Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using"
This reverts commit cfd0f41f4effb5d31654dcb28c1a577c152ee23b.

Causes crashes, see cfd0f41f4e.
2023-11-17 13:23:38 -08:00
Valery Dmitriev
94e86751e5
[NFC][SLP] Remove unnecessary DL argument (#72674) 2023-11-17 10:08:25 -08:00
Alexey Bataev
cfd0f41f4e [SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using
Expr container, NFC.

Saves the memory and may improve compile time.
2023-11-17 08:05:02 -08:00
Alexey Bataev
72b97630bc [SLP][NFC]Fix comparison of integers of different signs warning, NFC. 2023-11-16 17:18:28 -08:00
Alexey Bataev
cb678708e6 [SLP][NFC]Add TreeEntry-based add member functions and use them, where
possible, NFC.
2023-11-16 16:30:52 -08:00
Alexey Bataev
484a27e412 [SLP][NFC]Make needToDelay constant, NFC. 2023-11-16 16:11:43 -08:00
Alexey Bataev
009002a8cb [SLP][NFC]Unify matching for perfect diamond match between cost and codegen
models, NFC.
2023-11-16 08:11:52 -08:00
Alexey Bataev
206799fcf5 [SLP]Fix PR72524: "Out-of-bounds shuffle mask element" failed.
Need to check if we ran into subvector extract pattern before checking
for identity vector to avoid compiler crash.
2023-11-16 07:39:32 -08:00
Alexey Bataev
95703642e3 [SLP]Fix PR72202: wrong mask emission for the first found vector
operand.

Need to copy the submask not to the very first part of the common
extractelements vector mask, but to the proper one to avoid wrong code
emission.
2023-11-16 07:01:05 -08:00
Alexey Bataev
8ea8dd9a01 [SLP] Fix crash on trying to reshuffle a scalar that was vectorized.
If the buildvector node contains extractelement, which vector operand
depends on vector node, need to check if the node is ready and use
vectorized value instead of the original vector operation.
2023-11-15 11:01:45 -08:00
Alexey Bataev
d202b00826 [SLP][NFC] Make tryToGather[SingleRegister]ExtractElements routines BoUpSLP methods. 2023-11-15 09:47:24 -08:00
Alexey Bataev
b6f51787f6 [SLP]Fix signedness analysis for scalars in graph.
Cannot use the sign info for the roots for all scalars in the graph,
need to perform the analysis for each particular scalar (tree node).
2023-11-15 07:10:59 -08:00
Alexey Bataev
5adfad254e [SLP]Emit actual bitwidth for analyzed MinBitwidth nodes, NFCI.
SLP includes analysis for the minimum bitwidth, the actual integer
operations can be emitted. It allows to reduce register pressure and
improve perf. Currently, it includes only cost model and the next
transformation relies on InstructionCombiner. Better to do it directly
in SLP, it allows to reduce compile time and fix cost model issues.
2023-11-14 11:12:52 -08:00
Alexey Bataev
f2f3050476 Revert "[SLP]Emit actual bitwidth for analyzed MinBitwidth nodes, NFCI."
This reverts commit f6ae50f710d02d8553d28192a1f048b2a9e1fc4d to fix
a crash revealed in the internal testing.
2023-11-14 09:45:54 -08:00
Alexey Bataev
f6ae50f710 [SLP]Emit actual bitwidth for analyzed MinBitwidth nodes, NFCI.
SLP includes analysis for the minimum bitwidth, the actual integer
operations can be emitted. It allows to reduce register pressure and
improve perf. Currently, it includes only cost model and the next
transformation relies on InstructionCombiner. Better to do it directly
in SLP, it allows to reduce compile time and fix cost model issues.
2023-11-14 07:57:37 -08:00
Alexey Bataev
d4cec1ce73 [SLP][NFCI]Improve compile time by using SmallBitVector and filtering
trees with phis/buildvectors only.
2023-11-14 06:27:17 -08:00
Alexey Bataev
ac254fc055 [SLP]Improve tryToGatherExtractElements by using per-register analysis.
Currently tryToGatherExtractElements function analyzes the whole vector,
regrdless number of actual registers, used in this vector. It may
prevent some optimizations, because per-register analysis may allow to
simplify the final code by reusing more already emitted vectors and
better shuffles.

Differential Revision: https://reviews.llvm.org/D148855
2023-11-06 07:29:27 -08:00
Hans Wennborg
046c57e705 Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis."
This causes asserts:

  llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10082:
  Value *llvm::slpvectorizer::BoUpSLP::ShuffleInstructionBuilder::adjustExtracts(
    const TreeEntry *, MutableArrayRef<int>, unsigned int, bool &):
  Assertion `Part == 0 && "Expected firs part."' failed.

See comment on the code review.

> Currently tryToGatherExtractElements function analyzes the whole vector,
> regrdless number of actual registers, used in this vector. It may
> prevent some optimizations, because per-register analysis may allow to
> simplify the final code by reusing more already emitted vectors and
> better shuffles.
>
> Differential Revision: https://reviews.llvm.org/D148855

This reverts commit 9dfdbd788707edc8c39eb2bff16004aba1f3586b.
2023-11-06 13:56:42 +01:00
Alexey Bataev
9dfdbd7887 [SLP]Improve tryToGatherExtractElements by using per-register analysis.
Currently tryToGatherExtractElements function analyzes the whole vector,
regrdless number of actual registers, used in this vector. It may
prevent some optimizations, because per-register analysis may allow to
simplify the final code by reusing more already emitted vectors and
better shuffles.

Differential Revision: https://reviews.llvm.org/D148855
2023-11-03 10:43:58 -07:00