4149 Commits

Author SHA1 Message Date
Kazu Hirata
03dc806b12 [Transforms] Use {DenseMap,SmallPtrSet}::contains (NFC) 2023-12-22 14:51:22 -08:00
Alexey Bataev
a13148a880 [SLP]Fix PR75995: drop wrapping flags for resized wrapped binops.
If decided to resize the instruction, need to drop wrapping flags from
the resulting vector instructions to avoid incorrect
optimizations/assumptions later.
Fixes PR75995.
2023-12-20 06:51:39 -08:00
Arthur Eubanks
71a9292298 Revert "[SLP]Improve findReusedOrderedScalars processing, NFCI."
This reverts commit 44dc1e0baae7c4b8a02ba06dcf396d3d452aa873.

Causes non-determinism, see #75987.
2023-12-19 16:14:04 -08:00
Alexey Bataev
00edad17c2 [SLP][NFC]Check for equal opcode preliminary to meet weak strict order
requirement, NFC.

This change does not affect functionality, just fixes the assertions in
some standard c++ library implementations.
2023-12-18 14:12:33 -08:00
Alexey Bataev
a7e10e6603 Revert "[SLP][NFC]Check for equal opcode preliminary to meet weak strict order"
This reverts commit 58a2c4e2f24ffce3966c3988d1a4ca7b04c52244 to fix the
issue detected by https://lab.llvm.org/buildbot/#/builders/233/builds/5424.
2023-12-18 12:35:52 -08:00
Alexey Bataev
58a2c4e2f2 [SLP][NFC]Check for equal opcode preliminary to meet weak strict order
requirement, NFC.

This change does not affect functionality, just fixes the assertions in
some standard c++ library implementations.
2023-12-18 06:42:03 -08:00
Florian Hahn
cb56ba6350
[VPlan] Unswitch cond in replaceUsesWithIf in optimizeInductions (NFC)
As suggested post-commit for a00227197, unswitch the condition in
replaceUsesWithIf to simplify the check.
2023-12-15 20:26:36 +00:00
Florian Hahn
9277ef12c0
[VPlan] Remove stale comment from optimizeInductions (NFC).
As suggested post-commit for a00227197, remove the stale comment,
SetVector is no longer used here.
2023-12-15 17:35:13 +00:00
Reid Kleckner
3e16152ebc [SLP] Fix OOB GEP index access for a no-op GEP
Issue is covered by existing test
llvm/test/Transforms/SLPVectorizer/RISCV/phi-const.ll

See issue #75632 for ideas for how we could catch these more easily in
the future.
2023-12-15 17:33:06 +00:00
Florian Hahn
b1bfe221e6
[VPlan] Remove unneeded getNumUsers calls in replaceAllUsesWith (NFC).
As suggested post-commit for a00227197, replace unnecessary getNumUsers
calls by boolean variable to indicate if users changed. Note that this
also requires an early exit to detect the case where a value is replaced
by itself.
2023-12-15 13:43:15 +00:00
Shih-Po Hung
3d422a9859
[VPlan] Implement mayHaveSideEffects/mayWriteToMemory for VPInterleav… (#71360)
…eRecipe

This helps VPlanTransforms::removeDeadRecipes to work on
VPInterleaveRecipe
2023-12-15 00:23:14 +08:00
Maurice Heumann
f42b930af9
[SLP] Pessimistically handle unknown vector entries in SLP vectorizer (#75438)
SLP Vectorizer can discard vector entries at unknown positions. This
example shows the behaviour:

https://godbolt.org/z/or43EM594

The following instruction inserts an element at an unknown position:

```
%2 = insertelement <3 x i64> poison, i64 %value, i64 %position
```

The position depends on an argument that is unknown at compile time.

After running SLP, one can see there is no more instruction present
referencing `%position`.

This happens as SLP parallelizes the two adds in the example. It then
needs to merge the original vector with the new vector.

Within `isUndefVector`, the SLP vectorizer constructs a bitmap
indicating which elements of the original vector are poison values. It
does this by walking the insertElement instructions.

If it encounters an insert with a non-constant position, it is ignored.
This will result in poison values to be used for all entries, where
there are no inserts with constant positions.

However, as the position is unknown, the element could be anywhere.
Therefore, I think it is only safe to assume none of the entries are
poison values and to simply take them all over when constructing the
shuffleVector instruction.

This fixes #75437
2023-12-14 09:48:23 -05:00
Florian Hahn
173032902c
Revert "[VPlan] Mark Select VPInstructions as not having sideeffects."
This reverts commit 19918ac34dc5d304ec6ad413ceae1d4394abe28f.

Fixes #75298. There is still a case where we miss the correct users
outside the main vector loop for reductions, and that is tail-folded
loops with reductions where the final value is stored after the loop.

This should be handled explicitly in #70253
2023-12-13 21:05:24 +00:00
Florian Hahn
19918ac34d
[VPlan] Mark Select VPInstructions as not having sideeffects.
Select VPInstructions don't have sideeffects, mark them accordingly.
2023-12-11 12:26:32 +00:00
Shao-Ce SUN
d860710905
[NFC][VPlan] Simplify VPValue::removeUser (#74708)
Replaced explicit loops with find + erase.
2023-12-11 10:55:27 +08:00
Kazu Hirata
8b1181133d [Transforms] Remove unused forward declarations (NFC) 2023-12-10 10:07:12 -08:00
Kazu Hirata
a16429365c [Transforms] Remove unnecessary includes (NFC) 2023-12-09 18:23:06 -08:00
Alexey Bataev
44dc1e0baa [SLP]Improve findReusedOrderedScalars processing, NFCI.
Tries to simplify structural complexity of the findReusedOrderedScalars function.
2023-12-08 14:27:55 -08:00
Florian Hahn
a5891fa4d2
[VPlan] Initial modeling of VF * UF as VPValue. (#74761)
This patch starts initial modeling of VF * UF in VPlan.
Initially, introduce a dedicated VFxUF VPValue, which is then
populated during VPlan::prepareToExecute. Initially, the VF * UF
applies only to the main vector loop region. Once we extend the
scope of VPlan in the future, we may want to associate different VFxUFs
with different vector loop regions (e.g. the epilogue vector loop)

This allows explicitly parameterizing recipes that rely on the
VF * UF, like the canonical induction increment. At the moment, this
mainly helps to avoid generating some duplicated calls to vscale with
scalable vectors. It should also allow using EVL as induction increments
explicitly in D99750. Referring to VF * UF is also needed in other
places that we plan to migrate to VPlan, like the minimum trip count
check during skeleton creation.

The first version creates the value for VF * UF directly in
prepareToExecute to limit the scope of the patch. A follow-on patch will
model VF * UF computation explicitly in VPlan using recipes.

Moved from Phabricator (https://reviews.llvm.org/D157322)
2023-12-08 18:30:30 +00:00
Florian Hahn
5ea6a3fc6d
[VPlan] Compute scalable VF in preheader for induction increment. (#74762)
UF * VF is loop invariant and can be computed directly in the preheader.
This prepares the code for #74761 and reduces the test changes.
2023-12-08 12:18:31 +00:00
Florian Hahn
633fe60149
[VPlan] Print flags for VPWidenCastRecipe.
Update VPWidenCastRecipe to also print flags. Simplify nneg printing
test and replace hard-coded value number references with patterns.
2023-12-08 10:48:54 +00:00
Graham Hunter
d0d5ef8133
[LV] Add support for linear arguments for vector function variants (#73941)
If we have vectorized variants of a function which take linear
parameters, we should be able to vectorize assuming the strides match.
2023-12-08 10:24:05 +00:00
Alexey Bataev
fb35bb48c6 [SLP][NFC]Build value-to-gather-nodes map during nodes building, NFC. 2023-12-07 13:41:19 -08:00
Alexey Bataev
58785ebd24 [SLP][NFC]Check for ephemeral values beforehand, NFC. 2023-12-07 13:25:15 -08:00
Alexey Bataev
0e1a9e3084 [SLP]Fix PR74607: Fix dependency between buildvector nodes with user
nodes, having same last instruction.

If the user nodes has the same last-instruction, used as insert points
for the buildvector nodes, finding the proper dependency is crucial.
  Before, it depended on the indices of the buildvectors themselves but
  looks like it should depend on indices of the user nodes, because it
  identifies the vectorization order and, thus, properly aligns
  buildvector nodes in terms of def-use chain.
2023-12-06 10:15:01 -08:00
Paschalis Mpeis
7b83f69db4
[NFC] Replace CallInst with FunctionType in VFABI, VFShape API (#74569)
Minor simplification applied to VFShape::getScalarShape,
VFShape::get, and VFABI::tryDemangleForVFABI methods.

Also, remove unnecessary `static_cast` in `SLPVectorizer.cpp`
2023-12-06 17:14:58 +00:00
Florian Hahn
bbd1941a38
[VPlan] Add disjoint flag to VPRecipeWithIRFlags. (#74364)
A new disjoint flag was added for OR instructions in #72583. 

Update VPRecipeWithIRFlags to also support the new flag. This
allows printing and preserving the disjoint flag in vectorized code.
2023-12-05 15:21:59 +00:00
Alexey Bataev
056367bb19
[LV]Support dropping of nneg flag for zext widencast recipes. (#74112)
Compiler crashes when the assertion triggered for zext nneg instruction,
that checks that the instruction cannot produce poison. Changed the base
class for widencast recipe to handle dropping nneg flag to avoid
compiler crash.
2023-12-05 09:17:23 -05:00
Florian Hahn
cd4348349a
[VPlan] Sink cases where no truncate is needed in truncateMinimalBWs.
MinBWs contains entries that specify the minimum required bitwidth. In
some cases, the old and new bitwidths can be equal (see test case) and
in those cases no truncations are needed, so skip those cases.

Fixes #74307.
2023-12-04 15:35:54 +00:00
Florian Hahn
99aa5311ee
[VPlan] Add missing output of live-ins to VPlan dot printing.
Split off live-in printing to VPlan::printLiveIns and use it to print
Live-ins when printing in the DOT format.
2023-12-04 13:41:28 +00:00
Florian Hahn
c890582912
[VPlan] Account for live-in entries in MinBW used by replicate recipes.
In some cases MinBWs may contain entries for live-ins that are not used
by VPWidenRecipe or VPWidenSelectRecipes. In those cases, the live-ins
won't get processed, so make sure we include them in the count when used
as operands in VPWidenCast and VPWidenSelectRecipe.

Fixes https://github.com/llvm/llvm-project/issues/74231
2023-12-03 11:15:29 +00:00
Kazu Hirata
0008b9c0ac [Vectorize] Fix an unused variable warning
This patch fixes:

  llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:912:16: error:
  unused variable 'OldResSizeInBits' [-Werror,-Wunused-variable]
2023-12-02 11:20:57 -08:00
Florian Hahn
70535f5e60
[VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version.
This patch replaces the IR based truncateToMinimalBitwidths with a VPlan
version. This has 3 benefits:
1) the VPlan-based version is simpler; we don't need to implement
   special codegen for each supported instruction type like the IR based
   one.
2) Removes a dependency on the cost-model after VPlan execution and
3) Removes a use of getVPValue that uses underlying values after VPlan
   execution (See removed FIXME).

Depends on D149081.

Depends on D149079.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D149903
2023-12-02 16:12:38 +00:00
Florian Hahn
cbf7b52a65
[VPlan] Properly update reduction live-out after placing select.
After inserting a select for the final value, update the VPlan def-use
chains. At the moment, the incorrect live-out doesn't cause a
mis-compile, as computing the final reduction value is not yet modeled
in VPlan.
2023-12-02 15:22:09 +00:00
Alexey Bataev
279b1ea65f [SLP]Improve gathering of the scalars used in the graph.
Currently we emit gathers for scalars being vectorized in the tree as
a pair of extractelement/insertelement instructions. Instead we can try
to find all required vectors and emit shuffle vector instructions
directly, improving the code and reducing compile time.

Part of non-power-of-2 vectorization.

Differential Revision: https://reviews.llvm.org/D110978
2023-12-01 11:23:57 -08:00
Alexey Bataev
ba52310657
[SLP][NFC] Unify code for cost estimation/codegen for buildvector, NFC. (#73182)
This just moves towards reusing same function for both cost
estimation/codegen for buildvector.
2023-11-30 10:04:57 -05:00
Alexey Bataev
1f88e62db4 [SLP]Fix/improve minbitwidth mapping to use TreeEntry as a key.
Currently, MinBWs map uses Value* as a key and stores mapping for each
value to be demoted. It make is it hard to get the actual MinBWs value
for the buildvector scalars(constants), since same constant might be
  used in different nodes with the different MinBWs values/decisions.
Also, it consumes extra memory for the vectorized values/instructions
 from the same nodes.
Better to map actual nodes. It fixes the bitwidth data fetching for
buildvector scalars and improves memory consumption/analysis time for
other instructions.
2023-11-30 06:33:31 -08:00
Jeremy Morse
2425e2940e
[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.

This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.

Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
2023-11-30 12:19:57 +00:00
Youngsuk Kim
859338a695 [llvm] Replace uses of Type::getPointerTo (NFC)
Work towards removing method Type::getPointerTo.
Opaque ptr cleanup effort.
2023-11-29 10:22:31 -06:00
Alexey Bataev
447da954c7 [SLP][NFC]Use DenseSet instead of SetVector, NFC.
For CSEBlocks we can safely use DenseSet, the order should not be
preserved for this container.
2023-11-28 11:27:49 -08:00
Alexey Bataev
badec9b7bf [SLP][NFC]Fix loops variables names, NFC. 2023-11-28 10:30:19 -08:00
Alexey Bataev
c72884225b [SLP][NFC]Fix naming of variables/functions, NFC. 2023-11-28 09:15:38 -08:00
Alexey Bataev
b6eb740cae [SLP][NFC]Improve/fix auto declarations, NFC. 2023-11-28 07:39:21 -08:00
Alexey Bataev
45139ab6ca [SLP][NFC]Improve aliasing support in SLP, NFC.
No need to store optional boolean in the map, enough to store boolean
directly. Also, we can do preliminary check for instruction and if they
are not simple, mark as aliased without storing this result in the map.
2023-11-28 07:24:44 -08:00
Graham Hunter
104b7c624e
[LV] Add support for uniform parameters on vectorized function variants (#72891)
Parameters marked as uniform take a scalar value, assuming the value is
invariant in the scalar loop.
2023-11-28 15:01:32 +00:00
Alexey Bataev
e9fdb965f9 [SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using
Expr container, NFC.

Saves the memory and may improve compile time.
2023-11-24 08:05:19 -08:00
Alexander Kornienko
af7a145352 Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using"
This reverts commit 52df67ba76a03ad33132d1d4f4202d5a2313a3cd, which causes
spurious clang crashes. See
52df67ba76 (commitcomment-133381701)
2023-11-24 01:18:46 +01:00
Florian Hahn
906f598263
[VPlan] Remove dead IsEpilogueVec argument from prepareToExecute (NFC). 2023-11-23 16:59:50 +00:00
Alexey Bataev
53f912480f [SLP][NFC]Remove extra unused vars, add TODO, NFC. 2023-11-22 12:26:54 -08:00
Alexey Bataev
12bcd6339d [SLP]Improve detection of gathered loads, if no other deps are detected.
If the gather node includes ordered loads only partially (not the whole
node consists of loads) and the other gathered scalar are not loads, and
no other dependency from other nodes is found, we still can improve the
cost of gather, if take into account the fact that these loads still can
be vectorized.
2023-11-22 11:35:51 -08:00