2177 Commits

Author SHA1 Message Date
Florian Hahn
e454d31037
[VPlan] Factor out precomputing costs from LVP::cost (NFC).
Move the logic for pre-computing costs of certain instructions to a
separate helper function, allowing re-use in a follow-up patch.
2024-08-22 20:40:38 +01:00
Florian Hahn
1fa6c99a09
[VPlan] Move EVL memory recipes to VPlanRecipes.cpp (NFC)
Move VPWiden[Load|Store]EVLRecipe::executeto VPlanRecipes.cpp in line
with other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
2024-08-22 18:30:49 +01:00
Paul Walker
4f075086e7
[LLVM][VPlan] Keep all VPBlend masks until VPlan transformation. (#104015)
It's not possible to pick the best mask to remove when optimising
VPBlend at construction and so this patch refactors the code to move the
decision (and thus transformation) to VPlanTransforms.

NOTE: This patch does not change the decision of which mask to pick.
That will be done in a following PR to keep this patch as NFC from an
output point of view.
2024-08-21 12:51:40 +01:00
Florian Hahn
4e04286d61
[VPlan] Only use selectVectorizationFactor for cross-check (NFCI). (#103033)
Use getBestVF to select VF up-front and only use
selectVectorizationFactor to get the VF legacy VF to check the
vectorization decision matches the VPlan-based cost model.

PR: https://github.com/llvm/llvm-project/pull/103033
2024-08-21 13:09:01 +02:00
Florian Hahn
99741ac285
[VPlan] Introduce explicit ExtractFromEnd recipes for live-outs. (#100658)
Introduce explicit ExtractFromEnd recipes to extract the final values
for live-outs instead of implicitly extracting in VPLiveOut::fixPhi.

This is a follow-up to the recent changes of modeling extracts for
recurrences and consolidates live-out extract creation for fixed-order
recurrences at a single place: addLiveOutsForFirstOrderRecurrences.

It is also in preparation of replacing VPLiveOut with VPIRInstructions
wrapping the original scalar phis.

PR: https://github.com/llvm/llvm-project/pull/100658
2024-08-21 10:06:44 +02:00
Florian Hahn
7452014c95
[LV] Simplify !UserVF.isZero() -> UserVF (NFC).
Address post-commit comment for b8dccb7d56c to simplify code.
2024-08-20 09:40:35 +01:00
Florian Hahn
f2fcd9cb97
[VPlan] Rename getBestPlanFor -> getPlanFor (NFC).
As suggested in https://github.com/llvm/llvm-project/pull/103033, more
accurately rename to getPlanFor , as it simplify returns the VPlan for
VF, relying on the fact that there is a single VPlan for each VF at the
moment.
2024-08-19 13:05:19 +01:00
Florian Hahn
b8dccb7d56
[VPlan] Emit note when UserVF > MaxUserVF (NFCI).
As suggested in https://github.com/llvm/llvm-project/pull/103033, add a
remark when the UserVF is ignored due to it being larger than MaxUserVF.

Only changes behavior of diagnostic/debug output.
2024-08-19 12:40:20 +01:00
Florian Hahn
740f055451
[VPlan] Rename getBestVF -> computeBestVF (NFC).
As suggested in https://github.com/llvm/llvm-project/pull/103033, more
accurately rename to computeBestVF, as it now does not simply return the
best VF, but directly computes it.
2024-08-19 10:44:50 +01:00
Florian Hahn
cd60d10a10
[VPlan] Move some LoopVectorizationPlanner helpers to VPlan.cpp (NFC).
Members not requiring access to LoopVectorizationLegality or
LoopVectorizationCostModel can safely be moved out of the very large
LoopVectorization.cpp and are more accurately placed in VPlan.cpp
2024-08-19 09:58:46 +01:00
Florian Hahn
e9e3a183d6
[LV] Don't cost branches and conditions to empty blocks.
Update the legacy cost model skip branches with successors blocks
that are empty or only contain dead instructions, together with their
conditions. Such branches and conditions won't result in any
generated code and will be cleaned up by VPlan transforms.

This fixes a difference between the legacy and VPlan-based cost model.

When running LV in its usual pipeline position, such dead blocks should
already have been cleaned up, but they might be generated manually or by
fuzzers.

Fixes https://github.com/llvm/llvm-project/issues/100591.
2024-08-18 12:51:17 +01:00
Florian Hahn
1aa8a6f691
[VPlan] Compute cost for most opcodes in VPWidenRecipe (NFCI). (#98764)
Implement VPWidenRecipe::computeCost for most cases (except 
UDiv,SDiv,URem,SRem which require additional logic).

Note that this specializes `::computeCost` instead of `::cost`, as
`VPRecipeBase::cost` is responsible for skipping cost-computations
for pre-computed recipes for now.

The most recent version of the VPlan-based cost model introduction 
has been committed on Jul 10 (b841e2eca3b5c8b) and we should
probably give it at least a week in case additional mismatches surface.

PR: https://github.com/llvm/llvm-project/pull/98764
2024-08-16 21:20:23 +02:00
Florian Hahn
42555cdba4
[VPlan] Run VPlan optimizations on plans in native path.
Update buildVPlans (used in native path) to also run general VPlan
optimizations in another small step to align both codepaths.
2024-08-15 13:05:51 +01:00
Florian Hahn
12763a0652
[VPlan] Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp (NFC).
Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp in line with
other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
2024-08-15 08:04:22 +01:00
Shao-Ce SUN
b006007e4a
[NFC][VP] Reduce parameters in LoopVectorizePass::runImpl (#103551)
It seems that the parameters can be passed through the class members.
2024-08-14 20:53:28 +08:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Florian Hahn
2ab910c08c
[LV] Check pointer user are in loop when checking for uniform pointers.
Widening decisions are not set for users outside the loop. Avoid
crashing by only calling isVectorizedMemAccessUse for users in the loop.

Fixes https://github.com/llvm/llvm-project/issues/102934.
2024-08-13 09:23:44 +01:00
Florian Hahn
c7a44ec031
[VPlan] Check successors in VPlan to check if scalar epi required (NFC)
Now that the branches to the scalar epilogue are modeled in VPlan
directly, check the VPlan to see if a scalar epilogue is required.

Preparation for https://github.com/llvm/llvm-project/pull/100658.
2024-08-12 15:33:52 +01:00
Florian Hahn
cd08fadd03
[LV] Include chains feeding inductions in cost precomputation.
Include chain of ops feeding inductions in cost precomputation for
inductions, not just the induction increment. In VPlan, those
instructions will be cleaned up, as both phi and increment are generated
by VPWidenIntOrFpInductionRecipe independently.

Fixes https://github.com/llvm/llvm-project/issues/101337.
2024-08-12 14:45:43 +01:00
Florian Hahn
db0603cb7b
[LV] Only OR unique edges when creating block-in masks.
This removes redundant ORs of matching masks.

Follow-up to f0df4fbd0c7b to reduce the number of redundant ORs for
masks.
2024-08-12 10:17:40 +01:00
Florian Hahn
60680f7181
[LV] Handle SwitchInst in ::isPredicatedInst.
After f0df4fbd0c7b, isPredicatedInst needs to handle SwitchInst as well.
Handle it the same as BranchInst.

This fixes a crash in the newly added test and improves the results for
one of the existing tests in predicate-switch.ll

Should fix https://lab.llvm.org/buildbot/#/builders/113/builds/2099.
2024-08-11 20:56:58 +01:00
Florian Hahn
f0df4fbd0c
[LV] Support generating masks for switch terminators. (#99808)
Update createEdgeMask to created masks where the terminator in Src is a
switch. We need to handle 2 separate cases:

1. Dst is not the default desintation. Dst is reached if any of the
cases with destination == Dst are taken. Join the conditions for each
case where destination == Dst using a logical OR.
2. Dst is the default destination. Dst is reached if none of the cases
with destination != Dst are taken. Join the conditions for each case
where the destination is != Dst using a logical OR and negate it.

Edge masks are created for every destination of cases and/or 
default when requesting a mask where the source is a switch.

Fixes https://github.com/llvm/llvm-project/issues/48188.

PR: https://github.com/llvm/llvm-project/pull/99808
2024-08-11 20:38:36 +02:00
Florian Hahn
7024cecf03
[LV] Collect profitable VFs in ::getBestVF. (NFCI)
Move collectig profitable VFs to ::getBestVF, in preparation for
retiring selectVectorizationFactor.
2024-08-11 14:45:54 +01:00
Florian Hahn
35d3625a4d
[VPlan] Move VPWidenLoadRecipe::execute to VPlanRecipes.cpp (NFC).
Move VPWidenLoadRecipe::execute to VPlanRecipes.cpp in line with
other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
2024-08-11 12:01:18 +01:00
Jeremy Morse
fd7d7882e7
[DebugInfo][RemoveDIs] Use iterators to insert everywhere (#102003)
These are the final few places in LLVM where we use instruction pointers
to identify the position that we're inserting something. We're trying to
get away from that with a view to deprecating those methods, thus use
iterators in all these places. I believe they're all debug-info safe.

The sketchiest part is the ExtractValueInst copy constructor, where we
cast nullptr to a BasicBlock pointer, so that we take the non-default
insert-into-no-block path for instruction insertion, instead of the
default nullptr-instruction path for UnaryInstruction. Such a hack is
necessary until we get rid of the instruction constructor entirely.
2024-08-08 14:25:06 +01:00
Florian Hahn
241349fff2
[VPlan] Move VPWidenPointerInductionR::execute to VPlanRecipes. (NFC)
Move VPWidenPointerInductionRecipe::execute to VPlanRecipes.cpp in line
with other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
2024-08-05 20:42:10 +01:00
Alexey Bataev
badf34a063
[LV]Process alloca in isPredicatedInst for tail-folded analysis.
Patch fixes the compiler crash when it tries to check is alloca in the
loop is a predicated instruction.

Reviewers: fhahn

Reviewed By: fhahn

Pull Request: https://github.com/llvm/llvm-project/pull/101743
2024-08-05 14:04:54 -04:00
Florian Hahn
fdb9f96fa2
[LV] Consider earlier stores to invariant reduction address as dead.
For invariant stores to an address of a reduction, only the latest store
will be generated outside the loop. Consider earlier stores as dead.

This fixes a difference between the legacy and VPlan-based cost model.

Fixes https://github.com/llvm/llvm-project/issues/96294.
2024-08-04 20:54:26 +01:00
Kazu Hirata
b7146aed5b
[Transforms] Construct SmallVector with ArrayRef (NFC) (#101851) 2024-08-03 15:33:08 -07:00
Florian Hahn
66ce4f771e
[VPlan] Port invalid cost remarks to VPlan. (#99322)
This patch moves the logic to create remarks for instructions with
invalid costs to work on recipes and decoupling it from
selectVectorizationFactor. This is needed to replace the remaining uses
of selectVectorizationFactor with getBestPlan using the VPlan-based cost
model.

The current implementation iterates over all VPlans and their recipes
again, to find recipes with invalid costs, which is more work but will
only be done when remarks for LV are enabled. Once the remaining uses of
selectVectorizationFactor are retired, we can collect VPlans with
invalid costs as part of getBestPlan if we want to optimize the remarks
case a bit, at the cost of adding additional complexity.

PR: https://github.com/llvm/llvm-project/pull/99322
2024-07-27 12:52:12 +01:00
Florian Hahn
5a9b9ef660
[VPlan] Remove now redundant VF assertion.
The assertion was added in preparation for
https://github.com/llvm/llvm-project/pull/9882. Remove assertion now
the PR has landed.
2024-07-26 21:25:33 +01:00
Florian Hahn
67a55e01e3
[VPlan] Replace getBestPlan by getBestVF use also for epilogue vec. (#98821)
Replace getBestPlan by getBestVF which simply finds the best
VF out of the VFs for the available VPlans.

Then use getBestPlan to retrieve the corresponding VPlan.

This allows using getBestVF & getBestPlan for epilogue vectorization
as well. As the same plan may be used to vectorize both the main
and epilogue loop, restricting the VF of the best plan would cause
issues.

PR: https://github.com/llvm/llvm-project/pull/98821
2024-07-26 14:06:46 +01:00
Florian Hahn
8b02f31aea
[VPlan] Consistently use VF.Width to getting plan for main loop VF (NFC)
Cleanup to make things consistent in preparation for
https://github.com/llvm/llvm-project/pull/98821.
2024-07-26 11:16:45 +01:00
Florian Hahn
a3092152ac
[VPlan] Don't create live-outs for induction increments.
Follow up to fc9cd3272b5 to also skip creating live-outs for IV
increments, as those are also generated independent of VPlan for now.
2024-07-25 21:34:55 +01:00
Florian Hahn
72532c9219
[LV] Don't predicate divs with invariant divisor when folding tail (#98904)
When folding the tail, at least one of the lanes must execute
unconditionally. If the divisor is loop-invariant no predication is
needed, as predication would not prevent the divide-by-0 on the executed
lane.

Depends on https://github.com/llvm/llvm-project/pull/98892.

PR: https://github.com/llvm/llvm-project/pull/98904
2024-07-25 12:21:09 +01:00
Florian Hahn
b72689a5cb
[LV] Ignore live-out users in cost model if scalar epilogue is required.
Follow-up to ba8126b6fef79.

If a scalar epilogue is required, users outside the loop won't use
live-outs from the vector loop but from the scalar epilogue. Ignore them if
that is the case.

This fixes another case where the VPlan-based cost-model more accurately
computes cost.

Fixes https://github.com/llvm/llvm-project/issues/100464.
2024-07-25 11:16:18 +01:00
Florian Hahn
07688d1341
Revert "[LV] Add option to still enable the legacy cost model. (#99536)"
This reverts commit 9ba524427321b931bad156860755adf420aeec6a.

Remove the recently added temporary option vectorize-use-legacy-cost-model
as discussed on the PR adding it, now that we branched for 19.x.
2024-07-24 14:32:36 +01:00
Florian Hahn
ba8126b6fe
[LV] Mark dead instructions in loop as free.
Update collectValuesToIgnore to also ignore dead instructions in the
loop. Such instructions will be removed by VPlan-based DCE and won't be
considered by the VPlan-based cost model.

This closes a gap between the legacy and VPlan-based cost model. In
practice with the default pipelines, there shouldn't be any dead
instructions in loops reaching LoopVectorize, but it is easy to generate
such cases by hand or automatically via fuzzers.

Fixes https://github.com/llvm/llvm-project/issues/99701.
2024-07-24 09:31:32 +01:00
Florian Hahn
d89f3e8df3
[VPlan] Remove dead HeaderVPBB argument from addUsersInExitBlock (NFC). 2024-07-23 11:36:43 +01:00
Florian Hahn
a23efcc703
[VPlan] Move VPInterleaveRecipe::execute to VPlanRecipes.cpp (NFC).
Move ::exeute and ::print to VPlanRecipes.cpp in line with other recipe
definitions.
2024-07-20 22:23:02 +01:00
Florian Hahn
1f00c42446
[VPlan] Assert masked interleave accesses are allowed if needed (NFC)
Add assertion at interleave group construction.
2024-07-20 21:42:38 +01:00
Craig Topper
be7f1827ff
[LV] Use llvm::all_of in LoopVectorizationCostModel::getMaximizedVFForTarget. NFC (#99585) 2024-07-19 17:13:20 -07:00
Florian Hahn
9ba5244273
[LV] Add option to still enable the legacy cost model. (#99536)
This patch adds a new temporary option to still use the legacy cost
model after https://github.com/llvm/llvm-project/pull/92555. It defaults
to false and the only intended use is to adjust the default to true in
the soon-to-be-cut release branch.

PR: https://github.com/llvm/llvm-project/pull/99536
2024-07-19 18:48:15 +01:00
Florian Hahn
008df3cf85
[LV] Check isPredInst instead of isScalarWithPred in uniform analysis. (#98892)
Any instruction marked as uniform will result in a uniform
VPReplicateRecipe. If it requires predication, it will be placed in a
replicate region, even if isScalarWithPredication returns false.

Check isPredicatedInst instead of isScalarWithPredication to avoid
generating uniform VPReplicateRecipes placed inside a replicate region.
This fixes an assertion when using scalable VFs.

Fixes https://github.com/llvm/llvm-project/issues/80416. 
Fixes https://github.com/llvm/llvm-project/issues/94328.
Fixes https://github.com/llvm/llvm-project/issues/99625.

PR: https://github.com/llvm/llvm-project/pull/98892
2024-07-19 12:02:25 +01:00
Craig Topper
52d947b5c1 [LV] Remove unnecessary variable from InnerLoopVectorizer::createBitOrPointerCast. NFC
DstVTy is already a VectorType, we don't need to cast it again. This
used to be a cast to FixedVectorType that was changed to support
scalable vectors.
2024-07-18 12:54:40 -07:00
Florian Hahn
371777695f
[LV] Assert uniform recipes don't get predicated for when vectorizing.
Add assertion ensuring invariant on construction, split off as suggested
from https://github.com/llvm/llvm-project/pull/98892.
2024-07-18 17:43:51 +01:00
Alexey Bataev
1a80153ba9
[LV][NFC]Simplify the structure and improve message of safe distance analysis for scalable vectorization. (#99487) 2024-07-18 10:11:39 -04:00
Florian Hahn
2bb65660ae
[LV] Allow re-processing of operands of instrs feeding interleave group
Follow up to d216615518 to update dead interleave group pointer detection
to allow re-processing of operands of instructions determined to only feed
interleave groups.

This is needed because instructions feeding interleave group pointers
can become dead in any order, as per the newly added test case.
2024-07-17 21:37:28 +01:00
Florian Hahn
75b3ddf23b
[VPlan] Use State.VF in vectorizeInterleaveGroup (NFCI).
Update vectorizeInterleaveGroup to use State.VF in preparation to moving
the code directly to the recipe.
2024-07-17 14:30:19 +01:00
Alexey Bataev
8156be684d
[LV][NFC]Introduce isScalableVectorizationAllowed() to refactor getMaxLegalScalableVF().
Adds isScalableVectorizationAllowed() and the corresponding data member
to query if the scalable vectorization is supported rather than
performing the analysis each time the scalable vector factor is
requested.

Part of https://github.com/llvm/llvm-project/pull/91403

Reviewers: ayalz, fhahn

Reviewed By: fhahn, ayalz

Pull Request: https://github.com/llvm/llvm-project/pull/98916
2024-07-17 07:16:13 -04:00