Before this patch, a VPlan contained 2 mappings for Values -> VPValue:
1) Value2VPValue and 2) VPExternalDefs.
This duplication is unnecessary and there are already cases where
external defs are added to Value2VPValue. This patch replaces all uses
of VPExternalDefs with Value2VPValue.
It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue)
and addVPValue (to addExternalVPValue).
At the moment, this is NFC, but will enable additional simplifications
in D147783.
Depends on D147891.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147892
Update the isCanonical() implementations to check the VPValue step
operand instead of the step in the induction descriptor.
At the moment this is NFC, but it enables further optimizations if the
step is replaced by a constant in D147783.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147891
Building on D142885 and D142589, retire the SinkAfter map from the
recurrence handling code. It is replaced by checking whether it is
possible to sink all users of a recurrence directly in VPlan. This
results in simpler code overall and allows to handle additional cases
(see the improvements in @test_crash).
Depends on D142885.
Depends on D142589.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142886
The function doesn't use anything from VPRecipeBuilder, so move the
definition to where it is actually used and turn it into a simple static
function.
It also makes the VPRecipeBuilder argument for createAndOptimizeReplicateRegions
unnecessary.
There is no need to store information about invariance in the recipe.
Replace the fields with checks of the operands using
isDefinedOutsideVectorRegions.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D144489
There is no need to store information about invariance in the recipe.
Replace the fields with checks of the operands using
isDefinedOutsideVectorRegions.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D144487
This patch adds the predicate as additional operand to VPReplicateRecipe
during initial construction. The predicated recipes are later moved into
replicate regions. This simplifies constructions and some VPlan
transformations, like fixed-order recurrence handling.
It also improves codegen in some cases (e.g. for in-loop reductions),
because the recipes remain in the same block.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D143865
At the moment, properlyDominates(A, A) can return true via
LocalComesBefore. Add an early exit to ensure it returns false if
A == B.
Note: no test has been added because the existing test suite covers this
case already with libc++ with assertions enabled.
Fixes https://github.com/llvm/llvm-project/issues/60850.
The current implementation may return true for A < B and B < A, which
may cause issues if the sort implementation assures this property of the
comperator. This should fix a crash with MSVC.
adjustFixedOrderRecurrences may insert instructions after immediately
after the PHI nodes in the block. This invalidates the phis() iterator.
To avoid crashing/accessing invalid recipes, first collect all
first-order recurrence phi recipes.
This should fix a crash reported by @dmgreen after D142589 landed.
This patch updates LV to sink recipes directly using the VPlan use
chains. The initial patch only moves sinking to be purely VPlan-based.
Follow-up patches will move legality checks to VPlan as well.
At the moment, there's a single test failure remaining.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142589
Update sinkScalarOperands to consider all operands of recipes in
replicate blocks as sink candidates This enables additional sinking
opportunities and is another step towards retiring LLVM IR-based
sinkScalarOperands.
This enables iterative sinking of operands for successive calls of
sinkScalarOperands.
Depends on D139788.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D139790
Similar to vp_depth_first_shallow (D140512) add vp_depth_first_deep to
make existing code clearer and more compact.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142055
This reverts commit aa2414729ebbcb2d8f162e9002a3a6aa768b1f9d.
Previously-valid IR from a tensorflow test case (as shown on the
Diffusion revision for aa2414729ebbcb2d8f162e9002a3a6aa768b1f9d) started
hanging in the loop-vectorize pass. Reverting to keep everyone working.
Adjust mergeReplicateRegions to be in line with
mergeBlocksIntoPredecessors added in 36d70a6aea6b by collecting only the
valid candidates first.
Also rename to mergeReplicateRegionsIntoSuccessors and add missing
doc-comment.
This addresses post-commit suggestions by @Ayal.
This reduces the size of VPlan.h and avoids future growth of the file
when the graph traits are extended in future patches.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D140500
Even if the the sink candidate is already in the target block, its
operands can be candidates for sinking. Queue them up as well. Also
moves the queuing logic to a helper.
Merging regions can enable new sinking opportunities (e.g. if users of a
scalar value are moved from different VPBBs into the same VPBB). Sinking
in turn can also enable new merging opportunities (e.g. if a recipe
between to merge-able regions is moved.
To enable more sinking opportunities, repeat sinking & merging if
regions could be merged.
Also fix mergeReplicateRegions to return the correct Changed status.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D139788
Add and run VPlan transform to fold blocks with a single predecessor
into the predecessor. This remove redundant blocks and addresses a TODO
to replace special handling for the vector latch VPBB.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D139927
Just comparing constant trip counts causes LV to miss cases where the
vector loop body only executes once.
The motivation for this is to remove the need for unrolling to remove
vector loop back-edges, if the body only executes once in more cases.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D133017
This sets the stage for D133017 by moving out the code that performs
VPlan based simplifications to a separate transform that takes the
chosen VF & UF as arguments.
The main advantage is that this transform runs before any changes to
the CFG are being made. This allows using SCEV without worrying about
making queries while the IR is in an incomplete state.
Note that this patch switches the reasoning to use SCEV, but still only
simplifies loops with constant trip counts. Using SCEV here is needed to
access the backedge taken count, because the trip count IR value has not
been created yet.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D135017
In scalar plans, replicate recipes will only generate a single value per
UF, independent of whether they are uniform or not. So don't consider
uniformity for plans with scalar VFs only.
This allows us to handle a few additional cases in VPlan sinking instead
of non-VPlan sinkScalarOperands.
Depends on D133762.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D134218
This patch extends VP-based sinking to also sink VPScalarStepsRecipe.
This takes us a step closer towards retiring the IR based sinking.
The main change is extending VPScalarIVStepsRecipe::execute to support
executing in a replicate-region.
Depends on D133758.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D133760
This reverts commit bf15f1e489aa2f1ac13268c9081a992a8963eb5b.
The updated version fixes a crash by checking the induction kind instead
of the opcode; for integer inductions, the step is always added, but the
opcode might not be set.
This patch splits off the logic to transform the canonical IV to a
a value for an induction with a different start and step. This
transformation only needs to be done once (independent of VF/UF) and
enables sinking of VPScalarIVStepsRecipe as follow-up.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D133758
Replace custom code to check if only the first lane is used by generic
helper `onlyFirstLaneUsed`. This enables VPlan-based sinking in a few
additional cases and was suggested in D133760.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D136368
This patch moves the cost-based decision whether to use an intrinsic or
library call to the point where the recipe is created. This untangles
code-gen from the cost model and also avoids doing some extra work as
the information is already computed at construction.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D132585
In some cases, there may be widened users of inductions even though the
plan includes the scalar VF. In those cases, make sure we still replace
the VPWidenIntOrFpInductionRecipe with scalar steps, as otherwise we may
try to execute a VPWidenIntOrFpInductionRecipe with a scalar VF.
Alternatively the patch could also split the range if needed.
This fixes a crash exposed by D123720.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D128755
All information is already available in VPlan. Note that there are some
test changes, because we now can correctly look through instructions
like truncates to analyze the actual users.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123541
This reverts commit 266ea446ab747671eb6c736569c3c9c5f3c53d11.
The reasons for the revert have been addressed by cleaning up condition
handling in VPlan and properly marking VPBranchOnMaskRecipe as using
scalars.
The test case for the revert from D123720 has been added in 3d663308a5d.
This patch removes CondBit and Predicate from VPBasicBlock. To do so,
the patch introduces a new branch-on-cond VPInstruction opcode to model
a branch on a condition explicitly.
This addresses a long-standing TODO/FIXME that blocks shouldn't be users
of VPValues. Those extra users can cause issues for VPValue-based
analyses that don't expect blocks. Addressing this fixme should allow us
to re-introduce 266ea446ab7476.
The generic branch opcode can also be used in follow-up patches.
Depends on D123005.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D126618
This patch updates the VPlan native path to use VPRegionBlocks for all
loops in a loop nest. Up to now, only the outermost loop used a region.
This is a step towards unifying both paths and keep things consistent
between them. It also prepares various code-gen parts for modeling the
pre-header in the inner loop vectorizer (D121624).
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123005
This patch introduces a new VPLiveOut subclass of VPUser to model
exit values explicitly. The initial version handles exit values that
are neither part of induction or reduction chains nor first order
recurrence phis.
Fixes#51366, #54867, #55167, #55459
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123537
Those helpers model properties of a user and they should also be
available to non-recipe users. This will be used in D123537 for a new
exit value user.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D124936
This reverts commit 8b48223447311af8b3022697dd58858e1ce6975f.
This triggers an assertion on a test case mentioned in D123720.
Revert while I investigate.
This reverts commit f4e1eaa3755a13f85696be3b74b387122b74a558.
The patch was originally reverted because it uncovered an issue that has
now been fixed in 0ef8ca6d88aa7e4abc.
This reverts commit 43842b887e0a7b918bb2d6c9f672025b2c621f8a while I
investigate a buildbot failure.
It also reverts the follow-up commit
2883de05145fc5b4afb99b91f69ebb835af36af5.
Remove one of the last remaining uses of ::needsVectorIV, preparing for
its removal. Now that usesScalars is available and based on the
information explicit in VPlan, there is no need to use the pre-computed
needsVectorIV.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123720
This patch moves SCEV expansion of steps used by
VPWidenIntOrFpInductionRecipes to the pre-header using
VPExpandSCEVRecipe. This ensures that those steps are expanded while the
CFG is in a valid state. Previously, SCEV expansion may happen during
vector body code-generation, during which the CFG may be invalid,
causing issues with SCEV expansion.
Depends on D122095.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D122096