Update planContainsAdditionalSimplifications to also check phis not in
the loop header. This ensures we don't miss cases where VPBlendRecipes
(which correspond to such phis) have been simplified.
Fixes https://github.com/llvm/llvm-project/issues/107473.
Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the
runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only
generated if there are users of VF, to avoid unnecessary test changes.
PR: https://github.com/llvm/llvm-project/pull/95305
There are some cases where only the first operand is marked for
truncation. In that case, the compare won't be truncated which would
incorrectly trigger the assertion.
It also shows that the check pre 3fe6a064f15c also considered compares
truncated that cannot be truncated.
The current check for truncated compares in getInstructionCost misses
cases where either the first or both operands are constants.
Check directly if the compare is marked for truncation. In that case,
the minimum bitwidth is that of the operands.
The patch also adds asserts to ensure that.
This fixes a divergence between legacy and VPlan-based cost model, where
the legacy cost model incorrectly estimated the cost of compares with
truncated operands.
Fixes https://github.com/llvm/llvm-project/issues/107171.
Similarly to dd94537b4, setVectorizedCallDecision also did not consider
ForcedScalars. This lead to VPlans not reflecting the decision by the
legacy cost model (cost computation would use scalar cost, VPlan would
have VPWidenCallRecipe).
To fix this, check if the call has been forced to scalar in
setVectorizedCallDecision.
Note that this requires moving setVectorizedCallDecision after
collectLoopUniforms (which sets ForcedScalars). collectLoopUniforms does
not depend on call decisions and can safely be moved.
Fixes https://github.com/llvm/llvm-project/issues/107051.
collectInstsToScalarize may decide to scalarize a call. If so, we have
to update the widening decision for the call, otherwise the call won't
be scalarized as expected during VPlan construction.
This issue was uncovered by f82543d509.
This moves the logic to create simplified operands using SCEV to MUL
recipe creation. This is needed to match the behavior of the legacy's cost
model. TODOs are to extend to other opcodes and move to a transform.
Note that this also restricts the number of SCEV simplifications we
apply to more precisely match the cases handled by the legacy cost
model.
Fixes https://github.com/llvm/llvm-project/issues/107015.
Branches exiting the loop will remain regardless, so don't consider them
in collectValuesToIgnore.
This fixes another divergence between legacy and VPlan-based cost model.
Fixes https://github.com/llvm/llvm-project/issues/106780.
A optimizable cast can also be removed by VPlan simplifications. Remove
the restriction from planContainsAdditionalSimplifications, as this
causes it to miss relevant simplifications, triggering false positives
for the cost decision verification.
Also adds debug output for printing additional cost-precomputations.
Fixes https://github.com/llvm/llvm-project/issues/106641.
This ensures we skip any instructions identified to be ignored by the
legacy cost model as well. Fixes a divergence between legacy and
VPlan-based cost model.
Fixes https://github.com/llvm/llvm-project/issues/106417.
Improve operand analysis using SCEV for cost purposes. This fixes a
divergence between legacy and VPlan-based cost-modeling after
533e6bbd0d34.
Fixes https://github.com/llvm/llvm-project/issues/106248.
Live-ins that are used as exit values don't need to be extracted, they
can be passed through directly. This fixes a crash when trying to
extract from a live-in.
Fixes https://github.com/llvm/llvm-project/issues/106257.
This is a step towards further breaking up the rather large
tryToBuildVPlanWithVPRecipes. It moves logic create interleave groups to
VPlanTransforms.cpp, where similar replacements for other recipes are
defined as well (e.g. EVL-based ones)
Don't consider the cost of branches marked to be skipped in VPlan cost
pre-computation. Those aren't included in the legacy cost, so they
should not be included in the VPlan cast.
This patch fixes:
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7245:1: error:
unused function 'planContainsAdditionalSimplifications'
[-Werror,-Wunused-function]
There are cases where VPlans contain some simplifications that are very
hard to accurately account for up-front in the legacy cost model. Those
cases are caused by un-simplified inputs, which trigger the assert
ensuring both the legacy and VPlan-based cost model agree on the VF.
To avoid false positives due to missed simplifications in general, only
trigger the assert if the chosen VPlan doesn't contain any additional
simplifications.
Fixes https://github.com/llvm/llvm-project/issues/104714.
Fixes https://github.com/llvm/llvm-project/issues/105713.
Move VPWiden[Load|Store]EVLRecipe::executeto VPlanRecipes.cpp in line
with other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
It's not possible to pick the best mask to remove when optimising
VPBlend at construction and so this patch refactors the code to move the
decision (and thus transformation) to VPlanTransforms.
NOTE: This patch does not change the decision of which mask to pick.
That will be done in a following PR to keep this patch as NFC from an
output point of view.
Use getBestVF to select VF up-front and only use
selectVectorizationFactor to get the VF legacy VF to check the
vectorization decision matches the VPlan-based cost model.
PR: https://github.com/llvm/llvm-project/pull/103033
Introduce explicit ExtractFromEnd recipes to extract the final values
for live-outs instead of implicitly extracting in VPLiveOut::fixPhi.
This is a follow-up to the recent changes of modeling extracts for
recurrences and consolidates live-out extract creation for fixed-order
recurrences at a single place: addLiveOutsForFirstOrderRecurrences.
It is also in preparation of replacing VPLiveOut with VPIRInstructions
wrapping the original scalar phis.
PR: https://github.com/llvm/llvm-project/pull/100658
As suggested in https://github.com/llvm/llvm-project/pull/103033, more
accurately rename to getPlanFor , as it simplify returns the VPlan for
VF, relying on the fact that there is a single VPlan for each VF at the
moment.
As suggested in https://github.com/llvm/llvm-project/pull/103033, add a
remark when the UserVF is ignored due to it being larger than MaxUserVF.
Only changes behavior of diagnostic/debug output.
Members not requiring access to LoopVectorizationLegality or
LoopVectorizationCostModel can safely be moved out of the very large
LoopVectorization.cpp and are more accurately placed in VPlan.cpp
Update the legacy cost model skip branches with successors blocks
that are empty or only contain dead instructions, together with their
conditions. Such branches and conditions won't result in any
generated code and will be cleaned up by VPlan transforms.
This fixes a difference between the legacy and VPlan-based cost model.
When running LV in its usual pipeline position, such dead blocks should
already have been cleaned up, but they might be generated manually or by
fuzzers.
Fixes https://github.com/llvm/llvm-project/issues/100591.
Implement VPWidenRecipe::computeCost for most cases (except
UDiv,SDiv,URem,SRem which require additional logic).
Note that this specializes `::computeCost` instead of `::cost`, as
`VPRecipeBase::cost` is responsible for skipping cost-computations
for pre-computed recipes for now.
The most recent version of the VPlan-based cost model introduction
has been committed on Jul 10 (b841e2eca3b5c8b) and we should
probably give it at least a week in case additional mismatches surface.
PR: https://github.com/llvm/llvm-project/pull/98764
Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp in line with
other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
Now that the branches to the scalar epilogue are modeled in VPlan
directly, check the VPlan to see if a scalar epilogue is required.
Preparation for https://github.com/llvm/llvm-project/pull/100658.
Include chain of ops feeding inductions in cost precomputation for
inductions, not just the induction increment. In VPlan, those
instructions will be cleaned up, as both phi and increment are generated
by VPWidenIntOrFpInductionRecipe independently.
Fixes https://github.com/llvm/llvm-project/issues/101337.
After f0df4fbd0c7b, isPredicatedInst needs to handle SwitchInst as well.
Handle it the same as BranchInst.
This fixes a crash in the newly added test and improves the results for
one of the existing tests in predicate-switch.ll
Should fix https://lab.llvm.org/buildbot/#/builders/113/builds/2099.
Update createEdgeMask to created masks where the terminator in Src is a
switch. We need to handle 2 separate cases:
1. Dst is not the default desintation. Dst is reached if any of the
cases with destination == Dst are taken. Join the conditions for each
case where destination == Dst using a logical OR.
2. Dst is the default destination. Dst is reached if none of the cases
with destination != Dst are taken. Join the conditions for each case
where the destination is != Dst using a logical OR and negate it.
Edge masks are created for every destination of cases and/or
default when requesting a mask where the source is a switch.
Fixes https://github.com/llvm/llvm-project/issues/48188.
PR: https://github.com/llvm/llvm-project/pull/99808
Move VPWidenLoadRecipe::execute to VPlanRecipes.cpp in line with
other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp
These are the final few places in LLVM where we use instruction pointers
to identify the position that we're inserting something. We're trying to
get away from that with a view to deprecating those methods, thus use
iterators in all these places. I believe they're all debug-info safe.
The sketchiest part is the ExtractValueInst copy constructor, where we
cast nullptr to a BasicBlock pointer, so that we take the non-default
insert-into-no-block path for instruction insertion, instead of the
default nullptr-instruction path for UnaryInstruction. Such a hack is
necessary until we get rid of the instruction constructor entirely.
Move VPWidenPointerInductionRecipe::execute to VPlanRecipes.cpp in line
with other ::execute implementations that don't depend on anything
defined in LoopVectorization.cpp