144 Commits

Author SHA1 Message Date
Andrei Elovikov
f96c1ccc1e
[VPlan] Add -vplan-print-after= option (#178700)
UpdateTestChecks support is updated in subsequent
https://github.com/llvm/llvm-project/pull/178736.
2026-02-10 16:07:25 +00:00
Florian Hahn
7509cad693
[VPlan] Support masked VPInsts, use for predication (NFC) (#142285)
Add support for mask operands to most VPInstructions, using
getNumOperandsForOpcode.

This allows VPlan predication to predicate VPInstructions directly. The
mask will then be dropped or handled when creating wide recipes.

Depends on https://github.com/llvm/llvm-project/pull/142284.
Depends on https://github.com/llvm/llvm-project/pull/168784.

PR: https://github.com/llvm/llvm-project/pull/142285
2026-02-08 18:23:36 +00:00
Florian Hahn
05a2b146fb
[LV] Optimize FindLast recurrences to FindIV (NFCI). (#177870)
This patch restructures Find(First|Last)IV handling. Instead of
differentiating between FindLast, FindFirstIV and FindLastIV up front,
this patch simplifies the logic in IVDescriptor to just identify the
FindLast pattern up-front.

It then adds a new VPlan transformation to optimize FindLast reductions
to FindIV reductions if there is a suitable sentinel value.
Find(Last|First)IV recurrence kinds to a single FindIV kind.

This is simpler and more accurate, given selecting the first/last
induction of the final IV reduction is directly controlled by the
corresponding recurrence kind of the ComputeReductionResult.

The new structure also allows further optimizations, like vectorizing
FindLastIV with another boolean reduction that tracks if the condition
in the loop was ever true, if there is no suitable sentinel value.

PR: https://github.com/llvm/llvm-project/pull/177870
2026-02-05 13:57:20 +00:00
Luke Lau
bb14eabaca
[VPlan] Split out EVL exit cond transform from canonicalizeEVLLoops. NFC (#178181)
This is split out from #177114.

In order to make canonicalizeEVLLoops a generic "convert to variable
stepping" transform, move the code that changes the exit condition to a
separate transform since not all variable stepping loops will want to
transform the exit condition. Run it before canonicalizeEVLLoops before
VPEVLBasedIVPHIRecipe is expanded.

Also relax the assertion for VPInstruction::ExplicitVectorLength to just
bail instead, since eventually VPEVLBasedIVPHIRecipe will be used by
other loops that aren't EVL tail folded.
2026-02-02 04:45:43 +00:00
Florian Hahn
90b3712d8a
Reapply "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)"
This reverts commit d1e477b00b49c63ff4dd513eeb14a5b18bc055d7.

Recommit with a extra checks making sure extends are VPWidenCastRecipes,
rejecting VPReplicateRecipes.

Original message:
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.

With this, partial reductions are detected and created together in a
single step.

This allows forming partial reductions and bundling them up if
profitable together in a follow-up.

PR: https://github.com/llvm/llvm-project/pull/167851
2026-02-01 16:27:27 +00:00
Martin Storsjö
d1e477b00b Revert "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)"
This reverts commit f4e8cc1a2229dca76d21c8d37439c4c194b06b86.

This change wasn't NFC; it causes failed asserts when building
ffmpeg for i686 windows, see
https://github.com/llvm/llvm-project/pull/167851 for details.
2026-02-01 14:35:02 +02:00
Florian Hahn
f4e8cc1a22
[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.

With this, partial reductions are detected and created together in a
single step.

This allows forming partial reductions and bundling them up if
profitable together in a follow-up.

PR: https://github.com/llvm/llvm-project/pull/167851
2026-01-31 19:44:46 +00:00
Andrei Elovikov
d8621d665d
Reapply "[VPlan] Add hidden -vplan-print-after-all option" (#178547)
Re-commit of https://github.com/llvm/llvm-project/pull/175839 after
fixing build without `LLVM_ENABLE_DUMP`.

This consists of the following changes:

* Merge several overloads of `VPlanTransforms::runPass` into a single
function to avoid code duplication.

* Add helper macro `RUN_VPLAN_PASS` to capture the transformation name
  and pass it to the helper above for printing.

* Add new `-vplan-print-after-all` option (somewhat similar to existing
  `-vplan-verify-each`).

* Add two empty passes `printAfterInitialConstruction`/`printFinalVPlan`
so that initial/final VPlans would be supported in `-vplan-print-after-all`

This follows the original future plans in
https://github.com/llvm/llvm-project/pull/123640.
2026-01-30 19:55:09 +00:00
Florian Hahn
eabcdb572b
Revert "[VPlan] Add hidden -vplan-print-after-all option (#175839)" (#178544)
This reverts commit 97e1df149de213b760aae4060ee9e25dc9908125.

It looks like the commit caused some build bot failures. Revert back to green
so the failures can be investigated.

https://lab.llvm.org/buildbot/#/builders/159/builds/39803
https://lab.llvm.org/buildbot/#/builders/2/builds/43204
2026-01-28 23:49:24 +00:00
Andrei Elovikov
97e1df149d
[VPlan] Add hidden -vplan-print-after-all option (#175839)
This consists of the following changes: 
        
* Merge several overloads of `VPlanTransforms::runPass` into a single
function
  to avoid code duplication.
  
* Add helper macro `RUN_VPLAN_PASS` to capture the transformation name
  and pass it to the helper above for printing.

* Add new `-vplan-print-after-all` option (somewhat similar to existing
  `-vplan-verify-each`).
  
* Add two empty passes `printAfterInitialConstruction`/`printFinalVPlan`
so that initial/final
   VPlans would be supported in `-vplan-print-after-all`

This follows the original future plans in
https://github.com/llvm/llvm-project/pull/123640.
2026-01-28 22:25:54 +00:00
Graham Hunter
2abd6d6d7a
[LV] Vectorize conditional scalar assignments (#158088)
Based on Michael Maitland's previous work:
https://github.com/llvm/llvm-project/pull/121222

This PR uses the existing recurrences code instead of introducing a
new pass just for CSA autovec. I've also made recipes that are more
generic.
2026-01-14 14:59:18 +00:00
Luke Lau
0ae23ca9e6
[VPlan] Split out optimizeEVLMasks. NFC (#174925)
Addresses part of #153144 and splits off part of #166164

There are two parts to the EVL transform:

1) Convert the loop so the number of elements processed each iteration
is EVL, not VF. The IV and header mask are replaced with EVL-based
variants.
2) Optimize users of the EVL based header mask to VP intrinsic based
recipes.

(1) changes the semantics of the vector loop region, whereas (2) needs
to preserve them. This splits (2) out so we don't mix the two up, and
allows us to move (1) earlier in the pipeline in a future PR.
2026-01-14 07:01:14 +00:00
Florian Hahn
d27d75ee94
[VPlan] Use createHeaderPHIRecipes in native path (NFCI).
Simplify tryToBuildVPlan by using createHeaderPHIRecipes in the native
path as well.
2026-01-13 20:12:21 +00:00
Florian Hahn
524b1788c4
[VPlan] Add BranchOnTwoConds, use for early exit plans. (#172750)
This PR introduces a new BranchOnTwoConds VPInstruction, that takes 2
boolean operands and must be placed in a block with 3 successors.

If condition I is true, branches to successor I, otherwise falls through
to check the next condition. If both conditions are false, branch to the
third successor.

This new branch recipe is used for early-exit loops, to simplify the
representation in VPlan initially, by avoid the need for splitting the
middle block early on, in a way that preserves the single-exit block
property of regions. All exits still go through the latch block, but
they can go to more than 2 successors.

This idea was part of one of the original proposals for how to model
early exits in VPlan, but at that point in time, there was no good way
to handle this during code-gen, and we went with the early split-middle
block approach initially.

Now that we dissolve regions before ::execute, the new recipe can be
lowered nicely after regions have been removed, to a set of VPBBs and
BranchOnCond recipes. The initial lowering preserves the original
structure with the split middle blocks. Follow-ups will improve the
lowering to avoid this splitting, providing performance gains.

PR: https://github.com/llvm/llvm-project/pull/172750
2025-12-29 19:39:38 +00:00
Florian Hahn
c2a8739cd1
[VPlan] Split off VPReductionRecipe creation for in-loop reductions (NFC) (#168784)
This patch splits off VPReductionRecipe creation for in-loop reductions
to a separate transform from adjustInLoopReductions, which has been
renamed.

The new transform has been updated to work directly on VPInstructions,
and gets applied after header phis have been processed, once on VPlan0.

Builds on top of https://github.com/llvm/llvm-project/pull/168291 and
https://github.com/llvm/llvm-project/pull/166099 which should be
reviewed first.

PR: https://github.com/llvm/llvm-project/pull/168784
2025-12-25 14:02:58 +00:00
Florian Hahn
c43ccefc9f
[VPlan] Use PSE to construct SCEVs in getSCEVExprForVPValue (NFCI).
getSCEVExprForVPValue is used to create SCEVs for expressions from the
original loop, which may be predicated. Use PSE to construct predicated
SCEVs if possible. This matches the legacy LV code behavior.

Currently should be NFC, but will enable migrating more SCEV/cost-based
computations to VPlan.

The patch requires exposing a new getPredicatedSCEV helper to
PredicatedScalarEvolution which just takes a SCEV, to avoid needing to
go through IR values, which isn't an option for getSCEVExprForVPValue.
2025-12-21 22:39:49 +00:00
Florian Hahn
83eea87a36
[VPlan] Create header phis once, after constructing VPlan0 (NFC). (#168291)
Together with https://github.com/llvm/llvm-project/pull/168289 &
https://github.com/llvm/llvm-project/pull/166099 we can construct header
phis once up front, after creating VPlan0, as the
induction/reduction/first-order-recurrence classification applies across
all VFs.

Depends on https://github.com/llvm/llvm-project/pull/168289 &
https://github.com/llvm/llvm-project/pull/166099 

PR: https://github.com/llvm/llvm-project/pull/168291
2025-12-15 22:12:10 +00:00
Florian Hahn
4b6ad11876
[VPlan] Sink predicated stores with complementary masks. (#168771)
Extend the logic to hoist predicated loads
(https://github.com/llvm/llvm-project/pull/168373) to sink predicated
stores with complementary masks in a similar fashion.

The patch refactors some of the existing logic for legality checks to be
shared between hosting and sinking, and adds a new sinking transform on
top.

With respect to the legality checks, for sinking stores the code also
checks if there are any aliasing stores that may alias, not only loads.

PR: https://github.com/llvm/llvm-project/pull/168771
2025-12-02 11:43:37 +00:00
Florian Hahn
99addbf73d
[LV] Vectorize selecting last IV of min/max element. (#141431)
Add support for vectorizing loops that select the index of the minimum
or maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.

It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction. The extra user is then check as
part of the new `handleMultiUseReductions` VPlan transformation.

It processes any reduction that has other reduction users. The reduction
using the min/max reduction currently must be a FindLastIV reduction,
which needs adjusting to compute the correct result:
 1. We need to find the last IV for which the condition based on the
     min/max reduction is true,
 2. Compare the partial min/max reduction result to its final value and,
 3. Select the lanes of the partial FindLastIV reductions which
     correspond to the lanes matching the min/max reduction result.

Depends on https://github.com/llvm/llvm-project/pull/140451

PR: https://github.com/llvm/llvm-project/pull/141431
2025-11-28 22:26:19 +00:00
Florian Hahn
4cc8cc81e3
[VPlan] Hoist predicated loads with complementary masks. (#168373)
This patch adds a new VPlan transformation to hoist predicated loads, if
we can prove they execute unconditionally, i.e. there are 2 predicated
loads to the same address with complementary masks. Then we are
guaranteed to execute one of them on each iteration, allowing us to
remove the mask.

The transform groups masked replicating loads by their address SCEV,
then checks if there are 2 loads with complementary mask. If that is the
case, we check if there are any writes that may alias the load address
in the blocks between the first and last load with the same address.
The transforms operates after linearizing the CFG, but before
introducing replicate regions, which means this is just checking a chain
of consecutive blocks.

Currently this only uses noalias metadata to check for no-alias (using
the helpers added in https://github.com/llvm/llvm-project/pull/166247).

Then we create an unpredicated VPReplicateRecipe at the position of the
first load, then replace all users of the grouped loads with it.

Small Alive2 proof for hoisting with complementary masks:
https://alive2.llvm.org/ce/z/kUx742

PR: https://github.com/llvm/llvm-project/pull/168373
2025-11-26 13:55:14 +00:00
Florian Hahn
080ca902c6
[VPlan] Create resume phis in scalar preheader early. (NFC) (#166099)
Create phi recipes for scalar resume value up front in addInitialSkeleton during initial construction. This will allow moving the remaining code dealing with resume values to VPlan transforms/construction.

PR: https://github.com/llvm/llvm-project/pull/166099
2025-11-22 20:45:41 +00:00
Florian Hahn
7c34848ae1
[VPlan] Hoist loads with invariant addresses using noalias metadata. (#166247)
This patch implements a transform to hoists single-scalar replicated
loads with invariant addresses out of the vector loop to the preheader
when scoped noalias metadata proves they cannot alias with any stores in
the loop.

This enables hosting of loads we can prove do not alias any stores in
the loop due to memory runtime checks added during vectorization.

PR: https://github.com/llvm/llvm-project/pull/166247
2025-11-18 09:35:48 +00:00
Florian Hahn
3cba379e3d
[VPlan] Populate and use VPIRMetadata from VPInstructions (NFC) (#167253)
Update VPlan to populate VPIRMetadata during VPInstruction construction
and use it when creating widened recipes, instead of constructing
VPIRMetadata from the underlying IR instruction each time.

This centralizes VPIRMetadata in VPInstructions and ensures metadata is
consistently available throughout VPlan transformations.

PR: https://github.com/llvm/llvm-project/pull/167253
2025-11-17 21:28:49 +00:00
Fabrice de Gans
23f6a8aeaf
Add missing LLVM_ABI annotations (#167718)
This patch updates various LLVM headers to properly add the `LLVM_ABI`
and `LLVM_ABI_FOR_TEST` annotations to build LLVM as a DLL on Windows.

This effort is tracked in #109483.
2025-11-13 09:49:39 -08:00
Luke Lau
02c68b3ef7
[VPlan] Plumb scalable register size through narrowInterleaveGroups (#167505)
On RISC-V narrowInterleaveGroups doesn't kick in because the wrong
VectorRegWidth is passed to isConsecutiveInterleaveGroup.

narrowInterleaveGroups is always passed the RGK_FixedWidthVector
register size, but on RISC-V the RGK_ScalableVector size is twice as
large because we want to use LMUL 2. This causes the `GroupSize ==
VectorRegWidth` check to fail.

This fixes it by using the scalable register size whenever the VF is
scalable and plumbing it through as a potentially scalable TypeSize.

Note that this only makes a difference when tail folding is disabled, as
narrowInterleaveGroups can't handle EVL based IVs yet.
2025-11-12 11:14:53 +00:00
Kerry McLaughlin
de3de3f143
[LV] Consider interleaving when -enable-wide-lane-mask=true (#163387)
Currently the only way to enable the use of wide active lane masks is to pass
-enable-wide-lane-mask and force both interleaving & tail-folding with additional
flags. This patch changes selectInterleaveCount to consider interleaving if wide
lane masks were requested, although the feature remains off by default.
2025-11-11 11:46:59 +00:00
Florian Hahn
bfc322dd72
Revert "[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706)"
This reverts commit 8d29d09309654541fb2861524276ada6a3ebf84c.

There have been reports of mis-compiles
in https://github.com/llvm/llvm-project/pull/149706.

Revert while I investigate.
2025-10-22 21:27:11 +01:00
Florian Hahn
8d29d09309
[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706)
Move narrowInterleaveGroups to to general VPlan optimization stage.

To do so, narrowInterleaveGroups now has to find a suitable VF where all
interleave groups are consecutive and saturate the full vector width.

If such a VF is found, the original VPlan is split into 2:
 a) a new clone which contains all VFs of Plan, except VFToOptimize, and
 b) the original Plan with VFToOptimize as single VF.

The original Plan is then optimized. If a new copy for the other VFs has
been created, it is returned and the caller has to add it to the list of
candidate plans.

Together with https://github.com/llvm/llvm-project/pull/149702, this
allows to take the narrowed interleave groups into account when
computing costs to choose the best VF and interleave count.

One example where we currently miss interleaving/unrolling when
narrowing interleave groups is https://godbolt.org/z/Yz77zbacz

PR: https://github.com/llvm/llvm-project/pull/149706
2025-10-21 11:37:42 +01:00
Florian Hahn
b9ce7656e9
[VPlan] Add VPInstruction to unpack vector values to scalars. (#155670)
Add a new Unpack VPInstruction (name to be improved) to explicitly
extract scalars values from vectors.

Test changes are movements of the extracts: they are no generated
together and also directly after the producer.

Depends on https://github.com/llvm/llvm-project/pull/155102 (included in
PR)

PR: https://github.com/llvm/llvm-project/pull/155670
2025-10-19 18:49:05 +00:00
Ramkumar Ramachandra
93073af121
[LV] Move 3 functions into VPlanTransforms (NFC) (#158644)
Two of them are actually transforms, and the third is a dependent
static.
2025-10-06 11:13:01 +01:00
Ramkumar Ramachandra
ff4aec5d3c
[VPlan] Deref VPlanPtr when passing to transform (NFC) (#161369)
For uniformity with other transforms.
2025-10-03 09:47:35 +01:00
Florian Hahn
2016af5652
[VPlan] Create epilogue minimum iteration check in VPlan. (#157545)
Move creation of the minimum iteration check for the epilogue vector
loop to VPlan. This is a first step towards breaking up and moving
skeleton creation for epilogue vectorization to VPlan.

It moves most logic out of EpilogueVectorizerEpilogueLoop: the minimum
iteration check is created directly in VPlan, connecting the check
blocks from the main vector loop is done as post-processing. Next steps
are to move connecting and updating the branches from the check blocks
to VPlan, as well as updating the incoming values for phis.

Test changes are improvements due to folding of live-ins.

PR: https://github.com/llvm/llvm-project/pull/157545
2025-09-25 07:13:38 +00:00
Florian Hahn
b8eaceb39b
[VPlan] Explicitly replicate VPInstructions by VF. (#155102)
Extend replicateByVF added in #142433 (aa240293190) to also explicitly
unroll replicating VPInstructions.

Now the only remaining case where we replicate for all lanes is
VPReplicateRecipes in replicate regions.

PR: https://github.com/llvm/llvm-project/pull/155102
2025-09-12 17:06:26 +01:00
Ramkumar Ramachandra
d8fd511480
[VPlan] Introduce CSE pass (#151872)
Introduce a simple common-subexpression-elimination pass at the
VPlan-level, running late during the execution of the VPlan. The
long-term vision is to get rid of the legacy non-VPlan-based cse routine
in LV, but this patch doesn't yet fully subsume it.
2025-09-02 12:23:29 +01:00
Ramkumar Ramachandra
4cf770275f
[VPlan] Introduce replaceSymbolicStrides (NFC) (#155842)
Introduce VPlanTransforms::replaceSymbolicStrides factoring some code
from LoopVectorize.
2025-09-01 09:03:46 +00:00
Florian Hahn
5faed1ad84
[VPlan] Add VPlan-based addMinIterCheck, replace ILV for non-epilogue. (#153643)
This patch adds a new VPlan-based addMinimumIterationCheck, which
replaced the ILV version for the non-epilogue case.

The VPlan-based version constructs a SCEV expression to compute the
minimum iterations, use that to check if the check is known true or
false. Otherwise it creates a VPExpandSCEV recipe and emits a
compare-and-branch.

When using epilogue vectorization, we still need to create the minimum
trip-count-check during the legacy skeleton creation. The patch moves
the definitions out of ILV.

PR: https://github.com/llvm/llvm-project/pull/153643
2025-08-26 15:52:31 +01:00
Luke Lau
f3520c538d
[VPlan] Replace EVL branch condition with (branch-on-count AVLNext, 0) (#152167)
This changes the branch condition to use the AVL's backedge value
instead of the EVL-based IV.

This allows us to emit bnez on RISC-V and removes a use of the trip
count, which should reduce register pressure.

To match phis with VPlanPatternMatch I've had to relax the assert that
the number of operands must exactly match the pattern for the Phi
opcode, and I've copied over m_ZExtOrSelf from the LLVM IR
PatternMatch.h.

Fixes #151459
2025-08-26 11:19:19 +00:00
Florian Hahn
c950a72974
[VPlan] Support scalar VF for ExtractLane and FirstActiveLane.
Extend ExtractLane and FirstActiveLane to support scalable VFs. This
allows correct handling when interleaving with VF = 1.

Alive2 proofs:
 - Fixed codegen with this patch: https://alive2.llvm.org/ce/z/8Y5_Vc
   (verifies as correct)
 - Original codegen: https://alive2.llvm.org/ce/z/twdg3X (doesn't
   verify)

Fixes https://github.com/llvm/llvm-project/issues/154967.
2025-08-25 21:45:21 +01:00
Florian Hahn
954097dd61
[VPlan] Use SCEV to check subtract in getOptimizableIVOf.
Simplify checks for IV subtractions in getOptimizableIVOf by using SCEV.
This slightly generalizes the patterns we can handle.
2025-08-23 22:00:01 +01:00
Florian Hahn
30c26dcc47
[VPlan] Create extracts for live-outs early (NFC).
Create extracts for live-outs during skeleton construction.
2025-08-23 13:28:15 +01:00
Florian Hahn
300d2c6d20
[VPlan] Move SCEV expansion to VPlan transform. (NFCI).
Move the logic to expand SCEVs directly to a late VPlan transform that
expands SCEVs in the entry block. This turns VPExpandSCEVRecipe into an
abstract recipe without execute, which clarifies how the recipe is
handled, i.e. it is not executed like regular recipes.

It also helps to simplify construction, as now scalar evolution isn't
required to be passed to the recipe.
2025-08-21 22:03:26 +01:00
Florian Hahn
7e9989390d
[VPlan] Materialize Build(Struct)Vectors for VPReplicateRecipes. (NFCI) (#151487)
Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to
serve their users requiring a vector, instead of doing so when unrolling
by VF.

Now we only need to implicitly build vectors in VPTransformState::get
for VPInstructions. Once they are also unrolled by VF we can remove the
code-path alltogether.

PR: https://github.com/llvm/llvm-project/pull/151487
2025-08-18 20:49:42 +01:00
Ramkumar Ramachandra
d107c29fef
[VPlan] Strip unused CanonicalIVTy arg (NFC) (#153418) 2025-08-13 15:53:56 +01:00
Florian Hahn
424258947e
[VPlan] Materialize VF and VFxUF using VPInstructions. (#152879)
Materialize VF and VFxUF computation using VPInstruction
instead of directly creating IR.

This is one of the last few steps needed to model the full vector
skeleton in VPlan.

This is mostly NFC, although in some cases we remove some unused
computations.

PR: https://github.com/llvm/llvm-project/pull/152879
2025-08-12 14:13:13 +01:00
Luke Lau
aea82a780a
[VPlan] Remove some getCanonicalIV() uses. NFC (#152969)
A lot of time getCanonicalIV() is used to get the canonical IV type,
e.g. to instantiate a VPTypeAnalysis or to get the LLVMContext.

However VPTypeAnalysis has a constructor that takes the VPlan directly
and there's a method on VPlan to get the LLVMContext directly, so use
those instead where possible.

This lets us remove a constructor on VPTypeAnalysis.

Also remove an unused LLVMContext argument in UnrollState whilst we're
here.
2025-08-11 18:12:05 +08:00
Florian Hahn
06fd0f9d65
[VPlan] Move initial skeleton construction earlier (NFC). (#150848)
Split up the not clearly named prepareForVectorization transform into
buildVPlan0, which adds the vector preheader, middle and scalar
preheader blocks, as well as the canonical induction recipes and sets
the trip count. The new transform is run directly after building the
plain CFG VPlan initially.

The remaining code handling early exits and adding the branch in the
middle block is renamed to handleEarlyExitsAndAddMiddleCheck and still
runs at the original position.

With the code movement, we only have to add the skeleton once to the
initial VPlan, and cloning will take care of the rest. It will also
enable moving other construction steps to work directly on VPlan0, like
adding resume phis.

PR: https://github.com/llvm/llvm-project/pull/150848
2025-08-09 20:54:42 +01:00
Florian Hahn
82d633e9ff
[VPlan] Materialize vector trip count using VPInstructions. (#151925)
Materialize the vector trip count computation using VPInstruction
instead of directly creating IR. This is one of the last few steps
needed to model the full vector skeleton in VPlan. It also simplifies
vector-trip count computations for scalable vectors, as we can re-use
the UF x VF computation.

PR: https://github.com/llvm/llvm-project/pull/151925
2025-08-08 11:44:32 +01:00
Luke Lau
df8da2ff83
[VPlan] Support VPWidenPointerInductionRecipes with EVL tail folding (#152110)
Now that VPWidenPointerInductionRecipes are modelled in VPlan in
#148274, we can support them in EVL tail folding.

We need to replace their VFxUF operand with EVL as the increment is not
guaranteed to always be VF on the penultimate iteration, and UF is
always 1 with EVL tail folding.

We also need to move the creation of the backedge value to the latch so
that EVL dominates it.

With this we will no longer fail to convert a VPlan to EVL tail folding,
so adjust tryAddExplicitVectorLength to account for this. This brings us
to 99.4% of all vector loops vectorized on SPEC CPU 2017 with tail
folding vs no tail folding.

The test in only-compute-cost-for-vplan-vfs.ll previously relied on
widened pointer inductions with EVL tail folding to end up in a scenario
with no vector VPlans, so this also replaces it with an unvectorizable
fixed-order recurrence test from
first-order-recurrence-multiply-recurrences.ll that also gets discarded.
2025-08-07 10:54:24 +08:00
Florian Hahn
777c320e6c
[VPlan] Address comments missed in #142309.
Address additional comments from
https://github.com/llvm/llvm-project/pull/142309.
2025-08-06 11:52:08 +01:00
Florian Hahn
559d1dff89
[VPlan] Materialize BackedgeTakenCount using VPInstructions.
Explicitly compute the backedge-taken count using VPInstruction. This is
needed to model the full skeleton in VPlan.

NFC modulo some instruction re-ordering.
2025-08-03 12:21:28 +01:00