2859 Commits

Author SHA1 Message Date
Damian Heaton
762ba885f9
[LV] Add support for llvm.vector.partial.reduce.fadd (#163975)
Allows the Loop Vectorizer to generate `llvm.vector.partial.reduce.fadd`
intrinsics when sequences which match its requirements are found.
2026-01-28 15:05:34 +00:00
Florian Hahn
b794baf8e7
[TTI] Add VectorInstrContext for context-aware insert/extract costs. (#175982)
This commit introduces the VectorInstrContext (VIC) infrastructure to
improve cost estimates for insert/extracts based on the context
instruction in which the insert/extract is used.

This is similar to CastContextHint, and allows providing context on how
the insert/extract is going to be used before creating IR. This is
useful in the LoopVectorizer, where costs need to estimated before
creating IR.

The new hint currently only replaces an existing check in AArch64,
but new uses will be introduced in follow-ups, including
https://github.com/llvm/llvm-project/pull/177201.

PR: https://github.com/llvm/llvm-project/pull/175982
2026-01-27 16:30:29 +00:00
Florian Hahn
1251751c16
[VPlan] Consistently check ComputeReductionResult in prepareForEpi (NFCI)
Always use the information from ComputeReductionResult to identify
recurrence kinds when connecting main and epilogue plans. Connecting the
live-outs involves the reduction result computations, so it is natural
and more accurate to check the reduction result for the correct
structure.

Suggested cleanup from https://github.com/llvm/llvm-project/pull/170223
2026-01-26 20:51:20 +00:00
Florian Hahn
1650782144
[VPlan] Share and re-use logic to find FindIVResult (NFC).
Move logic to look for FindIVResult pattern out of LoopVectorize to
allow for re-use in current code and follow-up patches.
2026-01-24 20:55:41 +00:00
Florian Hahn
a871b707b7
Reapply "[VPlan] Move VDef subclass ID to VPRecipeBase (NFC). (#174282)"
Move SubclassID to VPRecipeBase, and store VPRecipeBase directly in
VPRecipeValue, instead of VPDef. This allows for some additional
simplifications and VPDef now just holds various helpers to deal with
removing and adding VPValues.

This reverts commit 16395da0ff577750571b99fe28281ce6fb6a3ae8.

PR: https://github.com/llvm/llvm-project/pull/174282
2026-01-24 13:22:48 +00:00
Florian Hahn
16395da0ff
Revert "[VPlan] Fold VPDef into VPRecipeBase (NFC). (#174282)"
This reverts commit f3ae334f4b7a8cf4fe0eb6ee7b2f2ef0879f522d.

Committed with out-of-date message, revert to reland with updated
message.
2026-01-24 13:16:45 +00:00
Florian Hahn
f3ae334f4b
[VPlan] Fold VPDef into VPRecipeBase (NFC). (#174282)
A separate VDef is not needed any longer, fold i into VPRecipeBase to
simplify code and class hierarchy.

Depends on https://github.com/llvm/llvm-project/pull/172758.

PR: https://github.com/llvm/llvm-project/pull/174282
2026-01-24 13:16:12 +00:00
Mel Chen
149c76538e
[LV] Separate runtime check cost from total overhead in profitability check (#176754)
In isOutsideLoopWorkProfitable function, there are two places where only
the runtime check cost (RtC) should be used, but incorrectly included
the costs of middle blocks and early-exit blocks.
1. VectorizeMemoryCheckThreshold comparison for interleaving-only
2. Minimum trip count that bounds runtime check overhead, i.e. MinTC2
calculation
This results in an overly conservative minimum profitable trip count.

This patch separates the runtime check cost from the total overhead
cost, and uses only RtC for VectorizeMemoryCheckThreshold comparison and
the MinTC2 calculation.
2026-01-23 07:29:56 +00:00
Florian Hahn
8a954feb3e
[LV] Replace legacy FindLast check with VPlan-based one (NFCI).
Checking directly in VPlan is more accurate, as the reductions could
have been transformed. This does not happen yet, so currently NFC.
2026-01-22 23:23:02 +00:00
Florian Hahn
7ea1fa591a
[LV] Skip FindLast reductions in collectInLoopReductions.
FindLast in-loop reductions are not supported, similarly to FindLastIV
reductions. Skip them in collectInLoopReductions, to avoid a crash for
loops with FindLast reductions and in-loop reductions preferred.
2026-01-22 21:49:52 +00:00
Florian Hahn
14a209f852
[VPlan] Replace ComputeFindIVRes with ComputeRdxRes + cmp + sel (NFC) (#176672)
Replace ComputeFindIVResult with ComputeReductionResult + explicit
compare + select, to more explicitly and simpler model computing finding
the first/last induction, which boils down to a min/max reduction +
compare and select of the sentinel value.

PR: https://github.com/llvm/llvm-project/pull/176672
2026-01-22 19:28:47 +00:00
Florian Hahn
d2c40c358a
[LV] Check if VPlan contains FindLast reduction directly (NFC).
Directly check the VPlan to see if there are any FindLast reductions.
Currently this is NFC, but checking in the VPlan is more future proof,
e.g. if reductions are simplified, removed or transformed. Then checking
in legacy LoopVectorizationLegality is inaccruate.
2026-01-20 21:33:47 +00:00
Florian Hahn
d3f2f1366d
[LV] Consider UserIC when limiting VF. (#174573)
If a UserIC is provided, the vector loop will process VF * UserIC. Pass
it through UserIC to computeFeasibleMaxVF and use it to limit the max VF
to factors where VF * UserIC <= MaxTripCount. This avoids creating dead
vector loops with user provided interleave counts.

PR: https://github.com/llvm/llvm-project/pull/174573
2026-01-20 14:19:11 +00:00
Ramkumar Ramachandra
302565b39e
[VPlan] Move VPDerivedIVRecipe::execute to VPlanRecipes (NFC) (#176577) 2026-01-19 13:06:37 +00:00
Florian Hahn
5e5d6389f6
[LV] Allow loops with multiple early exits in legality checks. (#176403)
This patch removes the single uncountable exit constraint, allowing
loops with multiple early exits, if the exits form a dominance chain and
all other constraints hold for all uncountable early exits.

While legality now accepts such loops, vectorization is not yet
supported. VPlan support will be added in a follow up:
https://github.com/llvm/llvm-project/pull/174864

PR: https://github.com/llvm/llvm-project/pull/176403
2026-01-19 12:32:04 +00:00
Florian Hahn
ae1bd068db
[VPlan] Replace PhiR operand of ComputeAnyOfResult with VPIRFlags. (#175657)
Replace the Phi recipe operand of ComputeAnyOfVResult with VPIRFlags,
building on top of https://github.com/llvm/llvm-project/pull/174026.

PR: https://github.com/llvm/llvm-project/pull/175657
2026-01-18 20:29:38 +00:00
Florian Hahn
497a6d6722
Recommit "[VPlan] Only use isAddressSCEVForCost in legacy getAddressAccSCEV"
This reverts commit ed004cf42bf57ca79b57bc3076ef83a8477426ea.

The original commit exposed an independent cost issue, triggering an
assertion. That issue has been fixed in 3457e7efc3.

Reland the patch now that the assertion has been fixed.
2026-01-18 19:55:46 +00:00
Florian Hahn
459990dcf7
[VPlan] Replace PhiR operand of ComputeFindIVResult with VPIRFlags. #174026 (#175461)
Replace the Phi recipe operand of ComputeFindIVResult with VPIRFlags,
building on top of https://github.com/llvm/llvm-project/pull/174026.

PR: https://github.com/llvm/llvm-project/pull/175461
2026-01-17 16:23:33 +00:00
Florian Hahn
d528686f43
[VPlan] Add VPConstantInt for VPIRValues wrapping ConstantInts (NFC) (#175458)
Follow-up to https://github.com/llvm/llvm-project/pull/174282: Introduce
a new VPConstantInt overlay for VPIRValue, to make it easier to check
and access constant int IR values.

PR: https://github.com/llvm/llvm-project/pull/175458
2026-01-16 11:27:07 +00:00
Graham Hunter
2abd6d6d7a
[LV] Vectorize conditional scalar assignments (#158088)
Based on Michael Maitland's previous work:
https://github.com/llvm/llvm-project/pull/121222

This PR uses the existing recurrences code instead of introducing a
new pass just for CSA autovec. I've also made recipes that are more
generic.
2026-01-14 14:59:18 +00:00
Florian Hahn
d5c11b9a24
[VPlan] Replace PhiR operand of ComputeRdxResult with VPIRFlags. (#174026)
Remove the artificial PhiR operand of ComputeReductionResult, which was
only used to look up recurrence kind, in-loop and ordered properties.

Instead, encode them as VPIRFlags as suggested by @ayalz in
https://github.com/llvm/llvm-project/pull/170223.

This addresses a TODO to make codegen for ComputeReductionResult
independent of looking up information from other recipes.

This is NFC w.r.t. codegen, the printing has been improved to include
the reduction type, and whether it is in-loop/ordered.

PR: https://github.com/llvm/llvm-project/pull/174026
2026-01-14 07:45:44 +00:00
David Sherwood
48ce7bb038
[LV] Fix bug in setVectorizedCallDecision (#175742)
There is a bug in this logic:

```
   InstructionCost Cost = ScalarCost;
   InstWidening Decision = CM_Scalarize;

   if (VectorCost <= Cost) {
     Cost = VectorCost;
     Decision = CM_VectorCall;
   }

   if (IntrinsicCost <= Cost) {
     Cost = IntrinsicCost;
     Decision = CM_IntrinsicCall;
   }
```

because it assumes that the comparisons behave sensibly in the face of
invalid costs. Unfortunately, PR #174835 exposes an issue when
attempting to vectorise the new test
uadd_with_overflow_i32 for AArch64 targets. Specifically, there are
situations where all costs are invalid (e.g. VF=vscale x 1), but some
costs are more invalid than others. For example, when querying the
intrinsic cost via the TTI hook we get an invalid cost with a non-zero
value, whereas the vector cost is invalid with a zero value. That leads
to us erroneously choosing CM_VectorCall as the call widening decision,
despite the lack of a vector math variant. Inevitably this causes
crashes because we create a VPCallWidenRecipe without a variant
function.

Fix this by only performing comparisons if the costs are valid. It now
leads to us choosing CM_Scalarize more often, but it's a toin coss
anyway between CM_Scalarize and CM_IntrinsicCall when both strategies
are invalid. Potentially we could also create a new strategy called
CM_Invalid, and avoid the creation of VPlans entirely.
2026-01-14 07:28:38 +00:00
Luke Lau
0ae23ca9e6
[VPlan] Split out optimizeEVLMasks. NFC (#174925)
Addresses part of #153144 and splits off part of #166164

There are two parts to the EVL transform:

1) Convert the loop so the number of elements processed each iteration
is EVL, not VF. The IV and header mask are replaced with EVL-based
variants.
2) Optimize users of the EVL based header mask to VP intrinsic based
recipes.

(1) changes the semantics of the vector loop region, whereas (2) needs
to preserve them. This splits (2) out so we don't mix the two up, and
allows us to move (1) earlier in the pipeline in a future PR.
2026-01-14 07:01:14 +00:00
Florian Hahn
d27d75ee94
[VPlan] Use createHeaderPHIRecipes in native path (NFCI).
Simplify tryToBuildVPlan by using createHeaderPHIRecipes in the native
path as well.
2026-01-13 20:12:21 +00:00
Florian Hahn
d620ea7657
[LV] Handle live-ins in findRecipe.
Skip live-ins in findRecipe to prevent a crash for cases with degenerate
reductions (where the backedge value is a live-in). Such reductions
should be removed, but this requires further changes.

Fixes https://github.com/llvm/llvm-project/issues/175229.
2026-01-11 11:19:30 +00:00
Hans Wennborg
ed004cf42b Revert "[VPlan] Only use isAddressSCEVForCost in legacy getAddressAccSCEV (NFCI)"
This caused assertion failures:

  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7265:
  VectorizationFactor llvm::LoopVectorizationPlanner::computeBestVF():
  Assertion `(BestFactor.Width == LegacyVF.Width || BestPlan.hasEarlyExit() || !Legal->getLAI()->getSymbolicStrides().empty() || UsesEVLGatherScatter || planContainsAdditionalSimplifications( getPlanFor(BestFactor.Width), CostCtx, OrigLoop, BestFactor.Width) || planContainsAdditionalSimplifications( getPlanFor(LegacyVF.Width), CostCtx, OrigLoop, LegacyVF.Width))
  && " VPlan cost model and legacy cost model disagreed"' failed.

see comment on https://github.com/llvm/llvm-project/pull/171204

This reverts commit 01d34eb38fa0587cb95eedd3bada8257abc122f8.
2026-01-09 15:38:32 +01:00
Florian Hahn
4998280c3f
[LV] Find reduction result VPInstruction from backedge value (NFC).
Split off from https://github.com/llvm/llvm-project/pull/174026. Make
the lookup of the reduction phi recipe/compute-reduction-result
VPInstruction independent of the latter having the reduction phi as
operand.
2026-01-07 21:12:07 +00:00
Florian Hahn
31b93d6e38
[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758)
This patch adds VPValue sub-classes for the different cases we currently
have:
 * VPIRValue: A live-in VPValue that wraps an underlying IR value
* VPSymbolicValue: A symbolic VPValue not tied to an underlying value,
e.g. the vector trip count or VF VPValues
 * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase.

This has multiple benefits:
 * clearer constructors for each kind of VPValue
* limited scope: for example allows moving VPDef member to VPRecipeValue,
reducing size of other VPValues.
* stricter type checking for member variables (e.g. using VPLiveIn in
the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic
member VPValues)

There probably are additional opportunities for cleanups as follow-ups.

PR: https://github.com/llvm/llvm-project/pull/172758
2026-01-07 20:29:05 +00:00
Shih-Po Hung
39d6f10e33
[LV] Conservatively predicate SDiv/SRem (#170818)
Conservatively predicate sdiv/srem:
- RHS may carry poison in masked‑off lanes.
- RHS could be −1 while LHS has masked‑off lanes (risking INT_MIN/−1
overflow).

We’ll relax this once we can prove non‑wrap/non‑poison conditions.

Fixes #170775.
2026-01-07 04:25:38 +00:00
Florian Hahn
01d34eb38f
[VPlan] Only use isAddressSCEVForCost in legacy getAddressAccSCEV (NFCI)
Follow-up to https://github.com/llvm/llvm-project/pull/171204 and
1f331e453f to only rely on isAddressSCEVForCost in legacy isAddressSCEVForCost,
completely aligning the decisions of VPlan and legacy cost model.
2026-01-06 19:18:13 +00:00
Florian Hahn
16830b2164
[VPlan] Remove VPWidenSelectRecipe, use VPWidenRecipe instead (NFCI). (#174234)
All extra state has been removed from VPWidenSelectRecipe at this point.
There's no benefit of having a separate recipe and Select can easily be
handled by the existing VPWidenRecipe.

PR: https://github.com/llvm/llvm-project/pull/174234
2026-01-05 22:33:37 +00:00
Florian Hahn
990883a690
[VPlan] Handle Alloca in VPReplicateRecipe::computeCost. (NFCI)
Handle Alloca in the VPlan-based cost mode. This also updates the cost
in the legacy cost model to clarify that we always compute the scalar
cost.
2026-01-03 17:40:51 +00:00
Florian Hahn
2d60f87111
[VPlan] Only use legacy cost for instructions only used by exit conds. (#174029)
Currently we need to precompute costs for exit conditions, to match the
legacy cost, as they will get replaced by a compare against the
canonical IV (or others, like active-lane-mask or EVL based) and the
original compare will get removed.

This is not true for instructions with users other than the exit
condition. Those will remain, and we can just use the VPlan-based cost
model to get more accurate results.

This improves results in some cases, like
@test_value_in_exit_compare_chain_used_outside because the IV increment
user outside the loop is replaced by computing the final value outside
the loop.

It also fixes a crash introduced by f196b1d66ff (#146525).

PR: https://github.com/llvm/llvm-project/pull/174029
2025-12-31 13:34:54 +00:00
Florian Hahn
524b1788c4
[VPlan] Add BranchOnTwoConds, use for early exit plans. (#172750)
This PR introduces a new BranchOnTwoConds VPInstruction, that takes 2
boolean operands and must be placed in a block with 3 successors.

If condition I is true, branches to successor I, otherwise falls through
to check the next condition. If both conditions are false, branch to the
third successor.

This new branch recipe is used for early-exit loops, to simplify the
representation in VPlan initially, by avoid the need for splitting the
middle block early on, in a way that preserves the single-exit block
property of regions. All exits still go through the latch block, but
they can go to more than 2 successors.

This idea was part of one of the original proposals for how to model
early exits in VPlan, but at that point in time, there was no good way
to handle this during code-gen, and we went with the early split-middle
block approach initially.

Now that we dissolve regions before ::execute, the new recipe can be
lowered nicely after regions have been removed, to a set of VPBBs and
BranchOnCond recipes. The initial lowering preserves the original
structure with the split middle blocks. Follow-ups will improve the
lowering to avoid this splitting, providing performance gains.

PR: https://github.com/llvm/llvm-project/pull/172750
2025-12-29 19:39:38 +00:00
Florian Hahn
d777b1a230
[VPlan] Skip phi recipes in tryToBuildVPlan (NFC).
No phi recipes are being transformed in the main loop any longer, so
skip phi recipes.

This also allows to clarify which recipes need skipping explicitly.
Those are recipes that have been already transformed.

Follow-up to post-commit comment in
https://github.com/llvm/llvm-project/pull/168291.
2025-12-27 17:02:48 +00:00
Florian Hahn
c2a8739cd1
[VPlan] Split off VPReductionRecipe creation for in-loop reductions (NFC) (#168784)
This patch splits off VPReductionRecipe creation for in-loop reductions
to a separate transform from adjustInLoopReductions, which has been
renamed.

The new transform has been updated to work directly on VPInstructions,
and gets applied after header phis have been processed, once on VPlan0.

Builds on top of https://github.com/llvm/llvm-project/pull/168291 and
https://github.com/llvm/llvm-project/pull/166099 which should be
reviewed first.

PR: https://github.com/llvm/llvm-project/pull/168784
2025-12-25 14:02:58 +00:00
Florian Hahn
c43ccefc9f
[VPlan] Use PSE to construct SCEVs in getSCEVExprForVPValue (NFCI).
getSCEVExprForVPValue is used to create SCEVs for expressions from the
original loop, which may be predicated. Use PSE to construct predicated
SCEVs if possible. This matches the legacy LV code behavior.

Currently should be NFC, but will enable migrating more SCEV/cost-based
computations to VPlan.

The patch requires exposing a new getPredicatedSCEV helper to
PredicatedScalarEvolution which just takes a SCEV, to avoid needing to
go through IR values, which isn't an option for getSCEVExprForVPValue.
2025-12-21 22:39:49 +00:00
Florian Hahn
1f78f6a2d6
[LV] Check Addr in getAddressAccessSCEV in terms of SCEV expressions. (#171204)
getAddressAccessSCEV previously had some restrictive checks that limited
pointer SCEV expressions passed to TTI to GEPs with operands that must
either be invariant or marked as inductions.

As a consequence, the check rejected things like `GEP %base, (%iv + 1)`,
while the SCEV for the GEP should be as easily analyzeable as for `GEP
%base, %v`, with the only difference being the of the AddRec start
adjusted by 1.

This patch changes the code to use a SCEV-based check, limiting the
address SCEV to be loop invariant, an affine AddRec (i.e. induction ),
or an add expression of such operands or a sign-extended AddRec.

This catches all existing cases getAddressAccessSCEV caught, plus
additional ones like the cases mentioned above.

This means we pass address SCEVs in more cases, giving the backends a
better change to make informed decisions. It also unifies the decision
when to use an address SCEV between the legacy and VPlan-based cost
model.

An illustrative example of showing the impact are the gather-cost.ll
tests. Previously they were considered not profitable to vectorize
because we failed to determine that
 %gep.src_data = getelementptr inbounds [1536 x float], ptr @src_data,
                                                        i64 0, i64 %mul
has a relatively small constant stride.

There may be some rough edges in the cost models, where not passing
pointer SCEVs hid some incorrect modeling, but those issues should be
fixed in the target cost models if they surface.


PR: https://github.com/llvm/llvm-project/pull/171204
2025-12-19 22:05:27 +00:00
Mel Chen
f196b1d66f
[VPlan] Extract reverse operation for reverse accesses (#146525)
This patch introduces VPInstruction::Reverse and extracts the reverse
operations of loaded/stored values from reverse memory accesses. This
extraction facilitates future support for permutation elimination within
VPlan.
2025-12-18 14:57:48 +00:00
Florian Hahn
bab0dc4d48
Reapply "[LV] Mark checks as never succeeding for high cost cutoff."
Reapply 8a115b6934a90441 with an update to tests handling remarks.

The patch now directly emits a clear remark when we bail out
due to the memory check threshold.

Original message:
When GeneratedRTChecks::create bails out due to exceeding the cost
threshold, no runtime checks are generated and we must not proceed
assuming checks have been generated.

Mark the checks as never succeeding, to make sure we don't try to
vectorize assuming the runtime checks hold. This fixes a case where we
previously incorrectly vectorized assuming runtime checks had been
generated when forcing vectorization via metadate.

Fixes the mis-compile mentioned in
https://github.com/llvm/llvm-project/pull/166247#issuecomment-3631471588
2025-12-17 20:21:49 +00:00
Ramkumar Ramachandra
1c6e5b2d04
[LV] Improve code using VPlan::get{ConstantInt,True} (NFC) (#172471) 2025-12-16 13:03:43 +00:00
Luke Lau
67d0e21a62
Reapply "[VPlan] Remove legacy costing inside VPBlendRecipe::computeCost (#171846)" (#172261)
This reapplies #171846 with a test case and fix for a legacy cost-model
mismatch assertion.

In the previous version of the patch, we only considered the plan to
contain simplifications when it had a VPBlendRecipe and VF.isScalar()
was true.

However for some VPlans we may have a blend with only the first lane
used:

    BLEND ir<%phi> = ir<%foo.res> ir<%bar.res>/ir<%c>
    CLONE ir<%gep> = getelementptr ir<%p>, ir<%phi>
    vp<%5> = vector-pointer ir<%gep>

And in the legacy cost model we cost a blend as a phi if it's uniform:

// If we know that this instruction will remain uniform, check the cost
of
    // the scalar version.
    if (isUniformAfterVectorization(I, VF))
      VF = ElementCount::getFixed(1);

So this replaces the VF.isScalar() check with
vputils::onlyFirstLaneUsed, which matches how the VPlan cost model
mirrored the legacy model beforehand.

A VPInstruction::Select will also emit a scalar select for a vector VF
if only the first lane is used, so this also updates
VPBlendRecipe::computeCost to reflect that too.
2025-12-16 06:30:54 +00:00
Florian Hahn
83eea87a36
[VPlan] Create header phis once, after constructing VPlan0 (NFC). (#168291)
Together with https://github.com/llvm/llvm-project/pull/168289 &
https://github.com/llvm/llvm-project/pull/166099 we can construct header
phis once up front, after creating VPlan0, as the
induction/reduction/first-order-recurrence classification applies across
all VFs.

Depends on https://github.com/llvm/llvm-project/pull/168289 &
https://github.com/llvm/llvm-project/pull/166099 

PR: https://github.com/llvm/llvm-project/pull/168291
2025-12-15 22:12:10 +00:00
Florian Hahn
dbb4f5c2dd
[VPlan] Set VF scale factor in tryToCreatePartialReduction (NFCI).
Split off unrelated change from approved
https://github.com/llvm/llvm-project/pull/168291/ to land separately as
suggested.
2025-12-15 21:18:07 +00:00
Florian Hahn
bcbbe2c2bc
[VPlan] Pass backedge value directly to FOR and reduction phis (NFC).
Pass backedge values directly to VPFirstOrderRecurrencePHIRecipe and
VPReductionPHIRecipe directly, as they must be provided and availbale.

Split off from https://github.com/llvm/llvm-project/pull/168291.
2025-12-14 20:59:22 +00:00
Florian Hahn
53cf22f3a1
[VPlan] Simplify live-ins early using SCEV. (#155304)
Use SCEV to simplify all live-ins during VPlan0 construction. This
enables us to remove special SCEV queries when constructing
VPWidenRecipes and improves results in some cases.

This leads to simplifications in a number of cases in real-world
applications (~250 files changed across LLVM, SPEC, ffmpeg)

PR: https://github.com/llvm/llvm-project/pull/155304
2025-12-14 20:15:05 +00:00
Luke Lau
4ea8157773 Revert "[VPlan] Remove legacy costing inside VPBlendRecipe::computeCost (#171846)"
This reverts commit fd5f53aa9b21060063484fc6c346316a34a6464c.

It's triggering legacy cost model assertions reported in
https://github.com/llvm/llvm-project/pull/171846#issuecomment-3647640019
2025-12-13 20:05:34 +08:00
Florian Hahn
333ee931df
[LV] Update stale comment after 4e05d702f02a. (NFC)
Address post-commit suggestion, update stale comment after 4e05d702f.
2025-12-12 21:36:56 +00:00
Florian Hahn
4e05d702f0
[LV] Always include middle block cost in isOutsideLoopWorkProfitable. (#171102)
Always include the cost of the middle block in
isOutsideLoopWorkProfitable. This addresses the TODO from
https://github.com/llvm/llvm-project/pull/168949 and removes the
temporary restriction.

isOutsideLoopWorkProfitable already scales the cost outside loops
according the expected trip counts.

In practice this increases the minimum iteration threshold in a few
cases. On a large IR corpus based on C/C++ workloads, ~50 out of 179450
vector loops have their thresholds increased slightly.


PR: https://github.com/llvm/llvm-project/pull/171102
2025-12-11 21:41:47 +00:00
Luke Lau
fd5f53aa9b
[VPlan] Remove legacy costing inside VPBlendRecipe::computeCost (#171846)
A VPBlendRecipe always emits selects, even when the VF is scalar.

However the legacy cost model always costs all scalar non-header phis as
a phi, and the VPlan cost model has to account for this.

This can cause the cost to be a little off, for example not including
the cost of the select in @smax_call_uniform leading to unprofitable
vectorization.

This removes this from the VPlan cost model and handles checks for the
case in planContainsAdditionalSimplifications instead.

I considered trying to make the legacy cost model more accurate but I'm
not sure if it's possible. We need information as to whether or not the
scalar VF we are costing is the original loop in which case it's
actually a phi, or if it's a VPBlendRecipe that emits a select,
potentially from a VF=1, UF>=1 VPlan.
2025-12-12 00:25:58 +08:00