2585 Commits

Author SHA1 Message Date
David Green
77941eba7f
[CostModel] Add a DstTy to getShuffleCost (#141634)
A shuffle will take two input vectors and a mask, to produce a new
vector of size <MaskElts x SrcEltTy>. Historically it has been assumed
that the SrcTy and the DstTy are the same for getShuffleCost, with that
being relaxed in recent years. If the Tp passed to getShuffleCost is the
SrcTy, then the DstTy can be calculated from the Mask elts and the src
elt size, but the Mask is not always provided and the Tp is not reliably
always the SrcTy. This has led to situations notably in the SLP
vectorizer but also in the generic cost routines where assumption about
how vectors will be legalized are built into the generic cost routines -
for example whether they will widen or promote, with the cost modelling
assuming they will widen but the default lowering to promote for integer
vectors.

This patch attempts to start improving that - it originally tried to
alter more of the cost model but that too quickly became too many
changes at once, so this patch just plumbs in a DstTy to getShuffleCost
so that DstTy and SrcTy can be reliably distinguished. The callers of
getShuffleCost have been updated to try and include a DstTy that is more
accurate. Otherwise it tries to be fairly non-functional, keeping the
SrcTy used as the primary type used in shuffle cost routines, only using
DstTy where it was in the past (for InsertSubVector for example).

Some asserts have been added that help to check for consistent values
when a Mask and a DstTy are provided to getShuffleCost. Some of them
took a while to get right, and some non-mask calls might still be
incorrect. Hopefully this will provide a useful base to build more
shuffles that alter size.
2025-06-21 12:29:29 +01:00
Philip Reames
c103bbc836
[LV] Consider whether vscale is a known power of two for iteration check (#144963)
Going mostly by the comment here - but it says "vscale is not
necessarily a power-of-2". Both in tree targets have vscale as a power
of two, and we have an existing TTI hook for that.
2025-06-20 11:37:27 -07:00
Luke Lau
a2b8a93ff9
[VPlan] Pass NumUnrolledElems as operand to VPWidenPointerInductionRecipe. NFC (#119859)
Similarly to VPWidenIntOrFpInductionRecipe, if we want to support it in
EVL tail folding we need to increment the induction by EVL steps instead
of VF*UF steps, but currently this is hard-wired in
VPWidenPointerInductionRecipe.

This adds an operand for the number of elements unrolled and plumbs it
through, so that we can swap it out in
VPlanTransforms::tryAddExplicitVectorLength further down the line.
2025-06-20 15:46:52 +01:00
Ramkumar Ramachandra
c8c4bd1ebc
[LV] Stengthen loop-invariance checks in isPredicatedInst (#140744)
Check loop-invariance against SCEV as well.
2025-06-20 14:01:48 +01:00
Philip Reames
d8e6d74c69
[LV] Consider EVL legality for TTI tail folding preference (#144790) 2025-06-19 16:15:23 -07:00
Philip Reames
b96370131d
[TTI] Plumb CostKind through getPartialReductionCost (#144953)
Purely for the sake of being idiomatic with other TTI costing routines,
no direct motivation beyond that.
2025-06-19 15:29:56 -07:00
David Sherwood
af51c9d9df
[LV][NFC] Add branch weight test showing incorrect behaviour (#144682)
This patch adds a test that shows incorrect branch weights being set in
function

EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck
2025-06-19 10:49:40 +01:00
Florian Hahn
071a6feabd
[TTI] Remove PPC hasActiveVectorLength impl, simplify interface (NFC). (#142310)
PPCTTIImpl defines hasActiveVectorLength and also getVPMemoryOpCost, but
they appear unused (i.e. no changes to tests).

Remove them, as they complicate the interface for hasActiveVectorLength.
This simplifies the only use in LV as now no placeholder values need to
be passed.

PR: https://github.com/llvm/llvm-project/pull/142310
2025-06-18 19:02:17 +01:00
Paul Walker
d3441f7348
[LV] Change getSmallBestKnownTC to return an ElementCount (NFC) (#141793)
This is prep work for enabling better UF calculations when using vscale
based VFs to vectorise loops with vscale based tripcounts.

NOTE: NFC because All uses remain fixed-length until a following PR
changes LoopVectorize's version of getSmallConstantTripCount().
2025-06-18 11:45:20 +01:00
Mel Chen
ba40a7bc2e
[LoopVectorize] Vectorize fixed-order recurrence with vscale x 1. (#142772)
When the fixed-order recurrence phi is live-out from the loop, the
vectorizer uses VPInstruction::ExtractPenultimateElement to extract the
penultimate element from the recurrence vector. However, this is not
feasible when the VF is vscale x 1, since vscale could be 1, making the
vector contain only one element.

This patch changes the behavior for vscale x 1 by extracting the last
element from the vector produced by splicing the recurrence phi and the
previous value. This ensures we can still determine the correct live-out
value of the recurrence phi.
2025-06-18 16:03:20 +08:00
Luke Lau
9dd1c66e8f
[VPlan] Expand VPWidenIntOrFpInductionRecipe into separate recipes (#118638)
The motivation of this PR is to make #115274 easier to implement, and
should allow us to add EVL support by just passing EVL to the VF
operand.

The current difficulty with widening IVs with EVL is that
VPWidenIntOrFpInductionRecipe generates its own backedge value. Since
it's a VPHeaderPHIRecipe the VF operand must be in the preheader, which
means we can't use the EVL since it's defined in the loop body.

The gist in this PR is to take the approach in #114305 and expand
VPWidenIntOrFpInductionRecipe into several recipes for the initial
value, phi and backedge value just before execution. I.e. this example:

```
  vector.ph:
  Successor(s): vector loop

  <x1> vector loop: {
    vector.body:
      WIDEN-INDUCTION %i = phi %start, %step, %vf
      ...
      EMIT branch-on-count ...
    No successors
  }
```

gets expanded to:

``` 
vector.ph:
  ...
  vp<%induction.start> = ...
  vp<%induction.increment> = ...

Successor(s): vector loop

<x1> vector loop: {
  vector.body:
    ir<%i> = WIDEN-PHI vp<%induction.start>, vp<%vec.ind.next>
    ...
    vp<%vec.ind.next> = add ir<%i>, vp<%induction.increment>
    EMIT branch-on-count ...
  No successors
}
```

This allows us to a value defined in the loop in the backedge value, and
also means we can just reuse the existing backedge fixups in
VPlan::execute without having to specially handle it ourselves.

After this #115274 should just become a matter of setting the VF operand
to EVL (and building the increment step in the loop body, not the
preheader).
2025-06-17 18:24:07 +01:00
Florian Hahn
d3bc834ece
[LV] Update check to find epilogue resume value to check all incoming.
This fixes a crash where all incoming values for the epilogue resume
value are zero, because there are no remaining iterations to execute for
the epilogue loop.
2025-06-16 21:10:12 +01:00
David Sherwood
a75e0627f9
[LV] Use vscale for tuning when updating profile information (#143690)
In fixVectorizedLoop we call setProfileInfoAfterUnrolling to update the
profile information after vectorising, however for scalable VFs we
pessimistically assume vscale=1. We can improve upon this by using the
value of vscale used for tuning, i.e. when targeting neoverse-v1 the
expected value is 2.
2025-06-16 10:02:38 +01:00
Sam Tebbs
3dd61c1876
[LV] Fix MVE regression from #132190 (#141736)
Register pressure was only considered if the vector bandwidth was being
maximised (chosen either by the target or user options), but #132190
inadvertently caused high pressure VFs to be pruned even when max
bandwidth wasn't enabled. This PR returns to the previous behaviour.
2025-06-16 09:58:03 +01:00
Ramkumar Ramachandra
fbade95ebf
[LV] Strip unnecessary make_{pair,optional} (NFC) (#141924) 2025-06-16 08:55:46 +01:00
Florian Hahn
df54a2d935
[VPlan] Only skip induction phis in planContainsAdditionalSimps (NFC).
Skip induction phis when checking for simplifications, as they may not
be lowered directly be lowered to a corresponding PHI recipe. Reductions
and first-order recurrences will get lowered to phi recipes, unless they
are removed. Considering them for simplifications allows removing them
if there are no remaining users.

NFC as currently reduction and recurrence phis are not
simplified/removed if dead.
2025-06-15 19:31:30 +01:00
Florian Hahn
577199f922
Reapply "[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035)"
This reverts commit 0604dc199c019b23746f4a54885ba0c75569cdae.

The recommitted version addresses post-commit comments and adjusts the
place the branch weights are added. It now runs before VPlans are optimized
for VF and UF, which may remove the vector loop region, causing a crash
trying to get the middle block after that. Test case added in
72f99b75afc12bb.

Original message:
Manage branch weights for the BranchOnCond in the middle block in VPlan.
This requires updating VPInstruction to inherit from VPIRMetadata, which
in general makes sense as there are a number of opcodes that could take
metadata.

There are other branches (part of the skeleton) that also need branch
weights adding.

PR: https://github.com/llvm/llvm-project/pull/143035
2025-06-14 17:20:46 +01:00
Florian Hahn
732ebf803b
[VPlan] Address post-commit comments for f68848015f62.
Assign sentinel value to named variable to clarify naming and update
comments.

Addresses post-commit comments from
https://github.com/llvm/llvm-project/pull/142291.
2025-06-14 10:44:20 +01:00
Florian Hahn
1ded2c599f
[LV] Use createIterationCountCheck during epilogue skeleton creation.
Use helper already used for minimum trip count checks for the regular
ILV skeleton creation also for epilogue skeleton creation.
2025-06-13 21:01:12 +01:00
Florian Hahn
f68848015f
[VPlan] Manage Sentinel value for FindLastIV in VPlan. (#142291)
Similar to modeling the start value as operand, also model the sentinel
value as operand explicitly. This makes all require information for
code-gen available directly in VPlan.

PR: https://github.com/llvm/llvm-project/pull/142291
2025-06-13 19:17:01 +01:00
David Sherwood
541e5118ce
[LV] Use getFixedValue instead of getKnownMinValue when appropriate (#143526)
There are many places in VPlan and LoopVectorize where we use
getKnownMinValue to discover the number of elements in a vector. Where
we expect the vector to have a fixed length, I have used the stronger
getFixedValue call. I believe this is clearer and adds extra protection
in the form of an assert in getFixedValue that the vector is not
scalable.

While looking at VPFirstOrderRecurrencePHIRecipe::computeCost I also
took the liberty of simplifying the code.

In theory I believe this patch should be NFC, but I'm reluctant to add
that to the title in case we're just missing tests for some of the VPlan
changes. I built and ran the LLVM test suite when targeting neoverse-v1
and it seemed ok.
2025-06-13 11:43:50 +01:00
Florian Hahn
d65904675e
[LV] Move logic to create trip count check to helper (NFC).
Move the logic to create the iteration count check to a separate helper,
so it can be re-used by when creating the skeleton for epilogue
vectorization as well.
2025-06-12 21:35:56 +01:00
Hans Wennborg
0604dc199c Revert "[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035)"
This caused assertion failures:

  llvm/lib/Transforms/Vectorize/VPlan.h:4021:
  llvm::VPBasicBlock* llvm::VPlan::getMiddleBlock():
  Assertion `LoopRegion && "cannot call the function after vector loop region has been removed"' failed.

See comment on the PR.

> Manage branch weights for the BranchOnCond in the middle block in VPlan.
> This requires updating VPInstruction to inherit from VPIRMetadata, which
> in general makes sense as there are a number of opcodes that could take
> metadata.
>
> There are other branches (part of the skeleton) that also need branch
> weights adding.
>
> PR: https://github.com/llvm/llvm-project/pull/143035

This reverts commit db8d34db26e9ea92c08d6e813eca9cce40c48478.
2025-06-12 13:52:05 +02:00
Luke Lau
7ef77eb998
[LV] Support scalable interleave groups for factors 3,5,6 and 7 (#141865)
Currently the loop vectorizer can only vectorize interleave groups for
power-of-2 factors at scalable VFs by recursively interleaving
[de]interleave2 intrinsics.

However after https://github.com/llvm/llvm-project/pull/124825 and
#139893, we now have [de]interleave intrinsics for all factors up to 8,
which is enough to support all types of segmented loads and stores on
RISC-V.

Now that the interleaved access pass has been taught to lower these in
#139373 and #141512, this patch teaches the loop vectorizer to emit
these intrinsics for factors up to 8, which enables scalable
vectorization for non-power-of-2 factors.

As far as I'm aware, no in-tree target will vectorize a scalable
interelave group above factor 8 because the maximum interleave factor is
capped at 4 on AArch64 and 8 on RISC-V, and the
`-max-interleave-group-factor` CLI option defaults to 8, so the
recursive [de]interleaving code has been removed for now.

Factors of 3 with scalable VFs are also turned off in AArch64 since
there's no lowering for [de]interleave3 just yet either.
2025-06-12 11:09:09 +01:00
Florian Hahn
db8d34db26
[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035)
Manage branch weights for the BranchOnCond in the middle block in VPlan.
This requires updating VPInstruction to inherit from VPIRMetadata, which
in general makes sense as there are a number of opcodes that could take
metadata.

There are other branches (part of the skeleton) that also need branch
weights adding.

PR: https://github.com/llvm/llvm-project/pull/143035
2025-06-12 10:04:08 +01:00
Florian Hahn
5623b7f2d5
[LV] Use GeneratedRTChecks to check if safety checks were added (NFC).
Directly check via GeneratedRTChecks if any checks have been added,
instead of needing to go through ILV. This simplifies the code and
enables further refactoring in follow-up patches.
2025-06-11 21:08:36 +01:00
Stephen Tozer
aa8a1fa6f5
[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192)
Following the work in PR #107279, this patch applies the annotative
DebugLocs, which indicate that a particular instruction is intentionally
missing a location for a given reason, to existing sites in the compiler
where their conditions apply. This is NFC in ordinary LLVM builds (each
function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the
instruction in coverage-tracking builds so that it will be ignored by
Debugify, allowing only real errors to be reported. From a developer
standpoint, it also communicates the intentionality and reason for a
missing DebugLoc.

Some notes for reviewers:

- The difference between `I->dropLocation()` and
`I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide
to keep some debug info alive, while the latter will always be empty; in
this patch, I always used the latter (even if the former could
technically be correct), because the former could result in some
(barely) different output, and I'd prefer to keep this patch purely NFC.
- I've generally documented the uses of `DebugLoc::getUnknown()`, with
the exception of the vectorizers - in summary, they are a huge cause of
dropped source locations, and I don't have the time or the domain
knowledge currently to solve that, so I've plastered it all over them as
a form of "fixme".
2025-06-11 17:42:10 +01:00
Florian Hahn
62b3e89afc
[LV] Remove unused LoopBypassBlocks from ILV (NFC).
After recent refactorings to move parts of skeleton creation
LoopBypassBlocks isn't used any more. Remove it.
2025-06-10 21:37:29 +01:00
Florian Hahn
4a6d31f4bf
[LV] Pass resume phi to fixReductionScalarResumeWhenVectorizing (NFC).
fixReductionScalarResumeWhenVectorizingEpilog updates the resume phis in
the scalar preheader. Instead of looking at all recipes in the middle
block and finding their resume-phi users we can iterate over all resume
phis in the scalar preheader directly.

This slightly simplifies the code and removes the need to look for the
resume phi.

Also slightly simplifies https://github.com/llvm/llvm-project/pull/141860.
2025-06-09 22:11:20 +01:00
Florian Hahn
f9b98e386e
[LV] Remove unused LoopMiddleBlock arg from fixReductionScalarRes (NFC)
The argument isn't used, remove it.
2025-06-09 21:47:03 +01:00
Florian Hahn
6108d50aed
[VPlan] Add ReductionStartVector VPInstruction. (#142290)
Add a new VPInstruction::ReductionStartVector opcode to create the start
values for wide reductions. This more accurately models the start value
creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the
line it also allows removing VPReductionPHIRecipe::RdxDesc.

PR: https://github.com/llvm/llvm-project/pull/142290
2025-06-09 20:59:12 +01:00
Florian Hahn
c2ea9404ab
[LV] Simplify finding EPResumeValue (NFC).
It should be sufficient to check that the resume phi has the correct
type, as the vector trip count as incoming value and starts at 0
otherwise. There is no need to find the middle block.
2025-06-09 19:37:33 +01:00
Ramkumar Ramachandra
6716d4eaa8
[LV] Prefer DenseMap::lookup over find (NFC) (#141809)
Apart from the stylistic improvement, lookup has the nice property of
returning a default-constructed object on failure-to-find, while find
returns the end iterator, which cannot be dereferenced.
2025-06-03 14:37:19 +01:00
Florian Hahn
5520ab3d50
[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932)
Add a dedicated opcode for any-of reduction, similar to
https://github.com/llvm/llvm-project/pull/132689 and
https://github.com/llvm/llvm-project/pull/132690.

The patch also explictly adds the start value to not require
RecurrenceDescriptor during execute. It also allows freezing the start
value to make it poison-safe.

PR: https://github.com/llvm/llvm-project/pull/141932
2025-06-03 14:33:53 +01:00
Luke Lau
ddfeecf4c5
[VPlan] Convert to concrete recipes before dissolving loop regions. NFCI (#141999)
After updating #118638 on tip of tree, expanding
VPWidenIntOrFpInductionRecipes fails because it needs the loop region to
get the latch to insert the increment into:

VPBasicBlock *ExitingBB =
Plan->getVectorLoopRegion()->getExitingBasicBlock();
Builder.setInsertPoint(ExitingBB,
ExitingBB->getTerminator()->getIterator());
    auto *Next = Builder.createNaryOp(AddOp, {Prev, Inc}, Flags,
WidenIVR->getDebugLoc(), "vec.ind.next");

However after #117506, the region is dissolved so it doesn't work.

This shuffles the dissolveLoopRegions steps to be after
convertToConcreteRecipes so we can use the region when expanding
VPWidenIntOrFpInductionRecipes
2025-06-03 12:05:13 +01:00
Florian Hahn
11713e86b0
[LV] Move VPlan-based calculateRegisterUsage to VPlanAnalysis (NFC). (#135673)
Move VPlan-based calculateRegisterUsage from LoopVectorize
to VPlanAnalysis.cpp. It is a VPlan-based analysis and this helps
to reduce the size of LoopVectorize.

PR: https://github.com/llvm/llvm-project/pull/135673
2025-06-02 17:40:50 +01:00
Florian Hahn
0f00a96fed
[VPlan] Simplify branch on False in VPlan transform (NFC). (#140409)
Simplify branch on false, starting with the branch from the middle block
to the scalar preheader. Initially this helps simplifying the initial
VPlan construction.

Depends on https://github.com/llvm/llvm-project/pull/140405.

PR: https://github.com/llvm/llvm-project/pull/140409
2025-05-31 20:32:45 +01:00
Jon Roelofs
798058fca5
[Remarks] Remove an upcast footgun. NFC (#142191)
CodeRegion's were previously passed as Value*, but then immediately
upcast to BasicBlock. Let's keep the type information around until the
use cases for non-BasicBlock code regions actually materialize.
2025-05-31 11:07:54 -07:00
Florian Hahn
10bd4cd9cd
[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405)
Use regular VPPhi instead of a separate opcode for resume phis. This
removes an unneeded specialized opcode and unifies the code
(verification, printing, updating when CFG is changed).

Depends on https://github.com/llvm/llvm-project/pull/140132.

PR: https://github.com/llvm/llvm-project/pull/140405
2025-05-30 12:50:08 +01:00
Florian Hahn
417e43ad43
[LV] Set PhiTy once in adjustRecipesForReductions (NFC). 2025-05-30 08:33:15 +01:00
Ramkumar Ramachandra
663aea2601
[LV] Clean up unused template args of min/max (NFC) (#141778) 2025-05-29 09:57:22 +02:00
Elvis Wang
332fe08f1d
[VPlan] Implement VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. (#113903)
This patch implement the VPlan-based cost model for VPReduction,
VPExtendedReduction and VPMulAccumulateReduction.

With this patch, we can calculate the reduction cost by the VPlan-based
cost model so remove the reduction costs in `precomputeCost()`.

Ref: Original instruction based implementation:
https://reviews.llvm.org/D93476
2025-05-29 11:15:16 +08:00
Florian Hahn
440a8adb86
[VPlan] Use VPIRFlags to manage FMFs for ComputeReductionResult (NFC).
Manage fast-math flags using VPIRFlags from VPInstruciton, in inline
with other VPInstructions. With this change, we now print the correctly
flags for ComputeReductionResult, other than that NFC.
2025-05-28 20:54:58 +01:00
Florian Hahn
ad58ea3ba8
[VPlan] Bail out before construction VPlan0 if MinVF > MaxVF.
This reduces the cases where we need to create initial VPlans
unnecessarily after 567b3172da2d52f5df70a37f3de06b7000b25968.

buildVPlansWithVPRecipes is called with MinVF > MaxVF if the target does
not support scalable vectors.

Recovers some of the compile-time impact
http://llvm-compile-time-tracker.com/compare.php?from=3033f202f6707937cd28c2473479db134993f96f&to=1a0b9e5834f7fd4abf058864e656f8e26b7a26ff&stat=instructions:u
2025-05-27 21:19:11 +01:00
Florian Hahn
d56deea1e4
[VPlan] Connect Entry to scalar preheader during initial construction. (#140132)
Update initial construction to connect the Plan's entry to the scalar
preheader during initial construction. This moves a small part of the
 skeleton creation out of ILV and will also enable replacing
 VPInstruction::ResumePhi with regular VPPhi recipes.

Resume phis need 2 incoming values to start with, the second being the
bypass value from the scalar ph (and used to replicate the incoming
value for other bypass blocks). Adding the extra edge ensures we
incoming values for resume phis match the incoming blocks.

PR: https://github.com/llvm/llvm-project/pull/140132
2025-05-27 16:07:56 +01:00
Florian Hahn
567b3172da
[VPlan] Construct initial once and pass clones to tryToBuildVPlan (NFC). (#141363)
Update to only build an initial, plain-CFG VPlan once, and then
transform & optimize clones.

This requires changes to ::clone() for VPInstruction and
VPWidenPHIRecipe to allow for proper cloning of the recipes in the
initial VPlan.

PR: https://github.com/llvm/llvm-project/pull/141363
2025-05-26 13:42:47 +01:00
Florian Hahn
dcef154b5c
[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506)
Building on top of https://github.com/llvm/llvm-project/pull/114305,
replace VPRegionBlocks with explicit CFG before executing.

This brings the final VPlan closer to the IR that is generated and
helps to simplify codegen.

It will also enable further simplifications of phi handling during
execution and transformations that do not have to preserve the 
canonical IV required by loop regions. This for example could include
replacing the canonical IV with an EVL based phi while completely
removing the original canonical IV.

PR: https://github.com/llvm/llvm-project/pull/117506
2025-05-24 19:17:16 +01:00
Florian Hahn
a9b2998e31
[VPlan] Skip cost assert if VPlan converted to single-scalar recipes.
Check if a VPlan transform converted recipes to single-scalar
VPReplicateRecipes (after 07c085af3efcd67503232f99a1652efc6e54c1a9). If
that's the case, the legacy cost model incorrectly overestimates the cost.

Fixes https://github.com/llvm/llvm-project/issues/141237.
2025-05-24 11:09:27 +01:00
Florian Hahn
95ba5508e5
Reapply "[VPlan] Move predication to VPlanTransform (NFC). (#128420)"
This reverts commit 793bb6b257fa4d9f4af169a4366cab3da01f2e1f.

The recommitted version contains a fix to make sure only the original
phis are processed in convertPhisToBlends nu collecting them in a vector
first. This fixes a crash when no mask is needed, because there is only
a single incoming value.

Original message:
This patch moves the logic to predicate and linearize a VPlan to a
dedicated VPlan transform. It mostly ports the existing logic directly.

There are a number of follow-ups planned in the near future to
further improve on the implementation:
* Edge and block masks are cached in VPPredicator, but the block masks
are still made available to VPRecipeBuilder, so they can be accessed
during recipe construction. As a follow-up, this should be replaced by
adding mask operands to all VPInstructions that need them and use that
during recipe construction.
* The mask caching in a map also means that this map needs updating each
time a new recipe replaces a VPInstruction; this would also be handled
by adding mask operands.

PR: https://github.com/llvm/llvm-project/pull/128420
2025-05-22 08:16:15 +01:00
Florian Hahn
793bb6b257
Revert "[VPlan] Move predication to VPlanTransform (NFC). (#128420)"
This reverts commit b263c08e1a0b54a871915930aa9a1a6ba205b099.

Looks like this triggers a crash in one of the Fortran tests. Reverting
while I investigate
    https://lab.llvm.org/buildbot/#/builders/41/builds/6825
2025-05-21 19:24:21 +01:00