100 Commits

Author SHA1 Message Date
Florian Hahn
7509cad693
[VPlan] Support masked VPInsts, use for predication (NFC) (#142285)
Add support for mask operands to most VPInstructions, using
getNumOperandsForOpcode.

This allows VPlan predication to predicate VPInstructions directly. The
mask will then be dropped or handled when creating wide recipes.

Depends on https://github.com/llvm/llvm-project/pull/142284.
Depends on https://github.com/llvm/llvm-project/pull/168784.

PR: https://github.com/llvm/llvm-project/pull/142285
2026-02-08 18:23:36 +00:00
Florian Hahn
fdce0ea708
[VPlan] Add ExitingIVValue VPInstruction. (#175651)
Add a new VPInstruction opcode to compute the exiting value of an
induction variable after vectorization. This replaces the pattern of
extracting the last lane from the last part of the induction backedge
value when applicable.

This allows us to always use the pre-computed IV end value. It will also
allow unifying end value creation for both induction resume and exit
values.

PR: https://github.com/llvm/llvm-project/pull/175651
2026-02-06 12:27:31 +00:00
Luke Lau
33a2c3ee9c
[VPlan] Ignore poison incoming values when creating blend (#180005)
We have an optimization in VPPredicator when creating blends where if
all the incoming values are the same, we just return that value.

This extends it to handle cases like "phi [%x, %x, poison, %x]" by
ignoring poison values.

This is split off from #176143 to prevent regressions when maintaining
SSA by adding PHIs with a poison incoming value.
2026-02-06 19:09:43 +08:00
Florian Hahn
792f7b089a
[VPlan] Refine exit select check in transformtoPartialReduction.
Make sure we find the actual select for the exit users and only use it
for the final link in the chain. This fixes a miscompile after
90b3712d8a20efa2cbaadc177da576e485dce038.
2026-02-03 21:07:02 +00:00
Florian Hahn
90b3712d8a
Reapply "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)"
This reverts commit d1e477b00b49c63ff4dd513eeb14a5b18bc055d7.

Recommit with a extra checks making sure extends are VPWidenCastRecipes,
rejecting VPReplicateRecipes.

Original message:
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.

With this, partial reductions are detected and created together in a
single step.

This allows forming partial reductions and bundling them up if
profitable together in a follow-up.

PR: https://github.com/llvm/llvm-project/pull/167851
2026-02-01 16:27:27 +00:00
Martin Storsjö
d1e477b00b Revert "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)"
This reverts commit f4e8cc1a2229dca76d21c8d37439c4c194b06b86.

This change wasn't NFC; it causes failed asserts when building
ffmpeg for i686 windows, see
https://github.com/llvm/llvm-project/pull/167851 for details.
2026-02-01 14:35:02 +02:00
Florian Hahn
f4e8cc1a22
[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.

With this, partial reductions are detected and created together in a
single step.

This allows forming partial reductions and bundling them up if
profitable together in a follow-up.

PR: https://github.com/llvm/llvm-project/pull/167851
2026-01-31 19:44:46 +00:00
Florian Hahn
e36cd26618
[VPlan] Remove non-reductions after simplifications. (#176795)
In some cases, we identify patterns as reductions, even though they can
be simplified to a non-reduction.

Mark VPReductionPHIRecipe as not reading from memory & not having
side-effects, to clean them up.

We also need to remove ComputeReductionResult VPInstructions with
live-in arguments. This means there is actually no reduction, and we
need to fold it to the live in. Otherwise we would incorrectly reduce
the live-in.

PR: https://github.com/llvm/llvm-project/pull/176795
2026-01-28 15:51:08 +00:00
Damian Heaton
762ba885f9
[LV] Add support for llvm.vector.partial.reduce.fadd (#163975)
Allows the Loop Vectorizer to generate `llvm.vector.partial.reduce.fadd`
intrinsics when sequences which match its requirements are found.
2026-01-28 15:05:34 +00:00
Luke Lau
c2457742c0
[VPlan] Make class_match variadic, reuse for m_LiveIn. NFC (#178196) 2026-01-27 21:22:51 +08:00
Florian Hahn
1650782144
[VPlan] Share and re-use logic to find FindIVResult (NFC).
Move logic to look for FindIVResult pattern out of LoopVectorize to
allow for re-use in current code and follow-up patches.
2026-01-24 20:55:41 +00:00
Florian Hahn
14a209f852
[VPlan] Replace ComputeFindIVRes with ComputeRdxRes + cmp + sel (NFC) (#176672)
Replace ComputeFindIVResult with ComputeReductionResult + explicit
compare + select, to more explicitly and simpler model computing finding
the first/last induction, which boils down to a min/max reduction +
compare and select of the sentinel value.

PR: https://github.com/llvm/llvm-project/pull/176672
2026-01-22 19:28:47 +00:00
Florian Hahn
6cc18a8e43
[VPlan] Support more GEP-like recipes in getSCEVExprForVPValue (NFCI)
Support VPWidenGEPRecipe, VPInstructions and VPRelpicateRecipe with
GEP-like opcodes in getSCEVExprForVPValue via a new matcher binding
source element type and operands.

This is used in code paths when computing SCEV expressions in the
VPlan-based cost model, which should produce costs matching the legacy
cost model.
2026-01-18 22:20:25 +00:00
Florian Hahn
459990dcf7
[VPlan] Replace PhiR operand of ComputeFindIVResult with VPIRFlags. #174026 (#175461)
Replace the Phi recipe operand of ComputeFindIVResult with VPIRFlags,
building on top of https://github.com/llvm/llvm-project/pull/174026.

PR: https://github.com/llvm/llvm-project/pull/175461
2026-01-17 16:23:33 +00:00
Florian Hahn
370eeef877
[VPlan] Add matchers for reduction result VPInstructions (NFC).
Add dedicated matchers for reduction result VPInstructions, to be
re-used in follow-up patches, including
https://github.com/llvm/llvm-project/pull/167851.
2026-01-16 11:28:30 +00:00
Florian Hahn
d528686f43
[VPlan] Add VPConstantInt for VPIRValues wrapping ConstantInts (NFC) (#175458)
Follow-up to https://github.com/llvm/llvm-project/pull/174282: Introduce
a new VPConstantInt overlay for VPIRValue, to make it easier to check
and access constant int IR values.

PR: https://github.com/llvm/llvm-project/pull/175458
2026-01-16 11:27:07 +00:00
Luke Lau
0ae23ca9e6
[VPlan] Split out optimizeEVLMasks. NFC (#174925)
Addresses part of #153144 and splits off part of #166164

There are two parts to the EVL transform:

1) Convert the loop so the number of elements processed each iteration
is EVL, not VF. The IV and header mask are replaced with EVL-based
variants.
2) Optimize users of the EVL based header mask to VP intrinsic based
recipes.

(1) changes the semantics of the vector loop region, whereas (2) needs
to preserve them. This splits (2) out so we don't mix the two up, and
allows us to move (1) earlier in the pipeline in a future PR.
2026-01-14 07:01:14 +00:00
Ramkumar Ramachandra
bd2cfc52e4
[PatternMatch] Implement match_fn using bind_back (NFC) (#175811)
Use llvm::bind_back landed in d2a521750 ([ADT] Introduce
bind_{front,back}, [not_]equal_to, #175056) to simplify implementations
of match_fn in PatternMatch and VPlanPatternMatch.
2026-01-13 20:55:38 +00:00
Elvis Wang
cd2caf6580
[LV] Simplify extract-lane with scalar operand to the scalar value itself. (#174534)
This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a
scalar value. Extracting from a scalar is redundant since there is only
one value to extract.
2026-01-12 10:03:44 +08:00
Florian Hahn
31b93d6e38
[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758)
This patch adds VPValue sub-classes for the different cases we currently
have:
 * VPIRValue: A live-in VPValue that wraps an underlying IR value
* VPSymbolicValue: A symbolic VPValue not tied to an underlying value,
e.g. the vector trip count or VF VPValues
 * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase.

This has multiple benefits:
 * clearer constructors for each kind of VPValue
* limited scope: for example allows moving VPDef member to VPRecipeValue,
reducing size of other VPValues.
* stricter type checking for member variables (e.g. using VPLiveIn in
the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic
member VPValues)

There probably are additional opportunities for cleanups as follow-ups.

PR: https://github.com/llvm/llvm-project/pull/172758
2026-01-07 20:29:05 +00:00
David Sherwood
97ee9b66c0
[LV] Teach m_One, m_ZeroInt patterns to look through broadcasts (#170159)
In VPlanPatternMatch.h I have changed the int_pred_ty code to look
through broadcasts in order to catch more cases, i.e. multiplying by a
splat of one, etc.
2026-01-07 10:35:08 +00:00
Aiden Grossman
c2d060c50e
[VPlan] Mark variable unused in release build [[maybe_unused]] (#174648)
To prevent compiler warnings when building without assertions turned on.
2026-01-06 21:27:55 +00:00
Ramkumar Ramachandra
d12e99376f
Reland [VPlan] Simplify pow-of-2 (mul|udiv) -> (shl|lshr) (#174581)
The original patch, landed as a2db31b0 ([VPlan] Simplify pow-of-2
(mul|udiv) -> (shl|lshr), #172477) had a critical commutative matcher
bug, which has now been fixed. An assert has also been strengthened,
following a post-commit review.
2026-01-06 20:36:26 +00:00
Alex Bradbury
5a456c17d9
Revert "[VPlan] Simplify pow-of-2 (mul|udiv) -> (shl|lshr)" (#174559)
Reverts llvm/llvm-project#172477

This is causing failures for RVA23 (including some tests running away in
their execution causing OOM, hence the builder dying). I will attempt to
follow up on the PR with a reproducer of some kind.
https://lab.llvm.org/buildbot/#/builders/210/builds/7243
2026-01-06 10:26:51 +00:00
Ramkumar Ramachandra
a2db31b06f
[VPlan] Simplify pow-of-2 (mul|udiv) -> (shl|lshr) (#172477) 2026-01-06 08:27:48 +00:00
Florian Hahn
16830b2164
[VPlan] Remove VPWidenSelectRecipe, use VPWidenRecipe instead (NFCI). (#174234)
All extra state has been removed from VPWidenSelectRecipe at this point.
There's no benefit of having a separate recipe and Select can easily be
handled by the existing VPWidenRecipe.

PR: https://github.com/llvm/llvm-project/pull/174234
2026-01-05 22:33:37 +00:00
Florian Hahn
524b1788c4
[VPlan] Add BranchOnTwoConds, use for early exit plans. (#172750)
This PR introduces a new BranchOnTwoConds VPInstruction, that takes 2
boolean operands and must be placed in a block with 3 successors.

If condition I is true, branches to successor I, otherwise falls through
to check the next condition. If both conditions are false, branch to the
third successor.

This new branch recipe is used for early-exit loops, to simplify the
representation in VPlan initially, by avoid the need for splitting the
middle block early on, in a way that preserves the single-exit block
property of regions. All exits still go through the latch block, but
they can go to more than 2 successors.

This idea was part of one of the original proposals for how to model
early exits in VPlan, but at that point in time, there was no good way
to handle this during code-gen, and we went with the early split-middle
block approach initially.

Now that we dissolve regions before ::execute, the new recipe can be
lowered nicely after regions have been removed, to a set of VPBBs and
BranchOnCond recipes. The initial lowering preserves the original
structure with the split middle blocks. Follow-ups will improve the
lowering to avoid this splitting, providing performance gains.

PR: https://github.com/llvm/llvm-project/pull/172750
2025-12-29 19:39:38 +00:00
Florian Hahn
7de080482c
[VPlan] Handle min/max intrinsics in getSCEVExprForVPValue (NFCI)
Use m_Intrinsic to handle min/max intrinsics in getSCEVExprForVPValue.
This also extends Argument_match and IntrinsicID_match to VPInstruction
for completeness, and unifies the handling to avoid looking up functions
from the underlying IR instruction.

Tested via the VPlan-based cost-model, but same costs should be
computed.

As part of the extension, fix a bug in Argument_match that had an
incorrect offset for the operands of VPReplicateRecipe; the function is
the last argument.
2025-12-28 22:28:16 +00:00
Mel Chen
f196b1d66f
[VPlan] Extract reverse operation for reverse accesses (#146525)
This patch introduces VPInstruction::Reverse and extracts the reverse
operations of loaded/stored values from reverse memory accesses. This
extraction facilitates future support for permutation elimination within
VPlan.
2025-12-18 14:57:48 +00:00
Florian Hahn
3fc7419236
[VPlan] Replace ExtractLast(Elem|LanePerPart) with ExtractLast(Lane/Part) (#164124)
Replace ExtractLastElement and ExtractLastLanePerPart with more generic
and specific ExtractLastLane and ExtractLastPart, which model distinct
parts of extracting across parts and lanes. ExtractLastElement ==
ExtractLastLane(ExtractLastPart) and ExtractLastLanePerPart ==
ExtractLastLane, the latter clarifying the name of the opcode. A new
m_ExtractLastElement matcher is provided for convenience.

The patch should be NFC modulo printing changes.

PR: https://github.com/llvm/llvm-project/pull/164124
2025-12-07 15:15:43 +00:00
Florian Hahn
99addbf73d
[LV] Vectorize selecting last IV of min/max element. (#141431)
Add support for vectorizing loops that select the index of the minimum
or maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.

It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction. The extra user is then check as
part of the new `handleMultiUseReductions` VPlan transformation.

It processes any reduction that has other reduction users. The reduction
using the min/max reduction currently must be a FindLastIV reduction,
which needs adjusting to compute the correct result:
 1. We need to find the last IV for which the condition based on the
     min/max reduction is true,
 2. Compare the partial min/max reduction result to its final value and,
 3. Select the lanes of the partial FindLastIV reductions which
     correspond to the lanes matching the min/max reduction result.

Depends on https://github.com/llvm/llvm-project/pull/140451

PR: https://github.com/llvm/llvm-project/pull/141431
2025-11-28 22:26:19 +00:00
Florian Hahn
8f36135aea
[VPlan] Add m_Intrinsic matcher that takes a variable intrinsic ID (NFC)
Add a variant of m_Intrinsic that matches a variable runtime ID.
2025-11-27 21:23:29 +00:00
Florian Hahn
f8eca64a28
Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)"
This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6.

The following fixes have landed, addressing issues causing the original
revert:
* https://github.com/llvm/llvm-project/pull/169298
* https://github.com/llvm/llvm-project/pull/167897
* https://github.com/llvm/llvm-project/pull/168949

Original message:
Building on top of https://github.com/llvm/llvm-project/pull/148817,
introduce a new abstract LastActiveLane opcode that gets lowered to
Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1).

When folding the tail, update all extracts for uses outside the loop the
extract the value of the last actice lane.

See also https://github.com/llvm/llvm-project/issues/148603

PR: https://github.com/llvm/llvm-project/pull/149042
2025-11-26 20:03:55 +00:00
Florian Hahn
d58ebe339c
Revert "Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)""
This reverts commit 72e51d389f66d9cc6b55fd74b56fbbd087672a43.

Missed some test updates.
2025-11-26 19:41:39 +00:00
Florian Hahn
72e51d389f
Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)"
This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6.

The following fixes have landed, addressing issues causing the original
revert:
* https://github.com/llvm/llvm-project/pull/169298
* https://github.com/llvm/llvm-project/pull/167897
* https://github.com/llvm/llvm-project/pull/168949

Original message:
Building on top of https://github.com/llvm/llvm-project/pull/148817,
introduce a new abstract LastActiveLane opcode that gets lowered to
Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1).

When folding the tail, update all extracts for uses outside the loop the
extract the value of the last actice lane.

See also https://github.com/llvm/llvm-project/issues/148603

PR: https://github.com/llvm/llvm-project/pull/149042
2025-11-26 19:31:25 +00:00
Florian Hahn
21378fb75a
[VPlan] Merge fcmp uno feeding AnyOf. (#166823)
Fold
  any-of (fcmp uno %A, %A), (fcmp uno %B, %B), ... ->
  any-of (fcmp uno %A, %B), ...

This pattern is generated to check if any vector lane is NaN, and
combining multiple compares is beneficial on architectures that have
dedicated instructions.

Alive2 Proof: https://alive2.llvm.org/ce/z/vA_aoM

Combine suggested as part of
https://github.com/llvm/llvm-project/pull/161735

PR: https://github.com/llvm/llvm-project/pull/166823
2025-11-23 15:52:19 +00:00
Rahul Joshi
4703195c8d
[NFC][LLVM] Namespace cleanup in SLPVectorizer (#168623)
- Remove file local functions out of `llvm` or anonymous namespace and
make them static.
- Use namespace qualifier to define `BoUpSLP` class and several template
specializations.
2025-11-19 07:34:09 -08:00
Florian Hahn
e009de26b6 [LV] Use VPlan pattern matching in adjustRecipesForReductions (NFC)
Replace the assert checking if CurrentLinkI is a CmpInst with a pattern
matching check in the if condition. This uses VPlan-level pattern matching
instead of inspecting the underlying instruction type.
2025-11-15 21:45:40 +00:00
Florian Hahn
a6edeedbfa Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)"
This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b.

This appears to be causing some runtime failures on RISCV
https://lab.llvm.org/buildbot/#/builders/210/builds/5221
2025-11-13 22:34:55 +00:00
Florian Hahn
53a65ba6b9 [VPlan] Don't look up recipe for IV step via RecipeBuilder. (NFC)
Directly update induction increments with step value created for wide
inductions in createWidenInductionRecipes, which does not require
looking up via RecipeBuilder.
2025-11-12 22:08:56 +00:00
Florian Hahn
62d1a080e6
[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)
Building on top of https://github.com/llvm/llvm-project/pull/148817,
introduce a new abstract LastActiveLane opcode that gets lowered to
Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1).

When folding the tail, update all extracts for uses outside the loop the
extract the value of the last actice lane.

See also https://github.com/llvm/llvm-project/issues/148603

PR: https://github.com/llvm/llvm-project/pull/149042
2025-11-12 15:11:00 +00:00
Luke Lau
97d4e96cc5
[VPlan] Perform optimizeMaskToEVL in terms of pattern matching (#155394)
Currently in optimizeMaskToEVL we convert every widened load, store or
reduction to a VP predicated recipe with EVL, regardless of whether or
not it uses the header mask.

So currently we have to be careful when working on other parts VPlan to
make sure that the EVL transform doesn't break or transform something
incorrectly, because it's not a semantics preserving transform.
Forgetting to do so has caused miscompiles before, like the case that
was fixed in #113667

This PR rewrites it to work in terms of pattern matching, so it now only
converts a recipe to a VP predicated recipe if it is exactly masked with
the header mask.

After this the transform should be a true optimisation and not change
any semantics, so it shouldn't miscompile things if other parts of VPlan
change.

This fixes #152541, and allows us to move addExplicitVectorLength into
tryToBuildVPlanWithVPRecipes in #153144

It also splits out the load/store transforms into separate patterns for
reversed and non-reversed, which should make #146525 easier to implement
and reason about.
2025-11-03 16:53:18 +08:00
Florian Hahn
b9ce7656e9
[VPlan] Add VPInstruction to unpack vector values to scalars. (#155670)
Add a new Unpack VPInstruction (name to be improved) to explicitly
extract scalars values from vectors.

Test changes are movements of the extracts: they are no generated
together and also directly after the producer.

Depends on https://github.com/llvm/llvm-project/pull/155102 (included in
PR)

PR: https://github.com/llvm/llvm-project/pull/155670
2025-10-19 18:49:05 +00:00
Ramkumar Ramachandra
0a4702407b
[VPlan] Improve code around canConstantBeExtended (NFC) (#161652)
Follow up on 7c4f188 ([LV] Support multiplies by constants when forming
scaled reductions), introducing m_APInt, and improving code around
canConstantBeExtended: we change canConstantBeExtended to take an APInt.
2025-10-16 13:03:13 +01:00
Florian Hahn
4f23767852
[VPlan] Add m_FirstActiveLane matcher (NFC).
Add m_FirstActiveLane, to slightly simplify pattern matching in
preparation for https://github.com/llvm/llvm-project/pull/149042.
2025-10-15 18:55:26 +01:00
Florian Hahn
7f54fccc0e
[VPlan] Add ExtractLastLanePerPart, use in narrowToSingleScalar. (#163056)
When narrowing stores of a single-scalar, we currently use
ExtractLastElement, which extracts the last element across all parts.
This is not correct if the store's address is not uniform across all
parts. If it is only uniform-per-part, the last lane per part must be
extracted. Add a new ExtractLastLanePerPart opcode to handle this
correctly. Most transforms apply to both ExtractLastElement and
ExtractLastLanePerPart, with the only difference being their treatment
during unrolling.

Fixes https://github.com/llvm/llvm-project/issues/162498.

PR: https://github.com/llvm/llvm-project/pull/163056
2025-10-15 13:46:09 +01:00
Ramkumar Ramachandra
869c76dda3
[VPlan] Allow zero-operand m_BranchOn(Cond|Count) (NFC) (#162721) 2025-10-13 08:50:09 +01:00
Ramkumar Ramachandra
b716d35388
[VPlanPatternMatch] Introduce m_ConstantInt (#159558) 2025-09-21 13:27:46 +01:00
Ramkumar Ramachandra
f1ba44f50a
[VPlan] Strip dead code in cst live-in match (NFC) (#159589)
A live-in constant can never be of vector type.
2025-09-18 19:28:42 +01:00
Graham Hunter
6b99a7bbed
[LV] Provide utility routine to find uncounted exit recipes (#152530)
Splitting out just the recipe finding code from #148626 into a utility
function (along with the extra pattern matchers). Hopefully this makes
reviewing a bit easier.

Added a gtest, since this isn't actually used anywhere yet.
2025-09-18 15:45:23 +00:00