373 Commits

Author SHA1 Message Date
Florian Hahn
b822a32659
[VPlan] Fix crash when trying to narrow interleave group storing const.
Use dyn_cast_null to handle the case where an interleave groups stores a
constant in any of its lanes.
2025-06-29 21:29:12 +01:00
Florian Hahn
f5ed863176
Revert "[VPlan] Allow derived IVs and scalar-steps in narrowing interleave."
This reverts commit 2787759ef2e41b19f8bfde06fe9a26b25d1f5834.

This exposed a crash on some build bots. Revert to investigate.
2025-06-29 14:40:03 +01:00
Florian Hahn
2787759ef2
[VPlan] Allow derived IVs and scalar-steps in narrowing interleave.
Both VPDerivedIVRecipe and VPScalarIVSteps recipe should be supported in
narrowInterleaveGroups:
 * VPDerivedIVRecipe is based on the canonical IV and independent of VF,
 * VPScalarIVSteps takes the VF as operand, so it will be updated by
   narrowInterleaveGroup.
2025-06-29 13:18:51 +01:00
Florian Hahn
bdb299a67e
[VPlan] Simplify code in single scalar transform code (NFC).
Adjust code as suggested post-commit 3b7b95f78e2.

3b7b95f78e (r160997427)
2025-06-28 22:53:14 +01:00
Florian Hahn
3b7b95f78e
[VPlan] Support VPWidenSelectRecipe in narrowToSingleScalar.
VPWidenSelectRecipes are single scalars if all their operands are. Add
support for narrowing them to a single scalar VPReplicateRecipe.

This fixes a crash after
https://github.com/llvm/llvm-project/pull/142433 (aa24029319083) when
due to a replicate recipe not being converted to single-scalar being
hoisted to the vector preheader.
2025-06-27 15:42:42 +01:00
Florian Hahn
aa24029319
[VPlan] Unroll VPReplicateRecipe by VF. (#142433)
Explicitly unroll VPReplicateRecipes outside replicate regions by VF,
replacing them by VF single-scalar recipes. Extracts for operands are
added as needed and the scalar results are combined to a vector using a
new BuildVector VPInstruction.

It also adds a few folds to simplify unnecessary extracts/BuildVectors.

It also adds a BuildStructVector opcode for handling of calls that have
struct return types.

VPReplicateRecipe in replicate regions can will be unrolled as follow
up, turing non-single-scalar VPReplicateRecipes into 'abstract', i.e.
not executable.

PR: https://github.com/llvm/llvm-project/pull/142433
2025-06-26 11:19:09 +01:00
Florian Hahn
58b939abe5
[VPlan] Support matching constants in narrowInterleaveGroups.
Matching constants can trivially be broadcasted, allow them if the same
constant is used for all recipes in a bundle.
2025-06-22 08:45:40 +01:00
Florian Hahn
60d1276b0e
[VPlan] Pass operand index to canNarrowLoad. (NFC)
Explicitly pass the operand we are checking to canNarrowLoad. This
simplifies the check if the operands match across recipes and enables
future optimizations.
2025-06-21 15:41:26 +01:00
Florian Hahn
f8ffb4e7cd
[VPlan] Simplify ExtractLastElement(Broadcast(A)) -> A.
Remove trivial ExtractLastElement VPInstructions.
2025-06-20 21:08:14 +01:00
Luke Lau
521adc9fa2 [VPlan] Use createScalarZExtOrTrunc when expanding expandVPWidenIntOrFpInduction
Split off from #144666
2025-06-20 19:18:49 +01:00
Florian Hahn
e8be733a3c
[VPlan] Remove redundant ExtractLastElement from vector-to-scalar VPI.
Recipes that are vector-to-scalar are guaranteed to generate a scalar
value, so the extract is redundant after VPlan unrolling. Remove it.

This removes unneeded ExtractLastElement VPInstruction of reduction
result computations.
2025-06-20 12:45:20 +01:00
Philip Reames
53ea522d1b
[LV] Introduce and use VPBuilder::createScalarZExtOrTrunc [nfc] (#144946)
Reduce redundant code, make the flow slightly easier to read.
2025-06-19 14:12:14 -07:00
Florian Hahn
23b8f11b27
[VPlan] Remove redundant VPWidenRecipe constructors (NFC)
Since the removal of VPWidenEVLRecipe, the constructors taking a
VPDefOpcode are not needed any more. Remove them.
2025-06-18 20:59:16 +01:00
Luke Lau
9dd1c66e8f
[VPlan] Expand VPWidenIntOrFpInductionRecipe into separate recipes (#118638)
The motivation of this PR is to make #115274 easier to implement, and
should allow us to add EVL support by just passing EVL to the VF
operand.

The current difficulty with widening IVs with EVL is that
VPWidenIntOrFpInductionRecipe generates its own backedge value. Since
it's a VPHeaderPHIRecipe the VF operand must be in the preheader, which
means we can't use the EVL since it's defined in the loop body.

The gist in this PR is to take the approach in #114305 and expand
VPWidenIntOrFpInductionRecipe into several recipes for the initial
value, phi and backedge value just before execution. I.e. this example:

```
  vector.ph:
  Successor(s): vector loop

  <x1> vector loop: {
    vector.body:
      WIDEN-INDUCTION %i = phi %start, %step, %vf
      ...
      EMIT branch-on-count ...
    No successors
  }
```

gets expanded to:

``` 
vector.ph:
  ...
  vp<%induction.start> = ...
  vp<%induction.increment> = ...

Successor(s): vector loop

<x1> vector loop: {
  vector.body:
    ir<%i> = WIDEN-PHI vp<%induction.start>, vp<%vec.ind.next>
    ...
    vp<%vec.ind.next> = add ir<%i>, vp<%induction.increment>
    EMIT branch-on-count ...
  No successors
}
```

This allows us to a value defined in the loop in the backedge value, and
also means we can just reuse the existing backedge fixups in
VPlan::execute without having to specially handle it ourselves.

After this #115274 should just become a matter of setting the VF operand
to EVL (and building the increment step in the loop body, not the
preheader).
2025-06-17 18:24:07 +01:00
Florian Hahn
30b16ec341
[VPlan] Simplify trivial VPFirstOrderRecurrencePHI recipes.
VPFirstOrderRecurrencePHIRecipes where the incoming values are the same
can be simplified and removed.

Fixes https://github.com/llvm/llvm-project/issues/144212.

The new test is added together with other related tests from
first-order-recurrence.ll
2025-06-16 22:54:26 +01:00
Florian Hahn
577199f922
Reapply "[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035)"
This reverts commit 0604dc199c019b23746f4a54885ba0c75569cdae.

The recommitted version addresses post-commit comments and adjusts the
place the branch weights are added. It now runs before VPlans are optimized
for VF and UF, which may remove the vector loop region, causing a crash
trying to get the middle block after that. Test case added in
72f99b75afc12bb.

Original message:
Manage branch weights for the BranchOnCond in the middle block in VPlan.
This requires updating VPInstruction to inherit from VPIRMetadata, which
in general makes sense as there are a number of opcodes that could take
metadata.

There are other branches (part of the skeleton) that also need branch
weights adding.

PR: https://github.com/llvm/llvm-project/pull/143035
2025-06-14 17:20:46 +01:00
Florian Hahn
6108d50aed
[VPlan] Add ReductionStartVector VPInstruction. (#142290)
Add a new VPInstruction::ReductionStartVector opcode to create the start
values for wide reductions. This more accurately models the start value
creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the
line it also allows removing VPReductionPHIRecipe::RdxDesc.

PR: https://github.com/llvm/llvm-project/pull/142290
2025-06-09 20:59:12 +01:00
Florian Hahn
2eab83f618
[VPlan] Remove CanonicalIV when dissolving loop regions (NFC). (#142372)
Directly replace the canonical IV when we dissolve the containing
region. That ensures that it won't get removed before the region gets
removed, which would result in an invalid region.

This removes the current ordering constraint between
convertToConcreteRecipes and dissolving regions.

PR: https://github.com/llvm/llvm-project/pull/142372
2025-06-03 10:05:28 +01:00
Ramkumar Ramachandra
b8c4eea3d8
[VPlan] Simplify PredPHI LiveIn -> LiveIn (#142271)
5f39be5 ([VPlan] Use InstSimplifyFolder instead of TargetFolder) updated
simplifyRecipe to fold live-ins to Values that are not necessarily
Constant, but forgot to update the corresponding PredPHI folder, which
still folds PredPHI constant -> constant. Update it to fold PredPHI
LiveIn -> LiveIn.

Fixes #141968.
2025-06-02 14:56:35 +01:00
Florian Hahn
3b474bc510
[VPlan] Use VPSingleDef in simplifyRecipe (NFC).
All simplifications are applied to VPSingleDefRecipes. Check for them
early to skip unnecessary work and remove a number of getVPSingleValue
calls.
2025-06-01 15:32:02 +01:00
Florian Hahn
33bbce5e34
[VPlan] Get plan once in simplifyRecipe (NFC).
Also check once if the plan is unrolled at the end, to make it easier to
add more transforms that apply after unrolling.
2025-06-01 12:46:08 +01:00
Kazu Hirata
c0bf51e3ad [Vectorize] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1865:17: error:
  unused variable 'Preds' [-Werror,-Wunused-variable]
2025-05-31 14:40:19 -07:00
Florian Hahn
0f00a96fed
[VPlan] Simplify branch on False in VPlan transform (NFC). (#140409)
Simplify branch on false, starting with the branch from the middle block
to the scalar preheader. Initially this helps simplifying the initial
VPlan construction.

Depends on https://github.com/llvm/llvm-project/pull/140405.

PR: https://github.com/llvm/llvm-project/pull/140409
2025-05-31 20:32:45 +01:00
Florian Hahn
78eafb14f7
[VPlan] Add getIndexFor(Predecessor|Successor) helpers (NFC).
Move code to get the index of a predecessor and successor to helpers in
VPBlockBase, to avoid duplication and enable future reuse.

Split off from https://github.com/llvm/llvm-project/pull/140409.
2025-05-31 12:53:05 +01:00
Florian Hahn
10bd4cd9cd
[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405)
Use regular VPPhi instead of a separate opcode for resume phis. This
removes an unneeded specialized opcode and unifies the code
(verification, printing, updating when CFG is changed).

Depends on https://github.com/llvm/llvm-project/pull/140132.

PR: https://github.com/llvm/llvm-project/pull/140405
2025-05-30 12:50:08 +01:00
Ramkumar Ramachandra
5f39be5917
[VPlan] Use InstSimplifyFolder instead of TargetFolder (#141222)
For more powerful folding with operands that are not necessarily
all-constant, use InstSimplifyFolder instead of TargetFolder in
tryToConstantFold, and rename the function tryToFoldLiveIns.
2025-05-28 11:00:14 +02:00
Florian Hahn
d56deea1e4
[VPlan] Connect Entry to scalar preheader during initial construction. (#140132)
Update initial construction to connect the Plan's entry to the scalar
preheader during initial construction. This moves a small part of the
 skeleton creation out of ILV and will also enable replacing
 VPInstruction::ResumePhi with regular VPPhi recipes.

Resume phis need 2 incoming values to start with, the second being the
bypass value from the scalar ph (and used to replicate the incoming
value for other bypass blocks). Adding the extra edge ensures we
incoming values for resume phis match the incoming blocks.

PR: https://github.com/llvm/llvm-project/pull/140132
2025-05-27 16:07:56 +01:00
Kazu Hirata
89308de4b0
[llvm] Value-initialize values with *Map::try_emplace (NFC) (#141522)
try_emplace value-initializes values, so we do not need to pass
nullptr to try_emplace when the value types are raw pointers or
std::unique_ptr<T>.
2025-05-26 15:13:02 -07:00
Florian Hahn
c0506a11f4
[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621)
This patch moves the logic to manage IR flags to a separate VPIRFlags
class. For now, VPRecipeWithIRFlags is the only class that inherits
VPIRFlags. The new class allows for simpler passing of flags when
constructing recipes, simplifying the constructors for various recipes
(VPInstruction in particular, which now just has 2 constructors, one
taking an extra VPIRFlags argument.

This mirrors the approach taken for VPIRMetadata and makes it easier to
extend in the future. The patch also adds a unified flagsValidForOpcode
to check if the flags in a VPIRFlags match the provided opcode.

PR: https://github.com/llvm/llvm-project/pull/140621
2025-05-25 11:13:11 +01:00
Florian Hahn
dcef154b5c
[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506)
Building on top of https://github.com/llvm/llvm-project/pull/114305,
replace VPRegionBlocks with explicit CFG before executing.

This brings the final VPlan closer to the IR that is generated and
helps to simplify codegen.

It will also enable further simplifications of phi handling during
execution and transformations that do not have to preserve the 
canonical IV required by loop regions. This for example could include
replacing the canonical IV with an EVL based phi while completely
removing the original canonical IV.

PR: https://github.com/llvm/llvm-project/pull/117506
2025-05-24 19:17:16 +01:00
Florian Hahn
bf15aadcbc
[VPlan] Don't try to narrow predicated VPReplicateRecipe.
We cannot convert predicated recipes to uniform ones at the moment.
This fixes a crash reported for https://github.com/llvm/llvm-project/pull/139150.
2025-05-21 22:13:55 +01:00
Ramkumar Ramachandra
cf1f116f78
[VPlan] Introduce constant folder in simplifyRecipe (#125365)
Introduce a VPlan-level constant folder in simplifyRecipe that tries to
fold a recipe to a constant using TargetFolder.
2025-05-20 14:16:01 +01:00
Florian Hahn
07c085af3e
[VPlan] Add narrowToSingleScalarRecipe transform. (#139150)
Add a new convertToUniformRecipes transform which uses VPlan-based
uniformity analysis to determine if wide recipes and replicate recipes
can be converted to uniform recipes.

There are a few places where we ad-hoc convert recipes to uniform
recipes, which this transform will eventually replace. There are a few
more generalizations required to do so which I plan to do as follow-ups.

By converting the recipes to uniform recipes, we effectively materialize
the information from the VPlan-based analysis.

Note that there is one regression at the moment in SystemZ/pr47665.ll
due to trivial constant folding opportunities in the input IR. This will
be fixed by VPlan-based constant folding
(https://github.com/llvm/llvm-project/pull/125365/)

PR: https://github.com/llvm/llvm-project/pull/139150
2025-05-18 09:32:27 +01:00
Florian Hahn
04fde85057
[VPlan] Rename isUniform(AfterVectorization) to isSingleScalar (NFC). (#140134)
Update the naming in VPReplicateRecipe and vputils to the more accurate
isSingleScalar, as the functions check for cases where only a single
scalar is needed, either because it produces the same value for all
lanes or has only their first lane used.

Discussed in https://github.com/llvm/llvm-project/pull/139150.

PR: https://github.com/llvm/llvm-project/pull/140134
2025-05-16 16:38:39 +01:00
Elvis Wang
664c937b43
[VPlan] Implement VPExtendedReduction, VPMulAccumulateReductionRecipe and corresponding vplan transformations. (#137746)
This patch introduce two new recipes.

* VPExtendedReductionRecipe
  - cast + reduction.

* VPMulAccumulateReductionRecipe
  - (cast) + mul + reduction.

This patch also implements the transformation that match following
patterns via vplan and converts to abstract recipes for better cost
estimation.

* VPExtendedReduction
  - reduce(cast(...))

* VPMulAccumulateReductionRecipe
  - reduce.add(mul(...))
  - reduce.add(mul(ext(...), ext(...))
  - reduce.add(ext(mul(ext(...), ext(...))))

The converted abstract recipes will be lower to the concrete recipes
(widen-cast + widen-mul + reduction) just before recipe execution.

Note that this patch still relies on legacy cost model the calculate the
cost for these patters.
Will enable vplan-based cost decision in #113903.

Split from #113903.
2025-05-16 10:25:38 +08:00
Florian Hahn
2f55123cbb
[VPlan] Handle early exit before forming regions. (NFC) (#138393)
Move early-exit handling up front to original VPlan construction, before
introducing early exits.

This builds on https://github.com/llvm/llvm-project/pull/137709, which
adds exiting edges to the original VPlan, instead of adding exit blocks
later.

This retains the exit conditions early, and means we can handle early
exits before forming regions, without the reliance on VPRecipeBuilder.

Once we retain all exits initially, handling early exits before region
construction ensures the regions are valid; otherwise we would leave
edges exiting the region from elsewhere than the latch.

Removing the reliance on VPRecipeBuilder removes the dependence on
mapping IR BBs to VPBBs and unblocks predication as VPlan transform:
https://github.com/llvm/llvm-project/pull/128420.

Depends on https://github.com/llvm/llvm-project/pull/137709 (included in
PR).

PR: https://github.com/llvm/llvm-project/pull/138393
2025-05-12 12:53:20 +01:00
Florian Hahn
e854c381c6
[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450)
Use VPIRMetadata added in
https://github.com/llvm/llvm-project/pull/135272
to also manage no-alias metadata added by versioning.

Note that this means we have to build the no-alias metadata up-front
once. If it is not used, it will be discarded automatically.

This also fixes a case where incorrect metadata was added to wide
loads/stores that got converted from an interleave group.

Compile-time impact is neutral:

https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u
2025-05-09 11:19:12 +01:00
Ramkumar Ramachandra
c4f723a7c3
[LV] Strip unmaintainable MinBWs assert (#136858)
tryToWiden attempts to replace an Instruction with a Constant from SCEV,
but forgets to erase the Instruction from the MinBWs map, leading to an
assert in VPlanTransforms::truncateToMinimalBitwidths. Going forward,
the assertion in truncateToMinimalBitwidths is unmaintainable, as LV
could simplify the expression at any point: fix the bug by stripping the
unmaintable assertion.

Fixes #125278.
2025-05-08 11:49:54 +01:00
Luke Lau
1484f82cbc
[VPlan] Add VPInstruction::StepVector and use it in VPWidenIntOrFpInductionRecipe (#129508)
Split off from #118638, this adds VPInstruction::StepVector, which
generates integer step vectors (0,1,2,...,VF). This is a step towards
eventually modelling all the separate parts of
VPWidenIntOrFpInductionRecipe in VPlan.

This is then used by VPWidenIntOrFpInductionRecipe, where we materialize
it just before unrolling so the operands stay in a fixed position.

The need for a separate operand in VPWidenIntOrFpInductionRecipe, as
well as the need to update it in
optimizeVectorInductionWidthForTCAndVFUF, should be removed with #118638
when everything is expanded in convertToConcreteRecipes.
2025-05-08 18:47:44 +08:00
Florian Hahn
7f4e36ebf6
[VPlan] Create PHI VPInstruction using VPBuilder (NFC).
Use builder to create scalar PHI VPInstructions.
2025-05-07 20:47:37 +01:00
Ramkumar Ramachandra
49842426f3
[LV] Fix MinBWs in WidenIntrinsic case (#137005)
There is a bug in the computation and handling of MinBWs in the case of
VPWidenIntrinsicRecipe: a crash is observed in
VPlanTransforms::truncateToMinimalBitwidths due to a mismatch between
the number of recipes processed and the number of entries in MinBWs. Fix
handling of calls in llvm::computeMinimumValueSizes, and handle
VPWidenIntrinsicRecipe in truncateToMinimalBitwidths, fixing the bug.

Fixes #87407.
2025-04-29 09:47:38 +01:00
Florian Hahn
043b04acff
Reapply "[VPlan] Fold NOT into predicate of wide compares." (#130347)
This reverts commit 8dd160f4767f971572eac065c8650d9202ff5bf9.

The recommit contains an adjustment to planContainsAdditionalSimplifications,
which considers changes to the original predicate for compares.

Original commit message:

Add simplification to fold negation into a compare, if the negation is
the only user of the compare. This removes a number of redundant
negations.

Alive2 Proofs for FPCMP test changes:  https://alive2.llvm.org/ce/z/WGDz9U

PR: https://github.com/llvm/llvm-project/pull/129430
2025-04-28 20:01:37 +01:00
Kazu Hirata
5cfd81b0cc
[llvm] Use range constructors of *Set (NFC) (#137552) 2025-04-27 15:59:57 -07:00
Florian Hahn
826f237cb4
[VPlan] Don't added separate vector latch block (NFC).
Simplify initial VPlan construction by not creating a separate
vector.latch block, which isn't needed and will get folded away later.
This has been suggested as independent clean-up multiple times.
2025-04-26 22:03:18 +01:00
Florian Hahn
df21288247
[VPlan] Replace ExtractFromEnd with Extract(Last|Penultimate)Element (NFC). (#137030)
ExtractFromEnd only has 2 uses, extracting the last and penultimate
elements. Replace it with 2 separate opcodes, removing the need to
materialize and handle a constant argument.

PR: https://github.com/llvm/llvm-project/pull/137030
2025-04-25 16:27:29 +01:00
Florian Hahn
7cce38beea
[VPlan] Remove dead SE argument from handleUncountableEarlyExit (NFC).
ScalarEvolution is not used by the function, remove the dead arg.
2025-04-24 19:59:05 +01:00
Florian Hahn
06d4876982
[VPlan] Replace checking IR loop with checking VPlan predecessors (NFC).
Update check to use VPEarlyExitBlock's predecessors, which removes a
dependence on underlying IR and is more in line with the comment below.
2025-04-24 12:29:34 +01:00
Florian Hahn
3fbbe9b8d0
[VPlan] Add exit phi operands during initial construction (NFC). (#136455)
Add incoming exit phi operands during the initial VPlan construction.
This ensures all users are added to the initial VPlan and is also needed
in preparation to retaining exiting edges during initial construction.

PR: https://github.com/llvm/llvm-project/pull/136455
2025-04-23 20:40:42 +01:00
Florian Hahn
8c83355d5b
[VPlan] Handle VPIRPhi in VPRecipeBase::isPhi (NFC).
Also handle VPIRPhi in VPRecipeBase::isPhi, to simplify existing code
dealing with VPIRPhis.

Suggested as part of https://github.com/llvm/llvm-project/pull/136455.
2025-04-21 21:04:20 +01:00
Florian Hahn
5739a22fbb
[VPlan] Also duplicated scalar-steps when it enables sinking scalars. (#136021)
Extend sinking logic to duplicate scalar steps recipe if it enables
sinking, that is if all users in a destination block require all lanes.

This should be the last step before removing legacy sinkScalarOperands.

PR: https://github.com/llvm/llvm-project/pull/136021
2025-04-21 18:36:43 +01:00