25 Commits

Author SHA1 Message Date
Florian Hahn
eb0c7e752f
[VPlan] Replace BranchOnCount with Compare + BranchOnCond (NFC). (#172181)
Expand BranchOnCount to BranchOnCond + ICmp in convertToConcreteRecipes
to simplify codegen.

PR: https://github.com/llvm/llvm-project/pull/172181
2025-12-16 19:19:31 +00:00
Florian Hahn
b8eaceb39b
[VPlan] Explicitly replicate VPInstructions by VF. (#155102)
Extend replicateByVF added in #142433 (aa240293190) to also explicitly
unroll replicating VPInstructions.

Now the only remaining case where we replicate for all lanes is
VPReplicateRecipes in replicate regions.

PR: https://github.com/llvm/llvm-project/pull/155102
2025-09-12 17:06:26 +01:00
Florian Hahn
5faed1ad84
[VPlan] Add VPlan-based addMinIterCheck, replace ILV for non-epilogue. (#153643)
This patch adds a new VPlan-based addMinimumIterationCheck, which
replaced the ILV version for the non-epilogue case.

The VPlan-based version constructs a SCEV expression to compute the
minimum iterations, use that to check if the check is known true or
false. Otherwise it creates a VPExpandSCEV recipe and emits a
compare-and-branch.

When using epilogue vectorization, we still need to create the minimum
trip-count-check during the legacy skeleton creation. The patch moves
the definitions out of ILV.

PR: https://github.com/llvm/llvm-project/pull/153643
2025-08-26 15:52:31 +01:00
Florian Hahn
424258947e
[VPlan] Materialize VF and VFxUF using VPInstructions. (#152879)
Materialize VF and VFxUF computation using VPInstruction
instead of directly creating IR.

This is one of the last few steps needed to model the full vector
skeleton in VPlan.

This is mostly NFC, although in some cases we remove some unused
computations.

PR: https://github.com/llvm/llvm-project/pull/152879
2025-08-12 14:13:13 +01:00
Florian Hahn
82d633e9ff
[VPlan] Materialize vector trip count using VPInstructions. (#151925)
Materialize the vector trip count computation using VPInstruction
instead of directly creating IR. This is one of the last few steps
needed to model the full vector skeleton in VPlan. It also simplifies
vector-trip count computations for scalable vectors, as we can re-use
the UF x VF computation.

PR: https://github.com/llvm/llvm-project/pull/151925
2025-08-08 11:44:32 +01:00
Florian Hahn
89ae085859
[VPlan] Remove VPVectorPointer for part 0 after unrolling. (#149735)
VPVectorPointer for part 0 is just the pointer operand. Simplify it
after unrolling. This removes a large number of redundant GEPs with
index 0.

PR: https://github.com/llvm/llvm-project/pull/149735
2025-07-27 13:53:26 +01:00
Florian Hahn
24bd4e59b9
[VPlan] Use regular phi printing for resume phis.
As discussed in https://github.com/llvm/llvm-project/pull/140405, remove
custom printing for resume-phis and update tests.
2025-06-07 21:49:54 +01:00
Florian Hahn
9ea4924720
[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC).
Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also
use EMIT-SCALAR for VPPhis that are single scalars.
2025-05-29 11:20:07 +01:00
Florian Hahn
5b85e4b08d
[VPlan] Use EMIT-SCALAR when printing single-scalar VPInstructions. (#141428)
By using SINGLE-SCALAR when printing, it is clear in the debug output
that those VPInstructions only produce a single scalar.

Split off in preparation for
https://github.com/llvm/llvm-project/pull/140623.

PR: https://github.com/llvm/llvm-project/pull/141428
2025-05-29 09:29:06 +01:00
Florian Hahn
dcef154b5c
[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506)
Building on top of https://github.com/llvm/llvm-project/pull/114305,
replace VPRegionBlocks with explicit CFG before executing.

This brings the final VPlan closer to the IR that is generated and
helps to simplify codegen.

It will also enable further simplifications of phi handling during
execution and transformations that do not have to preserve the 
canonical IV required by loop regions. This for example could include
replacing the canonical IV with an EVL based phi while completely
removing the original canonical IV.

PR: https://github.com/llvm/llvm-project/pull/117506
2025-05-24 19:17:16 +01:00
Florian Hahn
5fa64d65e9
[VPlan] Use printPhiOperands for VPPhi.
Split off from  https://github.com/llvm/llvm-project/pull/139151 to land
printing improvements separately.

Updates printing of VPPhi operands to be consistent with
VPWidenPHIRecipe.
2025-05-10 12:49:29 +01:00
Florian Hahn
783a846507
[VPlan] Add VF as operand to VPScalarIVStepsRecipe.
Similarly to other recipes, update VPScalarIVStepsRecipe to also take
the runtime VF as argument. This removes some unnecessary runtime VF
computations for scalable vectors. It will also allow dropping the
UF == 1 restriction for narrowing interleave groups required in
577631f0a528.
2025-03-28 21:48:59 +00:00
Florian Hahn
c482b8faea
[VPlan] Only execute VPExpandSCEVRecipes once and remove them (NFC).
Instead of executing the whole entry VPIRBB twice, first only execute
the VPExpandSCEVRecipes and replace their uses with the expanded
VPValue, which will be a live-in. This allows removing special logic in
VPExpandSCEVRecipe to support executing twice and allows moving the
ExpandedSCEVs map out of VPTransformState.

It will also allow adding other recipes to the entry VPBB in the future.
2025-03-23 09:06:01 +00:00
Florian Hahn
02575f887b
[VPlan] Use VPInstruction for VPScalarPHIRecipe. (NFCI) (#129767)
Now that all phi nodes manage their incoming blocks through the
VPlan-predecessors, there should be no need for having a dedicate
recipe, it should be sufficient to allow PHI opcodes in VPInstruction.

Follow-ups will also migrate VPWidenPHIRecipe and possibly others,
building on top of https://github.com/llvm/llvm-project/pull/129388.

PR: https://github.com/llvm/llvm-project/pull/129767
2025-03-13 18:35:07 +00:00
Florian Hahn
e5f5517f91 [VPlan] Create IR basic block for middle.block in VPlan.
Create a IR BB directly for the middle.block, instead of creating the IR
BB during skeleton creation and then replacing the middle VPBB with a
VPIRBB.

This moves another part of skeleton creation to VPlan and simplififes
the code slightly by removing code to disconnect the middle block and
vector preheader + the corresponding DT update.

NFC modulo IR block naming and block creation order, which changes the
IR names for the blocks.
2025-02-15 21:54:16 +01:00
Florian Hahn
7f3428d3ed
[VPlan] Compute induction end values in VPlan. (#112145)
Use createDerivedIV to compute IV end values directly in VPlan, instead
of creating them up-front.

This allows updating IV users outside the loop as follow-up.

Depends on https://github.com/llvm/llvm-project/pull/110004 and
https://github.com/llvm/llvm-project/pull/109975.

PR: https://github.com/llvm/llvm-project/pull/112145
2024-12-29 19:05:08 +00:00
Florian Hahn
6c8f41d336
[VPlan] Hook IR blocks into VPlan during skeleton creation (NFC) (#114292)
As a first step to move towards modeling the full skeleton in VPlan,
start by wrapping IR blocks created during legacy skeleton creation in
VPIRBasicBlocks and hook them into the VPlan. This means the skeleton
CFG is represented in VPlan, just before execute. This allows moving
parts of skeleton creation into recipes in the VPBBs gradually.

Note that this allows retiring some manual DT updates, as this will be
handled automatically during VPlan execution.

PR: https://github.com/llvm/llvm-project/pull/114292
2024-12-12 15:58:16 +00:00
Florian Hahn
156da98683
[VPlan] Move printing final VPlan to ::execute (NFC).
This moves printing of the final VPlan to ::execute. This ensures the
final VPlan is printed, including recipes that get introduced by late,
lowering transforms and skeleton construction.

Split off from https://github.com/llvm/llvm-project/pull/114292, to
simplify the diff.
2024-12-07 09:39:10 +00:00
Florian Hahn
6797b0f0c0
[VPlan] Use RPOT for VPlan codegen and printing.
This split off changes for more complex CFGs in VPlan from both
    https://github.com/llvm/llvm-project/pull/114292
    https://github.com/llvm/llvm-project/pull/112138

This simplifies their respective diffs.
2024-12-06 21:49:00 +00:00
Florian Hahn
b021464d35
[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975)
Update VPlan to include the scalar loop header. This allows retiring
VPLiveOut, as the remaining live-outs can now be handled by adding
operands to the wrapped phis in the scalar loop header.

Note that the current version only includes the scalar loop header, no
other loop blocks and also does not wrap it in a region block.

PR: https://github.com/llvm/llvm-project/pull/109975
2024-10-31 21:36:44 +01:00
Florian Hahn
8ec406757c
[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842)
This patch implements explicit unrolling by UF  as VPlan transform. In
follow up patches this will allow simplifying VPTransform state (no need
to store unrolled parts) as well as recipe execution (no need to
generate code for multiple parts in an each recipe). It also allows for
more general optimziations (e.g. avoid generating code for recipes that
are uniform-across parts).

It also unifies the logic dealing with unrolled parts in a single place,
rather than spreading it out across multiple places (e.g. VPlan post
processing for header-phi recipes previously.)

In the initial implementation, a number of recipes still take the
unrolled part as additional, optional argument, if their execution
depends on the unrolled part.

The computation for start/step values for scalable inductions changed
slightly. Previously the step would be computed as scalar and then
splatted, now vscale gets splatted and multiplied by the step in a
vector mul.

This has been split off https://github.com/llvm/llvm-project/pull/94339
which also includes changes to simplify VPTransfomState and recipes'
::execute.

The current version mostly leaves existing ::execute untouched and
instead sets VPTransfomState::UF to 1.

A follow-up patch will clean up all references to VPTransformState::UF.

Another follow-up patch will simplify VPTransformState to only store a
single vector value per VPValue.

PR: https://github.com/llvm/llvm-project/pull/95842
2024-09-21 19:47:37 +01:00
Florian Hahn
db0603cb7b
[LV] Only OR unique edges when creating block-in masks.
This removes redundant ORs of matching masks.

Follow-up to f0df4fbd0c7b to reduce the number of redundant ORs for
masks.
2024-08-12 10:17:40 +01:00
Florian Hahn
55d7e59023
[VPlan] Replace hard-coded value number in test with pattern.
Make test more robust w.r.t. future changes.
2024-08-12 09:36:12 +01:00
Florian Hahn
f0df4fbd0c
[LV] Support generating masks for switch terminators. (#99808)
Update createEdgeMask to created masks where the terminator in Src is a
switch. We need to handle 2 separate cases:

1. Dst is not the default desintation. Dst is reached if any of the
cases with destination == Dst are taken. Join the conditions for each
case where destination == Dst using a logical OR.
2. Dst is the default destination. Dst is reached if none of the cases
with destination != Dst are taken. Join the conditions for each case
where the destination is != Dst using a logical OR and negate it.

Edge masks are created for every destination of cases and/or 
default when requesting a mask where the source is a switch.

Fixes https://github.com/llvm/llvm-project/issues/48188.

PR: https://github.com/llvm/llvm-project/pull/99808
2024-08-11 20:38:36 +02:00
Florian Hahn
855703537e
[LV] Add more tests with switches.
Extra tests for
https://github.com/llvm/llvm-project/pull/99808, including cost model
tests.
2024-08-01 19:30:48 +01:00