307 Commits

Author SHA1 Message Date
Florian Hahn
f66509bf52
[VPlan] Clarify comment for replaceVPBBWithIRVPBB and add assert (NFCI).
Follow-up to suggestion during
https://github.com/llvm/llvm-project/pull/100735.

More specifically
9a40ed0919 (diff-6d0b73adfa9f8465923d2225ab6674ddcdeab71666f7a73dfaec7fa1246b3a1f)
2024-09-14 21:51:19 +01:00
Florian Hahn
f0c5caa814
[VPlan] Add VPIRInstruction, use for exit block live-outs. (#100735)
Add a new VPIRInstruction recipe to wrap existing IR instructions not to
be modified during execution, execept for PHIs. For PHIs, a single
VPValue
operand is allowed, and it is used to add a new incoming value for the
single predecessor VPBB. Expect PHIs, VPIRInstructions cannot have any
operands.

Depends on https://github.com/llvm/llvm-project/pull/100658.

PR: https://github.com/llvm/llvm-project/pull/100735
2024-09-14 21:21:55 +01:00
David Sherwood
f3029b330a
[NFC][LoopVectorize] Avoid passing ScalarEvolution to VPlanTransforms::optimize (#108380)
Whilst trying to write some VPlan unit tests I realised
that we don't need to pass a ScalarEvolution object into
VPlanTransforms::optimize because the only thing we
actually need is a LLVMContext.
2024-09-13 12:09:00 +01:00
Florian Hahn
a794ee4559
[VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (#95305)
Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the
runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only
generated if there are users of VF, to avoid unnecessary test changes.

PR: https://github.com/llvm/llvm-project/pull/95305
2024-09-10 10:41:35 +01:00
Florian Hahn
1a5a1e9781
[VPlan] Assert that VFxUF is always used.
Add assertion to ensure invariant discussed in
https://github.com/llvm/llvm-project/pull/95305.
2024-09-09 14:26:09 +01:00
Florian Hahn
96e1320a9a
[VPlan] Move properlyDominates to VPDominatorTree (NFCI).
This allows for easier re-use in additional places in the future. Also
move code to VPlanAnalysis.cpp
2024-08-28 13:58:12 +01:00
Ramkumar Ramachandra
71ede8d831
VPlan: factor out VPlanUtils into its own file (NFC) (#105857) 2024-08-28 13:54:41 +01:00
Florian Hahn
4e04286d61
[VPlan] Only use selectVectorizationFactor for cross-check (NFCI). (#103033)
Use getBestVF to select VF up-front and only use
selectVectorizationFactor to get the VF legacy VF to check the
vectorization decision matches the VPlan-based cost model.

PR: https://github.com/llvm/llvm-project/pull/103033
2024-08-21 13:09:01 +02:00
Florian Hahn
f2fcd9cb97
[VPlan] Rename getBestPlanFor -> getPlanFor (NFC).
As suggested in https://github.com/llvm/llvm-project/pull/103033, more
accurately rename to getPlanFor , as it simplify returns the VPlan for
VF, relying on the fact that there is a single VPlan for each VF at the
moment.
2024-08-19 13:05:19 +01:00
Florian Hahn
cd60d10a10
[VPlan] Move some LoopVectorizationPlanner helpers to VPlan.cpp (NFC).
Members not requiring access to LoopVectorizationLegality or
LoopVectorizationCostModel can safely be moved out of the very large
LoopVectorization.cpp and are more accurately placed in VPlan.cpp
2024-08-19 09:58:46 +01:00
Mel Chen
e3d9b01a36
[VPlan][NFC] Make VPValue pointer const. (#101334) 2024-08-02 09:34:25 +08:00
Florian Hahn
b8741cc185
[VPlan] Relax assertion retrieving a scalar from VPTransformState::get.
The current assertion VPTransformState::get when retrieving a single
scalar only does not account for cases where a def has multiple users,
some demanding all scalar lanes, some demanding only a single scalar.

For an example, see the modified test case. Relax the assertion by also
allowing requesting scalar lanes only when the Def doesn't have only its
first lane used.

Fixes https://github.com/llvm/llvm-project/issues/88849.
2024-07-19 11:33:57 +01:00
Florian Hahn
b841e2eca3
Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92.

A number of crashes have been fixed by separate fixes, including
ttps://github.com/llvm/llvm-project/pull/96622. This version of the
PR also pre-computes the costs for branches (except the latch) instead
of computing their costs as part of costing of replicate regions, as
there may not be a direct correspondence between original branches and
number of replicate regions.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-07-10 14:22:21 +01:00
Florian Hahn
72937203dd
[VPlan] Create vector header and latch VPBBs in createInitialVPlan (NFC)
The empty header and latch blocks can be created together with the
vector loop region.

This is in preparation for splitting up the very large
tryToBuildVPlanWithVPRecipes into several distinct functions, as
suggested multiple times, including in
https://github.com/llvm/llvm-project/pull/94760
2024-07-09 12:41:12 +01:00
Florian Hahn
99d6c6d936
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.

Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.

After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.

This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.

Depends on https://github.com/llvm/llvm-project/pull/92525

PR: https://github.com/llvm/llvm-project/pull/92651
2024-07-05 10:08:42 +01:00
Florian Hahn
ef1773ad57
[VPlan] Rewrite cloneSESE to use 2 depth-first passes (NFCI).
Rewrite cloneSESE to perform 2 depth-first passes with the first one
cloning blocks and the second one updating the predecessors and
successors.

This is needed to preserve the correct predecessor/successor ordering
with https://github.com/llvm/llvm-project/pull/92651 and has been split
off as suggested.
2024-06-23 20:37:51 +01:00
Florian Hahn
31a94bd783
[VPlan] Rename Preheader -> Entry in createInitialVPlan (NFCI).
Split off from https://github.com/llvm/llvm-project/pull/92651 as
suggested.
2024-06-23 20:28:15 +01:00
Florian Hahn
f1f3c34b47
Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)""
This reverts commit 242cc200ccb24e22eaf54aed7b0b0c84cfc54c0b and
eea150c84053035163f307b46549a2997a343ce9, as it is causing a build bot
failure and there have been a number of crashes reported at
https://github.com/llvm/llvm-project/pull/92555
2024-06-21 19:54:21 +01:00
Florian Hahn
eea150c840
[VPlan] Include IV phi and backedge cost in VPlan cost computation.
In WebAssembly, costs != 0 are assigned to be backedge and induction
phis, so make sure we include those costs in the VPlan-based cost model.

This fixes a downstream crash with WebAssembly after 242cc200ccb
(https://github.com/llvm/llvm-project/pull/92555)
2024-06-20 20:44:17 +01:00
Florian Hahn
242cc200cc
Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92.

Extra tests for crashes discovered when building Chromium have been
added in fb86cb7ec157689e, 3be7312f81ad2.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-20 17:32:52 +01:00
Florian Hahn
3808ba78de
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.

Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.


PR: https://github.com/llvm/llvm-project/pull/95816
2024-06-20 13:42:20 +01:00
Florian Hahn
c008647b3a
[VPlan] Introduce isHeaderMask helper (NFCI).
Split off from https://github.com/llvm/llvm-project/pull/92555
and slightly generalized to more precisely check for a header mask.

Use it to replace manual checks in collectHeaderMasks.
2024-06-19 20:17:01 +01:00
Florian Hahn
40a72f8cc4
[VPlan] Support extracting any lane of uniform value.
If the value we are extracting a lane from is uniform, only the first
lane will be set. Return lane 0 for any requested lane.

This fixes a crash when trying to extract the last lane for a
first-order recurrence resume value.

Fixes https://github.com/llvm/llvm-project/issues/95520.
2024-06-14 22:16:52 +01:00
Arthur Eubanks
6f538f6a2d Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)""
This reverts commit 90fd99c0795711e1cf762a02b29b0a702f86a264.
This reverts commit 43e6f46936e177e47de6627a74b047ba27561b44.

Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.
2024-06-14 17:47:08 +00:00
Florian Hahn
90fd99c079
Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 46080abe9b136821eda2a1a27d8a13ceac349f8c.

Extra tests have been added in 52d29eb287.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-14 12:33:48 +01:00
Arthur Eubanks
46080abe9b Revert "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 00798354c553d48d27006a2b06a904bd6013e31b.

Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.
2024-06-13 16:37:21 +00:00
Florian Hahn
00798354c5
[VPlan] First step towards VPlan cost modeling. (#92555)
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to 
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and 
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.


PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-13 14:26:18 +01:00
Florian Hahn
5785048321
[VPlan] Add VPIRBasicBlock, use to model pre-preheader. (#93398)
This patch adds a new special type of VPBasicBlock that wraps an
existing IR basic block. Recipes of the block get added before the
terminator of the wrapped IR basic block. Making it a subclass of
VPBasicBlock avoids duplicating various APIs to manage recipes in a
block, as well as makes sure the traversals filtering VPBasicBlocks
automatically apply as well.

Initially VPIRBasicBlock are only used for the pre-preheader (wrapping
the original preheader of the scalar loop).

As follow-up, this will be used to move more parts of the skeleton
inside VPlan, starting with the branch and condition in the middle
block.

Separated out of https://github.com/llvm/llvm-project/pull/92651

PR: https://github.com/llvm/llvm-project/pull/93398
2024-05-30 11:23:32 -07:00
Florian Hahn
8b037862b6
[VPlan] Preserve DT (and SCEV) in VPlan-native path (#93287)
As a follow-up to b2f65e80, use the DTU to also update and preserve
the DT in the native path. This should also allow preserving SCEV in the
native path

PR: https://github.com/llvm/llvm-project/pull/93287
2024-05-27 17:03:53 -07:00
Florian Hahn
b2f65e809e
[VPlan] Use DomTreeUpdater to automatically update DT for vector loop. (#92525)
Use DTU to queue DominatorTree updates directly when connecting basic
blocks during VPlan execution.

This simplifies DT updates and will also automatically allow updating
the DT for the VPlan-native path as additional benefit in a follow-up.

PR: https://github.com/llvm/llvm-project/pull/92525
2024-05-24 10:10:12 +01:00
Florian Hahn
67d840b60f
[VPlan] Relax over-aggressive assertion in VPTransformState::get().
There are cases where a vector value has some users that demand the
the single scalar value only (NeedsScalar), while other users demand the
vector value (see attached test cases). In those cases, the NeedsScalar
users should only demand the first lane.

Fixes https://github.com/llvm/llvm-project/issues/91883.
2024-05-14 19:10:49 +01:00
Florian Hahn
6254b6dd89
[VPlan] Version VPValue names in VPSlotTracker. (#81411)
This patch restructures the way names for printing VPValues are handled.
It moves the logic to generate names for printing to VPSlotTracker.

VPSlotTracker will now version names of the same underlying value if it
is used by multiple VPValues, by adding a .V suffix to the name.

This fixes cases where at the moment the same name is printed for
different VPValues.

PR: https://github.com/llvm/llvm-project/pull/81411
2024-04-15 12:27:45 +01:00
Florian Hahn
a8ec1eb843
[VPlan] Dont assign slots to VPValues with an underlying value.
This makes sure the numbering for VPValues without underlying
values is consecutive.
2024-04-09 21:30:51 +01:00
Alexey Bataev
413a66f339
[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172)
This patch introduces generating VP intrinsics in the Loop Vectorizer.

Currently the Loop Vectorizer supports vector predication in a very
limited capacity via tail-folding and masked load/store/gather/scatter
intrinsics. However, this does not let architectures with active vector
length predication support take advantage of their capabilities.
Architectures with general masked predication support also can only take
advantage of predication on memory operations. By having a way for the
Loop Vectorizer to generate Vector Predication intrinsics, which (will)
provide a target-independent way to model predicated vector
instructions. These architectures can make better use of their
predication capabilities.

Our first approach (implemented in this patch) builds on top of the
existing tail-folding mechanism in the LV (just adds a new tail-folding
mode using EVL), but instead of generating masked intrinsics for memory
operations it generates VP intrinsics for loads/stores instructions. The
patch adds a new VPlanTransforms to replace the wide header predicate
compare with EVL and updates codegen for load/stores to use VP
store/load with EVL.

Other important part of this approach is how the Explicit Vector Length
is computed. (VP intrinsics define this vector length parameter as
Explicit Vector Length (EVL)). We use an experimental intrinsic
`get_vector_length`, that can be lowered to architecture specific
instruction(s) to compute EVL.

Also, added a new recipe to emit instructions for computing EVL. Using
VPlan in this way will eventually help build and compare VPlans
corresponding to different strategies and alternatives.

Differential Revision: https://reviews.llvm.org/D99750
2024-04-04 18:30:17 -04:00
Florian Hahn
e5abd963c7
[VPlan] Remove VPTransformState::addMetadata with ArrayRef arg (NFCI).
addMeadata is only over called with a single element, clean up the
variant that takes multiple values.
2024-04-03 09:43:12 +01:00
Florian Hahn
8a614c1d31
[VPlan] Rename getVPValueOrAddLiveIn -> getOrAddLiveIn (NFCI).
The helper now only deals with live-ins, clarify the name.
2024-03-28 21:02:15 +00:00
Florian Hahn
06bb8c9f20
[VPlan] Explicitly handle scalar pointer inductions. (#83068)
Add a new PtrAdd opcode to VPInstruction that corresponds to
IRBuilder::CreatePtrAdd, which creates a GEP with source element type
i8.

This is then used to model scalarizing VPWidenPointerInductionRecipe by
introducing scalar-steps to model the index increment followed by a
PtrAdd.

Note that PtrAdd needs to be able to generate code for only the first
lane or for all lanes. This may warrant introducing a separate recipe
for scalarizing that can be created without relying on the underlying
IR.

Depends on https://github.com/llvm/llvm-project/pull/80271

PR: https://github.com/llvm/llvm-project/pull/83068
2024-03-26 16:01:57 +01:00
Florian Hahn
2435dcd83a
[VPlan] Add initial pattern match implementation for VPInstruction. (#80563)
Add an initial version of a pattern match for VPValues and recipes,
starting with VPInstruction.

PR: https://github.com/llvm/llvm-project/pull/80563
2024-03-03 21:48:58 +00:00
Florian Hahn
911055e34f
[VPlan] Consistently use (Part, 0) for first lane scalar values (#80271)
At the moment, some VPInstructions create only a single scalar value,
but use VPTransformatState's 'vector' storage for this value. Those
values are effectively uniform-per-VF (or in some cases
uniform-across-VF-and-UF). Using the vector/per-part storage doesn't
interact well with other recipes, that more accurately using (Part,
Lane) to look up scalar values and prevents VPInstructions creating
scalars from interacting with other recipes working with scalars.

This PR tries to unify handling of scalars by using (Part, 0) for scalar
values where only the first lane is demanded. This allows using
VPInstructions with other recipes like VPScalarCastRecipe and is also
needed when using VPInstructions in more cases otuside the vector loop
region to generate scalars.

Depends on https://github.com/llvm/llvm-project/pull/80269
2024-02-26 19:06:43 +00:00
Florian Hahn
85da9f80b8
[VPlan] Remove unused VPTransformState::VPValue2Value (NFCI).
Clean up unused member variable.
2024-02-25 12:14:44 +00:00
Florian Hahn
3d66d6932e
[VPlan] Support live-ins without underlying IR in type analysis. (#80723)
A VPlan contains multiple live-ins without underlying IR, like VFxUF or
VectorTripCount. Trying to infer the scalar type of those causes a crash
at the moment.

Update VPTypeAnalysis to take a VPlan in its constructor and assign
types to those live-ins up front. All those live-ins share the type of
the canonical IV.

PR: https://github.com/llvm/llvm-project/pull/80723
2024-02-21 19:37:15 +00:00
Florian Hahn
9923d29cfa
[VPlan] Merge main VPlan verifer with HCFG verifier.
Unify VPlan verifiers in verifyVPlanIsValid. This adds verification for
various properties on blocks to the verifier used for VPlans generated
by the inner loop vectorizer. It also adds def-use checks for the
verifier used in the VPlan native path.

This drops the separate flag to enable HCFG verification. Instead, all
VPlans are verified once they have been created, if assertions are
enabled.

This also removes VPWidenPHIRecipe from VPHeaderPHIRecipe; it is used to
model any phi node in the native path.
2024-02-20 16:43:57 +00:00
Florian Hahn
3444240540
[VPlan] Mark vputils::onlyFirstPartUsed arg as const (NFC)
Split off https://github.com/llvm/llvm-project/pull/80269 as
suggested.
2024-02-03 15:59:09 +00:00
Florian Hahn
6936479020
[VPlan] Mark vputils::onlyFirstLaneUsed arg as const (NFC)
Split off https://github.com/llvm/llvm-project/pull/80269 as suggested.
2024-02-03 15:56:40 +00:00
Florian Hahn
2906f3626b
[VPlan] Update ::onlyScalarsGenerated to take IsScalable bool (NFCI).
Instead of passing in a full VF, just pass IsScalable as bool.
2024-02-03 14:51:14 +00:00
Florian Hahn
1b37e8087e
[VPlan] use getVPValueOrAddLiveIn in VPlan::duplicate.
Instead of creating live-ins manually, use getOrAddLiveIn which
automatically takes care of adding them to VPLiveInsToFree. Also use it
to create the VPValue for the trip-count. This fixes a leak:
https://lab.llvm.org/buildbot/#/builders/168/builds/18308/steps/10/logs/stdio
2024-01-28 12:39:39 +00:00
Florian Hahn
ec402a2e53
[VPlan] Implement cloning of VPlans. (#73158)
This patch implements cloning for VPlans and recipes. Cloning is used in
the epilogue vectorization path, to clone the VPlan for the main vector
loop. This means we won't re-use a VPlan when executing the VPlan for
the epilogue vector loop, which in turn will enable us to perform
optimizations based on UF & VF.
2024-01-27 13:30:52 +00:00
Florian Hahn
731c2049a4
[VPlan] Relax IV user assertion after 0ab539f for epilogue vec.
After 0ab539fd6748adf2f638e10514dd9419597d8863, the canonical IV in the
epilogue vector loop may be used by a trunc. Relax the corresponding
assert.

This should fix some build-bot failures, including
    https://lab.llvm.org/buildbot/#/builders/187/builds/14113
    https://lab.llvm.org/buildbot/#/builders/98/builds/32350
    https://lab.llvm.org/buildbot/#/builders/239/builds/5473
2024-01-26 13:19:25 +00:00
Florian Hahn
3683852d49
[VPlan] Use replaceUsesWithIf in replaceAllUseswith and add comment (NFCI).
Follow-up to post-commit commens for b1bfe221e6.
2024-01-21 12:56:16 +00:00
Florian Hahn
241fe83704
[VPlan] Introduce ComputeReductionResult VPInstruction opcode. (#70253)
This patch introduces a new ComputeReductionResult opcode to compute the
final reduction result in the middle block. The code from fixReduction
has been moved to ComputeReductionResult, after some earlier cleanup
changes to model parts of fixReduction explicitly elsewhere as needed.

The recipe may be broken down further in the future.

Note that  the phi nodes to merge the reduction result from the trip 
count check and the middle block, to be used as resume value for the
scalar remainder loop are also generated based on 
ComputeReductionResult.

Once we have a VPValue for the reduction result, this can also be
modeled explicitly and moved out of the recipe.
2024-01-04 22:53:18 +00:00