A VPlan contains multiple live-ins without underlying IR, like VFxUF or
VectorTripCount. Trying to infer the scalar type of those causes a crash
at the moment.
Update VPTypeAnalysis to take a VPlan in its constructor and assign
types to those live-ins up front. All those live-ins share the type of
the canonical IV.
PR: https://github.com/llvm/llvm-project/pull/80723
Update truncateToMinimalBitwidths to handle truncating ICMPs. For ICMPs,
the new target type will be the same as the original type. In that case,
only truncate the operands, but skip the extend. This is in line with
what the original truncateToMinimalBitwidths did for compares.
Fixes https://github.com/llvm/llvm-project/issues/81415.
Move collectPoisonGeneratingFlags from InnerLoopVectorizer to
VPlanTransforms and also update its name. collectPoisonGeneratingFlags
already directly drops poison-generating flags, not only collecting it.
This means it is more appropriate to integerate it directly into the
VPlan transform pipeline.
The current implementation still calls back to legal to check if a block
needs predication, which should be improved in the future.
Update createScalarIVSteps to take an insert point as parameter. This
ensures that the inserted scalar steps are in the same order as the
recipes they replace (vs in reverse order as currently). This helps to
reduce the diff for follow-up changes.
Move simplification of VPBlendRecipes from early VPlan construction to
VPlan-to-VPlan based recipe simplification. This simplifies initial
construction.
Note that some in-loop reduction tests are failing at the moment, due to
the reduction predicate being created after the reduction recipe. I will
provide a patch for that soon.
PR: https://github.com/llvm/llvm-project/pull/76090
Add a new recipe to model scalar cast instructions, without relying on
an underlying instruction.
This allows creating scalar casts, without relying on an underlying
instruction (like the current VPReplicateRecipe). The new recipe is
used to explicitly model both truncating the induction step and the
VPDerivedIVRecipe, thus simplifying both the recipe and code
needed to introduce it.
Truncating VPWidenIntOrFpInductionRecipes should also be modeled using
the new recipe, as follow-up.
PR: https://github.com/llvm/llvm-project/pull/78113
Instead of using the debug location of the underlying instruction, use
the debug location from the recipe. This removes an unneeded dependency
of the underlying instruction.
This patch introduces a new common base class for recipes defining a
single result VPValue. This has been discussed/mentioned at various
previous reviews as potential follow-up and helps to replace various
getVPSingleValue calls.
PR: https://github.com/llvm/llvm-project/pull/77023
Compiler crashes when the assertion triggered for zext nneg instruction,
that checks that the instruction cannot produce poison. Changed the base
class for widencast recipe to handle dropping nneg flag to avoid
compiler crash.
MinBWs contains entries that specify the minimum required bitwidth. In
some cases, the old and new bitwidths can be equal (see test case) and
in those cases no truncations are needed, so skip those cases.
Fixes#74307.
In some cases MinBWs may contain entries for live-ins that are not used
by VPWidenRecipe or VPWidenSelectRecipes. In those cases, the live-ins
won't get processed, so make sure we include them in the count when used
as operands in VPWidenCast and VPWidenSelectRecipe.
Fixes https://github.com/llvm/llvm-project/issues/74231
This patch replaces the IR based truncateToMinimalBitwidths with a VPlan
version. This has 3 benefits:
1) the VPlan-based version is simpler; we don't need to implement
special codegen for each supported instruction type like the IR based
one.
2) Removes a dependency on the cost-model after VPlan execution and
3) Removes a use of getVPValue that uses underlying values after VPlan
execution (See removed FIXME).
Depends on D149081.
Depends on D149079.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D149903
Add simplification for redundant trunc(zext A) pairs. Generally apply a
transform from D149903.
Depends on D159200.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D159202
This patch updates the mask creation code to always create compares of
the form (ICMP_ULE, wide canonical IV, backedge-taken-count) up front
when tail folding and introduce active-lane-mask as later
transformation.
This effectively makes (ICMP_ULE, wide canonical IV, backedge-taken-count)
the canonical form for tail-folding early on. Introducing more specific
active-lane-mask recipes is treated as a VPlan-to-VPlan optimization.
This has the advantage of keeping the logic (and complexity) of
introducing active-lane-mask recipes in a single place, instead of
spreading the logic out across multiple functions. It also simplifies
initial VPlan construction and enables treating introducing EVL as
similar optimization.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D158779
Add first VPlan-based recipe simplification to fold (MUL A, 1) -> A.
Among other things, this enables additional simplifications after
applying versioned strides, as follow up to D147783.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D159200
Address post-commit simplification suggestion for 8a56179bcd8c: Replace
IsTruncated by conditionally setting TruncResultTy only if truncation
is required.
Replace ConditionalAssume set by treating conditional assumes like other
predicated instructions (i.e. create a VPReplicateRecipe with a mask)
and later remove any assume recipes with masks during VPlan cleanup.
This reduces coupling of VPlan construction and Legal by removing a
shared set between the 2 and results in a cleaner code structure
overall.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D157034
Split up tryToBuildVPlanWithVPRecipes into intial plan creation and
optimizations, by introducing a VPLanTransform::optimize helper.
Depends on D154640.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D154644
After D150027, all relevant recipes should model their IR flags
directly. Instead of removing the flags after codegen as part of
fixReductions, drop poison generating flags directly from the recipes.
Depends on D150027.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D150028
We started seeing new failure after D142886. Looks like it enabled new cases and we hit an assert:
assert(Current->getNumDefinedValues() == 1 &&
"only recipes with a single defined value expected");
When we do instruction sinking for the first order recurrence we hit an assert if instruction doesn't have single def. In case instruction doesn't produce any new def there is no new users and nothing to sink.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D151204
New that def-use chains are modeled directly in VPlan, we can simply use
the operands of the recipe we are replacing. There is no need to use the
operands of the underlying instruction to look up a VPValue.
To generate cast instructions, the result type is needed. To allow
creating widened casts without underlying instruction, introduce a new
VPWidenCastRecipe that also holds the result type.
This functionality will be used in a follow-up patch to
implement truncateToMinimalBitwidths as VPlan-to-VPlan transform.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D149081
This reverts the revert commit 3d8ed8b5192a59104bfbd5bf7ac84d035ee0a4a5.
The new version of the patch adds a set to avoid duplicating work in
isFixedOrderRecurrence, which was previously done through the removed
SinkAfter map.
Original commit message:
Building on D142885 and D142589, retire the SinkAfter map from the
recurrence handling code. It is replaced by checking whether it is
possible to sink all users of a recurrence directly in VPlan. This
results in simpler code overall and allows to handle additional cases
(see the improvements in @test_crash).
Depends on D142885.
Depends on D142589.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142886
This reverts the revert commit 8c2276f89887d0a27298a1bbbd2181fa54bbb509.
The updated patch re-orders the getDefiningRecipe check in getVPalue to avoid
a use-after-free.
Original commit message:
Before this patch, a VPlan contained 2 mappings for Values -> VPValue:
1) Value2VPValue and 2) VPExternalDefs.
This duplication is unnecessary and there are already cases where
external defs are added to Value2VPValue. This patch replaces all uses
of VPExternalDefs with Value2VPValue.
It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue)
and addVPValue (to addExternalVPValue).
At the moment, this is NFC, but will enable additional simplifications
in D147783.
Depends on D147891.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147892
Asan detects heap-use-after-free, see D147892.
This reverts commit 4fc190351e5af901b6107d162d07e1fbca90934f.
This reverts commit 668045eb77628be13e448ffbb855473ffca1cc43.
Before this patch, a VPlan contained 2 mappings for Values -> VPValue:
1) Value2VPValue and 2) VPExternalDefs.
This duplication is unnecessary and there are already cases where
external defs are added to Value2VPValue. This patch replaces all uses
of VPExternalDefs with Value2VPValue.
It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue)
and addVPValue (to addExternalVPValue).
At the moment, this is NFC, but will enable additional simplifications
in D147783.
Depends on D147891.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147892
Update the isCanonical() implementations to check the VPValue step
operand instead of the step in the induction descriptor.
At the moment this is NFC, but it enables further optimizations if the
step is replaced by a constant in D147783.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147891
Building on D142885 and D142589, retire the SinkAfter map from the
recurrence handling code. It is replaced by checking whether it is
possible to sink all users of a recurrence directly in VPlan. This
results in simpler code overall and allows to handle additional cases
(see the improvements in @test_crash).
Depends on D142885.
Depends on D142589.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142886
The function doesn't use anything from VPRecipeBuilder, so move the
definition to where it is actually used and turn it into a simple static
function.
It also makes the VPRecipeBuilder argument for createAndOptimizeReplicateRegions
unnecessary.