103 Commits

Author SHA1 Message Date
Florian Hahn
47abbf4fe9
[VPlan] Update VPInst::onlyFirstLaneUsed to check users. (#80269)
A VPInstruction only has its first lane used if all users use its first
lane only. Use vputils::onlyFirstLaneUsed to continue checking the
recipe's users to handle more cases.

Besides allowing additional introduction of scalar steps when
interleaving in some cases, this also enables using an Add VPInstruction
to model the increment - as a follow up.
2024-02-03 16:19:10 +00:00
Florian Hahn
2906f3626b
[VPlan] Update ::onlyScalarsGenerated to take IsScalable bool (NFCI).
Instead of passing in a full VF, just pass IsScalable as bool.
2024-02-03 14:51:14 +00:00
Graham Hunter
d4c0171423
[LV] Fix handling of interleaving linear args (#78725)
Currently when interleaving vector calls with linear arguments,
the Part is ignored and all vector calls use the initial value
from the first lane of the current iteration.

Fix this to extract from the correct part of the linear vector.
2024-01-26 11:30:35 +00:00
Florian Hahn
0ab539fd67
[VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (#78113)
Add a new recipe to model scalar cast instructions, without relying on
an underlying instruction.

This allows creating scalar casts, without relying on an underlying
instruction (like the current VPReplicateRecipe). The new recipe is 
used to explicitly model both truncating the induction step and the
VPDerivedIVRecipe, thus simplifying both the recipe and code
needed to introduce it.

Truncating VPWidenIntOrFpInductionRecipes should also be modeled using
the new recipe, as follow-up.

PR: https://github.com/llvm/llvm-project/pull/78113
2024-01-26 11:13:05 +00:00
Florian Hahn
42fb1fac9e
[VPlan] Use DebugLoc from recipe in VPWidenCallRecipe (NFCI).
Instead of using the debug location of the underlying instruction, use
the debug location from the recipe. This removes an unneeded dependency
of the underlying instruction.
2024-01-19 13:33:03 +00:00
Florian Hahn
abdb61f5fd
[VPlan] Introduce VPSingleDefRecipe. (#77023)
This patch introduces a new common base class for recipes defining a
single result VPValue. This has been discussed/mentioned at various
previous reviews as potential follow-up and helps to replace various
getVPSingleValue calls.

PR: https://github.com/llvm/llvm-project/pull/77023
2024-01-19 10:27:53 +00:00
Florian Hahn
6011d6b2cc
[VPlan] Use start value of reduction phi to determine type (NFCI).
Instead of accessing the underlying original IR value, check the type of
the start value from the recipe directly.
2024-01-16 14:39:51 +00:00
Florian Hahn
51afb10174
[LV] Create block in mask up-front if needed. (#76635)
At the moment, block and edge masks are created on demand, which means
that they are inserted at the point where they are demanded and then
cached. It is possible that the mask for a block is looked up later at a
point that's not dominated by the point where the mask has been
inserted.

To avoid this, create masks up front on entry to the corresponding basic
block and leave it to VPlan simplification to remove unneeded masks.

Note that we need to create masks for all blocks, if any of the blocks
in the loop needs predication, as computing the mask of a block depends
on the masks of its predecessor.

Needed for #76090.

https://github.com/llvm/llvm-project/pull/76635
2024-01-09 10:50:08 +00:00
Florian Hahn
18ec3304a9
[VPlan] Manage InBounds via VPRecipeWithIRFlags for VectorPtrRecipe.
As suggested as follow-up in
https://github.com/llvm/llvm-project/pull/72164, manage inbounds via
VPRecipeWithIRFlags.

Note that in some cases we can now preserve inbounds in a few more
cases.
2024-01-07 13:58:05 +00:00
Florian Hahn
3fb0d8dc80
Recommit "[VPlan] Mark Select VPInstructions as not having sideeffects."
With #70253 landed, selects for reduction results are explicitly used by
ComputeReductionResult and Selects can be marked as not having
side-effects again.

This reverts the revert commit 173032902c960d4d0d67b521d8c149553d8e8ba3.
2024-01-06 12:08:06 +00:00
Florian Hahn
241fe83704
[VPlan] Introduce ComputeReductionResult VPInstruction opcode. (#70253)
This patch introduces a new ComputeReductionResult opcode to compute the
final reduction result in the middle block. The code from fixReduction
has been moved to ComputeReductionResult, after some earlier cleanup
changes to model parts of fixReduction explicitly elsewhere as needed.

The recipe may be broken down further in the future.

Note that  the phi nodes to merge the reduction result from the trip 
count check and the middle block, to be used as resume value for the
scalar remainder loop are also generated based on 
ComputeReductionResult.

Once we have a VPValue for the reduction result, this can also be
modeled explicitly and moved out of the recipe.
2024-01-04 22:53:18 +00:00
Alexandros Lamprineas
e512df3ecc
[LV] Fix crash when vectorizing function calls with linear args. (#76274)
llvm/lib/IR/Type.cpp:694:
    Assertion `isValidElementType(ElementType) && "Element type of a
    VectorType must be an integer, floating point, or pointer type."'
    failed.
Stack dump:
    llvm::FixedVectorType::get(llvm::Type*, unsigned int)
    llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&)
    llvm::VPBasicBlock::execute(llvm::VPTransformState*)
    llvm::VPRegionBlock::execute(llvm::VPTransformState*)
    llvm::VPlan::execute(llvm::VPTransformState*)
    ...

Happens with function calls of void return type.
2024-01-02 18:14:16 +00:00
Florian Hahn
f18536d642
[VPlan] Model address separately. (#72164)
Move vector pointer generation to a separate VPVectorPointerRecipe.
This untangles address computation from the memory recipes future
and is also needed to enable explicit unrolling in VPlan.

https://github.com/llvm/llvm-project/pull/72164
2024-01-01 19:51:15 +00:00
Shih-Po Hung
3d422a9859
[VPlan] Implement mayHaveSideEffects/mayWriteToMemory for VPInterleav… (#71360)
…eRecipe

This helps VPlanTransforms::removeDeadRecipes to work on
VPInterleaveRecipe
2023-12-15 00:23:14 +08:00
Florian Hahn
173032902c
Revert "[VPlan] Mark Select VPInstructions as not having sideeffects."
This reverts commit 19918ac34dc5d304ec6ad413ceae1d4394abe28f.

Fixes #75298. There is still a case where we miss the correct users
outside the main vector loop for reductions, and that is tail-folded
loops with reductions where the final value is stored after the loop.

This should be handled explicitly in #70253
2023-12-13 21:05:24 +00:00
Florian Hahn
19918ac34d
[VPlan] Mark Select VPInstructions as not having sideeffects.
Select VPInstructions don't have sideeffects, mark them accordingly.
2023-12-11 12:26:32 +00:00
Florian Hahn
a5891fa4d2
[VPlan] Initial modeling of VF * UF as VPValue. (#74761)
This patch starts initial modeling of VF * UF in VPlan.
Initially, introduce a dedicated VFxUF VPValue, which is then
populated during VPlan::prepareToExecute. Initially, the VF * UF
applies only to the main vector loop region. Once we extend the
scope of VPlan in the future, we may want to associate different VFxUFs
with different vector loop regions (e.g. the epilogue vector loop)

This allows explicitly parameterizing recipes that rely on the
VF * UF, like the canonical induction increment. At the moment, this
mainly helps to avoid generating some duplicated calls to vscale with
scalable vectors. It should also allow using EVL as induction increments
explicitly in D99750. Referring to VF * UF is also needed in other
places that we plan to migrate to VPlan, like the minimum trip count
check during skeleton creation.

The first version creates the value for VF * UF directly in
prepareToExecute to limit the scope of the patch. A follow-on patch will
model VF * UF computation explicitly in VPlan using recipes.

Moved from Phabricator (https://reviews.llvm.org/D157322)
2023-12-08 18:30:30 +00:00
Florian Hahn
5ea6a3fc6d
[VPlan] Compute scalable VF in preheader for induction increment. (#74762)
UF * VF is loop invariant and can be computed directly in the preheader.
This prepares the code for #74761 and reduces the test changes.
2023-12-08 12:18:31 +00:00
Florian Hahn
633fe60149
[VPlan] Print flags for VPWidenCastRecipe.
Update VPWidenCastRecipe to also print flags. Simplify nneg printing
test and replace hard-coded value number references with patterns.
2023-12-08 10:48:54 +00:00
Florian Hahn
bbd1941a38
[VPlan] Add disjoint flag to VPRecipeWithIRFlags. (#74364)
A new disjoint flag was added for OR instructions in #72583. 

Update VPRecipeWithIRFlags to also support the new flag. This
allows printing and preserving the disjoint flag in vectorized code.
2023-12-05 15:21:59 +00:00
Alexey Bataev
056367bb19
[LV]Support dropping of nneg flag for zext widencast recipes. (#74112)
Compiler crashes when the assertion triggered for zext nneg instruction,
that checks that the instruction cannot produce poison. Changed the base
class for widencast recipe to handle dropping nneg flag to avoid
compiler crash.
2023-12-05 09:17:23 -05:00
Florian Hahn
70535f5e60
[VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version.
This patch replaces the IR based truncateToMinimalBitwidths with a VPlan
version. This has 3 benefits:
1) the VPlan-based version is simpler; we don't need to implement
   special codegen for each supported instruction type like the IR based
   one.
2) Removes a dependency on the cost-model after VPlan execution and
3) Removes a use of getVPValue that uses underlying values after VPlan
   execution (See removed FIXME).

Depends on D149081.

Depends on D149079.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D149903
2023-12-02 16:12:38 +00:00
Graham Hunter
4d64a2bcd3
[LV] Refactor vector function variant selection to prepare for uniform args (#68879)
Parameters marked as uniform take a scalar value, assuming the value is
invariant in the scalar loop. In order to support this, we need to stop
asking for a vector function variant with a default shape assuming that all
arguments will become vector arguments, and instead consider all available
variants and their parameter types.
2023-11-20 13:30:03 +00:00
Florian Hahn
b0b88643a1
[VPlan] Add initial anlysis to infer scalar type of VPValues. (#69013)
This patch adds initial type inferrence for VPValues. It infers the
scalar type of a VPValue, by bottom-up traversing through defining
recipes until root nodes with known types are reached (e.g. live-ins or
load recipes). The types are then propagated top down through
operations.

This is intended as building block for a VPlan-based cost model, which
will need access to type information for VPValues/recipes.

Initial testing is done by asserting the inferred type matches the type
of the result value generated for a widen and replicate recipes.
2023-10-27 14:38:28 +01:00
Florian Hahn
f7a8a78cb7
[VPlan] Also print operands of canonical IV (NFC).
Also print the operands of VPCanonicalIVPHIRecipe. That was missed
earlier.
2023-10-16 20:28:23 +01:00
Florian Hahn
d9f83169d1
[VPlan] Ensure start value of phis is the first op at construction (NFC)
Header phi recipes have the start value (incoming from outside the loop)
as first operand. This wasn't the case for VPWidenPHIRecipes. Instead
the start value was picked during execute() by doing extra work.

To be in line with other recipes, ensure the operand order is as
expected during construction.
2023-09-22 21:24:15 +01:00
Florian Hahn
f23246a0bb
[LV] Directly add fast-math flags to select recipe (NFC).
Now that VPInstruction can manage fast math flags via
VPRecipeWithIRFlags, use them directly to model the fast-math flags of
the select created for the final reduction value instead of adding them
late.
2023-09-21 11:05:55 +01:00
Florian Hahn
1d1cba44ea
[VPlan] Remove stray indent when printing scalar steps recipe.
VPScalarIVStepsRecipe will now be printed as
      vp<%6> = SCALAR-STEPS vp<%3>, ir<1>
instead of
      vp<%6>      = SCALAR-STEPS vp<%3>, ir<1>
2023-09-17 10:15:52 +01:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
Florian Hahn
3e2d564c3d
[VPlan] Use VPRecipeWithFlags for VPScalarIVStepsRecipe (NFC).
This directly models the flags as part of the recipe, which allows
dropping them using the VPlan infrastructure when required.

It also allows removing the full reference to InductionDescriptor and
limit it to only the opcode.
2023-09-08 15:46:12 +01:00
Florian Hahn
785e7063b9
[VPlan] Don't rely on underlying instr in VPWidenRecipe (NFCI).
VPWidenRecipe only needs the opcode to widen, all other information
(flags, debug loc and operands) is already modeled directly via the
recipe.

This removes the remaining uses of the underlying instruction from
VPWidenRecipe::execute.
2023-09-06 16:27:09 +01:00
Florian Hahn
165e24aa2a
[VPlan] Move DebugLoc to VPRecipeBase (NFCI).
Add a dedicated debug location to VPRecipeBase to remove another
unneeded use of the underlying LLVM IR instruction and also consolidate
various DL fields in sub classes.

Each recipe can have debug location and it shouldn't rely on reference
to the underlying LLVM IR instructions to retain it. See various recipes
that had separate DL fields already.
2023-09-05 15:45:16 +01:00
Florian Hahn
168e23c741
[VPlan] Remove reference to Instr when setting debug loc. (NFCI)
This allows untangling references to underlying IR for various recipes.
2023-09-05 10:59:13 +01:00
Florian Hahn
3fa1b254b7
[VPlan] Print blend recipe as operand directly, instead of IR PHI.
Update VPBlendRecipe::print() to print the result directly, instead of
relying on the stored Phi pointer. This brings the recipe in line with
how other recipes are printed.
2023-09-04 12:35:58 +01:00
Florian Hahn
fd66195777
[VPlan] Manage compare predicates in VPRecipeWithIRFlags.
Extend VPRecipeWithIRFlags to also manage predicates for compares. This
allows removing the custom ICmpULE opcode from VPInstruction which was a
workaround for missing proper predicate handling.

This simplifies the code a bit while also allowing compares with any
predicates. It also fixes a case where the compare predixcate wasn't
printed properly for VPReplicateRecipes.

Discussed/split off from D150398.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D158992
2023-09-02 21:45:24 +01:00
Florian Hahn
32cb8f519e
[VPlan] Generalize variable names for ICmpULE operands (NFC)
ICmp codegen for VPInstructionD will be extended for other predicates,
and the operands could be any values (not just IV and TC as implied by
the names). Suggested cleanup from 150398.
2023-08-28 15:47:04 +01:00
Florian Hahn
34d25924c4
[VPlan] Mark some VPInstruction opcodes as not having side effects.
Mark some VPInstruction opcodes as not having side effects, preparation
for D157037.
2023-08-22 20:05:57 +01:00
Florian Hahn
56f5738d85
[LV] Move induction ::execute impls to VPlanRecipes.cpp (NFC).
All dependencies on code from LoopVectorize.cpp have been
removed/refactored. Move the ::execute implementations to other recipe
definitions in VPlanRecipes.cpp
2023-08-20 21:00:05 +01:00
Florian Hahn
ada2a455fc
[VPlan] Use VPBasicBlock to get incoming block for exit phi fixup (NFC)
Retrieve block via VPlan infrastructure as suggested as independent
cleanup in D150398.
2023-08-17 18:17:45 +01:00
Mel Chen
463e7cb892 [LV][VPlan] Refactor VPReductionRecipe to use reference for member RdxDesc
This commit refactors the implementation of VPReductionRecipe to use
reference instead of pointer for member RdxDesc. Because the member
RdxDesc in VPReductionRecipe should not be a nullptr, using a reference
will provide clearer semantics.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D158058
2023-08-16 19:37:49 -07:00
Florian Hahn
aacaf3d580
[VPlan] Simplify VPDerivedIV truncation handling (NFCI).
Address post-commit simplification suggestion for 8a56179bcd8c: Replace
IsTruncated by conditionally setting TruncResultTy only if truncation
is required.
2023-08-14 17:33:10 +01:00
Florian Hahn
8a56179bcd
[VPlan] Store induction kind & binop directly in VPDerviedIVRecipe(NFC)
Limit the information stored in VPDerivedIVRecipe to the ingredients
really needed.
2023-08-10 10:57:32 +01:00
Florian Hahn
698ae66092
[VPlan] Replace FMF in VPInstruction with VPRecipeWithIRFlags (NFC).
Update VPInstruction to use VPRecipeWithIRFlags to manage FMFs for
VPInstruction.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D157144
2023-08-08 20:13:11 +01:00
Florian Hahn
b6d994de0f
[VPlan] Address post-commit suggestions for af635a554 (NFC). 2023-08-08 12:59:34 +01:00
Florian Hahn
af635a5547
[VPlan] Model wrap flags directly, remove *NUW opcodes (NFC)
Model wrap flags directly using VPRecipeWithIRFlags and clean up the
duplicated *NUW opcodes.

D157144 will build on this and also model FMFs for VPInstruction.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D157194
2023-08-08 12:12:30 +01:00
Florian Hahn
93c5bae00e
[VPlan] Use printOperands for VPInstruction.
Use the printOperands for printing VPInstruction's operands to be more
in line with other recipes and ensure consistent printing after D15719.

Also removes some stray spaces in print output.
2023-08-08 11:31:21 +01:00
Florian Hahn
0b17e9d285
[VPlan] Move VPRecipeWithIRFlags::getFastMathFlags. (NFCI)
Split off suggested refactoring from D157144. Also adds a assert to make
sure this is only used when OpType is FPMathOp.
2023-08-07 12:35:53 +01:00
Mel Chen
425e9e81a0 [LV] Rename the Select[I|F]Cmp reduction pattern to [I|F]AnyOf. (NFC)
Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D155786
2023-08-03 00:37:19 -07:00
Florian Hahn
2265bb064b
[LV] Update generateInstruction to return produced value (NFC).
Update generateInstruction to return the produced value instead of
setting it for each opcode. This reduces the amount of duplicated code
and is a preparation for D153696.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154240
2023-07-05 19:53:59 +01:00
Florian Hahn
0a246a0c72
[LV] Use VPValues when creating GEP with all invariant indices.
Update VPWidenGEPRecipe::execute to use the VPValue operands of the
recipe when creating the GEP instruction.

Fixes #63340.
2023-06-16 16:14:01 +01:00