366 Commits

Author SHA1 Message Date
Florian Hahn
5520ab3d50
[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932)
Add a dedicated opcode for any-of reduction, similar to
https://github.com/llvm/llvm-project/pull/132689 and
https://github.com/llvm/llvm-project/pull/132690.

The patch also explictly adds the start value to not require
RecurrenceDescriptor during execute. It also allows freezing the start
value to make it poison-safe.

PR: https://github.com/llvm/llvm-project/pull/141932
2025-06-03 14:33:53 +01:00
Florian Hahn
0f00a96fed
[VPlan] Simplify branch on False in VPlan transform (NFC). (#140409)
Simplify branch on false, starting with the branch from the middle block
to the scalar preheader. Initially this helps simplifying the initial
VPlan construction.

Depends on https://github.com/llvm/llvm-project/pull/140405.

PR: https://github.com/llvm/llvm-project/pull/140409
2025-05-31 20:32:45 +01:00
Ramkumar Ramachandra
f057a593a7
[VPlan] Improve code in VPWidenCallRecipe (NFC) (#141926)
Use operands() instead of {op_begin(), op_end()}. Also rename
arg_operands to args to match CallBase.
2025-05-31 15:41:20 +02:00
Florian Hahn
e4ef651695
[VPlan] Simplify VPReductionPHIRecipe::execute (NFC).
Simplify VPReductionPHIRecipe::execute by handling the simple cases
first, by directly using State.get() to the appropriate start value.
2025-05-30 15:56:44 +01:00
Florian Hahn
10bd4cd9cd
[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405)
Use regular VPPhi instead of a separate opcode for resume phis. This
removes an unneeded specialized opcode and unifies the code
(verification, printing, updating when CFG is changed).

Depends on https://github.com/llvm/llvm-project/pull/140132.

PR: https://github.com/llvm/llvm-project/pull/140405
2025-05-30 12:50:08 +01:00
Florian Hahn
9ea4924720
[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC).
Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also
use EMIT-SCALAR for VPPhis that are single scalars.
2025-05-29 11:20:07 +01:00
Florian Hahn
5b85e4b08d
[VPlan] Use EMIT-SCALAR when printing single-scalar VPInstructions. (#141428)
By using SINGLE-SCALAR when printing, it is clear in the debug output
that those VPInstructions only produce a single scalar.

Split off in preparation for
https://github.com/llvm/llvm-project/pull/140623.

PR: https://github.com/llvm/llvm-project/pull/141428
2025-05-29 09:29:06 +01:00
Elvis Wang
332fe08f1d
[VPlan] Implement VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. (#113903)
This patch implement the VPlan-based cost model for VPReduction,
VPExtendedReduction and VPMulAccumulateReduction.

With this patch, we can calculate the reduction cost by the VPlan-based
cost model so remove the reduction costs in `precomputeCost()`.

Ref: Original instruction based implementation:
https://reviews.llvm.org/D93476
2025-05-29 11:15:16 +08:00
Florian Hahn
249301c779
[LoopUtils] Pass sentinel value directly to createFindLastIVRed (NFC).
Now that there is only a single FindLastIV recurrence kind, simply pass
the sentinel value instead of the full recurrence descriptor to tighten
the interface.
2025-05-28 22:00:11 +01:00
Florian Hahn
0d7b34bfc1
[LoopUtils] Pass start value directly to createAnyOfReduction (NFC).
Now that there is only a single AnyOf recurrence kind, simply pass the
start value instead of the full recurrence descriptor, to tighten the
interface.
2025-05-28 21:28:02 +01:00
Florian Hahn
440a8adb86
[VPlan] Use VPIRFlags to manage FMFs for ComputeReductionResult (NFC).
Manage fast-math flags using VPIRFlags from VPInstruciton, in inline
with other VPInstructions. With this change, we now print the correctly
flags for ComputeReductionResult, other than that NFC.
2025-05-28 20:54:58 +01:00
Ramkumar Ramachandra
a8edb6a548
[VPlan] Improve cast code in VPlanRecipes (NFC) (#141240) 2025-05-27 22:31:46 +02:00
Simon Pilgrim
63eb00483f VPlanRecipes.cpp - fix "not all control paths return a value" MSVC warning 2025-05-25 15:16:01 +01:00
Florian Hahn
c0506a11f4
[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621)
This patch moves the logic to manage IR flags to a separate VPIRFlags
class. For now, VPRecipeWithIRFlags is the only class that inherits
VPIRFlags. The new class allows for simpler passing of flags when
constructing recipes, simplifying the constructors for various recipes
(VPInstruction in particular, which now just has 2 constructors, one
taking an extra VPIRFlags argument.

This mirrors the approach taken for VPIRMetadata and makes it easier to
extend in the future. The patch also adds a unified flagsValidForOpcode
to check if the flags in a VPIRFlags match the provided opcode.

PR: https://github.com/llvm/llvm-project/pull/140621
2025-05-25 11:13:11 +01:00
Florian Hahn
dcef154b5c
[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506)
Building on top of https://github.com/llvm/llvm-project/pull/114305,
replace VPRegionBlocks with explicit CFG before executing.

This brings the final VPlan closer to the IR that is generated and
helps to simplify codegen.

It will also enable further simplifications of phi handling during
execution and transformations that do not have to preserve the 
canonical IV required by loop regions. This for example could include
replacing the canonical IV with an EVL based phi while completely
removing the original canonical IV.

PR: https://github.com/llvm/llvm-project/pull/117506
2025-05-24 19:17:16 +01:00
Mel Chen
1b711b27d2
[VPlan] Clean up the function VPInstruction::generate for ComputeReductionResult, nfc (#140245)
When reducing unrolled parts, explicitly check for min/max reductions
using the function RecurrenceDescriptor::isMinMaxRecurrenceKind. Only if
the reduction is not min/max reduction, call
RecurrenceDescriptor::getOpcode() to handle other cases via CreateBinOp.

Based on https://github.com/llvm/llvm-project/pull/140242
Related to https://github.com/llvm/llvm-project/pull/118393
2025-05-19 17:31:23 +08:00
Mel Chen
f594cd0936
[IVDescriptor][LV] Return Instruction::Or for IAnyOf/FAnyOf in getOpcode(), nfc (#140242) 2025-05-19 16:17:04 +08:00
Florian Hahn
04fde85057
[VPlan] Rename isUniform(AfterVectorization) to isSingleScalar (NFC). (#140134)
Update the naming in VPReplicateRecipe and vputils to the more accurate
isSingleScalar, as the functions check for cases where only a single
scalar is needed, either because it produces the same value for all
lanes or has only their first lane used.

Discussed in https://github.com/llvm/llvm-project/pull/139150.

PR: https://github.com/llvm/llvm-project/pull/140134
2025-05-16 16:38:39 +01:00
Elvis Wang
664c937b43
[VPlan] Implement VPExtendedReduction, VPMulAccumulateReductionRecipe and corresponding vplan transformations. (#137746)
This patch introduce two new recipes.

* VPExtendedReductionRecipe
  - cast + reduction.

* VPMulAccumulateReductionRecipe
  - (cast) + mul + reduction.

This patch also implements the transformation that match following
patterns via vplan and converts to abstract recipes for better cost
estimation.

* VPExtendedReduction
  - reduce(cast(...))

* VPMulAccumulateReductionRecipe
  - reduce.add(mul(...))
  - reduce.add(mul(ext(...), ext(...))
  - reduce.add(ext(mul(ext(...), ext(...))))

The converted abstract recipes will be lower to the concrete recipes
(widen-cast + widen-mul + reduction) just before recipe execution.

Note that this patch still relies on legacy cost model the calculate the
cost for these patters.
Will enable vplan-based cost decision in #113903.

Split from #113903.
2025-05-16 10:25:38 +08:00
Florian Hahn
045fdda39d
[VPlan] Replace TTI::getOperandInfo with Ctx.getOperandInfo (NFC).
Update to use VPlan-based implementation of getOperandInfo, removing
uses of underlying IR references.
2025-05-12 22:44:54 +01:00
Florian Hahn
fb017a52e7
[VPlan] Use load/store opcode for VPWiden(Load|Store)EVLRecipe (NFC).
Removes unnecessary uses of Ingredient.
2025-05-12 21:30:18 +01:00
Florian Hahn
9a9a78eacb
[VPlan] Handle most bin-ops in VPReplicateRecipe::computeCost. (NFC)
Directly compute costs for binary ops and GEPs in
VPReplicateRecipe::computeCost. This simply ports the legacy cost
computation for uniform/replicating binary ops to the VPlan cost model.
2025-05-11 13:51:14 +01:00
Florian Hahn
5fa64d65e9
[VPlan] Use printPhiOperands for VPPhi.
Split off from  https://github.com/llvm/llvm-project/pull/139151 to land
printing improvements separately.

Updates printing of VPPhi operands to be consistent with
VPWidenPHIRecipe.
2025-05-10 12:49:29 +01:00
Florian Hahn
f2e62cfca5
[VPlan] Add VPPhi subclass for VPInstruction with PHI opcodes.(NFC) (#139151)
Similarly to VPInstructionWithType and VPIRPhi, add VPPhi as a subclass
for VPInstruction. This allows implementing the VPPhiAccessors trait,
making available helpers for generic printing of incoming values /
blocks and accessors for incoming blocks and values.

It will also allow properly verifying def-uses for values used by
VPInstructions with PHI opcodes via
https://github.com/llvm/llvm-project/pull/124838.

PR: https://github.com/llvm/llvm-project/pull/139151
2025-05-10 11:08:00 +01:00
Florian Hahn
e854c381c6
[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450)
Use VPIRMetadata added in
https://github.com/llvm/llvm-project/pull/135272
to also manage no-alias metadata added by versioning.

Note that this means we have to build the no-alias metadata up-front
once. If it is not used, it will be discarded automatically.

This also fixes a case where incorrect metadata was added to wide
loads/stores that got converted from an interleave group.

Compile-time impact is neutral:

https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u
2025-05-09 11:19:12 +01:00
Florian Hahn
d06d43a9e8
[VPlan] Add printPhiOperands to VPPhiAccessors, use for wide phis.
(NFC modulo debug output changes)

Add generic helper to print phi operands (incoming values) together with
their incoming blocks.

As more and more transforms are added, keeping the incoming blocks of
phis becomes more important. Print incoming blocks via VPPhiAcessors, to
make debugging easier.
2025-05-08 20:56:48 +01:00
Luke Lau
1484f82cbc
[VPlan] Add VPInstruction::StepVector and use it in VPWidenIntOrFpInductionRecipe (#129508)
Split off from #118638, this adds VPInstruction::StepVector, which
generates integer step vectors (0,1,2,...,VF). This is a step towards
eventually modelling all the separate parts of
VPWidenIntOrFpInductionRecipe in VPlan.

This is then used by VPWidenIntOrFpInductionRecipe, where we materialize
it just before unrolling so the operands stay in a fixed position.

The need for a separate operand in VPWidenIntOrFpInductionRecipe, as
well as the need to update it in
optimizeVectorInductionWidthForTCAndVFUF, should be removed with #118638
when everything is expanded in convertToConcreteRecipes.
2025-05-08 18:47:44 +08:00
Maryam Moghadas
a750893fea
[VPlan][LV] Fix invalid truncation in VPScalarIVStepsRecipe (#137832)
Replace CreateTrunc with CreateSExtOrTrunc in VPScalarIVStepsRecipe to
safely handle type conversion. This prevents assertion failures from
invalid truncation when StartIdx0 has a smaller integer type than
IntStepTy. The assertion was introduced by commit 783a846.
Fixes https://github.com/llvm/llvm-project/issues/137185
2025-05-06 12:48:21 -04:00
Florian Hahn
75532b21b1
[VPlan] Replace getPreheaderBBFor with getCFGPredecessor. (NFC)
Replace existing uses of getPreheaderBBFor with the newly added more
general getCFGPredecessor.
2025-05-05 21:47:19 +01:00
Florian Hahn
b807a2bc13
[VPlan] Move VPReplicateRecipe::execute to VPlanRecipes.cpp (NFC).
Consolidate ::execute implementation in VPlanRecipes.cpp, in line with
other ::execute implementations.
2025-05-04 22:02:01 +01:00
Florian Hahn
6e20519717
[VPlan] Add VPPhiAccessors to provide interface for phi recipes (NFC) (#129388)
Add a VPPhiAccessors class to provide interfaces to access incoming
values and blocks.

The first user is VPWidenPhiRecipe, with the other phi-like recipes
following soon.

This will also be used to verify def-use chains where users are phi-like
recipes, simplifying https://github.com/llvm/llvm-project/pull/124838.

PR: https://github.com/llvm/llvm-project/pull/129388
2025-05-04 13:47:42 +01:00
Kazu Hirata
6ab7cb7899
[Transforms] Remove unused local variables (NFC) (#138442) 2025-05-04 00:35:22 -07:00
Samuel Tebbs
fa769655e7 [LV] NFC: Make VPPartialReductionRecipe a VPReductionRecipe 2025-04-30 19:44:40 +01:00
Luke Lau
2cd829fc2c
[VectorUtils][VPlan] Consolidate VPWidenIntrinsicRecipe::onlyFirstLaneUsed and isVectorIntrinsicWithScalarOpAtArg (#137497)
We can reuse isVectorIntrinsicWithScalarOpAtArg in VectorUtils to
determine if only the first lane will be used for a
VPWidenIntrinsicRecipe, provided that we also move the VP EVL operand
check into it.

This was needed by a local patch I was working on that created a
VPWidenIntrinsicRecipe with a VP intrinsic, and prevents the need to
update the scalar arguments in two places.
2025-05-01 01:25:41 +08:00
Luke Lau
c5d780bb72
[VPlan] Remove no longer needed VP intrinsic handling in VPWidenIntrinsicRecipe::computeCost. NFCI (#137573)
Whenever calls were transformed to VP intrinsics with EVL tail folding
in #110412, this workaround was added in computeCost to avoid an
assertion when checking ICA.getArgs().

However it turned out that the actual arguments were never used and this
assertion was removed in #115983 afterwards, so it's now fine to leave
the arguments empty and use the type based cost instead.

The type based cost and value based cost are the same for these VP
intrinsics.

This was tested by adding back in the transformation code in #110412 and
checking that no assertions were still hit.
2025-04-29 21:55:20 +08:00
Florian Hahn
df21288247
[VPlan] Replace ExtractFromEnd with Extract(Last|Penultimate)Element (NFC). (#137030)
ExtractFromEnd only has 2 uses, extracting the last and penultimate
elements. Replace it with 2 separate opcodes, removing the need to
materialize and handle a constant argument.

PR: https://github.com/llvm/llvm-project/pull/137030
2025-04-25 16:27:29 +01:00
Florian Hahn
5d136f90a9
[VPlan] Manage instruction metadata in VPlan. (#135272)
Add a new helper to manage IR metadata that can be progated to generated
instructions for recipes.

This helps to remove a number of remaining uses of getUnderlyingInstr
during VPlan execution.

PR: https://github.com/llvm/llvm-project/pull/135272
2025-04-24 11:57:19 +01:00
Florian Hahn
71f2c1e204
[VPlan] Use early exit in ::extractLastLaneOfFirstOperand (NFC).
Reduce indent level, as suggested in
https://github.com/llvm/llvm-project/pull/136455.
2025-04-23 21:55:35 +01:00
Florian Hahn
3fbbe9b8d0
[VPlan] Add exit phi operands during initial construction (NFC). (#136455)
Add incoming exit phi operands during the initial VPlan construction.
This ensures all users are added to the initial VPlan and is also needed
in preparation to retaining exiting edges during initial construction.

PR: https://github.com/llvm/llvm-project/pull/136455
2025-04-23 20:40:42 +01:00
Florian Hahn
bf33f03f5a
[VPlan] Move Ingredient accesses closer to uses (NFC).
Move accessing Ingredient closer to its only use for
VPWidenMemoryRecipes.
2025-04-22 20:26:15 +01:00
David Sherwood
ef72b93626
[LV] Use requested calling convention for vector math routines (#136122)
Some vector math routines, e.g. ArmPL, specify a particular
calling convention on the routines which can help improve
performance by specifying what registers have to be preserved
across the call.
2025-04-22 09:33:52 +01:00
Florian Hahn
8c83355d5b
[VPlan] Handle VPIRPhi in VPRecipeBase::isPhi (NFC).
Also handle VPIRPhi in VPRecipeBase::isPhi, to simplify existing code
dealing with VPIRPhis.

Suggested as part of https://github.com/llvm/llvm-project/pull/136455.
2025-04-21 21:04:20 +01:00
Florian Hahn
3e5a9d9aa0
[VPlan] Rename setFlags -> applyFlags (NFC).
Update name to apply flags to instructions, as suggested in
https://github.com/llvm/llvm-project/pull/135272.

Also changes the arg to a reference.
2025-04-21 18:57:56 +01:00
David Green
e183459b8b
[CostModel] Make sure getCmpSelInstrCost is passed a CondTy (#135535)
It is already required along certain code paths that the CondTy is
valid. Fix some of the uses to make sure it is passed.
2025-04-21 05:33:30 +01:00
Florian Hahn
fc71b2c8db
[VPlan] Get opcode from recipe in VPWidenMemRecipe::computeCost (NFC).
Remove some uses of the underlying ingredient by getting the opcode
directly via the recipe ID.
2025-04-20 14:25:32 +01:00
Florian Hahn
366ff3a898
[VPlan] Get address space from inferred pointer type (NFC)
Remove a use of the underlying ingredient by getting the address space
from the inferred pointer type.
2025-04-20 10:38:30 +01:00
Florian Hahn
54b33eba16
[VPlan] Add opcode to create step for wide inductions. (#119284)
This patch adds a WideIVStep opcode that can be used to create a vector
with the steps to increment a wide induction. The opcode has 2 operands
* the vector step
* the scale of the vector step

The opcode is later converted into a sequence of recipes that convert
the scale and step to the target type, if needed, and then multiply
vector step by scale.

This simplifies code that needs to materialize step vectors, e.g.
replacing wide IVs as follow up to
https://github.com/llvm/llvm-project/pull/108378 with an increment of
the wide IV step.

PR: https://github.com/llvm/llvm-project/pull/119284
2025-04-14 23:20:44 +02:00
Florian Hahn
4fa3b2a184
[VPlan] Use TypeAnalysis instead of underlying instr in VPPredInst (NFC)
Removes another unnecessary use of the underlying instructions during
VPlan execution.
2025-04-12 20:54:27 +01:00
Florian Hahn
1331f17184
[VPlan] Replace getUnderlyingInstr() with type inference (NFC)
Remove an unnecessary use of getUnderlyingInstr().
2025-04-10 23:45:09 +01:00
Florian Hahn
6a9e8fc50c
[VPlan] Introduce VPInstructionWithType, use instead of VPScalarCast(NFC) (#129706)
There are some opcodes that currently require specialized recipes, due
to their result type not being implied by their operands, including
casts.

This leads to duplication from defining multiple full recipes.

This patch introduces a new VPInstructionWithType subclass that also
stores the result type. The general idea is to have opcodes needing to
specify a result type to use this general recipe. The current patch
replaces VPScalarCastRecipe with VInstructionWithType, a similar patch
for VPWidenCastRecipe will follow soon.

There are a few proposed opcodes that should also benefit, without the
need of workarounds:
* https://github.com/llvm/llvm-project/pull/129508
* https://github.com/llvm/llvm-project/pull/119284

PR: https://github.com/llvm/llvm-project/pull/129706
2025-04-10 22:30:40 +01:00