llvm-project

Author	SHA1	Message	Date
Luke Lau	9dd1c66e8f	[VPlan] Expand VPWidenIntOrFpInductionRecipe into separate recipes (#118638 ) The motivation of this PR is to make #115274 easier to implement, and should allow us to add EVL support by just passing EVL to the VF operand. The current difficulty with widening IVs with EVL is that VPWidenIntOrFpInductionRecipe generates its own backedge value. Since it's a VPHeaderPHIRecipe the VF operand must be in the preheader, which means we can't use the EVL since it's defined in the loop body. The gist in this PR is to take the approach in #114305 and expand VPWidenIntOrFpInductionRecipe into several recipes for the initial value, phi and backedge value just before execution. I.e. this example: ``` vector.ph: Successor(s): vector loop <x1> vector loop: { vector.body: WIDEN-INDUCTION %i = phi %start, %step, %vf ... EMIT branch-on-count ... No successors } ``` gets expanded to: ``` vector.ph: ... vp<%induction.start> = ... vp<%induction.increment> = ... Successor(s): vector loop <x1> vector loop: { vector.body: ir<%i> = WIDEN-PHI vp<%induction.start>, vp<%vec.ind.next> ... vp<%vec.ind.next> = add ir<%i>, vp<%induction.increment> EMIT branch-on-count ... No successors } ``` This allows us to a value defined in the loop in the backedge value, and also means we can just reuse the existing backedge fixups in VPlan::execute without having to specially handle it ourselves. After this #115274 should just become a matter of setting the VF operand to EVL (and building the increment step in the loop body, not the preheader).	2025-06-17 18:24:07 +01:00
Florian Hahn	790df93298	[VPlan] Mark VPFirstOrderRecurrencePHI as not reading/writing memory. First-order recurrence phis don't have side-effects and don't read or write memory. Mark them as such.	2025-06-15 22:00:47 +01:00
Florian Hahn	577199f922	Reapply "[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035 )" This reverts commit 0604dc199c019b23746f4a54885ba0c75569cdae. The recommitted version addresses post-commit comments and adjusts the place the branch weights are added. It now runs before VPlans are optimized for VF and UF, which may remove the vector loop region, causing a crash trying to get the middle block after that. Test case added in 72f99b75afc12bb. Original message: Manage branch weights for the BranchOnCond in the middle block in VPlan. This requires updating VPInstruction to inherit from VPIRMetadata, which in general makes sense as there are a number of opcodes that could take metadata. There are other branches (part of the skeleton) that also need branch weights adding. PR: https://github.com/llvm/llvm-project/pull/143035	2025-06-14 17:20:46 +01:00
Florian Hahn	732ebf803b	[VPlan] Address post-commit comments for f68848015f62. Assign sentinel value to named variable to clarify naming and update comments. Addresses post-commit comments from https://github.com/llvm/llvm-project/pull/142291.	2025-06-14 10:44:20 +01:00
Florian Hahn	f68848015f	[VPlan] Manage Sentinel value for FindLastIV in VPlan. (#142291 ) Similar to modeling the start value as operand, also model the sentinel value as operand explicitly. This makes all require information for code-gen available directly in VPlan. PR: https://github.com/llvm/llvm-project/pull/142291	2025-06-13 19:17:01 +01:00
David Sherwood	541e5118ce	[LV] Use getFixedValue instead of getKnownMinValue when appropriate (#143526 ) There are many places in VPlan and LoopVectorize where we use getKnownMinValue to discover the number of elements in a vector. Where we expect the vector to have a fixed length, I have used the stronger getFixedValue call. I believe this is clearer and adds extra protection in the form of an assert in getFixedValue that the vector is not scalable. While looking at VPFirstOrderRecurrencePHIRecipe::computeCost I also took the liberty of simplifying the code. In theory I believe this patch should be NFC, but I'm reluctant to add that to the title in case we're just missing tests for some of the VPlan changes. I built and ran the LLVM test suite when targeting neoverse-v1 and it seemed ok.	2025-06-13 11:43:50 +01:00
Philip Reames	8ee9646b06	[LV] Simplify creation of vp.load/vp.store/vp.reduce intrinsics (#143804 ) The use of VectorBuilder here was simply obscuring what was actually going on. For vp.load and vp.store, the resulting code is significantly more idiomatic. For the vp.reduce cases, we remove several layers of indirection, including passing parameters via implicit state on the builder. In both cases, the code is significantly easier to follow.	2025-06-12 13:46:06 -07:00
Hans Wennborg	0604dc199c	Revert "[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035 )" This caused assertion failures: llvm/lib/Transforms/Vectorize/VPlan.h:4021: llvm::VPBasicBlock* llvm::VPlan::getMiddleBlock(): Assertion `LoopRegion && "cannot call the function after vector loop region has been removed"' failed. See comment on the PR. > Manage branch weights for the BranchOnCond in the middle block in VPlan. > This requires updating VPInstruction to inherit from VPIRMetadata, which > in general makes sense as there are a number of opcodes that could take > metadata. > > There are other branches (part of the skeleton) that also need branch > weights adding. > > PR: https://github.com/llvm/llvm-project/pull/143035 This reverts commit db8d34db26e9ea92c08d6e813eca9cce40c48478.	2025-06-12 13:52:05 +02:00
Luke Lau	7ef77eb998	[LV] Support scalable interleave groups for factors 3,5,6 and 7 (#141865 ) Currently the loop vectorizer can only vectorize interleave groups for power-of-2 factors at scalable VFs by recursively interleaving [de]interleave2 intrinsics. However after https://github.com/llvm/llvm-project/pull/124825 and #139893, we now have [de]interleave intrinsics for all factors up to 8, which is enough to support all types of segmented loads and stores on RISC-V. Now that the interleaved access pass has been taught to lower these in #139373 and #141512, this patch teaches the loop vectorizer to emit these intrinsics for factors up to 8, which enables scalable vectorization for non-power-of-2 factors. As far as I'm aware, no in-tree target will vectorize a scalable interelave group above factor 8 because the maximum interleave factor is capped at 4 on AArch64 and 8 on RISC-V, and the `-max-interleave-group-factor` CLI option defaults to 8, so the recursive [de]interleaving code has been removed for now. Factors of 3 with scalable VFs are also turned off in AArch64 since there's no lowering for [de]interleave3 just yet either.	2025-06-12 11:09:09 +01:00
Florian Hahn	db8d34db26	[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035 ) Manage branch weights for the BranchOnCond in the middle block in VPlan. This requires updating VPInstruction to inherit from VPIRMetadata, which in general makes sense as there are a number of opcodes that could take metadata. There are other branches (part of the skeleton) that also need branch weights adding. PR: https://github.com/llvm/llvm-project/pull/143035	2025-06-12 10:04:08 +01:00
Florian Hahn	6108d50aed	[VPlan] Add ReductionStartVector VPInstruction. (#142290 ) Add a new VPInstruction::ReductionStartVector opcode to create the start values for wide reductions. This more accurately models the start value creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the line it also allows removing VPReductionPHIRecipe::RdxDesc. PR: https://github.com/llvm/llvm-project/pull/142290	2025-06-09 20:59:12 +01:00
Florian Hahn	414590bae8	[VPlan] Infer result type for ComptueReductionResult in ::execute (NFC). Remove explicit use of underlying instruction to get type.	2025-06-08 21:44:28 +01:00
Florian Hahn	24bd4e59b9	[VPlan] Use regular phi printing for resume phis. As discussed in https://github.com/llvm/llvm-project/pull/140405, remove custom printing for resume-phis and update tests.	2025-06-07 21:49:54 +01:00
Florian Hahn	5520ab3d50	[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932 ) Add a dedicated opcode for any-of reduction, similar to https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. The patch also explictly adds the start value to not require RecurrenceDescriptor during execute. It also allows freezing the start value to make it poison-safe. PR: https://github.com/llvm/llvm-project/pull/141932	2025-06-03 14:33:53 +01:00
Florian Hahn	0f00a96fed	[VPlan] Simplify branch on False in VPlan transform (NFC). (#140409 ) Simplify branch on false, starting with the branch from the middle block to the scalar preheader. Initially this helps simplifying the initial VPlan construction. Depends on https://github.com/llvm/llvm-project/pull/140405. PR: https://github.com/llvm/llvm-project/pull/140409	2025-05-31 20:32:45 +01:00
Ramkumar Ramachandra	f057a593a7	[VPlan] Improve code in VPWidenCallRecipe (NFC) (#141926 ) Use operands() instead of {op_begin(), op_end()}. Also rename arg_operands to args to match CallBase.	2025-05-31 15:41:20 +02:00
Florian Hahn	e4ef651695	[VPlan] Simplify VPReductionPHIRecipe::execute (NFC). Simplify VPReductionPHIRecipe::execute by handling the simple cases first, by directly using State.get() to the appropriate start value.	2025-05-30 15:56:44 +01:00
Florian Hahn	10bd4cd9cd	[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405 ) Use regular VPPhi instead of a separate opcode for resume phis. This removes an unneeded specialized opcode and unifies the code (verification, printing, updating when CFG is changed). Depends on https://github.com/llvm/llvm-project/pull/140132. PR: https://github.com/llvm/llvm-project/pull/140405	2025-05-30 12:50:08 +01:00
Florian Hahn	9ea4924720	[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC). Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also use EMIT-SCALAR for VPPhis that are single scalars.	2025-05-29 11:20:07 +01:00
Florian Hahn	5b85e4b08d	[VPlan] Use EMIT-SCALAR when printing single-scalar VPInstructions. (#141428 ) By using SINGLE-SCALAR when printing, it is clear in the debug output that those VPInstructions only produce a single scalar. Split off in preparation for https://github.com/llvm/llvm-project/pull/140623. PR: https://github.com/llvm/llvm-project/pull/141428	2025-05-29 09:29:06 +01:00
Elvis Wang	332fe08f1d	[VPlan] Implement VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. (#113903 ) This patch implement the VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. With this patch, we can calculate the reduction cost by the VPlan-based cost model so remove the reduction costs in `precomputeCost()`. Ref: Original instruction based implementation: https://reviews.llvm.org/D93476	2025-05-29 11:15:16 +08:00
Florian Hahn	249301c779	[LoopUtils] Pass sentinel value directly to createFindLastIVRed (NFC). Now that there is only a single FindLastIV recurrence kind, simply pass the sentinel value instead of the full recurrence descriptor to tighten the interface.	2025-05-28 22:00:11 +01:00
Florian Hahn	0d7b34bfc1	[LoopUtils] Pass start value directly to createAnyOfReduction (NFC). Now that there is only a single AnyOf recurrence kind, simply pass the start value instead of the full recurrence descriptor, to tighten the interface.	2025-05-28 21:28:02 +01:00
Florian Hahn	440a8adb86	[VPlan] Use VPIRFlags to manage FMFs for ComputeReductionResult (NFC). Manage fast-math flags using VPIRFlags from VPInstruciton, in inline with other VPInstructions. With this change, we now print the correctly flags for ComputeReductionResult, other than that NFC.	2025-05-28 20:54:58 +01:00
Ramkumar Ramachandra	a8edb6a548	[VPlan] Improve cast code in VPlanRecipes (NFC) (#141240 )	2025-05-27 22:31:46 +02:00
Simon Pilgrim	63eb00483f	VPlanRecipes.cpp - fix "not all control paths return a value" MSVC warning	2025-05-25 15:16:01 +01:00
Florian Hahn	c0506a11f4	[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621 ) This patch moves the logic to manage IR flags to a separate VPIRFlags class. For now, VPRecipeWithIRFlags is the only class that inherits VPIRFlags. The new class allows for simpler passing of flags when constructing recipes, simplifying the constructors for various recipes (VPInstruction in particular, which now just has 2 constructors, one taking an extra VPIRFlags argument. This mirrors the approach taken for VPIRMetadata and makes it easier to extend in the future. The patch also adds a unified flagsValidForOpcode to check if the flags in a VPIRFlags match the provided opcode. PR: https://github.com/llvm/llvm-project/pull/140621	2025-05-25 11:13:11 +01:00
Florian Hahn	dcef154b5c	[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506 ) Building on top of https://github.com/llvm/llvm-project/pull/114305, replace VPRegionBlocks with explicit CFG before executing. This brings the final VPlan closer to the IR that is generated and helps to simplify codegen. It will also enable further simplifications of phi handling during execution and transformations that do not have to preserve the canonical IV required by loop regions. This for example could include replacing the canonical IV with an EVL based phi while completely removing the original canonical IV. PR: https://github.com/llvm/llvm-project/pull/117506	2025-05-24 19:17:16 +01:00
Mel Chen	1b711b27d2	[VPlan] Clean up the function VPInstruction::generate for ComputeReductionResult, nfc (#140245 ) When reducing unrolled parts, explicitly check for min/max reductions using the function RecurrenceDescriptor::isMinMaxRecurrenceKind. Only if the reduction is not min/max reduction, call RecurrenceDescriptor::getOpcode() to handle other cases via CreateBinOp. Based on https://github.com/llvm/llvm-project/pull/140242 Related to https://github.com/llvm/llvm-project/pull/118393	2025-05-19 17:31:23 +08:00
Mel Chen	f594cd0936	[IVDescriptor][LV] Return Instruction::Or for IAnyOf/FAnyOf in getOpcode(), nfc (#140242 )	2025-05-19 16:17:04 +08:00
Florian Hahn	04fde85057	[VPlan] Rename isUniform(AfterVectorization) to isSingleScalar (NFC). (#140134 ) Update the naming in VPReplicateRecipe and vputils to the more accurate isSingleScalar, as the functions check for cases where only a single scalar is needed, either because it produces the same value for all lanes or has only their first lane used. Discussed in https://github.com/llvm/llvm-project/pull/139150. PR: https://github.com/llvm/llvm-project/pull/140134	2025-05-16 16:38:39 +01:00
Elvis Wang	664c937b43	[VPlan] Implement VPExtendedReduction, VPMulAccumulateReductionRecipe and corresponding vplan transformations. (#137746 ) This patch introduce two new recipes. * VPExtendedReductionRecipe - cast + reduction. * VPMulAccumulateReductionRecipe - (cast) + mul + reduction. This patch also implements the transformation that match following patterns via vplan and converts to abstract recipes for better cost estimation. * VPExtendedReduction - reduce(cast(...)) * VPMulAccumulateReductionRecipe - reduce.add(mul(...)) - reduce.add(mul(ext(...), ext(...)) - reduce.add(ext(mul(ext(...), ext(...)))) The converted abstract recipes will be lower to the concrete recipes (widen-cast + widen-mul + reduction) just before recipe execution. Note that this patch still relies on legacy cost model the calculate the cost for these patters. Will enable vplan-based cost decision in #113903. Split from #113903.	2025-05-16 10:25:38 +08:00
Florian Hahn	045fdda39d	[VPlan] Replace TTI::getOperandInfo with Ctx.getOperandInfo (NFC). Update to use VPlan-based implementation of getOperandInfo, removing uses of underlying IR references.	2025-05-12 22:44:54 +01:00
Florian Hahn	fb017a52e7	[VPlan] Use load/store opcode for VPWiden(Load\|Store)EVLRecipe (NFC). Removes unnecessary uses of Ingredient.	2025-05-12 21:30:18 +01:00
Florian Hahn	9a9a78eacb	[VPlan] Handle most bin-ops in VPReplicateRecipe::computeCost. (NFC) Directly compute costs for binary ops and GEPs in VPReplicateRecipe::computeCost. This simply ports the legacy cost computation for uniform/replicating binary ops to the VPlan cost model.	2025-05-11 13:51:14 +01:00
Florian Hahn	5fa64d65e9	[VPlan] Use printPhiOperands for VPPhi. Split off from https://github.com/llvm/llvm-project/pull/139151 to land printing improvements separately. Updates printing of VPPhi operands to be consistent with VPWidenPHIRecipe.	2025-05-10 12:49:29 +01:00
Florian Hahn	f2e62cfca5	[VPlan] Add VPPhi subclass for VPInstruction with PHI opcodes.(NFC) (#139151 ) Similarly to VPInstructionWithType and VPIRPhi, add VPPhi as a subclass for VPInstruction. This allows implementing the VPPhiAccessors trait, making available helpers for generic printing of incoming values / blocks and accessors for incoming blocks and values. It will also allow properly verifying def-uses for values used by VPInstructions with PHI opcodes via https://github.com/llvm/llvm-project/pull/124838. PR: https://github.com/llvm/llvm-project/pull/139151	2025-05-10 11:08:00 +01:00
Florian Hahn	e854c381c6	[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450 ) Use VPIRMetadata added in https://github.com/llvm/llvm-project/pull/135272 to also manage no-alias metadata added by versioning. Note that this means we have to build the no-alias metadata up-front once. If it is not used, it will be discarded automatically. This also fixes a case where incorrect metadata was added to wide loads/stores that got converted from an interleave group. Compile-time impact is neutral: https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u	2025-05-09 11:19:12 +01:00
Florian Hahn	d06d43a9e8	[VPlan] Add printPhiOperands to VPPhiAccessors, use for wide phis. (NFC modulo debug output changes) Add generic helper to print phi operands (incoming values) together with their incoming blocks. As more and more transforms are added, keeping the incoming blocks of phis becomes more important. Print incoming blocks via VPPhiAcessors, to make debugging easier.	2025-05-08 20:56:48 +01:00
Luke Lau	1484f82cbc	[VPlan] Add VPInstruction::StepVector and use it in VPWidenIntOrFpInductionRecipe (#129508 ) Split off from #118638, this adds VPInstruction::StepVector, which generates integer step vectors (0,1,2,...,VF). This is a step towards eventually modelling all the separate parts of VPWidenIntOrFpInductionRecipe in VPlan. This is then used by VPWidenIntOrFpInductionRecipe, where we materialize it just before unrolling so the operands stay in a fixed position. The need for a separate operand in VPWidenIntOrFpInductionRecipe, as well as the need to update it in optimizeVectorInductionWidthForTCAndVFUF, should be removed with #118638 when everything is expanded in convertToConcreteRecipes.	2025-05-08 18:47:44 +08:00
Maryam Moghadas	a750893fea	[VPlan][LV] Fix invalid truncation in VPScalarIVStepsRecipe (#137832 ) Replace CreateTrunc with CreateSExtOrTrunc in VPScalarIVStepsRecipe to safely handle type conversion. This prevents assertion failures from invalid truncation when StartIdx0 has a smaller integer type than IntStepTy. The assertion was introduced by commit 783a846. Fixes https://github.com/llvm/llvm-project/issues/137185	2025-05-06 12:48:21 -04:00
Florian Hahn	75532b21b1	[VPlan] Replace getPreheaderBBFor with getCFGPredecessor. (NFC) Replace existing uses of getPreheaderBBFor with the newly added more general getCFGPredecessor.	2025-05-05 21:47:19 +01:00
Florian Hahn	b807a2bc13	[VPlan] Move VPReplicateRecipe::execute to VPlanRecipes.cpp (NFC). Consolidate ::execute implementation in VPlanRecipes.cpp, in line with other ::execute implementations.	2025-05-04 22:02:01 +01:00
Florian Hahn	6e20519717	[VPlan] Add VPPhiAccessors to provide interface for phi recipes (NFC) (#129388 ) Add a VPPhiAccessors class to provide interfaces to access incoming values and blocks. The first user is VPWidenPhiRecipe, with the other phi-like recipes following soon. This will also be used to verify def-use chains where users are phi-like recipes, simplifying https://github.com/llvm/llvm-project/pull/124838. PR: https://github.com/llvm/llvm-project/pull/129388	2025-05-04 13:47:42 +01:00
Kazu Hirata	6ab7cb7899	[Transforms] Remove unused local variables (NFC) (#138442 )	2025-05-04 00:35:22 -07:00
Samuel Tebbs	fa769655e7	[LV] NFC: Make VPPartialReductionRecipe a VPReductionRecipe	2025-04-30 19:44:40 +01:00
Luke Lau	2cd829fc2c	[VectorUtils][VPlan] Consolidate VPWidenIntrinsicRecipe::onlyFirstLaneUsed and isVectorIntrinsicWithScalarOpAtArg (#137497 ) We can reuse isVectorIntrinsicWithScalarOpAtArg in VectorUtils to determine if only the first lane will be used for a VPWidenIntrinsicRecipe, provided that we also move the VP EVL operand check into it. This was needed by a local patch I was working on that created a VPWidenIntrinsicRecipe with a VP intrinsic, and prevents the need to update the scalar arguments in two places.	2025-05-01 01:25:41 +08:00
Luke Lau	c5d780bb72	[VPlan] Remove no longer needed VP intrinsic handling in VPWidenIntrinsicRecipe::computeCost. NFCI (#137573 ) Whenever calls were transformed to VP intrinsics with EVL tail folding in #110412, this workaround was added in computeCost to avoid an assertion when checking ICA.getArgs(). However it turned out that the actual arguments were never used and this assertion was removed in #115983 afterwards, so it's now fine to leave the arguments empty and use the type based cost instead. The type based cost and value based cost are the same for these VP intrinsics. This was tested by adding back in the transformation code in #110412 and checking that no assertions were still hit.	2025-04-29 21:55:20 +08:00
Florian Hahn	df21288247	[VPlan] Replace ExtractFromEnd with Extract(Last\|Penultimate)Element (NFC). (#137030 ) ExtractFromEnd only has 2 uses, extracting the last and penultimate elements. Replace it with 2 separate opcodes, removing the need to materialize and handle a constant argument. PR: https://github.com/llvm/llvm-project/pull/137030	2025-04-25 16:27:29 +01:00
Florian Hahn	5d136f90a9	[VPlan] Manage instruction metadata in VPlan. (#135272 ) Add a new helper to manage IR metadata that can be progated to generated instructions for recipes. This helps to remove a number of remaining uses of getUnderlyingInstr during VPlan execution. PR: https://github.com/llvm/llvm-project/pull/135272	2025-04-24 11:57:19 +01:00

1 2 3 4 5 ...

429 Commits