llvm-project

Author	SHA1	Message	Date
Shih-Po Hung	97fa3e5936	[NFC][VPlan] Rename VPEVLBasedIVPHIRecipe to VPCurrentIterationPHIRecipe (#177114 ) This is groundwork for #151300, which aims to support first-faulting loads in non-tail-folded early-exit loops. Per #175900, we need a variable-length stepping transform that can shared between EVL and non-EVL loops. The idea is to have an EVL-independent counter and transform for tracking the cumulative number of processed elements. This patch renames the existing counter (VPEVLBasedIVPHIRecipe) and transform (canonicalizeEVLLoops) to be EVL-independent: - Rename VPEVLBasedIVPHIRecipe to VPCurrentIterationRecipe to reflect its general purpose of tracking processed element count. - Rename canonicalizeEVLLoops to convertToVariableLengthStep. This is NFC.	2026-02-18 07:04:58 +00:00
Florian Hahn	ede1a9626b	[LV] Vectorize early exit loops with multiple exits. (#174864 ) Building on top of the recent changes to introduce BranchOnTwoConds, this patch adds support for vectorizing loops with multiple early exits, all dominating a countable latch. The early exits must form a dominance chain, so we can simply check which early exit has been taken in dominance order. Currently LoopVectorizationLegality ensures that all exits other than the latch must be uncountable. handleUncountableEarlyExits now collects those uncountable exits and processes each exit. In the vector region, we compute if any exit has been taken, by taking the OR of all early exit conditions (EarlyExitConds) and checking if there's any active lane. If the early exit is taken, we exit the loop and compute which early exit has been taken. The first taken early exit is the one where its exit condition is true in the first active lane of EarlyExitConds. We create a chain of dispatch blocks outside the loop to check this for the early exit blocks ordered by dominance. Depends on https://github.com/llvm/llvm-project/pull/174016. PR: https://github.com/llvm/llvm-project/pull/174864	2026-02-13 16:44:23 +00:00
Florian Hahn	a55fbab0cf	[VPlan] Run initial recipe simplification on VPlan0. (#176828 ) In some cases, LV gets simplifyable IR as input. Directly apply simplifications on the initial VPlan0 to avoid vectorization in cases where the loop body can be folded away. Using the end-to-end pipeline, this is relatively rare, but when reducing test cases, the reduction often ends up with cases with trivial folds. Rejecting those will result in more robust & realistic test cases. As follow-up, I also plan to add initial dead recipe removal. Depends on https://github.com/llvm/llvm-project/pull/176795. PR: https://github.com/llvm/llvm-project/pull/176828	2026-02-13 12:01:22 +00:00
Florian Hahn	7509cad693	[VPlan] Support masked VPInsts, use for predication (NFC) (#142285 ) Add support for mask operands to most VPInstructions, using getNumOperandsForOpcode. This allows VPlan predication to predicate VPInstructions directly. The mask will then be dropped or handled when creating wide recipes. Depends on https://github.com/llvm/llvm-project/pull/142284. Depends on https://github.com/llvm/llvm-project/pull/168784. PR: https://github.com/llvm/llvm-project/pull/142285	2026-02-08 18:23:36 +00:00
Florian Hahn	fdce0ea708	[VPlan] Add ExitingIVValue VPInstruction. (#175651 ) Add a new VPInstruction opcode to compute the exiting value of an induction variable after vectorization. This replaces the pattern of extracting the last lane from the last part of the induction backedge value when applicable. This allows us to always use the pre-computed IV end value. It will also allow unifying end value creation for both induction resume and exit values. PR: https://github.com/llvm/llvm-project/pull/175651	2026-02-06 12:27:31 +00:00
Jakub Kuderski	55fbb71db1	[llvm] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178502 ) Pre-commiting this before landing the new check in https://github.com/llvm/llvm-project/pull/177892	2026-01-28 15:44:04 -05:00
Florian Hahn	14a209f852	[VPlan] Replace ComputeFindIVRes with ComputeRdxRes + cmp + sel (NFC) (#176672 ) Replace ComputeFindIVResult with ComputeReductionResult + explicit compare + select, to more explicitly and simpler model computing finding the first/last induction, which boils down to a min/max reduction + compare and select of the sentinel value. PR: https://github.com/llvm/llvm-project/pull/176672	2026-01-22 19:28:47 +00:00
Florian Hahn	d528686f43	[VPlan] Add VPConstantInt for VPIRValues wrapping ConstantInts (NFC) (#175458 ) Follow-up to https://github.com/llvm/llvm-project/pull/174282: Introduce a new VPConstantInt overlay for VPIRValue, to make it easier to check and access constant int IR values. PR: https://github.com/llvm/llvm-project/pull/175458	2026-01-16 11:27:07 +00:00
Graham Hunter	2abd6d6d7a	[LV] Vectorize conditional scalar assignments (#158088 ) Based on Michael Maitland's previous work: https://github.com/llvm/llvm-project/pull/121222 This PR uses the existing recurrences code instead of introducing a new pass just for CSA autovec. I've also made recipes that are more generic.	2026-01-14 14:59:18 +00:00
Florian Hahn	564c0c169f	[VPlan] Cache other type for VPWidenRecipe with Select opcode (NFC). Add back caching + assertions after landing https://github.com/llvm/llvm-project/pull/174234.	2026-01-12 20:39:52 +00:00
Florian Hahn	8c830d3117	[VPlan] Merge cases inferring type of operand 0 (NFC). Merge all cases that infer the scalar type of operand 0 in inferScalarTypeForRecipe(const VPInstruction).	2026-01-08 21:03:00 +00:00
Florian Hahn	31b93d6e38	[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758 ) This patch adds VPValue sub-classes for the different cases we currently have: * VPIRValue: A live-in VPValue that wraps an underlying IR value * VPSymbolicValue: A symbolic VPValue not tied to an underlying value, e.g. the vector trip count or VF VPValues * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase. This has multiple benefits: * clearer constructors for each kind of VPValue * limited scope: for example allows moving VPDef member to VPRecipeValue, reducing size of other VPValues. * stricter type checking for member variables (e.g. using VPLiveIn in the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic member VPValues) There probably are additional opportunities for cleanups as follow-ups. PR: https://github.com/llvm/llvm-project/pull/172758	2026-01-07 20:29:05 +00:00
Florian Hahn	16830b2164	[VPlan] Remove VPWidenSelectRecipe, use VPWidenRecipe instead (NFCI). (#174234 ) All extra state has been removed from VPWidenSelectRecipe at this point. There's no benefit of having a separate recipe and Select can easily be handled by the existing VPWidenRecipe. PR: https://github.com/llvm/llvm-project/pull/174234	2026-01-05 22:33:37 +00:00
Florian Hahn	524b1788c4	[VPlan] Add BranchOnTwoConds, use for early exit plans. (#172750 ) This PR introduces a new BranchOnTwoConds VPInstruction, that takes 2 boolean operands and must be placed in a block with 3 successors. If condition I is true, branches to successor I, otherwise falls through to check the next condition. If both conditions are false, branch to the third successor. This new branch recipe is used for early-exit loops, to simplify the representation in VPlan initially, by avoid the need for splitting the middle block early on, in a way that preserves the single-exit block property of regions. All exits still go through the latch block, but they can go to more than 2 successors. This idea was part of one of the original proposals for how to model early exits in VPlan, but at that point in time, there was no good way to handle this during code-gen, and we went with the early split-middle block approach initially. Now that we dissolve regions before ::execute, the new recipe can be lowered nicely after regions have been removed, to a set of VPBBs and BranchOnCond recipes. The initial lowering preserves the original structure with the split middle blocks. Follow-ups will improve the lowering to avoid this splitting, providing performance gains. PR: https://github.com/llvm/llvm-project/pull/172750	2025-12-29 19:39:38 +00:00
Mel Chen	f196b1d66f	[VPlan] Extract reverse operation for reverse accesses (#146525 ) This patch introduces VPInstruction::Reverse and extracts the reverse operations of loaded/stored values from reverse memory accesses. This extraction facilitates future support for permutation elimination within VPlan.	2025-12-18 14:57:48 +00:00
Florian Hahn	65deac0872	[VPlan] Remove vector type checking in inferScalartType (NFC). inferScalarTypeForRecipe always infers a scalar type, so BaseTy must be a scalar type. Remove unneeded cast.	2025-12-11 22:10:31 +00:00
Florian Hahn	3fc7419236	[VPlan] Replace ExtractLast(Elem\|LanePerPart) with ExtractLast(Lane/Part) (#164124 ) Replace ExtractLastElement and ExtractLastLanePerPart with more generic and specific ExtractLastLane and ExtractLastPart, which model distinct parts of extracting across parts and lanes. ExtractLastElement == ExtractLastLane(ExtractLastPart) and ExtractLastLanePerPart == ExtractLastLane, the latter clarifying the name of the opcode. A new m_ExtractLastElement matcher is provided for convenience. The patch should be NFC modulo printing changes. PR: https://github.com/llvm/llvm-project/pull/164124	2025-12-07 15:15:43 +00:00
Florian Hahn	db85babddd	[VPlan] Use m_Intrinsic to match assumes/noalias_scope_decl (NFC). Use pattern matching to check for intrinsics to slightly simplify code.	2025-11-27 18:50:34 +00:00
Florian Hahn	f8eca64a28	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 20:03:55 +00:00
Florian Hahn	d58ebe339c	Revert "Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )"" This reverts commit 72e51d389f66d9cc6b55fd74b56fbbd087672a43. Missed some test updates.	2025-11-26 19:41:39 +00:00
Florian Hahn	72e51d389f	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 19:31:25 +00:00
Sam Tebbs	071d1fb8be	[LV] Use VPReductionRecipe for partial reductions (#147513 ) Partial reductions can easily be represented by the VPReductionRecipe class by setting their scale factor to something greater than 1. This PR merges the two together and gives VPReductionRecipe a VFScaleFactor so that it can choose to generate the partial reduction intrinsic at execute time. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/156976 4. https://github.com/llvm/llvm-project/pull/160154 5. https://github.com/llvm/llvm-project/pull/147302 6. https://github.com/llvm/llvm-project/pull/162503 7. -> https://github.com/llvm/llvm-project/pull/147513 Replaces https://github.com/llvm/llvm-project/pull/146073 .	2025-11-26 16:18:22 +00:00
Florian Hahn	a6edeedbfa	Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b. This appears to be causing some runtime failures on RISCV https://lab.llvm.org/buildbot/#/builders/210/builds/5221	2025-11-13 22:34:55 +00:00
Florian Hahn	62d1a080e6	[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 ) Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-12 15:11:00 +00:00
Florian Hahn	b9ce7656e9	[VPlan] Add VPInstruction to unpack vector values to scalars. (#155670 ) Add a new Unpack VPInstruction (name to be improved) to explicitly extract scalars values from vectors. Test changes are movements of the extracts: they are no generated together and also directly after the producer. Depends on https://github.com/llvm/llvm-project/pull/155102 (included in PR) PR: https://github.com/llvm/llvm-project/pull/155670	2025-10-19 18:49:05 +00:00
Florian Hahn	8769119027	[VPlan] Add VPRecipeBase::getRegion helper (NFC). Multiple places retrieve the region for a recipe. Add a helper to make the code more compact and clearer.	2025-10-18 21:25:34 +01:00
Florian Hahn	7f54fccc0e	[VPlan] Add ExtractLastLanePerPart, use in narrowToSingleScalar. (#163056 ) When narrowing stores of a single-scalar, we currently use ExtractLastElement, which extracts the last element across all parts. This is not correct if the store's address is not uniform across all parts. If it is only uniform-per-part, the last lane per part must be extracted. Add a new ExtractLastLanePerPart opcode to handle this correctly. Most transforms apply to both ExtractLastElement and ExtractLastLanePerPart, with the only difference being their treatment during unrolling. Fixes https://github.com/llvm/llvm-project/issues/162498. PR: https://github.com/llvm/llvm-project/pull/163056	2025-10-15 13:46:09 +01:00
Florian Hahn	a7b4dd42bd	[LV] Don't create partial reductions if factor doesn't match accumulator (#158603 ) Check if the scale-factor of the accumulator is the same as the request ScaleFactor in tryToCreatePartialReductions. This prevents creating partial reductions if not all instructions in the reduction chain form partial reductions. e.g. because we do not form a partial reduction for the loop exit instruction. Currently code-gen works fine, because the scale factor of VPPartialReduction is not used during ::execute, but it means we compute incorrect cost/register pressure, because the partial reduction won't reduce to the specified scaling factor. PR: https://github.com/llvm/llvm-project/pull/158603	2025-09-24 12:21:03 +01:00
Florian Hahn	4949cb4a5e	[VPlan] Track VPValues instead of VPRecipes in calculateRegisterUsage. (#155301 ) Update calculateRegisterUsageForPlan to track live-ness of VPValues instead of recipes. This gives slightly more accurate results for recipes that define multiple values (i.e. VPInterleaveRecipe). When tracking the live-ness of recipes, all VPValues defined by an VPInterleaveRecipe are considered alive until the last use of any of them. When tracking the live-ness of individual VPValues, we can accurately track the individual values until their last use. Note the changes in large-loop-rdx.ll and pr47437.ll. This patch restores the original behavior before introducing VPlan-based liveness tracking. PR: https://github.com/llvm/llvm-project/pull/155301	2025-09-15 20:55:11 +01:00
Mel Chen	13357e8a12	[LV][EVL] Support interleaved access with tail folding by EVL (#152070 ) The InterleavedAccess pass already supports transforming vector-predicated (vp) load/store intrinsics. With this patch, we start enabling interleaved access under tail folding by EVL. This patch introduces a new base class, VPInterleaveBase, and a concrete class, VPInterleaveEVLRecipe. Both the existing VPInterleaveRecipe and the new VPInterleaveEVLRecipe inherit from and implement VPInterleaveBase. Compared to VPInterleaveRecipe, VPInterleaveEVLRecipe adds an EVL operand to emit vp.load/vp.store intrinsics. Currently, tail folding by EVL is only supported for scalable vectorization. Therefore, VPInterleaveEVLRecipe will only emit interleave/deinterleave intrinsics. Reverse accesses are not yet implemented, as masked reverse interleaved access under tail folding is not yet supported. Fixed #123201	2025-09-01 21:20:06 +08:00
Florian Hahn	03a23f02a9	[VPlan] Store LoopRegion in variable in calculateRegisterUsage... (NFC)	2025-08-23 17:43:25 +01:00
Elvis Wang	d611a9ca15	[LV][VPlan] Reduce register usage of VPEVLBasedIVPHIRecipe. (#154482 ) `VPEVLBasedIVPHIRecipe` will lower to VPInstruction scalar phi and generate scalar phi. This recipe will only occupy a scalar register just like other phi recipes. This patch fix the register usage for `VPEVLBasedIVPHIRecipe` from vector to scalar which is close to generated vector IR. https://godbolt.org/z/6Mzd6W6ha shows that no register spills when choosing `<vscale x 16>`. Note that this test is basically copied from AArch64.	2025-08-21 07:39:01 +08:00
Ramkumar Ramachandra	0db57ab586	[VPlan] Improve code using onlyScalarValuesUsed (NFC) (#154564 )	2025-08-20 22:38:00 +01:00
Luke Lau	aea82a780a	[VPlan] Remove some getCanonicalIV() uses. NFC (#152969 ) A lot of time getCanonicalIV() is used to get the canonical IV type, e.g. to instantiate a VPTypeAnalysis or to get the LLVMContext. However VPTypeAnalysis has a constructor that takes the VPlan directly and there's a method on VPlan to get the LLVMContext directly, so use those instead where possible. This lets us remove a constructor on VPTypeAnalysis. Also remove an unused LLVMContext argument in UnrollState whilst we're here.	2025-08-11 18:12:05 +08:00
Florian Hahn	86813aa786	[VPlan] Add dedicated user for resume phi with epilogue vectorization. Epilogue vectorization currently relies on the resume phi for the canonical induction being always available, which is why VPPhi are considered to have side-effects, to prevent their removal. This patch adds a new ResumeForEpilogue opcode to mark the resume phi as used for epilogue vectorization. This allows treating VPPhis in general as not having side-effects, enabling removal of unused VPPhis.	2025-08-10 21:21:16 +01:00
Luke Lau	94a6cd464e	[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274 ) This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. There is a new VPInstruction added, WidePtrAdd to allow adding the step vector to the induction phi, since VPInstruction::PtrAdd only handles scalars or multiple scalar lanes. Originally this transformation was copied from the original recipe's execute code, but it's since been simplifed by teaching `unrollWidenInductionByUF` to unroll the recipe, which brings it inline with VPWidenIntOrFpInductionRecipe.	2025-08-05 16:54:02 +08:00
Florian Hahn	80c43b6c07	[VPlan] Add ExtractLane VPInst to extract across multiple parts. (#148817 ) This patch adds a new ExtractLane VPInstruction which extracts across multiple parts using a wide index, to be used in combination with FirstActiveLane. The patch updates early-exit codegen to use it instead ExtractElement, which is only per-part. With this change, interleaving should work correctly with early-exit loops. The patch removes the restrictions added in 6f43754e9 (#145877), but does not yet automatically select interleave counts > 1 for early-exit loops. I'll share a patch as follow-up. The cost of extracting a lane adds non-trivial overhead in the exit block, so that should be considered when picking the interleave count. PR: https://github.com/llvm/llvm-project/pull/148817	2025-07-27 08:08:25 +01:00
Florian Hahn	004c67ea25	[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239 ) Update LV to vectorize maxnum/minnum reductions without fast-math flags, by adding an extra check in the loop if any inputs to maxnum/minnum are NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros are already handled consistently by maxnum/minnum. If any input is NaN, exit the vector loop, compute the reduction result up to the vector iteration that contained NaN inputs and * resume in the scalar loop New recurrence kinds are added for reductions using maxnum/minnum without fast-math flags. PR: https://github.com/llvm/llvm-project/pull/148239	2025-07-18 21:58:19 +01:00
Nicholas Guy	20fc297ce3	[LoopVectorizer] Only check register pressure for VFs that have been enabled via maxBandwidth (#149056 ) Currently if MaxBandwidth is enabled, the register pressure is checked for each VF. This changes that to only perform said check if the VF would not have otherwise been considered by the LoopVectorizer if maxBandwidth was not enabled. Theoretically this allows for higher VFs to be considered than would otherwise be deemed "safe" (from a regpressure perspective), but more concretely this reduces the amount of work done at compile-time when maxBandwidth is enabled.	2025-07-18 09:21:20 +01:00
Kazu Hirata	7c83d66719	[llvm] Remove unused includes (NFC) (#148768 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-14 22:19:14 -07:00
Florian Hahn	6b3d2b629c	[VPlan] Add VPExpressionRecipe, replacing extended reduction recipes. (#144281 ) This patch adds a new recipe to combine multiple recipes into an 'expression' recipe, which should be considered as single entity for cost-modeling and transforms. The recipe needs to be 'decomposed', i.e. replaced by its individual recipes before execute. This subsumes VPExtendedReductionRecipe and VPMulAccumulateReductionRecipe and should make it easier to extend to include more types of bundled patterns, like e.g. extends folded into loads or various arithmetic instructions, if supported by the target. It allows avoiding re-creating the original recipes when converting to concrete recipes, together with removing the need to record various information. The current version of the patch still retains the original printing matching VPExtendedReductionRecipe and VPMulAccumulateReductionRecipe, but this specialized print could be replaced with printing the bundled recipes directly. PR: https://github.com/llvm/llvm-project/pull/144281	2025-07-01 20:44:50 +01:00
Florian Hahn	026aae7047	[VPlan] Infer reduction result types w/o accessing underlying phis.(NFC) Remove another use of the underlying IR phi.	2025-06-30 21:29:29 +01:00
Florian Hahn	20fbbd7675	[LV] Add support for cmp reductions with decreasing IVs. (#140451 ) Similar to FindLastIV, add FindFirstIVSMin to support select (icmp(), x, y) reductions where one of x or y is a decreasing induction, producing a SMin reduction. It uses signed max as sentinel value. PR: https://github.com/llvm/llvm-project/pull/140451	2025-06-29 11:17:03 +01:00
Florian Hahn	aa24029319	[VPlan] Unroll VPReplicateRecipe by VF. (#142433 ) Explicitly unroll VPReplicateRecipes outside replicate regions by VF, replacing them by VF single-scalar recipes. Extracts for operands are added as needed and the scalar results are combined to a vector using a new BuildVector VPInstruction. It also adds a few folds to simplify unnecessary extracts/BuildVectors. It also adds a BuildStructVector opcode for handling of calls that have struct return types. VPReplicateRecipe in replicate regions can will be unrolled as follow up, turing non-single-scalar VPReplicateRecipes into 'abstract', i.e. not executable. PR: https://github.com/llvm/llvm-project/pull/142433	2025-06-26 11:19:09 +01:00
Florian Hahn	6108d50aed	[VPlan] Add ReductionStartVector VPInstruction. (#142290 ) Add a new VPInstruction::ReductionStartVector opcode to create the start values for wide reductions. This more accurately models the start value creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the line it also allows removing VPReductionPHIRecipe::RdxDesc. PR: https://github.com/llvm/llvm-project/pull/142290	2025-06-09 20:59:12 +01:00
Florian Hahn	5520ab3d50	[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932 ) Add a dedicated opcode for any-of reduction, similar to https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. The patch also explictly adds the start value to not require RecurrenceDescriptor during execute. It also allows freezing the start value to make it poison-safe. PR: https://github.com/llvm/llvm-project/pull/141932	2025-06-03 14:33:53 +01:00
Florian Hahn	11713e86b0	[LV] Move VPlan-based calculateRegisterUsage to VPlanAnalysis (NFC). (#135673 ) Move VPlan-based calculateRegisterUsage from LoopVectorize to VPlanAnalysis.cpp. It is a VPlan-based analysis and this helps to reduce the size of LoopVectorize. PR: https://github.com/llvm/llvm-project/pull/135673	2025-06-02 17:40:50 +01:00
Florian Hahn	10bd4cd9cd	[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405 ) Use regular VPPhi instead of a separate opcode for resume phis. This removes an unneeded specialized opcode and unifies the code (verification, printing, updating when CFG is changed). Depends on https://github.com/llvm/llvm-project/pull/140132. PR: https://github.com/llvm/llvm-project/pull/140405	2025-05-30 12:50:08 +01:00
Elvis Wang	664c937b43	[VPlan] Implement VPExtendedReduction, VPMulAccumulateReductionRecipe and corresponding vplan transformations. (#137746 ) This patch introduce two new recipes. * VPExtendedReductionRecipe - cast + reduction. * VPMulAccumulateReductionRecipe - (cast) + mul + reduction. This patch also implements the transformation that match following patterns via vplan and converts to abstract recipes for better cost estimation. * VPExtendedReduction - reduce(cast(...)) * VPMulAccumulateReductionRecipe - reduce.add(mul(...)) - reduce.add(mul(ext(...), ext(...)) - reduce.add(ext(mul(ext(...), ext(...)))) The converted abstract recipes will be lower to the concrete recipes (widen-cast + widen-mul + reduction) just before recipe execution. Note that this patch still relies on legacy cost model the calculate the cost for these patters. Will enable vplan-based cost decision in #113903. Split from #113903.	2025-05-16 10:25:38 +08:00
Florian Hahn	efae492ad1	[VPlan] Add VPTypeAnalysis constructor taking a VPlan (NFC). Add constructor that retrieves the scalar type from the trip count expression, if no canonical IV is available. Used in the verifier, in preparation for late verification, when the canonical IV has been dissolved.	2025-05-15 22:19:36 +01:00

1 2

95 Commits