llvm-project

Author	SHA1	Message	Date
Florian Hahn	f68848015f	[VPlan] Manage Sentinel value for FindLastIV in VPlan. (#142291 ) Similar to modeling the start value as operand, also model the sentinel value as operand explicitly. This makes all require information for code-gen available directly in VPlan. PR: https://github.com/llvm/llvm-project/pull/142291	2025-06-13 19:17:01 +01:00
David Sherwood	541e5118ce	[LV] Use getFixedValue instead of getKnownMinValue when appropriate (#143526 ) There are many places in VPlan and LoopVectorize where we use getKnownMinValue to discover the number of elements in a vector. Where we expect the vector to have a fixed length, I have used the stronger getFixedValue call. I believe this is clearer and adds extra protection in the form of an assert in getFixedValue that the vector is not scalable. While looking at VPFirstOrderRecurrencePHIRecipe::computeCost I also took the liberty of simplifying the code. In theory I believe this patch should be NFC, but I'm reluctant to add that to the title in case we're just missing tests for some of the VPlan changes. I built and ran the LLVM test suite when targeting neoverse-v1 and it seemed ok.	2025-06-13 11:43:50 +01:00
Florian Hahn	d65904675e	[LV] Move logic to create trip count check to helper (NFC). Move the logic to create the iteration count check to a separate helper, so it can be re-used by when creating the skeleton for epilogue vectorization as well.	2025-06-12 21:35:56 +01:00
Hans Wennborg	0604dc199c	Revert "[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035 )" This caused assertion failures: llvm/lib/Transforms/Vectorize/VPlan.h:4021: llvm::VPBasicBlock* llvm::VPlan::getMiddleBlock(): Assertion `LoopRegion && "cannot call the function after vector loop region has been removed"' failed. See comment on the PR. > Manage branch weights for the BranchOnCond in the middle block in VPlan. > This requires updating VPInstruction to inherit from VPIRMetadata, which > in general makes sense as there are a number of opcodes that could take > metadata. > > There are other branches (part of the skeleton) that also need branch > weights adding. > > PR: https://github.com/llvm/llvm-project/pull/143035 This reverts commit db8d34db26e9ea92c08d6e813eca9cce40c48478.	2025-06-12 13:52:05 +02:00
Luke Lau	7ef77eb998	[LV] Support scalable interleave groups for factors 3,5,6 and 7 (#141865 ) Currently the loop vectorizer can only vectorize interleave groups for power-of-2 factors at scalable VFs by recursively interleaving [de]interleave2 intrinsics. However after https://github.com/llvm/llvm-project/pull/124825 and #139893, we now have [de]interleave intrinsics for all factors up to 8, which is enough to support all types of segmented loads and stores on RISC-V. Now that the interleaved access pass has been taught to lower these in #139373 and #141512, this patch teaches the loop vectorizer to emit these intrinsics for factors up to 8, which enables scalable vectorization for non-power-of-2 factors. As far as I'm aware, no in-tree target will vectorize a scalable interelave group above factor 8 because the maximum interleave factor is capped at 4 on AArch64 and 8 on RISC-V, and the `-max-interleave-group-factor` CLI option defaults to 8, so the recursive [de]interleaving code has been removed for now. Factors of 3 with scalable VFs are also turned off in AArch64 since there's no lowering for [de]interleave3 just yet either.	2025-06-12 11:09:09 +01:00
Florian Hahn	db8d34db26	[VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035 ) Manage branch weights for the BranchOnCond in the middle block in VPlan. This requires updating VPInstruction to inherit from VPIRMetadata, which in general makes sense as there are a number of opcodes that could take metadata. There are other branches (part of the skeleton) that also need branch weights adding. PR: https://github.com/llvm/llvm-project/pull/143035	2025-06-12 10:04:08 +01:00
Florian Hahn	5623b7f2d5	[LV] Use GeneratedRTChecks to check if safety checks were added (NFC). Directly check via GeneratedRTChecks if any checks have been added, instead of needing to go through ILV. This simplifies the code and enables further refactoring in follow-up patches.	2025-06-11 21:08:36 +01:00
Stephen Tozer	aa8a1fa6f5	[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192 ) Following the work in PR #107279, this patch applies the annotative DebugLocs, which indicate that a particular instruction is intentionally missing a location for a given reason, to existing sites in the compiler where their conditions apply. This is NFC in ordinary LLVM builds (each function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the instruction in coverage-tracking builds so that it will be ignored by Debugify, allowing only real errors to be reported. From a developer standpoint, it also communicates the intentionality and reason for a missing DebugLoc. Some notes for reviewers: - The difference between `I->dropLocation()` and `I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide to keep some debug info alive, while the latter will always be empty; in this patch, I always used the latter (even if the former could technically be correct), because the former could result in some (barely) different output, and I'd prefer to keep this patch purely NFC. - I've generally documented the uses of `DebugLoc::getUnknown()`, with the exception of the vectorizers - in summary, they are a huge cause of dropped source locations, and I don't have the time or the domain knowledge currently to solve that, so I've plastered it all over them as a form of "fixme".	2025-06-11 17:42:10 +01:00
Florian Hahn	62b3e89afc	[LV] Remove unused LoopBypassBlocks from ILV (NFC). After recent refactorings to move parts of skeleton creation LoopBypassBlocks isn't used any more. Remove it.	2025-06-10 21:37:29 +01:00
Florian Hahn	4a6d31f4bf	[LV] Pass resume phi to fixReductionScalarResumeWhenVectorizing (NFC). fixReductionScalarResumeWhenVectorizingEpilog updates the resume phis in the scalar preheader. Instead of looking at all recipes in the middle block and finding their resume-phi users we can iterate over all resume phis in the scalar preheader directly. This slightly simplifies the code and removes the need to look for the resume phi. Also slightly simplifies https://github.com/llvm/llvm-project/pull/141860.	2025-06-09 22:11:20 +01:00
Florian Hahn	f9b98e386e	[LV] Remove unused LoopMiddleBlock arg from fixReductionScalarRes (NFC) The argument isn't used, remove it.	2025-06-09 21:47:03 +01:00
Florian Hahn	6108d50aed	[VPlan] Add ReductionStartVector VPInstruction. (#142290 ) Add a new VPInstruction::ReductionStartVector opcode to create the start values for wide reductions. This more accurately models the start value creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the line it also allows removing VPReductionPHIRecipe::RdxDesc. PR: https://github.com/llvm/llvm-project/pull/142290	2025-06-09 20:59:12 +01:00
Florian Hahn	c2ea9404ab	[LV] Simplify finding EPResumeValue (NFC). It should be sufficient to check that the resume phi has the correct type, as the vector trip count as incoming value and starts at 0 otherwise. There is no need to find the middle block.	2025-06-09 19:37:33 +01:00
Ramkumar Ramachandra	6716d4eaa8	[LV] Prefer DenseMap::lookup over find (NFC) (#141809 ) Apart from the stylistic improvement, lookup has the nice property of returning a default-constructed object on failure-to-find, while find returns the end iterator, which cannot be dereferenced.	2025-06-03 14:37:19 +01:00
Florian Hahn	5520ab3d50	[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932 ) Add a dedicated opcode for any-of reduction, similar to https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. The patch also explictly adds the start value to not require RecurrenceDescriptor during execute. It also allows freezing the start value to make it poison-safe. PR: https://github.com/llvm/llvm-project/pull/141932	2025-06-03 14:33:53 +01:00
Luke Lau	ddfeecf4c5	[VPlan] Convert to concrete recipes before dissolving loop regions. NFCI (#141999 ) After updating #118638 on tip of tree, expanding VPWidenIntOrFpInductionRecipes fails because it needs the loop region to get the latch to insert the increment into: VPBasicBlock ExitingBB = Plan->getVectorLoopRegion()->getExitingBasicBlock(); Builder.setInsertPoint(ExitingBB, ExitingBB->getTerminator()->getIterator()); auto Next = Builder.createNaryOp(AddOp, {Prev, Inc}, Flags, WidenIVR->getDebugLoc(), "vec.ind.next"); However after #117506, the region is dissolved so it doesn't work. This shuffles the dissolveLoopRegions steps to be after convertToConcreteRecipes so we can use the region when expanding VPWidenIntOrFpInductionRecipes	2025-06-03 12:05:13 +01:00
Florian Hahn	11713e86b0	[LV] Move VPlan-based calculateRegisterUsage to VPlanAnalysis (NFC). (#135673 ) Move VPlan-based calculateRegisterUsage from LoopVectorize to VPlanAnalysis.cpp. It is a VPlan-based analysis and this helps to reduce the size of LoopVectorize. PR: https://github.com/llvm/llvm-project/pull/135673	2025-06-02 17:40:50 +01:00
Florian Hahn	0f00a96fed	[VPlan] Simplify branch on False in VPlan transform (NFC). (#140409 ) Simplify branch on false, starting with the branch from the middle block to the scalar preheader. Initially this helps simplifying the initial VPlan construction. Depends on https://github.com/llvm/llvm-project/pull/140405. PR: https://github.com/llvm/llvm-project/pull/140409	2025-05-31 20:32:45 +01:00
Jon Roelofs	798058fca5	[Remarks] Remove an upcast footgun. NFC (#142191 ) CodeRegion's were previously passed as Value*, but then immediately upcast to BasicBlock. Let's keep the type information around until the use cases for non-BasicBlock code regions actually materialize.	2025-05-31 11:07:54 -07:00
Florian Hahn	10bd4cd9cd	[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405 ) Use regular VPPhi instead of a separate opcode for resume phis. This removes an unneeded specialized opcode and unifies the code (verification, printing, updating when CFG is changed). Depends on https://github.com/llvm/llvm-project/pull/140132. PR: https://github.com/llvm/llvm-project/pull/140405	2025-05-30 12:50:08 +01:00
Florian Hahn	417e43ad43	[LV] Set PhiTy once in adjustRecipesForReductions (NFC).	2025-05-30 08:33:15 +01:00
Ramkumar Ramachandra	663aea2601	[LV] Clean up unused template args of min/max (NFC) (#141778 )	2025-05-29 09:57:22 +02:00
Elvis Wang	332fe08f1d	[VPlan] Implement VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. (#113903 ) This patch implement the VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. With this patch, we can calculate the reduction cost by the VPlan-based cost model so remove the reduction costs in `precomputeCost()`. Ref: Original instruction based implementation: https://reviews.llvm.org/D93476	2025-05-29 11:15:16 +08:00
Florian Hahn	440a8adb86	[VPlan] Use VPIRFlags to manage FMFs for ComputeReductionResult (NFC). Manage fast-math flags using VPIRFlags from VPInstruciton, in inline with other VPInstructions. With this change, we now print the correctly flags for ComputeReductionResult, other than that NFC.	2025-05-28 20:54:58 +01:00
Florian Hahn	ad58ea3ba8	[VPlan] Bail out before construction VPlan0 if MinVF > MaxVF. This reduces the cases where we need to create initial VPlans unnecessarily after 567b3172da2d52f5df70a37f3de06b7000b25968. buildVPlansWithVPRecipes is called with MinVF > MaxVF if the target does not support scalable vectors. Recovers some of the compile-time impact http://llvm-compile-time-tracker.com/compare.php?from=3033f202f6707937cd28c2473479db134993f96f&to=1a0b9e5834f7fd4abf058864e656f8e26b7a26ff&stat=instructions:u	2025-05-27 21:19:11 +01:00
Florian Hahn	d56deea1e4	[VPlan] Connect Entry to scalar preheader during initial construction. (#140132 ) Update initial construction to connect the Plan's entry to the scalar preheader during initial construction. This moves a small part of the skeleton creation out of ILV and will also enable replacing VPInstruction::ResumePhi with regular VPPhi recipes. Resume phis need 2 incoming values to start with, the second being the bypass value from the scalar ph (and used to replicate the incoming value for other bypass blocks). Adding the extra edge ensures we incoming values for resume phis match the incoming blocks. PR: https://github.com/llvm/llvm-project/pull/140132	2025-05-27 16:07:56 +01:00
Florian Hahn	567b3172da	[VPlan] Construct initial once and pass clones to tryToBuildVPlan (NFC). (#141363 ) Update to only build an initial, plain-CFG VPlan once, and then transform & optimize clones. This requires changes to ::clone() for VPInstruction and VPWidenPHIRecipe to allow for proper cloning of the recipes in the initial VPlan. PR: https://github.com/llvm/llvm-project/pull/141363	2025-05-26 13:42:47 +01:00
Florian Hahn	dcef154b5c	[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506 ) Building on top of https://github.com/llvm/llvm-project/pull/114305, replace VPRegionBlocks with explicit CFG before executing. This brings the final VPlan closer to the IR that is generated and helps to simplify codegen. It will also enable further simplifications of phi handling during execution and transformations that do not have to preserve the canonical IV required by loop regions. This for example could include replacing the canonical IV with an EVL based phi while completely removing the original canonical IV. PR: https://github.com/llvm/llvm-project/pull/117506	2025-05-24 19:17:16 +01:00
Florian Hahn	a9b2998e31	[VPlan] Skip cost assert if VPlan converted to single-scalar recipes. Check if a VPlan transform converted recipes to single-scalar VPReplicateRecipes (after 07c085af3efcd67503232f99a1652efc6e54c1a9). If that's the case, the legacy cost model incorrectly overestimates the cost. Fixes https://github.com/llvm/llvm-project/issues/141237.	2025-05-24 11:09:27 +01:00
Florian Hahn	95ba5508e5	Reapply "[VPlan] Move predication to VPlanTransform (NFC). (#128420 )" This reverts commit 793bb6b257fa4d9f4af169a4366cab3da01f2e1f. The recommitted version contains a fix to make sure only the original phis are processed in convertPhisToBlends nu collecting them in a vector first. This fixes a crash when no mask is needed, because there is only a single incoming value. Original message: This patch moves the logic to predicate and linearize a VPlan to a dedicated VPlan transform. It mostly ports the existing logic directly. There are a number of follow-ups planned in the near future to further improve on the implementation: * Edge and block masks are cached in VPPredicator, but the block masks are still made available to VPRecipeBuilder, so they can be accessed during recipe construction. As a follow-up, this should be replaced by adding mask operands to all VPInstructions that need them and use that during recipe construction. * The mask caching in a map also means that this map needs updating each time a new recipe replaces a VPInstruction; this would also be handled by adding mask operands. PR: https://github.com/llvm/llvm-project/pull/128420	2025-05-22 08:16:15 +01:00
Florian Hahn	793bb6b257	Revert "[VPlan] Move predication to VPlanTransform (NFC). (#128420 )" This reverts commit b263c08e1a0b54a871915930aa9a1a6ba205b099. Looks like this triggers a crash in one of the Fortran tests. Reverting while I investigate https://lab.llvm.org/buildbot/#/builders/41/builds/6825	2025-05-21 19:24:21 +01:00
Kazu Hirata	a28d753e96	[Vectorize] Fix a warning This patch fixes: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8564:20: error: unused variable 'LoopRegionOf' [-Werror,-Wunused-variable]	2025-05-21 08:03:16 -07:00
Florian Hahn	b263c08e1a	[VPlan] Move predication to VPlanTransform (NFC). (#128420 ) This patch moves the logic to predicate and linearize a VPlan to a dedicated VPlan transform. It mostly ports the existing logic directly. There are a number of follow-ups planned in the near future to further improve on the implementation: * Edge and block masks are cached in VPPredicator, but the block masks are still made available to VPRecipeBuilder, so they can be accessed during recipe construction. As a follow-up, this should be replaced by adding mask operands to all VPInstructions that need them and use that during recipe construction. * The mask caching in a map also means that this map needs updating each time a new recipe replaces a VPInstruction; this would also be handled by adding mask operands. PR: https://github.com/llvm/llvm-project/pull/128420	2025-05-21 15:47:33 +01:00
Sam Tebbs	70501ed2f0	[LoopVectorizer] Prune VFs based on plan register pressure (#132190 ) This PR moves the register usage checking to after the plans are created, so that any recipes that optimise register usage (such as partial reductions) can be properly costed and not have their VF pruned unnecessarily. Depends on https://github.com/llvm/llvm-project/pull/137746	2025-05-19 13:27:17 +01:00
Florian Hahn	e81fab6847	[VPlan] Verify final VPlan, just before execution. (NFC) Add additional verifier call just before execution, to make sure the final VPlan is valid. Note that this currently requires disabling a small number of checks when running late.	2025-05-17 17:53:06 +01:00
Elvis Wang	664c937b43	[VPlan] Implement VPExtendedReduction, VPMulAccumulateReductionRecipe and corresponding vplan transformations. (#137746 ) This patch introduce two new recipes. * VPExtendedReductionRecipe - cast + reduction. * VPMulAccumulateReductionRecipe - (cast) + mul + reduction. This patch also implements the transformation that match following patterns via vplan and converts to abstract recipes for better cost estimation. * VPExtendedReduction - reduce(cast(...)) * VPMulAccumulateReductionRecipe - reduce.add(mul(...)) - reduce.add(mul(ext(...), ext(...)) - reduce.add(ext(mul(ext(...), ext(...)))) The converted abstract recipes will be lower to the concrete recipes (widen-cast + widen-mul + reduction) just before recipe execution. Note that this patch still relies on legacy cost model the calculate the cost for these patters. Will enable vplan-based cost decision in #113903. Split from #113903.	2025-05-16 10:25:38 +08:00
George Chaltas	c4f7ab1d2e	[LV] Initialize IR block pointers in ILV. (NFC) (#139807 ) Setting unitialized pointers to nullptr in InnerLoopVectorizer() constructor. These were noticed during a review of the code. Seems like a good idea to clean them up.	2025-05-15 09:18:45 +01:00
Florian Hahn	7a9fd62278	[VPlan] Use VPlan operand order for VPBlendRecipes. (#139475 ) Don't use the order of incoming values of IR phis when creating VPBlendRecipes. Instead, simply use the incoming operands and blocks from the VPWidenPHIRecipe. Note that this changes the order of the incoming operands/masks for some blends. PR: https://github.com/llvm/llvm-project/pull/139475	2025-05-14 14:56:35 +01:00
Florian Hahn	98683b0a48	[VPlan] Construct VPBlendRecipe from VPWidenPHIRecipe (NFC). Update VPRecipeBuilder to construct VPBlendRecipe from VPWidenPHIRecipe, starting to thread recipes through the builder instead of the underlying IR instruction up-front. Landing first part of approved https://github.com/llvm/llvm-project/pull/139475 separately as NFC as suggested.	2025-05-14 11:17:26 +01:00
Florian Hahn	8767d55ff3	[VPlan] Consistently use VPlanTransforms::runPass if possible (NFC). Update some more transforms to use ::runPass.	2025-05-13 20:50:27 +01:00
Ramkumar Ramachandra	4f0be9414c	[LV] Improve code in selectInterleaveCount (NFC) (#128002 ) Use the fact that getSmallBestKnownTC returns an exact trip count, if possible, and falls back to returning an estimate, to factor some code in selectInterleaveCount.	2025-05-12 17:20:10 +01:00
Florian Hahn	2f55123cbb	[VPlan] Handle early exit before forming regions. (NFC) (#138393 ) Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on https://github.com/llvm/llvm-project/pull/137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: https://github.com/llvm/llvm-project/pull/128420. Depends on https://github.com/llvm/llvm-project/pull/137709 (included in PR). PR: https://github.com/llvm/llvm-project/pull/138393	2025-05-12 12:53:20 +01:00
Mel Chen	688bccb290	[TTI][LV] Simplify the prototype of preferPredicatedReductionSelect. nfc (#139265 )	2025-05-12 17:24:37 +08:00
Florian Hahn	7500cead4e	[VPlan] Flatten the CFG separately after creating wide recipes (NFC). Move flattening of the CFG out of the loop that creates the wide recipes. This simplifies the already large loop and prepares for moving flattening to a separate transform.	2025-05-11 21:30:01 +01:00
Florian Hahn	2acecfe653	[VPlan] Use VPBBs to look up masks for newly created recipes (NFC). Update recipe construction to use VPBBs to look up masks, in preparation for https://github.com/llvm/llvm-project/pull/128420.	2025-05-11 13:04:33 +01:00
Florian Hahn	cfde685e22	[VPlan] Sink VPB2IRBB lookups to VPRecipeBuilder (NFC). This allows migrating some more code to be based on VPBBs in VPRecipeBuilder, in preparation for https://github.com/llvm/llvm-project/pull/128420.	2025-05-10 22:00:58 +01:00
Florian Hahn	8c6c525a6b	[LV] Don't consider FORs as profitable to scalarize. Fixed-order recurrence phis cannot be scalarized, they will always be widened at the moment. Make sure they are not incorrectly considered profitable to scalarize, similar to 41c1a7be3f1a2556e. Fixes https://github.com/llvm/llvm-project/issues/139060. Fixes https://github.com/llvm/llvm-project/issues/139065.	2025-05-09 20:29:22 +01:00
Florian Hahn	e854c381c6	[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450 ) Use VPIRMetadata added in https://github.com/llvm/llvm-project/pull/135272 to also manage no-alias metadata added by versioning. Note that this means we have to build the no-alias metadata up-front once. If it is not used, it will be discarded automatically. This also fixes a case where incorrect metadata was added to wide loads/stores that got converted from an interleave group. Compile-time impact is neutral: https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u	2025-05-09 11:19:12 +01:00
Luke Lau	1484f82cbc	[VPlan] Add VPInstruction::StepVector and use it in VPWidenIntOrFpInductionRecipe (#129508 ) Split off from #118638, this adds VPInstruction::StepVector, which generates integer step vectors (0,1,2,...,VF). This is a step towards eventually modelling all the separate parts of VPWidenIntOrFpInductionRecipe in VPlan. This is then used by VPWidenIntOrFpInductionRecipe, where we materialize it just before unrolling so the operands stay in a fixed position. The need for a separate operand in VPWidenIntOrFpInductionRecipe, as well as the need to update it in optimizeVectorInductionWidthForTCAndVFUF, should be removed with #118638 when everything is expanded in convertToConcreteRecipes.	2025-05-08 18:47:44 +08:00
Min-Yih Hsu	e0537c0768	[LV][EVL] Attach a new metadata on EVL vectorized loops (#131000 ) This patch attaches a new metadata, `llvm.loop.isvectorized.withevl`, on loops vectorized with explicit vector length. This will help other optimizations down in the pipeline that focus on EVL-vectorized loop This approach is much safer than, said IR pattern matching to figure out if a loop is EVL-vectorized or not.	2025-05-06 10:06:37 -07:00

1 2 3 4 5 ...

2566 Commits