llvm-project

Author	SHA1	Message	Date
Ramkumar Ramachandra	840e9a4ddd	[VPlan] Fix wrap-flags on WidenInduction unroll (#187710 ) Due to a somewhat recent change, IntOrFpInduction recipes have associated VPIRFlags. The VPlanUnroll logic for WidenInduction recipes predates this change, and computes incomplete wrap-flags: update it to simply use the flags on IntOrFpInduction recipes; PointerInduction recipes have no associated flags, and indeed, no flags should be used.	2026-03-27 13:26:04 +00:00
Florian Hahn	90c1c588f8	[VPlan] Don't set WrapFlags for truncated IVs. (#188966 ) The wrap flags from the IV bin-op are not guaranteed to apply to truncated inductions, which are evaluated in narrower types. Instead of dropping them late (in expandVPWidenIntOrFpInduction), do not add them at the outset, the prevent invalid transforms based on incorrect flags in the future. PR: https://github.com/llvm/llvm-project/pull/188966	2026-03-27 12:39:03 +00:00
Florian Hahn	f8fe67c998	[VPlan] Expose cloneFrom and mergeBlocksIntoPredecessors. (NFC) (#188818 ) Move cloneFrom from a file-static function in VPlan.cpp to a public static method VPBlockUtils::cloneFrom, and move mergeBlocksIntoPredecessors from a file-static function in VPlanTransforms.cpp to a public static method VPlanTransforms::mergeBlocksIntoPredecessors. This is in preparation for dissolving replicate regions which needs both utilities. Split off from approved https://github.com/llvm/llvm-project/pull/170212. PR: https://github.com/llvm/llvm-project/pull/188818	2026-03-26 19:07:33 +00:00
Ramkumar Ramachandra	76a9692254	[VPlan] Sink single-scalar replicates in licm (#187047 ) Refine the replicate bail-out in licm to permit single-scalar replicates.	2026-03-26 14:42:57 +00:00
Florian Hahn	40304d8fef	Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252 )" (#188589 ) This reverts commit e30f9c19464bcf1bf1e9f69b63884fb78ad2d05d. Re-land, now that the reported crash causing the revert has been fixed as part of 77fb84889 (#187504). Original message: Replace manual region dissolution code in simplifyBranchConditionForVFAndUF with using general removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates a (BranchOnCond true) or updates BranchOnTwoConds. The loop then gets automatically removed by running removeBranchOnConst. This removes a bunch of special logic to handle header phi replacements and CFG updates. With the new code, there's no restriction on what kind of header phi recipes the loop contains. Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is technically unrelated, but I could not find an independent test that would be impacted. The code to deal with epilogue resume values now needs updating, because we may simplify a reduction directly to the start value. PR: https://github.com/llvm/llvm-project/pull/181252	2026-03-26 10:14:10 +00:00
Luke Lau	065a39b9f7	[VPlan] Tighten SafeAVL matching in convertEVLExitCond. NFC (#179164 ) Follow-up from https://github.com/llvm/llvm-project/pull/178181#discussion_r2743630145	2026-03-25 18:11:51 +08:00
Benjamin Maxwell	249b086545	[LV] Fix crash when extends are not widened in partial reduction matching (#187782 ) Fixes https://github.com/llvm/llvm-project/pull/185821#issuecomment-4098933551	2026-03-23 10:30:19 +00:00
Florian Hahn	c079372099	[VPlan] Add m_VPPhi pattern matcher and use in removeDeadRecipes (NFC). Add m_VPPhi to match VPPhi instructions with exactly 2 operands. Split off from https://github.com/llvm/llvm-project/pull/156262.	2026-03-22 19:49:47 +00:00
Ramkumar Ramachandra	1dfd268f10	[VPlan] Simplify mul x, -1 -> sub 0, x (#187551 ) Simplify exactly as InstCombine does. A follow-up would include simplifying add x, (sub 0, y) -> sub x, y. Alive2 proof: https://alive2.llvm.org/ce/z/Af7QiD	2026-03-20 12:07:51 +00:00
Benjamin Maxwell	4b17135d14	[LV] Simplify `matchExtendedReductionOperand()` (NFCI) (#185821 ) This updates `matchExtendedReductionOperand` so the simple case of `UpdateR(PrevValue, ext(...))` is matched first as an early exit. The binop matching is then flattened to remove the extra layer of the `MatchExtends` lambda.	2026-03-20 09:29:28 +00:00
Graham Hunter	b227fab5a6	[NFC][LV] Introduce enums for uncountable exit detail and style (#184808 ) Recursively splitting out some work from #183318; this covers the enums for early exit loop type (none, readonly, readwrite) and the style used (just readonly and masked-handle-ee-in-scalar-tail for now) and refactoring for basic use of those enums.	2026-03-19 14:17:25 +00:00
Elvis Wang	53f8f3b017	Reland [LV] Replace remaining LogicalAnd to vp.merge in EVL optimization. (#184068 ) (#187199 ) This patch replace the remaining LogicalAnd to vp.merge in the second pass to not break the `m_RemoveMask` pattern in the optimizeMaskToEVL. Also skip cost model comparison when the plan contains `vp_merge` which won't be calculated by the legacy model. This can help to remove header mask for FindLast reduction (CSA) loops. Original PR: https://github.com/llvm/llvm-project/pull/184068 Original built-bot failure: https://lab.llvm.org/buildbot/#/builders/213/builds/2497	2026-03-19 07:56:42 +08:00
Florian Hahn	fce100e26e	[VPlan] Fix masked_cond expansion. masked_cond is used to combine early-exit conditions with masks from predicate. The early-exit condition should only be evaluated if the mask is true. Emit the mask first, to avoid incorrect poison propagation. Fixes https://github.com/llvm/llvm-project/issues/187061.	2026-03-18 20:26:04 +00:00
Luke Lau	bf46a95f2c	[VPlan] Use target's index type for {First,Last}ActiveLane instead of i64 (#186361 ) Fixes #186005 On RV32 with zve32x, i.e. no legal 64 bit types either scalar or vector, @llvm.cttz.elts.i64 cannot be lowered and so returns an illegal cost for scalable VFs. However VPInstruction::FirstActiveLane and VPInstruction::LastActiveLane always use a hardcoded i64 type. This causes a legacy/VPlan cost model mismatch in the live-out.ll test, and in early-exit-live-out.ll prevents the scalable VF from being chosen. This PR teaches the two VPInstructions to use the target's index type, i.e. the width of a pointer in the default address space, so it will generate a 32 bit cttz.elts on RV32. This should be large enough to hold the maximum number of elements in a vector, as if the vector was any bigger it would imply it isn't accessible by memory. I considered using the canonical IV type but I don't think that will work since the canonical IV can be i64 on RV32, and it causes regressions due to extra zexting on 64-bit targets with a 32-bit IV.	2026-03-18 15:01:21 +00:00
Elvis Wang	3eb8b788b7	Revert "[LV] Replace remaining LogicalAnd to vp.merge in EVL optimization." (#187170 ) Reverts llvm/llvm-project#184068 This hit the cost model assertion in rva23 stage2 build bot. https://lab.llvm.org/buildbot/#/builders/213/builds/2497	2026-03-18 09:21:40 +08:00
Elvis Wang	52089f895e	[LV] Replace remaining LogicalAnd to vp.merge in EVL optimization. (#184068 ) This patch replace the remaining LogicalAnd to vp.merge in the second pass to not break the `m_RemoveMask` pattern in the optimizeMaskToEVL. This can help to remove header mask for FindLast reduction (CSA) loops. PR: https://github.com/llvm/llvm-project/pull/184068	2026-03-18 08:39:27 +08:00
Ramkumar Ramachandra	56d7920c09	[VPlan] Factor collectGroupedReplicateMemOps (NFC) (#186820 ) Factor out a collectGroupedReplicateMemOps from collectComplementaryPredicatedMemOps, so it can be re-used in other places.	2026-03-17 09:15:46 +00:00
Elvis Wang	51b3b9b039	[LV] Optimize x && (x && y) -> x && y (#185806 ) This patch removes the extra logical-and in `x && (x && y)` and `x && (y && x)` to `x && y`. This helps to simplify mask calculation in the FindLast reduction and exposes more opportunities to replace to EVL. PR link: https://github.com/llvm/llvm-project/pull/185806	2026-03-17 13:03:04 +08:00
Ramkumar Ramachandra	92e44b247f	Reland [VPlan] Extend interleave-group-narrowing to WidenCast (#186454 ) The patch was intially landed as bd5f9384, but then reverted due to an underlying issue in narrowInterleaveGroups, described in #185860. The issue has since been fixed. The reland is simply a conflict-resolved version of the original patch, which includes an additonal test update. WidenCast is very similar to Widen recipes. Fixes #128062.	2026-03-16 12:21:48 +00:00
Ramkumar Ramachandra	616bf5abd1	[VPlan] Introduce VPlan::getDataLayout (NFC) (#186418 )	2026-03-13 16:17:04 +00:00
Florian Hahn	cbb8e08192	[VPlan] Don't narrow wide loads for scalable VFs when narrowing IGs. (#186181 ) For scalable VFs, the narrowed plan processes vscale iterations at once, so a shared wide load cannot be narrowed to a uniform scalar; bail out, as there currently is not way to create a narrowed load that loads vscale elements. Fixes https://github.com/llvm/llvm-project/issues/185860. PR: https://github.com/llvm/llvm-project/pull/186181	2026-03-13 16:04:42 +00:00
Florian Hahn	579aca8755	[VPlan] Prevent uses of materialized VPSymbolicValues. (NFC) (#182318 ) After VPSymbolicValues (like VF and VFxUF) are materialized via replaceAllUsesWith, they should not be accessed again. This patch: 1. Tracks materialization state in VPSymbolicValue. 2. Asserts if the materialized VPValue is used again. Currently it adds asserts to various member functions, preventing calling them on materialized symbolic values. Note that this still allows some uses (e.g. comparing VPSymbolicValue references or pointers), but this should be relatively harmless given that it is impossible to (re-)add any users. If we want to further tighten the checks, we could add asserts to the accessors or override operator&, but that will require more changes and not add much extra guards I think. Depends on https://github.com/llvm/llvm-project/pull/182146 to fix a current access violation. PR: https://github.com/llvm/llvm-project/pull/182318	2026-03-13 14:39:46 +00:00
Ramkumar Ramachandra	540ea54ad7	Revert "[VPlan] Extend interleave-group-narrowing to WidenCast" (#186072 ) This reverts commit bd5f9384 (#183204) to buy us time to investigate a AArch64 SVE-fixed-length buildbot miscompile. Ref: https://lab.llvm.org/buildbot/#/builders/143/builds/14601	2026-03-12 11:37:09 +00:00
Benjamin Maxwell	430e2b7b79	[LV] Simplify the chain traversal in `getScaledReductions()` (NFCI) (#184830 ) I found the logic of this function quite hard to reason about. This patch attempts to rectify this by splitting out matching an extended reduction operand and traversing reduction chain. - `matchExtendedReductionOperand()` contains all the logic to match an extended operand. - `getScaledReductions()` validates each operation in the chain, starting backwards from the exit value, walking up through the operand that is not extended.	2026-03-11 06:39:20 +00:00
Florian Hahn	c79a058a6a	[VPlan] Materialize VectorTripCount in narrowInterleaveGroups. (#182146 ) When narrowInterleaveGroups transforms a plan, VF and VFxUF are materialized (replaced with concrete values). This patch also materializes the VectorTripCount in the same transform. This ensures that VectorTripCount is properly computed when the narrow interleave transform is applied, instead of using the original VF + UF to compute the vector trip count. The previous behavior generated correct code, but executed fewer iterations in the vector loop. The change also enables stricter verification prevent accesses of UF, VF, VFxUF etc after materialization as follow-up. Note that in some cases we no miss branch folding, but that should be addressed separately, https://github.com/llvm/llvm-project/pull/181252 Fixes one of the violations accessing a VectorTripCount after UF and VF being materialized PR: https://github.com/llvm/llvm-project/pull/182146	2026-03-10 12:33:30 +00:00
Sander de Smalen	0da00c325b	[LV] Support float and pointer FindLast reductions (#184101 ) This duplicates #182313 with some very small modifications on top, as @dheaton-arm is unable to finish the PR and I'm unable to push to his branch. Expands support for the `FindLast` Reccurence Kind to floating-point and pointer types, thereby enabling conditional scalar assignment (CSA) for these types. Originally authored by @dheaton-arm --------- Co-authored-by: Damian Heaton <Damian.Heaton@arm.com>	2026-03-09 10:27:06 +00:00
Aiden Grossman	e30f9c1946	Revert "Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252 )"" This reverts commit 6aa115bba55054b0dc81ebfc049e8c7a29e614b2. This is causing crashes. See #185345 for details.	2026-03-09 04:24:01 +00:00
Florian Hahn	2207296d3f	[VPlan] Fold constant trunc after EVL simplification. This fixes a crash for the new test after 6aa115bba55054b0dc81ebfc049e8c7a29e614b2.	2026-03-08 19:31:20 +00:00
Florian Hahn	6aa115bba5	Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252 )" This reverts commit d7e037c8383e66e5c07897f144f6d8ef47258682. Recommit with a small fix to properly handle ordered reductions when connecting the epilogue. Original message: Replace manual region dissolution code in simplifyBranchConditionForVFAndUF with using general removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates a (BranchOnCond true) or updates BranchOnTwoConds. The loop then gets automatically removed by running removeBranchOnConst. This removes a bunch of special logic to handle header phi replacements and CFG updates. With the new code, there's no restriction on what kind of header phi recipes the loop contains. Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is technically unrelated, but I could not find an independent test that would be impacted. The code to deal with epilogue resume values now needs updating, because we may simplify a reduction directly to the start value. PR: https://github.com/llvm/llvm-project/pull/181252	2026-03-08 11:13:40 +00:00
Florian Hahn	2ce5f91425	[VPlan] Optimize resume values of IVs together with other exit values. (#174239 ) Remove updateScalarResumePhis and create extracts for live-outs early in addInitialSkeleton. Instead of extracting the from the header phi recipes for the resume values (which is incorrect), extract the last lane of the backedege value. Then update optimizeInductionExitUsers to optimize both the scalar resume values for IVs and IV exit values together. This removes the need to pass state between transforms and addresses a TODO. PR: https://github.com/llvm/llvm-project/pull/174239	2026-03-06 17:05:53 +00:00
Florian Hahn	d316fb0797	[VPlan] Replicate VPScalarIVStepsRecipe by VF outside replicate regions. (#170053 ) Extend replicateByVF to also handle VPScalarIVStepsRecipe. To do so, the patch adds a new lane operand to VPScalarIVStepsRecipe, which is only added when replicating. This enables removing a number of lane 0 computations. The lane operand will also be used to explicitly replicate replicate regions in a follow-up. Depends on https://github.com/llvm/llvm-project/pull/169796 Depends on https://github.com/llvm/llvm-project/pull/170906 PR: https://github.com/llvm/llvm-project/pull/170053	2026-03-05 12:42:20 +00:00
Ramkumar Ramachandra	ca0d100e79	[VPlan] Use VPlan::getZero to improve code (NFC) (#184591 )	2026-03-04 21:21:35 +00:00
Florian Hahn	c370f5af6c	[VPlan] Preserve IsSingleScalar for hoisted predicated load. (#184453 ) The predicated loads may be single scalar (e.g. for VF = 1). We should preserve IsSingleScalar when hoisting them. As all loops access the same address, IsSingleScalar must match across all loads in the group. This fixes an assertion when interleaving-only with hoisted loads. Fixes https://github.com/llvm/llvm-project/issues/184372 PR: https://github.com/llvm/llvm-project/pull/184453	2026-03-04 14:32:00 +00:00
Florian Hahn	bbde3e3b59	[VPlan] Preserve IsSingleScalar for sunken predicated stores. (#184329 ) The predicated stores may be single scalar (e.g. for VF = 1). We should preserve IsSingleScalar. As all stores access the same address, IsSingleScalar must match across all stores in the group. This fixes an assertion when interleaving-only with sunken stores. Fixes https://github.com/llvm/llvm-project/issues/184317 PR: https://github.com/llvm/llvm-project/pull/184329	2026-03-03 14:08:00 +00:00
Ramkumar Ramachandra	b4743b2641	[VPlan] Introduce VPlan::get(Zero\|AllOnes) (NFC) (#184085 )	2026-03-03 09:47:05 +00:00
Luke Lau	bcc272b322	[LV] Remove DataAndControlFlowWithoutRuntimeCheck. NFC (#183762 ) After #144963 and #183292 we never emit the runtime check, so DataAndControlFlowWithoutRuntimeCheck is equivalent to DataAndControlFlow. With that we only need to store one tail folding style instead of two, because we don't need to distinguish whether or not the IV update overflows (to a non-zero value)	2026-03-02 21:14:04 +08:00
Jan Patrick Lehr	60fec80bdc	Revert "[VPlan] Remove unused VPExpandSCEVRecipe before expansion" (#184108 ) Reverts llvm/llvm-project#181329 Breaks: https://lab.llvm.org/buildbot/#/builders/123/builds/36163 Local revert fixes the issue seen in the buildbot.	2026-03-02 12:45:48 +00:00
Mel Chen	c62c00c524	[VPlan] Remove unused VPExpandSCEVRecipe before expansion (#181329 ) VPExpandSCEVRecipe may become unused after VPlan optimizations. This patch removes VPExpandSCEVRecipes with no users before expansion in expandSCEVs, avoiding generating dead code during VPlan execution.	2026-03-02 09:04:59 +00:00
Florian Hahn	320220e48b	[VPlan] Support arbitrary predicated early exits. (#182396 ) This removes the restriction requiring a single predicated early exit. Using MaskedCond, we only combine early-exit conditions with block masks from non-exiting control flow. This means we have to ensure that we check the early exit conditions in program order, to make sure we take the first exit in program order that exits at the first lane for the combined exit condition. To do so, sort the exits by their reverse post-order numbers. Depends on https://github.com/llvm/llvm-project/pull/182395 PR: https://github.com/llvm/llvm-project/pull/182396	2026-03-01 16:07:05 +00:00
Florian Hahn	72525fb4ee	[VPlan] Materialize UF after unrolling (NFCI). Move materialization of the symbolic UF directly to unrollByUF. At this point, unrolling materializes the decision and it is natural to also materialize the symbolic UF here.	2026-02-28 12:44:15 +00:00
Luke Lau	6f9c68d320	[VPlan] Don't adjust trip count for DataAndControlFlowWithoutRuntimeCheck (#183729 ) Previously, the canonical IV increment may have overflowed to a non-zero value due to vscale being a non power-of-two. So we used to emit a runtime check for this. If you didn't want the runtime check, DataAndControlFlowWithoutRuntimeCheck skipped it and instead tweaked the trip count so it wouldn't overflow. However #144963 stopped the check from ever being emitted because vscale is always a power-of-two on AArch64 and RISC-V, so it never overflowed to a non-zero value. And in #183292 the code to emit the check was removed. But we never restored the trip count back to normal when the target's vscale was a power-of-two. Now that vscale is always a power-of-two, this PR avoids adjusting it. A follow up NFC can then remove DataAndControlFlowWithoutRuntimeCheck.	2026-02-28 04:01:58 +00:00
Florian Hahn	d7e037c838	Revert "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252 )" This reverts commit 9c53215d213189d1f62e8f6ee7ba73a089ac2269. Appears to cause crashes with ordered reductions, revert while I investigate	2026-02-27 21:29:41 +00:00
Florian Hahn	9c53215d21	[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252 ) Replace manual region dissolution code in simplifyBranchConditionForVFAndUF with using general removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates a (BranchOnCond true) or updates BranchOnTwoConds. The loop then gets automatically removed by running removeBranchOnConst. This removes a bunch of special logic to handle header phi replacements and CFG updates. With the new code, there's no restriction on what kind of header phi recipes the loop contains. Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is technically unrelated, but I could not find an independent test that would be impacted. The code to deal with epilogue resume values now needs updating, because we may simplify a reduction directly to the start value. PR: https://github.com/llvm/llvm-project/pull/181252	2026-02-27 16:49:54 +00:00
Luke Lau	c5c0fe663c	[VPlan] Remove non-power-of-2 scalable VF comment. NFC (#183719 ) No longer holds after #183080	2026-02-27 10:45:17 +00:00
Sander de Smalen	a1f83ba1b6	[LV] NFCI: Move extend optimization to transformToPartialReduction. (#182860 ) The reason for doing this in `transformToPartialReduction` is so that we can create the VPExpressions directly when transforming reductions into partial reductions (to be done in a follow-up PR). I also intent to see if we can merge the in-loop reductions with partial reductions, so that there will be no need for the separate `convertToAbstractRecipes` VPlan Transform pass.	2026-02-27 08:38:13 +00:00
Ramkumar Ramachandra	bd5f9384d8	[VPlan] Extend interleave-group-narrowing to WidenCast (#183204 ) WidenCast is very similar to Widen recipes. Fixes #128062.	2026-02-26 14:50:25 +00:00
Florian Hahn	c5d6feb315	[VPlan] Limit interleave group narrowing to consecutive wide loads. Tighten check in canNarrowLoad to require consecutive wide loads; we cannot properly narrow gathers at the moment. Fixe https://github.com/llvm/llvm-project/issues/183345.	2026-02-26 12:52:31 +00:00
Florian Hahn	32b8b9ba1e	[VPlan] Simplify ExitingIVValue and use for tail-folded IVs. (#182507 ) Now that we have ExitingIVValue, we can also use it for tail-folded loops; the only difference is that we have to compute the end value with the original trip count instead the vector trip count. This allows removing the induction increment operand only used when tail-folding. PR: https://github.com/llvm/llvm-project/pull/182507	2026-02-26 11:48:04 +00:00
Benjamin Maxwell	3c566a698a	[LV] Fix miscompile with conditional scalar assignment + tail folding (#182492 ) Previously, we could miscompile when vectorizing conditional scalar assignments with forced tail folding, as the backedge select could be based on the header mask, not the assignment conditional. This resulted in a number of failures in the LLVM test suite when building with `-O3 -march=armv8-a+sve -mllvm -prefer-predicate-over-epilogue=predicate-dont-vectorize`. The patch reworks `handleFindLastReductions()` to correctly handle tail folding.	2026-02-26 09:00:16 +00:00
Florian Hahn	bf4705c05b	[VPlan] Supported conditionally executed single early exits. (#182395 ) Add support for a single early exit that is executed conditionally. To make sure the mask from any non-exiting control flow is combined with the early exit condition. To do so, introduce a MaskedCond VPInstruction, which is inserted as user of the early-exit condition, at the point of the early-exit branch. The VPInstruction will get masked automatically if needed by the predicator, ensuring that we properly account for it when checking whether the early exit has been taken. Note that this does not allow for instructions that require predication after the early exit. This requires additional work in progress: https://github.com/llvm/llvm-project/pull/172454 As an alternative to MaskedCond, we could also predicate before handling early exiting blocks: https://github.com/llvm/llvm-project/pull/181830 PR: https://github.com/llvm/llvm-project/pull/182395	2026-02-25 14:28:04 +00:00

1 2 3 4 5 ...

714 Commits