llvm-project

Author	SHA1	Message	Date
Florian Hahn	61521a94af	[VPlan] Ensure countable region in narrowInterleaveGroups. This tightens the legality checks. Currently should not have any impact, but is needed to avoid mis-compiles in follow-up changes.	2026-02-10 21:24:54 +00:00
Florian Hahn	a1fc5b4a48	[VPlan] Reject partial reductions with invalid costs in getScaledReds. (#180438 ) Check if costs for partial reductions are valid up-front in getScaledReductions instead when transforming each link in the chain in transformToPartialReduction. This ensures that we either transform all entries in the chain together, or none via the existing invalidation logic. This fixes a crash when a link in the chain would have invalid cost, as in the added test cases. Fixes https://github.com/llvm/llvm-project/issues/180340. PR: https://github.com/llvm/llvm-project/pull/180438	2026-02-10 21:16:21 +00:00
Sander de Smalen	3157758190	[LV] Handle partial sub-reductions with sub in middle block. (#178919 ) Sub-reductions can be implemented in two ways: (1) negate the operand in the vector loop (the default way). (2) subtract the reduced value from the init value in the middle block. Note that both ways keep the reduction itself as an 'add' reduction, which is necessary because only llvm.vector.partial.reduce.add exists. The ISD nodes for partial reductions don't support folding the sub/negation into its operands because the following is not a valid transformation: ``` sub(0, mul(ext(a), ext(b))) -> mul(ext(a), ext(sub(0, b))) ``` It can therefore be better to choose option (2) such that the partial reduction is always positive (starting at '0') and to do a final subtract in the middle block. For AArch64 there are no dot-product instructions that can do a `partial.reduce.sub(acc, mul(ext(a), ext(b)))` operation. I'm not sure if such instructions exist for other targets. (If so then we may want to make this decision a target option) This PR also increases the AArch64 cost of a partial sub-reduction when this exists in an 'add-sub' reduction chain. Fixes https://github.com/llvm/llvm-project/issues/178703	2026-02-10 11:00:32 +00:00
Mel Chen	7e5d9189d2	[VPlan] Simplify true && x -> x (#179426 )	2026-02-10 08:49:03 +00:00
Florian Hahn	d1ec04dfd4	[VPlan] Simplify single-entry VPWidenPHIRecipe. Include VPWidenPHIRecipe in phi simplification if there's a single incoming value.	2026-02-09 22:10:13 +00:00
Luke Lau	8cd86ff284	[VPlan] Propagate FastMathFlags from phis to blends (#180226 ) If a phi has fast math flags, we can propagate it to the widened select. To do this, this patch makes VPPhi and VPBlendRecipe subclasses of VPRecipeWithIRFlags, and propagates it through PlainCFGBuilder and VPPredicator. Alive2 proofs for some of the FMFs (it looks like it can't reason about the full "fast" set yet) nnan: https://alive2.llvm.org/ce/z/f0bRd4 nsz: https://alive2.llvm.org/ce/z/u9P96T The actual motivation for this to eventually be able to move the special casing for tail folding in LoopVectorizationPlanner::addReductionResultComputation into the CFG in #176143, which requires passing through FMFs.	2026-02-09 19:38:58 +08:00
Florian Hahn	7509cad693	[VPlan] Support masked VPInsts, use for predication (NFC) (#142285 ) Add support for mask operands to most VPInstructions, using getNumOperandsForOpcode. This allows VPlan predication to predicate VPInstructions directly. The mask will then be dropped or handled when creating wide recipes. Depends on https://github.com/llvm/llvm-project/pull/142284. Depends on https://github.com/llvm/llvm-project/pull/168784. PR: https://github.com/llvm/llvm-project/pull/142285	2026-02-08 18:23:36 +00:00
Ramkumar Ramachandra	22a16623e1	[VPlan] Fix comments in simplifyRecipe around BinaryOr (NFC) (#180050 )	2026-02-06 13:43:30 +00:00
Ramkumar Ramachandra	901d175d18	[VPlan] Simplify x & AllOnes -> x (#180049 )	2026-02-06 13:42:58 +00:00
Florian Hahn	fdce0ea708	[VPlan] Add ExitingIVValue VPInstruction. (#175651 ) Add a new VPInstruction opcode to compute the exiting value of an induction variable after vectorization. This replaces the pattern of extracting the last lane from the last part of the induction backedge value when applicable. This allows us to always use the pre-computed IV end value. It will also allow unifying end value creation for both induction resume and exit values. PR: https://github.com/llvm/llvm-project/pull/175651	2026-02-06 12:27:31 +00:00
Florian Hahn	05a2b146fb	[LV] Optimize FindLast recurrences to FindIV (NFCI). (#177870 ) This patch restructures Find(First\|Last)IV handling. Instead of differentiating between FindLast, FindFirstIV and FindLastIV up front, this patch simplifies the logic in IVDescriptor to just identify the FindLast pattern up-front. It then adds a new VPlan transformation to optimize FindLast reductions to FindIV reductions if there is a suitable sentinel value. Find(Last\|First)IV recurrence kinds to a single FindIV kind. This is simpler and more accurate, given selecting the first/last induction of the final IV reduction is directly controlled by the corresponding recurrence kind of the ComputeReductionResult. The new structure also allows further optimizations, like vectorizing FindLastIV with another boolean reduction that tracks if the condition in the loop was ever true, if there is no suitable sentinel value. PR: https://github.com/llvm/llvm-project/pull/177870	2026-02-05 13:57:20 +00:00
Florian Hahn	792f7b089a	[VPlan] Refine exit select check in transformtoPartialReduction. Make sure we find the actual select for the exit users and only use it for the final link in the chain. This fixes a miscompile after 90b3712d8a20efa2cbaadc177da576e485dce038.	2026-02-03 21:07:02 +00:00
Florian Hahn	8240cf337a	[VPlan] Always set flags for overflowing ops etc via VPIRFlags. (#179138 ) Enforce that all VPInstructions set the correct OpType of the VPIRFlags. Flag mis-matches (e.g. VPInstruction Add without `OverflowingBinOp` being set) can cause crashes (e.g. in CSE) or potentially mis-compiles. Add a few helpers in VPBuilder to create common instructions with correct flags. PR: https://github.com/llvm/llvm-project/pull/179138	2026-02-03 12:33:23 +00:00
Mel Chen	8c6658aca6	[VPlan] Sink recipes from the vector loop region in licm. (#168031 ) When a recipe can be safely sunk and all of its users are outside the vector loop region in the same dedicated exit block, the recipe does not need to be executed on every iteration. This patch extends the VPlan-based LICM (Loop Invariant Code Motion) to also sink such recipes from the vector loop region into the exit block. This reduces redundant computation and improves cost model accuracy. TODO: Support nested loop sinking TODO: Support sinking `VPReplicateRecipe` (requires `replicateByVF` fixes) TODO: Support recipes with multiple defined values (e.g., interleaved loads) TODO: Clone recipes without users to all exit blocks TODO: Support PHI node users by checking incoming value blocks TODO: Support sinking when users are in multiple blocks TODO: Clone recipes when users are on multiple exit paths Co-authored-by: Luke Lau <luke@igalia.com> --------- Co-authored-by: Luke Lau <luke@igalia.com> Co-authored-by: Luke Lau <luke_lau@icloud.com>	2026-02-03 07:57:15 +00:00
Luke Lau	bb14eabaca	[VPlan] Split out EVL exit cond transform from canonicalizeEVLLoops. NFC (#178181 ) This is split out from #177114. In order to make canonicalizeEVLLoops a generic "convert to variable stepping" transform, move the code that changes the exit condition to a separate transform since not all variable stepping loops will want to transform the exit condition. Run it before canonicalizeEVLLoops before VPEVLBasedIVPHIRecipe is expanded. Also relax the assertion for VPInstruction::ExplicitVectorLength to just bail instead, since eventually VPEVLBasedIVPHIRecipe will be used by other loops that aren't EVL tail folded.	2026-02-02 04:45:43 +00:00
Florian Hahn	beb0e7e150	[VPlan] Fold (x \| !x) -> true. (#177887 ) PR: https://github.com/llvm/llvm-project/pull/177887	2026-02-01 20:12:21 +00:00
Florian Hahn	90b3712d8a	Reapply "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851 )" This reverts commit d1e477b00b49c63ff4dd513eeb14a5b18bc055d7. Recommit with a extra checks making sure extends are VPWidenCastRecipes, rejecting VPReplicateRecipes. Original message: As a first step, move the existing partial reduction detection logic to VPlan, trying to preserve the existing code structure & behavior as closely as possible. With this, partial reductions are detected and created together in a single step. This allows forming partial reductions and bundling them up if profitable together in a follow-up. PR: https://github.com/llvm/llvm-project/pull/167851	2026-02-01 16:27:27 +00:00
Martin Storsjö	d1e477b00b	Revert "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851 )" This reverts commit f4e8cc1a2229dca76d21c8d37439c4c194b06b86. This change wasn't NFC; it causes failed asserts when building ffmpeg for i686 windows, see https://github.com/llvm/llvm-project/pull/167851 for details.	2026-02-01 14:35:02 +02:00
Florian Hahn	f4e8cc1a22	[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851 ) As a first step, move the existing partial reduction detection logic to VPlan, trying to preserve the existing code structure & behavior as closely as possible. With this, partial reductions are detected and created together in a single step. This allows forming partial reductions and bundling them up if profitable together in a follow-up. PR: https://github.com/llvm/llvm-project/pull/167851	2026-01-31 19:44:46 +00:00
Andrei Elovikov	d8621d665d	Reapply "[VPlan] Add hidden `-vplan-print-after-all` option" (#178547 ) Re-commit of https://github.com/llvm/llvm-project/pull/175839 after fixing build without `LLVM_ENABLE_DUMP`. This consists of the following changes: * Merge several overloads of `VPlanTransforms::runPass` into a single function to avoid code duplication. * Add helper macro `RUN_VPLAN_PASS` to capture the transformation name and pass it to the helper above for printing. * Add new `-vplan-print-after-all` option (somewhat similar to existing `-vplan-verify-each`). * Add two empty passes `printAfterInitialConstruction`/`printFinalVPlan` so that initial/final VPlans would be supported in `-vplan-print-after-all` This follows the original future plans in https://github.com/llvm/llvm-project/pull/123640.	2026-01-30 19:55:09 +00:00
Sander de Smalen	b4c7518a0f	[LV] Add support for extended fadd reductions (#178447 ) This makes use of the llvm.vector.partial.reduce.fadd intrinsics added in #163975 to handle the following with FDOT: ``` float32_t fdot(float16_t *src, int N) { float32_t sum = 0.0f; for (int i=0; i<N; ++i) sum += src[i]; return sum; } ```	2026-01-30 08:27:57 +00:00
Florian Hahn	eabcdb572b	Revert "[VPlan] Add hidden `-vplan-print-after-all` option (#175839 )" (#178544 ) This reverts commit 97e1df149de213b760aae4060ee9e25dc9908125. It looks like the commit caused some build bot failures. Revert back to green so the failures can be investigated. https://lab.llvm.org/buildbot/#/builders/159/builds/39803 https://lab.llvm.org/buildbot/#/builders/2/builds/43204	2026-01-28 23:49:24 +00:00
Andrei Elovikov	97e1df149d	[VPlan] Add hidden `-vplan-print-after-all` option (#175839 ) This consists of the following changes: * Merge several overloads of `VPlanTransforms::runPass` into a single function to avoid code duplication. * Add helper macro `RUN_VPLAN_PASS` to capture the transformation name and pass it to the helper above for printing. * Add new `-vplan-print-after-all` option (somewhat similar to existing `-vplan-verify-each`). * Add two empty passes `printAfterInitialConstruction`/`printFinalVPlan` so that initial/final VPlans would be supported in `-vplan-print-after-all` This follows the original future plans in https://github.com/llvm/llvm-project/pull/123640.	2026-01-28 22:25:54 +00:00
Jakub Kuderski	55fbb71db1	[llvm] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178502 ) Pre-commiting this before landing the new check in https://github.com/llvm/llvm-project/pull/177892	2026-01-28 15:44:04 -05:00
Florian Hahn	e36cd26618	[VPlan] Remove non-reductions after simplifications. (#176795 ) In some cases, we identify patterns as reductions, even though they can be simplified to a non-reduction. Mark VPReductionPHIRecipe as not reading from memory & not having side-effects, to clean them up. We also need to remove ComputeReductionResult VPInstructions with live-in arguments. This means there is actually no reduction, and we need to fold it to the live in. Otherwise we would incorrectly reduce the live-in. PR: https://github.com/llvm/llvm-project/pull/176795	2026-01-28 15:51:08 +00:00
Damian Heaton	762ba885f9	[LV] Add support for llvm.vector.partial.reduce.fadd (#163975 ) Allows the Loop Vectorizer to generate `llvm.vector.partial.reduce.fadd` intrinsics when sequences which match its requirements are found.	2026-01-28 15:05:34 +00:00
Jim Lin	0ed8e7230f	[VPlan] Create SCEV before any VPIRInstructions to check for overflow (#177911 ) This PR tried to fix the assertion fail at VPlanTransforms.cpp:4862 since SCEV was created after VPIRInstructions. The tripcount in scalable-predication.ll was changed from constant value 256 to non-constant value %n to avoid VPIRInstructions optimized out, which cannot trigger the assertion fail. The orders in ir-bb<entry> from: ir-bb<entry>: EMIT vp<%2> = EXPAND SCEV (1 umax %n) EMIT vp<%3> = sub ir<-1>, vp<%2> EMIT vp<%4> = EXPAND SCEV (4 * vscale)<nuw> EMIT vp<%5> = icmp ult vp<%3>, vp<%4> EMIT branch-on-cond vp<%5> Successor(s): scalar.ph, vector.ph to: ir-bb<entry>: EMIT vp<%2> = EXPAND SCEV (1 umax %n) EMIT vp<%3> = EXPAND SCEV (4 * vscale)<nuw> EMIT vp<%4> = sub ir<-1>, vp<%2> EMIT vp<%5> = icmp ult vp<%4>, vp<%3> EMIT branch-on-cond vp<%5> Successor(s): scalar.ph, vector.ph	2026-01-28 03:16:50 +00:00
Ramkumar Ramachandra	a40f9710b7	[VPlan] Refine VPValue types in tryToFoldLiveIns (NFC) (#178183 ) tryToFoldLiveIns operates on live-ins (that is, both VPIRValues and VPSymbolicValues), and returns a VPIRValue. Clarify this.	2026-01-27 13:23:06 +00:00
Florian Hahn	a871b707b7	Reapply "[VPlan] Move VDef subclass ID to VPRecipeBase (NFC). (#174282 )" Move SubclassID to VPRecipeBase, and store VPRecipeBase directly in VPRecipeValue, instead of VPDef. This allows for some additional simplifications and VPDef now just holds various helpers to deal with removing and adding VPValues. This reverts commit 16395da0ff577750571b99fe28281ce6fb6a3ae8. PR: https://github.com/llvm/llvm-project/pull/174282	2026-01-24 13:22:48 +00:00
Florian Hahn	16395da0ff	Revert "[VPlan] Fold VPDef into VPRecipeBase (NFC). (#174282 )" This reverts commit f3ae334f4b7a8cf4fe0eb6ee7b2f2ef0879f522d. Committed with out-of-date message, revert to reland with updated message.	2026-01-24 13:16:45 +00:00
Florian Hahn	f3ae334f4b	[VPlan] Fold VPDef into VPRecipeBase (NFC). (#174282 ) A separate VDef is not needed any longer, fold i into VPRecipeBase to simplify code and class hierarchy. Depends on https://github.com/llvm/llvm-project/pull/172758. PR: https://github.com/llvm/llvm-project/pull/174282	2026-01-24 13:16:12 +00:00
Florian Hahn	dd363d0629	[VPlan] Replace UnrollPart for VPScalarIVSteps with start index op (NFC) (#170906 ) Replace the unroll part operand for VPScalarIVStepsRecipe with the start index. This simplifies https://github.com/llvm/llvm-project/pull/170053 and is also a first step to break down the recipe into its components. PR: https://github.com/llvm/llvm-project/pull/170906	2026-01-21 22:13:13 +00:00
Florian Hahn	e36ddff7a4	[VPlan] Add scalable check to SinkStoreInfo helper. Bail out on scalable vectors in helper. Currently this is not causing issues, but fixes a potential crash that would be exposed by a follow-up change. Test would exposes the issue in the future has been added in 8c5352cf3e14ec0c56f592091899d229de8436a7.	2026-01-16 21:07:40 +00:00
Elvis Wang	aa11629192	[LV] Prevent `extract-lane` generate unused IRs with single vector operand. (#172798 ) When `extract-lane` only contains single vector operand. We can simplify it to `extractelement`. This patch makes `extract-lane` generate simple `extractelement` when it only contains single vector operand to prevent unused IR generated. This patch is mostly NFC, the unused IR should be removed in following IR passes.	2026-01-16 13:59:51 +08:00
Florian Hahn	f14577fa6f	[VPlan] Fold boolean select to xor if possible. Fold select c, false, true -> not c. This allows for more accurate cost estimation and fixes the underlying issue for the cost divergence between legacy and VPlan-based cost model that caused the revert of 01d34eb38fa058 in ed004cf42bf57c. https://alive2.llvm.org/ce/z/yVuSgW.	2026-01-15 22:13:47 +00:00
Luke Lau	0ae23ca9e6	[VPlan] Split out optimizeEVLMasks. NFC (#174925 ) Addresses part of #153144 and splits off part of #166164 There are two parts to the EVL transform: 1) Convert the loop so the number of elements processed each iteration is EVL, not VF. The IV and header mask are replaced with EVL-based variants. 2) Optimize users of the EVL based header mask to VP intrinsic based recipes. (1) changes the semantics of the vector loop region, whereas (2) needs to preserve them. This splits (2) out so we don't mix the two up, and allows us to move (1) earlier in the pipeline in a future PR.	2026-01-14 07:01:14 +00:00
Ramkumar Ramachandra	d69335bac9	[LLVM] Clean up code using [not_]equal_to (NFC) (#175824 ) Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner code.	2026-01-13 21:19:39 +00:00
Florian Hahn	4a807e8dd9	[VPlan] Optimize BranchOnTwoConds to chain of 2 simple branches. (#174016 ) This patch improves the lowering for BranchOnTwoConds added in https://github.com/llvm/llvm-project/pull/172750 by replacing the branch on OR with a chain of 2 branches. On Apple M cores, the new lowering is ~8-10% faster for std::find-like loops. It also makes it easier to determine the early exits in VPlan. I am also planning on extensions to support loops with multiple early exits and early-exits at different positions, which should also be slightly easier to do with the new representation. PR: https://github.com/llvm/llvm-project/pull/174016	2026-01-13 20:14:15 +00:00
Florian Hahn	d27d75ee94	[VPlan] Use createHeaderPHIRecipes in native path (NFCI). Simplify tryToBuildVPlan by using createHeaderPHIRecipes in the native path as well.	2026-01-13 20:12:21 +00:00
Luke Lau	123b6a2766	[VPlan] Give VPInstruction::ExplicitVectorLength name. NFC (#175493 ) This makes it a tad easier to read VPlan dumps, e.g. WIDEN vp.store vp<%7>, ir<%val>, vp<%5> -> WIDEN vp.store vp<%7>, ir<%val>, vp<%evl>	2026-01-13 00:05:30 +08:00
Florian Hahn	8f182526de	[VPlan] Don't fold UDiv in replicate regions. (#175460 ) The UDiv fold added in d12e993 (#174581) is currently also applied to replicate regions, which means we may end up with VPInstructions in replicate regions, which is currently nots supported. Fixes https://github.com/llvm/llvm-project/issues/175295. PR: https://github.com/llvm/llvm-project/pull/175460	2026-01-12 12:16:48 +00:00
Elvis Wang	cd2caf6580	[LV] Simplify extract-lane with scalar operand to the scalar value itself. (#174534 ) This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a scalar value. Extracting from a scalar is redundant since there is only one value to extract.	2026-01-12 10:03:44 +08:00
Ramkumar Ramachandra	78f1de803a	[VPlan] Strip iterator-invalidation guard in findHeaderMask (NFC) (#174930 ) It is unnecessary, as the users are never modified.	2026-01-08 09:42:18 +00:00
Florian Hahn	31b93d6e38	[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758 ) This patch adds VPValue sub-classes for the different cases we currently have: * VPIRValue: A live-in VPValue that wraps an underlying IR value * VPSymbolicValue: A symbolic VPValue not tied to an underlying value, e.g. the vector trip count or VF VPValues * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase. This has multiple benefits: * clearer constructors for each kind of VPValue * limited scope: for example allows moving VPDef member to VPRecipeValue, reducing size of other VPValues. * stricter type checking for member variables (e.g. using VPLiveIn in the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic member VPValues) There probably are additional opportunities for cleanups as follow-ups. PR: https://github.com/llvm/llvm-project/pull/172758	2026-01-07 20:29:05 +00:00
Ramkumar Ramachandra	d12e99376f	Reland [VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr) (#174581 ) The original patch, landed as a2db31b0 ([VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr), #172477) had a critical commutative matcher bug, which has now been fixed. An assert has also been strengthened, following a post-commit review.	2026-01-06 20:36:26 +00:00
Alex Bradbury	5a456c17d9	Revert "[VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr)" (#174559 ) Reverts llvm/llvm-project#172477 This is causing failures for RVA23 (including some tests running away in their execution causing OOM, hence the builder dying). I will attempt to follow up on the PR with a reproducer of some kind. https://lab.llvm.org/buildbot/#/builders/210/builds/7243	2026-01-06 10:26:51 +00:00
Ramkumar Ramachandra	a2db31b06f	[VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr) (#172477 )	2026-01-06 08:27:48 +00:00
Florian Hahn	16830b2164	[VPlan] Remove VPWidenSelectRecipe, use VPWidenRecipe instead (NFCI). (#174234 ) All extra state has been removed from VPWidenSelectRecipe at this point. There's no benefit of having a separate recipe and Select can easily be handled by the existing VPWidenRecipe. PR: https://github.com/llvm/llvm-project/pull/174234	2026-01-05 22:33:37 +00:00
Florian Hahn	0b46cf7dcd	[VPlan] Handle BranchOnTwoConds in simplifyBranchCondition. This fixes a crash after introducing BranchOnTwoConds (524b1788, https://github.com/llvm/llvm-project/pull/172750) when trying to replace BranchOnTwoConds with a VPBranchOnCond, without dissolving the region. In that case, we need to update the appropriate condition operand.	2025-12-30 18:47:22 +00:00
Florian Hahn	524b1788c4	[VPlan] Add BranchOnTwoConds, use for early exit plans. (#172750 ) This PR introduces a new BranchOnTwoConds VPInstruction, that takes 2 boolean operands and must be placed in a block with 3 successors. If condition I is true, branches to successor I, otherwise falls through to check the next condition. If both conditions are false, branch to the third successor. This new branch recipe is used for early-exit loops, to simplify the representation in VPlan initially, by avoid the need for splitting the middle block early on, in a way that preserves the single-exit block property of regions. All exits still go through the latch block, but they can go to more than 2 successors. This idea was part of one of the original proposals for how to model early exits in VPlan, but at that point in time, there was no good way to handle this during code-gen, and we went with the early split-middle block approach initially. Now that we dissolve regions before ::execute, the new recipe can be lowered nicely after regions have been removed, to a set of VPBBs and BranchOnCond recipes. The initial lowering preserves the original structure with the split middle blocks. Follow-ups will improve the lowering to avoid this splitting, providing performance gains. PR: https://github.com/llvm/llvm-project/pull/172750	2025-12-29 19:39:38 +00:00

1 2 3 4 5 ...

637 Commits