llvm-project

Author	SHA1	Message	Date
Florian Hahn	eabcdb572b	Revert "[VPlan] Add hidden `-vplan-print-after-all` option (#175839 )" (#178544 ) This reverts commit 97e1df149de213b760aae4060ee9e25dc9908125. It looks like the commit caused some build bot failures. Revert back to green so the failures can be investigated. https://lab.llvm.org/buildbot/#/builders/159/builds/39803 https://lab.llvm.org/buildbot/#/builders/2/builds/43204	2026-01-28 23:49:24 +00:00
Andrei Elovikov	97e1df149d	[VPlan] Add hidden `-vplan-print-after-all` option (#175839 ) This consists of the following changes: * Merge several overloads of `VPlanTransforms::runPass` into a single function to avoid code duplication. * Add helper macro `RUN_VPLAN_PASS` to capture the transformation name and pass it to the helper above for printing. * Add new `-vplan-print-after-all` option (somewhat similar to existing `-vplan-verify-each`). * Add two empty passes `printAfterInitialConstruction`/`printFinalVPlan` so that initial/final VPlans would be supported in `-vplan-print-after-all` This follows the original future plans in https://github.com/llvm/llvm-project/pull/123640.	2026-01-28 22:25:54 +00:00
Jakub Kuderski	55fbb71db1	[llvm] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178502 ) Pre-commiting this before landing the new check in https://github.com/llvm/llvm-project/pull/177892	2026-01-28 15:44:04 -05:00
Florian Hahn	e36cd26618	[VPlan] Remove non-reductions after simplifications. (#176795 ) In some cases, we identify patterns as reductions, even though they can be simplified to a non-reduction. Mark VPReductionPHIRecipe as not reading from memory & not having side-effects, to clean them up. We also need to remove ComputeReductionResult VPInstructions with live-in arguments. This means there is actually no reduction, and we need to fold it to the live in. Otherwise we would incorrectly reduce the live-in. PR: https://github.com/llvm/llvm-project/pull/176795	2026-01-28 15:51:08 +00:00
Damian Heaton	762ba885f9	[LV] Add support for llvm.vector.partial.reduce.fadd (#163975 ) Allows the Loop Vectorizer to generate `llvm.vector.partial.reduce.fadd` intrinsics when sequences which match its requirements are found.	2026-01-28 15:05:34 +00:00
Jim Lin	0ed8e7230f	[VPlan] Create SCEV before any VPIRInstructions to check for overflow (#177911 ) This PR tried to fix the assertion fail at VPlanTransforms.cpp:4862 since SCEV was created after VPIRInstructions. The tripcount in scalable-predication.ll was changed from constant value 256 to non-constant value %n to avoid VPIRInstructions optimized out, which cannot trigger the assertion fail. The orders in ir-bb<entry> from: ir-bb<entry>: EMIT vp<%2> = EXPAND SCEV (1 umax %n) EMIT vp<%3> = sub ir<-1>, vp<%2> EMIT vp<%4> = EXPAND SCEV (4 * vscale)<nuw> EMIT vp<%5> = icmp ult vp<%3>, vp<%4> EMIT branch-on-cond vp<%5> Successor(s): scalar.ph, vector.ph to: ir-bb<entry>: EMIT vp<%2> = EXPAND SCEV (1 umax %n) EMIT vp<%3> = EXPAND SCEV (4 * vscale)<nuw> EMIT vp<%4> = sub ir<-1>, vp<%2> EMIT vp<%5> = icmp ult vp<%4>, vp<%3> EMIT branch-on-cond vp<%5> Successor(s): scalar.ph, vector.ph	2026-01-28 03:16:50 +00:00
Ramkumar Ramachandra	a40f9710b7	[VPlan] Refine VPValue types in tryToFoldLiveIns (NFC) (#178183 ) tryToFoldLiveIns operates on live-ins (that is, both VPIRValues and VPSymbolicValues), and returns a VPIRValue. Clarify this.	2026-01-27 13:23:06 +00:00
Florian Hahn	a871b707b7	Reapply "[VPlan] Move VDef subclass ID to VPRecipeBase (NFC). (#174282 )" Move SubclassID to VPRecipeBase, and store VPRecipeBase directly in VPRecipeValue, instead of VPDef. This allows for some additional simplifications and VPDef now just holds various helpers to deal with removing and adding VPValues. This reverts commit 16395da0ff577750571b99fe28281ce6fb6a3ae8. PR: https://github.com/llvm/llvm-project/pull/174282	2026-01-24 13:22:48 +00:00
Florian Hahn	16395da0ff	Revert "[VPlan] Fold VPDef into VPRecipeBase (NFC). (#174282 )" This reverts commit f3ae334f4b7a8cf4fe0eb6ee7b2f2ef0879f522d. Committed with out-of-date message, revert to reland with updated message.	2026-01-24 13:16:45 +00:00
Florian Hahn	f3ae334f4b	[VPlan] Fold VPDef into VPRecipeBase (NFC). (#174282 ) A separate VDef is not needed any longer, fold i into VPRecipeBase to simplify code and class hierarchy. Depends on https://github.com/llvm/llvm-project/pull/172758. PR: https://github.com/llvm/llvm-project/pull/174282	2026-01-24 13:16:12 +00:00
Florian Hahn	dd363d0629	[VPlan] Replace UnrollPart for VPScalarIVSteps with start index op (NFC) (#170906 ) Replace the unroll part operand for VPScalarIVStepsRecipe with the start index. This simplifies https://github.com/llvm/llvm-project/pull/170053 and is also a first step to break down the recipe into its components. PR: https://github.com/llvm/llvm-project/pull/170906	2026-01-21 22:13:13 +00:00
Florian Hahn	e36ddff7a4	[VPlan] Add scalable check to SinkStoreInfo helper. Bail out on scalable vectors in helper. Currently this is not causing issues, but fixes a potential crash that would be exposed by a follow-up change. Test would exposes the issue in the future has been added in 8c5352cf3e14ec0c56f592091899d229de8436a7.	2026-01-16 21:07:40 +00:00
Elvis Wang	aa11629192	[LV] Prevent `extract-lane` generate unused IRs with single vector operand. (#172798 ) When `extract-lane` only contains single vector operand. We can simplify it to `extractelement`. This patch makes `extract-lane` generate simple `extractelement` when it only contains single vector operand to prevent unused IR generated. This patch is mostly NFC, the unused IR should be removed in following IR passes.	2026-01-16 13:59:51 +08:00
Florian Hahn	f14577fa6f	[VPlan] Fold boolean select to xor if possible. Fold select c, false, true -> not c. This allows for more accurate cost estimation and fixes the underlying issue for the cost divergence between legacy and VPlan-based cost model that caused the revert of 01d34eb38fa058 in ed004cf42bf57c. https://alive2.llvm.org/ce/z/yVuSgW.	2026-01-15 22:13:47 +00:00
Luke Lau	0ae23ca9e6	[VPlan] Split out optimizeEVLMasks. NFC (#174925 ) Addresses part of #153144 and splits off part of #166164 There are two parts to the EVL transform: 1) Convert the loop so the number of elements processed each iteration is EVL, not VF. The IV and header mask are replaced with EVL-based variants. 2) Optimize users of the EVL based header mask to VP intrinsic based recipes. (1) changes the semantics of the vector loop region, whereas (2) needs to preserve them. This splits (2) out so we don't mix the two up, and allows us to move (1) earlier in the pipeline in a future PR.	2026-01-14 07:01:14 +00:00
Ramkumar Ramachandra	d69335bac9	[LLVM] Clean up code using [not_]equal_to (NFC) (#175824 ) Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner code.	2026-01-13 21:19:39 +00:00
Florian Hahn	4a807e8dd9	[VPlan] Optimize BranchOnTwoConds to chain of 2 simple branches. (#174016 ) This patch improves the lowering for BranchOnTwoConds added in https://github.com/llvm/llvm-project/pull/172750 by replacing the branch on OR with a chain of 2 branches. On Apple M cores, the new lowering is ~8-10% faster for std::find-like loops. It also makes it easier to determine the early exits in VPlan. I am also planning on extensions to support loops with multiple early exits and early-exits at different positions, which should also be slightly easier to do with the new representation. PR: https://github.com/llvm/llvm-project/pull/174016	2026-01-13 20:14:15 +00:00
Florian Hahn	d27d75ee94	[VPlan] Use createHeaderPHIRecipes in native path (NFCI). Simplify tryToBuildVPlan by using createHeaderPHIRecipes in the native path as well.	2026-01-13 20:12:21 +00:00
Luke Lau	123b6a2766	[VPlan] Give VPInstruction::ExplicitVectorLength name. NFC (#175493 ) This makes it a tad easier to read VPlan dumps, e.g. WIDEN vp.store vp<%7>, ir<%val>, vp<%5> -> WIDEN vp.store vp<%7>, ir<%val>, vp<%evl>	2026-01-13 00:05:30 +08:00
Florian Hahn	8f182526de	[VPlan] Don't fold UDiv in replicate regions. (#175460 ) The UDiv fold added in d12e993 (#174581) is currently also applied to replicate regions, which means we may end up with VPInstructions in replicate regions, which is currently nots supported. Fixes https://github.com/llvm/llvm-project/issues/175295. PR: https://github.com/llvm/llvm-project/pull/175460	2026-01-12 12:16:48 +00:00
Elvis Wang	cd2caf6580	[LV] Simplify extract-lane with scalar operand to the scalar value itself. (#174534 ) This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a scalar value. Extracting from a scalar is redundant since there is only one value to extract.	2026-01-12 10:03:44 +08:00
Ramkumar Ramachandra	78f1de803a	[VPlan] Strip iterator-invalidation guard in findHeaderMask (NFC) (#174930 ) It is unnecessary, as the users are never modified.	2026-01-08 09:42:18 +00:00
Florian Hahn	31b93d6e38	[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758 ) This patch adds VPValue sub-classes for the different cases we currently have: * VPIRValue: A live-in VPValue that wraps an underlying IR value * VPSymbolicValue: A symbolic VPValue not tied to an underlying value, e.g. the vector trip count or VF VPValues * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase. This has multiple benefits: * clearer constructors for each kind of VPValue * limited scope: for example allows moving VPDef member to VPRecipeValue, reducing size of other VPValues. * stricter type checking for member variables (e.g. using VPLiveIn in the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic member VPValues) There probably are additional opportunities for cleanups as follow-ups. PR: https://github.com/llvm/llvm-project/pull/172758	2026-01-07 20:29:05 +00:00
Ramkumar Ramachandra	d12e99376f	Reland [VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr) (#174581 ) The original patch, landed as a2db31b0 ([VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr), #172477) had a critical commutative matcher bug, which has now been fixed. An assert has also been strengthened, following a post-commit review.	2026-01-06 20:36:26 +00:00
Alex Bradbury	5a456c17d9	Revert "[VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr)" (#174559 ) Reverts llvm/llvm-project#172477 This is causing failures for RVA23 (including some tests running away in their execution causing OOM, hence the builder dying). I will attempt to follow up on the PR with a reproducer of some kind. https://lab.llvm.org/buildbot/#/builders/210/builds/7243	2026-01-06 10:26:51 +00:00
Ramkumar Ramachandra	a2db31b06f	[VPlan] Simplify pow-of-2 (mul\|udiv) -> (shl\|lshr) (#172477 )	2026-01-06 08:27:48 +00:00
Florian Hahn	16830b2164	[VPlan] Remove VPWidenSelectRecipe, use VPWidenRecipe instead (NFCI). (#174234 ) All extra state has been removed from VPWidenSelectRecipe at this point. There's no benefit of having a separate recipe and Select can easily be handled by the existing VPWidenRecipe. PR: https://github.com/llvm/llvm-project/pull/174234	2026-01-05 22:33:37 +00:00
Florian Hahn	0b46cf7dcd	[VPlan] Handle BranchOnTwoConds in simplifyBranchCondition. This fixes a crash after introducing BranchOnTwoConds (524b1788, https://github.com/llvm/llvm-project/pull/172750) when trying to replace BranchOnTwoConds with a VPBranchOnCond, without dissolving the region. In that case, we need to update the appropriate condition operand.	2025-12-30 18:47:22 +00:00
Florian Hahn	524b1788c4	[VPlan] Add BranchOnTwoConds, use for early exit plans. (#172750 ) This PR introduces a new BranchOnTwoConds VPInstruction, that takes 2 boolean operands and must be placed in a block with 3 successors. If condition I is true, branches to successor I, otherwise falls through to check the next condition. If both conditions are false, branch to the third successor. This new branch recipe is used for early-exit loops, to simplify the representation in VPlan initially, by avoid the need for splitting the middle block early on, in a way that preserves the single-exit block property of regions. All exits still go through the latch block, but they can go to more than 2 successors. This idea was part of one of the original proposals for how to model early exits in VPlan, but at that point in time, there was no good way to handle this during code-gen, and we went with the early split-middle block approach initially. Now that we dissolve regions before ::execute, the new recipe can be lowered nicely after regions have been removed, to a set of VPBBs and BranchOnCond recipes. The initial lowering preserves the original structure with the split middle blocks. Follow-ups will improve the lowering to avoid this splitting, providing performance gains. PR: https://github.com/llvm/llvm-project/pull/172750	2025-12-29 19:39:38 +00:00
Florian Hahn	c43ccefc9f	[VPlan] Use PSE to construct SCEVs in getSCEVExprForVPValue (NFCI). getSCEVExprForVPValue is used to create SCEVs for expressions from the original loop, which may be predicated. Use PSE to construct predicated SCEVs if possible. This matches the legacy LV code behavior. Currently should be NFC, but will enable migrating more SCEV/cost-based computations to VPlan. The patch requires exposing a new getPredicatedSCEV helper to PredicatedScalarEvolution which just takes a SCEV, to avoid needing to go through IR values, which isn't an option for getSCEVExprForVPValue.	2025-12-21 22:39:49 +00:00
Mel Chen	f196b1d66f	[VPlan] Extract reverse operation for reverse accesses (#146525 ) This patch introduces VPInstruction::Reverse and extracts the reverse operations of loaded/stored values from reverse memory accesses. This extraction facilitates future support for permutation elimination within VPlan.	2025-12-18 14:57:48 +00:00
Florian Hahn	eb0c7e752f	[VPlan] Replace BranchOnCount with Compare + BranchOnCond (NFC). (#172181 ) Expand BranchOnCount to BranchOnCond + ICmp in convertToConcreteRecipes to simplify codegen. PR: https://github.com/llvm/llvm-project/pull/172181	2025-12-16 19:19:31 +00:00
Elvis Wang	1eba2cbe72	[LV] Convert uniform-address unmasked scatters to scalar store. (#166114 ) This patch optimizes vector scatters that have a uniform (single-scalar) address by replacing them with "extract-last-lane + scalar store" when the scatter is unmasked. Notes: - The legacy cost model can scalarize a store if both the address and the value are uniform. In VPlan we materialize the stored value via ExtractLastLane, so only the address must be uniform. - Some of the loops won't be vectorized any sine no vector instructions will be generated.	2025-12-16 12:24:22 +08:00
Ramkumar Ramachandra	0636225b93	[VPlan] Directly unroll VectorPointerRecipe (#168886 ) In an effort to get rid of VPUnrollPartAccessor and directly unroll recipes, start by directly unrolling VectorPointerRecipe, allowing for VPlan-based simplifications and simplification of the corresponding execute.	2025-12-15 10:54:06 +00:00
Ramkumar Ramachandra	85fafd5db0	[SCEVExp] Get DL from SE, strip constructor arg (NFC) (#171823 )	2025-12-11 14:26:47 +00:00
Florian Hahn	c61a481a23	[VPlan] Use SCEV to prove non-aliasing for stores at different offsets. (#170347 ) Extend the logic add in https://github.com/llvm/llvm-project/pull/168771 to also allow sinking stores past stores in the same noalias set by checking if we can prove no-alias via the distance between accesses, checked via SCEV. PR: https://github.com/llvm/llvm-project/pull/170347	2025-12-09 16:19:13 +00:00
Florian Hahn	0768068ff0	[VPlan] Remove ExtractLastLane for plans with scalar VFs. (#171145 ) ExtractLastLane is a no-op for scalar VFs. Update simplifyRecipe to remove them. This also requires adjusting the code in VPlanUnroll.cpp to split off handling of ExtractLastLane/ExtractPenultimateElement for scalar VFs, which now needs to match ExtractLastPart. PR: https://github.com/llvm/llvm-project/pull/171145	2025-12-09 11:59:40 +00:00
Ramkumar Ramachandra	c5b90103da	[VPlan] Use nuw when computing {VF,VScale}xUF (#170710 ) These quantities should never unsigned-wrap. This matches the behavior if only VFxUF is used (and not VF): when computing both VF and VFxUF, nuw should hold for each step separately.	2025-12-08 15:46:02 +00:00
Florian Hahn	3fc7419236	[VPlan] Replace ExtractLast(Elem\|LanePerPart) with ExtractLast(Lane/Part) (#164124 ) Replace ExtractLastElement and ExtractLastLanePerPart with more generic and specific ExtractLastLane and ExtractLastPart, which model distinct parts of extracting across parts and lanes. ExtractLastElement == ExtractLastLane(ExtractLastPart) and ExtractLastLanePerPart == ExtractLastLane, the latter clarifying the name of the opcode. A new m_ExtractLastElement matcher is provided for convenience. The patch should be NFC modulo printing changes. PR: https://github.com/llvm/llvm-project/pull/164124	2025-12-07 15:15:43 +00:00
Florian Hahn	f02dc4d198	[VPlan] Don't try to hoist multi-defs for first-order recurrences. Currently the hoisting implementation expects single-defs. Bail out on multi-defs (VPInterleaveRecipe), to fix an assertion. Fixes https://github.com/llvm/llvm-project/issues/170666	2025-12-04 21:09:16 +00:00
Ramkumar Ramachandra	ef58670f03	Revert [VPlan] Consolidate logic for narrowToSingleScalars (#170720 ) This reverts commit 7b3ec51, as a crash was reported: https://llvm.godbolt.org/z/dK6ff5zvr -- this will give us time to investigate a re-land.	2025-12-04 19:14:51 +00:00
Stephen Tozer	fda85a1423	[DebugInfo][LoopVectorizer][NFC] Use unknown annotations for more instructions (#170522 ) Some recent patches have added more non-annotated empty locations to the loop vectorizer, resulting in errors reported on the DebugLoc coverage tracking buildbot: https://lab.llvm.org/staging/#/builders/222/builds/1938 This patch adds "unknown" annotations in place of the empty locations, allowing the buildbot to ignore them for now.	2025-12-04 09:32:01 +00:00
Ramkumar Ramachandra	7b3ec5191a	[VPlan] Consolidate logic for narrowToSingleScalars (NFCI) (#167360 ) The logic for narrowing to single scalar recipes is in two different places: narrowToSingleScalarRecipes and legalizeAndOptimizeInductions. Consolidate them.	2025-12-03 20:25:52 +00:00
Florian Hahn	4b6ad11876	[VPlan] Sink predicated stores with complementary masks. (#168771 ) Extend the logic to hoist predicated loads (https://github.com/llvm/llvm-project/pull/168373) to sink predicated stores with complementary masks in a similar fashion. The patch refactors some of the existing logic for legality checks to be shared between hosting and sinking, and adds a new sinking transform on top. With respect to the legality checks, for sinking stores the code also checks if there are any aliasing stores that may alias, not only loads. PR: https://github.com/llvm/llvm-project/pull/168771	2025-12-02 11:43:37 +00:00
Florian Hahn	25ab47bd40	[VPlan] Use wide IV if scalar lanes > 0 are used with scalable vectors. (#169796 ) For scalable vectors, VPScsalarIVStepsRecipe cannot create all scalar step values. At the moment, it creates a vector, in addition to to the first lane. The only supported case for this is when only the last lane is used. A recipe should not set both scalar and vector values. Instead, we can simply use a vector induction. It would also be possible to preserve the current vector code-gen, by creating VPInstructions based on the first lane of VPScalarIVStepsRecipe, but using a vector induction seems simpler. PR: https://github.com/llvm/llvm-project/pull/169796	2025-12-01 17:33:36 +00:00
David Sherwood	17677ad7eb	[LV] Don't create WidePtrAdd recipes for scalar VFs (#169344 ) While attempting to remove the use of undef from more loop vectoriser tests I discovered a bug where this assert was firing: ``` llvm::Constant* llvm::Constant::getSplatValue(bool) const: Assertion `this->getType()->isVectorTy() && "Only valid for vectors!"' failed. ... #8 0x0000aaaab9e2fba4 llvm::Constant::getSplatValue #9 0x0000aaaab9dfb844 llvm::ConstantFoldBinaryInstruction ``` This seems to be happening because we are incorrectly generating WidePtrAdd recipes for scalar VFs. The PR fixes this by checking whether a plan has a scalar VF only in legalizeAndOptimizeInductions. This PR also removes the use of undef from the test `both` in Transforms/LoopVectorize/iv_outside_user.ll, which is what started triggering the assert. Fixes #169334	2025-12-01 08:12:41 +00:00
Florian Hahn	b76089c7f3	[VPlan] Skip uses-scalars restriction if one of ops needs broadcast. (#168246 ) Update the logic in narrowToSingleScalar to allow narrowing even if not all users use scalars, if at least one of the operands already needs broadcasting. In that case, there won't be any additional broadcasts introduced. This should allow removing the special handling for stores, which can introduce additional broadcasts currently. Fixes https://github.com/llvm/llvm-project/issues/169668. PR: https://github.com/llvm/llvm-project/pull/168246	2025-11-28 10:26:27 +00:00
Florian Hahn	8459508227	[VPlan] Handle scalar VPWidenPointerInd in convertToConcreteRecipes. (#169338 ) In some case, VPWidenPointerInductions become only used by scalars after legalizeAndOptimizationInducftions was already run, for example due to some VPlan optimizations. Move the code to scalarize VPWidenPointerInductions to a helper and use it if needed. This fixes a crash after #148274 in the added test case. Fixes https://github.com/llvm/llvm-project/issues/169780	2025-11-27 21:52:15 +00:00
Luke Lau	1c7ec06b16	[VPlan] Optimize LastActiveLane to EVL - 1 (#169766 ) With EVL tail folding, the LastActiveLane can be computed with EVL - 1. This removes the need for a header mask and vfirst.m for loops with live outs on RISC-V: # %bb.5: # %for.cond.cleanup7 - vsetvli zero, zero, e32, m2, ta, ma - vmv.v.x v8, s1 - vmsleu.vv v10, v8, v22 - vfirst.m a0, v10 - srli a1, a0, 63 - czero.nez a0, a0, a1 - czero.eqz a1, s8, a1 - or a0, a0, a1 - addi a0, a0, -1 - vsetvli zero, zero, e64, m4, ta, ma - vslidedown.vx v8, v12, a0 + addi s1, s1, -1 + vslidedown.vx v8, v12, s1	2025-11-27 17:03:08 +08:00
Florian Hahn	f8eca64a28	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 20:03:55 +00:00

1 2 3 4 5 ...

616 Commits