llvm-project

Author	SHA1	Message	Date
Florian Hahn	edb690dc5b	Reapply "[VPlan] Add canonical IV during construction (NFC)." This reverts commit d431921677ae923d189ff2d6f188f676a2964ed8. Missing gtests have been updated. Original message: This addresses an existing TODO and simply moves the current code to add canonical IV recipes to the initial skeleton construction, at the same place where the corresponding region will be introduced.	2025-05-03 10:54:59 +01:00
Florian Hahn	8d02529f77	[VPlan] Consistently use ArrayRef<VPValue *> for operands in ctors (NFC) (#137798 ) Now that there is an ArrayRef constructor taking iterator_ranges, use it consistently to slightly simplify code. Depends on https://github.com/llvm/llvm-project/pull/137796. PR: https://github.com/llvm/llvm-project/pull/137798	2025-05-01 21:19:10 +01:00
Samuel Tebbs	fa769655e7	[LV] NFC: Make VPPartialReductionRecipe a VPReductionRecipe	2025-04-30 19:44:40 +01:00
Florian Hahn	d431921677	Revert "[VPlan] Add canonical IV during construction (NFC)." This reverts commit e17122fffa8d233fcf9f717354ecda46173f1b8d. Revert as this seems to break some unit tests on some bots.	2025-04-29 22:55:11 +01:00
Florian Hahn	e17122fffa	[VPlan] Add canonical IV during construction (NFC). This addresses an existing TODO and simply moves the current code to add canonical IV recipes to the initial skeleton construction, at the same place where the corresponding region will be introduced.	2025-04-29 22:38:59 +01:00
Florian Hahn	7e71466900	[VPlan] Preserve dbg location on canonical IVs in native path. Pass the debug location of the primary IV to addCanonicalIVRecipes in the native path, matching the behavior of inner loop vectorization.	2025-04-29 21:40:42 +01:00
Florian Hahn	d2ce88a939	[VPlan] Create initial skeleton before creating regions. (NFC) Move out the logic to prepare for vectorization to a separate transform, before creating loop regions. This was discussed as follow-up in https://github.com/llvm/llvm-project/pull/136455. This just moves the existing code around slightly and will simplify follow-up patches to include the exiting edges during initial VPlan construction.	2025-04-28 21:51:32 +01:00
Florian Hahn	043b04acff	Reapply "[VPlan] Fold NOT into predicate of wide compares." (#130347 ) This reverts commit 8dd160f4767f971572eac065c8650d9202ff5bf9. The recommit contains an adjustment to planContainsAdditionalSimplifications, which considers changes to the original predicate for compares. Original commit message: Add simplification to fold negation into a compare, if the negation is the only user of the compare. This removes a number of redundant negations. Alive2 Proofs for FPCMP test changes: https://alive2.llvm.org/ce/z/WGDz9U PR: https://github.com/llvm/llvm-project/pull/129430	2025-04-28 20:01:37 +01:00
Kazu Hirata	5cfd81b0cc	[llvm] Use range constructors of *Set (NFC) (#137552 )	2025-04-27 15:59:57 -07:00
Kazu Hirata	1f56716a7e	[llvm] Use hash_combine_range with ranges (NFC) (#137530 )	2025-04-27 12:31:28 -07:00
Florian Hahn	2e934170b0	[LV] Remove LoopVectorizationLegality from InnerLoopVectorizer (NFC). a51e28278 removed the last real use of Legal in InnerLoopVectorizer. Now that it isn't used any longer, remove it to avoid new users being introduced.	2025-04-27 20:30:48 +01:00
Florian Hahn	826f237cb4	[VPlan] Don't added separate vector latch block (NFC). Simplify initial VPlan construction by not creating a separate vector.latch block, which isn't needed and will get folded away later. This has been suggested as independent clean-up multiple times.	2025-04-26 22:03:18 +01:00
Florian Hahn	df21288247	[VPlan] Replace ExtractFromEnd with Extract(Last\|Penultimate)Element (NFC). (#137030 ) ExtractFromEnd only has 2 uses, extracting the last and penultimate elements. Replace it with 2 separate opcodes, removing the need to materialize and handle a constant argument. PR: https://github.com/llvm/llvm-project/pull/137030	2025-04-25 16:27:29 +01:00
Florian Hahn	7cce38beea	[VPlan] Remove dead SE argument from handleUncountableEarlyExit (NFC). ScalarEvolution is not used by the function, remove the dead arg.	2025-04-24 19:59:05 +01:00
Florian Hahn	e268f71c59	[VPlan] Remove unneeded early continue. (NFC) As suggested in https://github.com/llvm/llvm-project/pull/136455, now unreachable exit blocks won't have any phi nodes.	2025-04-24 08:59:30 +01:00
Florian Hahn	15bb1db4a9	[VPlan] Remove ILV::sinkScalarOperands. (#136023 ) Remove legacy ILV sinkScalarOperands, which is superseded by the sinkScalarOperands VPlan transforms. There are a few cases that aren't handled by VPlan's sinkScalarOperands, because the recipes doesn't support replicating. Those are pointer inductions and blends. We could probably improve this further, by allowing replication for more recipes, but I don't think the extra complexity is warranted. Depends on https://github.com/llvm/llvm-project/pull/136021. PR: https://github.com/llvm/llvm-project/pull/136023	2025-04-24 08:37:49 +01:00
Florian Hahn	3fbbe9b8d0	[VPlan] Add exit phi operands during initial construction (NFC). (#136455 ) Add incoming exit phi operands during the initial VPlan construction. This ensures all users are added to the initial VPlan and is also needed in preparation to retaining exiting edges during initial construction. PR: https://github.com/llvm/llvm-project/pull/136455	2025-04-23 20:40:42 +01:00
Ramkumar Ramachandra	bdf21ca8ac	[LV] Fix missing entry in willGenerateVectors (#136712 ) willGenerateVectors switches on opcodes of a recipe, but Histogram is missing in the switch statement, which could cause a crash in some cases. The crash was initially observed when developing another patch.	2025-04-23 19:06:38 +01:00
Nicholas Guy	1ce709cb84	[LV] Fix crash when building partial reductions using types that aren't known scale factors (#136680 )	2025-04-23 13:19:18 +01:00
David Green	98b6f8dc69	[CostModel] Remove optional from InstructionCost::getValue() (#135596 ) InstructionCost is already an optional value, containing an Invalid state that can be checked with isValid(). There is little point in returning another optional from getValue(). Most uses do not make use of it being a std::optional, dereferencing the value directly (either isValid has been checked previously or the Cost is assumed to be valid). The one case that does in AMDGPU used value_or which has been replaced by a isValid() check.	2025-04-23 07:46:27 +01:00
Florian Hahn	8c83355d5b	[VPlan] Handle VPIRPhi in VPRecipeBase::isPhi (NFC). Also handle VPIRPhi in VPRecipeBase::isPhi, to simplify existing code dealing with VPIRPhis. Suggested as part of https://github.com/llvm/llvm-project/pull/136455.	2025-04-21 21:04:20 +01:00
Florian Hahn	3e5a9d9aa0	[VPlan] Rename setFlags -> applyFlags (NFC). Update name to apply flags to instructions, as suggested in https://github.com/llvm/llvm-project/pull/135272. Also changes the arg to a reference.	2025-04-21 18:57:56 +01:00
David Green	e183459b8b	[CostModel] Make sure getCmpSelInstrCost is passed a CondTy (#135535 ) It is already required along certain code paths that the CondTy is valid. Fix some of the uses to make sure it is passed.	2025-04-21 05:33:30 +01:00
Florian Hahn	e232d28eff	[VPlan] Move plain CFG construction to VPlanConstruction. (NFC) Follow-up as discussed in https://github.com/llvm/llvm-project/pull/129402. After bc03d6cce257, the VPlanHCFGBuilder doesn't actually build a HCFG any longer. Move what remains directly into VPlanConstruction.cpp.	2025-04-18 21:52:05 +01:00
Kazu Hirata	b2ba53172e	[Transforms] Construct SmallVector with iterator ranges (NFC) (#136259 )	2025-04-18 10:27:05 -07:00
Shih-Po Hung	e5263e3ec8	[LV][NFC] Clean up tail-folding check for early-exit loops (#133931 ) This patch moves the check for a single latch exit from computeMaxVF() to LoopVectorizationLegality::canFoldTailByMasking(), as it duplicates the logic when foldTailByMasking() returns false. It also updates the NoScalarEpilogueNeeded logic to return false for loops that are neither single-latch-exit nor early-exit. This avoids applying tail-folding in unsupported cases and prevents triggering assertions during analysis.	2025-04-18 10:57:11 +08:00
Elvis Wang	69ade7c090	[LV] Check if the VF is scalar by VFRange in `handleUncountableEarlyExit`. (#135294 ) This patch check if the plan contains scalar VF by VFRange instead of Plan. This patch also clamp the range to contains either only scalar or only vector VFs to prevent mis-compile. Split from #113903.	2025-04-18 06:51:36 +08:00
Sander de Smalen	f9c01b59e3	[LV] Fix '-1U' bits for smallest type in getSmallestAndWidestTypes (#135783 ) For loops without loads/stores, where the smallest/widest types are calculated from the reduction, the smallest type returned is always -1U and it actually returns the smallest type as the widest type. This PR fixes the calculation. This follows from https://github.com/llvm/llvm-project/pull/132190#discussion_r2044232607	2025-04-17 13:26:15 +01:00
John Brawn	eafbb879f6	[LoopVectorize] Don't replicate blocks with optsize (#129265 ) Any VPlan we generate that contains a replicator region will result in replicated blocks in the output, causing a large code size increase. Reject such VPlans when optimizing for size, as the code size impact is usually worse than having a scalar epilogue, which we already forbid with optsize. This change requires a lot of test changes. For tests of optsize specifically I've updated the test with the new output, otherwise the tests have been adjusted to not rely on optsize. Fixes #66652	2025-04-17 11:50:49 +01:00
Florian Hahn	41c1a7be3f	[LV] Don't add fixed-order recurrence phis to forced scalars. Fixed-order recurrence phis cannot be forced to be scalar, they will always be widened at the moment. Make sure we don't add them to ForcedScalars, otherwise the legacy cost model will compute incorrect costs. This fixes an assertion reported with https://github.com/llvm/llvm-project/pull/129645.	2025-04-16 22:58:10 +02:00
Florian Hahn	bc03d6cce2	[VPlan] Introduce all loop regions as VPlan transform. (NFC) (#129402 ) Further simplify VPlan CFG builder by moving introduction of inner regions to a VPlan transform, building on https://github.com/llvm/llvm-project/pull/128419. The HCFG builder now only constructs plain CFGs. I will move it to VPlanConstruction as follow-up. Depends on https://github.com/llvm/llvm-project/pull/128419. PR: https://github.com/llvm/llvm-project/pull/129402	2025-04-16 13:30:45 +02:00
Mel Chen	e96111d3e9	[LV] Remove redundant check. nfc (#135605 ) Remove the redundant `!TheLoop->contains(Op->getParent())` check since `!TheLoop->contains(Op)` has already been verified.	2025-04-15 08:21:28 +08:00
Florian Hahn	54b33eba16	[VPlan] Add opcode to create step for wide inductions. (#119284 ) This patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 2 operands * the vector step * the scale of the vector step The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale. This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to https://github.com/llvm/llvm-project/pull/108378 with an increment of the wide IV step. PR: https://github.com/llvm/llvm-project/pull/119284	2025-04-14 23:20:44 +02:00
Mel Chen	9df153bc14	[LV] Remove unused requiresScalarEpilogue function. nfc (#135341 )	2025-04-14 14:16:04 +08:00
Sam Tebbs	b658a2e74a	[LV] Reduce register usage for scaled reductions (#133090 ) This PR accounts for scaled reductions in `calculateRegisterUsage` to reflect the fact that the number of lanes in their output is smaller than the VF. Depends on https://github.com/llvm/llvm-project/pull/126437	2025-04-11 14:31:08 +01:00
Florian Hahn	e27a21f6a7	[VPlan] Add hasScalarTail, use instead of !CM.foldTailByMasking() (NFC). (#134674 ) Now that VPlan is able to fold away redundant branches to the scalar preheader, we can directly check in VPlan if the scalar tail may execute. hasScalarTail returns true if the tail may execute. We know that the scalar tail won't execute if the scalar preheader doesn't have any predecessors, i.e. is not reachable. This removes some late uses of the legacy cost model. PR: https://github.com/llvm/llvm-project/pull/134674	2025-04-11 12:50:59 +01:00
Florian Hahn	6a9e8fc50c	[VPlan] Introduce VPInstructionWithType, use instead of VPScalarCast(NFC) (#129706 ) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * https://github.com/llvm/llvm-project/pull/129508 * https://github.com/llvm/llvm-project/pull/119284 PR: https://github.com/llvm/llvm-project/pull/129706	2025-04-10 22:30:40 +01:00
Florian Hahn	6f92339d9e	[LV] Compute register usage for interleaving on VPlan. (#126437 ) Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on https://github.com/llvm/llvm-project/pull/126415 PR: https://github.com/llvm/llvm-project/pull/126437	2025-04-08 20:52:50 +01:00
Florian Hahn	a51e282784	[LV] Check if plan has an early exit via plan's exit blocks. (NFC) (#134720 ) Add a dedicated function to check if a plan is for a loop with an early exit. This can easily be determined by checking the exit blocks. This allows removing a use of Legal->hasUncountableEarlyExit() from InnerLoopVectorizer. PR: https://github.com/llvm/llvm-project/pull/134720	2025-04-08 12:52:38 +01:00
Ramkumar Ramachandra	6a42fb8fbf	[LV] Clarify code in isPredicatedInst (NFC) (#134251 )	2025-04-08 10:46:17 +01:00
Florian Hahn	ad9f15ab53	[VPlan] Introduce and use VPValue::replaceUsesOfWith (NFC). Adds an API matching LLVM's IR Value, which simplifies some code a bit.	2025-04-07 22:07:52 +01:00
Mel Chen	409df9f74c	[TTI][LV] Change the prototype of preferInLoopReduction. nfc (#132698 ) This patch changes the preferInLoopReduction function to take a RecurKind instead of an unsigned Opcode. This makes it possible to distinguish non-arithmetic reductions such as min/max, AnyOf, and FindLastIV, and also helps unify IAnyOf with FAnyOf and IFindLastIV with FFindLastIV. Related patch #118393 #131830	2025-04-07 19:10:16 +08:00
Florian Hahn	449e2f5d66	[LV] Remove more DT updates from legacy code path (NFCI). Remove some legacy DT updates. Those should already be handled when updating the DT during VPlan execution.	2025-04-06 14:35:21 +01:00
Florian Hahn	283a78a088	Reapply "[LV] Don't add blocks to loop in GeneratedRTChecks (NFC)." This reverts commit 46a2f4174a051f29a09dbc3844df763571c67309. Recommits 2fd6f8fb5e3a with corresponding VPlan change to ensure LoopInfo is updated for all blocks during VPlan execution if needed.	2025-04-06 12:18:11 +01:00
Florian Hahn	46a2f4174a	Revert "[LV] Don't add blocks to loop in GeneratedRTChecks (NFC)." This reverts commit 2fd6f8fb5e3a52e901276d97c285b8de66742985. This missed a possible case, causing buildbot failures.	2025-04-05 21:47:14 +01:00
Florian Hahn	2fd6f8fb5e	[LV] Don't add blocks to loop in GeneratedRTChecks (NFC). Blocks will get added to parent loops as needed during VPlan execution.	2025-04-05 21:10:26 +01:00
Florian Hahn	5fbd0658a0	[VPlan] Add initial CFG simplification, removing BranchOnCond true. (#106748 ) Add an initial CFG simplification transform, which removes the dead edges for blocks terminated with BranchOnCond true. At the moment, this removes the edge between middle block and scalar preheader when folding the tail. PR: https://github.com/llvm/llvm-project/pull/106748	2025-04-04 15:44:26 +01:00
Florian Hahn	2bdc1a1337	[LV] Use frozen start value for FindLastIV if needed. (#132691 ) FindLastIV introduces multiple uses of the start value, where in the original source there was only a single use, when the epilogue is vectorized. Each use of undef may produce a different result, so introducing multiple uses can produce incorrect results when the input is undef/poison. If the start value may be undef or poison, freeze it and use the frozen value, which will be the same at all uses. See the following scenarios in Alive2: * Both main and epilogue vector loops execute, go to exit block: https://alive2.llvm.org/ce/z/_TSvRr * Both main and epilogue vector loops execute, go to scalar loop: https://alive2.llvm.org/ce/z/CsPj5v * Only epilogue vector loop executes, go to exit block: https://alive2.llvm.org/ce/z/5XqkNV * Only epilogue vector loop executes, go to scalar loop: https://alive2.llvm.org/ce/z/JUpqRN The latter 2 show requiring freezing the resume phi. That means we cannot freeze in the preheader. We could move the freeze to the main iteration count check, but that would be a bit fragile to find and other transforms can sink the freeze if needed. Depends on https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. Fixes https://github.com/llvm/llvm-project/issues/126836 PR: https://github.com/llvm/llvm-project/pull/132691	2025-04-04 11:48:01 +01:00
Florian Hahn	cdff7f0b6e	[LV] Retrieve middle VPBB via scalar ph to fix epilogue resumephis (NFC) If ScalarPH has predecessors, we may need to update its reduction resume values. If there is a middle block, it must be the first predecessor. Note that the first predecessor may not be the middle block, if the middle block doesn't branch to the scalar preheader. In that case, fixReductionScalarResumeWhenVectorizingEpilog will be a no-op. In preparation for https://github.com/llvm/llvm-project/pull/106748.	2025-04-03 21:46:48 +01:00
Ramkumar Ramachandra	6bbdc70066	[LV] Use getCallWideningDecision in more places (NFC) (#134236 )	2025-04-03 14:53:19 +01:00

1 2 3 4 5 ...

2511 Commits