llvm-project

Author	SHA1	Message	Date
Florian Hahn	8fa440a1e0	[LV] Add tests for speculatively loading ptrs with UB/poison ops. Test cases for https://github.com/llvm/llvm-project/issues/142957.	2025-06-06 21:18:41 +01:00
Paul Walker	6955a7d134	[NFC][LLVM][Instrumentation][LoopVectorize] Regenerate test checks.	2025-06-05 11:38:30 +00:00
Luke Lau	5458ea5122	[LV] Regenerate UTC variable names in RISCV/interleaved-accesses.ll. NFC	2025-06-04 01:07:32 +01:00
Ramkumar Ramachandra	86f18d394e	[LV] Re-org tests; introduce iv-select-cmp-decreasing.ll (#141769 ) Having FindFirstIV tests in if-reduction.ll is misleading, and iv-select-cmp.ll is already too large.	2025-06-03 18:13:51 +01:00
Florian Hahn	11713e86b0	[LV] Move VPlan-based calculateRegisterUsage to VPlanAnalysis (NFC). (#135673 ) Move VPlan-based calculateRegisterUsage from LoopVectorize to VPlanAnalysis.cpp. It is a VPlan-based analysis and this helps to reduce the size of LoopVectorize. PR: https://github.com/llvm/llvm-project/pull/135673	2025-06-02 17:40:50 +01:00
Ramkumar Ramachandra	b8c4eea3d8	[VPlan] Simplify PredPHI LiveIn -> LiveIn (#142271 ) 5f39be5 ([VPlan] Use InstSimplifyFolder instead of TargetFolder) updated simplifyRecipe to fold live-ins to Values that are not necessarily Constant, but forgot to update the corresponding PredPHI folder, which still folds PredPHI constant -> constant. Update it to fold PredPHI LiveIn -> LiveIn. Fixes #141968.	2025-06-02 14:56:35 +01:00
Florian Hahn	0ba63b2f22	[SCEV] Add additional test coverage for loop-guards reasoning. Add additional tests showing missed opportunities when using loop guards for reasoning in SCEV, depending on the order the guards appear in the IR.	2025-06-01 22:39:37 +01:00
Madhur Amilkanthwar	67ff713052	[NFC][LV] Remove incorrect comment about lack of support (#142126 )	2025-05-30 13:25:55 +02:00
Florian Hahn	9ea4924720	[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC). Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also use EMIT-SCALAR for VPPhis that are single scalars.	2025-05-29 11:20:07 +01:00
Florian Hahn	5b85e4b08d	[VPlan] Use EMIT-SCALAR when printing single-scalar VPInstructions. (#141428 ) By using SINGLE-SCALAR when printing, it is clear in the debug output that those VPInstructions only produce a single scalar. Split off in preparation for https://github.com/llvm/llvm-project/pull/140623. PR: https://github.com/llvm/llvm-project/pull/141428	2025-05-29 09:29:06 +01:00
Elvis Wang	332fe08f1d	[VPlan] Implement VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. (#113903 ) This patch implement the VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. With this patch, we can calculate the reduction cost by the VPlan-based cost model so remove the reduction costs in `precomputeCost()`. Ref: Original instruction based implementation: https://reviews.llvm.org/D93476	2025-05-29 11:15:16 +08:00
Florian Hahn	440a8adb86	[VPlan] Use VPIRFlags to manage FMFs for ComputeReductionResult (NFC). Manage fast-math flags using VPIRFlags from VPInstruciton, in inline with other VPInstructions. With this change, we now print the correctly flags for ComputeReductionResult, other than that NFC.	2025-05-28 20:54:58 +01:00
Paul Walker	9aebf4c399	[NFC][LLVM] Tests for vectorisation of loops with vscale base trip counts.	2025-05-28 12:42:41 +00:00
Ramkumar Ramachandra	5f39be5917	[VPlan] Use InstSimplifyFolder instead of TargetFolder (#141222 ) For more powerful folding with operands that are not necessarily all-constant, use InstSimplifyFolder instead of TargetFolder in tryToConstantFold, and rename the function tryToFoldLiveIns.	2025-05-28 11:00:14 +02:00
Florian Hahn	d56deea1e4	[VPlan] Connect Entry to scalar preheader during initial construction. (#140132 ) Update initial construction to connect the Plan's entry to the scalar preheader during initial construction. This moves a small part of the skeleton creation out of ILV and will also enable replacing VPInstruction::ResumePhi with regular VPPhi recipes. Resume phis need 2 incoming values to start with, the second being the bypass value from the scalar ph (and used to replicate the incoming value for other bypass blocks). Adding the extra edge ensures we incoming values for resume phis match the incoming blocks. PR: https://github.com/llvm/llvm-project/pull/140132	2025-05-27 16:07:56 +01:00
Luke Lau	841c8d48a6	[LV] Add tests for more interleave group factors on AArch64 and RISC-V. NFC The plan is to eventually add support for scalably vectorizing these for non-power-of-2 factors, see https://github.com/llvm/llvm-project/pull/139893 Simultaneously, we need to add a test to make sure we don't generate @llvm.vector.[de]interleave3 for AArch64 if we can't lower it (yet)	2025-05-26 18:21:27 +01:00
Philip Reames	041d189f01	[RISCV][TTI] Adjust costing in getPartialReductionCost for zvqdotq (#141430 ) Two changes: 1) Handle fixed vector cases now that 77a3f8 has landed. 2) Fix a mistake in the original costing - the VF passed in is the input VF, not the output VF. Given that we should be costing the accumulator type with VF/4. Note that (2) does not cause any visible test differences as the vectorizer (outside of maximize-bandwidth mode) does not consider wide enough VF for the costing difference to matter.	2025-05-26 08:23:56 -07:00
Florian Hahn	dcef154b5c	[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506 ) Building on top of https://github.com/llvm/llvm-project/pull/114305, replace VPRegionBlocks with explicit CFG before executing. This brings the final VPlan closer to the IR that is generated and helps to simplify codegen. It will also enable further simplifications of phi handling during execution and transformations that do not have to preserve the canonical IV required by loop regions. This for example could include replacing the canonical IV with an EVL based phi while completely removing the original canonical IV. PR: https://github.com/llvm/llvm-project/pull/117506	2025-05-24 19:17:16 +01:00
Florian Hahn	e089d48944	[VPlan] VPWidenGEPRecipe uses first lane of invariant indices (NFC) Update VPWidenGEPRecipe::onlyFirstLaneUsed to return true for indices that are defined outside the loop regions, if the base pointer is not invariant.	2025-05-24 17:32:05 +01:00
Florian Hahn	69f2ff3e9b	[LV] Add test case showing unnecessary broadcast of invariant GEP idx.	2025-05-24 15:19:21 +01:00
Luke Lau	4b4699a13c	[InstCombine] Don't cover up poison elements for shifts when folding shuffles thru binops (#141303 ) As noted in the TODO, we don't need to cover up the poison elements placed in the unused lanes for shifts, since it's not UB unlike div/rem. New poison elements are only introduced in cases like ShMask = <1,1,2,2> and C = <5,5,6,6> --> NewC = <poison,5,6,poison> And the resulting shuffle won't use the poison lanes.	2025-05-24 13:47:18 +01:00
Florian Hahn	a9b2998e31	[VPlan] Skip cost assert if VPlan converted to single-scalar recipes. Check if a VPlan transform converted recipes to single-scalar VPReplicateRecipes (after 07c085af3efcd67503232f99a1652efc6e54c1a9). If that's the case, the legacy cost model incorrectly overestimates the cost. Fixes https://github.com/llvm/llvm-project/issues/141237.	2025-05-24 11:09:27 +01:00
Philip Reames	a21fb74c0c	[RISCV][TTI] Implement getPartialReductionCost for the vqdotq cases (#140974 ) Doing so tells the loop vectorizer that the partial.reduce intrinsic is profitable to use over the plain extend/multiply/reduce.add sequence.	2025-05-23 07:15:06 -07:00
Florian Hahn	95ba5508e5	Reapply "[VPlan] Move predication to VPlanTransform (NFC). (#128420 )" This reverts commit 793bb6b257fa4d9f4af169a4366cab3da01f2e1f. The recommitted version contains a fix to make sure only the original phis are processed in convertPhisToBlends nu collecting them in a vector first. This fixes a crash when no mask is needed, because there is only a single incoming value. Original message: This patch moves the logic to predicate and linearize a VPlan to a dedicated VPlan transform. It mostly ports the existing logic directly. There are a number of follow-ups planned in the near future to further improve on the implementation: * Edge and block masks are cached in VPPredicator, but the block masks are still made available to VPRecipeBuilder, so they can be accessed during recipe construction. As a follow-up, this should be replaced by adding mask operands to all VPInstructions that need them and use that during recipe construction. * The mask caching in a map also means that this map needs updating each time a new recipe replaces a VPInstruction; this would also be handled by adding mask operands. PR: https://github.com/llvm/llvm-project/pull/128420	2025-05-22 08:16:15 +01:00
Mohammad Bashir	bcdce987c0	Fix regression tests with bad FileCheck checks (#140373 ) Fixes https://github.com/llvm/llvm-project/issues/140149	2025-05-22 07:59:57 +03:00
Philip Reames	c21416d1f9	[RISCV][TTI] Add test coverage for getPartialReductionCost [nfc] Adding testing in advance of a change to cost the zvqdotq instructions such that we emit them from LV.	2025-05-21 15:12:23 -07:00
Florian Hahn	bf15aadcbc	[VPlan] Don't try to narrow predicated VPReplicateRecipe. We cannot convert predicated recipes to uniform ones at the moment. This fixes a crash reported for https://github.com/llvm/llvm-project/pull/139150.	2025-05-21 22:13:55 +01:00
Ramkumar Ramachandra	cf1f116f78	[VPlan] Introduce constant folder in simplifyRecipe (#125365 ) Introduce a VPlan-level constant folder in simplifyRecipe that tries to fold a recipe to a constant using TargetFolder.	2025-05-20 14:16:01 +01:00
Sam Tebbs	70501ed2f0	[LoopVectorizer] Prune VFs based on plan register pressure (#132190 ) This PR moves the register usage checking to after the plans are created, so that any recipes that optimise register usage (such as partial reductions) can be properly costed and not have their VF pruned unnecessarily. Depends on https://github.com/llvm/llvm-project/pull/137746	2025-05-19 13:27:17 +01:00
Florian Hahn	07c085af3e	[VPlan] Add narrowToSingleScalarRecipe transform. (#139150 ) Add a new convertToUniformRecipes transform which uses VPlan-based uniformity analysis to determine if wide recipes and replicate recipes can be converted to uniform recipes. There are a few places where we ad-hoc convert recipes to uniform recipes, which this transform will eventually replace. There are a few more generalizations required to do so which I plan to do as follow-ups. By converting the recipes to uniform recipes, we effectively materialize the information from the VPlan-based analysis. Note that there is one regression at the moment in SystemZ/pr47665.ll due to trivial constant folding opportunities in the input IR. This will be fixed by VPlan-based constant folding (https://github.com/llvm/llvm-project/pull/125365/) PR: https://github.com/llvm/llvm-project/pull/139150	2025-05-18 09:32:27 +01:00
Florian Hahn	ba93685ea2	[VPlan] Also use original parent loop for exit VPBBs. When vectorizing loops with early exits that is nested within another one, one of the loop exits may be outside both loops, so setting adding it to the parent loop is incorrect. Also use the original parent loop for exit blocks.	2025-05-16 21:12:39 +01:00
Elvis Wang	664c937b43	[VPlan] Implement VPExtendedReduction, VPMulAccumulateReductionRecipe and corresponding vplan transformations. (#137746 ) This patch introduce two new recipes. * VPExtendedReductionRecipe - cast + reduction. * VPMulAccumulateReductionRecipe - (cast) + mul + reduction. This patch also implements the transformation that match following patterns via vplan and converts to abstract recipes for better cost estimation. * VPExtendedReduction - reduce(cast(...)) * VPMulAccumulateReductionRecipe - reduce.add(mul(...)) - reduce.add(mul(ext(...), ext(...)) - reduce.add(ext(mul(ext(...), ext(...)))) The converted abstract recipes will be lower to the concrete recipes (widen-cast + widen-mul + reduction) just before recipe execution. Note that this patch still relies on legacy cost model the calculate the cost for these patters. Will enable vplan-based cost decision in #113903. Split from #113903.	2025-05-16 10:25:38 +08:00
Min-Yih Hsu	0ab67ec191	[LV][EVL] Introduce the EVLIndVarSimplify Pass for EVL-vectorized loops (#131005 ) When we enable EVL-based loop vectorization w/ predicated tail-folding, each vectorized loop has effectively two induction variables: one calculates the step using (VF x vscale) and the other one increases the IV by values returned from experiment.get.vector.length. The former, also known as canonical IV, is more favorable for analyses as it's "countable" in the sense of SCEV; the latter (EVL-based IV), however, is more favorable to codegen, at least for those that support scalable vectors like AArch64 SVE and RISC-V. The idea is that we use canonical IV all the way until the end of all vectorizers, where we replace it with EVL-based IV using EVLIVSimplify introduced here. Such that we can have the best from both worlds. This Pass is enabled by default in RISC-V. However, since we haven't really vectorize loops with predicate tail-folding by default, this Pass is no-op at this moment.	2025-05-14 13:49:50 -07:00
Florian Hahn	7a9fd62278	[VPlan] Use VPlan operand order for VPBlendRecipes. (#139475 ) Don't use the order of incoming values of IR phis when creating VPBlendRecipes. Instead, simply use the incoming operands and blocks from the VPWidenPHIRecipe. Note that this changes the order of the incoming operands/masks for some blends. PR: https://github.com/llvm/llvm-project/pull/139475	2025-05-14 14:56:35 +01:00
Florian Hahn	5fa64d65e9	[VPlan] Use printPhiOperands for VPPhi. Split off from https://github.com/llvm/llvm-project/pull/139151 to land printing improvements separately. Updates printing of VPPhi operands to be consistent with VPWidenPHIRecipe.	2025-05-10 12:49:29 +01:00
Florian Hahn	8c6c525a6b	[LV] Don't consider FORs as profitable to scalarize. Fixed-order recurrence phis cannot be scalarized, they will always be widened at the moment. Make sure they are not incorrectly considered profitable to scalarize, similar to 41c1a7be3f1a2556e. Fixes https://github.com/llvm/llvm-project/issues/139060. Fixes https://github.com/llvm/llvm-project/issues/139065.	2025-05-09 20:29:22 +01:00
Ramkumar Ramachandra	f058333941	[LV] Regen a test with UTC (#139235 )	2025-05-09 14:26:20 +01:00
Florian Hahn	e854c381c6	[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450 ) Use VPIRMetadata added in https://github.com/llvm/llvm-project/pull/135272 to also manage no-alias metadata added by versioning. Note that this means we have to build the no-alias metadata up-front once. If it is not used, it will be discarded automatically. This also fixes a case where incorrect metadata was added to wide loads/stores that got converted from an interleave group. Compile-time impact is neutral: https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u	2025-05-09 11:19:12 +01:00
Florian Hahn	d06d43a9e8	[VPlan] Add printPhiOperands to VPPhiAccessors, use for wide phis. (NFC modulo debug output changes) Add generic helper to print phi operands (incoming values) together with their incoming blocks. As more and more transforms are added, keeping the incoming blocks of phis becomes more important. Print incoming blocks via VPPhiAcessors, to make debugging easier.	2025-05-08 20:56:48 +01:00
Florian Hahn	339dc9500b	[VPlan] Retain exit conditions and edges in initial VPlan (NFC). (#137709 ) Update initial VPlan construction to include exit conditions and edges. The loop region is now first constructed without entry/exiting. Those are set after inserting the region in the CFG, to preserve the original predecessor/successor order of blocks. For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (https://github.com/llvm/llvm-project/pull/128420). PR: https://github.com/llvm/llvm-project/pull/137709	2025-05-08 18:10:52 +01:00
Ramkumar Ramachandra	c4f723a7c3	[LV] Strip unmaintainable MinBWs assert (#136858 ) tryToWiden attempts to replace an Instruction with a Constant from SCEV, but forgets to erase the Instruction from the MinBWs map, leading to an assert in VPlanTransforms::truncateToMinimalBitwidths. Going forward, the assertion in truncateToMinimalBitwidths is unmaintainable, as LV could simplify the expression at any point: fix the bug by stripping the unmaintable assertion. Fixes #125278.	2025-05-08 11:49:54 +01:00
Luke Lau	1484f82cbc	[VPlan] Add VPInstruction::StepVector and use it in VPWidenIntOrFpInductionRecipe (#129508 ) Split off from #118638, this adds VPInstruction::StepVector, which generates integer step vectors (0,1,2,...,VF). This is a step towards eventually modelling all the separate parts of VPWidenIntOrFpInductionRecipe in VPlan. This is then used by VPWidenIntOrFpInductionRecipe, where we materialize it just before unrolling so the operands stay in a fixed position. The need for a separate operand in VPWidenIntOrFpInductionRecipe, as well as the need to update it in optimizeVectorInductionWidthForTCAndVFUF, should be removed with #118638 when everything is expanded in convertToConcreteRecipes.	2025-05-08 18:47:44 +08:00
Florian Hahn	127f48668b	[LV] Add test showing incorrect metadata merging when narrowing IGs. Add test showing that incorrect tbaa metadata is added to the widened loads and stores when narrowing interleave groups. The widened loads/stores currently have the TBAA metadata of the first load/store, even though the wide accesses also access data with types of the second load/store.	2025-05-08 11:13:25 +01:00
Paul Walker	01813e8929	[LLVM][VecLib] Refactor LIBMVEC integration to be target neutral. (#138262 ) Renames LIBMVEC-X86 to LIBMVEC and updates TLI to only add the existing x86 specific mapping when targeting x86.	2025-05-07 11:05:25 +01:00
Min-Yih Hsu	e0537c0768	[LV][EVL] Attach a new metadata on EVL vectorized loops (#131000 ) This patch attaches a new metadata, `llvm.loop.isvectorized.withevl`, on loops vectorized with explicit vector length. This will help other optimizations down in the pipeline that focus on EVL-vectorized loop This approach is much safer than, said IR pattern matching to figure out if a loop is EVL-vectorized or not.	2025-05-06 10:06:37 -07:00
Maryam Moghadas	a750893fea	[VPlan][LV] Fix invalid truncation in VPScalarIVStepsRecipe (#137832 ) Replace CreateTrunc with CreateSExtOrTrunc in VPScalarIVStepsRecipe to safely handle type conversion. This prevents assertion failures from invalid truncation when StartIdx0 has a smaller integer type than IntStepTy. The assertion was introduced by commit 783a846. Fixes https://github.com/llvm/llvm-project/issues/137185	2025-05-06 12:48:21 -04:00
Florian Hahn	9a26b2903b	[VPlan] Don't rely on region check in isUniformAfterVectorization. (#137883 ) Generalize isUniformAfterVectorization check to not rely on the region, but purely work on checking operands and opcodes. This will be needed when disolving the vector region (https://github.com/llvm/llvm-project/pull/117506) and improves codegen slightly in some cases. PR: https://github.com/llvm/llvm-project/pull/137883	2025-05-02 15:42:21 +01:00
Sam Tebbs	2876dbcd66	[AArch64] Don't allow mixed partial reductions without i8mm (#137602 ) Partial reductions with mixed extends should only be allowed if i8mm is present.	2025-05-01 16:06:37 +01:00
Samuel Tebbs	fa769655e7	[LV] NFC: Make VPPartialReductionRecipe a VPReductionRecipe	2025-04-30 19:44:40 +01:00
Luke Lau	2cd829fc2c	[VectorUtils][VPlan] Consolidate VPWidenIntrinsicRecipe::onlyFirstLaneUsed and isVectorIntrinsicWithScalarOpAtArg (#137497 ) We can reuse isVectorIntrinsicWithScalarOpAtArg in VectorUtils to determine if only the first lane will be used for a VPWidenIntrinsicRecipe, provided that we also move the VP EVL operand check into it. This was needed by a local patch I was working on that created a VPWidenIntrinsicRecipe with a VP intrinsic, and prevents the need to update the scalar arguments in two places.	2025-05-01 01:25:41 +08:00

1 2 3 4 5 ...

3111 Commits