llvm-project

Author	SHA1	Message	Date
Florian Hahn	e80e7e717e	[VPlan] Use scalar VPPhi instead of VPWidenPHIRecipe in createPlainCFG. (#150847 ) The initial VPlan closely reflects the original scalar loop, so unsing VPWidenPHIRecipe here is premature. Widened phi recipes should only be introduced together with other widened recipes. PR: https://github.com/llvm/llvm-project/pull/150847	2025-08-06 14:43:03 +01:00
Florian Hahn	777c320e6c	[VPlan] Address comments missed in #142309 . Address additional comments from https://github.com/llvm/llvm-project/pull/142309.	2025-08-06 11:52:08 +01:00
Andrew Rogers	a3c386d241	[llvm] annotate recently added interfaces for DLL export (#152179 ) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates symbols that were recently added to LLVM and fixes incorrectly annotated symbols. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). ## Overview The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS: - Add `LLVM_EXPORT_TEMPLATE` and `LLVM_TEMPLATE_ABI` annotations to explicitly instantiated instances of `llvm::object::SFrameParser`. ## Validation On Windows 11: ``` cmake -B build -S llvm -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lldb;lld" -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_BUILD_LLVM_DYLIB_VIS=ON -DLLVM_LINK_LLVM_DYLIB=ON -DLLVM_BUILD_TESTS=ON -DCLANG_LINK_CLANG_DYLIB=OFF -DCMAKE_BUILD_TYPE=Release ninja -C build ```	2025-08-05 23:12:07 -07:00
Florian Hahn	d478502a42	[VPlan] Ensure that IV resume phi for epilogue is always first. (NFCI) Update handling of canonical IV resume phi for the epilogue loop to make sure the resume phi for the canonical IV is always the first phi in the scalar preheader. This makes it easier to retrieve it in preparePlanForEpilogueVectorLoop. For now, we keep an assert to make sure we use the same resume phi as before. This will be removed in the future.	2025-08-05 21:06:41 +01:00
Florian Hahn	47258ca470	[VPlan] Use VPPhi instead of dyn_cast + opcode check in isPhi (NFC).	2025-08-05 19:20:12 +01:00
Luke Lau	94a6cd464e	[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274 ) This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. There is a new VPInstruction added, WidePtrAdd to allow adding the step vector to the induction phi, since VPInstruction::PtrAdd only handles scalars or multiple scalar lanes. Originally this transformation was copied from the original recipe's execute code, but it's since been simplifed by teaching `unrollWidenInductionByUF` to unroll the recipe, which brings it inline with VPWidenIntOrFpInductionRecipe.	2025-08-05 16:54:02 +08:00
Florian Hahn	c9dd14d1d4	[VPlan] Compute interleave count for VPlan. (#149702 ) Move selectInterleaveCount to LoopVectorizationPlanner and retrieve some information directly from VPlan. Register pressure was already computed for a VPlan, and with this patch we now also check for reductions directly on VPlan, as well as checking how many load and store operations remain in the loop. This should be mostly NFC, but we may compute slightly different interleave counts, except for some edge cases, e.g. where dead loads have been removed. This shouldn't happen in practice, and the patch doesn't cause changes across a large test corpus on AArch64. Computing the interleave count based on VPlan allows for making better decisions in presence of VPlan optimizations, for example when operations on interleave groups are narrowed. Note that there are a few test changes for tests that were still checking the legacy cost-model output when it was computed in selectInterleaveCount. PR: https://github.com/llvm/llvm-project/pull/149702	2025-08-05 09:42:55 +01:00
Mel Chen	8761b6cf8f	[VPlan] Use VPTypeAnalysis to get the step type of widen pointer induction (#147925 ) This patch uses VPTypeAnalysis to determine its type since the induction step is not always a live-in value in the VPlan and may be defined by a recipe.	2025-08-05 09:13:44 +08:00
Florian Hahn	0433e1e15f	[VPlan] Add VPlan::getTrue/getFalse convenience helpers (NFC). Makes it slightly more convenient to create true/false constants.	2025-08-04 21:04:55 +01:00
Florian Hahn	215e6beae0	[LV] Use MapVector for ScalarCostsTy for deterministic iter order (NFC) We iterate over the scalar costs of instruction when printing costs, and currently the iteration order is not deterministic. Currently no tests check the output with multiple instructions in the map, but those will come soon.	2025-08-04 19:31:07 +01:00
Alexey Bataev	e27831ff9b	[SLP] Fix a check for main/alternate interchanged instruction If the instruction is checked for matching the main instruction, need to check if the opcode of the main instruction is compatible with the operands of the instruction. If they are not, need to check the alternate instruction and its operands for compatibility and return alternate instruction as a match. Fixes #151699 Fixed check for non-supported binary operations.	2025-08-04 11:20:54 -07:00
Michael Halkenhäuser	70af09e3a1	Revert "[SLP] Fix a check for main/alternate interchanged instruction" (#151997 ) This reverts commit 3ee8d047109ea4bb479095f4b153c2120a8d726c. Revert reason: FAILED build for openmp-offload-amdgpu-runtime-2 https://lab.llvm.org/buildbot/#/builders/10/builds/10827	2025-08-04 12:57:20 -04:00
Alexey Bataev	3ee8d04710	[SLP] Fix a check for main/alternate interchanged instruction If the instruction is checked for matching the main instruction, need to check if the opcode of the main instruction is compatible with the operands of the instruction. If they are not, need to check the alternate instruction and its operands for compatibility and return alternate instruction as a match. Fixes #151699	2025-08-04 08:31:35 -07:00
Simon Pilgrim	88c6448fa2	Revert "[VectorCombine] Shrink loads used in shufflevector rebroadcasts" (#151960 ) Reverts llvm/llvm-project#128938 while a crash regression is investigated	2025-08-04 15:03:53 +01:00
Alexey Bataev	7cd1ce3aa0	[SLP]Check vector-like instruction for dominance in copyables Need to check if the vector-like instruction is dominated by main operation in the copyables to prevent broken def-use chain Fixes #151456	2025-08-04 06:14:19 -07:00
Florian Hahn	66a8341f6d	[VPlan] Skip disconnected exit blocks in hasEarlyExit. (#151718 ) Currently hasEarlyExit returns true, if there are multiple exit blocks. ExitBlocks contains the wrapped original IR exit blocks. Without checking the predecessors we incorrectly return true for loops with multiple countable exits, that have been vectorized by requiring a scalar epilogue. In that case, the exit blocks will get disconnected. Fix this by filtering out disconnected exit blocks. Currently this should only impact the 'early-exit vectorized' statistic. PR: https://github.com/llvm/llvm-project/pull/151718	2025-08-04 11:31:00 +01:00
Leon Clark	1feed444aa	[VectorCombine] Shrink loads used in shufflevector rebroadcasts (#128938 ) Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions. --------- Co-authored-by: Leon Clark <leoclark@amd.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 10:49:27 +01:00
Kazu Hirata	3549134836	[Vectorize] Remove an unnecessary cast (NFC) (#151850 ) getNumElements() already returns unsigned.	2025-08-03 08:44:50 -07:00
Florian Hahn	559d1dff89	[VPlan] Materialize BackedgeTakenCount using VPInstructions. Explicitly compute the backedge-taken count using VPInstruction. This is needed to model the full skeleton in VPlan. NFC modulo some instruction re-ordering.	2025-08-03 12:21:28 +01:00
Simon Pilgrim	b983ce8145	[VPlan] handleMaxMinNumReductions - fix gcc Wparentheses warning. NFC.	2025-08-03 11:50:31 +01:00
Florian Hahn	39c30665e9	[VPlan] Update type of cloned instruction in scalarizeInstruction. The operands of the replicate recipe may have been narrowed, resulting in a narrower result type. Update the type of the cloned instruction to the correct type. Fixes https://github.com/llvm/llvm-project/issues/151392.	2025-08-02 19:49:59 +01:00
Florian Hahn	08f50e9665	[VPlan] Use vector tripcount if computable when simplifying conds. (#151034 ) Update isConditionTrueViaVFAndUF to use the vector trip count if computable. This is the case when it has been materialized to a constant. Otherwise fall back to the trip count. PR: https://github.com/llvm/llvm-project/pull/151034	2025-08-02 16:31:31 +01:00
Ramkumar Ramachandra	af0be76a35	[VPlan] Replace reverse RPOT with PO traversal (NFC) (#151757 )	2025-08-02 08:46:27 +01:00
Florian Hahn	eee9755881	[LV] Refine check to find epilogue IV resume value. Make sure to check that the vector trip count is containedin the list of incoming values to serve as tie-breaker with phis with all-zero incoming values. Fixes https://github.com/llvm/llvm-project/issues/151686.	2025-08-01 20:54:39 +01:00
Florian Hahn	c300a99ea8	[LV] Use MapVector for InstsToScalarize for deterministic iter order (NFC) We iterate over InstsToScalarize when printing costs, and currently the iteration order is not deterministic. Currently no tests check the output with multiple instructions in InstsToScalarize, but those will come soon.	2025-08-01 14:29:53 +01:00
Florian Hahn	7d815c7642	[LV] Remove unused variables after 965231ca0a9a. (NFC) Clean up unused/dead variables after 965231ca0a9a (https://github.com/llvm/llvm-project/pull/151311)	2025-08-01 10:12:20 +01:00
Florian Hahn	965231ca0a	[LV] Replace UncountableEdge with UncountableEarlyExitingBlock (NFC). (#151311 ) Only the uncountable exiting BB is used. Store it instead of a piar of Exiting BB and Exit BB. PR: https://github.com/llvm/llvm-project/pull/151311	2025-08-01 09:37:02 +01:00
Mel Chen	86916ff0f0	[LV] Fix gap mask requirement for interleaved access (#151105 ) When interleaved stores contain gaps, a mask is required to skip the gaps, regardless of whether scalar epilogues are allowed. This patch corrects the condition under which a gap mask is needed, ensuring consistency between the legacy and VPlan-based cost models and avoiding assertion failures. Related #149981	2025-08-01 14:24:30 +08:00
Ramkumar Ramachandra	d07f48e4da	[VPlan] Use m_BinaryOr matcher for clarity (NFC) (#151541 )	2025-08-01 06:56:27 +01:00
Luke Lau	7250b66240	[VPlan] Create AVL as a phi from TC -> 0 with EVL tail folding (#151481 ) This implements the first half of #151459, by changing the AVL so it's no longer computed as `trip-count - EVL-based IV`, but instead a separate scalar phi that is decremented by EVL each iteration. This shortens the dependency chain for computing the AVL and should eventually allow us to convert the branch condition to `branch-count avl-next, 0`. `simplifyBranchConditionForVFAndUF` had to be updated to prevent a regression because this introduces a VPPhi in the header block.	2025-08-01 11:00:05 +08:00
Luke Lau	08c5944222	[VPlan] Fix header phi VPInstruction verification. NFC (#151472 ) Noticed this when checking the invariant that all phis in the header block must be header phis. I think there's a missing set of parentheses here, since otherwise it only cast<VPInstruction> when RecipeI isn't a VPInstruction.	2025-07-31 23:09:20 +08:00
Samuel Tebbs	339b0a1d74	[LV][NFCI] Format fcc419b05f62	2025-07-31 14:37:59 +01:00
Samuel Tebbs	fcc419b05f	[LV][NFCI] Swap reduction recipe operand order https://github.com/llvm/llvm-project/pull/147026 will enable sub reductions, which require that the phi value is the first operand since they aren't commutative. This re-orders the operands when executing reductions, which actually matches other existing code in VPReductionRecipe::execute.	2025-07-31 14:35:10 +01:00
Nathan Gauër	67273393b1	[VectorCombine][TTI] Prevent extract/ins rewrite to GEP (#150216 ) Using GEP to index into a vector is not disallowed, but not recommended. The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures. Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation. Related to #145002	2025-07-31 14:14:00 +02:00
Ramkumar Ramachandra	b7d00b827e	[VPlan] Uniformly use VPlanPatternMatch in transforms (NFC) (#151488 )	2025-07-31 12:01:40 +01:00
Ramkumar Ramachandra	20f6ec4b29	[VPlan] Make VPBuilder APIs uniformly take ArrayRef (NFC) (#151484 )	2025-07-31 11:33:04 +01:00
Mel Chen	6752415ce8	[VectorUtils] Simplify the code by new function InterleaveGroup::isFull. nfc (#151112 )	2025-07-31 16:02:53 +08:00
Shih-Po Hung	cc8c941e17	[VPlan] Convert EVL loops to variable-length stepping after dissolution (#147222 ) Loop regions require fixed-length steps and rounded-up trip counts, but after dissolution creates explicit control flow, EVL loops can leverage variable-length stepping with original trip counts. This patch adds a post-dissolution transform pass to convert EVL loops from fixed-length to variable-length stepping .	2025-07-30 16:50:57 +08:00
Luke Lau	b663e563cc	[VPlan] Fix header masks in EVL tail folding (#150202 ) With EVL tail folding, the EVL may not always be VF on the second-to-last iteration. Recipes that have been converted to VP intrinsics via optimizeMaskToEVL account for this, but recipes that are left behind will still use the old header mask which may end up having a different vector length. This is effectively the same as #95368, and fixes this by converting header masks from icmp ule wide-canonical-iv, backedge-trip-count -> icmp ult step-vector, evl. Without it, recipes that fall through optimizeMaskToEVL may use the wrong vector length, e.g. in #150074 and #149981. We really need to split off optimizeMaskToEVL into VPlanTransforms::optimize and move transformRecipestoEVLRecipes into tryToBuildVPlanWithVPRecipes, so we don't mix up what is needed for correctness and what is needed to optimize away the mask computations. We should be able to still generate a correct albeit suboptimal VPlan without running optimizeMaskToEVL. I've added a TODO for this, which I think we can do after #148274 Fixes #150197	2025-07-30 11:31:04 +08:00
Florian Hahn	55f9eccee9	[LV] Revert back to use Loop::isLoopInvariant in isPredicatedInst. (#150828 ) This partially reverts https://github.com/llvm/llvm-project/pull/140744, restoring the original TheLoop->isLoopInvariant check instead the more powerful Legal->isInvariant, which uses SCEV. This causes a mis-compile, because SCEV can prove that the stored value is loop-invariant, which in turn converts the store to a uniform store. But in VPlan, we aren't yet able to determine that the stored value is loop-invariant, so we extract the last lane, which is incorrect, because it does not account for the mask of the store. Restoring the original code is a safe fix and avoids this subtle divergence. Fixes https://github.com/llvm/llvm-project/issues/149347. PR: https://github.com/llvm/llvm-project/pull/150828	2025-07-29 20:32:31 +01:00
Paul Walker	3ede2decbe	[LLVM][LV] Improve UF calculation for vscale based scalar loops. (#146102 ) Update getSmallConstantTripCount() to return scalable ElementCount values that is used to acurrately determine the maximum value for UF, namely: TripCount / VF ==> X * VScale / Y * VScale ==> X / Y This improves the chances of being able to remove the scalar loop and also fixes an issue where a UF=2 is choosen for a scalar loop with exactly VF(= X * VScale) iterations.	2025-07-29 12:49:38 +01:00
David Sherwood	6fbc397964	[IR] Add new CreateVectorInterleave interface (#150931 ) This PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder.	2025-07-29 08:47:07 +01:00
Florian Hahn	c93d166c58	[VPlan] Simplify (MUL %x, 0) -> 0. Simplify trivial multiplies. https://alive2.llvm.org/ce/z/DabRkA	2025-07-28 21:50:57 +01:00
Luke Lau	92d09245d6	[VPlan] Fall back to scalar epilogue if possible when EVL isn't legal (#150908 ) When enabling predicated vectorization by default on RISC-V, there's a bunch of performance regressions on llvm-test-suite's LoopInterleaving microbenchmarks: https://lnt.lukelau.me/db_default/v4/nts/788?show_delta=yes&show_previous=yes&show_stddev=yes&show_mad=yes&show_all=yes&show_all_samples=yes&show_sample_counts=yes&show_small_diff=yes&num_comparison_runs=0&test_filter=&test_min_value_filter=&aggregation_fn=min&MW_confidence_lv=0.05&compare_to=791&baseline=730&submit=Update Most of these regressions stem from the interleave_count pragma, which causes EVL tail folding interleaving to be unsupported (since we don't support unrolling with EVL) Currently if DataWithEVL isn't legal we fall back to DataWithoutLaneMask as the tail folding style, but this is very slow on RISC-V. The order of performance roughly is something like: DataWithEVL > None (scalar-epilogue) > Data[WithoutLaneMask] So this patch tries to prevent the regressions by falling back to a scalar epilogue where possible, i.e. the existing vectorization we have today. Not we may still need to fall back to DataWithoutLaneMask, e.g. if the trip count is low etc or it's forced by -prefer-predicate-over-epilogue=predicate-dont-vectorize.	2025-07-28 20:10:36 +08:00
Florian Hahn	2f2df751d4	[LV] Use SCEV::getElementCount in selectEpilogueVectorizationFactor. (#150018 ) Follow-up to https://github.com/llvm/llvm-project/pull/149789 to use getElementCount to compute the remaining iterations in selectEpilogueVectrizationFactor. PR: https://github.com/llvm/llvm-project/pull/150018	2025-07-28 12:12:27 +01:00
Florian Hahn	f8b1c7333f	[VPlan] Add getContext helper to VPlan (NFC).	2025-07-27 18:53:53 +01:00
Florian Hahn	4386848776	[VPlan] Add explicit VPUnrollPartAccessor<1> instantiation. This should fix a build-failure with GCC, including https://lab.llvm.org/buildbot/#/builders/105/builds/10685.	2025-07-27 14:05:23 +01:00
Florian Hahn	89ae085859	[VPlan] Remove VPVectorPointer for part 0 after unrolling. (#149735 ) VPVectorPointer for part 0 is just the pointer operand. Simplify it after unrolling. This removes a large number of redundant GEPs with index 0. PR: https://github.com/llvm/llvm-project/pull/149735	2025-07-27 13:53:26 +01:00
Florian Hahn	c9a87b45a3	[VPlan] Retrieve latch terminator from VPlan. (NFC) Remove an unnecessary lookup via original IR loop.	2025-07-27 09:48:59 +01:00
Florian Hahn	bc7487d8ed	[VPlan] Cast header and latch to VPBasicBlock early (NFC). There are only VPBasicBlocks when prepareForVectorization is called. Cast them early instead of having multiple casts later on.	2025-07-27 09:47:50 +01:00

1 2 3 4 5 ...

6326 Commits