llvm-project

Author	SHA1	Message	Date
Nicolai Hähnle	11a4b2d950	Cleanup the LLVM exported symbols namespace (#161240 ) There's a pattern throughout LLVM of cl::opts being exported. That in itself is probably a bit unfortunate, but what's especially bad about it is that a lot of those symbols are in the global namespace. Move them into the llvm namespace. While doing this, I noticed some other variables in the global namespace and moved them as well.	2025-10-01 15:32:07 -07:00
Ramkumar Ramachandra	280abaf9da	[VPlan] Handle scalar-VF in transforms (NFC) (#161365 )	2025-09-30 19:35:12 +01:00
Sam Tebbs	88658dbbc5	[LV] Add ExtNegatedMulAccReduction expression type (#160154 ) This PR adds the ExtNegatedMulAccReduction expression type for VPExpressionRecipe so that extend-multiply-accumulate reductions with a negated multiply can be bundled. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/156976 2. -> https://github.com/llvm/llvm-project/pull/160154 3. https://github.com/llvm/llvm-project/pull/147302	2025-09-30 10:10:37 +01:00
Florian Hahn	71be13a6f0	[VPlan] Rewrite VPExpandSCEVExprs in replaceSymbolicStrides. Extend replaceSymbolicStrides to also replace SCEVUnknowns in VPExpandSCEVExprs using the information from StridesMaps. This results in simpler SCEV expansions in some cases.	2025-09-28 21:55:31 +01:00
Florian Hahn	70a26da639	[VPlan] Set correct flags when creating and cloning VPWidenCastRecipe. Make sure that we set the correct wrap flags when creating new VPWidenCastRecipes for truncs and preserve the flags from the recipe directly when cloning, to make sure they are not dropped. Fixes https://github.com/llvm/llvm-project/issues/160396	2025-09-25 09:00:47 +01:00
Ramkumar Ramachandra	019913e4fa	[VPlan] Add WidenGEP::getSourceElementType (NFC) (#159029 )	2025-09-22 10:02:08 +01:00
Ramkumar Ramachandra	b716d35388	[VPlanPatternMatch] Introduce m_ConstantInt (#159558 )	2025-09-21 13:27:46 +01:00
Florian Hahn	50b9ca4dda	[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510 ) After https://github.com/llvm/llvm-project/pull/153643, there may be a BranchOnCond with constant condition in the entry block. Simplify those in removeBranchOnConst. This removes a number of redundant conditional branch from entry blocks. In some cases, it may also make the original scalar loop unreachable, because we know it will never execute. In that case, we need to remove the loop from LoopInfo, because all unreachable blocks may dominate each other, making LoopInfo invalid. In those cases, we can also completely remove the loop, for which I'll share a follow-up patch. Depends on https://github.com/llvm/llvm-project/pull/153643. PR: https://github.com/llvm/llvm-project/pull/154510	2025-09-18 19:25:05 +01:00
Ramkumar Ramachandra	0384f6c9db	[VPlanPatternMatch] Introduce match functor (NFC) (#159521 ) Follow up on 7fb3a91 ([PatternMatch] Introduce match functor) to introduce the VPlanPatternMatch version of the match functor to shorten some idioms. Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-18 10:36:12 +01:00
Ramkumar Ramachandra	46fcece2a8	[VPlan] Extend CSE to eliminate GEPs (#156699 ) The motivation for this patch is to close the gap between the VPlan-based CSE and the legacy CSE, to make it easier to remove the legacy CSE. Before this patch, stubbing out the legacy CSE leads to 22 test failures, and after this patch, there are only 12 failures, and all of them seem to have a single root cause: VPlanTransforms::createInterleaveGroups() and VPInterleaveGroup::execute(). The improvements from this patch are of course welcome. While developing the patch, a miscompile was found when GEP source-element-types differ, and this has been fixed. Co-authored-by: Florian Hahn <flo@fhahn.com> Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-16 10:14:32 +00:00
Ramkumar Ramachandra	148a83543b	[LV] Introduce m_One and improve (0\|1)-match (NFC) (#157419 )	2025-09-15 10:34:06 +00:00
Florian Hahn	ef7e03a2d1	[VPlan] Limit ExtractLastElem fold to recipes guaranteed single-scalar. vputils::isSingleScalar(A) may return true to recipes that produce only a single scalar value, but they could still end up as vector instruction, because the recipe could not be converted to a single-scalar VPInstruction/VPReplicateRecipe. For now, only apply the fold for recipes guaranteed to produce a single value, i.e. single-scalar VPInstructions and VPReplicateRecipes. Fixes https://github.com/llvm/llvm-project/issues/158319.	2025-09-13 18:15:38 +01:00
Florian Hahn	b8eaceb39b	[VPlan] Explicitly replicate VPInstructions by VF. (#155102 ) Extend replicateByVF added in #142433 (aa240293190) to also explicitly unroll replicating VPInstructions. Now the only remaining case where we replicate for all lanes is VPReplicateRecipes in replicate regions. PR: https://github.com/llvm/llvm-project/pull/155102	2025-09-12 17:06:26 +01:00
Florian Hahn	1efa997317	[VPlan] Handle stores to single-scalar addr in narrowToSingleScalars. Move handling of stores to single-scalar/uniform address from replicateByVF to narrowToSingleScalar.	2025-09-10 21:58:29 +01:00
Florian Hahn	055e4ff35a	[VPlan] Don't narrow op multiple times in narrowInterleaveGroups. Track which ops already have been narrowed, to avoid narrowing the same operation multiple times. Repeated narrowing will lead to incorrect results, because we could first narrow from an interleave group -> wide load, and then narrow the wide load > single-scalar load. Fixes thttps://github.com/llvm/llvm-project/issues/156190.	2025-09-10 19:22:42 +01:00
Florian Hahn	c3e76b2770	[VPlan] Keep common flags during CSE. (#157664 ) During CSE, we don't have to drop all poison-generating flags on mis-match, we can keep the ones common on both recipes. PR: https://github.com/llvm/llvm-project/pull/157664	2025-09-10 10:20:48 +00:00
Stephen Tozer	d4f7995488	[VPlan] Use Unknown instead of empty location in VPlanTransforms (#157702 ) The default values for DebugLocs in LoopVectorizer/VPlan were recently updated from empty DebugLocs to DebugLoc::getUnknown, as part of the DebugLoc Coverage Tracking work. However, there are some cases where we also pass an explicit empty DebugLoc, in many cases as a filler argument. This patch updates all of these to `getUnknown` for now, until either valid locations or a suitable categorization can be assigned to each instead. This change is NFC outside of DebugLoc coverage tracking builds.	2025-09-10 10:33:58 +01:00
Mel Chen	4d9a7fa9ba	[VPlan] Remove dead recipes before simplifying blends (#157622 ) In simplifyBlends, when normalizing a blend recipe, the first mask that is used only by the blend and is not all-false is chosen, and its corresponding incoming value becomes the initial value, with the others blended into it. At the same time, the mask that is chosen can be eliminated. However, a multi-user mask might be used by a dead recipe, which prevents this optimization. This patch moves removeDeadRecipes before simplifyBlends to eliminate dead recipes, allowing simplifyBlends to remove more dead masks.	2025-09-10 08:03:18 +00:00
Florian Hahn	c4b17bf9ed	[VPlan] Slightly extend ExtractLastElement fold to single-scalars. Update ExtractLastElement fold to support single scalar recipes, if all their users only use scalars.	2025-09-09 22:08:08 +01:00
Florian Hahn	132bacde22	[VPlan] Also allow extracts as users when converting to single scalars. Extracts technically do not use scalars, but vectors, but if the operand is a single scalar we do not need a vector and they should not block forming single scalars.	2025-09-08 22:11:39 +01:00
Luke Lau	3f9e0736ac	[VPlan] Move findCommonEdgeMask optimization to simplifyBlends (#156304 ) Following up from #150368, this moves folding common edge masks into simplifyBlends. One test in uniform-blend.ll ended up regressing but after looking at it closely, it came from a weird (x && !x) edge mask. So I've just included a simplifcation in this PR to fold that to false.	2025-09-05 01:29:22 +00:00
Ramkumar Ramachandra	e4c0b3e111	[VPlan] Simplify x && false -> false, x \| 0 -> x (#156345 ) The OR x, 0 -> x simplification has been introduced to avoid regressions.	2025-09-04 10:29:59 +01:00
Luke Lau	c33ccfa52b	[VPlan] Reassociate (x & y) & z -> x & (y & z) (#155383 ) This PR reassociates logical ands in order to enable more simplifications. The driving motivation for this is that with tail folding all blocks inside the loop body will end up using the header mask. However this can end up nestled deep within a chain of logical ands from other edges. Typically the header mask will be a leaf nested in the LHS, e.g. (headermask & y) & z. So pulling it out allows it to be simplified further, e.g. allows it to be optimised away to VP intrinsics with EVL tail folding.	2025-09-03 01:09:19 +00:00
Ramkumar Ramachandra	d8fd511480	[VPlan] Introduce CSE pass (#151872 ) Introduce a simple common-subexpression-elimination pass at the VPlan-level, running late during the execution of the VPlan. The long-term vision is to get rid of the legacy non-VPlan-based cse routine in LV, but this patch doesn't yet fully subsume it.	2025-09-02 12:23:29 +01:00
Sam Tebbs	37127f74f4	[LV] Bundle sub reductions into VPExpressionRecipe (#147255 ) This PR bundles sub reductions into the VPExpressionRecipe class and adjusts the cost functions to take the negation into account. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. -> https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513	2025-09-01 17:25:01 +01:00
Mel Chen	13357e8a12	[LV][EVL] Support interleaved access with tail folding by EVL (#152070 ) The InterleavedAccess pass already supports transforming vector-predicated (vp) load/store intrinsics. With this patch, we start enabling interleaved access under tail folding by EVL. This patch introduces a new base class, VPInterleaveBase, and a concrete class, VPInterleaveEVLRecipe. Both the existing VPInterleaveRecipe and the new VPInterleaveEVLRecipe inherit from and implement VPInterleaveBase. Compared to VPInterleaveRecipe, VPInterleaveEVLRecipe adds an EVL operand to emit vp.load/vp.store intrinsics. Currently, tail folding by EVL is only supported for scalable vectorization. Therefore, VPInterleaveEVLRecipe will only emit interleave/deinterleave intrinsics. Reverse accesses are not yet implemented, as masked reverse interleaved access under tail folding is not yet supported. Fixed #123201	2025-09-01 21:20:06 +08:00
Luke Lau	eb7f6a5f8a	[VPlan] Simplify (x && y) \|\| (x && z) -> x && (y \|\| z) (#156308 ) Split off from #155383, since it turns out this has a diff on its own.	2025-09-01 21:12:23 +08:00
Kerry McLaughlin	f0e9bba024	[LoopVectorize] Generate wide active lane masks (#147535 ) This patch adds a new flag (-enable-wide-lane-mask) which allows LoopVectorize to generate wider-than-VF active lane masks when it is safe to do so (i.e. the mask is used for data and control flow). The transform in extractFromWideActiveLaneMask creates vector extracts from the first active lane mask in the header & loop body, modifying the active lane mask phi operands to use the extracts. An additional operand is passed to the ActiveLaneMask instruction, the value of which is used as a multiplier of VF when generating the mask. By default this is 1, and is updated to UF by extractFromWideActiveLaneMask. The motivation for this change is to improve interleaved loops when SVE2.1 is available, where we can make use of the whilelo instruction which returns a predicate pair. This is based on a PR that was created by @momchil-velikov (#81140) and contains tests which were added there.	2025-09-01 13:53:30 +01:00
Ramkumar Ramachandra	4cf770275f	[VPlan] Introduce replaceSymbolicStrides (NFC) (#155842 ) Introduce VPlanTransforms::replaceSymbolicStrides factoring some code from LoopVectorize.	2025-09-01 09:03:46 +00:00
Ramkumar Ramachandra	0a193cb687	[VPlan] Use IsaPred to improve code (NFC) (#156037 )	2025-09-01 09:16:35 +01:00
Florian Hahn	465b17c450	[VPlan] Support scalable VFs in narrowInterleaveGroups. (#154842 ) Update narrowInterleaveGroups to support scalable VFs. After the transform, the vector loop will process a single iteration of the original vector loop for fixed-width vectors and vscale iterations for scalable vectors.	2025-08-31 20:45:07 +01:00
Florian Hahn	13aff91e7c	Revert "[VPlan] Support plans with vector pointers in narrowInterleaveGroups." This reverts commit 806a797c52d8018639f5cdcce5eb375b17c87f5e as it introduced a miscompile.	2025-08-31 19:37:24 +01:00
Florian Hahn	806a797c52	[VPlan] Support plans with vector pointers in narrowInterleaveGroups. After narrowing interleave groups and related memory operations, all vector pointers should be removed. Remove the check. In preparation for https://github.com/llvm/llvm-project/pull/149706.	2025-08-29 20:55:40 +01:00
Florian Hahn	5ebd59806b	[VPlan] Fold BinaryAnd x, 0 -> 0 in simplifyRecipe. This also fixes a cost-model divergence in the newly added tests in constant-fold.ll	2025-08-27 22:35:08 +01:00
Florian Hahn	5faed1ad84	[VPlan] Add VPlan-based addMinIterCheck, replace ILV for non-epilogue. (#153643 ) This patch adds a new VPlan-based addMinimumIterationCheck, which replaced the ILV version for the non-epilogue case. The VPlan-based version constructs a SCEV expression to compute the minimum iterations, use that to check if the check is known true or false. Otherwise it creates a VPExpandSCEV recipe and emits a compare-and-branch. When using epilogue vectorization, we still need to create the minimum trip-count-check during the legacy skeleton creation. The patch moves the definitions out of ILV. PR: https://github.com/llvm/llvm-project/pull/153643	2025-08-26 15:52:31 +01:00
Ramkumar Ramachandra	1e0e0e0a56	[VPlan] Improve style around container-inserts (NFC) (#155174 )	2025-08-26 14:12:59 +01:00
Luke Lau	f3520c538d	[VPlan] Replace EVL branch condition with (branch-on-count AVLNext, 0) (#152167 ) This changes the branch condition to use the AVL's backedge value instead of the EVL-based IV. This allows us to emit bnez on RISC-V and removes a use of the trip count, which should reduce register pressure. To match phis with VPlanPatternMatch I've had to relax the assert that the number of operands must exactly match the pattern for the Phi opcode, and I've copied over m_ZExtOrSelf from the LLVM IR PatternMatch.h. Fixes #151459	2025-08-26 11:19:19 +00:00
Florian Hahn	c950a72974	[VPlan] Support scalar VF for ExtractLane and FirstActiveLane. Extend ExtractLane and FirstActiveLane to support scalable VFs. This allows correct handling when interleaving with VF = 1. Alive2 proofs: - Fixed codegen with this patch: https://alive2.llvm.org/ce/z/8Y5_Vc (verifies as correct) - Original codegen: https://alive2.llvm.org/ce/z/twdg3X (doesn't verify) Fixes https://github.com/llvm/llvm-project/issues/154967.	2025-08-25 21:45:21 +01:00
Ramkumar Ramachandra	66be00d635	[VPlan] Introduce m_Cmp; match more compares (#154771 ) Extend [Specific]Cmp_match to handle floating-point compares, and introduce m_Cmp that matches both integer and floating-point compares. Use it in simplifyRecipe to match and simplify the general case of compares. The change has necessitated a bugfix in VPReplicateRecipe::execute.	2025-08-24 13:27:06 +01:00
Florian Hahn	954097dd61	[VPlan] Use SCEV to check subtract in getOptimizableIVOf. Simplify checks for IV subtractions in getOptimizableIVOf by using SCEV. This slightly generalizes the patterns we can handle.	2025-08-23 22:00:01 +01:00
Florian Hahn	9f87cd68a4	[VPlan] Add m_ExtractLastElement matcher. (NFC)	2025-08-23 21:21:03 +01:00
Luke Lau	c97c6869b6	[VPlan] Allow folding not (cmp eq) -> icmp ne with other select users (#154497 ) Currently we only allow folding not (cmp eq) -> icmp ne if the not is the only user of the compare. However a common scenario is that some select might also use the compare. We can still fold the not if we also swizzle the arms of the selects. This helps avoid regressions in #150368	2025-08-22 07:59:14 +08:00
Florian Hahn	300d2c6d20	[VPlan] Move SCEV expansion to VPlan transform. (NFCI). Move the logic to expand SCEVs directly to a late VPlan transform that expands SCEVs in the entry block. This turns VPExpandSCEVRecipe into an abstract recipe without execute, which clarifies how the recipe is handled, i.e. it is not executed like regular recipes. It also helps to simplify construction, as now scalar evolution isn't required to be passed to the recipe.	2025-08-21 22:03:26 +01:00
Florian Hahn	e41aaf5a64	[VPlan] Use VPIRMetadata for VPInterleaveRecipe. (#153084 ) Use VPIRMetadata for VPInterleaveRecipe to preserve noalias metadata added by versioning. This still uses InterleaveGroup's logic to preserve existing metadata from IR. This can be migrated separately. Fixes https://github.com/llvm/llvm-project/issues/153006. PR: https://github.com/llvm/llvm-project/pull/153084	2025-08-21 18:58:10 +01:00
Florian Hahn	21cca5ea9d	[VPlan] Rely on VPlan opts to simplify multiply by 1 (NFCI).	2025-08-21 18:43:47 +01:00
Luke Lau	5ef28e0a88	[VPlan] Add m_c_Add to VPlanPatternMatch. NFC (#154730 ) Same thing as #154705, and useful for simplifying the matching in #152167	2025-08-21 11:26:08 +00:00
Luke Lau	955c475ae6	[VPlan] Add m_Sub to VPlanPatternMatch. NFC (#154705 ) To mirror PatternMatch.h, and we'll also be able to use it in #152167	2025-08-21 09:33:46 +00:00
Shih-Po Hung	cf0e86118d	[VPlan] Handle canonical VPWidenIntOrFpInduction in branch-condition simplification (#153539 ) SimplifyBranchConditionForVFAndUF only recognized canonical IVs and a few PHI recipes in the loop header. With more IV-step optimizations, the canonical widen-canonical-iv can be replaced by a canonical VPWidenIntOrFpInduction, which the pass did not handle, causing regressions (missed simplifications). This patch replaces canonical VPWidenIntOrFpInduction with a StepVector in the vector preheader since the vector loop region only executes once.	2025-08-21 07:34:54 +08:00
Luke Lau	cabf6433c6	[VPlan] EVL transform VPVectorEndPointerRecipe alongisde load/store recipes. NFC (#152542 ) This is the first step in untangling the variable step transform and header mask optimizations as described in #152541. Currently we replace all VF users globally in the plan, including VPVectorEndPointerRecipe. However this leaves reversed loads and stores in an incorrect state until they are adjusted in optimizeMaskToEVL. This moves the VPVectorEndPointerRecipe transform so that it is updated in lockstep with the actual load/store recipe. One thought that crossed my mind was that VPInterleaveRecipe could also use VPVectorEndPointerRecipe, in which case we would have also been computing the wrong address because we don't transform it to an EVL recipe which accounts for the reversed address.	2025-08-19 08:16:48 +00:00
Luke Lau	144736b07e	[VPlan] Don't fold live ins with both scalar and vector operands (#154067 ) If we end up with a extract_element VPInstruction where both operands are live-ins, we will try to fold the live-ins even though the first operand is a vector whilst the live-in is scalar. This fixes it by just returning the vector live-in instead of calling the folder, and removes the handling for insertelement where we aren't able to do the fold. From some quick testing we previously never hit this fold anyway, and were probably just missing test coverage. Fixes #154045	2025-08-19 04:10:53 +00:00

1 2 3 4 5 ...

481 Commits