llvm-project

Author	SHA1	Message	Date
Florian Hahn	0c028bbf33	[LV] Always add uniform pointers to uniforms list. Always add pointers proved to be uniform via legal/SCEV to worklist. This extends the existing logic to handle a few more pointers known to be uniform.	2025-09-18 22:56:19 +01:00
Alexey Bataev	8c41859a21	[SLP]Clear the operands deps of non-schedulable nodes, if previously all operands were copyable If all operands of the non-schedulable nodes were previously only copyables, need to clear the dependencies of the original schedule data for such copyable operands and recalculate them to correctly handle number of dependecies. Fixes #159406	2025-09-18 12:11:33 -07:00
Ramkumar Ramachandra	f1ba44f50a	[VPlan] Strip dead code in cst live-in match (NFC) (#159589 ) A live-in constant can never be of vector type.	2025-09-18 19:28:42 +01:00
Florian Hahn	50b9ca4dda	[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510 ) After https://github.com/llvm/llvm-project/pull/153643, there may be a BranchOnCond with constant condition in the entry block. Simplify those in removeBranchOnConst. This removes a number of redundant conditional branch from entry blocks. In some cases, it may also make the original scalar loop unreachable, because we know it will never execute. In that case, we need to remove the loop from LoopInfo, because all unreachable blocks may dominate each other, making LoopInfo invalid. In those cases, we can also completely remove the loop, for which I'll share a follow-up patch. Depends on https://github.com/llvm/llvm-project/pull/153643. PR: https://github.com/llvm/llvm-project/pull/154510	2025-09-18 19:25:05 +01:00
Graham Hunter	6b99a7bbed	[LV] Provide utility routine to find uncounted exit recipes (#152530 ) Splitting out just the recipe finding code from #148626 into a utility function (along with the extra pattern matchers). Hopefully this makes reviewing a bit easier. Added a gtest, since this isn't actually used anywhere yet.	2025-09-18 15:45:23 +00:00
Ramkumar Ramachandra	f68f3b9a7e	[VPlan] Allow zero-operand m_VPInstruction (NFC) (#159550 )	2025-09-18 12:40:31 +01:00
Ramkumar Ramachandra	0384f6c9db	[VPlanPatternMatch] Introduce match functor (NFC) (#159521 ) Follow up on 7fb3a91 ([PatternMatch] Introduce match functor) to introduce the VPlanPatternMatch version of the match functor to shorten some idioms. Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-18 10:36:12 +01:00
Ramkumar Ramachandra	7fb3a91418	[PatternMatch] Introduce match functor (NFC) (#159386 ) A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-17 21:04:33 +01:00
Piotr Fusik	2ce04d0a41	[SLP][NFC] Refactor a long `if` into an early `return` (#156410 )	2025-09-17 18:31:46 +02:00
Hassnaa Hamdi	e8aa0b688a	[LV]: Ensure fairness when selecting epilogue VF. (#155547 ) Consider IC when deciding if epilogue profitable for scalable vectors, same as fixed-width vectors.	2025-09-17 14:48:10 +01:00
Sander de Smalen	17e008db17	[IR] NFC: Remove 'experimental' from partial.reduce.add intrinsic (#158637 ) The partial reduction intrinsics are no longer experimental, because they've been used in production for a while and are unlikely to change.	2025-09-17 11:44:47 +01:00
Mikhail Gudim	66a8f47066	[SLPVectorizer][NFC] Save stride in a map. (#157706 ) In order to avoid recalculating stride of strided load twice save it in a map.	2025-09-16 09:02:09 -04:00
Ramkumar Ramachandra	46fcece2a8	[VPlan] Extend CSE to eliminate GEPs (#156699 ) The motivation for this patch is to close the gap between the VPlan-based CSE and the legacy CSE, to make it easier to remove the legacy CSE. Before this patch, stubbing out the legacy CSE leads to 22 test failures, and after this patch, there are only 12 failures, and all of them seem to have a single root cause: VPlanTransforms::createInterleaveGroups() and VPInterleaveGroup::execute(). The improvements from this patch are of course welcome. While developing the patch, a miscompile was found when GEP source-element-types differ, and this has been fixed. Co-authored-by: Florian Hahn <flo@fhahn.com> Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-16 10:14:32 +00:00
Florian Hahn	1858532c48	[VPlan] Handle predicated UDiv in VPReplicateRecipe::computeCost. Account for predicated UDiv,SDiv,URem,SRem in VPReplicateRecipe::computeCost: compute costs of extra phis and apply getPredBlockCostDivisor. Fixes https://github.com/llvm/llvm-project/issues/158660	2025-09-15 21:46:50 +01:00
Florian Hahn	4949cb4a5e	[VPlan] Track VPValues instead of VPRecipes in calculateRegisterUsage. (#155301 ) Update calculateRegisterUsageForPlan to track live-ness of VPValues instead of recipes. This gives slightly more accurate results for recipes that define multiple values (i.e. VPInterleaveRecipe). When tracking the live-ness of recipes, all VPValues defined by an VPInterleaveRecipe are considered alive until the last use of any of them. When tracking the live-ness of individual VPValues, we can accurately track the individual values until their last use. Note the changes in large-loop-rdx.ll and pr47437.ll. This patch restores the original behavior before introducing VPlan-based liveness tracking. PR: https://github.com/llvm/llvm-project/pull/155301	2025-09-15 20:55:11 +01:00
Alexey Bataev	f2301be0e8	[SLP]Add a check if the user itself is commutable If the commutable instruction can be represented as a non-commutable vector instruction (like add 0, %v can be represented as a part of sub nodes with operation sub %v, 0), its operands might still be reordered and this should be accounted when checking for copyables in operands Fixes #158293	2025-09-15 12:50:03 -07:00
Ramkumar Ramachandra	d012642be1	[VPlan] Match more GEP-like in m_GetElementPtr (#158019 ) The m_GetElementPtr matcher is incorrect and incomplete. Fix it to match all possible GEPs to avoid misleading users. It currently just has one use, and the change is non-functional for that use.	2025-09-15 20:06:37 +01:00
Ramkumar Ramachandra	148a83543b	[LV] Introduce m_One and improve (0\|1)-match (NFC) (#157419 )	2025-09-15 10:34:06 +00:00
Uyiosa Iyekekpolor	994a6a39e1	[VectorCombine] Fix scalarizeExtExtract for big-endian (#157962 ) The scalarizeExtExtract transform assumed little-endian lane ordering, causing miscompiles on big-endian targets such as AIX/PowerPC under -O3 -flto. This patch updates the shift calculation to handle endianness correctly for big-endian targets. No functional change for little-endian targets. Fixes #158197. --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-09-15 10:08:16 +00:00
Florian Hahn	fb60d0337c	[VPlan] Return non-option cost from getCostForRecipeWithOpcode (NFC). getCostForRecipeWithOpcode must only be called with supported opcodes. Directly return the cost, and add llvm_unreachable to catch unhandled cases.	2025-09-14 22:24:57 +01:00
Mikhail Gudim	ee3a4f4c94	[SLPVectorizer] Test -1 stride loads. (#158358 ) Add a test to generate -1 stride load and flags to force this behaviour.	2025-09-14 15:29:28 -04:00
Florian Hahn	91d4c0dfdf	Reapply "[VPlan] Compute cost of scalar (U\|S)Div, (U\|S)Rem in computeCost (NFCI)." This reverts commit 9490d58fa92bb338db96af331194c9ba26eb0201. Recommits de7e3a58952 with a fix for an unhandled case, causing crashes in some configs.	2025-09-14 13:15:07 +01:00
Aiden Grossman	9490d58fa9	Revert "[VPlan] Compute cost of scalar (U\|S)Div, (U\|S)Rem in computeCost (NFCI)." This reverts commit de7e3a589525179f3b02b84b194aac6cf581425c. This broke quite a few upstream buildbots and premerge. Reverting for now to get things back to green. https://lab.llvm.org/buildbot/#/builders/137/builds/25467	2025-09-13 22:32:48 +00:00
Florian Hahn	de7e3a5895	[VPlan] Compute cost of scalar (U\|S)Div, (U\|S)Rem in computeCost (NFCI). Directly compute the cost of UDiv, SDiv, URem, SRem in VPlan.	2025-09-13 22:09:06 +01:00
Florian Hahn	30e9cbacab	[VPlan] Move logic to compute scalarization overhead to cost helper(NFC) Extract the logic to compute the scalarization overhead to a helper for easy re-use in the future.	2025-09-13 20:41:44 +01:00
Florian Hahn	ef7e03a2d1	[VPlan] Limit ExtractLastElem fold to recipes guaranteed single-scalar. vputils::isSingleScalar(A) may return true to recipes that produce only a single scalar value, but they could still end up as vector instruction, because the recipe could not be converted to a single-scalar VPInstruction/VPReplicateRecipe. For now, only apply the fold for recipes guaranteed to produce a single value, i.e. single-scalar VPInstructions and VPReplicateRecipes. Fixes https://github.com/llvm/llvm-project/issues/158319.	2025-09-13 18:15:38 +01:00
Florian Hahn	b8eaceb39b	[VPlan] Explicitly replicate VPInstructions by VF. (#155102 ) Extend replicateByVF added in #142433 (aa240293190) to also explicitly unroll replicating VPInstructions. Now the only remaining case where we replicate for all lanes is VPReplicateRecipes in replicate regions. PR: https://github.com/llvm/llvm-project/pull/155102	2025-09-12 17:06:26 +01:00
Graham Hunter	54fc5367f6	[LV] Fix crash in uncountable exit with side effects checking Fixes an ICE reported on PR #145663, as an assert was found to be reachable with a specific combination of unreachable blocks.	2025-09-12 10:41:05 +00:00
Luke Lau	4bb250d6a3	[VPlan] Always consider register pressure on RISC-V (#156951 ) Stacked on #156923 In https://godbolt.org/z/8svWaredK, we spill a lot on RISC-V because whilst the largest element type is i8, we generate a bunch of pointer vectors for gathers and scatters. This means the VF chosen is quite high e.g. <vscale x 16 x i8>, but we end up using a bunch of <vscale x 16 x i64> m8 registers for the pointers. This was briefly fixed by #132190 where we computed register pressure in VPlan and used it to prune VFs that were likely to spill. The legacy cost model wasn't able to do this pruning because it didn't have visibility into the pointer vectors that were needed for the gathers/scatters. However VF pruning was restricted again to just the case when max bandwidth was enabled in #141736 to avoid an AArch64 regression, and restricted again in #149056 to only prune VFs that had max bandwidth enabled. On RISC-V we take advantage of register grouping for performance and choose a default of LMUL 2, which means there are 16 registers to work with – half the number as SVE, so we encounter higher register pressure more frequently. As such, we likely want to always consider pruning VFs with high register pressure and not just the VFs from max bandwidth. This adds a TTI hook to opt into this behaviour for RISC-V which fixes the motivating godbolt example above. When last checked this significantly reduces the number of spills on SPEC CPU 2017, up to 80% on 538.imagick_r.	2025-09-12 06:21:54 +00:00
Graham Hunter	e285602fda	[LV] Enforce addrec in current loop for uncountable exit load address check Addresses post-commit review raised for #145663	2025-09-11 11:18:22 +00:00
Hongyu Chen	c62ea6598e	[VectorCombine] Add Ext and Trunc support in foldBitOpOfCastConstant (#157822 ) Follow-up of https://github.com/llvm/llvm-project/pull/155216. This patch doesn't preserve the flags. I will implement it in the follow-up patch.	2025-09-11 17:08:47 +08:00
Garth Lei	8a8a810506	[SLP][NFC] Remove unused local variable in lambda (#156835 )	2025-09-11 02:05:55 +00:00
Elvis Wang	3e898bc40f	[LV] Fix cost misaligned when gather/scatter w/ addr is uniform. (#157387 ) This patch fix the assertion when the `isUniform` (from legacy model) and `isSingleScalar`(from Vplan-based model) mismatch. The simplify test that cause assertion ``` loop: loadA = load %a => %a is loop invariant. loadB = load %LoadA ... ``` In the legacy cost model, it cannot analysis that addr of `%loadB` is uniform but in the Vplan-based cost model both addr in `%loadA` and `loadB` is single scalar. Full test caused crash: https://llvm.godbolt.org/z/zEG8YKjqh. --------- Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-11 07:49:54 +08:00
Florian Hahn	1efa997317	[VPlan] Handle stores to single-scalar addr in narrowToSingleScalars. Move handling of stores to single-scalar/uniform address from replicateByVF to narrowToSingleScalar.	2025-09-10 21:58:29 +01:00
Florian Hahn	055e4ff35a	[VPlan] Don't narrow op multiple times in narrowInterleaveGroups. Track which ops already have been narrowed, to avoid narrowing the same operation multiple times. Repeated narrowing will lead to incorrect results, because we could first narrow from an interleave group -> wide load, and then narrow the wide load > single-scalar load. Fixes thttps://github.com/llvm/llvm-project/issues/156190.	2025-09-10 19:22:42 +01:00
Alexey Bataev	0dddfab54c	[SLP]Recalculate deps if the original instruction scheduled after being copyable If the original instruction is going to be scheduled after same instruction being scheduled as copyable, need to recalculate dependencies. Otherwise, the dependencies maybe calculated incorrectly.	2025-09-10 10:18:45 -07:00
Graham Hunter	3c810b76b9	[LV] Add initial legality checks for early exit loops with side effects (#145663 ) This adds initial support to LoopVectorizationLegality to analyze loops with side effects (particularly stores to memory) and an uncountable exit. This patch alone doesn't enable any new transformations, but does give clearer reasons for rejecting vectorization for such a loop. The intent is for a loop like the following to pass the specific checks, and only be rejected at the end until the transformation code is committed: ``` // Assume a is marked restrict // Assume b is known to be large enough to access up to b[N-1] for (int i = 0; i < N; ++) { a[i]++; if (b[i] > threshold) break; } ```	2025-09-10 13:54:52 +01:00
Florian Hahn	c3e76b2770	[VPlan] Keep common flags during CSE. (#157664 ) During CSE, we don't have to drop all poison-generating flags on mis-match, we can keep the ones common on both recipes. PR: https://github.com/llvm/llvm-project/pull/157664	2025-09-10 10:20:48 +00:00
Stephen Tozer	d4f7995488	[VPlan] Use Unknown instead of empty location in VPlanTransforms (#157702 ) The default values for DebugLocs in LoopVectorizer/VPlan were recently updated from empty DebugLocs to DebugLoc::getUnknown, as part of the DebugLoc Coverage Tracking work. However, there are some cases where we also pass an explicit empty DebugLoc, in many cases as a filler argument. This patch updates all of these to `getUnknown` for now, until either valid locations or a suitable categorization can be assigned to each instead. This change is NFC outside of DebugLoc coverage tracking builds.	2025-09-10 10:33:58 +01:00
Mel Chen	4d9a7fa9ba	[VPlan] Remove dead recipes before simplifying blends (#157622 ) In simplifyBlends, when normalizing a blend recipe, the first mask that is used only by the blend and is not all-false is chosen, and its corresponding incoming value becomes the initial value, with the others blended into it. At the same time, the mask that is chosen can be eliminated. However, a multi-user mask might be used by a dead recipe, which prevents this optimization. This patch moves removeDeadRecipes before simplifyBlends to eliminate dead recipes, allowing simplifyBlends to remove more dead masks.	2025-09-10 08:03:18 +00:00
Florian Hahn	c4b17bf9ed	[VPlan] Slightly extend ExtractLastElement fold to single-scalars. Update ExtractLastElement fold to support single scalar recipes, if all their users only use scalars.	2025-09-09 22:08:08 +01:00
Ramkumar Ramachandra	5544afd253	[LoopUtils] Simplify expanded RT-checks (#157518 ) Follow up on 528b13d ([SCEVExp] Add helper to clean up dead instructions after expansion.) to hoist the SCEVExapnder::eraseDeadInstructions call from LoopVectorize into the LoopUtils APIs add[Diff]RuntimeChecks, so that other callers (LoopDistribute and LoopVersioning) can benefit from the patch.	2025-09-09 11:38:54 +00:00
Florian Hahn	9b1b93766d	Reapply "[SCEVExp] Add helper to clean up dead instructions after expansion. (#157308 )" This reverts commit eeb43806eb1b40e690aeeba496ee974172202df9. Recommit with with a fix for MSan failure ( https://lab.llvm.org/buildbot/#/builders/169/builds/14799), by adding a set to track deleted values. Using the InsertedInstructions set is not sufficient, as it use asserting value handles as keys, which may dereference the value at construction. Original message: Add new helper to erase dead instructions inserted during SCEV expansion but not being used due to InstSimplifyFolder simplifications. Together with https://github.com/llvm/llvm-project/pull/157307 this also allows removing some specialized folds, e.g. https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L2205 PR: https://github.com/llvm/llvm-project/pull/157308	2025-09-09 09:47:41 +01:00
Florian Hahn	132bacde22	[VPlan] Also allow extracts as users when converting to single scalars. Extracts technically do not use scalars, but vectors, but if the operand is a single scalar we do not need a vector and they should not block forming single scalars.	2025-09-08 22:11:39 +01:00
Alexey Bataev	d0ea176cce	[SLP]Do not consider SExt/ZExt profitable for demotion, if the user is a bitcast to float If the user node of the SExt/ZExt node is a bitcast to a float point type, the node itself should not be considered legal to demote, since still the casting is required to match the size of the float point type. Fixes #157277	2025-09-08 07:59:01 -07:00
Florian Hahn	eeb43806eb	Revert "[SCEVExp] Add helper to clean up dead instructions after expansion. (#157308 )" This reverts commit 528b13df571c86a2c5b8305d7974f135d785e30f. Triggers MSan errors in some configurations, e.g. https://lab.llvm.org/buildbot/#/builders/169/builds/14799	2025-09-08 14:52:28 +01:00
Hongyu Chen	75b0c89e62	[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597 ) This patch addresses https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663. This patch adds a helper function to put the inverse cast on constants, with cast flags preserved(optional). Follow-up patches will add trunc/ext handling on VectorCombine and flags preservation on InstCombine.	2025-09-08 13:30:06 +00:00
Simon Pilgrim	ad3a0ae9e1	[VectorCombine] foldSelectShuffle - early-out cases where the max vector register width isn't large enough (#157430 ) Technically this could happen with vector units that can't handle all legal scalar widths - but its good enough to use a generic crash test without a suitable target Fixes #157335	2025-09-08 12:04:23 +00:00
Florian Hahn	528b13df57	[SCEVExp] Add helper to clean up dead instructions after expansion. (#157308 ) Add new helper to erase dead instructions inserted during SCEV expansion but not being used due to InstSimplifyFolder simplifications. Together with https://github.com/llvm/llvm-project/pull/157307 this also allows removing some specialized folds, e.g. https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L2205 PR: https://github.com/llvm/llvm-project/pull/157308	2025-09-08 10:53:20 +01:00
Luke Lau	fe6e178401	[VPlan] Don't build recipes for unconditional switches (#157323 ) In #157322 we crash because we try to infer a type for a VPReplicate switch recipe. My understanding was that these switches should be removed by VPlanPredicator, but this switch survived through it because it was unconditional, i.e. had no cases other than the default case. This fixes #157322 by not emitting any recipes for unconditional switches to begin with, similar to how we treat unconditional branches.	2025-09-08 09:01:43 +00:00

1 2 3 4 5 ...

6554 Commits