llvm-project

Author	SHA1	Message	Date
Hassnaa Hamdi	9491f75e1d	Reland: [LV]: Teach LV to recursively (de)interleave. (#122989 ) This commit relands the changes from "[LV]: Teach LV to recursively (de)interleave. #89018" Reason for revert: - The patch exposed a bug in the IA pass, the bug is now fixed and landed by commit: #122643	2025-01-17 10:34:57 +00:00
Mel Chen	9720be95d6	[LV][EVL] Disable fixed-order recurrence idiom with EVL tail folding. (#122458 ) The currently llvm.splice may occurs unexpected behavior if the evl of the second-to-last iteration is not VF*UF. Issue #122461	2025-01-17 16:55:35 +08:00
vporpo	e902c6960c	[SandboxVec][BottomUpVec] Implement InstrMaps (#122848 ) InstrMaps is a helper data structure that maps scalars to vectors and the reverse. This is used by the vectorizer to figure out which vectors it can extract scalar values from.	2025-01-16 15:26:35 -08:00
Luke Lau	5c15caa83f	[VPlan] Verify scalar types in VPlanVerifier. NFCI (#122679 ) VTypeAnalysis contains some assertions which can be useful for reasoning that the types of various operands match. This patch teaches VPlanVerifier to invoke VTypeAnalysis to check them, and catches some issues with VPInstruction types that are also fixed here: * Handles the missing cases for CalculateTripCountMinusVF, CanonicalIVIncrementForPart and AnyOf * Fixes ICmp and ActiveLaneMask to return i1 (to align with `icmp` and `@llvm.get.active.lane.mask` in the LangRef) The VPlanVerifier unit tests also need to be fleshed out a bit more to satisfy the stricter assertions	2025-01-16 18:57:08 +08:00
Vasileios Porpodas	acf6072fae	Reapply "[SandboxVec][Interval][NFC] Move a few definitions from header to .cpp" This reverts commit 069fbeb82f56f0ce7c0382dfd5d4fa4dc1983a13.	2025-01-15 16:38:37 -08:00
Vasileios Porpodas	069fbeb82f	Revert "[SandboxVec][Interval][NFC] Move a few definitions from header to .cpp" This reverts commit 24c603505f91b2979d13e0b963fbd3c0174a005f.	2025-01-15 15:30:19 -08:00
Vasileios Porpodas	24c603505f	[SandboxVec][Interval][NFC] Move a few definitions from header to .cpp	2025-01-15 15:23:28 -08:00
Florian Hahn	ef1260acc0	[VPlan] Make VPBlock constructors private (NFC). 16d19aaed moved to manage block creation via VPlan directly, with VPlan owning the created blocks. Follow up to make the VPBlock constructors private, to require creation via VPlan helpers and thus preventing issues due to manually constructing blocks.	2025-01-15 21:34:24 +00:00
David Sherwood	edc02351dd	[NFC][LoopVectorize] Add more loop early exit asserts (#122732 ) This patch is split off #120567, adding asserts in addScalarResumePhis and addExitUsersForFirstOrderRecurrences that the loop does not contain an uncountable early exit, since the code cannot yet handle them correctly.	2025-01-15 14:29:06 +08:00
LiqinWeng	0294dab79e	[LV][VPlan] Add fast flags for selectRecipe (#121023 ) Change the inheritance of class VPWidenSelectRecipe to class VPRecipeWithIRFlags, which allows recipe of the select to pass the fastmath flags.The patch of #119847 will add the fastmath flag to for recipe	2025-01-15 10:10:11 +08:00
Florian Hahn	1de3dc7d23	[LV] Bail out early if BTC+1 wraps. Currently we fail to detect the case where BTC + 1 wraps, i.e. the vector trip count is 0, In those cases, the minimum iteration count check will fail, and the vector code will never be executed. Explicitly check for this condition in computeMaxVF and avoid trying to vectorize alltogether. Note that a number of tests needed to be updated, because the vector loop would never be executed given the input IR. Fixes https://github.com/llvm/llvm-project/issues/122558.	2025-01-14 22:07:38 +00:00
Simon Pilgrim	87750c9de4	[VectorCombine] foldPermuteOfBinops - match identity shuffles only if they match the destination type Fixes regression identified after #122118	2025-01-14 16:09:50 +00:00
Luke Lau	f925e54554	[VPlan] Fix mutating whilst iterating over users in EVL transform (#122885 ) This fixes a miscompilation extracted from 525.x264_r, where we were failing to update the runtime VF of a VPReverseVectorPointerRecipe. We were removing a use of VF whilst iterating over the users() iterator, which messed up the iterator in-flight and caused us to miss some recipes. This fixes it by copying the users into a SmallVector first. Fixes #122681 Fixes #122682	2025-01-14 22:17:51 +08:00
Simon Pilgrim	6a9e9878a2	[VectorCombine] foldPermuteOfBinops - ensure potential identity mask isn't length changing.	2025-01-14 12:17:21 +00:00
Ramkumar Ramachandra	e409204a89	VectorCombine: teach foldExtractedCmps about samesign (#122883 ) Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid of one of the FIXMEs it introduced by replacing a predicate comparison with CmpPredicate::getMatching.	2025-01-14 12:04:14 +00:00
Ramkumar Ramachandra	0fe8469e08	SLPVectorizer: strip bad FIXME (NFC) (#122888 ) Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid of the FIXME it introduced in SLPVectorizer: the FIXME is bad, and we'd get no testable impact by using CmpPredicate::getMatching here.	2025-01-14 11:27:55 +00:00
Simon Pilgrim	0bf1591d01	[VectorCombine] foldPermuteOfBinops - fold "shuffle (binop (shuffle, other)), undef" --> "binop (shuffle), (shuffle)". (#122118 ) foldPermuteOfBinops currently requires both binop operands to be oneuse shuffles to fold the shuffles across the binop, but there will be cases where its still profitable to fold across the binop with only one foldable shuffle.	2025-01-14 10:43:22 +00:00
Luke Lau	cb2560d33b	[VPlan] Verify plan before optimizations. NFC (#122678 ) I've been exploring verifying the VPlan before and after the EVL transformation steps, and noticed that the VPlan comes out in an invalid state between construction and optimisation. In adjustRecipesForReductions, we leave behind some dead recipes which are invalid: 1) When we replace a link with a reduction recipe, the old link ends up becoming a use-before-def: WIDEN ir<%l7> = add ir<%sum.02>, ir<%indvars.iv>.1 WIDEN ir<%l8> = add ir<%l7>.1, ir<%l3> WIDEN ir<%l9> = add ir<%l8>.1, ir<%l5> ... REDUCE ir<%l7>.1 = ir<%sum.02> + reduce.add (ir<%indvars.iv>.1) REDUCE ir<%l8>.1 = ir<%l7>.1 + reduce.add (ir<%l3>) REDUCE ir<%l9>.1 = ir<%l8>.1 + reduce.add (ir<%l5>) 2) When transforming an AnyOf reduction phi to a boolean, we leave behind a select with mismatching operand types, which will trigger the assertions in VTypeAnalysis after #122679 This adds an extra verification step and deletes the dead recipes eagerly to keep the plan valid.	2025-01-14 12:44:24 +08:00
vporpo	7c51c310ad	[SandboxVec][BottomUpVec] Clean up dead address instrs (#122536 ) When we vectorize loads or stores we only keep the address of the first lane. The rest may become dead. This patch adds the address operands of vectorized loads or stores to the dead candidates set, such that they get erased if dead.	2025-01-13 18:25:25 -08:00
offsake	83be69cf9a	[VPlan][Coverity] Fix coverity CID1579964. (#121805 ) Fix for the Coverity hit with CID1579964 in VPlan.cpp. Coverity message with some context follows. [Cov] var_compare_op: Comparing TermBr to null implies that TermBr might be null. 434 } else if (TermBr && !TermBr->isConditional()) { 435 TermBr->setSuccessor(0, NewBB); 436 } else { 437 // Set each forward successor here when it is created, excluding 438 // backedges. A backward successor is set when the branch is created. 439 unsigned idx = PredVPSuccessors.front() == this ? 0 : 1; [Cov] CID 1579964: (#1 of 1): Dereference after null check (FORWARD_NULL) [Cov] var_deref_model: Passing null pointer TermBr to getSuccessor, which dereferences it.	2025-01-13 20:29:51 +00:00
Alexey Bataev	066b88879a	[SLP]Correctly set vector operand for extracts with poisons When extracts are vectorized and it has some poison values instead of instructions, need to correctly set the vectorized operand not as poison, but as a main vector operand of the main extract instruction. Fixes #122583	2025-01-13 10:57:07 -08:00
Alexey Bataev	092d628383	[SLP]Check for div/rem instructions before extending with poisons Need to check if the instructions can be safely extended with poison before actually doing this to avoid incorrect transformations. Fixes #122691	2025-01-13 09:28:27 -08:00
Alexey Bataev	af524de1fa	[SLP]Do not include subvectors for fully matched buildvectors If the buildvector node fully matched another node, need to exclude subvectors, when building final shuffle, just a shuffle of the original node must be emitted. Fixes #122584	2025-01-13 07:24:16 -08:00
Mel Chen	3397950f2d	[LV] Fix FindLastIV reduction for epilogue vectorization. (#120395 ) Following 0e528ac404e13ed2d952a2d83aaf8383293c851e, this patch adjusts the resume value of VPReductionPHIRecipe for FindLastIV reductions. Replacing the resume value with: ResumeValue = ResumeValue == StartValue ? SentinelValue : ResumeValue; This addressed the correctness issue when the start value might not be less than the minimum value of a monotonically increasing induction variable. Thanks Florian Hahn for the help. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2025-01-13 20:58:38 +08:00
Sam Tebbs	795e35a653	Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744 ) This relands the reverted #120721 with a fix for cases where neither reduction operand are the reduction phi. Only 63114239cc8d26225a0ef9920baacfc7cc00fc58 and 63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted PR. --------- Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>	2025-01-13 11:20:35 +00:00
Mel Chen	56a37a3c76	[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549 ) This patch simplifies select-based integer min/max reductions by utilizing `llvm::getMinMaxReductionPredicate`, and generates intrinsic-based min/max reductions by utilizing `llvm::getMinMaxReductionIntrinsicOp`.	2025-01-13 16:11:31 +08:00
Florian Hahn	8df64ed777	[LV] Don't consider IV increments uniform if exit value is used outside. In some cases, there might be a chain of uniform instructions producing the exit value. To generate correct code in all cases, consider the IV increment not uniform, if there are users outside the loop. Instead, let VPlan narrow the IV, if possible using the logic from 3ff1d01985752. Test case from #122602 verified with Alive2: https://alive2.llvm.org/ce/z/bA4EGj Fixes https://github.com/llvm/llvm-project/issues/122496. Fixes https://github.com/llvm/llvm-project/issues/122602.	2025-01-12 22:03:21 +00:00
Florian Hahn	3ff1d01985	Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67. Re-applies commit with typos fixed.	2025-01-12 20:10:28 +00:00
Florian Hahn	0ebb3ac7c9	Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac. Typo breaking the build	2025-01-12 19:37:45 +00:00
Florian Hahn	1afba19913	[VPlan] Try to narrow wide and replicating recipes to uniform recipes. Use the existing VPlan-based analysis to identify recipes that only have their first lane demanded and transform them to uniform recpliate recipes. This simplifies the generated code in some places and prepares for fixing https://github.com/llvm/llvm-project/issues/122496.	2025-01-12 19:32:01 +00:00
Florian Hahn	7f59b4e998	[VPlan] Skip non-induction phi recipes in legalizeAndOptimizeInductions. The body of the loop only applies to wide induction recipes, skip any other header phi recipes up-frond	2025-01-11 20:33:02 +00:00
Vasileios Porpodas	25b90c4ef6	[SandboxVec][SeedCollector][NFC] Remove redundant 'else' and move the assertion within the 'if'	2025-01-10 14:54:44 -08:00
vporpo	9248428db7	[SandboxVec][DAG][NFC] Refactor setNextNode() and setPrevNode() (#122363 ) This patch updates DAG's `setNextNode()` and `setPrevNode()` to update both nodes of the link.	2025-01-10 13:32:33 -08:00
Han-Kuan Chen	35e76b6a4f	Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 )" This reverts commit f3d6cdc5aebafac3961d4fccbd2ca0e302c6082c.	2025-01-10 10:09:54 -08:00
Alexey Bataev	681c83a2f9	[SLP]Fix mask generation after cost estimation When estimating the cost of entries shuffles for buildvectors, need to rebuild original mask, not a generated submask, used for subregisters analysis. Fixes #122430	2025-01-10 09:32:35 -08:00
Alex MacLean	986f2ac48f	[SLPVectorizer] minor tweaks around lambdas for compatibility with older compilers (#122348 ) Older version of msvc do not have great lambda support and are not able to handle uses of class data or lambdas with implicit return types in some cases. These minor changes improve the sources compatibility with older msvc and don't hurt readability either.	2025-01-10 09:18:28 -08:00
Alexey Bataev	3c9c94a24f	Revert "[SLP]Fix mask generation after cost estimation" This reverts commit 547ba9730bf05df3383150f730a689f2c8336206 to fix buildbots reported in https://lab.llvm.org/buildbot/#/builders/123/builds/11370, https://lab.llvm.org/buildbot/#/builders/133/builds/9492	2025-01-10 08:46:42 -08:00
Alexey Bataev	547ba9730b	[SLP]Fix mask generation after cost estimation When estimating the cost of entries shuffles for buildvectors, need to rebuild original mask, not a generated submask, used for subregisters analysis. Fixes #122430	2025-01-10 08:17:56 -08:00
Mel Chen	e0f14e11c7	[SLPVectorizer] Refine the scope of RdxOpcode in HorizontalReduction::createOp (NFC) (#122239 ) This patch is one part of unifying IAnyOf and FAnyOf reduction. #118393 The related patch is #118777.	2025-01-10 16:01:36 +08:00
Han-Kuan Chen	f3d6cdc5ae	[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-09 23:41:52 -08:00
Han-Kuan Chen	5454ac28b3	Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 )" This reverts commit 760f550de25792db83cd39c88ef57ab6d80a41a0.	2025-01-09 18:41:47 -08:00
Han-Kuan Chen	36b423e0f8	[SLP] NFC. Refactor getSameOpcode and reduce for loop iterations. (#122241 ) Replace Cnt and AltIndex with MainOp and AltOp. Reduce the number of iterations in the for loop.	2025-01-10 09:06:07 +08:00
Han-Kuan Chen	760f550de2	[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-10 09:05:39 +08:00
Florian Hahn	7ffb691595	[VPlan] Remove dead ToRemove (NFC).	2025-01-09 22:02:32 +00:00
vporpo	6312beef78	[SandboxVec][BottomUpVec] Use SeedCollector and slice seeds (#120826 ) With this patch we switch from the temporary dummy seeds to actual seeds provided by the seed collector. The seeds get sliced and each slice is used as the starting point for vectorization.	2025-01-09 11:53:48 -08:00
Alexey Bataev	5ff36748cf	[SLP]Fix mask processing for reused gathered scalars Need to sync the mask between cost and actual emission to avoid bugs in mask calculation Fixes #122324	2025-01-09 11:24:48 -08:00
Florian Hahn	b0697dc1de	[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994 ) Currently we emit early-exit related debug messages/remarks even when there is a single exit. Update to only check isVectorizableEarlyExitLoop if there isn't a single exit block. PR: https://github.com/llvm/llvm-project/pull/121994	2025-01-09 12:05:19 +00:00
Benjamin Maxwell	f88ef1bd1b	[LV] Teach LoopVectorizationLegality about struct vector calls (#119221 ) This is a split-off from #109833 and only adds code relating to checking if a struct-returning call can be vectorized. This initial patch only allows the case where all users of the struct return are `extractvalue` operations that can be widened. ``` %call = tail call { float, float } @foo(float %in_val) %extract_a = extractvalue { float, float } %call, 0 %extract_b = extractvalue { float, float } %call, 1 ``` Note: The tests require the VFABI changes from #119000 to pass.	2025-01-09 09:27:29 +00:00
Alexey Bataev	5b76a2e51b	[SLP]Correctly calculate mask for the inserted vector	2025-01-08 15:18:06 -08:00
Alexey Bataev	0d921f96d4	[SLP][NFC]Introduce and use createInsertVector helper function, NFC	2025-01-08 14:26:13 -08:00

1 2 3 4 5 ...

5436 Commits