llvm-project

Author	SHA1	Message	Date
vporpo	1c207f1b6e	[SandboxVec][DAG] Fix DAG when old interval is mem free (#126983 ) This patch fixes a bug in `DependencyGraph::extend()` when the old interval contains no memory instructions. When this is the case we should do a full dependency scan of the new interval.	2025-02-12 15:06:30 -08:00
vporpo	31cb807537	[SanbdoxVec][BottomUpVec] Fix diamond shuffle with multiple vector inputs (#126965 ) When the operand comes from multiple inputs then we need additional packing code. When the operands are scalar then we can use a single InsertElementInst. But when the operands are vectors then we need a chain of ExtractElementInst and InsertElementInst instructions to insert the vector value into the destination vector. This is what this patch implements.	2025-02-12 14:33:05 -08:00
Vasileios Porpodas	e75e61728e	[SandboxVec] Fix warnings introduced by 7a7f9190d03e	2025-02-12 12:43:24 -08:00
vporpo	7a7f9190d0	[SandboxVec][Legality] Fix mask on diamond reuse with shuffle (#126963 ) This patch fixes a bug in the creation of shuffle masks when vectorizing vectors in case of a diamond reuse with shuffle. The mask needs to enumerate all elements of a vector, not treat the original vector value as a single element. That is: if vectorizing two <2 x float> vectors into a <4 x float> the mask needs to have 4 indices, not just 2.	2025-02-12 12:29:09 -08:00
vporpo	6d7a84d72b	[SandboxVec][Scheduler] Fix top of schedule (#126820 ) This patch fixes the way the top-of-schedule variable gets set and updated. Before this patch it used to get updated whenever we scheduled a bundle, which is wrong, as the top-of-schedule needs to be maintained across scheduling attempts. It should get reset only when we clear the schedule or when we destroy the current schedule and re-schedule.	2025-02-12 11:52:01 -08:00
Alexey Bataev	bb3d789dfe	[SLP][NFC]Improve dump of the ScheduleData, NFC	2025-02-12 06:51:30 -08:00
Alexey Bataev	e1935a2b15	Revert "[SLP][NFC]Improve dump of the ScheduleData, NFC" This reverts commit 108e6bca693e5f44d2d17da5a6e06203a0290de7 to fix error revealed by buildbots https://lab.llvm.org/buildbot/#/builders/159/builds/15888.	2025-02-12 06:34:27 -08:00
Alexey Bataev	108e6bca69	[SLP][NFC]Improve dump of the ScheduleData, NFC	2025-02-12 06:25:04 -08:00
David Sherwood	3e62321ed9	[LoopVectorize] Make collectInLoopReductions more efficient (#126769 ) We call collectInLoopReductions in multiple places asking the same question with exactly the same answer. For example, this was being called from a loop in calculateRegisterUsage and this patch hoists the call out to above the loop. In addition I've changed collectInLoopReductions so that it bails out if we've already built up a list.	2025-02-12 14:05:34 +00:00
Alexey Bataev	10844fb9b0	[SLP]Fix attempt to build the reorder mask for non-adjusted reuse mask When building the reorder for non-single use reuse mask, need to check if the size of the mask is multiple of the number of unique scalars. Otherwise, the compiler may crash when trying to reorder nodes. Fixes #126304	2025-02-11 13:41:25 -08:00
Kazu Hirata	042e860a8a	[Vectorize] Avoid repeated hash lookups (NFC) (#126681 )	2025-02-11 09:09:43 -08:00
Florian Hahn	e258bca950	[VPlan] Only skip expansion for SCEVUnknown if it isn't an instruction. (#125235 ) Update getOrCreateVPValueForSCEVExpr to only skip expansion of SCEVUnknown if the underlying value isn't an instruction. Instructions may be defined in a loop and using them without expansion may break LCSSA form. SCEVExpander will take care of preserving LCSSA if needed. We could also try to pass LoopInfo, but there are some users of the function where it won't be available and main benefit from skipping expansion is slightly more concise VPlans. Note that SCEVExpander is now used to expand SCEVUnknown with floats. Adjust the check in expandCodeFor to only check the types and casts if the type of the value is different to the requested type. Otherwise we crash when trying to expand a float and requesting a float type. Fixes https://github.com/llvm/llvm-project/issues/121518. PR: https://github.com/llvm/llvm-project/pull/125235	2025-02-11 13:03:12 +01:00
Florian Hahn	3706dfef66	[LV] Forget LCSSA phi with new pred before other SCEV invalidation. (#119897 ) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes https://github.com/llvm/llvm-project/issues/119665. PR: https://github.com/llvm/llvm-project/pull/119897	2025-02-10 16:29:42 +00:00
Ricardo Jesus	5f84b6edd9	[AArch64] Add MATCH loops to LoopIdiomVectorizePass (#101976 ) This patch adds a new loop to LoopIdiomVectorizePass, enabling it to recognise and vectorise loops such as: ```cpp template<class InputIt, class ForwardIt> InputIt find_first_of(InputIt first, InputIt last, ForwardIt s_first, ForwardIt s_last) { for (; first != last; ++first) for (ForwardIt it = s_first; it != s_last; ++it) if (first == it) return first; return last; } ``` These loops match the C++ standard library function `std::find_first_of`.	2025-02-10 08:23:34 +00:00
Elvis Wang	2e3729bf40	[LV] Prevent query the computeCost() when VF=1 in emitInvalidCostRemarks(). (#117288 ) We should only query the computeCost() when the VF is vector.	2025-02-10 08:40:28 +08:00
Hassnaa Hamdi	e9a20f77ee	Reland "[LV]: Teach LV to recursively (de)interleave." (#125094 ) This patch relands the changes from "[LV]: Teach LV to recursively (de)interleave.#122989" Reason for revert: - The patch exposed an assert in the vectorizer related to VF difference between legacy cost model and VPlan-based cost model because of uncalculated cost for VPInstruction which is created by VPlanTransforms as a replacement to 'or disjoint' instruction. VPlanTransforms do that instructions change when there are memory interleaving and predicated blocks, but that change didn't cause problems because at most cases the cost difference between legacy/new models is not noticeable. - Issue is fixed by #125434 Original patch: https://github.com/llvm/llvm-project/pull/89018 Reviewed-by: paulwalker-arm, Mel-Chen	2025-02-09 19:21:54 +00:00
Florian Hahn	32c4493d5f	[VPlan] Add incoming values for all predecessor to ResumePHI (NFCI). Follow-up as discussed when using VPInstruction::ResumePhi for all resume values (#112147). This patch explicitly adds incoming values for each predecessor in VPlan. This simplifies codegen and allows transformations adjusting the predecessors of blocks with NFC modulo incoming block order in phis.	2025-02-09 11:20:20 +00:00
vporpo	69b8cf4f06	[SandboxVec][BottomUpVec] Add cost estimation and tr-accept-or-revert pass (#126325 ) The TransactionAcceptOrRevert pass is the final pass in the Sandbox Vectorizer's default pass pipeline. It's job is to check the cost before/after vectorization and accept or revert the IR to its original state. Since we are now starting the transaction in BottomUpVec, tests that run a custom pipeline need to accept the transaction. This is done with the help of the TransactionAlwaysAccept pass (tr-accept).	2025-02-08 08:34:18 -08:00
Florian Hahn	6ff8a06de9	[VPlan] Run recipe removal and simplification after optimizeForVFAndUF. (#125926 ) Run recipe simplification and dead recipe removal after VPlan-based unrolling and optimizeForVFAndUF, to clean up any redundant or dead recipes introduced by them. Currently this is NFC, as it removes the corresponding removeDeadRecipes run in optimizeForVFAndUF and no additional simplifications kick in after unrolling yet. That is changing with https://github.com/llvm/llvm-project/pull/123655. Note that with this change, pattern-matching is now applied after EVL-based recipes have been introduced. Trying to match VPWidenEVLRecipe when not explicitly requested might apply a pattern with 2 operands to one with 3 due to the extra EVL operand and VPWidenEVLRecipe being a subclass of VPWidenRecipe. To prevent this, update Recipe_match::match to only match VPWidenEVLRecipe if it is in the requested recipe types (RecipeTy). PR: https://github.com/llvm/llvm-project/pull/125926	2025-02-08 13:33:46 +00:00
Florian Hahn	ee806646ad	[VPlan] Consistently use hasScalarVFOnly (NFC). Consistently use hasScalarVFOnly instead of using hasVF(ElementCount::getFixed(1)). Also add an assert to ensure all cases are covered by hasScalarVFOnly.	2025-02-08 12:19:25 +00:00
Florian Hahn	16df836a52	[VPlan] Mark hasVF & hasScalableVF as const (NFC).	2025-02-08 11:32:23 +00:00
Kazu Hirata	5901bda5a0	[Vectorize] Avoid repeated hash lookups (NFC) (#126345 )	2025-02-08 00:48:51 -08:00
Florian Hahn	1611059f5d	[VPlan] Compute cost for binary op VPInstruction with underlying values. (#125434 ) As exposed by https://github.com/llvm/llvm-project/pull/125094, we are missing cost computation for some binary VPInstructions we created based on original IR instructions. Their cost should be considered. PR: https://github.com/llvm/llvm-project/pull/125434	2025-02-07 15:27:31 +00:00
David Sherwood	3872e55758	[LoopVectorize] Fix build error (#126218 ) Fixes issue caused by 1930524bbde3cd26ff527bbdb5e1f937f484edd6 Unused variable UsesMask in LoopVectorize.cpp	2025-02-07 10:16:32 +00:00
David Sherwood	1930524bbd	[LoopVectorize] Fix cost model assert when vectorising calls (#125716 ) The legacy and vplan cost models did not agree because VPWidenCallRecipe::computeCost only calculates the cost of the call instruction, whereas LoopVectorizationCostModel::setVectorizedCallDecision in some cases adds on the cost of a synthesised mask argument. However, this mask is always 'splat(i1 true)' which should be hoisted out of the loop during codegen. In order to synchronise the two cost models I have two options: 1) Also add the cost of the splat to the vplan model, or 2) Remove the cost of the splat from the legacy model. I chose 2) because I feel this more closely represents what the final code will look like. There is an argument that we should take account of such broadcast costs in the preheader when deciding if it's profitable to vectorise a loop, however there isn't currently a mechanism to do this. We currently only take account of the runtime checks when assessing profitability and what the minimum trip count should be. However, I don't believe this work needs doing as part of this PR.	2025-02-07 09:36:52 +00:00
James Chesterman	ac158aa13b	[LoopVectorizer] Allow partial reductions to be made in predicated loops (#124268 ) Does a select on the input rather than the output. This way the mask has the same number of lanes as the other operand in the select instruction.	2025-02-07 09:09:10 +00:00
Mel Chen	4d3148d926	[LV][EVL] Fix the check for legality of folding with EVL. (#125678 ) The current legality check for folding with EVL has incomplete verification for VF. This patch fixes the VF check, ensuring that tail folding with EVL is enabled only when a scalable VF is available. This allows loops that prefer tail folding with EVL but cannot use scalable VF vectorization to still be vectorized using a fixed VF, rather than abandoning vectorization entirely.	2025-02-07 12:53:10 +08:00
Luke Lau	d0f122b9c5	[LV] Update incoming blocks in VPWidenPHIRecipe in reassociateBlocks (#125481 ) This is extracted from #118638 After c7ebe4f we will crash in fixNonInductionPHIs if we use a VPWidenPHIRecipe with the vector preheader as an incoming block, because the phi will reference the old non-IRBB vector preheader. This fixes this by updating VPBlockUtils::reassociateBlocks to update any VPWidenPHIRecipes's incoming blocks. This assumes that if the VPWidenPHIRecipe is in a VPRegionBlock, it's in the entry block, and that we are replacing a VPBasicBlock with another VPBasicBlock.	2025-02-07 08:50:35 +08:00
vporpo	a0d86b23c0	[SandboxVec][Scheduler] Notify scheduler about instruction creation (#126141 ) This patch implements the vectorizer's callback for getting notified about new instructions being created. This updates the scheduler state, which may involve removing dependent instructions from the ready list and update the "scheduled" flag. Since we need to remove elements from the ready list, this patch also implements the `remove()` operation.	2025-02-06 15:45:44 -08:00
vporpo	166b2e8837	[SandboxVec][DAG] Update DAG when a new instruction is created (#126124 ) The DAG will now receive a callback whenever a new instruction is created and will update itself accordingly.	2025-02-06 14:12:03 -08:00
Florian Hahn	049aa179dc	[VPlan] Simplify operand tuple matching in VPlanPatternMatch (NFC). Remove some indirection when matching recipe and matcher operands by directly using fold over parameter pack.	2025-02-06 21:00:44 +00:00
vporpo	788c88e2f6	[SandboxVec][DependencyGraph] Fix dependency node iterators (#125616 ) This patch fixes a bug in the dependency node iterators that would incorrectly not skip nodes that are not in the current DAG. This resulted in iterators returning nullptr when dereferenced. The fix is to update the existing "skip" function to not only skip non-instruction values but also to skip instructions not in the DAG.	2025-02-06 12:30:49 -08:00
Simon Pilgrim	eb2b453eb7	[VectorCombine] foldInsExtVectorToShuffle - ensure we call getShuffleCost with the input operand type, not the result Typo in #121216 Fixes #126085	2025-02-06 17:41:24 +00:00
hanbeom	8c1dbac304	[VectorCombine] Allow shuffling between vectors the same type but different element sizes (#121216 ) `foldInsExtVectorToShuffle` function combines the extract/insert of a vector into a vector through a shuffle. However, we only supported coupling between vectors of the same size. This commit allows combining extract/insert for vectors of the same type but with different sizes by converting the length of the vectors. Proof: https://alive2.llvm.org/ce/z/ELNLr7 Fixed https://github.com/llvm/llvm-project/issues/120772	2025-02-06 10:38:50 +00:00
Florian Hahn	585b75ec9a	[VPlan] Simplify matching recipe ty and opcode in pattern match (NFC). Use parameter pack fold to simplify matching of recipe types and opcodes for RecipeTys parameter pack.	2025-02-05 20:03:42 +00:00
David Sherwood	f07cd36a5d	[LoopVectorize] Add the cost of VPInstruction::AnyOf to vplan (#125058 ) This patch adds an initial implementation of VPInstruction::computeCost with support for only one instruction so far - VPInstruction::AnyOf. This is only used when vectorising loops with uncountable early exits.	2025-02-05 16:31:14 +00:00
Mel Chen	8d037b9256	[LV][EVL] Skip tryAddExplicitVectorLength for plans with scalar VF. (#125497 ) The plans with scalar VF should not be transformed the plans folded by EVL. TODO: Move the scalar VF checking into `LoopVectorizationCostModel ::foldTailWithEVL()`.	2025-02-05 15:02:33 +08:00
Alexey Bataev	7dca2c628c	[SLP]Gather scalarized calls If the calls won't be vectorized, but will be scalarized after vectorization, they should be build as buildvector nodes, not vector nodes. Vectorization of such calls leads to incorrect cost estimation, does not allow to calculate correctly spills costs. Reviewers: lukel97, preames Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/125070	2025-02-04 19:09:57 -05:00
Alexey Bataev	88e7b8b81c	[SLP]Use TTI::getScalarizationOverhead where possible Better to use TTI::getScalarizationOverhead instead of TTI::getVectorInstrCost to correctly calculate the costs of buildvectors/extracts. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/125725	2025-02-04 18:49:43 -05:00
Florian Hahn	7043895911	[VPlan] Remove dead VPBB argument from tryTo[Create]Widen[Recipe] (NFC) The functions now use VPBuilder to insert recipes and the VPBB argument is unused. Clean it up.	2025-02-04 21:40:07 +00:00
Alexey Bataev	fe7e280820	[SLP][NFC]Move functions definitions, NFC Move functions to use them later in the following patches	2025-02-04 07:19:18 -08:00
Min-Yih Hsu	635ab515d5	[VectorCombine] Fold vector.interleave2 with two constant splats (#125144 ) If we're interleaving 2 constant splats, for instance `<vscale x 8 x i32> <splat of 666>` and `<vscale x 8 x i32> <splat of 777>`, we can create a larger splat `<vscale x 8 x i64> <splat of ((777 << 32) \| 666)>` first before casting it back into `<vscale x 16 x i32>`.	2025-02-03 19:05:49 -08:00
vporpo	c5f99e1bd4	[SandboxVec][Legality] Fix legality of SelectInst (#125005 ) SelectInsts need special treatment because they are not always straightforward to vectorize. This patch disables vectorization unless they are trivially vectorizable.	2025-02-03 17:33:09 -08:00
Florian Hahn	f8fa93193b	[LV] Add VPBuilder::insert, use to insert created vector pointer (NFC). Split off from https://github.com/llvm/llvm-project/pull/124432 as suggested. Adds VPBuilder::insert, inspired by IRBuilderBase.	2025-02-03 22:20:40 +00:00
Florian Hahn	30f3752e54	[VPlan] Only use SCEV for live-ins in tryToWiden. (#125436 ) Replacing a recipe with a live-in may not be correct in all cases, e.g. when replacing recipes involving header-phi recipes, like reductions. For now, only use SCEV to simplify live-ins. More powerful input simplification can be built in top of https://github.com/llvm/llvm-project/pull/124432 in the future. Fixes https://github.com/llvm/llvm-project/issues/119173. Fixes https://github.com/llvm/llvm-project/issues/125374. PR: https://github.com/llvm/llvm-project/pull/125436	2025-02-03 17:01:02 +00:00
Alexey Bataev	0c70a26f46	[SLP]Clear root node reordering only if the root node is not re-used in graph The reordering of the root node can be safely cleared only if the root node is not reused, otherwise the graph might be broken Fixes #125357	2025-02-03 06:05:19 -08:00
Simon Pilgrim	e3fbf19eb4	[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis. (#124129 ) (REAPPLIED) We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.) Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can. Noticed while having yet another attempt at #63980 Reapplied cleanup now that #125223 and #124984 have landed.	2025-02-03 09:55:41 +00:00
David Sherwood	50d5d06d38	[LoopVectorize][NFC] Cache the result of getVScaleForTuning (#124732 ) We currently call getVScaleForTuning in many places, doing a lot of work asking the same question with the same answer. I've refactored the code to cache the value if the max scalable VF != 0 and pull out the cached value from LoopVectorizationCostModel.	2025-02-03 09:49:26 +00:00
Martin Storsjö	d00579be39	Revert "[SLP]Reduce number of alternate instruction, where possible" This reverts commit d5a7a483a65f830a0c7a931781bc90046dc67ff4. That commit triggers failed asserts, see https://github.com/llvm/llvm-project/pull/123360 for details.	2025-02-02 15:56:08 +02:00
Florian Hahn	5008277322	[VPlan] Move auxiliary declarations out of VPlan.h (NFC). (#124104 ) Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState (https://github.com/llvm/llvm-project/pull/117138). PR: https://github.com/llvm/llvm-project/pull/124104	2025-02-02 13:44:07 +00:00

1 2 3 4 5 ...

5570 Commits