llvm-project

Author	SHA1	Message	Date
Florian Hahn	baa77e30f0	[LV] Remove some redundant casts (NFC).	2025-02-24 21:46:29 +00:00
Luke Lau	e23ab73335	[VPlan] Don't convert widen recipes to VP intrinsics in EVL transform (#127180 ) This is a copy of #126177, since it was automatically and permanently closed because I messed up the source branch on my remote This patch proposes to avoid converting widening recipes to VP intrinsics during the EVL transform. IIUC we initially did this to avoid `vl` toggles on RISC-V. However we now have the RISCVVLOptimizer pass which mostly makes this redundant. Emitting regular IR instead of VP intrinsics allows more generic optimisations, both in the middle end and DAGCombiner, and we generally have better patterns in the RISC-V backend for non-VP nodes. Sticking to regular IR instructions is likely a lot less work than reimplementing all of these optimisations for VP intrinsics, and on SPEC CPU 2017 we get noticeably better code generation.	2025-02-22 19:38:11 +08:00
Florian Hahn	6ff8a06de9	[VPlan] Run recipe removal and simplification after optimizeForVFAndUF. (#125926 ) Run recipe simplification and dead recipe removal after VPlan-based unrolling and optimizeForVFAndUF, to clean up any redundant or dead recipes introduced by them. Currently this is NFC, as it removes the corresponding removeDeadRecipes run in optimizeForVFAndUF and no additional simplifications kick in after unrolling yet. That is changing with https://github.com/llvm/llvm-project/pull/123655. Note that with this change, pattern-matching is now applied after EVL-based recipes have been introduced. Trying to match VPWidenEVLRecipe when not explicitly requested might apply a pattern with 2 operands to one with 3 due to the extra EVL operand and VPWidenEVLRecipe being a subclass of VPWidenRecipe. To prevent this, update Recipe_match::match to only match VPWidenEVLRecipe if it is in the requested recipe types (RecipeTy). PR: https://github.com/llvm/llvm-project/pull/125926	2025-02-08 13:33:46 +00:00
Florian Hahn	ee806646ad	[VPlan] Consistently use hasScalarVFOnly (NFC). Consistently use hasScalarVFOnly instead of using hasVF(ElementCount::getFixed(1)). Also add an assert to ensure all cases are covered by hasScalarVFOnly.	2025-02-08 12:19:25 +00:00
Florian Hahn	5008277322	[VPlan] Move auxiliary declarations out of VPlan.h (NFC). (#124104 ) Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState (https://github.com/llvm/llvm-project/pull/117138). PR: https://github.com/llvm/llvm-project/pull/124104	2025-02-02 13:44:07 +00:00
David Sherwood	3bc2dade36	[LoopVectorize] Enable vectorisation of early exit loops with live-outs (#120567 ) This work feeds part of PR https://github.com/llvm/llvm-project/pull/88385, and adds support for vectorising loops with uncountable early exits and outside users of loop-defined variables. When calculating the final value from an uncountable early exit we need to calculate the vector lane that triggered the exit, and hence determine the value at the point we exited. All code for calculating the last value when exiting the loop early now lives in a new vector.early.exit block, which sits between the middle.split block and the original exit block. Doing this required two fixes: 1. The vplan verifier incorrectly assumed that the block containing a definition always dominates the block of the user. That's not true if you can arrive at the use block from multiple incoming blocks. This is possible for early exit loops where both the early exit and the latch jump to the same block. 2. We were adding the new vector.early.exit to the wrong parent loop. It needs to have the same parent as the actual early exit block from the original loop. I've added a new ExtractFirstActive VPInstruction that extracts the first active lane of a vector, i.e. the lane of the vector predicate that triggered the exit. NOTE: The IR generated for dealing with live-outs from early exit loops is unoptimised, as opposed to normal loops. This inevitably leads to poor quality code, but this can be fixed up later.	2025-01-30 10:37:00 +00:00
Florian Hahn	2b55ef187c	[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640 ) Add new runPass helpers to run a VPlan transformation. This makes it easier to add additional checks/functionality for each transform run. In this patch, an option is added to run the verifier after each VPlan transform. Follow-ups will use the same helper to also support printing VPlans after each transform. Note that the verifier at the moment requires there to be a canonical IV and vector loop region, so the final lowering transforms aren't run via runPass yet. PR: https://github.com/llvm/llvm-project/pull/123640	2025-01-29 10:50:01 +00:00
Florian Hahn	09a29fcc8d	[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819 ) Live-ins don't need to be handled, other than adding to the exit phi recipe. Do that early and assert that otherwise the exit value is defined in the vector loop region. This should enable simply skipping other exit values that do not need further fixing, e.g. if handling the exit value from the early exit directly in handleUncountableEarlyExit. PR: https://github.com/llvm/llvm-project/pull/123819	2025-01-27 16:12:07 +00:00
Florian Hahn	2c87133c62	Reapply "[VPlan] Update final IV exit value via VPlan. (#112147 )" This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46. The build failure in sanitizer stage2 builds has been fixed with 0d39fe6f5bb3edf0bddec09a8c6417377390aeac. Original commit message: Model updating IV users directly in VPlan, replace fixupIVUsers. Now simple extracts are created for all phis in the exit block during initial VPlan construction. A later VPlan transform (optimizeInductionExitUsers) replaces extracts of inductions with their pre-computed values if possible. This completes the transition towards modeling all live-outs directly in VPlan. There are a few follow-ups: * emit extracts initially also for resume phis, and optimize them tougher with IV exit users * support for VPlans with multiple exits in optimizeInductionExitUsers. Depends on https://github.com/llvm/llvm-project/pull/110004, https://github.com/llvm/llvm-project/pull/109975 and https://github.com/llvm/llvm-project/pull/112145.	2025-01-19 19:32:03 +00:00
Florian Hahn	58326f1d5b	Revert "[VPlan] Update final IV exit value via VPlan. (#112147 )" This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec. Causes build failures on PPC stage2 & fuchsia bots https://lab.llvm.org/buildbot/#/builders/168/builds/7650 https://lab.llvm.org/buildbot/#/builders/11/builds/11248	2025-01-18 13:40:33 +00:00
Florian Hahn	c2d15ac4d4	[VPlan] Update final IV exit value via VPlan. (#112147 ) Model updating IV users directly in VPlan, replace fixupIVUsers. Now simple extracts are created for all phis in the exit block during initial VPlan construction. A later VPlan transform (optimizeInductionExitUsers) replaces extracts of inductions with their pre-computed values if possible. This completes the transition towards modeling all live-outs directly in VPlan. There are a few follow-ups: * emit extracts initially also for resume phis, and optimize them tougher with IV exit users * support for VPlans with multiple exits in optimizeInductionExitUsers. Depends on https://github.com/llvm/llvm-project/pull/110004, https://github.com/llvm/llvm-project/pull/109975 and https://github.com/llvm/llvm-project/pull/112145.	2025-01-18 13:22:34 +00:00
Luke Lau	f925e54554	[VPlan] Fix mutating whilst iterating over users in EVL transform (#122885 ) This fixes a miscompilation extracted from 525.x264_r, where we were failing to update the runtime VF of a VPReverseVectorPointerRecipe. We were removing a use of VF whilst iterating over the users() iterator, which messed up the iterator in-flight and caused us to miss some recipes. This fixes it by copying the users into a SmallVector first. Fixes #122681 Fixes #122682	2025-01-14 22:17:51 +08:00
Florian Hahn	8df64ed777	[LV] Don't consider IV increments uniform if exit value is used outside. In some cases, there might be a chain of uniform instructions producing the exit value. To generate correct code in all cases, consider the IV increment not uniform, if there are users outside the loop. Instead, let VPlan narrow the IV, if possible using the logic from 3ff1d01985752. Test case from #122602 verified with Alive2: https://alive2.llvm.org/ce/z/bA4EGj Fixes https://github.com/llvm/llvm-project/issues/122496. Fixes https://github.com/llvm/llvm-project/issues/122602.	2025-01-12 22:03:21 +00:00
Florian Hahn	3ff1d01985	Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67. Re-applies commit with typos fixed.	2025-01-12 20:10:28 +00:00
Florian Hahn	0ebb3ac7c9	Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac. Typo breaking the build	2025-01-12 19:37:45 +00:00
Florian Hahn	1afba19913	[VPlan] Try to narrow wide and replicating recipes to uniform recipes. Use the existing VPlan-based analysis to identify recipes that only have their first lane demanded and transform them to uniform recpliate recipes. This simplifies the generated code in some places and prepares for fixing https://github.com/llvm/llvm-project/issues/122496.	2025-01-12 19:32:01 +00:00
Florian Hahn	7f59b4e998	[VPlan] Skip non-induction phi recipes in legalizeAndOptimizeInductions. The body of the loop only applies to wide induction recipes, skip any other header phi recipes up-frond	2025-01-11 20:33:02 +00:00
Florian Hahn	7ffb691595	[VPlan] Remove dead ToRemove (NFC).	2025-01-09 22:02:32 +00:00
Florian Hahn	f9369cc602	[VPlan] Make sure last IV increment value is available if needed. Legalize extract-from-ends using uniform VPReplicateRecipe of wide inductions to use regular VPReplicateRecipe, so the correct end value is available. Fixes https://github.com/llvm/llvm-project/issues/121745.	2025-01-06 22:40:41 +00:00
Florian Hahn	f4230b4332	[VPlan] Add and use debug location for VPScalarCastRecipe. Update the recipe it always take a debug location and set it.	2025-01-05 20:08:51 +00:00
Florian Hahn	f48884ded8	[VPlan] Remove loop region in optimizeForVFAndUF. (#108378 ) Update optimizeForVFAndUF to completely remove the vector loop region when possible. At the moment, we cannot remove the region if it contains * widened IVs: the recipe is needed to generate the step vector * reductions: ComputeReductionResults requires the reduction phi recipe for codegen. Both cases can be addressed by more explicit modeling. The patch also includes a number of updates to allow executing VPlans without a vector loop region. Depends on https://github.com/llvm/llvm-project/pull/110004	2025-01-05 15:50:42 +00:00
Luke Lau	7700695739	[VPlan] Fix crash with EVL tail folding intrinsic with no corresponding VP (#121542 ) This fixes a crash when building SPEC CPU 2017 with EVL tail folding when widening @llvm.log10 intrinsics. @llvm.log10 and some other intrinsics don't have a corresponding VP intrinsic, so this fixes the crash by removing the assert and bailing instead.	2025-01-05 11:41:56 +08:00
Florian Hahn	5f5792aedb	[VPlan] Use removeDeadRecipes in optimizeForVFAndUF (NFCI) Split off from https://github.com/llvm/llvm-project/pull/108378.	2025-01-02 20:10:46 +00:00
Florian Hahn	ddef380cd6	[VPlan] Move simplifyRecipe(s) definitions up to allow re-use (NFC) Move definitions to allow easy reuse in https://github.com/llvm/llvm-project/pull/108378.	2024-12-31 13:23:19 +00:00
Florian Hahn	16d19aaedf	[VPlan] Manage created blocks directly in VPlan. (NFC) (#120918 ) This patch changes the way blocks are managed by VPlan. Previously all blocks reachable from entry would be cleaned up when a VPlan is destroyed. With this patch, each VPlan keeps track of blocks created for it in a list and this list is then used to delete all blocks in the list when the VPlan is destroyed. To do so, block creation is funneled through helpers in directly in VPlan. The main advantage of doing so is it simplifies CFG transformations, as those do not have to take care of deleting any blocks, just adjusting the CFG. This helps to simplify https://github.com/llvm/llvm-project/pull/108378 and https://github.com/llvm/llvm-project/pull/106748. This also simplifies handling of 'immutable' blocks a VPlan holds references to, which at the moment only include the scalar header block. PR: https://github.com/llvm/llvm-project/pull/120918	2024-12-30 12:08:12 +00:00
Florian Hahn	c7a777322d	[VPlan] Replace else-if dyn_cast with cast (NFC). The recipes handled here are either VPWidenIntrinsic or VPWidenCast, so replace the else-if dyn_cast with a single else + cast.	2024-12-23 19:46:22 +00:00
LiqinWeng	b1fab4f849	[LV][VPlan] Initialize the variable 'VPID' of the createEVLRecipe (#120926 ) Resolve the compilation error caused by the merge issue: #119510	2024-12-23 09:23:22 +08:00
LiqinWeng	8a51471d83	[LV][VPlan] Extract the implementation of transform Recipe to EVLRecipe into a small function. NFC (#119510 )	2024-12-23 08:28:19 +08:00
Florian Hahn	e1833e3a7e	[VPlan] Simplify redundant VPDerivedIVRecipe (NFC). Split DerivedIV simplification off from https://github.com/llvm/llvm-project/pull/112145 and use to remove the need for extra checks in createScalarIVSteps. Required an extra simplification run after IV transforms.	2024-12-22 09:39:19 +00:00
LiqinWeng	86fa35ce7e	[LV][VPlan] Use opcode to retrieve the VPID of the CallRecipe, rather than underlying instruction (#120816 ) This patch may cause the flags in the CallRecipe to be lost after EVL transformation, and it has been addressed in the patch: #119847	2024-12-22 10:28:20 +08:00
Florian Hahn	9b496deb90	[VPlan] Set and use debug location for VPPredInstPHIRecipe. Update the recipe it always set its debug location and use it during IR generation.	2024-12-21 21:57:47 +00:00
Luke Lau	c2a879ecaa	[VPlan] Fix VPTypeAnalysis cache clobbering in EVL transform (#120252 ) When building SPEC CPU 2017 with RISC-V and EVL tail folding, this assertion in VPTypeAnalysis would trigger during the transformation to EVL recipes: `d8a0709b10/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (L135-L142)` It was caused by this recipe: ``` WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> ``` Having its type inferred as i16, when ir<%add33> and ir<0> had inferred types of i32 somehow. The cause of this turned out to be because the VPTypeAnalysis cache was getting clobbered: In this transform we were erasing recipes but keeping around the same mapping from VPValue* to Type. In the meantime, new recipes would be created which would have the same address as the old value. They would then incorrectly get the old erased VPValue's cached type: ``` --- before --- 0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6> 0x600001ec5450: <badref> <- some value that was erased --- after --- 0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6> 0x600001ec5450: WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> <- a new value that happens to have the same address ``` This fixes this by deferring the erasing of recipes till after the transformation. The test case might be a bit flakey since it just happens to have the right conditions to recreate this. I tried to add an assert in inferScalarType that every VPValue in the cache was valid, but couldn't find a way of telling if a VPValue had been erased. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-12-18 11:28:28 +08:00
Luke Lau	4a7f60d328	[VPlan] Handle VPWidenCastRecipe without underlying value in EVL transform (#120194 ) This fixes a crash that shows up when building SPEC CPU 2017 with EVL tail folding on RISC-V. A VPWidenCastRecipe doesn't always have an underlying value, and in the case of this crash this happens whenever a widened cast is created via truncateToMinimalBitwidths. Fix this by just using the opcode stored in the recipe itself. I think a similar issue exists with VPWidenIntrinsicRecipe and how it's widened, but I haven't run into any crashes with it just yet.	2024-12-18 11:28:07 +08:00
Florian Hahn	734a204fbd	[VPlan] Manage VPWidenIntOrFPInduction debug location via recipe (NFC). Properly set VPWidenIntOrFpInductionRecipe's debug location in the recipe and use it, instead of using the debug location of the underlying IR instruction.	2024-12-15 13:45:28 +00:00
Florian Hahn	2564f1e199	[VPlan] Simplify Not(Not(A)) -> A. Follow-up simplification to 5fae408d3a4c073ee4.	2024-12-14 20:08:26 +00:00
Florian Hahn	6c8f41d336	[VPlan] Hook IR blocks into VPlan during skeleton creation (NFC) (#114292 ) As a first step to move towards modeling the full skeleton in VPlan, start by wrapping IR blocks created during legacy skeleton creation in VPIRBasicBlocks and hook them into the VPlan. This means the skeleton CFG is represented in VPlan, just before execute. This allows moving parts of skeleton creation into recipes in the VPBBs gradually. Note that this allows retiring some manual DT updates, as this will be handled automatically during VPlan execution. PR: https://github.com/llvm/llvm-project/pull/114292	2024-12-12 15:58:16 +00:00
Luke Lau	b26fe5b7e9	[VPlan] Use variadic isa<> in a few more places. NFC (#119538 )	2024-12-12 13:26:39 +08:00
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
LiqinWeng	b759020cc8	[LV][EVL] Support cast instruction with EVL-vectorization (#108351 )	2024-12-11 10:01:41 +08:00
Florian Hahn	afef545efa	[VPlan] Address post-commit for #114305 . Apply suggested renaming and adjust placement as suggested in https://github.com/llvm/llvm-project/pull/114305. Also drop unneeded RPOT creation.	2024-12-08 21:24:19 +00:00
Florian Hahn	a7fda0e1e4	[VPlan] Introduce VPScalarPHIRecipe, use for can & EVL IV codegen (NFC). (#114305 ) Introduce a general recipe to generate a scalar phi. Lower VPCanonicalIVPHIRecipe and VPEVLBasedIVRecipe to VPScalarIVPHIrecipe before plan execution, avoiding the need for duplicated ::execute implementations. There are other cases that could benefit, including in-loop reduction phis and pointer induction phis. Builds on a similar idea as https://github.com/llvm/llvm-project/pull/82270. PR: https://github.com/llvm/llvm-project/pull/114305	2024-12-03 14:53:51 +00:00
Florian Hahn	77767986ed	[LV] Use IsaPred in a few more places (NFC). Simplifies the code slightly by removing explicit lambdas.	2024-12-01 18:47:53 +00:00
LiqinWeng	4a3f46de50	[LV][EVL] Support call instruction with EVL-vectorization (#110412 )	2024-11-28 10:05:08 +08:00
Florian Hahn	590f451b60	[VPlan] Allow setting IR name for VPDerivedIVRecipe (NFCI). Allow setting the name to use for the generated IR value of the derived IV in preparations for https://github.com/llvm/llvm-project/pull/112145. This is analogous to VPInstruction::Name.	2024-11-24 20:39:12 +00:00
Shih-Po Hung	632c5d2991	[VPlan] Support VPReverseVectorPointer in DataWithEVL vectorization (#113667 ) VPReverseVectorPointer relies on the runtime VF, but in DataWithEVL tail-folding, EVL (which can be less than VF at runtime) should be used instead. This patch updates the logic to check the users of VF and replaces the second operand if the user is VPReverseVectorPointer.	2024-11-22 17:18:39 +08:00
Stephen Tozer	caa9a82797	[DebugInfo][LoopVectorizer] Avoid dropping !dbg in optimizeForVFAndUF (#114243 ) Prior to this patch, optimizeForVFAndUF may optimize the conditional branch for a VPBasicblock to have a constant condition, but unnecessarily drops the DILocation attachment when it does so; this patch changes it to preserve the DILocation.	2024-11-14 09:33:46 +00:00
Florian Hahn	ccb40b0b7a	[VPlan] Add insertOnEdge to VPBlockUtils (NFC). Add a new helper to insert a new VPBlockBase on an edge between 2 blocks. Suggested in https://github.com/llvm/llvm-project/pull/114292 and also useful for some existing code.	2024-11-09 21:19:39 +00:00
Florian Hahn	95eeae195e	[VPlan] Add PredIdx and SuccIdx arguments to connectBlocks (NFC). Add extra arguments to connectBlocks which allow selecting which existing predecessor/successor to update. This avoids having to disconnect blocks first unnecessarily. Suggested in https://github.com/llvm/llvm-project/pull/114292.	2024-11-09 17:18:40 +00:00
Mel Chen	4480a22c2b	[LV][EVL] Emit vp.merge intrinsic to enable out-loop reduction in EVL vectorization. (#101641 ) Following #90184, this patch emits vp.merge intrinsic, which is used to set the inactive lanes in a select operation to the RHS instead of undef. Currently, it is applied to out-loop reduction for EVL vectorization. This patch performs transformation to convert select(header_mask, LHS, RHS) into vp.merge(all-true, LHS, RHS, EVL) And always use the predicated reduction select to set the incoming value of the reduction phi to support out-loop reduction when using tail folding with EVL. TODO: Postpone the adjustment of the predicated reduction select to VPlanTransform. The current adjustment might be too early, which could lead to a situation where the predicated reduction select is adjusted, but the EVL recipes cannot be successfully generated during VPlanTransform.	2024-11-06 14:53:49 +08:00
Kazu Hirata	aa825b74af	[Vectorize] Remove unused includes (NFC) (#114643 ) Identified with misc-include-cleaner.	2024-11-03 08:58:51 -08:00

1 2 3 4 5 ...

275 Commits