llvm-project

Author	SHA1	Message	Date
Florian Hahn	7f59b4e998	[VPlan] Skip non-induction phi recipes in legalizeAndOptimizeInductions. The body of the loop only applies to wide induction recipes, skip any other header phi recipes up-frond	2025-01-11 20:33:02 +00:00
Florian Hahn	7ffb691595	[VPlan] Remove dead ToRemove (NFC).	2025-01-09 22:02:32 +00:00
Florian Hahn	f9369cc602	[VPlan] Make sure last IV increment value is available if needed. Legalize extract-from-ends using uniform VPReplicateRecipe of wide inductions to use regular VPReplicateRecipe, so the correct end value is available. Fixes https://github.com/llvm/llvm-project/issues/121745.	2025-01-06 22:40:41 +00:00
Florian Hahn	f4230b4332	[VPlan] Add and use debug location for VPScalarCastRecipe. Update the recipe it always take a debug location and set it.	2025-01-05 20:08:51 +00:00
Florian Hahn	f48884ded8	[VPlan] Remove loop region in optimizeForVFAndUF. (#108378 ) Update optimizeForVFAndUF to completely remove the vector loop region when possible. At the moment, we cannot remove the region if it contains * widened IVs: the recipe is needed to generate the step vector * reductions: ComputeReductionResults requires the reduction phi recipe for codegen. Both cases can be addressed by more explicit modeling. The patch also includes a number of updates to allow executing VPlans without a vector loop region. Depends on https://github.com/llvm/llvm-project/pull/110004	2025-01-05 15:50:42 +00:00
Luke Lau	7700695739	[VPlan] Fix crash with EVL tail folding intrinsic with no corresponding VP (#121542 ) This fixes a crash when building SPEC CPU 2017 with EVL tail folding when widening @llvm.log10 intrinsics. @llvm.log10 and some other intrinsics don't have a corresponding VP intrinsic, so this fixes the crash by removing the assert and bailing instead.	2025-01-05 11:41:56 +08:00
Florian Hahn	5f5792aedb	[VPlan] Use removeDeadRecipes in optimizeForVFAndUF (NFCI) Split off from https://github.com/llvm/llvm-project/pull/108378.	2025-01-02 20:10:46 +00:00
Florian Hahn	ddef380cd6	[VPlan] Move simplifyRecipe(s) definitions up to allow re-use (NFC) Move definitions to allow easy reuse in https://github.com/llvm/llvm-project/pull/108378.	2024-12-31 13:23:19 +00:00
Florian Hahn	16d19aaedf	[VPlan] Manage created blocks directly in VPlan. (NFC) (#120918 ) This patch changes the way blocks are managed by VPlan. Previously all blocks reachable from entry would be cleaned up when a VPlan is destroyed. With this patch, each VPlan keeps track of blocks created for it in a list and this list is then used to delete all blocks in the list when the VPlan is destroyed. To do so, block creation is funneled through helpers in directly in VPlan. The main advantage of doing so is it simplifies CFG transformations, as those do not have to take care of deleting any blocks, just adjusting the CFG. This helps to simplify https://github.com/llvm/llvm-project/pull/108378 and https://github.com/llvm/llvm-project/pull/106748. This also simplifies handling of 'immutable' blocks a VPlan holds references to, which at the moment only include the scalar header block. PR: https://github.com/llvm/llvm-project/pull/120918	2024-12-30 12:08:12 +00:00
Florian Hahn	c7a777322d	[VPlan] Replace else-if dyn_cast with cast (NFC). The recipes handled here are either VPWidenIntrinsic or VPWidenCast, so replace the else-if dyn_cast with a single else + cast.	2024-12-23 19:46:22 +00:00
LiqinWeng	b1fab4f849	[LV][VPlan] Initialize the variable 'VPID' of the createEVLRecipe (#120926 ) Resolve the compilation error caused by the merge issue: #119510	2024-12-23 09:23:22 +08:00
LiqinWeng	8a51471d83	[LV][VPlan] Extract the implementation of transform Recipe to EVLRecipe into a small function. NFC (#119510 )	2024-12-23 08:28:19 +08:00
Florian Hahn	e1833e3a7e	[VPlan] Simplify redundant VPDerivedIVRecipe (NFC). Split DerivedIV simplification off from https://github.com/llvm/llvm-project/pull/112145 and use to remove the need for extra checks in createScalarIVSteps. Required an extra simplification run after IV transforms.	2024-12-22 09:39:19 +00:00
LiqinWeng	86fa35ce7e	[LV][VPlan] Use opcode to retrieve the VPID of the CallRecipe, rather than underlying instruction (#120816 ) This patch may cause the flags in the CallRecipe to be lost after EVL transformation, and it has been addressed in the patch: #119847	2024-12-22 10:28:20 +08:00
Florian Hahn	9b496deb90	[VPlan] Set and use debug location for VPPredInstPHIRecipe. Update the recipe it always set its debug location and use it during IR generation.	2024-12-21 21:57:47 +00:00
Luke Lau	c2a879ecaa	[VPlan] Fix VPTypeAnalysis cache clobbering in EVL transform (#120252 ) When building SPEC CPU 2017 with RISC-V and EVL tail folding, this assertion in VPTypeAnalysis would trigger during the transformation to EVL recipes: `d8a0709b10/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (L135-L142)` It was caused by this recipe: ``` WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> ``` Having its type inferred as i16, when ir<%add33> and ir<0> had inferred types of i32 somehow. The cause of this turned out to be because the VPTypeAnalysis cache was getting clobbered: In this transform we were erasing recipes but keeping around the same mapping from VPValue* to Type. In the meantime, new recipes would be created which would have the same address as the old value. They would then incorrectly get the old erased VPValue's cached type: ``` --- before --- 0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6> 0x600001ec5450: <badref> <- some value that was erased --- after --- 0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6> 0x600001ec5450: WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> <- a new value that happens to have the same address ``` This fixes this by deferring the erasing of recipes till after the transformation. The test case might be a bit flakey since it just happens to have the right conditions to recreate this. I tried to add an assert in inferScalarType that every VPValue in the cache was valid, but couldn't find a way of telling if a VPValue had been erased. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-12-18 11:28:28 +08:00
Luke Lau	4a7f60d328	[VPlan] Handle VPWidenCastRecipe without underlying value in EVL transform (#120194 ) This fixes a crash that shows up when building SPEC CPU 2017 with EVL tail folding on RISC-V. A VPWidenCastRecipe doesn't always have an underlying value, and in the case of this crash this happens whenever a widened cast is created via truncateToMinimalBitwidths. Fix this by just using the opcode stored in the recipe itself. I think a similar issue exists with VPWidenIntrinsicRecipe and how it's widened, but I haven't run into any crashes with it just yet.	2024-12-18 11:28:07 +08:00
Florian Hahn	734a204fbd	[VPlan] Manage VPWidenIntOrFPInduction debug location via recipe (NFC). Properly set VPWidenIntOrFpInductionRecipe's debug location in the recipe and use it, instead of using the debug location of the underlying IR instruction.	2024-12-15 13:45:28 +00:00
Florian Hahn	2564f1e199	[VPlan] Simplify Not(Not(A)) -> A. Follow-up simplification to 5fae408d3a4c073ee4.	2024-12-14 20:08:26 +00:00
Florian Hahn	6c8f41d336	[VPlan] Hook IR blocks into VPlan during skeleton creation (NFC) (#114292 ) As a first step to move towards modeling the full skeleton in VPlan, start by wrapping IR blocks created during legacy skeleton creation in VPIRBasicBlocks and hook them into the VPlan. This means the skeleton CFG is represented in VPlan, just before execute. This allows moving parts of skeleton creation into recipes in the VPBBs gradually. Note that this allows retiring some manual DT updates, as this will be handled automatically during VPlan execution. PR: https://github.com/llvm/llvm-project/pull/114292	2024-12-12 15:58:16 +00:00
Luke Lau	b26fe5b7e9	[VPlan] Use variadic isa<> in a few more places. NFC (#119538 )	2024-12-12 13:26:39 +08:00
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
LiqinWeng	b759020cc8	[LV][EVL] Support cast instruction with EVL-vectorization (#108351 )	2024-12-11 10:01:41 +08:00
Florian Hahn	afef545efa	[VPlan] Address post-commit for #114305 . Apply suggested renaming and adjust placement as suggested in https://github.com/llvm/llvm-project/pull/114305. Also drop unneeded RPOT creation.	2024-12-08 21:24:19 +00:00
Florian Hahn	a7fda0e1e4	[VPlan] Introduce VPScalarPHIRecipe, use for can & EVL IV codegen (NFC). (#114305 ) Introduce a general recipe to generate a scalar phi. Lower VPCanonicalIVPHIRecipe and VPEVLBasedIVRecipe to VPScalarIVPHIrecipe before plan execution, avoiding the need for duplicated ::execute implementations. There are other cases that could benefit, including in-loop reduction phis and pointer induction phis. Builds on a similar idea as https://github.com/llvm/llvm-project/pull/82270. PR: https://github.com/llvm/llvm-project/pull/114305	2024-12-03 14:53:51 +00:00
Florian Hahn	77767986ed	[LV] Use IsaPred in a few more places (NFC). Simplifies the code slightly by removing explicit lambdas.	2024-12-01 18:47:53 +00:00
LiqinWeng	4a3f46de50	[LV][EVL] Support call instruction with EVL-vectorization (#110412 )	2024-11-28 10:05:08 +08:00
Florian Hahn	590f451b60	[VPlan] Allow setting IR name for VPDerivedIVRecipe (NFCI). Allow setting the name to use for the generated IR value of the derived IV in preparations for https://github.com/llvm/llvm-project/pull/112145. This is analogous to VPInstruction::Name.	2024-11-24 20:39:12 +00:00
Shih-Po Hung	632c5d2991	[VPlan] Support VPReverseVectorPointer in DataWithEVL vectorization (#113667 ) VPReverseVectorPointer relies on the runtime VF, but in DataWithEVL tail-folding, EVL (which can be less than VF at runtime) should be used instead. This patch updates the logic to check the users of VF and replaces the second operand if the user is VPReverseVectorPointer.	2024-11-22 17:18:39 +08:00
Stephen Tozer	caa9a82797	[DebugInfo][LoopVectorizer] Avoid dropping !dbg in optimizeForVFAndUF (#114243 ) Prior to this patch, optimizeForVFAndUF may optimize the conditional branch for a VPBasicblock to have a constant condition, but unnecessarily drops the DILocation attachment when it does so; this patch changes it to preserve the DILocation.	2024-11-14 09:33:46 +00:00
Florian Hahn	ccb40b0b7a	[VPlan] Add insertOnEdge to VPBlockUtils (NFC). Add a new helper to insert a new VPBlockBase on an edge between 2 blocks. Suggested in https://github.com/llvm/llvm-project/pull/114292 and also useful for some existing code.	2024-11-09 21:19:39 +00:00
Florian Hahn	95eeae195e	[VPlan] Add PredIdx and SuccIdx arguments to connectBlocks (NFC). Add extra arguments to connectBlocks which allow selecting which existing predecessor/successor to update. This avoids having to disconnect blocks first unnecessarily. Suggested in https://github.com/llvm/llvm-project/pull/114292.	2024-11-09 17:18:40 +00:00
Mel Chen	4480a22c2b	[LV][EVL] Emit vp.merge intrinsic to enable out-loop reduction in EVL vectorization. (#101641 ) Following #90184, this patch emits vp.merge intrinsic, which is used to set the inactive lanes in a select operation to the RHS instead of undef. Currently, it is applied to out-loop reduction for EVL vectorization. This patch performs transformation to convert select(header_mask, LHS, RHS) into vp.merge(all-true, LHS, RHS, EVL) And always use the predicated reduction select to set the incoming value of the reduction phi to support out-loop reduction when using tail folding with EVL. TODO: Postpone the adjustment of the predicated reduction select to VPlanTransform. The current adjustment might be too early, which could lead to a situation where the predicated reduction select is adjusted, but the EVL recipes cannot be successfully generated during VPlanTransform.	2024-11-06 14:53:49 +08:00
Kazu Hirata	aa825b74af	[Vectorize] Remove unused includes (NFC) (#114643 ) Identified with misc-include-cleaner.	2024-11-03 08:58:51 -08:00
Florian Hahn	b021464d35	[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975 ) Update VPlan to include the scalar loop header. This allows retiring VPLiveOut, as the remaining live-outs can now be handled by adding operands to the wrapped phis in the scalar loop header. Note that the current version only includes the scalar loop header, no other loop blocks and also does not wrap it in a region block. PR: https://github.com/llvm/llvm-project/pull/109975	2024-10-31 21:36:44 +01:00
Mel Chen	8420dbf2b9	[VPlan] Refine the constructor of VPWidenIntrinsicRecipe. nfc (#113890 ) Infers member MayReadFromMemory, MayWriteToMemory, and MayHaveSideEffects based on intrinsic attributes. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-10-30 12:22:28 +08:00
Florian Hahn	ef217a0f6b	[VPlan] Introduce and use getVectorPreheader (NFC). Introduce a dedicated function to retrieve the vector preheader. This ensures the correct block is used, even if the skeleton is exetended.	2024-10-23 21:01:52 -07:00
Florian Hahn	2dfb1c664c	[VPlan] Try to hoist Previous (and operands), if sinking fails for FORs. (#108945 ) In some cases, Previous (and its operands) can be hoisted. This allows supporting additional cases where sinking of all users of to FOR fails, e.g. due having to sink recipes with side-effects. This fixes a crash where we fail to create a scalar VPlan for a first-order recurrence, but can create a vector VPlan, because the trunc instruction of an IV which generates the previous value of the recurrence has been optimized to a truncated induction recipe, thus hoisting it to the beginning. Fixes https://github.com/llvm/llvm-project/issues/106523. PR: https://github.com/llvm/llvm-project/pull/108945	2024-10-23 13:12:03 -07:00
Alexey Bataev	f148d5791b	[LV]Initial support for safe distance in predicated DataWithEVL vectorization mode. Enabled initial support for max safe distance in DataWithEVL mode. If max safe distance is required, need to emit special code: CMP = icmp ult AVL, MAX_SAFE_DISTANCE SAFE_AVL = select CMP, AVL, MAX_SAFE_DISTANCE EVL = call i32 @llvm.experimental.get.vector.length(i64 SAFE_AVL) while vectorize the loop in DataWithEVL tail folding mode. Reviewers: fhahn Reviewed By: fhahn Pull Request: https://github.com/llvm/llvm-project/pull/102897	2024-10-18 15:51:49 -04:00
Florian Hahn	bbff5b8891	[VPlan] Use alloc-type to compute interleave group offset. Use getAllocTypeSize to get compute the offset to the start of interleave groups instead getScalarSizeInBits, which may return 0 for pointers. This is in line with the analysis building the interleave groups and fixes a mis-compile reported for https://github.com/llvm/llvm-project/pull/106431.	2024-10-16 07:21:58 +01:00
Florian Hahn	34cdd67c85	[VPlan] Use VPWidenIntrinsicRecipe to vp.select. (#110489 ) Use VPWidenIntrinsicRecipe (https://github.com/llvm/llvm-project/pull/110486) to create vp.select intrinsics. This potentially offers an alternative to duplicating EVL recipes for all existing recipes. There are some recipes that will need duplicates (at least at the moment), due to extra code-gen needs (e.g. widening loads and stores). But in cases the intrinsic can directly be used, creating the widened intrinsic directly would reduce the need to duplicate some recipes. PR: https://github.com/llvm/llvm-project/pull/110489	2024-10-15 21:48:15 +01:00
Florian Hahn	6fbbe152fa	[VPlan] Introduce VPWidenIntrinsicRecipe to separate from libcall. (#110486 ) This patch splits off intrinsic hanlding to a new VPWidenIntrinsicRecipe. VPWidenIntrinsicRecipes only need access to the intrinsic ID to widen and the scalar result type (in case the intrinsic is overloaded on the result type). It does not need access to an underlying IR call instruction or function. This means VPWidenIntrinsicRecipe can be created easily without access to underlying IR.	2024-10-08 22:37:20 +01:00
Florian Hahn	7f74651837	[VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (#106431 ) Update VPInterleaveRecipe to always use the pointer to member 0 as pointer argument. This in many cases helps to remove unneeded index adjustments and simplifies VPInterleaveRecipe::execute. In some rare cases, the address of member 0 does not dominate the insert position of the interleave group. In those cases a PtrAdd VPInstruction is emitted to compute the address of member 0 based on the address of the insert position. Alternatively we could hoist the recipe computing the address of member 0.	2024-10-06 22:53:13 +01:00
Alexey Bataev	8e9011b3b8	[LV][NFC]Fix formatting	2024-09-25 06:05:35 -07:00
Alexey Bataev	60ed2361c0	[LV][EVL]Explicitly model AVL as sub, original TC, EVL_PHI. Patch explicitly models AVL as sub original TC, EVL_PHI instead of having it in EXPLICIT-VECTOR-LENGTH VPInstruction. Required for correct safe dependence distance suport. Reviewers: fhahn, ayalz Reviewed By: ayalz Pull Request: https://github.com/llvm/llvm-project/pull/108869	2024-09-25 08:58:29 -04:00
Florian Hahn	c95583f15f	[VPlan] Add createPtrAdd helper (NFC). Preparation for https://github.com/llvm/llvm-project/pull/106431.	2024-09-24 19:55:35 +01:00
Florian Hahn	53266f73f0	[VPlan] Run DCE after unrolling. This cleans up a number of dead recipes after unrolling if only their first or last parts are used. This simplifies a number of tests. Fixes https://github.com/llvm/llvm-project/issues/109581.	2024-09-22 22:08:46 +01:00
Florian Hahn	8ec406757c	[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842 ) This patch implements explicit unrolling by UF as VPlan transform. In follow up patches this will allow simplifying VPTransform state (no need to store unrolled parts) as well as recipe execution (no need to generate code for multiple parts in an each recipe). It also allows for more general optimziations (e.g. avoid generating code for recipes that are uniform-across parts). It also unifies the logic dealing with unrolled parts in a single place, rather than spreading it out across multiple places (e.g. VPlan post processing for header-phi recipes previously.) In the initial implementation, a number of recipes still take the unrolled part as additional, optional argument, if their execution depends on the unrolled part. The computation for start/step values for scalable inductions changed slightly. Previously the step would be computed as scalar and then splatted, now vscale gets splatted and multiplied by the step in a vector mul. This has been split off https://github.com/llvm/llvm-project/pull/94339 which also includes changes to simplify VPTransfomState and recipes' ::execute. The current version mostly leaves existing ::execute untouched and instead sets VPTransfomState::UF to 1. A follow-up patch will clean up all references to VPTransformState::UF. Another follow-up patch will simplify VPTransformState to only store a single vector value per VPValue. PR: https://github.com/llvm/llvm-project/pull/95842	2024-09-21 19:47:37 +01:00
Florian Hahn	bd8fe9972e	[VPlan] Mov licm to end of VPlan optimizations. This moves licm after expanding replicate regions. This fixes a crash when trying to hoist a predicated VPReplicateRecipes which later get expanded to replicate regions. Hoisting replicate regions out was not intended (see the discussion and at the review and comment on shallow traversal in licm()). Fixes https://github.com/llvm/llvm-project/issues/109510.	2024-09-21 12:45:45 +01:00
Florian Hahn	a861ed411a	[VPlan] Add initial loop-invariant code motion transform. (#107894 ) Add initial transform to move out loop-invariant recipes. This also helps to fix a divergence between legacy and VPlan-based cost model due to legacy using ScalarEvolution::isLoopInvariant in some cases. Fixes https://github.com/llvm/llvm-project/issues/107501. PR: https://github.com/llvm/llvm-project/pull/107894	2024-09-20 11:22:03 +01:00

1 2 3 4 5 ...

259 Commits