llvm-project

Author	SHA1	Message	Date
Florian Hahn	8a91f6bcda	[VPlan] Use CurrentParentLoop instead of looking up via CFG (NFC). There is no need to look up the current parent loop via LoopInfo and the vector preheader; we can simply use CurrentParentLoop.	2025-03-18 22:11:47 +00:00
Florian Hahn	6a030b3005	[VPlan] Remove unused VPlanIngredient (NFC). VPlanIngredient is not used anymore, remove it.	2025-03-15 10:52:43 +00:00
Florian Hahn	02575f887b	[VPlan] Use VPInstruction for VPScalarPHIRecipe. (NFCI) (#129767 ) Now that all phi nodes manage their incoming blocks through the VPlan-predecessors, there should be no need for having a dedicate recipe, it should be sufficient to allow PHI opcodes in VPInstruction. Follow-ups will also migrate VPWidenPHIRecipe and possibly others, building on top of https://github.com/llvm/llvm-project/pull/129388. PR: https://github.com/llvm/llvm-project/pull/129767	2025-03-13 18:35:07 +00:00
Mel Chen	5d5e706691	[VPlan] Restrict hoisting of broadcast operations using VPDominatorTree (#117138 ) This patch restricts broadcast operations from being hoisted to the vector preheader unless the basic block that defines the broadcasted value properly dominates the vector preheader. This prevents potential use-before-definition issues when the broadcasted value is defined within the plan. VPDominatorTree is used to confirm this restriction while still allowing safe hoisting for broadcasted values defined outside the plan. Issue https://github.com/llvm/llvm-project/issues/117139	2025-03-13 07:16:04 -07:00
Mel Chen	ffe202ca00	Revert "[LV] Limits the splat operations be hoisted must not be defined by a recipe. (#117138 )" This reverts commit 1ff10fa82fff83bb2f0a5c1ffde6203b52bc9619.	2025-03-13 07:16:04 -07:00
Mel Chen	1ff10fa82f	[LV] Limits the splat operations be hoisted must not be defined by a recipe. (#117138 ) Issue https://github.com/llvm/llvm-project/issues/117139	2025-03-11 17:59:12 +08:00
Florian Hahn	fd267082ee	[VPlan] Refactor VPlan creation, add transform introducing region (NFC). (#128419 ) Create an empty VPlan first, then let the HCFG builder create a plain CFG for the top-level loop (w/o a top-level region). The top-level region is introduced by a separate VPlan-transform. This is instead of creating the vector loop region before building the VPlan CFG for the input loop. This simplifies the HCFG builder (which should probably be renamed) and moves along the roadmap ('buildLoop') outlined in [1]. As follow-up, I plan to also preserve the exit branches in the initial VPlan out of the CFG builder, including connections to the exit blocks. The conversion from plain CFG with potentially multiple exits to a single entry/exit region will be done as VPlan transform in a follow-up. This is needed to enable VPlan-based predication. Currently early exit support relies on building the block-in masks on the original CFG, because exiting branches and conditions aren't preserved in the VPlan. So in order to switch to VPlan-based predication, we will have to preserve them in the initial plain CFG, so the exit conditions are available explicitly when we convert to single entry/exit regions. Another follow-up is updating the outer loop handling to also introduce VPRegionBlocks for nested loops as transform. Currently the existing logic in the builder will take care of creating VPRegionBlocks for nested loops, but not the top-level loop. [1] https://llvm.org/devmtg/2023-10/slides/techtalks/Hahn-VPlan-StatusUpdateAndRoadmap.pdf PR: https://github.com/llvm/llvm-project/pull/128419	2025-03-09 15:05:35 +00:00
Florian Hahn	9f37cdca52	[VPlan] Update VPTransformState accessors to take const VPValue (NFC). This will enable using const VPValue * pointers are in more places.	2025-03-01 13:15:37 +00:00
John Brawn	8150ab93f7	[LoopVectorize] Use CodeSize as the cost kind for minsize (#124119 ) Functions marked with minsize should aim for minimum code size, so the vectorizer should use CodeSize for the cost kind and also the cost we compare should be the cost for the entire loop: it shouldn't be divided by the number of vector elements and block costs shouldn't be divided by the block probability. Possibly we should also be doing this for optsize as well, but there are a lot of tests that assume the current behaviour and the definition of optsize is less clear than minsize (for minsize the goal is to "keep the code size of this function as small as possible" whereas for optsize it's "keep the code size of this function low").	2025-02-27 11:07:02 +00:00
Florian Hahn	4277c21059	[VPlan] Introduce explicit broadcasts for live-ins. (#124644 ) Add a new VPInstruction::Broadcast opcode and use it to materialize explicit broadcasts of live-ins. The initial patch only materlizes the broadcasts if the vector preheader dominates all uses that need it. Later patches will pick the best valid insert point, thus retiring implicit hoisting of broadcasts from VPTransformsState::get(). PR: https://github.com/llvm/llvm-project/pull/124644	2025-02-26 13:57:51 +00:00
Florian Hahn	522b05afb6	[VPlan] Construct immutable VPIRBBs for exit blocks at construction(NFC) (#128374 ) Constract immutable VPIRBasicBlocks for all exit blocks up front and keep a list of them. Same as the scalar header, they are leaf nodes of the VPlan and won't change. Some exit blocks may be unreachable, e.g. if the scalar epilogue always executes or depending on optimizations. This simplifies both the way we retrieve the exit blocks as well as hooking up the exit blocks. PR: https://github.com/llvm/llvm-project/pull/128374	2025-02-25 14:23:27 +00:00
Florian Hahn	baa77e30f0	[LV] Remove some redundant casts (NFC).	2025-02-24 21:46:29 +00:00
Florian Hahn	38376dee92	[VPlan] Build initial VPlan 0 using HCFGBuilder for inner loops. (NFC) (#124432 ) Use HCFGBuilder to build an initial VPlan 0, which wraps all input instructions in VPInstructions and update tryToBuildVPlanWithVPRecipes to replace the VPInstructions with widened recipes. At the moment, widened recipes are created based on the underlying instruction of the VPInstruction. Masks are also still created based on the input IR basic blocks and the loop CFG is flattened in the main loop processing the VPInstructions. This patch also incldues support for Switch instructions in HCFGBuilder using just a VPInstruction with Instruction::Switch opcode. There are multiple follow-ups planned: * Perform predication on the VPlan directly, * Unify code constructing VPlan 0 to be shared by both inner and outer loop code paths. * Construct VPlan 0 once, clone subsequent ones for VFs PR: https://github.com/llvm/llvm-project/pull/124432	2025-02-18 16:12:29 +01:00
Benjamin Maxwell	e0e67a6207	[LV] Add initial support for vectorizing literal struct return values (#109833 ) This patch adds initial support for vectorizing literal struct return values. Currently, this is limited to the case where the struct is homogeneous (all elements have the same type) and not packed. The users of the call also must all be `extractvalue` instructions. The intended use case for this is vectorizing intrinsics such as: ``` declare { float, float } @llvm.sincos.f32(float %x) ``` Mapping them to structure-returning library calls such as: ``` declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>) ``` Or their widened form (such as `@llvm.sincos.v4f32` in this case). Implementing this required two main changes: 1. Supporting widening `extractvalue` 2. Adding support for vectorized struct types in LV * This is mostly limited to parts of the cost model and scalarization Since the supported use case is narrow, the required changes are relatively small.	2025-02-17 09:51:35 +00:00
Florian Hahn	e5f5517f91	[VPlan] Create IR basic block for middle.block in VPlan. Create a IR BB directly for the middle.block, instead of creating the IR BB during skeleton creation and then replacing the middle VPBB with a VPIRBB. This moves another part of skeleton creation to VPlan and simplififes the code slightly by removing code to disconnect the middle block and vector preheader + the corresponding DT update. NFC modulo IR block naming and block creation order, which changes the IR names for the blocks.	2025-02-15 21:54:16 +01:00
Florian Hahn	5008277322	[VPlan] Move auxiliary declarations out of VPlan.h (NFC). (#124104 ) Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState (https://github.com/llvm/llvm-project/pull/117138). PR: https://github.com/llvm/llvm-project/pull/124104	2025-02-02 13:44:07 +00:00
David Sherwood	3bc2dade36	[LoopVectorize] Enable vectorisation of early exit loops with live-outs (#120567 ) This work feeds part of PR https://github.com/llvm/llvm-project/pull/88385, and adds support for vectorising loops with uncountable early exits and outside users of loop-defined variables. When calculating the final value from an uncountable early exit we need to calculate the vector lane that triggered the exit, and hence determine the value at the point we exited. All code for calculating the last value when exiting the loop early now lives in a new vector.early.exit block, which sits between the middle.split block and the original exit block. Doing this required two fixes: 1. The vplan verifier incorrectly assumed that the block containing a definition always dominates the block of the user. That's not true if you can arrive at the use block from multiple incoming blocks. This is possible for early exit loops where both the early exit and the latch jump to the same block. 2. We were adding the new vector.early.exit to the wrong parent loop. It needs to have the same parent as the actual early exit block from the original loop. I've added a new ExtractFirstActive VPInstruction that extracts the first active lane of a vector, i.e. the lane of the vector predicate that triggered the exit. NOTE: The IR generated for dealing with live-outs from early exit loops is unoptimised, as opposed to normal loops. This inevitably leads to poor quality code, but this can be fixed up later.	2025-01-30 10:37:00 +00:00
Florian Hahn	2b55ef187c	[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640 ) Add new runPass helpers to run a VPlan transformation. This makes it easier to add additional checks/functionality for each transform run. In this patch, an option is added to run the verifier after each VPlan transform. Follow-ups will use the same helper to also support printing VPlans after each transform. Note that the verifier at the moment requires there to be a canonical IV and vector loop region, so the final lowering transforms aren't run via runPass yet. PR: https://github.com/llvm/llvm-project/pull/123640	2025-01-29 10:50:01 +00:00
Jeremy Morse	34b139594a	[NFC][DebugInfo] Switch more call-sites to using iterator-insertion (#124283 ) To finalise the "RemoveDIs" work removing debug intrinsics, we're updating call sites that insert instructions to use iterators instead. This set of changes are those where it's not immediately obvious that just calling getIterator to fetch an iterator is correct, and one or two places where more than one line needs to change. Overall the same rule holds though: iterators generated for the start of a block such as getFirstNonPHIIt need to be passed into insert/move methods without being unwrapped/rewrapped, everything else can use getIterator.	2025-01-27 16:44:14 +00:00
Florian Hahn	6383a12e3b	[VPlan] Refactor HCFG builder to preserve original vector latch (NFC). Update HCFG builder to preserve the original latch block of the initial VPlan, ensuring there is always a latch. It also skips creating the BranchOnCond for the latch of the top-level loop, instead of removing it later. Exiting via the latch is controlled by later recipes. This further unifies HCFG construction and prepares for use to also build an initial VPlan (VPlan0) for inner loops.	2025-01-25 13:32:01 +00:00
Jeremy Morse	6292a808b3	[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>	2025-01-24 13:27:56 +00:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
John Brawn	edf3a55bce	[LoopVectorize][NFC] Centralize the setting of CostKind (#121937 ) In each class which calculates instruction costs (VPCostContext, LoopVectorizationCostModel, GeneratedRTChecks) set the CostKind once in the constructor instead of in each function that calculates a cost. This is in preparation for potentially changing the CostKind when compiling for optsize.	2025-01-17 15:06:18 +00:00
offsake	83be69cf9a	[VPlan][Coverity] Fix coverity CID1579964. (#121805 ) Fix for the Coverity hit with CID1579964 in VPlan.cpp. Coverity message with some context follows. [Cov] var_compare_op: Comparing TermBr to null implies that TermBr might be null. 434 } else if (TermBr && !TermBr->isConditional()) { 435 TermBr->setSuccessor(0, NewBB); 436 } else { 437 // Set each forward successor here when it is created, excluding 438 // backedges. A backward successor is set when the branch is created. 439 unsigned idx = PredVPSuccessors.front() == this ? 0 : 1; [Cov] CID 1579964: (#1 of 1): Dereference after null check (FORWARD_NULL) [Cov] var_deref_model: Passing null pointer TermBr to getSuccessor, which dereferences it.	2025-01-13 20:29:51 +00:00
Florian Hahn	f48884ded8	[VPlan] Remove loop region in optimizeForVFAndUF. (#108378 ) Update optimizeForVFAndUF to completely remove the vector loop region when possible. At the moment, we cannot remove the region if it contains * widened IVs: the recipe is needed to generate the step vector * reductions: ComputeReductionResults requires the reduction phi recipe for codegen. Both cases can be addressed by more explicit modeling. The patch also includes a number of updates to allow executing VPlans without a vector loop region. Depends on https://github.com/llvm/llvm-project/pull/110004	2025-01-05 15:50:42 +00:00
Florian Hahn	20d491bb99	[VPlan] Remove re-using vector PH in VPBasicBlock::execute (NFC). Remove logic to re-use the previous basic block for the vector pre header from VPBasicBlock::execute. The preheader is now modeled as VPIRBasicBlock, so the code is no longer needed. Split off from https://github.com/llvm/llvm-project/pull/108378.	2025-01-03 19:56:44 +00:00
Florian Hahn	c7ebe4fd0a	[VPlan] Replace VPBBs with VPIRBBs during skeleton creation (NFC). Move replacement of VPBBs for vector preheader, middle block and scalar preheader from VPlan::execute to skeleton creation, which actually creates the IR basic blocks. For now, the vector preheader can only be replaced after prepareToExecute as it may create new instructions in the vector preheader.	2025-01-01 22:05:43 +00:00
Florian Hahn	b06a45c66f	[VPlan] Add all blocks to outer loop if present during ::execute (NFCI). This ensures that all blocks created during VPlan execution are properly added to an enclosing loop, if present. Split off from https://github.com/llvm/llvm-project/pull/108378 and also needed once more of the skeleton blocks are created directly via VPlan. This also allows removing the custom logic for early-exit loop vectorization added as part of https://github.com/llvm/llvm-project/pull/117008.	2024-12-31 19:34:34 +00:00
Florian Hahn	16d19aaedf	[VPlan] Manage created blocks directly in VPlan. (NFC) (#120918 ) This patch changes the way blocks are managed by VPlan. Previously all blocks reachable from entry would be cleaned up when a VPlan is destroyed. With this patch, each VPlan keeps track of blocks created for it in a list and this list is then used to delete all blocks in the list when the VPlan is destroyed. To do so, block creation is funneled through helpers in directly in VPlan. The main advantage of doing so is it simplifies CFG transformations, as those do not have to take care of deleting any blocks, just adjusting the CFG. This helps to simplify https://github.com/llvm/llvm-project/pull/108378 and https://github.com/llvm/llvm-project/pull/106748. This also simplifies handling of 'immutable' blocks a VPlan holds references to, which at the moment only include the scalar header block. PR: https://github.com/llvm/llvm-project/pull/120918	2024-12-30 12:08:12 +00:00
LiqinWeng	b5f0ec80d5	[VPlan] Remove redundant printing final in VPlan::execute (#121048 ) Multiple prints will cause problems when testing ir-bb	2024-12-25 10:11:02 +08:00
Florian Hahn	5ca3794e82	[VPlan] Move initial VPlan block creation to constructor. (NFC) This sets up the initial blocks needed to initialize a VPlan directly in the constructor. This will allow tracking of all created blocks directly in VPlan, simplifying block deletion.	2024-12-18 22:00:30 +00:00
Florian Hahn	58cfa39861	[VPlan] Remove legacy VPlan() constructors (NFC). The constructors were retained to reduce the diff during transition. Remove them now.	2024-12-17 08:22:22 +00:00
Florian Hahn	95e509a989	[VPlan] Add VPWidenInduction recipe as common base class (NFC). (#120008 ) This helps to simplify some existing code and new code (https://github.com/llvm/llvm-project/pull/112145) PR: https://github.com/llvm/llvm-project/pull/120008	2024-12-16 09:40:03 +00:00
Florian Hahn	c95af0844d	[VPlan] Move ::getVectorLoopRegion out of ifdef (NFC). Fixes a build failure with assertions disabled after 6c8f41d336747.	2024-12-12 16:21:21 +00:00
Florian Hahn	6c8f41d336	[VPlan] Hook IR blocks into VPlan during skeleton creation (NFC) (#114292 ) As a first step to move towards modeling the full skeleton in VPlan, start by wrapping IR blocks created during legacy skeleton creation in VPIRBasicBlocks and hook them into the VPlan. This means the skeleton CFG is represented in VPlan, just before execute. This allows moving parts of skeleton creation into recipes in the VPBBs gradually. Note that this allows retiring some manual DT updates, as this will be handled automatically during VPlan execution. PR: https://github.com/llvm/llvm-project/pull/114292	2024-12-12 15:58:16 +00:00
Luke Lau	b26fe5b7e9	[VPlan] Use variadic isa<> in a few more places. NFC (#119538 )	2024-12-12 13:26:39 +08:00
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
Florian Hahn	e9834209aa	[VPlan] Move convertToConreteRecipes to end of VPlan-opt phase (NFCI). Adjust placement as suggested in https://github.com/llvm/llvm-project/pull/114305, after some refactoring to prepare for the move.	2024-12-10 09:13:13 +00:00
Florian Hahn	0e70289f37	[VPlan] Create canonical IV resume value for epilogue in VPlan. (NFCI) Update the code to create induction resume PHIs to also create a resume phi for the canonical induction during epilogue vectorization. This unifies the code for handling induction resume values and removes the need to explicitly create manually resume PHI and return it during epilogue creation. Overall it helps to move the code for updating the canonical induction resume value to the place where all other header phi resume values are updated. This is NFC, modulo order of the created phis.	2024-12-09 23:11:38 +00:00
Florian Hahn	ec22b1ab47	[VPlan] Iterate over blocks in VPlan::execute in RPOT (NFC). This prepares for more complex CFGs in VPlan, as in https://github.com/llvm/llvm-project/pull/114292 https://github.com/llvm/llvm-project/pull/112138	2024-12-07 10:19:27 +00:00
Florian Hahn	156da98683	[VPlan] Move printing final VPlan to ::execute (NFC). This moves printing of the final VPlan to ::execute. This ensures the final VPlan is printed, including recipes that get introduced by late, lowering transforms and skeleton construction. Split off from https://github.com/llvm/llvm-project/pull/114292, to simplify the diff.	2024-12-07 09:39:10 +00:00
Florian Hahn	6797b0f0c0	[VPlan] Use RPOT for VPlan codegen and printing. This split off changes for more complex CFGs in VPlan from both https://github.com/llvm/llvm-project/pull/114292 https://github.com/llvm/llvm-project/pull/112138 This simplifies their respective diffs.	2024-12-06 21:49:00 +00:00
Florian Hahn	a7fda0e1e4	[VPlan] Introduce VPScalarPHIRecipe, use for can & EVL IV codegen (NFC). (#114305 ) Introduce a general recipe to generate a scalar phi. Lower VPCanonicalIVPHIRecipe and VPEVLBasedIVRecipe to VPScalarIVPHIrecipe before plan execution, avoiding the need for duplicated ::execute implementations. There are other cases that could benefit, including in-loop reduction phis and pointer induction phis. Builds on a similar idea as https://github.com/llvm/llvm-project/pull/82270. PR: https://github.com/llvm/llvm-project/pull/114305	2024-12-03 14:53:51 +00:00
Florian Hahn	0dbdc6dc35	[VPlan] Simplify code to re-use existing basic blocks (NFCI). Restructure and slightly simplify code to re-use existing basic blocks.	2024-11-24 19:14:29 +00:00
Finn Plummer	8663b8777e	[NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849 ) This changes allows target intrinsics to specify and overwrite overloaded types. - Updates `ReplaceWithVecLib` to not provide TTI as there most probably won't be a use-case - Updates `SLPVectorizer` to use available TTI - Updates `VPTransformState` to pass down TTI - Updates `VPlanRecipe` to use passed-down TTI This change will let us add scalarization for `asdouble`: #114847	2024-11-21 11:04:25 -08:00
Florian Hahn	a5a1612deb	[VPlan] Consistently use DEBUG_TYPE loop-vectorize. This ensures debug messages in VPlan.cpp are included in the commonly used -debug-only=loop-vectorize.	2024-11-10 09:17:03 +00:00
Florian Hahn	8a7a7b5ffc	[VPlan] Remove unneeded code connecting blocks in VPBB:splitAt (NFC). insertBlockAfter already takes care of transferring successors. Remove unneeded code to transfer them manually.	2024-11-08 21:52:18 +00:00
Florian Hahn	596fd103f8	[VPlan] Share logic to connect predecessors in VPBB/VPIRBB execute (NFC) This moves the common logic to connect IRBBs created for a VPBB to their predecessors in the VPlan CFG, making it easier to keep in sync in the future.	2024-11-04 19:01:39 +00:00
Kazu Hirata	aa825b74af	[Vectorize] Remove unused includes (NFC) (#114643 ) Identified with misc-include-cleaner.	2024-11-03 08:58:51 -08:00
David Sherwood	4ed7bcb4a6	[VPlan][NFC] Add new getMiddleBlock interface to VPlan (#113558 ) This work is in preparation for PRs #112138 and #88385 where the middle block is not guaranteed to be the immediate successor to the region block. I've simply add new getMiddleBlock() interfaces to VPlan that for now just return cast<VPBasicBlock>(VectorRegion->getSingleSuccessor()) Once PR #112138 lands we'll need to do more work to discover the middle block.	2024-11-01 10:50:52 +00:00

1 2 3 4 5 ...

378 Commits