llvm-project

Author	SHA1	Message	Date
vporpo	e094c0fa67	[SandboxVec][Legality] Don't vectorize when instructions repeat (#124479 ) This patch adds a legality check that checks for repeated instrs in a bundle and won't vectorize if such pattern is found.	2025-01-29 15:54:15 -08:00
Simon Pilgrim	5921295dca	Revert "[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis." (#124962 ) Reverts llvm/llvm-project#124129 as its currently causing a regression at #124499 - avoids the regression until a proper fix can be added to getSpillCost	2025-01-29 22:17:53 +00:00
Alexey Bataev	4a1a697427	[SLP][NFC]Unify ScalarToTreeEntries and MultiNodeScalars, NFC Currently, SLP has 2 distinct storages to manage mapping between vectorized instructions and their corresponding vectorized TreeEntry nodes. It leads to inefficient lookup for the matching TreeEntries and makes it harder to correctly track instructions, associated with multiple nodes. There is a plan to extend this support for instructions, that require scheduling, to allow support for copyable elements. Merging ScalarToTreeEntry and MultiNodeScalars will allow reduce maintenance of the feature Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/124914	2025-01-29 09:05:54 -05:00
Florian Hahn	2b55ef187c	[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640 ) Add new runPass helpers to run a VPlan transformation. This makes it easier to add additional checks/functionality for each transform run. In this patch, an option is added to run the verifier after each VPlan transform. Follow-ups will use the same helper to also support printing VPlans after each transform. Note that the verifier at the moment requires there to be a canonical IV and vector loop region, so the final lowering transforms aren't run via runPass yet. PR: https://github.com/llvm/llvm-project/pull/123640	2025-01-29 10:50:01 +00:00
vporpo	79cbad188a	[SandboxVec] Clear Context's state within runOnFunction() (#124842 ) `sandboxir::Context` is defined at a pass-level scope with the `SandboxVectorizerPass` class because the function pass manager `FPM` object depends on it, and that is in pass-level scope to avoid recreating the pass pipeline every single time `runOnFunction()` is called. This means that the Context's state lives on across function passes. The problem is twofold: (i) the LLVM IR to Sandbox IR map can grow very large including objects from different functions, which is of no use to the vectorizer, as it's a function-level pass. (ii) this can result in stale data in the LLVM IR to Sandbox IR object map, as other passes may delete LLVM IR objects. To fix both issues this patch introduces a `Context::clear()` function that clears the `LLVMValueToValueMap`.	2025-01-28 18:28:08 -08:00
Florian Hahn	6338bde568	[VPlan] Use cast<VPRecipeBase> in verifier (NFC). All users of VPValue must be a VPRecipeBase, use cast.	2025-01-28 21:01:02 +00:00
Alexey Bataev	947d8ebbf3	[SLP]Unify getNumberOfParts use Adds getNumberOfParts and uses it instead of similar code across code base, fixes analysis of non-vectorizable types in computeMinimumValueSizes. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/124774	2025-01-28 12:16:44 -05:00
Alexey Bataev	a1ab5b4c87	[SLP]Check the MainOp matches the requirements for the instructions Need to include MainOp into the analysis of the instructions in getSameOpcode to be sure that it is checked for the requirements to prevent crashes during further analysis.	2025-01-28 06:00:52 -08:00
Alexey Bataev	1d5fbe83c3	[SLP]Adjust NumberOfParts value for adjusted number of buildvector scalars Need to adjust NumParts value, when GatheredScalars scalars are adjusted after extractelements analysis, to fix compiler crash	2025-01-28 05:45:13 -08:00
Jeremy Morse	304a99091c	[NFC][DebugInfo] Use iterators for insertion at some final callsites These are the callsites that have materialised in the last three weeks since I last built with deprecation warnings.	2025-01-28 11:37:11 +00:00
Nicholas Guy	cdea38f91a	Reland "[LoopVectorizer] Add support for chaining partial reductions #120272 " (#124282 ) Change `getScaledReduction` to take an existing vector, rather than creating and returning a new one each call. Rename `getScaledReduction` to `getScaledReductions` to more accurately reflect what it's now doing. --------- Co-authored-by: Karlo Basioli <68535415+basioli-k@users.noreply.github.com>	2025-01-28 10:40:35 +00:00
David Sherwood	0f61558b97	[LoopVectorize][NFC] Remove unused variable in addUsersInExitBlocks (#124553 ) We were allocating a VPTypeAnalysis object on the stack, but never using it for anything.	2025-01-28 09:11:27 +00:00
vporpo	334a1cdbfa	[SandboxIR] createFunction() should always create a function (#124665 ) This patch removes the assertion that checks for an existing function. If one exists it will remove it and create a new one. This helps remove a crash when a function declaration object already exists and we are about to create a SandboxIR object for the definition.	2025-01-27 20:16:30 -08:00
Han-Kuan Chen	08d14e10ca	[SLP] Fix CommonMask will be transformed into an incorrect mask if createShuffle is called multiple times. (#124244 ) We have two types of mask in SLP: a scalar mask and a vector mask. When vectorizing four i32 additions into <4 x i32>, SLP creates a mask of length 4. When vectorizing four <2 x i32> additions into <8 x i32>, SLP also creates a mask of length 4. We refer to the first case as a scalar mask (because the mask element represents a scalar, i32), and the second case as a vector mask (because the mask element represents a vector, <4 x i32>). At some point, we must convert the scalar mask into a vector mask (otherwise, calling TTI cost functions or IRBuilderBase functions may yield incorrect results). Since both ShuffleCostEstimator and ShuffleInstructionBuilder can modify the CommonMask, we have decided to perform the mask transformation only within createShuffle. However, we do not store the transformed result, as createShuffle may be called multiple times.	2025-01-28 12:02:37 +08:00
Florian Hahn	713482fccf	[VPlan] Use State.get to extract lane mask for BranchOnMask. Simplifies the code slightly and avoids redundant extracts/broadcasts if the operand is live-in or already scalar.	2025-01-27 21:35:36 +00:00
Jeremy Morse	34b139594a	[NFC][DebugInfo] Switch more call-sites to using iterator-insertion (#124283 ) To finalise the "RemoveDIs" work removing debug intrinsics, we're updating call sites that insert instructions to use iterators instead. This set of changes are those where it's not immediately obvious that just calling getIterator to fetch an iterator is correct, and one or two places where more than one line needs to change. Overall the same rule holds though: iterators generated for the start of a block such as getFirstNonPHIIt need to be passed into insert/move methods without being unwrapped/rewrapped, everything else can use getIterator.	2025-01-27 16:44:14 +00:00
Florian Hahn	09a29fcc8d	[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819 ) Live-ins don't need to be handled, other than adding to the exit phi recipe. Do that early and assert that otherwise the exit value is defined in the vector loop region. This should enable simply skipping other exit values that do not need further fixing, e.g. if handling the exit value from the early exit directly in handleUncountableEarlyExit. PR: https://github.com/llvm/llvm-project/pull/123819	2025-01-27 16:12:07 +00:00
Alexey Bataev	f1d5e70a00	[SLP][NFC]Do not check poison values for corresponding vectorized entries No need to check poison values if they have been vectorized and/or mark them as vectorized, it should work only for instructions.	2025-01-27 06:38:23 -08:00
Vasileios Porpodas	1c4341d176	[SandboxVec][DAG] Fix interval check without Node This patch moves the check of whether a node exists before the check of whether it is contained in the interval.	2025-01-26 11:54:09 -08:00
Florian Hahn	1395cd015f	[VPlan] Support multi-exit loops in HCFG builder. Update HCFG construction to support multi-exit loops. If there is no unique exit block, map the middle block of the initial plan to the exit block from the latch. This further unifies HCFG construction and prepares for use to also build an initial VPlan (VPlan0) for inner loops. Effectively NFC as this isn't used on the default code path yet.	2025-01-25 21:55:15 +00:00
Vasileios Porpodas	b178c2d63e	[SandboxVec][DAG] Fix trim schedule Fix trimSchedule by skipping instructions without a DAG Node.	2025-01-25 09:42:14 -08:00
vporpo	5cb2db3b51	[SandboxVec][Scheduler] Forbid crossing BBs (#124369 ) This patch updates the scheduler to forbid scheduling across BBs. It should eventually be able to handle this, but we disable it for now.	2025-01-25 08:19:27 -08:00
Florian Hahn	6383a12e3b	[VPlan] Refactor HCFG builder to preserve original vector latch (NFC). Update HCFG builder to preserve the original latch block of the initial VPlan, ensuring there is always a latch. It also skips creating the BranchOnCond for the latch of the top-level loop, instead of removing it later. Exiting via the latch is controlled by later recipes. This further unifies HCFG construction and prepares for use to also build an initial VPlan (VPlan0) for inner loops.	2025-01-25 13:32:01 +00:00
vporpo	6409799bdc	[SandboxVec][Legality] Pack from different BBs (#124363 ) When the inputs of the pack come from different BBs we need to make sure we emit the pack instructions at the correct place.	2025-01-24 15:39:37 -08:00
vporpo	ac75d32280	[SandboxVec][VecUtils] Filter out instructions not in BB in VecUtils:getLowest() (#124360 ) This patch changes the functionality of `VecUtils::getLowest(Vals, BB)` such that it filters out any instructions in `Vals` that are not in BB. This is useful when Vals contains instructions from different BBs, because in that case we are only interested in one BB.	2025-01-24 14:52:57 -08:00
vporpo	cff7ad56ba	[SandboxVec][Utils] Implement Utils::verifyFunction() (#124356 ) This patch implements a wrapper function for the LLVM IR verifier for functions, and calls it (flag-guarded) within the bottom-up-vectorizer for finding IR bugs as soon as they happen.	2025-01-24 14:35:20 -08:00
vporpo	4b209c5d87	[SandboxIR][Region] Add cost modeling to the region (#124354 ) This patch implements cost modeling for Region. All instructions that are added or removed get their cost counted in the Scoreboard. This is used for checking if the region before or after a transformation is more profitable.	2025-01-24 14:28:55 -08:00
vporpo	b41987beae	[SandboxVec][DAG] Fix MemDGNode chain maintenance when move destination is non-mem (#124227 ) This patch fixes a bug in the maintenance of the MemDGNode chain of the DAG. Whenever we move a memory instruction, the DAG gets notified about the move and maintains the chain of memory nodes. The bug was that if the destination of the move was not a memory instruction, then the memory node's next node would end up pointing to itself.	2025-01-24 13:59:32 -08:00
Simon Pilgrim	a12d7e4b61	[SLP] getVectorCallCosts - don't provide scalar argument data for vector IntrinsicCostAttributes (#124254 ) getVectorCallCosts determines the cost of a vector intrinsic, based off an existing scalar intrinsic call - but we were including the scalar argument data to the IntrinsicCostAttributes, which meant that not only was the cost calculation not type-only based, it was making incorrect assumptions about constant values etc. This also exposed an issue that x86 relied on fallback calculations for funnel shift costs - this is great when we have the argument data as that improves the accuracy of uniform shift amounts etc., but meant that type-only costs would default to Cost=2 for all custom lowered funnel shifts, which was far too cheap. This is the reverse of #124129 where we weren't including argument data when we could. Fixes #63980	2025-01-24 15:13:13 +00:00
Jeremy Morse	6292a808b3	[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>	2025-01-24 13:27:56 +00:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Elvis Wang	aff1242b8e	[LV] Align debug location of the widen-phi to the original phi. (#120338 ) This patch align the debug location of the widen-phi to the debug location of original phi. Split from: #120054	2025-01-24 17:49:54 +08:00
vporpo	d2234ca163	[SandboxVec][BottomUpVec] Fix packing when PHIs are present (#124206 ) Before this patch we might have emitted pack instructions in between PHI nodes. This patch fixes it by fixing the insert point of the new packs.	2025-01-23 16:29:01 -08:00
vporpo	c7053ac202	[SandboxVec][BottomUpVec] Disable crossing BBs (#124039 ) Crossing BBs is not currently supported by the structures of the vectorizer. This patch fixes instances where this was happening, including: - a walk of use-def operands that updates the UnscheduledSuccs counter, - the dead instruction removal is now done per BB, - the scheduler, which will reject bundles that cross BBs.	2025-01-23 15:08:13 -08:00
Vitaly Buka	0e213834df	Revert "[LoopVectorizer] Add support for chaining partial reductions (#120272 )" (#124198 ) Introduced stack buffer overflow, see #120272. `getScaledReduction` can return empty vector, and there is not check for that. This reverts commit c9b7303b9b18129c4ee6b56aaa2a0a9f59be2d09. This reverts commit caf0540b91b0fee31353dc7049ae836e0f814cff.	2025-01-23 14:00:33 -08:00
Florian Hahn	7a831eb924	[VPlan] Remove unused VPLane::getNumCachedLanes. (NFC) The function isn't used, remove it.	2025-01-23 20:38:47 +00:00
Karlo Basioli	c9b7303b9b	Add [[maybe_unused]] to a variable used only in assert in VPlan.h (#124173 )	2025-01-23 18:52:53 +00:00
Alexey Bataev	c7e6ca76cb	[SLP][NFC]Add dump() method for ScheduleData struct type for better debugging	2025-01-23 09:49:37 -08:00
Nicholas Guy	caf0540b91	[LoopVectorizer] Add support for chaining partial reductions (#120272 ) Chaining partial reductions, where multiple partial reductions share an accumulator, allow for more values to be combined together as part of the reduction without discarding the semantics of the partial reduction itself.	2025-01-23 17:24:57 +00:00
Simon Pilgrim	d8cd8d56ea	[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis. (#124129 ) We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.) Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can. Noticed while having yet another attempt at #63980	2025-01-23 16:57:13 +00:00
Alexey Bataev	fa299294c0	[SLP][NFC]Modernize code base in several places	2025-01-23 08:43:07 -08:00
Nicholas Guy	26b61e143b	[LoopVectorizer] Propagate underlying instruction to the cloned instances of VPPartialReductionRecipes (#123638 )	2025-01-23 14:57:31 +00:00
Florian Hahn	05fbc3830d	[VPlan] Move VPBlockUtils to VPlanUtils.h (NFC) Nothing in VPlan.h directly uses VPBlockUtils.h. Move it out to the more appropriate VPlanUtils.h to reduce the size of the widely included VPlan.h.	2025-01-23 11:28:11 +00:00
Han-Kuan Chen	d3aea77f50	[SLP] Move transformMaskAfterShuffle into BaseShuffleAnalysis and use it as much as possible. (#123896 )	2025-01-23 09:47:38 +08:00
vporpo	8110af75b1	[SandboxVec][BottomUpVec] Fix codegen when packing constants. (#124033 ) Before this patch packing a bundle of constants would crash because `getInsertPointAfterInstrs()` expected instructions. This patch fixes this.	2025-01-22 16:38:10 -08:00
vporpo	2dc1c95595	[SandboxVec][VecUtils] Implement VecUtils::getLowest() (#124024 ) VecUtils::getLowest(Valse) returns the lowest instruction in the BB among Vals. If the instructions are not in the same BB, or if none of them is an instruction it returns nullptr.	2025-01-22 16:08:15 -08:00
vporpo	fd087135ef	[SandboxVec][Legality] Diamond reuse multi input (#123426 ) This patch implements the diamond pattern where we are vectorizing toward the top of the diamond from both edges, but the second edge may use elements from a different vector or just scalar values. This requires some additional packing code (see lit test).	2025-01-22 15:23:47 -08:00
David Sherwood	4a2ebd6661	[LV][NFC] Refactor structures used to maintain uncountable exit info (#123219 ) I've removed the HasUncountableEarlyExit variable, since we can already determine whether or not a loop has an early exit by seeing if we found an uncountable exit. I have also deleted the old UncountableExitingBlocks and UncountableExitBlocks lists and replaced them with a single uncountable edge. This means we don't need to worry about keeping the list entries in sync and makes it clear which exiting block corresponds to which exit block.	2025-01-22 09:40:08 +00:00
Vasileios Porpodas	4089314907	[SandboxVec][DAG][NFC] Remove early return in notifyMoveInstr() It used to early return when destination is same as origin. But it's redundant because in that case the callback won't get called in the first place.	2025-01-21 18:38:39 -08:00
Florian Hahn	6c787ff6cf	Revert "[LV]: Teach LV to recursively (de)interleave. (#122989 )" This reverts commit 9491f75e1d912b277247450d1c7b6d56f7faf885. This triggers an assert when building with SVE enabled. https://lab.llvm.org/buildbot/#/builders/143/builds/4795	2025-01-21 21:36:16 +00:00

1 2 3 4 5 ...

5505 Commits