llvm-project

Author	SHA1	Message	Date
Ramkumar Ramachandra	1abb055c57	[IVDesc] Make getCastInsts return an ArrayRef (NFC) (#169021 ) To make it clear that the return value is immutable.	2025-11-24 08:57:55 +00:00
Luke Lau	5da0445420	[LV] Consolidate shouldOptimizeForSize and remove unused BFI/PSI. NFC (#168697 ) #158690 plans on passing BFI as a lazy lambda to avoid computing BlockFrequencyInfo when not needed. In preparation for that, this PR removes BFI and PSI from some constructors that aren't used. It also consolidates the two calls to llvm::shouldOptimizeForSize so that the result is computed once and passed where needed. This also renames OptForSize in LoopVectorizationLegality to clarify that it's to prevent runtime SCEV checks, see https://reviews.llvm.org/D68082	2025-11-19 21:29:26 +08:00
Florian Hahn	a6edeedbfa	Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b. This appears to be causing some runtime failures on RISCV https://lab.llvm.org/buildbot/#/builders/210/builds/5221	2025-11-13 22:34:55 +00:00
Florian Hahn	62d1a080e6	[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 ) Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-12 15:11:00 +00:00
Florian Hahn	af9a4263a1	[LAA] Only use inbounds/nusw in isNoWrap if the GEP is dereferenced. (#161445 ) Update isNoWrap to only use the inbounds/nusw flags from GEPs that are guaranteed to be dereferenced on every iteration. This fixes a case where we incorrectly determine no dependence. I think the issue is isolated to code that evaluates the resulting AddRec at BTC, just using it to compute the distance between accesses should still be fine; if the access does not execute in a given iteration, there's no dependence in that iteration. But isolating the code is not straight-forward, so be conservative for now. The practical impact should be very minor (only one loop changed across a corpus with 27k modules from large C/C++ workloads. Fixes https://github.com/llvm/llvm-project/issues/160912. PR: https://github.com/llvm/llvm-project/pull/161445	2025-11-04 17:08:12 +00:00
Florian Hahn	5e3ac2a6f2	[LV] Bail out on loops with switch as latch terminator. Currently we cannot vectorize loops with latch blocks terminated by a switch. In the future this could be handled by materializing appropriate compares. Fixes https://github.com/llvm/llvm-project/issues/156894.	2025-10-12 21:20:35 +01:00
Craig Topper	59fe8401f4	[LV] Use StringRef::consume_front. NFC (#161454 )	2025-10-01 08:58:30 -07:00
Graham Hunter	54fc5367f6	[LV] Fix crash in uncountable exit with side effects checking Fixes an ICE reported on PR #145663, as an assert was found to be reachable with a specific combination of unreachable blocks.	2025-09-12 10:41:05 +00:00
Graham Hunter	e285602fda	[LV] Enforce addrec in current loop for uncountable exit load address check Addresses post-commit review raised for #145663	2025-09-11 11:18:22 +00:00
Graham Hunter	3c810b76b9	[LV] Add initial legality checks for early exit loops with side effects (#145663 ) This adds initial support to LoopVectorizationLegality to analyze loops with side effects (particularly stores to memory) and an uncountable exit. This patch alone doesn't enable any new transformations, but does give clearer reasons for rejecting vectorization for such a loop. The intent is for a loop like the following to pass the specific checks, and only be rejected at the end until the transformation code is committed: ``` // Assume a is marked restrict // Assume b is known to be large enough to access up to b[N-1] for (int i = 0; i < N; ++) { a[i]++; if (b[i] > threshold) break; } ```	2025-09-10 13:54:52 +01:00
Florian Hahn	b9b0ea5f62	[LV] Pass DT to isGuaranteedNotToBePoison in canVectorizeWithIfCvt. Pass DT to slightly improve analysis results. Note that the context instruction is already passed.	2025-09-05 20:42:56 +01:00
Shih-Po Hung	9876b06bc7	[LV] Add initial legality checks for loops with unbound loads. (#152422 ) This patch splits out the legality checks from PR #151300, following the landing of PR #128593. It is a step toward supporting vectorization of early-exit loops that contain potentially faulting loads. In this commit, an early-exit loop is considered legal for vectorization if it satisfies the following criteria: 1. it is a read-only loop. 2. all potentially faulting loads are unit-stride, which is the only type currently supported by vp.load.ff.	2025-09-05 08:20:16 +08:00
Tobias Stadler	8135b7c1ab	[LV] Emit all remarks for unvectorizable instructions (#153833 ) If ExtraAnalysis is requested, emit all remarks caused by unvectorizable instructions - instead of only the first. This is in line with how other places handle DoExtraAnalysis and it can be quite helpful to get info about all instructions in a loop that prevent vectorization.	2025-08-18 18:04:53 +01:00
Florian Hahn	7d815c7642	[LV] Remove unused variables after 965231ca0a9a. (NFC) Clean up unused/dead variables after 965231ca0a9a (https://github.com/llvm/llvm-project/pull/151311)	2025-08-01 10:12:20 +01:00
Florian Hahn	965231ca0a	[LV] Replace UncountableEdge with UncountableEarlyExitingBlock (NFC). (#151311 ) Only the uncountable exiting BB is used. Store it instead of a piar of Exiting BB and Exit BB. PR: https://github.com/llvm/llvm-project/pull/151311	2025-08-01 09:37:02 +01:00
Florian Hahn	fd97dfbb78	[LV] Don't mark ptrs as safe to speculate if fed by UB/poison op. (#143204 ) Add additional checks before marking pointers safe to load speculatively. If some computations feeding the pointer may trigger UB, we cannot load the pointer speculatively, because we cannot compute the address speculatively. The UB triggering instructions will be predicated, but if the predicated block does not execute the result is poison. Similarly, we also cannot load the pointer speculatively if it may be poison. The patch also checks if any of the operands defined outside the loop may be poison when entering the loop. We don't need to check if any operation inside the loop may produce poison due to flags, as those will be dropped if needed. There are some types of instructions inside the loop that can produce poison independent of flags. Currently loads are also checked, not sure if there's a convenient API to check for all such operands. Fixes https://github.com/llvm/llvm-project/issues/142957. PR: https://github.com/llvm/llvm-project/pull/143204	2025-06-20 13:05:19 +01:00
Jeremy Morse	9eb0020555	[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389 ) Seeing how we can't generate any debug intrinsics any more: delete a variety of codepaths where they're handled. For the most part these are plain deletions, in others I've tweaked comments to remain coherent, or added a type to (what was) type-generic-lambdas. This isn't all the DbgInfoIntrinsic call sites but it's most of the simple scenarios. Co-authored-by: Nikita Popov <github@npopov.com>	2025-06-17 15:55:14 +01:00
Shih-Po Hung	e5263e3ec8	[LV][NFC] Clean up tail-folding check for early-exit loops (#133931 ) This patch moves the check for a single latch exit from computeMaxVF() to LoopVectorizationLegality::canFoldTailByMasking(), as it duplicates the logic when foldTailByMasking() returns false. It also updates the NoScalarEpilogueNeeded logic to return false for loops that are neither single-latch-exit nor early-exit. This avoids applying tail-folding in unsupported cases and prevents triggering assertions during analysis.	2025-04-18 10:57:11 +08:00
chrisPyr	71f4c7dabe	[NFC]Make file-local cl::opt global variables static (#126486 ) #125983	2025-03-03 13:46:33 +07:00
Ramkumar Ramachandra	4ac43b541c	[LV] Restrict widest induction type to be IntegerType (NFC) (#128173 ) As the name of the function suggests, convertPointerToIntegerType should return an IntegerType instead of a Type, and should only ever be called with integer or ptr type. Fix the callers getWiderType, and addInductionPhi to narrow the type of WidestIndTy to IntegerType, stripping unclear casts. While at it, rename convertPointerToIntegerType and getWiderType for clarity.	2025-02-24 19:10:33 +00:00
Benjamin Maxwell	e0e67a6207	[LV] Add initial support for vectorizing literal struct return values (#109833 ) This patch adds initial support for vectorizing literal struct return values. Currently, this is limited to the case where the struct is homogeneous (all elements have the same type) and not packed. The users of the call also must all be `extractvalue` instructions. The intended use case for this is vectorizing intrinsics such as: ``` declare { float, float } @llvm.sincos.f32(float %x) ``` Mapping them to structure-returning library calls such as: ``` declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>) ``` Or their widened form (such as `@llvm.sincos.v4f32` in this case). Implementing this required two main changes: 1. Supporting widening `extractvalue` 2. Adding support for vectorized struct types in LV * This is mostly limited to parts of the cost model and scalarization Since the supported use case is narrow, the required changes are relatively small.	2025-02-17 09:51:35 +00:00
David Sherwood	4a2ebd6661	[LV][NFC] Refactor structures used to maintain uncountable exit info (#123219 ) I've removed the HasUncountableEarlyExit variable, since we can already determine whether or not a loop has an early exit by seeing if we found an uncountable exit. I have also deleted the old UncountableExitingBlocks and UncountableExitBlocks lists and replaced them with a single uncountable edge. This means we don't need to worry about keeping the list entries in sync and makes it clear which exiting block corresponds to which exit block.	2025-01-22 09:40:08 +00:00
Florian Hahn	b0697dc1de	[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994 ) Currently we emit early-exit related debug messages/remarks even when there is a single exit. Update to only check isVectorizableEarlyExitLoop if there isn't a single exit block. PR: https://github.com/llvm/llvm-project/pull/121994	2025-01-09 12:05:19 +00:00
Benjamin Maxwell	f88ef1bd1b	[LV] Teach LoopVectorizationLegality about struct vector calls (#119221 ) This is a split-off from #109833 and only adds code relating to checking if a struct-returning call can be vectorized. This initial patch only allows the case where all users of the struct return are `extractvalue` operations that can be widened. ``` %call = tail call { float, float } @foo(float %in_val) %extract_a = extractvalue { float, float } %call, 0 %extract_b = extractvalue { float, float } %call, 1 ``` Note: The tests require the VFABI changes from #119000 to pass.	2025-01-09 09:27:29 +00:00
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
David Sherwood	c18fda02e1	[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414 )	2024-12-19 10:07:13 +00:00
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
Ellis Hoag	6ab26eab4f	Check hasOptSize() in shouldOptimizeForSize() (#112626 )	2024-10-28 09:45:03 -07:00
Graham Hunter	6f1a8c2da2	[LV] Vectorize histogram operations (#99851 ) This patch implements autovectorization support for the 'all-in-one' histogram intrinsic, which seems to have more support than the 'standalone' intrinsic. See https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/ for an overview of the work and my notes on the tradeoffs between the two approaches.	2024-09-27 13:08:55 +01:00
David Sherwood	f4eeae1244	[LoopVectorize] Address comments on PR #107004 left post-commit (#109300 ) * Rename Speculative -> Uncountable and update tests. * Add comments explaining why it's safe to ignore the predicates when building up a list of exiting blocks. * Reshuffle some code to do (hopefully) cheaper checks first.	2024-09-23 13:36:25 +01:00
David Sherwood	02ee96eca9	[Analysis] Teach isDereferenceableAndAlignedInLoop about SCEV predicates (#106562 ) Currently if a loop contains loads that we can prove at compile time are dereferenceable when certain conditions are satisfied the function isDereferenceableAndAlignedInLoop will still return false because getSmallConstantMaxTripCount will return 0 when SCEV predicates are required. This patch changes getSmallConstantMaxTripCount to take an optional Predicates pointer argument so that we can permit functions such as isDereferenceableAndAlignedInLoop to consider more cases.	2024-09-23 09:56:37 +01:00
Benjamin Kramer	57777a5066	[LoopVectorize] Silence unused variable warning	2024-09-19 11:01:58 +02:00
David Sherwood	e762d4dac7	[LoopVectorize] Teach LoopVectorizationLegality about more early exits (#107004 ) This patch is split off from PR #88385 and concerns only the code related to the legality of vectorising early exit loops. It is the first step in adding support for vectorisation of a simple class of loops that typically involves searching for something, i.e. for (int i = 0; i < n; i++) { if (p[i] == val) return i; } return n; or for (int i = 0; i < n; i++) { if (p1[i] != p2[i]) return i; } return n; In this initial commit LoopVectorizationLegality will only consider early exit loops legal for vectorising if they follow these criteria: 1. There are no stores in the loop. 2. The loop must have only one early exit like those shown in the above example. I have referred to such exits as speculative early exits, to distinguish from existing support for early exits where the exit-not-taken count is known exactly at compile time. 3. The early exit block dominates the latch block. 4. The latch block must have an exact exit count. 5. There are no loads after the early exit block. 6. The loop must not contain reductions or recurrences. I don't see anything fundamental blocking vectorisation of such loops, but I just haven't done the work to support them yet. 7. We must be able to prove at compile-time that loops will not contain faulting loads. Tests have been added here: Transforms/LoopVectorize/AArch64/simple_early_exit.ll	2024-09-19 09:41:25 +01:00
ErikHogeman	78e1e6ace6	[LV] Check for vector-to-scalar casts in legalizer (#106244 ) The code makes assumptions later on the operations and their inputs being scalar in the loops that are processed, so we should make sure this is the case in the legalizer.	2024-09-06 11:20:14 +02:00
Madhur Amilkanthwar	cd46829e54	[LV] Fix emission of debug message in legality check (#101924 ) Successful vectorization message is emitted even after "Result" is false. "Result" = false indicates failure of one of the legality check and thus successful message should not be printed.	2024-09-04 16:28:39 +05:30
Daniil Fukalov	0da2ba811a	[NFC] Cleanup in ADT and Analysis headers. (#104484 ) Remove unused directly includes and forward declarations in ADT and Analysis headers.	2024-08-17 13:11:18 +02:00
Florian Hahn	f0df4fbd0c	[LV] Support generating masks for switch terminators. (#99808 ) Update createEdgeMask to created masks where the terminator in Src is a switch. We need to handle 2 separate cases: 1. Dst is not the default desintation. Dst is reached if any of the cases with destination == Dst are taken. Join the conditions for each case where destination == Dst using a logical OR. 2. Dst is the default destination. Dst is reached if none of the cases with destination != Dst are taken. Join the conditions for each case where the destination is != Dst using a logical OR and negate it. Edge masks are created for every destination of cases and/or default when requesting a mask where the source is a switch. Fixes https://github.com/llvm/llvm-project/issues/48188. PR: https://github.com/llvm/llvm-project/pull/99808	2024-08-11 20:38:36 +02:00
Florian Hahn	edf46f365c	[SCEV] Use const SCEV * explicitly in more places. Use const SCEV * explicitly in more places to prepare for https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.	2024-08-03 20:10:01 +01:00
Ramkumar Ramachandra	e1a3aa8c0f	LV/Legality: fix style after cursory reading (NFC) (#100363 )	2024-07-24 16:32:24 +01:00
Kazu Hirata	5c83498984	[Transforms] Use range-based for loops (NFC) (#99607 )	2024-07-21 13:11:25 -07:00
Graham Hunter	22a7f6dcc4	Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493 ) Reverts llvm/llvm-project#91458 to deal with post-commit reviewer requests.	2024-07-11 16:39:30 +01:00
Graham Hunter	1860fd049e	[LV] Autovectorization for the all-in-one histogram intrinsic (#91458 ) This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.	2024-07-11 15:33:30 +01:00
Florian Hahn	0577cdaa32	[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612 ) Introduce new canFoldTail helper which only checks if tail-folding is possible, but without modifying MaskedOps. Just because tail-folding is possible doesn't mean the tail will be folded; that's up to the cost-model to decide. Separating the check if tail-folding is possible and preparing for tail-folding makes sure that MaskedOps is only populated when tail-folding is actually selected. PR: https://github.com/llvm/llvm-project/pull/77612	2024-07-08 16:34:42 +01:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Florian Hahn	e949b54a5b	[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499 ) Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns the minimum of the countable exits. When analyzing dependences and computing runtime checks, we need the smallest upper bound on the number of iterations. In terms of memory safety, it shouldn't matter if any uncomputable exits leave the loop, as long as we prove that there are no dependences given the minimum of the countable exits. The same should apply also for generating runtime checks. Note that this shifts the responsiblity of checking whether all exit counts are computable or handling early-exits to the users of LAA. Depends on https://github.com/llvm/llvm-project/pull/93498 PR: https://github.com/llvm/llvm-project/pull/93499	2024-06-04 22:23:30 +01:00
Florian Hahn	b54a78d69b	[LV,LAA] Don't vectorize loops with load and store to invar address. Code checking stores to invariant addresses and reductions made an incorrect assumption that the case of both a load & store to the same invariant address does not need to be handled. In some cases when vectorizing with runtime checks, there may be dependences with a load and store to the same address, storing a reduction value. Update LAA to separately track if there was a store-store and a load-store dependence with an invariant addresses. Bail out early if there as a load-store dependence with invariant address. If there was a store-store one, still apply the logic checking if they all store a reduction.	2024-05-04 20:53:54 +01:00
Niwin Anto	eaf0d82529	[LV] Disable fold tail by masking when IV is used outside (#81609 ) When induction variable are used outside the loop body, tail folding by masking mis-compiles, because for users outside of the loop the final value of the induction is computed separately from the vector loop. Fixes https://github.com/llvm/llvm-project/issues/76069 Fixes https://github.com/llvm/llvm-project/issues/51677	2024-03-04 11:33:30 +00:00
Graham Hunter	b070629c10	[LV] Increase max VF if vectorized function variants exist (#66639 ) If there are function calls in the candidate loop and we have vectorized variants available, try some wider VFs in case the conservative initial maximum based on the widest types in the loop won't actually allow us to make use of those function variants.	2023-11-13 10:27:10 +00:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Florian Hahn	aac8acb115	[VPlan] Model masked assumes as replicate recipes, drop them (NFCI). Replace ConditionalAssume set by treating conditional assumes like other predicated instructions (i.e. create a VPReplicateRecipe with a mask) and later remove any assume recipes with masks during VPlan cleanup. This reduces coupling of VPlan construction and Legal by removing a shared set between the 2 and results in a cleaner code structure overall. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157034	2023-08-06 20:56:24 +01:00

1 2 3 4

169 Commits