llvm-project

Author	SHA1	Message	Date
chrisPyr	71f4c7dabe	[NFC]Make file-local cl::opt global variables static (#126486 ) #125983	2025-03-03 13:46:33 +07:00
Ramkumar Ramachandra	4ac43b541c	[LV] Restrict widest induction type to be IntegerType (NFC) (#128173 ) As the name of the function suggests, convertPointerToIntegerType should return an IntegerType instead of a Type, and should only ever be called with integer or ptr type. Fix the callers getWiderType, and addInductionPhi to narrow the type of WidestIndTy to IntegerType, stripping unclear casts. While at it, rename convertPointerToIntegerType and getWiderType for clarity.	2025-02-24 19:10:33 +00:00
Benjamin Maxwell	e0e67a6207	[LV] Add initial support for vectorizing literal struct return values (#109833 ) This patch adds initial support for vectorizing literal struct return values. Currently, this is limited to the case where the struct is homogeneous (all elements have the same type) and not packed. The users of the call also must all be `extractvalue` instructions. The intended use case for this is vectorizing intrinsics such as: ``` declare { float, float } @llvm.sincos.f32(float %x) ``` Mapping them to structure-returning library calls such as: ``` declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>) ``` Or their widened form (such as `@llvm.sincos.v4f32` in this case). Implementing this required two main changes: 1. Supporting widening `extractvalue` 2. Adding support for vectorized struct types in LV * This is mostly limited to parts of the cost model and scalarization Since the supported use case is narrow, the required changes are relatively small.	2025-02-17 09:51:35 +00:00
David Sherwood	4a2ebd6661	[LV][NFC] Refactor structures used to maintain uncountable exit info (#123219 ) I've removed the HasUncountableEarlyExit variable, since we can already determine whether or not a loop has an early exit by seeing if we found an uncountable exit. I have also deleted the old UncountableExitingBlocks and UncountableExitBlocks lists and replaced them with a single uncountable edge. This means we don't need to worry about keeping the list entries in sync and makes it clear which exiting block corresponds to which exit block.	2025-01-22 09:40:08 +00:00
Florian Hahn	b0697dc1de	[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994 ) Currently we emit early-exit related debug messages/remarks even when there is a single exit. Update to only check isVectorizableEarlyExitLoop if there isn't a single exit block. PR: https://github.com/llvm/llvm-project/pull/121994	2025-01-09 12:05:19 +00:00
Benjamin Maxwell	f88ef1bd1b	[LV] Teach LoopVectorizationLegality about struct vector calls (#119221 ) This is a split-off from #109833 and only adds code relating to checking if a struct-returning call can be vectorized. This initial patch only allows the case where all users of the struct return are `extractvalue` operations that can be widened. ``` %call = tail call { float, float } @foo(float %in_val) %extract_a = extractvalue { float, float } %call, 0 %extract_b = extractvalue { float, float } %call, 1 ``` Note: The tests require the VFABI changes from #119000 to pass.	2025-01-09 09:27:29 +00:00
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
David Sherwood	c18fda02e1	[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414 )	2024-12-19 10:07:13 +00:00
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
Ellis Hoag	6ab26eab4f	Check hasOptSize() in shouldOptimizeForSize() (#112626 )	2024-10-28 09:45:03 -07:00
Graham Hunter	6f1a8c2da2	[LV] Vectorize histogram operations (#99851 ) This patch implements autovectorization support for the 'all-in-one' histogram intrinsic, which seems to have more support than the 'standalone' intrinsic. See https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/ for an overview of the work and my notes on the tradeoffs between the two approaches.	2024-09-27 13:08:55 +01:00
David Sherwood	f4eeae1244	[LoopVectorize] Address comments on PR #107004 left post-commit (#109300 ) * Rename Speculative -> Uncountable and update tests. * Add comments explaining why it's safe to ignore the predicates when building up a list of exiting blocks. * Reshuffle some code to do (hopefully) cheaper checks first.	2024-09-23 13:36:25 +01:00
David Sherwood	02ee96eca9	[Analysis] Teach isDereferenceableAndAlignedInLoop about SCEV predicates (#106562 ) Currently if a loop contains loads that we can prove at compile time are dereferenceable when certain conditions are satisfied the function isDereferenceableAndAlignedInLoop will still return false because getSmallConstantMaxTripCount will return 0 when SCEV predicates are required. This patch changes getSmallConstantMaxTripCount to take an optional Predicates pointer argument so that we can permit functions such as isDereferenceableAndAlignedInLoop to consider more cases.	2024-09-23 09:56:37 +01:00
Benjamin Kramer	57777a5066	[LoopVectorize] Silence unused variable warning	2024-09-19 11:01:58 +02:00
David Sherwood	e762d4dac7	[LoopVectorize] Teach LoopVectorizationLegality about more early exits (#107004 ) This patch is split off from PR #88385 and concerns only the code related to the legality of vectorising early exit loops. It is the first step in adding support for vectorisation of a simple class of loops that typically involves searching for something, i.e. for (int i = 0; i < n; i++) { if (p[i] == val) return i; } return n; or for (int i = 0; i < n; i++) { if (p1[i] != p2[i]) return i; } return n; In this initial commit LoopVectorizationLegality will only consider early exit loops legal for vectorising if they follow these criteria: 1. There are no stores in the loop. 2. The loop must have only one early exit like those shown in the above example. I have referred to such exits as speculative early exits, to distinguish from existing support for early exits where the exit-not-taken count is known exactly at compile time. 3. The early exit block dominates the latch block. 4. The latch block must have an exact exit count. 5. There are no loads after the early exit block. 6. The loop must not contain reductions or recurrences. I don't see anything fundamental blocking vectorisation of such loops, but I just haven't done the work to support them yet. 7. We must be able to prove at compile-time that loops will not contain faulting loads. Tests have been added here: Transforms/LoopVectorize/AArch64/simple_early_exit.ll	2024-09-19 09:41:25 +01:00
ErikHogeman	78e1e6ace6	[LV] Check for vector-to-scalar casts in legalizer (#106244 ) The code makes assumptions later on the operations and their inputs being scalar in the loops that are processed, so we should make sure this is the case in the legalizer.	2024-09-06 11:20:14 +02:00
Madhur Amilkanthwar	cd46829e54	[LV] Fix emission of debug message in legality check (#101924 ) Successful vectorization message is emitted even after "Result" is false. "Result" = false indicates failure of one of the legality check and thus successful message should not be printed.	2024-09-04 16:28:39 +05:30
Daniil Fukalov	0da2ba811a	[NFC] Cleanup in ADT and Analysis headers. (#104484 ) Remove unused directly includes and forward declarations in ADT and Analysis headers.	2024-08-17 13:11:18 +02:00
Florian Hahn	f0df4fbd0c	[LV] Support generating masks for switch terminators. (#99808 ) Update createEdgeMask to created masks where the terminator in Src is a switch. We need to handle 2 separate cases: 1. Dst is not the default desintation. Dst is reached if any of the cases with destination == Dst are taken. Join the conditions for each case where destination == Dst using a logical OR. 2. Dst is the default destination. Dst is reached if none of the cases with destination != Dst are taken. Join the conditions for each case where the destination is != Dst using a logical OR and negate it. Edge masks are created for every destination of cases and/or default when requesting a mask where the source is a switch. Fixes https://github.com/llvm/llvm-project/issues/48188. PR: https://github.com/llvm/llvm-project/pull/99808	2024-08-11 20:38:36 +02:00
Florian Hahn	edf46f365c	[SCEV] Use const SCEV * explicitly in more places. Use const SCEV * explicitly in more places to prepare for https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.	2024-08-03 20:10:01 +01:00
Ramkumar Ramachandra	e1a3aa8c0f	LV/Legality: fix style after cursory reading (NFC) (#100363 )	2024-07-24 16:32:24 +01:00
Kazu Hirata	5c83498984	[Transforms] Use range-based for loops (NFC) (#99607 )	2024-07-21 13:11:25 -07:00
Graham Hunter	22a7f6dcc4	Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493 ) Reverts llvm/llvm-project#91458 to deal with post-commit reviewer requests.	2024-07-11 16:39:30 +01:00
Graham Hunter	1860fd049e	[LV] Autovectorization for the all-in-one histogram intrinsic (#91458 ) This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.	2024-07-11 15:33:30 +01:00
Florian Hahn	0577cdaa32	[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612 ) Introduce new canFoldTail helper which only checks if tail-folding is possible, but without modifying MaskedOps. Just because tail-folding is possible doesn't mean the tail will be folded; that's up to the cost-model to decide. Separating the check if tail-folding is possible and preparing for tail-folding makes sure that MaskedOps is only populated when tail-folding is actually selected. PR: https://github.com/llvm/llvm-project/pull/77612	2024-07-08 16:34:42 +01:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Florian Hahn	e949b54a5b	[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499 ) Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns the minimum of the countable exits. When analyzing dependences and computing runtime checks, we need the smallest upper bound on the number of iterations. In terms of memory safety, it shouldn't matter if any uncomputable exits leave the loop, as long as we prove that there are no dependences given the minimum of the countable exits. The same should apply also for generating runtime checks. Note that this shifts the responsiblity of checking whether all exit counts are computable or handling early-exits to the users of LAA. Depends on https://github.com/llvm/llvm-project/pull/93498 PR: https://github.com/llvm/llvm-project/pull/93499	2024-06-04 22:23:30 +01:00
Florian Hahn	b54a78d69b	[LV,LAA] Don't vectorize loops with load and store to invar address. Code checking stores to invariant addresses and reductions made an incorrect assumption that the case of both a load & store to the same invariant address does not need to be handled. In some cases when vectorizing with runtime checks, there may be dependences with a load and store to the same address, storing a reduction value. Update LAA to separately track if there was a store-store and a load-store dependence with an invariant addresses. Bail out early if there as a load-store dependence with invariant address. If there was a store-store one, still apply the logic checking if they all store a reduction.	2024-05-04 20:53:54 +01:00
Niwin Anto	eaf0d82529	[LV] Disable fold tail by masking when IV is used outside (#81609 ) When induction variable are used outside the loop body, tail folding by masking mis-compiles, because for users outside of the loop the final value of the induction is computed separately from the vector loop. Fixes https://github.com/llvm/llvm-project/issues/76069 Fixes https://github.com/llvm/llvm-project/issues/51677	2024-03-04 11:33:30 +00:00
Graham Hunter	b070629c10	[LV] Increase max VF if vectorized function variants exist (#66639 ) If there are function calls in the candidate loop and we have vectorized variants available, try some wider VFs in case the conservative initial maximum based on the widest types in the loop won't actually allow us to make use of those function variants.	2023-11-13 10:27:10 +00:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Florian Hahn	aac8acb115	[VPlan] Model masked assumes as replicate recipes, drop them (NFCI). Replace ConditionalAssume set by treating conditional assumes like other predicated instructions (i.e. create a VPReplicateRecipe with a mask) and later remove any assume recipes with masks during VPlan cleanup. This reduces coupling of VPlan construction and Legal by removing a shared set between the 2 and results in a cleaner code structure overall. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157034	2023-08-06 20:56:24 +01:00
Florian Hahn	e48b1e87a3	[LV] Split off invariance check from isUniform (NFCI). After 572cfa3fde5433, isUniform now checks VF based uniformity instead of just invariance as before. As follow-up cleanup suggested in D148841, separate the invariance check out and update callers that currently check only for invariance. This also moves the implementation of isUniform from LoopAccessAnalysis to LoopVectorizationLegality, as LoopAccesAnalysis doesn't use the more general isUniform.	2023-06-01 19:09:11 +01:00
Florian Hahn	572cfa3fde	[LV] Use SCEV for uniformity analysis across VF This patch uses SCEV to check if a value is uniform across a given VF. The basic idea is to construct SCEVs where the AddRecs of the loop are adjusted to reflect the version in the vectorized loop (Step multiplied by VF). We construct a SCEV for the value of the vector lane 0 (offset 0) compare it to the expressions for lanes 1 to the last vector lane (VF - 1). If they are equal, consider the expression uniform. While re-writing expressions, we also need to catch expressions we cannot determine uniformity (e.g. SCEVUnknown). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D148841	2023-05-31 16:01:00 +01:00
Philip Reames	e41dce4d49	[LAA/LV] Simplify stride speculation logic [NFC] (try 2) The original commit wasn't quite NFC, and this was caught by an arguably overly strong assert. Specifically, I'd failed to strip off the integer cast off the SCEV before saving it in the map. The result - other than a failed assert - is that we'd speculate on the casted unknown, not the unknown. The only case I can think of where that might change behavior would be a sext(i1 load). I doubt that case is interesting in practice, but it's good to be strictly NFC on this change regardless. Original commit message follows.. The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is profitable to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750	2023-05-11 10:19:23 -07:00
Philip Reames	dc0d00c5fc	Revert "[LAA/LV] Simplify stride speculation logic [NFC]" This reverts commit d5b840131223f2ffef4e48ca769ad1eb7bb1869a. Running this through broader testing after rebasing is revealing a crash. Reverting while I investigate.	2023-05-11 09:26:35 -07:00
Philip Reames	d5b8401312	[LAA/LV] Simplify stride speculation logic [NFC] The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is profitable to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750	2023-05-11 08:32:56 -07:00
Florian Hahn	6b8d19d2b5	Recommit "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts the revert commit 3d8ed8b5192a59104bfbd5bf7ac84d035ee0a4a5. The new version of the patch adds a set to avoid duplicating work in isFixedOrderRecurrence, which was previously done through the removed SinkAfter map. Original commit message: Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-20 09:31:16 +01:00
Manoj Gupta	3d8ed8b519	Revert "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts commit 7fc0b3049df532fce726d1ff6869a9f6e3183780. Causes a clang hang when building xz utils, github issue #62187.	2023-04-17 12:19:36 -07:00
Florian Hahn	7fc0b3049d	[VPlan] Switch to checking sinking legality for recurrences in VPlan. Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-13 22:00:52 +01:00
Philip Reames	92aae9e725	[LV] Remove a cover function with a single use [nfc] And more importantly, move the fixme to the sole caller where it actually makes sense in context.	2023-04-06 08:27:57 -07:00
Philip Reames	c416f6700f	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2) (JFYI - This has been heavily reframed since original attempt at landing.) This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior. In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach). This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues. Differential Revision: https://reviews.llvm.org/D147336	2023-04-05 09:32:35 -07:00
Graham Hunter	185863f7de	[LV] Use available masked vector function variants when required LLVM has the ability to vectorize using function variants that require a mask by creating an all-true mask, and to vectorize a conditional call via scalarization, now we want to join the two parts together and use a masked variant when a mask is required. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D136251	2023-04-05 11:18:38 +01:00
Kazu Hirata	c8f9555c4d	[Transforms] Use *{Set,Map}::contains (NFC)	2023-03-14 00:24:30 -07:00
Graham Hunter	a180344589	[LV] Allow scalarization of function calls when masking is required This patch adds support for scalarizing calls to a function when there is a vector variant that cannot be used, either because there isn't a masked variant or because the cost model indicated a VF without a masked variant was better. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D134422	2023-03-03 15:26:04 +00:00
Florian Hahn	825e16969e	[LAA] Pass LoopAccessInfoManager instead of GetLAA function. Use LoopAccessInfoManager directly instead of various GetLAA lambdas. Depends on D134608. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134609	2022-10-04 11:51:25 +01:00
Igor Kirillov	2d60d7ba1a	[LoopVectorize][Fix] Crash when invariant store address is calculated inside loop Fixes #57572 Generally LICM pass is responsible for sinking out code that calculates invariant address inside loop as it only needed to be calculated once. But in rare case it does not happen we will not be vectorizing the loop. Differential Revision: https://reviews.llvm.org/D133687	2022-09-28 10:33:50 +01:00
Philip Reames	f6d110e26f	[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc] This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it.	2022-09-27 15:55:44 -07:00
Matt Arsenault	b609741958	LoopVectorize: Pass through AssumptionCache	2022-09-19 19:25:22 -04:00
Kazu Hirata	56ea4f9bd3	[Transforms] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-27 21:21:02 -07:00

1 2 3 4

151 Commits