llvm-project

Author	SHA1	Message	Date
Benjamin Maxwell	bb3066d42b	[LAA] Move scalable vector check into `getStrideFromAddRec()` (#154013 ) This moves the check closer to the `.getFixedValue()` call and fixes #153797 (which is a regression from #126971).	2025-08-19 06:40:07 +01:00
Ramkumar Ramachandra	4443b37877	[LAA] Pre-commit tests exercising different types (#151091 ) Pre-commit tests exercising different types of source/sink in depend_diff_types.ll, in preparation to weaken the HasSameSize check in LoopAccessAnalysis. Co-authored-by: Igor Kirillov <igor.kirillov@arm.com>	2025-08-11 10:19:10 +01:00
Florian Hahn	2ae996cbbe	[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047 ) This patch extends the logic added in https://github.com/llvm/llvm-project/pull/128061 to support dereferenceability information from assumptions as well. Unfortunately both assumption cache and the dominator tree need to be threaded through multiple layers to make them available where needed. PR: https://github.com/llvm/llvm-project/pull/147047	2025-08-01 14:18:07 +01:00
Florian Hahn	9e7782db73	[LV,LAA] Add tests where RT checks are known false after expansion.	2025-07-26 14:17:35 +01:00
Florian Hahn	6c50e2b2dd	[SCEV] Don't require NUW at first add when checking A+C1 < (A+C2)<nuw> (#149795 ) Relax the NUW requirements for isKnownPredicateViaNoOverflow, if the second operand (Y) is an ADD. The code only simplifies the condition if C1 < C2, so if the second ADD is NUW, it doesn't matter whether the first operand also has the NUW flag, as it cannot wrap if C1 < C2. https://alive2.llvm.org/ce/z/b3dM7N PR: https://github.com/llvm/llvm-project/pull/149795	2025-07-23 09:33:34 +01:00
Florian Hahn	a11c5dd34b	[LAA] Add test variant with backward dep with overlap in loop. The original test @backward_dep_known_distance_less_than_btc was incorrectly named, as all loads are completely before the first store. Add a variant where this is not the case: @backward_dep_known_distance_less_than_btc	2025-07-21 11:43:35 +01:00
Florian Hahn	5a4586f468	Reapply "[LAA] Remove loop-invariant check added in 234cc40adc61." This reverts commit d43a80936d437d217d5a6dbbaa5fb131c27e7085. With the correctness issue blocking the recommit finally fixed (5d01697ec6cb), again unconditionally check if accesses are completely before or after each other.	2025-07-14 21:21:22 +01:00
Florian Hahn	d749095b94	[LAA] Add tests where we could derive NoDep due to no overlap. Add additional tests where we can prove that the accesses are either completely before or after each other.	2025-07-14 15:48:41 +01:00
Ramkumar Ramachandra	fb845f93c0	[LAA] Hoist setting condition for RT-checks (#128045 ) Strip ShouldRetyWithRuntimeCheck from the DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the responsibility of setting the condition for when runtime-checks are needed, transferring this responsibility to getDependenceDistanceStrideAndSize. We can have multiple DepType::Unknown dependences that, by themselves, do not trigger the retrying with runtime memory checks, and therefore block vectorization. But once a single FoundNonConstantDistanceDependence is found, the analysis seems to switch to the "LAA: Retrying with memory checks" path and allows all these dependences to be handled via runtime checks. There is hence no rationale for predicating FoundNonConstantDependenceDistance on DepType::Unknown, and removing this predication is one of the side-effects of this patch.	2025-07-07 12:02:41 +01:00
Florian Hahn	0afbf17213	[LAA,LV] Add tests with early-exit loops and deref assumptions. Adds additional test coverage for early-exit loops with deref assumptions, as suggested in https://github.com/llvm/llvm-project/pull/128436.	2025-07-03 19:41:13 +01:00
Ramkumar Ramachandra	619f7afd71	[LAA] Clean up APInt-overflow related code (#140048 ) Co-authored-by: Florian Hahn <flo@fhahn.com>	2025-06-30 14:48:56 +01:00
Florian Hahn	c5a49fb62d	[LAA] Add tests with 128 bit inductions and 128 bit pointers. Adds extra test coverage for https://github.com/llvm/llvm-project/pull/140048.	2025-06-29 11:28:35 +01:00
Florian Hahn	5d01697ec6	[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. (#128061 ) Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2). If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap (52ded672492, https://github.com/llvm/llvm-project/pull/127543), so unsigned max represents an upper bound. When there is a computable backedge-taken count, we are guaranteed to execute the number of iterations, and if any pointer would wrap it would be UB (or the access will never be executed, so cannot alias). It includes new tests from the previous discussion that show a case we wrap with a BTC, but it is UB due to the pointer after the object wrapping (in `evaluate-at-backedge-taken-count-wrapping.ll`) When we have only a maximum backedge taken count, we instead try to use dereferenceability information to determine if the pointer access must be in bounds for the maximum backedge taken count. PR: https://github.com/llvm/llvm-project/pull/128061	2025-06-23 20:23:40 +01:00
Florian Hahn	f4ca223196	[LAA] Update early-exit test to cover last valid & first invalid access. Make sure tests cover loops accessing the last valid and the first invalid memory location. Note that the test `@all_exits_dominate_latch_countable_exits_at_most_1000_iterations_kno` has been modified to execute at most 1000 iterations; it previously executed 1001 iterations.	2025-06-23 15:06:32 +01:00
Nikita Popov	6157028fea	[BasicAA][ValueTracking] Increase depth for underlying object search (#143714 ) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in #137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.	2025-06-12 09:19:50 +02:00
John Brawn	81d3189891	[LAA] Keep pointer checks on partial analysis (#139719 ) Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch makes this able to be controlled with the AllowPartial option, which makes it so we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information. Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either: * Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false. * Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.	2025-06-04 16:47:20 +01:00
Florian Hahn	0ba63b2f22	[SCEV] Add additional test coverage for loop-guards reasoning. Add additional tests showing missed opportunities when using loop guards for reasoning in SCEV, depending on the order the guards appear in the IR.	2025-06-01 22:39:37 +01:00
Ramkumar Ramachandra	bb2791609d	[LAA] Tweak debug output for UTC stability (#140764 ) UpdateTestChecks has a make_analyzer_generalizer to replace pointer addressess from the debug output of LAA with a pattern, which is an acceptable solution when there is one RUN line. However, when there are multiple RUN lines with a common pattern, UTC fails to recognize common output due to mismatched pointer addresses. Instead of hacking UTC scrub the output before comparing the outputs from the different RUN lines, fix the issue once and for all by making LAA not output unstable pointer addresses in the first place. The removal of the now-dead make_analyzer_generalizer is left as a non-trivial exercise for a follow-up.	2025-05-21 12:01:49 +01:00
Ramkumar Ramachandra	b18ebd17b8	[LAA] Improve forked-pointers.ll test (#140383 )	2025-05-18 09:34:27 +01:00
Florian Hahn	9da103ab9e	[LAA] Update remaining tests after 384a5b00a7.	2025-05-07 21:02:44 +01:00
vaibhav	384a5b00a7	[LAA] Use MaxStride instead of CommonStride to calculate MaxVF (#98142 ) We bail out from MaxVF calculation if the strides are not the same. Instead, we are dependent on runtime checks, though not yet implemented. We could instead use the MaxStride to conservatively use an upper bound. This handles cases like the following: ```c #define LEN 256 * 256 float a[LEN]; void gather() { for (int i = 0; i < LEN - 1024 - 255; i++) { #pragma clang loop interleave(disable) #pragma clang loop unroll(disable) for (int j = 0; j < 256; j++) a[i + j + 1024] += a[j * 4 + i]; } } ``` --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2025-05-07 21:02:21 +01:00
Vikram Hegde	f91a6e6dab	[SCEV] Reject comparision of pointers to different address spaces in SCEVWrapPredicate::implies (#137935 )	2025-04-30 21:03:31 +05:30
Ramkumar Ramachandra	13b443f2ef	[LAA] Improve convergent tests (#136758 ) LoopAccessAnalysis has code for handling function calls where the function is marked with the 'convergent' attribute, but the test coverage is insufficient. Fix this by adding a test showing the case of no-runtime-checks adapted from LoopDistribute, and clean up the existing test with runtime-checks. Also regenerate the test file with UpdateTestChecks.	2025-04-29 09:50:33 +01:00
Alexey Bataev	ea1b525ceb	[LAA] Add tests with non-power-of-2 store-load forward distance (#136710 )	2025-04-28 15:10:55 -07:00
Alexey Bataev	88f8637d22	Revert "[LAA] Add tests with non-power-of-2 store-load forward distance (#136710 )" This reverts commit 51bbebb6677bb0ea14ca62cc140492965c2a6e19 to fix buildbots https://lab.llvm.org/buildbot/#/builders/137/builds/17662	2025-04-28 14:36:44 -07:00
Alexey Bataev	51bbebb667	[LAA] Add tests with non-power-of-2 store-load forward distance (#136710 )	2025-04-28 17:02:02 -04:00
Alexey Bataev	78777a204a	[LV]Split store-load forward distance analysis from other checks, NFC (#121156 ) The patch splits the store-load forwarding distance analysis from other dependency analysis in LAA. Currently it supports only power-of-2 distances, required to support non-power-of-2 distances in future. Part of #100755	2025-03-31 07:28:44 -04:00
Florian Hahn	8bdcd0a96e	[LAA] Add missing test coverage for retrying with runtime checks. Adds extra test coverage showing change by https://github.com/llvm/llvm-project/pull/128045.	2025-03-27 19:09:10 +00:00
Florian Hahn	dfb661cd1c	[LAA] Add extra tests for #128061 . Extend test coverage for https://github.com/llvm/llvm-project/pull/128061.	2025-03-13 21:42:32 +00:00
Florian Hahn	275baedfde	[LAA] Consider accessed addrspace when mapping underlying obj to access. (#129087 ) In some cases, it is possible for the same underlying object to be accessed via pointers to different address spaces. This could lead to pointers from different address spaces ending up in the same dependency set, which isn't allowed (and triggers an assertion). Update the mapping from underlying object -> last access to also include the accessing address space. Fixes https://github.com/llvm/llvm-project/issues/124759. PR: https://github.com/llvm/llvm-project/pull/129087	2025-02-28 20:56:12 +00:00
Florian Hahn	52ded67249	[LAA] Always require non-wrapping pointers for runtime checks. (#127543 ) Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). If the pointer wraps, then we swap start and end of the runtime check, leading to incorrect checks. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on https://github.com/llvm/llvm-project/pull/127410 to avoid more regressions. PR: https://github.com/llvm/llvm-project/pull/127543	2025-02-20 19:00:23 +01:00
Florian Hahn	01d0793a69	[LAA] Make Ptr argument optional in isNoWrap. (#127410 ) Update isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed. PR: https://github.com/llvm/llvm-project/pull/127410	2025-02-19 14:51:19 +01:00
Ramkumar Ramachandra	6646b65082	[LAA] Rework and rename stripGetElementPtr (#125315 ) The stripGetElementPtr function is mysteriously named, and calls into another mysterious getGEPInductionOperand which does something complicated with GEP indices. The real purpose of the badly-named stripGetElementPtr function is to get a loop-variant GEP index, if there is one. The getGEPInductionOperand is totally redundant, as stripping off zeros from the end of GEP indices has no effect on computing the loop-variant GEP index, as constant zeros are always loop-invariant. Moreover, the GEP induction operand is simply the first non-zero index from the end, which stripGetElementPtr returns when it finds that any of the GEP indices are loop-variant: this is a completely unrelated value to the GEP index that is loop-variant. The implicit assumption here is that there is only ever one loop-variant index, and it is the first non-zero one from the end. The logic is unnecessarily complicated for what stripGetElementPtr wants to achieve, and the header comments are confusing as well. Strip getGEPInductionOperand, rework and rename stripGetElementPtr.	2025-02-18 10:25:47 +00:00
Florian Hahn	88e72c401b	[LAA] Add test where GEPs may wrap.	2025-02-17 21:49:40 +01:00
Florian Hahn	e080366a76	[LAA] Inline hasComputableBounds in only caller, simplify isNoWrap. Inline hasComputableBounds into createCheckForAccess. This removes a level of indirection and allows for passing the AddRec directly to isNoWrap, removing the need to retrieve the AddRec for the pointer again. The early continue for invariant SCEVs now also applies to forked pointers (i.e. when there's more than one entry in TranslatedPtrs) when ShouldCheckWrap is true, as those trivially won't wrap. The change is NFC otherwise. replaceSymbolicStrideSCEV is now called earlier.	2025-02-16 19:56:13 +01:00
Florian Hahn	044b52832a	[LAA] Perform checks for no-wrap separately from getPtrStride. (#126971 ) Reorganize the code in isNoWrap to perform the no-wrap checks without relying on getPtrStride directly. getPtrStride now uses isNoWrap. The new structure allows deriving no-wrap in more cases in LAA, because there are some cases where getPtrStride bails out early because it cannot return a constant stride, but we can still prove no-wrap for the pointer. An example are AddRecs with non-ConstantInt strides with inbound GEPs, in the improved test cases. This enables vectorization with runtime checks in a few more cases. PR: https://github.com/llvm/llvm-project/pull/126971	2025-02-14 20:06:37 +01:00
Florian Hahn	1199bbb396	[LAA] Add forked pointers tests with dep checks and runtime checks (NFC) Add missing test coverage where generating runtime checks is tried again after dependence analysis.	2025-02-14 19:45:52 +01:00
Ramkumar Ramachandra	8327c2cfdb	LAA: fix logic for MaxTargetVectorWidth (#125487 ) Uses the fixed register width if scalable vectorization is not enabled (via TargetTransformInfo::enableScalableVectorization) and improves results if there are scalable vector registers, but they shouldn't be used.	2025-02-13 11:40:05 +00:00
Florian Hahn	82605285b8	[LAA] Also clear CheckingGroups in RuntimePointerChecking::reset. This fixes a crash when trying to print access-info in the newly added test cases.	2025-02-12 21:49:22 +01:00
Ramkumar Ramachandra	8fe7860610	LAA/test: cover invariant stores with unit stride (#124586 ) LoopAccessAnalysis is missing coverage of the special-case of invariant stores with unit stride. It was previously determined that stride-versioning for stores is not profitable, but test coverage is missing. Fix this.	2025-01-28 10:02:28 +00:00
Ramkumar Ramachandra	3a4376b8f9	LAA: handle 0 return from getPtrStride correctly (#124539 ) getPtrStride returns 0 when the PtrScev is loop-invariant, and this is not an erroneous value: it returns std::nullopt to communicate that it was not able to find a valid pointer stride. In analyzeLoop, we call getPtrStride with a value_or(0) conflating the zero return value with std::nullopt. Fix this, handling loop-invariant loads correctly.	2025-01-27 14:21:14 +00:00
Ramkumar Ramachandra	a94f08174c	LAA: regen a test with UTC (NFC) (#122748 )	2025-01-14 09:02:33 +00:00
Ramkumar Ramachandra	8b4561467e	LAA: add missed swap when inverting src, sink (#122254 ) When inverting source and sink on a negative induction step, the types of the source and sink should also be swapped. This fixes a bug in the code that follows, that computes properties based on these types. With 234cc40 ([LAA] Limit no-overlap check to at least one loop-invariant accesses.), that code is guarded by a loop-invariant condition: however, the commit did not add any new tests exercising the guarded code, and hence the bugfix in this patch requires additional tests to exercise that guarded codepath.	2025-01-13 13:07:19 +00:00
Florian Hahn	8fce5d96a7	[SCEV] Update changed test after df8efbdbb. Test needed updating due to changes on main since branch was tested.	2024-12-20 21:13:35 +00:00
Florian Hahn	df8efbdbbf	[SCEV] Remove existing predicates implied by newly added ones. (#118185 ) When adding a new predicate to a union predicate, some of the existing predicates may be implied by the new predicate. Remove any existing predicates that are already implied by the new predicate. Depends on https://github.com/llvm/llvm-project/pull/118184 to show the main benefit. PR: https://github.com/llvm/llvm-project/pull/118185	2024-12-20 20:49:37 +00:00
Florian Hahn	95eb49a090	[SCEV] Bail out on mixed int/pointer in SCEVWrapPredicate::implies. Fixes a crash when trying to extend the pointer start value to a narrow integer type after b6c29fdffd65.	2024-12-18 11:22:39 +00:00
Florian Hahn	7bfcf93527	[SCEV] Use Step and Start to check if SCEVWrapPredicate is implied. (#118184 ) A SCEVWrapPredicate A implies B, if * they have the same flag, * both steps are positive and * B's start and step are ULE/SLE (for NSUW/NSSW) than A's. See https://alive2.llvm.org/ce/z/n2T4ss (first pair with known constants as strides, second pair with variable strides). Note that this is limited to steps of the same size, due to NSUW having slightly different semantics than regular NUW. We should be able to remove this restriction for NSSW (which matches NSW) in the future. PR: https://github.com/llvm/llvm-project/pull/118184	2024-12-16 15:51:22 +00:00
Florian Hahn	1c702d3854	[SCEV] Add tests where one wrap predicate implies another.	2024-11-30 14:12:54 +00:00
Nikita Popov	e636434bdf	[BasicAA][LAA] Don't use same-block phis in cross iteration mode (#116802 ) In 4de3184f07fd8c548125d315dd306d4afa7c9698 we exposed BasicAA's cross-iteration mode for use in LAA, so we can handle selects with equal conditions correctly (where the select condition is not actually equal across iterations). However, if we replace the selects with equivalent phis, the issue still exists. In the phi case, we effectively still have an assumption that the condition(s) that control which phi arg is used will be the same across iterations. Fix this by disabling this phi handling in cross-iteration mode. (I'm not entirely sure whether this is also needed when BasicAA enables cross-iteration mode during internal phi recursion, but I wouldn't be surprised if that's the case.)	2024-11-27 09:38:51 +01:00
Nikita Popov	681939e154	[LAA] Add phi test variant for cross-iteration dependence (NFC)	2024-11-19 14:28:25 +01:00

1 2 3 4 5

242 Commits