llvm-project

Author	SHA1	Message	Date
Florian Hahn	5a4586f468	Reapply "[LAA] Remove loop-invariant check added in 234cc40adc61." This reverts commit d43a80936d437d217d5a6dbbaa5fb131c27e7085. With the correctness issue blocking the recommit finally fixed (5d01697ec6cb), again unconditionally check if accesses are completely before or after each other.	2025-07-14 21:21:22 +01:00
Florian Hahn	9693056aac	[LAA] Move code to check if access are completely before/after (NFC). Factor out code to check if access are completely before/after each other. This reduces the diff for an upcoming re-commit and moving to a function also helps to reduce the nesting level via early exits.	2025-07-11 19:53:57 +01:00
Ramkumar Ramachandra	20864c4379	[LAA] Strip outdated comment in isDependent (NFC) (#146367 ) The comment has been outdated since 87ddd3a1 ([LAA] Rename and fix semantics of MaxSafeDepDistBytes to MinDepDistBytes).	2025-07-07 13:54:37 +01:00
Ramkumar Ramachandra	fb845f93c0	[LAA] Hoist setting condition for RT-checks (#128045 ) Strip ShouldRetyWithRuntimeCheck from the DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the responsibility of setting the condition for when runtime-checks are needed, transferring this responsibility to getDependenceDistanceStrideAndSize. We can have multiple DepType::Unknown dependences that, by themselves, do not trigger the retrying with runtime memory checks, and therefore block vectorization. But once a single FoundNonConstantDistanceDependence is found, the analysis seems to switch to the "LAA: Retrying with memory checks" path and allows all these dependences to be handled via runtime checks. There is hence no rationale for predicating FoundNonConstantDependenceDistance on DepType::Unknown, and removing this predication is one of the side-effects of this patch.	2025-07-07 12:02:41 +01:00
Ramkumar Ramachandra	619f7afd71	[LAA] Clean up APInt-overflow related code (#140048 ) Co-authored-by: Florian Hahn <flo@fhahn.com>	2025-06-30 14:48:56 +01:00
Florian Hahn	b8769104f1	[LAA] Address follow-up suggestions for #128061 . Adjust naming and add argument comments as suggested.	2025-06-24 12:00:17 +01:00
Florian Hahn	5d01697ec6	[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. (#128061 ) Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2). If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap (52ded672492, https://github.com/llvm/llvm-project/pull/127543), so unsigned max represents an upper bound. When there is a computable backedge-taken count, we are guaranteed to execute the number of iterations, and if any pointer would wrap it would be UB (or the access will never be executed, so cannot alias). It includes new tests from the previous discussion that show a case we wrap with a BTC, but it is UB due to the pointer after the object wrapping (in `evaluate-at-backedge-taken-count-wrapping.ll`) When we have only a maximum backedge taken count, we instead try to use dereferenceability information to determine if the pointer access must be in bounds for the maximum backedge taken count. PR: https://github.com/llvm/llvm-project/pull/128061	2025-06-23 20:23:40 +01:00
Ramkumar Ramachandra	c8c4bd1ebc	[LV] Stengthen loop-invariance checks in isPredicatedInst (#140744 ) Check loop-invariance against SCEV as well.	2025-06-20 14:01:48 +01:00
Kazu Hirata	03f616eb3a	[llvm] Compare std::optional<T> to values directly (NFC) (#143340 ) This patch transforms: X && *X == Y to: X == Y where X is of std::optional<T>, and Y is of T or similar.	2025-06-08 22:37:59 -07:00
John Brawn	81d3189891	[LAA] Keep pointer checks on partial analysis (#139719 ) Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch makes this able to be controlled with the AllowPartial option, which makes it so we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information. Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either: * Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false. * Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.	2025-06-04 16:47:20 +01:00
Ramkumar Ramachandra	ba57ff66a3	[LAA] Improve code in findForkedSCEVs (NFC) (#140384 )	2025-06-03 11:00:37 +01:00
Jon Roelofs	798058fca5	[Remarks] Remove an upcast footgun. NFC (#142191 ) CodeRegion's were previously passed as Value*, but then immediately upcast to BasicBlock. Let's keep the type information around until the use cases for non-BasicBlock code regions actually materialize.	2025-05-31 11:07:54 -07:00
Kazu Hirata	89308de4b0	[llvm] Value-initialize values with *Map::try_emplace (NFC) (#141522 ) try_emplace value-initializes values, so we do not need to pass nullptr to try_emplace when the value types are raw pointers or std::unique_ptr<T>.	2025-05-26 15:13:02 -07:00
Florian Hahn	c554fc9245	[LAA] Use m_scev_AffineAddRec in LAA (NFC).	2025-05-26 19:58:22 +01:00
Kazu Hirata	0918361d8b	[Analysis] Remove unused includes (NFC) (#141319 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-05-23 23:59:56 -07:00
Ramkumar Ramachandra	5a1311d516	[LAA] Strip isNoWrapGEP: dead code (NFC) (#140308 ) isNoWrap is the only caller of isNoWrapGEP, and it has subsuming check on the GEP immediately after.	2025-05-22 22:47:17 +01:00
Florian Hahn	4a6b1fb9da	[LAA] Remove dead SE arg from canCheckPtrAtRT (NFC).	2025-05-22 20:05:35 +01:00
Ramkumar Ramachandra	bb2791609d	[LAA] Tweak debug output for UTC stability (#140764 ) UpdateTestChecks has a make_analyzer_generalizer to replace pointer addressess from the debug output of LAA with a pattern, which is an acceptable solution when there is one RUN line. However, when there are multiple RUN lines with a common pattern, UTC fails to recognize common output due to mismatched pointer addresses. Instead of hacking UTC scrub the output before comparing the outputs from the different RUN lines, fix the issue once and for all by making LAA not output unstable pointer addresses in the first place. The removal of the now-dead make_analyzer_generalizer is left as a non-trivial exercise for a follow-up.	2025-05-21 12:01:49 +01:00
Florian Hahn	35ee462fef	[LAA] Add assert check CanDoRTIFNeeded can be computed w/o RT.Need (NFC) Add assert to ensure that CanDoRTIfNeeded can be computed w/o RtCheck.Need, to prepare for adjusting the condition.	2025-05-18 22:12:28 +01:00
Ramkumar Ramachandra	c807395011	[LAA/SLP] Don't truncate APInt in getPointersDiff (#139941 ) Change getPointersDiff to return an std::optional<int64_t>, and fill this value with using APInt::trySExtValue. This simple change requires changes to other functions in LAA, and major changes in SLPVectorizer changing types from 32-bit to 64-bit. Fixes #139202.	2025-05-15 10:08:05 +01:00
Igor Kirillov	a3fb54c1ae	[LAA][NFC] Unify naming of DepCandidates to DepCands (#139534 ) The MemoryDepChecker::DepCandidates instance in each LoopAccessInfo had multiple names (AccessSets, DepCands, DependentAccesses), which was confusing. This patch renames all references to DepCands for consistency.	2025-05-13 08:52:46 +01:00
Ramkumar Ramachandra	c1e678b134	[LAA] Improve code in replaceSymbolicStrideSCEV (NFC) (#139532 ) Prefer DenseMap::lookup over DenseMap::find.	2025-05-12 14:18:26 +01:00
Ramkumar Ramachandra	68dccb9fa0	[LAA] Strip dead code in getStrideFromPointer (NFC) (#139140 ) The SCEV multiply by 1 doesn't make sense, because SCEV would fold it: therefore, the OrigPtr == Ptr branch effectively rejects a multiply. However, in this branch, we have a pointer SCEV that cannot be a multiply, and hence the code the code is dead. Strip it.	2025-05-09 09:20:50 +01:00
Ramkumar Ramachandra	458991197d	[SCEVPatternMatch] Extend with more matchers (#138836 )	2025-05-09 09:20:14 +01:00
vaibhav	384a5b00a7	[LAA] Use MaxStride instead of CommonStride to calculate MaxVF (#98142 ) We bail out from MaxVF calculation if the strides are not the same. Instead, we are dependent on runtime checks, though not yet implemented. We could instead use the MaxStride to conservatively use an upper bound. This handles cases like the following: ```c #define LEN 256 * 256 float a[LEN]; void gather() { for (int i = 0; i < LEN - 1024 - 255; i++) { #pragma clang loop interleave(disable) #pragma clang loop unroll(disable) for (int j = 0; j < 256; j++) a[i + j + 1024] += a[j * 4 + i]; } } ``` --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2025-05-07 21:02:21 +01:00
Kazu Hirata	2f3067ed69	[llvm] Remove unused local variables (NFC) (#138454 )	2025-05-04 09:38:16 -07:00
Ramkumar Ramachandra	faf87e1414	[LAA] Prefer set-contains over set-count (NFC) (#136749 ) Improve code by preferring {SmallSet,SmallPtrSet}::contains() over the count() function, when used in a boolean context.	2025-04-29 13:56:04 +01:00
Kazu Hirata	47d8fec9b8	[llvm] Use llvm::append_range (NFC) (#136066 ) This patch replaces: llvm::copy(Src, std::back_inserter(Dst)); with: llvm::append_range(Dst, Src); for breavity. One side benefit is that llvm::append_range eventually calls llvm::SmallVector::reserve if Dst is of llvm::SmallVector.	2025-04-16 19:30:01 -07:00
Florian Hahn	995fd47944	[LAA] Make sure MaxVF for Store-Load forward safe dep distances is pow2. MaxVF computed in couldPreventStoreLoadFowrard may not be a power of 2, as CommonStride may not be a power-of-2. This can cause crashes after 78777a20. Use bit_floor to make sure it is a suitable power-of-2. Fixes https://github.com/llvm/llvm-project/issues/134696.	2025-04-12 20:05:37 +01:00
Ramkumar Ramachandra	fd6260f13b	[EquivClasses] Shorten members_{begin,end} idiom (#134373 ) Introduce members() iterator-helper to shorten the members_{begin,end} idiom. A previous attempt of this patch was #130319, which had to be reverted due to unit-test failures when attempting to call members() on the end iterator. In this patch, members() accepts either an ECValue or an ElemTy, which is more intuitive and doesn't suffer from the same issue.	2025-04-04 14:34:08 +01:00
Florian Hahn	32f24029c7	Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)." This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d. It includes updates to remaining users in Polly and Clang, to avoid failures when building those projects.	2025-03-31 22:27:59 +01:00
Florian Hahn	616f447fc8	Revert "[EquivalenceClasses] Replace findValue with contains (NFC)." Breaks clang builds. This reverts commit 8e390dedd71d0c2bcbe8775aee2e234ef7a5b787.	2025-03-31 20:38:12 +01:00
Florian Hahn	8e390dedd7	[EquivalenceClasses] Replace findValue with contains (NFC). Replace remaining use of findValue with more compact and limited contains().	2025-03-31 20:11:00 +01:00
Florian Hahn	5877bef385	[LAA] Remove unneeded findValue calls (NFC). Use findLeader directly instead if going through findValue, getLeaderValue. This is simpler and more efficient.	2025-03-31 19:19:27 +01:00
Alexey Bataev	78777a204a	[LV]Split store-load forward distance analysis from other checks, NFC (#121156 ) The patch splits the store-load forwarding distance analysis from other dependency analysis in LAA. Currently it supports only power-of-2 distances, required to support non-power-of-2 distances in future. Part of #100755	2025-03-31 07:28:44 -04:00
Kazu Hirata	8f5c3deadd	[Analysis] Use llvm::append_range (NFC) (#133602 )	2025-03-29 16:52:36 -07:00
Kazu Hirata	03205121d2	[Analysis] Avoid repeated hash lookups (NFC) (#131421 )	2025-03-15 09:11:34 -07:00
Vitaly Buka	5bc166728a	Revert "Reland [EquivClasses] Introduce members iterator-helper" (#130380 ) Reverts llvm/llvm-project#130319 Multiple bot failures.	2025-03-07 17:46:53 -08:00
Ramkumar Ramachandra	21d973dbb3	Reland [EquivClasses] Introduce members iterator-helper (#130319 ) Changes: Fix the expectations in EquivalenceClassesTest.MemberIterator, also fixing a build failure.	2025-03-07 21:09:31 +00:00
Ramkumar Ramachandra	86dfd90193	Revert "[EquivClasses] Introduce members iterator-helper" (#130313 ) This reverts commit 259624bf6d, as it causes a build failure.	2025-03-07 17:38:38 +00:00
Ramkumar Ramachandra	259624bf6d	[EquivClasses] Introduce members iterator-helper (#130139 )	2025-03-07 17:24:14 +00:00
Florian Hahn	275baedfde	[LAA] Consider accessed addrspace when mapping underlying obj to access. (#129087 ) In some cases, it is possible for the same underlying object to be accessed via pointers to different address spaces. This could lead to pointers from different address spaces ending up in the same dependency set, which isn't allowed (and triggers an assertion). Update the mapping from underlying object -> last access to also include the accessing address space. Fixes https://github.com/llvm/llvm-project/issues/124759. PR: https://github.com/llvm/llvm-project/pull/129087	2025-02-28 20:56:12 +00:00
Kazu Hirata	303825d2ab	[Analysis] Avoid repeated hash lookups (NFC) (#128394 )	2025-02-23 08:47:02 -08:00
Florian Hahn	52ded67249	[LAA] Always require non-wrapping pointers for runtime checks. (#127543 ) Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). If the pointer wraps, then we swap start and end of the runtime check, leading to incorrect checks. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on https://github.com/llvm/llvm-project/pull/127410 to avoid more regressions. PR: https://github.com/llvm/llvm-project/pull/127543	2025-02-20 19:00:23 +01:00
Kazu Hirata	c0c172213b	[Analysis] Avoid repeated hash lookups (NFC) (#127955 )	2025-02-20 08:55:35 -08:00
Ramkumar Ramachandra	6eba2775e2	[LAA] Scale strides using type-size (NFC) (#124529 ) Change getDependenceDistanceStrideAndSize to scale strides by TypeByteSize, scaling the returned CommonStride and MaxStride. Even though there is a seemingly-functional change of setting CommonStride when scaled strides are equal, it ends up being a non-functional change due to aggressive HasSameSize checking.	2025-02-20 15:19:17 +00:00
Florian Hahn	01d0793a69	[LAA] Make Ptr argument optional in isNoWrap. (#127410 ) Update isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed. PR: https://github.com/llvm/llvm-project/pull/127410	2025-02-19 14:51:19 +01:00
Ramkumar Ramachandra	6646b65082	[LAA] Rework and rename stripGetElementPtr (#125315 ) The stripGetElementPtr function is mysteriously named, and calls into another mysterious getGEPInductionOperand which does something complicated with GEP indices. The real purpose of the badly-named stripGetElementPtr function is to get a loop-variant GEP index, if there is one. The getGEPInductionOperand is totally redundant, as stripping off zeros from the end of GEP indices has no effect on computing the loop-variant GEP index, as constant zeros are always loop-invariant. Moreover, the GEP induction operand is simply the first non-zero index from the end, which stripGetElementPtr returns when it finds that any of the GEP indices are loop-variant: this is a completely unrelated value to the GEP index that is loop-variant. The implicit assumption here is that there is only ever one loop-variant index, and it is the first non-zero one from the end. The logic is unnecessarily complicated for what stripGetElementPtr wants to achieve, and the header comments are confusing as well. Strip getGEPInductionOperand, rework and rename stripGetElementPtr.	2025-02-18 10:25:47 +00:00
Florian Hahn	a8b177aa60	[LAA] Remove unneeded hasNoOverflow call (NFC). The function already calls hasNoOverflow above.	2025-02-17 21:14:01 +01:00
Ramkumar Ramachandra	6d86a8a1a1	LAA: scope responsibility of isNoWrapAddRec (NFC) (#127479 ) Free isNoWrapAddRec from the AddRec check, and rename it to isNoWrapGEP.	2025-02-17 16:58:09 +00:00

1 2 3 4 5 ...

488 Commits