llvm-project

Author	SHA1	Message	Date
Kazu Hirata	47d8fec9b8	[llvm] Use llvm::append_range (NFC) (#136066 ) This patch replaces: llvm::copy(Src, std::back_inserter(Dst)); with: llvm::append_range(Dst, Src); for breavity. One side benefit is that llvm::append_range eventually calls llvm::SmallVector::reserve if Dst is of llvm::SmallVector.	2025-04-16 19:30:01 -07:00
Florian Hahn	995fd47944	[LAA] Make sure MaxVF for Store-Load forward safe dep distances is pow2. MaxVF computed in couldPreventStoreLoadFowrard may not be a power of 2, as CommonStride may not be a power-of-2. This can cause crashes after 78777a20. Use bit_floor to make sure it is a suitable power-of-2. Fixes https://github.com/llvm/llvm-project/issues/134696.	2025-04-12 20:05:37 +01:00
Ramkumar Ramachandra	fd6260f13b	[EquivClasses] Shorten members_{begin,end} idiom (#134373 ) Introduce members() iterator-helper to shorten the members_{begin,end} idiom. A previous attempt of this patch was #130319, which had to be reverted due to unit-test failures when attempting to call members() on the end iterator. In this patch, members() accepts either an ECValue or an ElemTy, which is more intuitive and doesn't suffer from the same issue.	2025-04-04 14:34:08 +01:00
Florian Hahn	32f24029c7	Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)." This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d. It includes updates to remaining users in Polly and Clang, to avoid failures when building those projects.	2025-03-31 22:27:59 +01:00
Florian Hahn	616f447fc8	Revert "[EquivalenceClasses] Replace findValue with contains (NFC)." Breaks clang builds. This reverts commit 8e390dedd71d0c2bcbe8775aee2e234ef7a5b787.	2025-03-31 20:38:12 +01:00
Florian Hahn	8e390dedd7	[EquivalenceClasses] Replace findValue with contains (NFC). Replace remaining use of findValue with more compact and limited contains().	2025-03-31 20:11:00 +01:00
Florian Hahn	5877bef385	[LAA] Remove unneeded findValue calls (NFC). Use findLeader directly instead if going through findValue, getLeaderValue. This is simpler and more efficient.	2025-03-31 19:19:27 +01:00
Alexey Bataev	78777a204a	[LV]Split store-load forward distance analysis from other checks, NFC (#121156 ) The patch splits the store-load forwarding distance analysis from other dependency analysis in LAA. Currently it supports only power-of-2 distances, required to support non-power-of-2 distances in future. Part of #100755	2025-03-31 07:28:44 -04:00
Kazu Hirata	8f5c3deadd	[Analysis] Use llvm::append_range (NFC) (#133602 )	2025-03-29 16:52:36 -07:00
Kazu Hirata	03205121d2	[Analysis] Avoid repeated hash lookups (NFC) (#131421 )	2025-03-15 09:11:34 -07:00
Vitaly Buka	5bc166728a	Revert "Reland [EquivClasses] Introduce members iterator-helper" (#130380 ) Reverts llvm/llvm-project#130319 Multiple bot failures.	2025-03-07 17:46:53 -08:00
Ramkumar Ramachandra	21d973dbb3	Reland [EquivClasses] Introduce members iterator-helper (#130319 ) Changes: Fix the expectations in EquivalenceClassesTest.MemberIterator, also fixing a build failure.	2025-03-07 21:09:31 +00:00
Ramkumar Ramachandra	86dfd90193	Revert "[EquivClasses] Introduce members iterator-helper" (#130313 ) This reverts commit 259624bf6d, as it causes a build failure.	2025-03-07 17:38:38 +00:00
Ramkumar Ramachandra	259624bf6d	[EquivClasses] Introduce members iterator-helper (#130139 )	2025-03-07 17:24:14 +00:00
Florian Hahn	275baedfde	[LAA] Consider accessed addrspace when mapping underlying obj to access. (#129087 ) In some cases, it is possible for the same underlying object to be accessed via pointers to different address spaces. This could lead to pointers from different address spaces ending up in the same dependency set, which isn't allowed (and triggers an assertion). Update the mapping from underlying object -> last access to also include the accessing address space. Fixes https://github.com/llvm/llvm-project/issues/124759. PR: https://github.com/llvm/llvm-project/pull/129087	2025-02-28 20:56:12 +00:00
Kazu Hirata	303825d2ab	[Analysis] Avoid repeated hash lookups (NFC) (#128394 )	2025-02-23 08:47:02 -08:00
Florian Hahn	52ded67249	[LAA] Always require non-wrapping pointers for runtime checks. (#127543 ) Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). If the pointer wraps, then we swap start and end of the runtime check, leading to incorrect checks. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on https://github.com/llvm/llvm-project/pull/127410 to avoid more regressions. PR: https://github.com/llvm/llvm-project/pull/127543	2025-02-20 19:00:23 +01:00
Kazu Hirata	c0c172213b	[Analysis] Avoid repeated hash lookups (NFC) (#127955 )	2025-02-20 08:55:35 -08:00
Ramkumar Ramachandra	6eba2775e2	[LAA] Scale strides using type-size (NFC) (#124529 ) Change getDependenceDistanceStrideAndSize to scale strides by TypeByteSize, scaling the returned CommonStride and MaxStride. Even though there is a seemingly-functional change of setting CommonStride when scaled strides are equal, it ends up being a non-functional change due to aggressive HasSameSize checking.	2025-02-20 15:19:17 +00:00
Florian Hahn	01d0793a69	[LAA] Make Ptr argument optional in isNoWrap. (#127410 ) Update isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed. PR: https://github.com/llvm/llvm-project/pull/127410	2025-02-19 14:51:19 +01:00
Ramkumar Ramachandra	6646b65082	[LAA] Rework and rename stripGetElementPtr (#125315 ) The stripGetElementPtr function is mysteriously named, and calls into another mysterious getGEPInductionOperand which does something complicated with GEP indices. The real purpose of the badly-named stripGetElementPtr function is to get a loop-variant GEP index, if there is one. The getGEPInductionOperand is totally redundant, as stripping off zeros from the end of GEP indices has no effect on computing the loop-variant GEP index, as constant zeros are always loop-invariant. Moreover, the GEP induction operand is simply the first non-zero index from the end, which stripGetElementPtr returns when it finds that any of the GEP indices are loop-variant: this is a completely unrelated value to the GEP index that is loop-variant. The implicit assumption here is that there is only ever one loop-variant index, and it is the first non-zero one from the end. The logic is unnecessarily complicated for what stripGetElementPtr wants to achieve, and the header comments are confusing as well. Strip getGEPInductionOperand, rework and rename stripGetElementPtr.	2025-02-18 10:25:47 +00:00
Florian Hahn	a8b177aa60	[LAA] Remove unneeded hasNoOverflow call (NFC). The function already calls hasNoOverflow above.	2025-02-17 21:14:01 +01:00
Ramkumar Ramachandra	6d86a8a1a1	LAA: scope responsibility of isNoWrapAddRec (NFC) (#127479 ) Free isNoWrapAddRec from the AddRec check, and rename it to isNoWrapGEP.	2025-02-17 16:58:09 +00:00
Florian Hahn	e080366a76	[LAA] Inline hasComputableBounds in only caller, simplify isNoWrap. Inline hasComputableBounds into createCheckForAccess. This removes a level of indirection and allows for passing the AddRec directly to isNoWrap, removing the need to retrieve the AddRec for the pointer again. The early continue for invariant SCEVs now also applies to forked pointers (i.e. when there's more than one entry in TranslatedPtrs) when ShouldCheckWrap is true, as those trivially won't wrap. The change is NFC otherwise. replaceSymbolicStrideSCEV is now called earlier.	2025-02-16 19:56:13 +01:00
Florian Hahn	e60de25c4e	[LAA] Replace symbolic strides for translated pointers earlier (NFC). Move up replaceSymbolicStrideSCEV before isNoWrap. It needs to be called after hasComputableBounds, as this may create an AddRec via PSE, which replaceSymbolicStrideSCEV will look up. This is in preparation for simplifying isNoWrap.	2025-02-15 19:44:39 +01:00
Florian Hahn	4664a4c66b	[LAA] Use getPointer/setPointer in createCheckForAccess (NFC). Use getPointer/setPointer to clarify we are accessing/modifying the rurrent value.	2025-02-15 16:17:42 +01:00
Kazu Hirata	778001514f	[Analysis] Fix a warning This patch fixes: llvm/lib/Analysis/LoopAccessAnalysis.cpp:1530:9: error: unused variable 'Ty' [-Werror,-Wunused-variable]	2025-02-14 12:41:43 -08:00
Florian Hahn	9ad83f7fcf	[LAA] Get pointer address space from AddRec (NFC). Retrieve the address space from the pointer AddRec instead of the IR pointer value, to prepare to make the IR pointer value optional.	2025-02-14 20:39:52 +01:00
Florian Hahn	044b52832a	[LAA] Perform checks for no-wrap separately from getPtrStride. (#126971 ) Reorganize the code in isNoWrap to perform the no-wrap checks without relying on getPtrStride directly. getPtrStride now uses isNoWrap. The new structure allows deriving no-wrap in more cases in LAA, because there are some cases where getPtrStride bails out early because it cannot return a constant stride, but we can still prove no-wrap for the pointer. An example are AddRecs with non-ConstantInt strides with inbound GEPs, in the improved test cases. This enables vectorization with runtime checks in a few more cases. PR: https://github.com/llvm/llvm-project/pull/126971	2025-02-14 20:06:37 +01:00
Florian Hahn	424fcc5df7	[LAA] Split off code to compute stride from AddRec for reuse (NFC). Refactors to code to expose the core logic from getPtrStride to compute the stride for a given AddRec. Split off from https://github.com/llvm/llvm-project/pull/126971 as suggested.	2025-02-13 22:06:12 +01:00
Ramkumar Ramachandra	8327c2cfdb	LAA: fix logic for MaxTargetVectorWidth (#125487 ) Uses the fixed register width if scalable vectorization is not enabled (via TargetTransformInfo::enableScalableVectorization) and improves results if there are scalable vector registers, but they shouldn't be used.	2025-02-13 11:40:05 +00:00
Ramkumar Ramachandra	db43dd7f4f	LAA: simplify LoopAccessInfoManager::clear (NFC) (#125488 ) DenseMap::erase() doesn't invalidate the iterator.	2025-02-03 16:06:21 +00:00
Ramkumar Ramachandra	7444ccdd26	LAA: improve code in getStrideFromPointer (NFC) (#124780 ) Strip dead code, inline a constant, and modernize style.	2025-01-31 20:06:25 +00:00
Ramkumar Ramachandra	3a4376b8f9	LAA: handle 0 return from getPtrStride correctly (#124539 ) getPtrStride returns 0 when the PtrScev is loop-invariant, and this is not an erroneous value: it returns std::nullopt to communicate that it was not able to find a valid pointer stride. In analyzeLoop, we call getPtrStride with a value_or(0) conflating the zero return value with std::nullopt. Fix this, handling loop-invariant loads correctly.	2025-01-27 14:21:14 +00:00
David Sherwood	b7286dbef9	Reland "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop #96752 " (#123616 ) The last attempt failed a sanitiser build because we were creating a reference to a null Predicates pointer in isDereferenceableAndAlignedInLoop. This was exposed by the unit test IsDerefReadOnlyLoop in unittests/Analysis/LoadsTest.cpp. I fixed this by falling back on getConstantMaxBackedgeTakenCount if Predicates is null - see line 316 in llvm/lib/Analysis/Loads.cpp. There are no other changes.	2025-01-27 11:59:38 +00:00
David Sherwood	a00938eedd	Revert "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752 )" (#123057 ) This reverts commit bfedf6460c2cad6e6f966b457d8d27084579dcd8.	2025-01-15 13:56:42 +00:00
David Sherwood	bfedf6460c	[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752 ) Currently when we encounter a negative step in the induction variable isDereferenceableAndAlignedInLoop bails out because the element size is signed greater than the step. This patch adds support for negative steps in cases where we detect the start address for the load is of the form base + offset. In this case the address decrements in each iteration so we need to calculate the access size differently. I have done this by caling getStartAndEndForAccess from LoopAccessAnalysis.cpp. The motivation for this patch comes from PR #88385 where a reviewer requested reusing isDereferenceableAndAlignedInLoop, but that PR itself does support reverse loops. The changed test in LoopVectorize/X86/load-deref-pred.ll now passes because previously we were calculating the total access size incorrectly, whereas now it is 412 bytes and fits perfectly into the alloca.	2025-01-15 12:47:43 +00:00
Ramkumar Ramachandra	8b4561467e	LAA: add missed swap when inverting src, sink (#122254 ) When inverting source and sink on a negative induction step, the types of the source and sink should also be swapped. This fixes a bug in the code that follows, that computes properties based on these types. With 234cc40 ([LAA] Limit no-overlap check to at least one loop-invariant accesses.), that code is guarded by a loop-invariant condition: however, the commit did not add any new tests exercising the guarded code, and hence the bugfix in this patch requires additional tests to exercise that guarded codepath.	2025-01-13 13:07:19 +00:00
Ramkumar Ramachandra	17912f336b	LAA: refactor dependence class to prep for scaled strides (NFC) (#122113 ) Rearrange the DepDistanceAndSizeInfo struct in preparation to scale strides. getDependenceDistanceStrideAndSize now returns the data of CommonStride, MaxStride, and clarifies when to retry with runtime checks, in place of (unscaled) strides.	2025-01-09 16:05:17 +00:00
Nikita Popov	bc0976ed1f	[LAA] Strip non-inbounds offset in getPointerDiff() (NFC) (#118665 ) I believe that this code doesn't care whether the offsets are known to be inbounds a priori. For the same reason the change is not testable, as the SCEV based fallback code will look through non-inbounds offsets anyway. So make it clear that there is no special inbounds requirement here.	2024-12-10 13:05:34 +01:00
Ramkumar Ramachandra	aa5cdcea39	LAA: improve code in a couple of routines (NFC) (#108092 )	2024-11-28 16:15:45 +00:00
Florian Hahn	a353e258ba	[LAA] Don't require Stride == 1/-1 for inbounds pointer AddRecs nowrap. (#113126 ) If we have a pointer AddRec, the maximum increment is 2^(pointer-index-wdith - 1) - 1. This means that if incrementing the AddRec wraps, the distance between the previously accessed location and the wrapped location is > 2^(pointer-index-wdith - 1), i.e. if the GEP for the AddRec is inbounds, this would be poison due to the object being larger than half the pointer index type space. The poison would be immediate UB when the memory access gets executed.. Similar reasoning can be applied for decrements. PR: https://github.com/llvm/llvm-project/pull/113126	2024-11-05 22:45:56 +01:00
Ramkumar Ramachandra	d897ea37db	LAA: check nusw on GEP in place of inbounds (#112223 ) With the introduction of the nusw flag in GEPNoWrapFlags, it should be safe to weaken the check in LoopAccessAnalysis to just check the nusw flag on the GEP, instead of inbounds.	2024-10-22 09:58:54 +01:00
Ramkumar Ramachandra	f719cfa868	LAA: be less conservative in isNoWrap (#112553 ) isNoWrap has exactly one caller which handles Assume = true separately, but too conservatively. Instead, pass Assume to isNoWrap, so it is threaded into getPtrStride, which has the correct handling for the Assume flag. Also note that the Stride == 1 check in isNoWrap is incorrect: getPtrStride returns Strides == 1 or -1, except when isNoWrapAddRec or Assume are true, assuming ShouldCheckWrap is true; we can include the case of -1 Stride, and when isNoWrapAddRec is true. With this change, passing Assume = true to getPtrStride could return a non-unit stride, and we correctly handle that case as well.	2024-10-22 09:55:51 +01:00
Kazu Hirata	0614b3cfac	[Analysis] Simplify code with DenseMap::operator[] (NFC) (#111331 )	2024-10-07 07:00:45 -07:00
Florian Hahn	dec4cfdb09	[LAA] Use loop guards when checking invariant accesses. Apply loop guards to start and end pointers like done in other places to improve results.	2024-10-04 12:23:13 +01:00
Benjamin Maxwell	50a1ab12ab	[LAA] Don't assume libcalls with output/input pointers can be vectorized (#108980 ) LoopAccessAnalysis currently does not check/track aliasing from the output pointers, but assumes vectorizing library calls with a mapping is safe. This can result in incorrect codegen if something like the following is vectorized: ``` for(int i=0; i<N; i++) { // No aliasing between input and output pointers detected. sincos(cos_out[0], sin_out+i, cos_out+i); } ``` Where for VF >= 2 `cos_out[1]` to `cos_out[VF-1]` is the cosine of the original value of `cos_out[0]` not the updated value.	2024-09-23 16:05:55 +01:00
Florian Hahn	d43a80936d	Revert "[LAA] Remove loop-invariant check added in 234cc40adc61." This reverts commit a80053322b765eec93951e21db490c55521da2d8. The new asserts exposed an underlying issue where the expanded bounds could wrap, causing the parts of the code to incorrectly determine that accesses do not overlap. Reproducer below based on @mstorsjo's test case. opt -passes='print<access-info>' target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64" define i32 @j(ptr %P, i32 %x, i32 %y) { entry: %gep.P.4 = getelementptr inbounds nuw i8, ptr %P, i32 4 %gep.P.8 = getelementptr inbounds nuw i8, ptr %P, i32 8 br label %loop loop: %1 = phi i32 [ %x, %entry ], [ %sel, %loop.latch ] %iv = phi i32 [ %y, %entry ], [ %iv.next, %loop.latch ] %gep.iv = getelementptr inbounds i64, ptr %gep.P.8, i32 %iv %l = load i32, ptr %gep.iv, align 4 %c.1 = icmp eq i32 %l, 3 br i1 %c.1, label %loop.latch, label %if.then if.then: ; preds = %for.body store i64 0, ptr %gep.iv, align 4 %l.2 = load i32, ptr %gep.P.4 br label %loop.latch loop.latch: %sel = phi i32 [ %l.2, %if.then ], [ %1, %loop ] %iv.next = add nsw i32 %iv, 1 %c.2 = icmp slt i32 %iv.next, %sel br i1 %c.2, label %loop, label %exit exit: %res = phi i32 [ %iv.next, %loop.latch ] ret i32 %res }	2024-08-27 11:55:47 +01:00
Florian Hahn	a80053322b	[LAA] Remove loop-invariant check added in 234cc40adc61. 234cc40adc61 introduced a loop-invariance check to limit the compile-time impact of the newly added checks. This patch removes the restriction and avoids extra compile-time impact by sinking the check to exits where we would return an unknown dependence. This notably reduces the amount the extra checks are executed while not missing out on any improvements from them. https://llvm-compile-time-tracker.com/compare.php?from=33e7cd6ff23f6c904314d17c68dc58168fd32d09&to=7c55e66d4f31ce8262b90c119a8e84e1f9515ff1&stat=instructions:u	2024-08-26 10:24:00 +01:00
Florian Hahn	d7c84d7b71	[LAA] Collect loop guards only once in MemoryDepChecker (NFCI). This on its own gives small compile-time improvements in some configs and enables using loop guards at more places in the future while keeping compile-time impact low. https://llvm-compile-time-tracker.com/compare.php?from=c44202574ff9a8c0632aba30c2765b134557435f&to=55ffc3dd920fa9af439fd39f8f9cc13509531420&stat=instructions:u	2024-08-21 08:28:52 +01:00

1 2 3 4 5 ...

461 Commits