470 Commits

Author SHA1 Message Date
Florian Hahn
35ee462fef
[LAA] Add assert check CanDoRTIFNeeded can be computed w/o RT.Need (NFC)
Add assert to ensure that CanDoRTIfNeeded can be computed w/o
RtCheck.Need, to prepare for adjusting the condition.
2025-05-18 22:12:28 +01:00
Ramkumar Ramachandra
c807395011
[LAA/SLP] Don't truncate APInt in getPointersDiff (#139941)
Change getPointersDiff to return an std::optional<int64_t>, and fill
this value with using APInt::trySExtValue. This simple change requires
changes to other functions in LAA, and major changes in SLPVectorizer
changing types from 32-bit to 64-bit.

Fixes #139202.
2025-05-15 10:08:05 +01:00
Igor Kirillov
a3fb54c1ae
[LAA][NFC] Unify naming of DepCandidates to DepCands (#139534)
The MemoryDepChecker::DepCandidates instance in each LoopAccessInfo had multiple names (AccessSets, DepCands, DependentAccesses), which was confusing. This patch renames all references to DepCands for consistency.
2025-05-13 08:52:46 +01:00
Ramkumar Ramachandra
c1e678b134
[LAA] Improve code in replaceSymbolicStrideSCEV (NFC) (#139532)
Prefer DenseMap::lookup over DenseMap::find.
2025-05-12 14:18:26 +01:00
Ramkumar Ramachandra
68dccb9fa0
[LAA] Strip dead code in getStrideFromPointer (NFC) (#139140)
The SCEV multiply by 1 doesn't make sense, because SCEV would fold it:
therefore, the OrigPtr == Ptr branch effectively rejects a multiply.
However, in this branch, we have a pointer SCEV that cannot be a
multiply, and hence the code the code is dead. Strip it.
2025-05-09 09:20:50 +01:00
Ramkumar Ramachandra
458991197d
[SCEVPatternMatch] Extend with more matchers (#138836) 2025-05-09 09:20:14 +01:00
vaibhav
384a5b00a7
[LAA] Use MaxStride instead of CommonStride to calculate MaxVF (#98142)
We bail out from MaxVF calculation if the strides are not the same.
Instead, we are dependent on runtime checks, though not yet implemented.
We could instead use the MaxStride to conservatively use an upper bound.

This handles cases like the following:
```c
#define LEN 256 * 256
float a[LEN];

void gather() {
  for (int i = 0; i < LEN - 1024 - 255; i++) {
  #pragma clang loop interleave(disable)
  #pragma clang loop unroll(disable)
    for (int j = 0; j < 256; j++)
      a[i + j + 1024] += a[j * 4 + i];
  }
}
```

---------

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-05-07 21:02:21 +01:00
Kazu Hirata
2f3067ed69
[llvm] Remove unused local variables (NFC) (#138454) 2025-05-04 09:38:16 -07:00
Ramkumar Ramachandra
faf87e1414
[LAA] Prefer set-contains over set-count (NFC) (#136749)
Improve code by preferring {SmallSet,SmallPtrSet}::contains() over the
count() function, when used in a boolean context.
2025-04-29 13:56:04 +01:00
Kazu Hirata
47d8fec9b8
[llvm] Use llvm::append_range (NFC) (#136066)
This patch replaces:

  llvm::copy(Src, std::back_inserter(Dst));

with:

  llvm::append_range(Dst, Src);

for breavity.

One side benefit is that llvm::append_range eventually calls
llvm::SmallVector::reserve if Dst is of llvm::SmallVector.
2025-04-16 19:30:01 -07:00
Florian Hahn
995fd47944
[LAA] Make sure MaxVF for Store-Load forward safe dep distances is pow2.
MaxVF computed in couldPreventStoreLoadFowrard may not be a power of 2,
as CommonStride may not be a power-of-2.

This can cause crashes after 78777a20. Use bit_floor to make sure it is
a suitable power-of-2.

Fixes https://github.com/llvm/llvm-project/issues/134696.
2025-04-12 20:05:37 +01:00
Ramkumar Ramachandra
fd6260f13b
[EquivClasses] Shorten members_{begin,end} idiom (#134373)
Introduce members() iterator-helper to shorten the members_{begin,end}
idiom. A previous attempt of this patch was #130319, which had to be
reverted due to unit-test failures when attempting to call members() on
the end iterator. In this patch, members() accepts either an ECValue or
an ElemTy, which is more intuitive and doesn't suffer from the same
issue.
2025-04-04 14:34:08 +01:00
Florian Hahn
32f24029c7
Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)."
This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d.

It includes updates to remaining users in Polly and Clang, to avoid
failures when building those projects.
2025-03-31 22:27:59 +01:00
Florian Hahn
616f447fc8
Revert "[EquivalenceClasses] Replace findValue with contains (NFC)."
Breaks clang builds.

This reverts commit 8e390dedd71d0c2bcbe8775aee2e234ef7a5b787.
2025-03-31 20:38:12 +01:00
Florian Hahn
8e390dedd7
[EquivalenceClasses] Replace findValue with contains (NFC).
Replace remaining use of findValue with more compact and limited
contains().
2025-03-31 20:11:00 +01:00
Florian Hahn
5877bef385
[LAA] Remove unneeded findValue calls (NFC).
Use findLeader directly instead if going through findValue,
getLeaderValue. This is simpler and more efficient.
2025-03-31 19:19:27 +01:00
Alexey Bataev
78777a204a
[LV]Split store-load forward distance analysis from other checks, NFC (#121156)
The patch splits the store-load forwarding distance analysis from other
dependency analysis in LAA. Currently it supports only power-of-2
distances, required to support non-power-of-2 distances in future.

Part of #100755
2025-03-31 07:28:44 -04:00
Kazu Hirata
8f5c3deadd
[Analysis] Use llvm::append_range (NFC) (#133602) 2025-03-29 16:52:36 -07:00
Kazu Hirata
03205121d2
[Analysis] Avoid repeated hash lookups (NFC) (#131421) 2025-03-15 09:11:34 -07:00
Vitaly Buka
5bc166728a
Revert "Reland [EquivClasses] Introduce members iterator-helper" (#130380)
Reverts llvm/llvm-project#130319

Multiple bot failures.
2025-03-07 17:46:53 -08:00
Ramkumar Ramachandra
21d973dbb3
Reland [EquivClasses] Introduce members iterator-helper (#130319)
Changes: Fix the expectations in EquivalenceClassesTest.MemberIterator,
also fixing a build failure.
2025-03-07 21:09:31 +00:00
Ramkumar Ramachandra
86dfd90193
Revert "[EquivClasses] Introduce members iterator-helper" (#130313)
This reverts commit 259624bf6d, as it causes a build failure.
2025-03-07 17:38:38 +00:00
Ramkumar Ramachandra
259624bf6d
[EquivClasses] Introduce members iterator-helper (#130139) 2025-03-07 17:24:14 +00:00
Florian Hahn
275baedfde
[LAA] Consider accessed addrspace when mapping underlying obj to access. (#129087)
In some cases, it is possible for the same underlying object to be
accessed via pointers to different address spaces. This could lead to
pointers from different address spaces ending up in the same dependency
set, which isn't allowed (and triggers an assertion).

Update the mapping from underlying object -> last access to also include
the accessing address space.

Fixes https://github.com/llvm/llvm-project/issues/124759.

PR: https://github.com/llvm/llvm-project/pull/129087
2025-02-28 20:56:12 +00:00
Kazu Hirata
303825d2ab
[Analysis] Avoid repeated hash lookups (NFC) (#128394) 2025-02-23 08:47:02 -08:00
Florian Hahn
52ded67249
[LAA] Always require non-wrapping pointers for runtime checks. (#127543)
Currently we only check if the pointers involved in runtime checks do
not wrap if we need to perform dependency checks. If that's not the
case, we generate runtime checks, even if the pointers may wrap (see
test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll).

If the pointer wraps, then we swap start and end of the runtime check,
leading to incorrect checks.

An Alive2 proof of what the runtime checks are checking conceptually (on
i4 to have it complete in reasonable time) showing the incorrect result
should be https://alive2.llvm.org/ce/z/KsHzn8

Depends on https://github.com/llvm/llvm-project/pull/127410 to avoid
more regressions.

PR: https://github.com/llvm/llvm-project/pull/127543
2025-02-20 19:00:23 +01:00
Kazu Hirata
c0c172213b
[Analysis] Avoid repeated hash lookups (NFC) (#127955) 2025-02-20 08:55:35 -08:00
Ramkumar Ramachandra
6eba2775e2
[LAA] Scale strides using type-size (NFC) (#124529)
Change getDependenceDistanceStrideAndSize to scale strides by
TypeByteSize, scaling the returned CommonStride and MaxStride. Even
though there is a seemingly-functional change of setting CommonStride
when scaled strides are equal, it ends up being a non-functional change
due to aggressive HasSameSize checking.
2025-02-20 15:19:17 +00:00
Florian Hahn
01d0793a69
[LAA] Make Ptr argument optional in isNoWrap. (#127410)
Update isNoWrap to make the IR Ptr argument optional. This allows using
isNoWrap when dealing with things like pointer-selects, where a select
is translated to multiple pointer SCEV expressions, but there is no IR
value that can be used. We don't try to retrieve pointer values for the
pointer SCEVs and using info from the IR would not be safe. For example,
we cannot use inbounds, because the pointer may never be accessed.

PR: https://github.com/llvm/llvm-project/pull/127410
2025-02-19 14:51:19 +01:00
Ramkumar Ramachandra
6646b65082
[LAA] Rework and rename stripGetElementPtr (#125315)
The stripGetElementPtr function is mysteriously named, and calls into
another mysterious getGEPInductionOperand which does something
complicated with GEP indices. The real purpose of the badly-named
stripGetElementPtr function is to get a loop-variant GEP index, if there
is one. The getGEPInductionOperand is totally redundant, as stripping
off zeros from the end of GEP indices has no effect on computing the
loop-variant GEP index, as constant zeros are always loop-invariant.
Moreover, the GEP induction operand is simply the first non-zero index
from the end, which stripGetElementPtr returns when it finds that any of
the GEP indices are loop-variant: this is a completely unrelated value
to the GEP index that is loop-variant. The implicit assumption here is
that there is only ever one loop-variant index, and it is the first
non-zero one from the end.

The logic is unnecessarily complicated for what stripGetElementPtr wants
to achieve, and the header comments are confusing as well. Strip
getGEPInductionOperand, rework and rename stripGetElementPtr.
2025-02-18 10:25:47 +00:00
Florian Hahn
a8b177aa60 [LAA] Remove unneeded hasNoOverflow call (NFC).
The function already calls hasNoOverflow above.
2025-02-17 21:14:01 +01:00
Ramkumar Ramachandra
6d86a8a1a1
LAA: scope responsibility of isNoWrapAddRec (NFC) (#127479)
Free isNoWrapAddRec from the AddRec check, and rename it to isNoWrapGEP.
2025-02-17 16:58:09 +00:00
Florian Hahn
e080366a76 [LAA] Inline hasComputableBounds in only caller, simplify isNoWrap.
Inline hasComputableBounds into createCheckForAccess. This removes a
level of indirection and allows for passing the AddRec directly to
isNoWrap, removing the need to retrieve the AddRec for the pointer
again.

The early continue for invariant SCEVs now also applies to forked
pointers (i.e. when there's more than one entry in TranslatedPtrs) when
ShouldCheckWrap is true, as those trivially won't wrap.

The change is NFC otherwise. replaceSymbolicStrideSCEV is now called
earlier.
2025-02-16 19:56:13 +01:00
Florian Hahn
e60de25c4e [LAA] Replace symbolic strides for translated pointers earlier (NFC).
Move up replaceSymbolicStrideSCEV before isNoWrap. It needs to be called
after hasComputableBounds, as this may create an AddRec via PSE, which
replaceSymbolicStrideSCEV will look up.

This is in preparation for simplifying isNoWrap.
2025-02-15 19:44:39 +01:00
Florian Hahn
4664a4c66b [LAA] Use getPointer/setPointer in createCheckForAccess (NFC).
Use getPointer/setPointer to clarify we are accessing/modifying the
rurrent value.
2025-02-15 16:17:42 +01:00
Kazu Hirata
778001514f [Analysis] Fix a warning
This patch fixes:

  llvm/lib/Analysis/LoopAccessAnalysis.cpp:1530:9: error: unused
  variable 'Ty' [-Werror,-Wunused-variable]
2025-02-14 12:41:43 -08:00
Florian Hahn
9ad83f7fcf [LAA] Get pointer address space from AddRec (NFC).
Retrieve the address space from the pointer AddRec instead of the IR
pointer value, to prepare to make the IR pointer value optional.
2025-02-14 20:39:52 +01:00
Florian Hahn
044b52832a
[LAA] Perform checks for no-wrap separately from getPtrStride. (#126971)
Reorganize the code in isNoWrap to perform the no-wrap checks without
relying on getPtrStride directly. getPtrStride now uses isNoWrap.

The new structure allows deriving no-wrap in more cases in LAA, because
there are some cases where getPtrStride bails out early because it
cannot return a constant stride, but we can still prove no-wrap for the
pointer.

An example are AddRecs with non-ConstantInt strides with inbound GEPs,
in the improved test cases.

This enables vectorization with runtime checks in a few more cases.

PR: https://github.com/llvm/llvm-project/pull/126971
2025-02-14 20:06:37 +01:00
Florian Hahn
424fcc5df7 [LAA] Split off code to compute stride from AddRec for reuse (NFC).
Refactors to code to expose the core logic from getPtrStride to compute
the stride for a given AddRec.

Split off from https://github.com/llvm/llvm-project/pull/126971 as
suggested.
2025-02-13 22:06:12 +01:00
Ramkumar Ramachandra
8327c2cfdb
LAA: fix logic for MaxTargetVectorWidth (#125487)
Uses the fixed register width if scalable vectorization is not enabled
(via TargetTransformInfo::enableScalableVectorization) and improves
results if there are scalable vector registers, but they shouldn't be
used.
2025-02-13 11:40:05 +00:00
Ramkumar Ramachandra
db43dd7f4f
LAA: simplify LoopAccessInfoManager::clear (NFC) (#125488)
DenseMap::erase() doesn't invalidate the iterator.
2025-02-03 16:06:21 +00:00
Ramkumar Ramachandra
7444ccdd26
LAA: improve code in getStrideFromPointer (NFC) (#124780)
Strip dead code, inline a constant, and modernize style.
2025-01-31 20:06:25 +00:00
Ramkumar Ramachandra
3a4376b8f9
LAA: handle 0 return from getPtrStride correctly (#124539)
getPtrStride returns 0 when the PtrScev is loop-invariant, and this is
not an erroneous value: it returns std::nullopt to communicate that it
was not able to find a valid pointer stride. In analyzeLoop, we call
getPtrStride with a value_or(0) conflating the zero return value with
std::nullopt. Fix this, handling loop-invariant loads correctly.
2025-01-27 14:21:14 +00:00
David Sherwood
b7286dbef9
Reland "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop #96752" (#123616)
The last attempt failed a sanitiser build because we were
creating a reference to a null Predicates pointer in
isDereferenceableAndAlignedInLoop. This was exposed by
the unit test IsDerefReadOnlyLoop in
unittests/Analysis/LoadsTest.cpp. I fixed this by falling
back on getConstantMaxBackedgeTakenCount if Predicates is
null - see line 316 in llvm/lib/Analysis/Loads.cpp. There
are no other changes.
2025-01-27 11:59:38 +00:00
David Sherwood
a00938eedd
Revert "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752)" (#123057)
This reverts commit bfedf6460c2cad6e6f966b457d8d27084579dcd8.
2025-01-15 13:56:42 +00:00
David Sherwood
bfedf6460c
[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752)
Currently when we encounter a negative step in the induction
variable isDereferenceableAndAlignedInLoop bails out because
the element size is signed greater than the step. This patch
adds support for negative steps in cases where we detect the
start address for the load is of the form base + offset. In
this case the address decrements in each iteration so we need
to calculate the access size differently. I have done this by
caling getStartAndEndForAccess from LoopAccessAnalysis.cpp.

The motivation for this patch comes from PR #88385 where a
reviewer requested reusing isDereferenceableAndAlignedInLoop,
but that PR itself does support reverse loops.

The changed test in LoopVectorize/X86/load-deref-pred.ll now
passes because previously we were calculating the total access
size incorrectly, whereas now it is 412 bytes and fits
perfectly into the alloca.
2025-01-15 12:47:43 +00:00
Ramkumar Ramachandra
8b4561467e
LAA: add missed swap when inverting src, sink (#122254)
When inverting source and sink on a negative induction step, the types
of the source and sink should also be swapped. This fixes a bug in the
code that follows, that computes properties based on these types. With
234cc40 ([LAA] Limit no-overlap check to at least one loop-invariant
accesses.), that code is guarded by a loop-invariant condition: however,
the commit did not add any new tests exercising the guarded code, and
hence the bugfix in this patch requires additional tests to exercise
that guarded codepath.
2025-01-13 13:07:19 +00:00
Ramkumar Ramachandra
17912f336b
LAA: refactor dependence class to prep for scaled strides (NFC) (#122113)
Rearrange the DepDistanceAndSizeInfo struct in preparation to scale
strides. getDependenceDistanceStrideAndSize now returns the data of
CommonStride, MaxStride, and clarifies when to retry with runtime
checks, in place of (unscaled) strides.
2025-01-09 16:05:17 +00:00
Nikita Popov
bc0976ed1f
[LAA] Strip non-inbounds offset in getPointerDiff() (NFC) (#118665)
I believe that this code doesn't care whether the offsets are known to
be inbounds a priori. For the same reason the change is not testable, as
the SCEV based fallback code will look through non-inbounds offsets
anyway. So make it clear that there is no special inbounds requirement
here.
2024-12-10 13:05:34 +01:00
Ramkumar Ramachandra
aa5cdcea39
LAA: improve code in a couple of routines (NFC) (#108092) 2024-11-28 16:15:45 +00:00