253 Commits

Author SHA1 Message Date
Florian Hahn
9aadbb716d
[LAA] Check if access is part of loop in isNoWrap.
blockNeedsPredication does not support blocks outside of the loop. Bail
out for users outside the loop.

Fixes https://github.com/llvm/llvm-project/issues/174760.
2026-01-19 20:21:09 +00:00
Florian Hahn
af9a4263a1
[LAA] Only use inbounds/nusw in isNoWrap if the GEP is dereferenced. (#161445)
Update isNoWrap to only use the inbounds/nusw flags from GEPs that are
guaranteed to be dereferenced on every iteration. This fixes a case
where we incorrectly determine no dependence.

I think the issue is isolated to code that evaluates the resulting
AddRec at BTC, just using it to compute the distance between accesses
should still be fine; if the access does not execute in a given
iteration, there's no dependence in that iteration. But isolating the
code is not straight-forward, so be conservative for now. The practical
impact should be very minor (only one loop changed across a corpus with
27k modules from large C/C++ workloads.

Fixes https://github.com/llvm/llvm-project/issues/160912.

PR: https://github.com/llvm/llvm-project/pull/161445
2025-11-04 17:08:12 +00:00
Florian Hahn
3cb8a52e2d
[LAA] Add additional tests for #161445.
Add extra test variants for
https://github.com/llvm/llvm-project/pull/161445.
2025-10-27 02:39:09 +00:00
Florian Hahn
7ceef762c8
[LAA] Check if Ptr can be freed between Assume and CtxI. (#161725)
When using information from dereferenceable assumptions, we need to make
sure that the memory is not freed between the assume and the specified
context instruction. Instead of just checking canBeFreed, check if there
any calls that may free between the assume and the context instruction.

This patch introduces a willNotFreeBetween to check for calls that may
free between an assume and a context instructions, to also be used in
https://github.com/llvm/llvm-project/pull/161255.

PR: https://github.com/llvm/llvm-project/pull/161725
2025-10-03 13:44:58 +00:00
Florian Hahn
9133fc8cb0
[LAA,LV] Add early-exit tests with deref assumes and nofree via context.
Add tests with early exits and dereferenceable assumptions that need
proving no-free via the context.
2025-10-02 19:24:56 +01:00
Ramkumar Ramachandra
6e52e538cd
[LAA] Test different-type-sizes in safe-dep-dist (#161244)
The isSafeDependenceDistance routine is guarded by a HasSameSize check
which can be removed, as the test demonstrates.
2025-10-02 09:17:36 +01:00
Florian Hahn
a099c91eb8
[LAA] Add tests for using inbounds flags only used in predicated blocks.
Test for https://github.com/llvm/llvm-project/issues/160912.
2025-09-30 22:06:17 +01:00
Florian Hahn
9d42c75256
[LAA] Fix picking context instr in evaluatePtrAddRec for multiple preds.
A loop may have more than one predecessor out of the loop. In that case,
just pick the first non-phi instruction in the loop header.
2025-09-30 20:04:29 +01:00
Ramkumar Ramachandra
08c1e9e80a
[LAA] Revert 56a1cbb and 1aded51, due to crash (#160993)
This reverts commits 56a1cbb ([LAA] Fix non-NFC parts of 1aded51),
1aded51 ([LAA] Prepare to handle diff type sizes (NFC)). The original
NFC patch caused some regressions, which the later patch tried to fix.
However, the later patch is the cause of some crashes, and it would be
best to revert both for now, and re-land after thorough testing.
2025-09-27 10:42:20 +01:00
Ramkumar Ramachandra
56a1cbbd1c
[LAA] Fix non-NFC parts of 1aded51 (#160701)
1aded51 ([LAA] Prepare to handle diff type sizes (NFC)) was supposed to
be a non-functional patch, but introduced functional changes as
known-non-negative and known-non-positive is not equivalent to
!known-non-zero. Fix this.
2025-09-25 15:52:02 +01:00
Florian Hahn
b5fc4fd009
[LAA] Add test requiring runtime checks with -max-dependences flag.
Adds extra test coverage for
https://github.com/llvm/llvm-project/pull/91196.
2025-08-21 17:31:53 +01:00
Benjamin Maxwell
bb3066d42b
[LAA] Move scalable vector check into getStrideFromAddRec() (#154013)
This moves the check closer to the `.getFixedValue()` call and fixes
#153797 (which is a regression from #126971).
2025-08-19 06:40:07 +01:00
Ramkumar Ramachandra
4443b37877
[LAA] Pre-commit tests exercising different types (#151091)
Pre-commit tests exercising different types of source/sink in
depend_diff_types.ll, in preparation to weaken the HasSameSize check in
LoopAccessAnalysis.

Co-authored-by: Igor Kirillov <igor.kirillov@arm.com>
2025-08-11 10:19:10 +01:00
Florian Hahn
2ae996cbbe
[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)
This patch extends the logic added in
https://github.com/llvm/llvm-project/pull/128061 to support
dereferenceability information from assumptions as well.

Unfortunately both assumption cache and the dominator tree need to be
threaded through multiple layers to make them available where needed.

PR: https://github.com/llvm/llvm-project/pull/147047
2025-08-01 14:18:07 +01:00
Florian Hahn
9e7782db73
[LV,LAA] Add tests where RT checks are known false after expansion. 2025-07-26 14:17:35 +01:00
Florian Hahn
6c50e2b2dd
[SCEV] Don't require NUW at first add when checking A+C1 < (A+C2)<nuw> (#149795)
Relax the NUW requirements for isKnownPredicateViaNoOverflow, if the
second operand (Y) is an ADD. The code only simplifies the condition if
C1 < C2, so if the second ADD is NUW, it doesn't matter whether the
first operand also has the NUW flag, as it cannot wrap if C1 < C2.

https://alive2.llvm.org/ce/z/b3dM7N


PR: https://github.com/llvm/llvm-project/pull/149795
2025-07-23 09:33:34 +01:00
Florian Hahn
a11c5dd34b
[LAA] Add test variant with backward dep with overlap in loop.
The original test @backward_dep_known_distance_less_than_btc was
incorrectly named, as all loads are completely before the first store.

Add a variant where this is not the case: @backward_dep_known_distance_less_than_btc
2025-07-21 11:43:35 +01:00
Florian Hahn
5a4586f468
Reapply "[LAA] Remove loop-invariant check added in 234cc40adc61."
This reverts commit d43a80936d437d217d5a6dbbaa5fb131c27e7085.

With the correctness issue blocking the recommit finally fixed
(5d01697ec6cb), again unconditionally check if accesses are completely
before or after each other.
2025-07-14 21:21:22 +01:00
Florian Hahn
d749095b94
[LAA] Add tests where we could derive NoDep due to no overlap.
Add additional tests where we can prove that the accesses are either
completely before or after each other.
2025-07-14 15:48:41 +01:00
Ramkumar Ramachandra
fb845f93c0
[LAA] Hoist setting condition for RT-checks (#128045)
Strip ShouldRetyWithRuntimeCheck from the
DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the
responsibility of setting the condition for when runtime-checks are
needed, transferring this responsibility to
getDependenceDistanceStrideAndSize.

We can have multiple DepType::Unknown dependences that, by themselves,
do not trigger the retrying with runtime memory checks, and therefore
block vectorization. But once a single
FoundNonConstantDistanceDependence is found, the analysis seems to
switch to the "LAA: Retrying with memory checks" path and allows all
these dependences to be handled via runtime checks. There is hence no
rationale for predicating FoundNonConstantDependenceDistance on
DepType::Unknown, and removing this predication is one of the
side-effects of this patch.
2025-07-07 12:02:41 +01:00
Florian Hahn
0afbf17213
[LAA,LV] Add tests with early-exit loops and deref assumptions.
Adds additional test coverage for early-exit loops with deref
assumptions, as suggested in
https://github.com/llvm/llvm-project/pull/128436.
2025-07-03 19:41:13 +01:00
Ramkumar Ramachandra
619f7afd71
[LAA] Clean up APInt-overflow related code (#140048)
Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-06-30 14:48:56 +01:00
Florian Hahn
c5a49fb62d
[LAA] Add tests with 128 bit inductions and 128 bit pointers.
Adds extra test coverage for
https://github.com/llvm/llvm-project/pull/140048.
2025-06-29 11:28:35 +01:00
Florian Hahn
5d01697ec6
[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. (#128061)
Evaluating AR at the symbolic max BTC may wrap and create an expression
that is less than the start of the AddRec due to wrapping (for example
consider MaxBTC = -2).

If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get
incremented by EltSize before returning, so this effectively sets ScEnd
to unsigned max. Note that LAA separately checks that accesses cannot
not wrap (52ded672492,
https://github.com/llvm/llvm-project/pull/127543), so unsigned max
represents an upper bound.

When there is a computable backedge-taken count, we are guaranteed to
execute the number of iterations, and if any pointer would wrap it would
be UB (or the access will never be executed, so cannot alias). It
includes new tests from the previous discussion that show a case we wrap
with a BTC, but it is UB due to the pointer after the object wrapping
(in `evaluate-at-backedge-taken-count-wrapping.ll`)

When we have only a maximum backedge taken count, we instead try to use
dereferenceability information to determine if the pointer access must be in
bounds for the maximum backedge taken count.

PR: https://github.com/llvm/llvm-project/pull/128061
2025-06-23 20:23:40 +01:00
Florian Hahn
f4ca223196
[LAA] Update early-exit test to cover last valid & first invalid access.
Make sure tests cover loops accessing the last valid and the first
invalid memory location. Note that the test
`@all_exits_dominate_latch_countable_exits_at_most_1000_iterations_kno`
has been modified to execute at most 1000 iterations; it previously
executed 1001 iterations.
2025-06-23 15:06:32 +01:00
Nikita Popov
6157028fea
[BasicAA][ValueTracking] Increase depth for underlying object search (#143714)
This depth limits a linear search (rather than the usual potentially
exponential one) and is not particularly important for compile-time in
practice.

The change in #137297 is going to increase the length of GEP chains, so
I'd like to increase this limit a bit to reduce the chance of
regressions (https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2419
showed a 13% increase in SearchLimitReached). There is no particular
significance to the new value of 10.

Compile-time is neutral.
2025-06-12 09:19:50 +02:00
John Brawn
81d3189891
[LAA] Keep pointer checks on partial analysis (#139719)
Currently if there's any memory access that AccessAnalysis couldn't
analyze then all of the runtime pointer check results are discarded.
This patch makes this able to be controlled with the AllowPartial
option, which makes it so we generate the runtime check information
for those pointers that we could analyze, as transformations may still
be able to make use of the partial information.

Of the transformations that use LoopAccessAnalysis, only
LoopVersioningLICM changes behaviour as a result of this change. This is
because the others either:
* Check canVectorizeMemory, which will return false when we have partial
pointer information as analyzeLoop() will return false.
* Examine the dependencies returned by getDepChecker(), which will be
empty as we exit analyzeLoop if we have partial pointer information
before calling areDepsSafe(), which is what fills in the dependency
information.
2025-06-04 16:47:20 +01:00
Florian Hahn
0ba63b2f22
[SCEV] Add additional test coverage for loop-guards reasoning.
Add additional tests showing missed opportunities when using loop guards
for reasoning in SCEV, depending on the order the guards appear in the
IR.
2025-06-01 22:39:37 +01:00
Ramkumar Ramachandra
bb2791609d
[LAA] Tweak debug output for UTC stability (#140764)
UpdateTestChecks has a make_analyzer_generalizer to replace pointer
addressess from the debug output of LAA with a pattern, which is an
acceptable solution when there is one RUN line. However, when there are
multiple RUN lines with a common pattern, UTC fails to recognize common
output due to mismatched pointer addresses. Instead of hacking UTC scrub
the output before comparing the outputs from the different RUN lines,
fix the issue once and for all by making LAA not output unstable pointer
addresses in the first place.

The removal of the now-dead make_analyzer_generalizer is left as a
non-trivial exercise for a follow-up.
2025-05-21 12:01:49 +01:00
Ramkumar Ramachandra
b18ebd17b8
[LAA] Improve forked-pointers.ll test (#140383) 2025-05-18 09:34:27 +01:00
Florian Hahn
9da103ab9e
[LAA] Update remaining tests after 384a5b00a7. 2025-05-07 21:02:44 +01:00
vaibhav
384a5b00a7
[LAA] Use MaxStride instead of CommonStride to calculate MaxVF (#98142)
We bail out from MaxVF calculation if the strides are not the same.
Instead, we are dependent on runtime checks, though not yet implemented.
We could instead use the MaxStride to conservatively use an upper bound.

This handles cases like the following:
```c
#define LEN 256 * 256
float a[LEN];

void gather() {
  for (int i = 0; i < LEN - 1024 - 255; i++) {
  #pragma clang loop interleave(disable)
  #pragma clang loop unroll(disable)
    for (int j = 0; j < 256; j++)
      a[i + j + 1024] += a[j * 4 + i];
  }
}
```

---------

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-05-07 21:02:21 +01:00
Vikram Hegde
f91a6e6dab
[SCEV] Reject comparision of pointers to different address spaces in SCEVWrapPredicate::implies (#137935) 2025-04-30 21:03:31 +05:30
Ramkumar Ramachandra
13b443f2ef
[LAA] Improve convergent tests (#136758)
LoopAccessAnalysis has code for handling function calls where the
function is marked with the 'convergent' attribute, but the test
coverage is insufficient. Fix this by adding a test showing the case of
no-runtime-checks adapted from LoopDistribute, and clean up the existing
test with runtime-checks. Also regenerate the test file with
UpdateTestChecks.
2025-04-29 09:50:33 +01:00
Alexey Bataev
ea1b525ceb [LAA] Add tests with non-power-of-2 store-load forward distance (#136710) 2025-04-28 15:10:55 -07:00
Alexey Bataev
88f8637d22 Revert "[LAA] Add tests with non-power-of-2 store-load forward distance (#136710)"
This reverts commit 51bbebb6677bb0ea14ca62cc140492965c2a6e19 to fix
buildbots https://lab.llvm.org/buildbot/#/builders/137/builds/17662
2025-04-28 14:36:44 -07:00
Alexey Bataev
51bbebb667
[LAA] Add tests with non-power-of-2 store-load forward distance (#136710) 2025-04-28 17:02:02 -04:00
Alexey Bataev
78777a204a
[LV]Split store-load forward distance analysis from other checks, NFC (#121156)
The patch splits the store-load forwarding distance analysis from other
dependency analysis in LAA. Currently it supports only power-of-2
distances, required to support non-power-of-2 distances in future.

Part of #100755
2025-03-31 07:28:44 -04:00
Florian Hahn
8bdcd0a96e
[LAA] Add missing test coverage for retrying with runtime checks.
Adds extra test coverage showing change by
https://github.com/llvm/llvm-project/pull/128045.
2025-03-27 19:09:10 +00:00
Florian Hahn
dfb661cd1c
[LAA] Add extra tests for #128061.
Extend test coverage for
https://github.com/llvm/llvm-project/pull/128061.
2025-03-13 21:42:32 +00:00
Florian Hahn
275baedfde
[LAA] Consider accessed addrspace when mapping underlying obj to access. (#129087)
In some cases, it is possible for the same underlying object to be
accessed via pointers to different address spaces. This could lead to
pointers from different address spaces ending up in the same dependency
set, which isn't allowed (and triggers an assertion).

Update the mapping from underlying object -> last access to also include
the accessing address space.

Fixes https://github.com/llvm/llvm-project/issues/124759.

PR: https://github.com/llvm/llvm-project/pull/129087
2025-02-28 20:56:12 +00:00
Florian Hahn
52ded67249
[LAA] Always require non-wrapping pointers for runtime checks. (#127543)
Currently we only check if the pointers involved in runtime checks do
not wrap if we need to perform dependency checks. If that's not the
case, we generate runtime checks, even if the pointers may wrap (see
test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll).

If the pointer wraps, then we swap start and end of the runtime check,
leading to incorrect checks.

An Alive2 proof of what the runtime checks are checking conceptually (on
i4 to have it complete in reasonable time) showing the incorrect result
should be https://alive2.llvm.org/ce/z/KsHzn8

Depends on https://github.com/llvm/llvm-project/pull/127410 to avoid
more regressions.

PR: https://github.com/llvm/llvm-project/pull/127543
2025-02-20 19:00:23 +01:00
Florian Hahn
01d0793a69
[LAA] Make Ptr argument optional in isNoWrap. (#127410)
Update isNoWrap to make the IR Ptr argument optional. This allows using
isNoWrap when dealing with things like pointer-selects, where a select
is translated to multiple pointer SCEV expressions, but there is no IR
value that can be used. We don't try to retrieve pointer values for the
pointer SCEVs and using info from the IR would not be safe. For example,
we cannot use inbounds, because the pointer may never be accessed.

PR: https://github.com/llvm/llvm-project/pull/127410
2025-02-19 14:51:19 +01:00
Ramkumar Ramachandra
6646b65082
[LAA] Rework and rename stripGetElementPtr (#125315)
The stripGetElementPtr function is mysteriously named, and calls into
another mysterious getGEPInductionOperand which does something
complicated with GEP indices. The real purpose of the badly-named
stripGetElementPtr function is to get a loop-variant GEP index, if there
is one. The getGEPInductionOperand is totally redundant, as stripping
off zeros from the end of GEP indices has no effect on computing the
loop-variant GEP index, as constant zeros are always loop-invariant.
Moreover, the GEP induction operand is simply the first non-zero index
from the end, which stripGetElementPtr returns when it finds that any of
the GEP indices are loop-variant: this is a completely unrelated value
to the GEP index that is loop-variant. The implicit assumption here is
that there is only ever one loop-variant index, and it is the first
non-zero one from the end.

The logic is unnecessarily complicated for what stripGetElementPtr wants
to achieve, and the header comments are confusing as well. Strip
getGEPInductionOperand, rework and rename stripGetElementPtr.
2025-02-18 10:25:47 +00:00
Florian Hahn
88e72c401b [LAA] Add test where GEPs may wrap. 2025-02-17 21:49:40 +01:00
Florian Hahn
e080366a76 [LAA] Inline hasComputableBounds in only caller, simplify isNoWrap.
Inline hasComputableBounds into createCheckForAccess. This removes a
level of indirection and allows for passing the AddRec directly to
isNoWrap, removing the need to retrieve the AddRec for the pointer
again.

The early continue for invariant SCEVs now also applies to forked
pointers (i.e. when there's more than one entry in TranslatedPtrs) when
ShouldCheckWrap is true, as those trivially won't wrap.

The change is NFC otherwise. replaceSymbolicStrideSCEV is now called
earlier.
2025-02-16 19:56:13 +01:00
Florian Hahn
044b52832a
[LAA] Perform checks for no-wrap separately from getPtrStride. (#126971)
Reorganize the code in isNoWrap to perform the no-wrap checks without
relying on getPtrStride directly. getPtrStride now uses isNoWrap.

The new structure allows deriving no-wrap in more cases in LAA, because
there are some cases where getPtrStride bails out early because it
cannot return a constant stride, but we can still prove no-wrap for the
pointer.

An example are AddRecs with non-ConstantInt strides with inbound GEPs,
in the improved test cases.

This enables vectorization with runtime checks in a few more cases.

PR: https://github.com/llvm/llvm-project/pull/126971
2025-02-14 20:06:37 +01:00
Florian Hahn
1199bbb396 [LAA] Add forked pointers tests with dep checks and runtime checks (NFC)
Add missing test coverage where generating runtime checks is tried again
after dependence analysis.
2025-02-14 19:45:52 +01:00
Ramkumar Ramachandra
8327c2cfdb
LAA: fix logic for MaxTargetVectorWidth (#125487)
Uses the fixed register width if scalable vectorization is not enabled
(via TargetTransformInfo::enableScalableVectorization) and improves
results if there are scalable vector registers, but they shouldn't be
used.
2025-02-13 11:40:05 +00:00
Florian Hahn
82605285b8 [LAA] Also clear CheckingGroups in RuntimePointerChecking::reset.
This fixes a crash when trying to print access-info in the newly added
test cases.
2025-02-12 21:49:22 +01:00