191 Commits

Author SHA1 Message Date
Ramkumar Ramachandra
d897ea37db
LAA: check nusw on GEP in place of inbounds (#112223)
With the introduction of the nusw flag in GEPNoWrapFlags, it should be
safe to weaken the check in LoopAccessAnalysis to just check the nusw
flag on the GEP, instead of inbounds.
2024-10-22 09:58:54 +01:00
Ramkumar Ramachandra
f719cfa868
LAA: be less conservative in isNoWrap (#112553)
isNoWrap has exactly one caller which handles Assume = true separately,
but too conservatively. Instead, pass Assume to isNoWrap, so it is
threaded into getPtrStride, which has the correct handling for the
Assume flag. Also note that the Stride == 1 check in isNoWrap is
incorrect: getPtrStride returns Strides == 1 or -1, except when
isNoWrapAddRec or Assume are true, assuming ShouldCheckWrap is true; we
can include the case of -1 Stride, and when isNoWrapAddRec is true. With
this change, passing Assume = true to getPtrStride could return a
non-unit stride, and we correctly handle that case as well.
2024-10-22 09:55:51 +01:00
Florian Hahn
dec4cfdb09
[LAA] Use loop guards when checking invariant accesses.
Apply loop guards to start and end pointers like done in other places to
improve results.
2024-10-04 12:23:13 +01:00
Florian Hahn
972353fdfa
[LAA] Add tests where results can be improved using loop guards. 2024-10-04 11:26:16 +01:00
Ramkumar Ramachandra
7eea55fd4b
LoopLoadElim: re-org tests after invalid #96656 (#97598)
After pr96656.ll were added to LAA and LoopVersioning, it was decided
that the bug is in a caller of LoopVersioning, not in LAA or
LoopVersioning itself. The new candidate was LoopLoadElim, but #96656
has since been marked invalid. Hence, re-organize the added tests to
avoid confusion, and the testcase from the investigation to
LoopLoadElim.
2024-09-30 15:46:34 +01:00
Florian Hahn
606a9342f1
[LAA] Add test cases where evaluating AddRecs at symbolic max BTC wraps.
The underlying issue was discovered by an assert added in
a80053322b765eec939 by a test case provided by @mstorsjo.
2024-08-29 12:29:10 +01:00
Florian Hahn
d43a80936d
Revert "[LAA] Remove loop-invariant check added in 234cc40adc61."
This reverts commit a80053322b765eec93951e21db490c55521da2d8.

The new asserts exposed an underlying issue where the expanded bounds
could wrap, causing the parts of the code to incorrectly determine that
accesses do not overlap.

Reproducer below based on @mstorsjo's test case.

opt -passes='print<access-info>'

target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"

define i32 @j(ptr %P, i32 %x, i32 %y) {
entry:
  %gep.P.4 = getelementptr inbounds nuw i8, ptr %P, i32 4
  %gep.P.8 = getelementptr inbounds nuw i8, ptr %P, i32 8
  br label %loop

loop:
  %1 = phi i32 [ %x, %entry ], [ %sel, %loop.latch ]
  %iv = phi i32 [ %y, %entry ], [ %iv.next, %loop.latch ]
  %gep.iv = getelementptr inbounds i64, ptr %gep.P.8, i32 %iv
  %l = load i32, ptr %gep.iv, align 4
  %c.1 = icmp eq i32 %l, 3
  br i1 %c.1, label %loop.latch, label %if.then

if.then:                                          ; preds = %for.body
  store i64 0, ptr %gep.iv, align 4
  %l.2 = load i32, ptr %gep.P.4
  br label %loop.latch

loop.latch:
  %sel = phi i32 [ %l.2, %if.then ], [ %1, %loop ]
  %iv.next = add nsw i32 %iv, 1
  %c.2 = icmp slt i32 %iv.next, %sel
  br i1 %c.2, label %loop, label %exit

exit:
  %res = phi i32 [ %iv.next, %loop.latch ]
  ret i32 %res
}
2024-08-27 11:55:47 +01:00
Florian Hahn
a80053322b
[LAA] Remove loop-invariant check added in 234cc40adc61.
234cc40adc61 introduced a loop-invariance check to limit the
compile-time impact of the newly added checks.

This patch removes the restriction and avoids extra compile-time impact
by sinking the check to exits where we would return an unknown
dependence. This notably reduces the amount the extra checks are
executed while not missing out on any improvements from them.

https://llvm-compile-time-tracker.com/compare.php?from=33e7cd6ff23f6c904314d17c68dc58168fd32d09&to=7c55e66d4f31ce8262b90c119a8e84e1f9515ff1&stat=instructions:u
2024-08-26 10:24:00 +01:00
Ramkumar Ramachandra
a80dd44b0d
LAA: pre-commit tests for stride-versioning (#97570)
Add tests for when the Stride is unknown and equal to TC, with different
kinds of casts. In these cases, LAA should not speculate on Stride.
2024-08-21 12:11:19 +01:00
Florian Hahn
844c188c79
[LAA] Refine stride checks for SCEVs during dependence analysis. (#99577)
Update getDependenceDistanceStrideAndSize to reason about different
combinations of strides directly and explicitly.

Update getPtrStride to return 0 for invariant pointers.

Then proceed by checking the strides.

If either source or sink are not strided by a constant (i.e. not a
non-wrapping AddRec) or invariant, the accesses may overlap
with earlier or later iterations and we cannot generate runtime
checks to disambiguate them.

Otherwise they are either loop invariant or strided. In that case, we
can generate a runtime check to disambiguate them.

If both are strided by constants, we proceed as previously.

This is an alternative to
https://github.com/llvm/llvm-project/pull/99239 and also replaces
additional checks if the underlying object is loop-invariant.

Fixes https://github.com/llvm/llvm-project/issues/87189.

PR: https://github.com/llvm/llvm-project/pull/99577
2024-07-26 13:10:16 +01:00
Jay Foad
8ebe499e07 [LLVM] Fix typo "depedent" 2024-07-23 12:52:20 +01:00
Florian Hahn
4199f80df5
[LAA] Adjust test from a4f8705b05 so RT checks aren't always false.
Updated @B_indices_loaded_in_loop_A_stored to use a different offset
for one of the accesses we create runtime checks for; the original
version had a runtime check that was always true as the accesses always
overlapped.
2024-07-16 21:56:57 +01:00
Florian Hahn
3ccda93671
[LAA] Update pointer-bounds cache to also consider access type.
The same pointer may be accessed with different types and the bound
includes the size of the accessed type to compute the end. Update the
cache to correctly disambiguate between different accessed types.
2024-07-14 17:24:12 +01:00
Florian Hahn
41209075da
[LAA] Add tests accesses to same pointer with different types.
Add tests with accesses to the same pointer with different types. At the
moment, runtime checks for those accesses are incorrectly based on the
smaller type.
2024-07-14 15:01:44 +01:00
Florian Hahn
a4f8705b05
[LAA] Precommit test with loops where indices are loaded in each iter.
Add tests which are not safe to vectorize because %indices are loaded in
the loop and the same indices could be loaded in later iterations.

Tests for https://github.com/llvm/llvm-project/issues/87189.
2024-07-13 21:25:32 +01:00
Graham Hunter
22a7f6dcc4
Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493)
Reverts llvm/llvm-project#91458 to deal with post-commit reviewer
requests.
2024-07-11 16:39:30 +01:00
Graham Hunter
1860fd049e
[LV] Autovectorization for the all-in-one histogram intrinsic (#91458)
This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.
2024-07-11 15:33:30 +01:00
Ramkumar Ramachandra
6334d0af3b
LAA, LVer: add pre-commit tests for #96656 (#96925)
The issue is in LoopAccessAnalysis, but the regression was seen in the
user LoopVersioning. Hence, add pre-commit tests for both, in
preparation to fix the issue in LoopAccessAnalysis.
2024-06-28 10:04:23 +01:00
Ramkumar Ramachandra
0f111ba790
LoopInfo: introduce Loop::getLocStr; unify debug output (#93051)
Introduce a Loop::getLocStr stolen from LoopVectorize's static function
getDebugLocString in order to have uniform debug output headers across
LoopVectorize, LoopAccessAnalysis, and LoopDistribute. The motivation
for this change is to have UpdateTestChecks recognize the headers and
automatically generate CHECK lines for debug output, with minimal
special-casing.
2024-06-25 13:12:15 +01:00
Ramkumar Ramachandra
5ae50698a0
LAA: strip unnecessary getUniqueCastUse (#92119)
733b8b2 ([LAA] Simplify identification of speculatable strides [nfc])
refactored getStrideFromPointer() to compute directly on SCEVs, and
return an SCEV expression instead of a Value. However, it left behind a
call to getUniqueCastUse(), which is completely unnecessary. Remove
this, showing a positive test update, and simplify the surrounding
program logic.
2024-06-24 22:49:02 +01:00
Florian Hahn
e949b54a5b
[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499)
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.

Depends on https://github.com/llvm/llvm-project/pull/93498

PR: https://github.com/llvm/llvm-project/pull/93499
2024-06-04 22:23:30 +01:00
Florian Hahn
461cc8612f
[LAA] Add test where stride is also used for BTC.
Add missing test coverage for follow-up to
https://github.com/llvm/llvm-project/pull/93499.
2024-05-30 21:05:31 -07:00
Florian Hahn
234cc40adc
[LAA] Limit no-overlap check to at least one loop-invariant accesses.
Limit the logic added in https://github.com/llvm/llvm-project/pull/9230
to cases where either sink or source are loop-invariant, to avoid
compile-time increases. This is not needed for correctness.

I am working on follow-up changes to reduce the compile-time impact in
general to allow us to enable this again for any source/sink.

This should fix the compile-time regression introduced by this change:

* compile-time improvement with this change:
  https://llvm-compile-time-tracker.com/compare.php?from=4351787fb650da6d1bfb8d6e58753c90dcd4c418&to=b89010a2eb5f98494787c1c3b77f25208c59090c&stat=instructions:u

* compile-time improvement with original patch reverted on top of this
  change:
  https://llvm-compile-time-tracker.com/compare.php?from=b89010a2eb5f98494787c1c3b77f25208c59090c&to=19a1103fe68115cfd7d6472c6961f4fabe81a593&stat=instructions:u
2024-05-28 09:23:02 -07:00
Florian Hahn
0f08ef1b66
[LAA] Add tests with various early exits. 2024-05-27 18:50:26 -07:00
Ramkumar Ramachandra
9e814669a0
[LAA] rewrite a test to make it more robust (#93197)
The test select-dependence.ll can be eliminated completely by dce, as it
returns a constant, and doesn't write any arguments. Lift out the local
allocas into arguments, so that it is less nonsensical. While at it,
rename the variables for greater readability, and regenerate the test
with UpdateTestChecks.
2024-05-24 17:22:35 +01:00
Ramkumar Ramachandra
f1acd9d577
[LAA] increase test coverage in symbolic-stride (#92253)
The test symbolic-stride.ll does not exercise all codepaths in
getStrideFromPointer, particularly when the operand is an
SCEVIntegralCastExpr. Cover these codepaths as well. This patch serves
as pre-commit tests for #92119.
2024-05-24 10:58:39 +01:00
Paul Walker
5bd210ace6 [NFC][LLVM] Autogenerate check lines for some Analysis/LoopAccessAnalysis tests. 2024-05-22 10:37:06 +00:00
Florian Hahn
1b377dbeb7
[LAA] Check accesses don't overlap early to determine NoDep (#92307)
Use getStartAndEndForAccess to compute the start and end of both src 
and sink (factored out to helper in bce3680f45b57f). If they do not
overlap (i.e. SrcEnd <= SinkStart || SinkEnd <= SrcStart), there is no
dependence, regardless of stride.

PR: https://github.com/llvm/llvm-project/pull/92307
2024-05-21 11:00:11 +01:00
Florian Hahn
d108fa03d4
[LAA] Add tests with invariant accesses using vector types.
Extra tests for https://github.com/llvm/llvm-project/pull/92307
2024-05-20 14:53:23 +01:00
Florian Hahn
ec36145f58
[LAA] Add tests with invariant dependences before strided ones.
Add extra test coverage for loops with strided and invariant accesses to
the same object.
2024-05-15 18:54:35 +01:00
Florian Hahn
179efe5abc
[LAA] Delay applying loop guards until after isSafeDependenceDistance.
Applying the loop guards to the distance may prevent
isSafeDependenceDistance from determining NoDep, unless loop guards are
also applied to the backedge-taken-count.

Instead of applying the guards to both Dist and the
backedge-taken-count, just apply them after handling
isSafeDependenceDistance and constant distances; there is no benefit to
applying the guards before then.

This fixes a regression flagged by @bjope due to
ecae3ed958481cba7d60868cf3504292f7f4fdf5.
2024-05-14 19:47:24 +01:00
Florian Hahn
86f655cb4e
[LAA] Add tests showing unnecessary RT check due to applying loop guards
Test courtesy to @bjope showing a regression due to
ecae3ed958481cba7d60868cf3504292f7f4fdf5.
2024-05-14 18:27:37 +01:00
Florian Hahn
f52ca63278
[LAA] Drop x86_64 target triple to fix test on builds with X86.
Follow-up o fix test after 28767afd53353d9333b0adf6f0fafa1592092532.
2024-05-10 12:27:24 +01:00
Florian Hahn
28767afd53
[LAA] Support backward dependences with non-constant distance. (#91525)
Following up to 933f49248, also update the code reasoning about
backwards dependences to support non-constant distances.

Update the code to use the signed minimum distance instead of a constant
distance

This means e checked the lower bound of the dependence distance and the
distance may be larger at runtime (and safe for vectorization). Whether
to classify it as Unknown or Backwards depends on the vector width and
LAA was updated to take TTI to get the maximum vector register width.

If the minimum dependence distance is larger than the max vector width,
we consider it as backwards-vectorizable. Otherwise we classify them as
Unknown, so we re-try with runtime checks.

PR: https://github.com/llvm/llvm-project/pull/91525
2024-05-10 11:47:13 +01:00
Florian Hahn
ecae3ed958
[LAA] Apply loop guards to dependence distance.
After supporting non-constant dependence distances in 933f49248bf,
applying information from loop guards can help further disambiguate
dependencies.
2024-05-09 18:12:55 +01:00
Florian Hahn
1464aee376
[LAA] Add tests with non-constant backward deps with known min value.
Add a set of tests with non-constant backward dependences, where the
minimum value is known (via the start value of the outer AddRec).
2024-05-08 20:16:46 +01:00
Florian Hahn
50b45b2422
[LAA] Add tests with forward dependences known via assumes. 2024-05-08 15:45:38 +01:00
Florian Hahn
5f73d29cb7
[LAA] Add tests showing extra unnecessary runtime checks.
Pre-commit tests for an upcoming patch.
2024-05-06 13:14:33 +01:00
Florian Hahn
148b721772
[LAA] Update check line in test to fully match message. 2024-05-06 13:04:36 +01:00
Florian Hahn
82219e547b
[LAA] Pass maximum stride to isSafeDependenceDistance. (#90036)
As discussed in https://github.com/llvm/llvm-project/pull/88039, support
different strides with isSafeDependenceDistance by passing the maximum
of both strides.

isSafeDependenceDistance tries to prove that
    |Dist| > BackedgeTakenCount * Step
holds. Chosing the maximum stride computes the maximum range accesed by
the loop for all strides.

PR: https://github.com/llvm/llvm-project/pull/90036
2024-04-30 12:59:08 +01:00
Florian Hahn
933f49248b
[LAA] Support different strides & non constant dep distances using SCEV. (#88039)
Extend LoopAccessAnalysis to support different strides and as a
consequence non-constant distances between dependences using SCEV to
reason about the direction of the dependence.

In multiple places, logic to rule out dependences using the stride has
been updated to only be used if StrideA == StrideB, i.e. there's a
common stride.

We now also may bail out at multiple places where we may have to set
FoundNonConstantDistanceDependence. This is done when we need to bail
out and the distance is not constant to preserve original behavior.

Fixes https://github.com/llvm/llvm-project/issues/87336

PR: https://github.com/llvm/llvm-project/pull/88039
2024-04-25 21:38:07 +01:00
Florian Hahn
d5f2753067
[LAA] Tests with different strides where BTC can rule out dependence.
Tests to add support for different strides with isSafeDependenceDistance
as follow-up to https://github.com/llvm/llvm-project/pull/88039.
2024-04-25 10:44:36 +01:00
Florian Hahn
5138ccd0e4
[LAA] Add etra tests with strides with different signs.
Extra tests with strides with different signs for
https://github.com/llvm/llvm-project/pull/88039.
2024-04-22 15:50:59 +01:00
Florian Hahn
977c0a6d29
[LAA] Add tests with non-constant strides & distances.
Add a number of LAA test cases with both forward and backward
dependences with non-constant strides and dependence distances.

This includes test coverage for
https://github.com/llvm/llvm-project/issues/87336

Also includes a LoopLoadElimination test to make sure the pass does not
crash on non-constant dependence distances.
2024-04-08 19:18:38 +01:00
Florian Hahn
a3ad5faa32
[LAA] Fix typo IndidrectUnsafe -> IndirectUnsafe.
Fix type in textual analysis output.
2024-03-12 14:44:04 +00:00
Florian Hahn
b274b23665
[ValueTracking] Treat phi as underlying obj when not decomposing further (#84339)
At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes https://github.com/llvm/llvm-project/issues/82665.

PR: https://github.com/llvm/llvm-project/pull/84339
2024-03-12 08:55:03 +00:00
Florian Hahn
4cfd4a7896
[LAA] Add test case for #82665.
Test case for https://github.com/llvm/llvm-project/issues/82665.
2024-03-07 13:53:03 +00:00
Fangrui Song
3d18c8cd26 [test] Replace aarch64-*-{eabi,gnueabi}{,hf} with aarch64
Similar to d39b4ce3ce8a3c256e01bdec2b140777a332a633
Using "eabi" or "gnueabi" for aarch64 targets is a common mistake and
warned by Clang Driver. We want to avoid them elsewhere as well. Just
use the common "aarch64" without other triple components.
2024-02-12 18:29:55 -08:00
Nikita Popov
1aee1e1f4c [Analysis] Convert tests to opaque pointers (NFC) 2024-02-05 12:04:39 +01:00
Nikita Popov
cd7ea4ea65
[LAA] Drop alias scope metadata that is not valid across iterations (#79161)
LAA currently adds memory locations with their original AATags to AST.
However, scoped alias AATags may be valid only within one loop
iteration, while LAA reasons across iterations.

Fix this by determining which alias scopes are defined inside the loop,
and drop AATags that reference these scopes.

Fixes https://github.com/llvm/llvm-project/issues/79137.
2024-01-24 11:20:16 +01:00