2034 Commits

Author SHA1 Message Date
v01dXYZ
cff8d716bd
[SCEV] forgetValue: support (with-overflow-inst op0, op1) (#98015)
The use-def walk in forgetValue() was skipping instructions with
non-SCEVable types. However, SCEV may look past with.overflow
intrinsics returning aggregates.

Fixes #97586.
2024-07-09 09:14:33 +02:00
Florian Hahn
2f89d4a8c7
[SCEV] Split collecting and applying rewrite info from loop guards (NFC) (#97316)
Introduce a new LoopGuards class to track info from loop guards and split 
off collecting rewrite info to LoopGuards::collect. This allows users of 
applyLoopGuards to collect rewrite info once in cases where the same
loop guards are applied multiple times.

This is used to collect rewrite info once in howFarToZero, which saves a
bit of compile-time:
stage1-O3: -0.04%
stage1-ReleaseThinLTO: -0.02%
stage1-ReleaseLTO-g: -0.04%
stage2-O3: -0.02%

https://llvm-compile-time-tracker.com/compare.php?from=117b53ae38428ca66eaa886fb432e6f09db88fe4&to=4ffb7b2e1c99081ccebe6f236c48a0be2f64b6ff&stat=instructions:u

Notably this improves mafft by -0.9% with -O3, -0.11% with LTO and
-0.12% with stage2-O3.

PR: https://github.com/llvm/llvm-project/pull/97316
2024-07-02 19:32:28 +01:00
Nikita Popov
d23959b2f5 [SCEV] Cache DataLayout in class (NFC)
PR #96919 caused a minor compile-time regression, mostly because
SCEV now goes through an extra out-of-line function to fetch the
data layout, and does this a lot. Cache the DataLayout in SCEV
to avoid these repeated calls.
2024-06-28 12:19:27 +02:00
vaibhav
7e59b20034
[SCEV] Support addrec in right hand side in howManyLessThans (#92560)
Fixes #92554 (std::reverse will auto-vectorize now)

When calculating number of times a exit condition containing a
comparison is executed, we mostly assume that RHS of comparison should
be loop invariant, but it may be another add-recurrence.

~In that case, we can try the computation with `LHS = LHS - RHS` and
`RHS = 0`.~ (It is not valid unless proven that it doesn't wrap)

**Edit:**
We can calculate back edge count for loop structure like:

```cpp
left = left_start
right = right_start
while(left < right){
  // ...do something...
  left += s1; // the stride of left is s1 (> 0)
  right -= s2; // the stride of right is -s2 (s2 > 0)
}
// left and right converge somewhere in the middle of their start values
```
We can calculate the backedge-count as ceil((End - left_start) /u (s1-
(-s2)) where, End = max(left_start, right_start).

**Alive2**: https://alive2.llvm.org/ce/z/ggxx58
2024-06-25 14:33:57 -07:00
Kazu Hirata
1462605ab0
[Analysis] Use range-based for loops (NFC) (#96587) 2024-06-25 06:57:30 -07:00
Mohammed Keyvanzadeh
7b57a1b401
[llvm] format and terminate namespaces with closing comment (#94917)
Namespaces are terminated with a closing comment in the majority of the
codebase so do the same here for consistency. Also format code within
some namespaces to make clang-format happy.
2024-06-21 23:50:53 +03:30
Enna1
1fafa32f6e
[SCEV] Avoid unnecessary call to getExitingBlock() in computeExitLimit() (#96188)
In `computeExitLimit()`, we use `getExitingBlock()` to check if loop has
exactly one exiting block.
Since `computeExitLimit()` is only used in
`computeBackedgeTakenCount()`, and `getExitingBlocks()` is called to get
all exiting blocks in `computeBackedgeTakenCount()`, we can simply check
if loop has exactly one exiting block by checking if the number of
exiting blocks equals 1 in `computeBackedgeTakenCount()` and pass it as
an argument to `computeExitLimit()`.

This change helps to improve the compile time for files containing large
loops.
2024-06-21 09:43:34 +08:00
Nikita Popov
290a939fc3 [SCEV] Handle nusw/nuw gep flags for addrecs
Set the nw flag is gep has any nowrap flags. Transfer the nuw
flag. Also set nuw for the nusw + nneg combination.
2024-06-20 15:59:42 +02:00
Nikita Popov
7f09aa9e36 [SCEV] Transfer gep nusw and nuw flags
nusw implies nsw offset and nuw base+offset arithmetic if offset
is non-negative. nuw implies nuw offset and base+offset arithmetic.
As usual, we can only transfer is poison implies UB.
2024-06-20 14:45:02 +02:00
Philip Reames
04cd069062
[SCEV] Use context sensitive reasoning in howFarToZero (#94525)
This change builds on 0a357ad which supported non-constant strides in
howFarToZero, but used only context insensitive reasoning.

This change does two things:
1) Directly use context sensitive queries to prove facts established
   before the loop.  Note that we technically only need facts known
   at the latch, but using facts known on entry is a conservative
   approximation which will cover most everything.
2) For the non-zero check, we can usually prove non-zero from the
   finite assumption implied by mustprogress.  This eliminates the
   need to do the context sensitive query in the common case.
2024-06-19 08:59:26 -07:00
Philip Reames
cb76896d6e
[SCEVExpander] Recognize urem idiom during expansion (#96005)
If we have a urem expression, emitting it as a urem is significantly
better that letting the fully expansion kick in. We have the risk of a
udiv or mul which could have previously been shared, but loosing that
seems like a reasonable tradeoff for being able to round trip a urem w/o
modification.
2024-06-19 08:40:04 -07:00
Philip Reames
0a357adc75
[SCEV] Support non-constant step in howFarToZero (#94411)
VF * vscale is the canonical step for a scalably vectorized loop, and
LFTR canonicalizes to NE loop tests, so having our trip count logic be
unable to compute trip counts for such loops is unfortunate.

The existing code needed minimal generalization to handle non-constant
strides. The tricky cases to be sure we handle correctly are: zero, and
-1 (due to the special case of abs(-1) being non-positive).

This patch does the full generalization in terms of code structure, but
in practice, this seems unlikely to benefit
anything beyond the (C * vscale) case. I did some quick investigation,
and it seems the context free non-zero, and sign checks are basically
never disproved for arbitrary scales. I think we have alternate tactics
available for these, but I'm going to return to that in a separate
patch.
2024-06-05 08:05:07 -07:00
Florian Hahn
4812e9a487
[SCEV] Preserve flags in SCEVLoopGuardRewriter for add and mul. (#91472)
SCEVLoopGuardRewriter only replaces operands with equivalent values, so
we should be able to transfer the flags from the original expression.

PR: https://github.com/llvm/llvm-project/pull/91472
2024-06-03 13:25:55 +01:00
Florian Hahn
39e5036c0e
[SCEV] Add predicated version of getSymbolicMaxBackedgeTakenCount. (#93498)
This patch adds a predicated version of
getSymbolicMaxBackedgeTakenCount.

The intended use for this is loop access analysis for loops with
uncountable exits. When analyzing dependences and computing runtime
checks, we need the smallest upper bound on the number of iterations. In
terms of memory safety, it shouldn't matter if any uncomputable exits
leave the loop, as long as we prove that there are no dependences given
the minimum of the countable exits. The same should apply also for
generating runtime checks.

PR: https://github.com/llvm/llvm-project/pull/93498
2024-05-28 16:25:54 -07:00
Florian Hahn
ef67f31e88
[SCEV] Compute symbolic max backedge taken count in BTI directly. (NFC)
Move symbolic max backedge taken count computation to BackedgeTakenInfo,
use existing ExitNotTaken info.

In preparation for https://github.com/llvm/llvm-project/pull/93498.
2024-05-28 10:51:21 -07:00
Florian Hahn
bb4c8f9219
[SCEV] Don't add predicates already implied by UnionPredicate. (#93397)
Update SCEVUnionPredicate::add to only add predicates from another union
predicate, if they aren't alread implied by the union predicate we add
them to.

Note that there exists logic elsewhere to avoid adding predicates if
they are already implied, but this logic misses cases when only some
predicates of a union predicate are implied by the current set of
predicates.

PR: https://github.com/llvm/llvm-project/pull/93397
2024-05-26 18:31:36 -07:00
Nikita Popov
ca478bc6cc
[SCEV] Support ule/sle exit counts via widening (#92206)
If we have an exit condition of the form IV <= Limit, we will first try
to convert it into IV < Limit+1 or IV-1 < Limit based on range info (in
icmp simplification). If that fails, we try to convert it to IV < Limit
+ 1 based on controlling exits in non-infinite loops.

However, if all else fails, we can still determine the exit count by
rewriting to ext(IV) < ext(Limit) + 1, where the zero/sign extension
ensures that the addition does not overflow.

Proof: https://alive2.llvm.org/ce/z/iR-iYd
2024-05-23 07:54:08 +02:00
Nikita Popov
b6e102e08c
[SCEV] Don't use non-deterministic constant folding for trip counts (#90942)
When calculating the exit count exhaustively, if any of the involved
operations is non-deterministic, the exit count we compute at
compile-time and the exit count at run-time may differ. Using these
non-deterministic constant folding results is only correct if we
actually replace all uses of the instruction with the value. SCEV (or
its consumers) generally don't do this.

Handle this by adding a new AllowNonDeterministic flag to the constant
folding API, and disabling it in SCEV. If non-deterministic results are
not allowed, do not fold FP lib calls in general, and FP operations
returning NaNs in particular. This could be made more precise (some FP
libcalls like fabs are fully deterministic), but I don't think this that
precise handling here is worthwhile.

Fixes the interesting part of
https://github.com/llvm/llvm-project/issues/89885.
2024-05-20 07:40:54 +02:00
Florian Hahn
3f0e1d4cf0
[SCEV] Swap order of arguments to MatchBinaryAddToConst (NFCI). (#91945)
The argument order to MatchBinaryAddToConst doesn't match the comment
and also is counter-intuitive (passing RHS before LHS, C2 before C1).

This patch adjusts the order to be inline with the calls above, which
should be equivalent, but more natural:
https://alive2.llvm.org/ce/z/ZWGp-Z

PR: https://github.com/llvm/llvm-project/pull/91945
2024-05-13 13:04:22 +01:00
Eli Friedman
f893dccbba
Replace uses of ConstantExpr::getCompare. (#91558)
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
2024-05-09 16:50:01 -07:00
Andreas Jonson
ff3523f67b
[IR] Drop poison-generating return attributes when necessary (#89138)
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
2024-04-18 15:27:36 +09:00
Harald van Dijk
60de56c743
[ValueTracking] Restore isKnownNonZero parameter order. (#88873)
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
2024-04-16 15:21:09 +01:00
Yingwei Zheng
e0a628715a
[ValueTracking] Convert isKnownNonZero to use SimplifyQuery (#85863)
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.

Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-04-12 23:47:20 +08:00
Andreas Jonson
9250aedb5c
[SCEV] Add range attribute handling (#88449) 2024-04-12 18:21:48 +09:00
annamthomas
54a9f0007c
[SCEV] Fix BinomialCoefficient Iteration to fit in W bits (#88010)
BinomialCoefficient computes the value of W-bit IV at iteration It of a loop. When W is 1, we can call multiplicative inverse on 0 which triggers an assert since 1b76120.
    
Since the arithmetic is supposed to wrap if It or K does not fit in W bits, do the truncation into W bits after we do the shift.
    
 Fixes #87798
2024-04-10 09:02:23 -04:00
Jay Foad
1b761205f2
[APInt] Add a simpler overload of multiplicativeInverse (#87610)
The current APInt::multiplicativeInverse takes a modulus which can be
any value, but all in-tree callers use a power of two. Moreover, most
callers want to use two to the power of the width of an existing APInt,
which is awkward because 2^N is not representable as an N-bit APInt.

Add a new overload of multiplicativeInverse which implicitly uses
2^BitWidth as the modulus.
2024-04-04 16:11:06 +01:00
Justin Lebar
fab2bb8bfd
Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)
For some reason this was missing from STLExtras.
2024-03-10 20:00:13 -07:00
Philip Reames
1a37147af5
[SCEV] Match both (-1)b + a and a + (-1)b as a - b (#84247)
In our analysis of guarding conditions, we were converting a-b == 0 into
a == b alternate form, but we were only checking for one of the two
forms for the sub. There's no requirement that the multiply only be on
the LHS of the add.
2024-03-06 15:57:34 -08:00
Philip Reames
0d38f21e4a [SCEV] Extend type hint in analysis output to all backedge kinds
This extends the work from 7755c26 to all of the different backend
taken count kinds that we print for the scev analysis printer.  As
before, the goal is to cut down on confusion as i4 -1 is a very
different (unsigned) value from i32 -1.
2024-03-06 13:08:05 -08:00
Philip Reames
8b5b294ec2 [SCEV] Print predicate backedge count only if new information available
When printing the result of SCEV's analysis, we can avoid printing
the predicated backedge taken count and the predicates if the predicates
are empty and no new information is provided.  This helps to reduce the
verbosity of the output.
2024-03-06 10:24:32 -08:00
Philip Reames
7755c26195 [SCEV] Include type when printing constant max backedge taken count
When printing the result of the analysis, i8 -1 and i64 -1 are quite
different in terms of analysis quality.  In a recent conversion with
a new contributor, we ran into exactly this confusion.

Adding the type for constant scevs more globally seems worthwhile, but
introduces a much larger test diff.  I'm splitting this off first since
it addresses the immediate need, and then going to do some further
changes to clarify a few related bits of analysis result output.
2024-03-06 08:48:25 -08:00
Philip Reames
df9ba13579
[LV] Handle scalable VFs in optimizeForVFAndUF (#82669)
Given a scalable VF of the form <NumElts * VScale>, this patch adds the
ability to discharge a backedge test for a loop whose trip count is
between (NumElts, MinVScale*NumElts).

A couple of notes on this:
* Annoyingly, I could not figure out to write a test for this case. My
attempt is checked in as test32_i8 in f67ef1a, but LV uses a fixed
vector in that case, and ignored the force flags.
* This depends on 9eb5f94f to avoid appearing like a regression. Since
SCEV doesn't know any upper bound on vscale without the vscale_range
attribute (it doesn't query TTI), the ranges overflow on the multiply.
Arguably, this is fixing a bug in the current LV code since in theory
vscale can be large enough to overflow for real, but no actual target is
going to see that case.
2024-03-04 13:49:35 -08:00
Nikita Popov
43dd1e84df [SCEV] Move canReuseInstruction() helper into SCEV (NFC)
To allow reusing it in IndVars.
2024-02-02 16:48:00 +01:00
Philip Reames
e4d01bb227
[SCEV] Special case sext in isKnownNonZero (#77834)
The existing logic in isKnownNonZero relies on unsigned ranges, which
can be problematic when our range calculation is imprecise. Consider the
following:
  %offset.nonzero = or i32 %offset, 1
  -->  %offset.nonzero U: [1,0) S: [1,0)
  %offset.i64 = sext i32 %offset.nonzero to i64
  -->  (sext i32 %offset.nonzero to i64) U: [-2147483648,2147483648)
                                         S: [-2147483648,2147483648)

Note that the unsigned range for the sext does contain zero in this case
despite the fact that it can never actually be zero.

Instead, we can push the query down one level - relying on the fact that
the sext is an invertible operation and that the result can only be zero
if the input is. We could likely generalize this reasoning for other
invertible operations, but special casing sext seems worthwhile.
2024-01-12 07:45:28 -08:00
Simon Pilgrim
3736e1d1cd [SCEV] Ensure shift amount is in range before calling getZExtValue()
Fixes #76234
2023-12-22 14:16:54 +00:00
Nikita Popov
90d82412ea
[SCEV] Use loop guards when checking that RHS >= Start (#75039)
Loop guards tend to provide better results when it comes to reasoning
about ranges than isLoopEntryGuardedByCond(). See the test change for
the motivating case.

I have retained both the loop guard check and the implied cond based
check for now, though the latter only seems to impact a single test and
only via side effects (nowrap flag calculation) at that.
2023-12-12 09:41:54 +01:00
Nikita Popov
ff0e4fb89a
[SCEV] Use or disjoint flag (#74467)
Use the disjoint flag to convert or to add instead of calling the
haveNoCommonBitsSet() ValueTracking query. This ensures that we can
reliably undo add -> or canonicalization, even in cases where the
necessary information has been lost or is too complex to reinfer in
SCEV.

I have updated the bulk of the test coverage to add the necessary
disjoint flags in advance.
2023-12-05 17:01:46 +01:00
Nikita Popov
de176d8c54
[SCEV][LV] Invalidate LCSSA exit phis more thoroughly (#69909)
This an alternative to #69886. The basic problem is that SCEV can look
through trivial LCSSA phis. When the phi node later becomes non-trivial,
we do invalidate it, but this doesn't catch uses that are not covered by
the IR use-def walk, such as those in BECounts.

Fix this by adding a special invalidation method for LCSSA phis, which
will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the
LCSSA phi node and defined in the loop.

We should probably also use this invalidation method in other places
that add predecessors to exit blocks, such as loop unrolling and loop
peeling.

Fixes #69097.
Fixes #66616.
Fixes #63970.
2023-11-17 09:34:24 +01:00
Björn Pettersson
8fc0aca5d1
[SCEV] Support larger than 64-bit types in ashr(add(shl(x, n), c), m) (#71600)
In commit 5a9a02f67b771fb2edcf06 scalar evolution got support for
computing SCEV:s for (ashr(add(shl(x, n), c), m)) constructs. The code
however used APInt::getZExtValue without first checking that the APInt
would fit inside an uint64_t. When for example using 128-bit types we
ended up in assertion failures (or maybe miscompiles in non-assert
builds).
This patch simply avoid converting from APInt to uint64_t when creating
the truncated constant. We can just truncate the APInt instead.
2023-11-08 11:29:12 +01:00
Philip Reames
a7f35d54ee
[SCEV] Extend isImpliedCondOperandsViaRanges to independent predicates (#71110)
As far as I can tell, there's nothing in this code which actually
assumes the two predicates in (FoundLHS FoundPred FoundRHS) => (LHS Pred
RHS) are the same.

Noticed while investigating something else, this is purely an
oppurtunistic optimization while I'm looking at the code. Unfortunately,
this doesn't solve my original problem. :)
2023-11-07 07:25:47 -08:00
Nikita Popov
8f76522a61 [SCEV] Remove mul handling from BuildConstantFromSCEV()
We can't support this once mul constant expressions are removed,
and this is not useful in any practical sense (as this code is
primarily intended for GEP expressions).
2023-11-06 16:58:04 +01:00
Nikita Popov
a8ac6a9868 [SCEV] Remove newline after predicates in dump
update_analyze_test_checks.py will now insert check lines for
empty lines, which means that all the existing test coverage will
have a spurious change to check for the newline after "Predicates:".

I don't think we actually want to have that newline, so drop it
before it gets into more test coverage.
2023-11-03 15:43:30 +01:00
Daniil Suchkov
1344b65c90
[SCEV] Fix incorrect NUW inference (#70521)
This patch fixes a miscompile in LSR caused by incorrect inference of
NUW flag for AddRec: we shouldn't infer no-wrap flags based on a
comparison which doesn't fully control the loop exit.
2023-10-31 11:43:57 -07:00
Danila Malyutin
ba1349fc31
[SCEV] Fix "quick and dirty" difference that could lead to assert (#70688)
The old algorithm would remove all operands matching %step SCEV when it
intended to only remove a single one. This lead to assert when
SCEVAddExpr was of the form %step + %step and potential miscompiles in
similar cases. Such SCEVs could be created when construction reached
depth thresholds.

Fixes #70348
2023-10-31 00:50:57 +03:00
Kazu Hirata
f9306f6de3
[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156)
C++20 comes with std::erase to erase a value from std::vector.  This
patch renames llvm::erase_value to llvm::erase for consistency with
C++20.

We could make llvm::erase more similar to std::erase by having it
return the number of elements removed, but I'm not doing that for now
because nobody seems to care about that in our code base.

Since there are only 50 occurrences of erase_value in our code base,
this patch replaces all of them with llvm::erase and deprecates
llvm::erase_value.
2023-10-24 23:03:13 -07:00
Nikita Popov
d4300154b6 Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)"
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.

This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
2023-10-16 14:04:09 +02:00
Nikita Popov
b5743d4798 [ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)
Remove the old overloads that accept KnownBits by reference, in
favor of those that return it by value.
2023-10-16 13:00:31 +02:00
Nikita Popov
80fa5a6377 [ValueTracking] Use SimplifyQuery in haveNoCommonBitsSet() (NFC)
Pass SimplifyQuery instead of unpacked list of arguments.
2023-10-10 11:39:59 +02:00
Nikita Popov
32ec6d91a1
[SCEV] Make invalidation in SCEVCallbackVH more thorough (#68316)
When a SCEVCallbackVH is RAUWed, we currently do a def-use walk and
remove dependent instructions from the ValueExprMap. However, unlike
SCEVs usual invalidation, this does not forget memoized values.

The end result is that we might end up removing a SCEVUnknown from the
map, while that expression still has users. Due to that, we may later
fail to invalide those expressions. In particular, invalidation of loop
dispositions only does something if there is an expression for the
value, which would not be the case here.

Fix this by using the standard forgetValue() API, instead of rolling a
custom variant.

Fixes https://github.com/llvm/llvm-project/issues/68285.
2023-10-10 10:55:57 +02:00
Nikita Popov
1c3fdb3d1e Revert "[SCEV] Don't invalidate past dependency-breaking instructions"
Unforuntately, the assumption underlying this optimization is
incorrect for getSCEVAtScope(): A SCEVUnknown instruction with
operands that have constant loop exit values can evaluate to
a constant, thus creating a dependency from an "always unknown"
instruction.

Losing this optimization is quite unfortunate, but it doesn't
seem like there is any simple workaround for this.

Fixes #68260.

This reverts commit 3ddd1ffb721dd0ac3faa4a53c76b6904e862b7ab.
2023-10-09 16:35:01 +02:00