The integer case in genLoopLimit reduces down to a special case for narrowing the bitwidth of the limit, and then performing the same expansion we would for a pointer IV.
Differential Revision: https://reviews.llvm.org/D146638
Note that the comments being removed appear to be very out of sync with the actual code in question.
Differential Revision: https://reviews.llvm.org/D146468
This option is switched off by default, and it seems that it has never worked correctly.
What it basically does is: it remembers current BECount SCEV, and after all transforms
tries to validate some facts for it. However, between these two points this SCEV may
become invalid (e.g. because some SCEVUnknown it references may be deleted as dead
code). So basically it may work with broken pointers.
Besides, its implementation does strange things (e.g. forgetLoop) which are invasive and
may affect behavior in other parts of the system (specifically verification), concealing some
other problems. Another issue is that it may use SCEVCouldNotCompute object without
checking this.
The option is not used in any unit tests, and if switched on by default, the following tests
fail:
```
********************
Failed Tests (14):
LLVM :: Transforms/IndVarSimplify/2005-06-15-InstMoveCrash.ll
LLVM :: Transforms/IndVarSimplify/2007-06-06-DeleteDanglesPtr.ll
LLVM :: Transforms/IndVarSimplify/2008-10-03-CouldNotCompute.ll
LLVM :: Transforms/IndVarSimplify/2009-05-24-useafterfree.ll
LLVM :: Transforms/IndVarSimplify/2011-10-27-lftrnull.ll
LLVM :: Transforms/IndVarSimplify/ARM/code-size.ll
LLVM :: Transforms/IndVarSimplify/X86/deterministic-scev-verify.ll
LLVM :: Transforms/IndVarSimplify/X86/pr57187.ll
LLVM :: Transforms/IndVarSimplify/X86/verify-scev.ll
LLVM :: Transforms/IndVarSimplify/bbi-63564.ll
LLVM :: Transforms/IndVarSimplify/invalidate-modified-lcssa-phi.ll
LLVM :: Transforms/IndVarSimplify/loop-predication.ll
LLVM :: Transforms/IndVarSimplify/post-inc-range.ll
LLVM :: Transforms/IndVarSimplify/turn-to-invariant.ll
********************
Unexpectedly Passed Tests (1):
LLVM :: Transforms/IndVarSimplify/pr55689.ll
```
None of these looks like real problems found by verification, these are
bugs in the verifying code itself (such as use of deleted SCEVs and
SCEVCouldNotCompute's).
I think it all gives enough justification for its removal.
https://github.com/llvm/llvm-project/issues/61302
Differential Revision: https://reviews.llvm.org/D145894
Reviewed By: nikic
DFAJumpThreading
JumpThreading
LibCallsShrink
LoopVectorize
SLPVectorizer
DeadStoreElimination
AggressiveDCE
CorrelatedValuePropagation
IndVarSimplify
These are part of the optimization pipeline, of which the legacy version is deprecated and being removed.
The motivation is that 'createInvariantCond' unconditionally
builds icmp in the loop block, while it could always do it
in preheader. Build it in preheader instead.
Patch by Aleksandr Popov!
Differential Revision: https://reviews.llvm.org/D141994
Reviewed By: nikic
This patch does two things, both related to support of multi-exit loops with
many exits that have known symbolic max exit count. They can theoretically
go independently, but I don't know how to write a test showing separate
impact.
Part 1: `SkipLastIter` can be set to `true` not when a particular exit has exit
count same as the whole loop (and therefore it must exit on the last iteration),
but when the aggregate of first few exits has umin same as whole loop exit count.
It means that it's not known which of them will exit exactly, but one of them will.
Part 2: when `SkipLastIter` is set, and exit count is `umin(a, b, c)`, instead of
`umin(a, b, c) - 1` use `umin(a - 1, b - 1, c - 1)`. We don't care about overflows
here, but the further logic knows how to deal with umin by element, but the
`SCEVAddExpr` node will confuse it.
Differential Revision: https://reviews.llvm.org/D141361
Reviewed By: nikic
When exit by condition `C1` dominates exit by condition `C2`, and
max symbolic exit count for `C1` matches those for loop, we will
apply more optimistic logic to `C2` by setting `SkipLastIter` for it,
meaning that it will do 1 iteration less because the dominating branch
must exit on the last loop iteration.
But when we have a single exit by condition `C1 & C2`, we cannot
apply the same logic, because there is no dominating condition.
However, if we can prove that the symbolic max exit count of `C1 & C2`
matches those of `C1`, it means that for `C2` we can assume that it
doesn't matter on the last iteration (because the whole thing is `false`
because `C1` must be `false`). Therefore, in this situation, we can handle
`C2` as if it had `SkipLastIter`.
Differential Revision: https://reviews.llvm.org/D139934
Reviewed By: nikic
Similar to how `makeArrayRef` is deprecated in favor of deduction guides, do the
same for `makeMutableArrayRef`.
Once all of the places in-tree are using the deduction guides for
`MutableArrayRef`, we can mark `makeMutableArrayRef` as deprecated.
Differential Revision: https://reviews.llvm.org/D141814
This patch allows optimizeLoopExitWithUnknownExitCount to deal with
branches by conditions that are not immediately ICmp's, but aggregates
of ICmp's joined by arithmetic or logical AND/OR. Each ICmp is optimized
independently.
Differential Revision: https://reviews.llvm.org/D139832
Reviewed By: nikic
This patch updates propgatesPoison to take a Use as argument and
propagatesPoison now returns true if the passed in operand causes the
user to yield poison if the operand is poison
This allows propagating poison if the condition of a select is poison.
This helps improve results for programUndefinedIfUndefOrPoison.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D111643
There was a crippled version of this transform for Inverted predicate, so the same
query was done twice. Advanced version of this transform wasn't implemented for
inverted condition. Thus, the code was hard to read. The only real purpose of the
Inverted param was to make a simple isKnownPredicateAt query.
Instead if this, use evaluatePredicateAt to solve the task for both inverted and
non-inverted predicate. This slightly changes the order of queries, but effectively
it should save some time by avoiding duplicating queries, and simplifies the code a lot.
I also could not find any evidence that we ever eliminate anything with Inverted = true,
but conservatively preserved the current behavior. Maybe we can remove it and save
some compile time.
Differential Revision: https://reviews.llvm.org/D139814
Reviewed By: nikic
Old logic: when loop symbolic max exit count matched *exact* block exit count,
assume that all subsequent blocks will do 1 iteration less.
New logic: when loop symbolic max exit count matched *symbolic max* block exit count,
assume that all subsequent blocks will do 1 iteration less.
The new logic is still legal and is more permissive in situations when exact
block exit count is not known.
Differential Revision: https://reviews.llvm.org/D139692
Reviewed By: nikic
Additional SCEV verification highlighted a case where the cached loop
dispositions where incorrect after simplifying a phi node in IndVars.
Fix it by invalidating the phi before replacing it.
Fixes#58750
Moving an instruction can invalidate the cached block dispositions of
the corresponding SCEV. Invalidate the cached dispositions.
Also fixes a copy-paste error in forgetBlockAndLoopDispositions where
the start expression S was removed from BlockDispositions in the loop
but not the current values. This was also exposed by the new test case.
Fixes#58439.
isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.
Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.
Fixes https://github.com/llvm/llvm-project/issues/50506.
Differential Revision: https://reviews.llvm.org/D129630
After replacing a loop phi with the preheader value, it's usually
possible to simplify some of the using instructions, so do that as
part of replaceLoopPHINodesWithPreheaderValues().
Doing this as part of IndVars is valuable, because it may make GEPs
in the loop have constant offsets and allow the following SROA run
to succeed (as demonstrated in the PhaseOrdering test).
Differential Revision: https://reviews.llvm.org/D129293
Currently we only call replaceLoopPHINodesWithPreheaderValues() if
optimizeLoopExits() replaces the exit with an unconditional exit.
However, it is very common that this already happens as part of
eliminateIVComparison(), in which case we're leaving behind the
dead header phi.
Tweak the early bailout for already-constant exits to also call
replaceLoopPHINodesWithPreheaderValues().
Differential Revision: https://reviews.llvm.org/D129214
Remove the assertion about the pointer element type, only check
that the stride is one. Ultimately, the actual pointer type here
doesn't matter, because SCEVExpander would insert appropriate
casts if necessary.
rGf39978b84f1d3a1da6c32db48f64c8daae64b3ad led to and/or exposed
an issue with IndVarSimplification for a loop where a i32 phi node is
no longer replaced by a widened (i64) phi node, because the SCEVs of a
sign-extend no longer folded the same way. I'm unsure how to properly
explain this because it's all rather complicated, but in short: SCEVs
don't fold as nicely as they used to and this caused a difference.
While investigating this, I found that IndVarSimplify can actually
optimise the case in the way we want to if it chooses the widened IV to
be 'signed' (the i32 IV is both sign and zero-extended). Oddly enough,
there is some level of indeterminism in the way the algorithm works,
it just picks the sign of the 'first' zext/sext user, where the order of
the users-iterator is not guaranteed to be the same on each invocation
of the pass (e.g. shown by first running loop-rotate, which puts the
users in a different order).
While I think the fix is valid in the sense that consistently picking
_any_ order is better than having an nondeterministic order, I can
use a bit of advice from people more familiar in this area of the
code-base.
For example, I'm not sure if this fix is hiding another issue where the
IndVarSimplify pass could actually draw the same conclusions (i.e. that
it only needs an i64 phi node) if it does a bit more work, regardless
of whether it chooses the induction variable to be signed or unsigned.
I'm also not sure if choosing signed is better than unsigned, or whether
that just happens to be beneficial only in this individual case.
Any feedback would be much appreciated!
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D112573
This reverts commit 7cd273c339cfe8427404f881ae280bd9fae6ff78.
Several patches with tests fixes have been applied:
0cada82f0a30e5ae22dce66b58604ab9b47a3897 "[Test] Remove incorrect test in GVN"
97cb13615d6d9df254e3c0f3deef9eaedfe189b6 "[Test] Separate IndVars test into AArch64 and X86 parts"
985cc490f17d28b20392ee214895d947b85120ef "[Test] Remove separated test in IndVars",
and test failures caused by 5ec2386 should be resolved now.
This reapplies patch db289340c841990055a164e8eb2a3b5ff25677bf.
The test failures on build with expensive checks caused by the patch happened due
to the fact that we sorted loop Phis in replaceCongruentIVs using llvm::sort,
which shuffles the given container if the expensive checks are enabled,
so equivalent Phis in the sorted vector had different mutual order from run
to run. replaceCongruentIVs tries to replace narrow Phis with truncations
of wide ones. In some test cases there were several Phis with the same
width, so if their order differs from run to run, the narrow Phis would
be replaced with a different Phi, depending on the shuffling result.
The patch ae14fae0ff4304022beda5ab484f84ac0fdda807 fixed this issue by
replacing llvm::sort with llvm::stable_sort.
In IndVarSimplify after simplifying and extending loop IVs we call 'replaceCongruentIVs'.
This function optionally takes a TTI argument to be able to replace narrow IVs uses
with truncates of the widest one.
For some reason the TTI wasn't passed to the function, so it couldn't perform such
transform.
This patch fixes it.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D113024
This relaxes the one-use requirement on the rotation transform specifically for the case where we know we're zexting an IV of the loop. This allows us to discover trip count information in SCEV, which seems worth a single extra loop invariant truncate. Honestly, I'd prefer if SCEV could just compute the trip count directly (e.g. D109457), but this unblocks practical benefit.