We are replacing a narrow IV increment with a wider one. If the original
(narrow) increment did not wrap, the wider one should not wrap either.
Set the flags to be the union of both wide increment and original
increment; this ensures we preserve flags SCEV could infer for the wider
increment.
Fixes https://github.com/llvm/llvm-project/issues/71517.
As far as I can tell, there's nothing in this code which actually
assumes the two predicates in (FoundLHS FoundPred FoundRHS) => (LHS Pred
RHS) are the same.
Noticed while investigating something else, this is purely an
oppurtunistic optimization while I'm looking at the code. Unfortunately,
this doesn't solve my original problem. :)
zext nneg was recently added to the IR in #67982. This patch teaches
SimplifyIndVars to prefer zext nneg over *both* sext and plain zext,
when a local SCEV query indicates the source is non-negative.
The choice to prefer zext nneg over sext looks slightly aggressive
here, but probably isn't so much in practice. For cases where we'd
"remember" the range fact, instcombine would convert the sext into
a zext nneg anyways. The only cases where this produces a different
result overall are when SCEV knows a non-local fact, and it doesn't
get materialized into the IR. Those are exactly the cases where
using zext nneg are most useful. We do run the risk of e.g. a
missing combine - since we haven't updated most of them yet - but
that seems like a manageable risk.
Note that there are much deeper algorithmic changes we could make
to this code to exploit zext nneg, but this seemed like a reasonable
and low risk starting point.
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
The motivation is that 'createInvariantCond' unconditionally
builds icmp in the loop block, while it could always do it
in preheader. Build it in preheader instead.
Patch by Aleksandr Popov!
Differential Revision: https://reviews.llvm.org/D141994
Reviewed By: nikic
When exit by condition `C1` dominates exit by condition `C2`, and
max symbolic exit count for `C1` matches those for loop, we will
apply more optimistic logic to `C2` by setting `SkipLastIter` for it,
meaning that it will do 1 iteration less because the dominating branch
must exit on the last loop iteration.
But when we have a single exit by condition `C1 & C2`, we cannot
apply the same logic, because there is no dominating condition.
However, if we can prove that the symbolic max exit count of `C1 & C2`
matches those of `C1`, it means that for `C2` we can assume that it
doesn't matter on the last iteration (because the whole thing is `false`
because `C1` must be `false`). Therefore, in this situation, we can handle
`C2` as if it had `SkipLastIter`.
Differential Revision: https://reviews.llvm.org/D139934
Reviewed By: nikic
Recent improvements in symbolic exit count computation revealed some problems with
SCEV's ability to find invariant predicate during first iterations. Ultimately it is based on its
ability to prove some facts for value on the last iteration. This last value, when it includes
`umin` as part of exit count, isn't always simplified enough. The motivating example is following:
https://github.com/llvm/llvm-project/issues/59615
Could not prove:
```
Pred = 36, LHS = (-1 + (-1 * (2147483645 umin (-1 + %var)<nsw>))<nsw> + %var), RHS = %var
FoundPred = 36, FoundLHS = {1,+,1}<nuw><nsw><%bb3>, FoundRHS = %var
```
Can prove:
```
Pred = 36, LHS = (-1 + (-1 * (-1 + %var)<nsw>)<nsw> + %var), RHS = %var
FoundPred = 36, FoundLHS = {1,+,1}<nuw><nsw><%bb3>, FoundRHS = %var
```
Here ` (2147483645 umin (-1 + %var)<nsw>)` is exit count composed of two parts from
two different exits: `2147483645 ` and `(-1 + %var)<nsw>`. When it was only one (latter)
analyzeable exit, for it everything was easily provable. Unfortunately, in general case `umin`
in one of `add`'s operands doesn't guarantee that the whole sum reduces, especially in presence
of negative steps and lack of `nuw`. I don't think there is a generic legal way to somehow play
around this `umin`.
So the ad-hoc solution is following: if we failed to find an equivalent predicate that is invariant
during first `MaxIter` iterations, and `MaxIter = umin(a, b, c...)`, try to find solution for at least one
of `a`, `b`, `c`... Because they all are `uge` than `MaxIter`, whatever is true during `a (b, c)` iterations
is also true during `MaxIter` iterations.
Differential Revision: https://reviews.llvm.org/D140456
Reviewed By: nikic
These limitations are too strict, and their only purpose is to avoid code
size explosion. These restrictions seem obsolete, and the size problem
is solved in other places through cheap expansion limits.
The motivation is that the old code cannot deal with comparisons against
induction variant's increment.
Differential Revision: https://reviews.llvm.org/D138412
Reviewed By: lebedev.ri, reames
Instruction being hoisted could have nuw/nsw flags inferred from the old
context, and we cannot simply move it to the new location keeping them
because we are going to introduce new uses to them that didn't exist before.
Example in https://github.com/llvm/llvm-project/issues/57187 shows how
this can produce branch by poison from initially well-defined program.
This patch forcefully recomputes poison-generating flag in the new context.
Differential Revision: https://reviews.llvm.org/D132022
Reviewed By: fhahn, nikic
This reverts commit 354fa0b48008eca701a110badd6974bf449df257.
Returning as is. The patch was reverted due to a miscompile, but
this patch is not causing it. This patch made it possible to infer
some nuw flags in code guarded by `false` condition, and then someone
else to managed to propagate the flag from dead code outside.
Returning the patch to be able to reproduce the issue.
This reverts commit 34ae308c73e4d76dbdab25a6206d3fbc5ebafdf5.
Our internal testing found a miscompile. Not sure if it's caused by
this patch or it revealed something else. Reverting while investigating.
Sometimes SCEV cannot infer nuw/nsw from something as simple as
```
len in [0, MAX_INT]
...
iv = phi(0, iv.next)
guard(iv <s len)
guard(iv <u len)
iv.next = iv + 1
```
just because flag strenthening only relies on definition and does not use local facts.
This patch adds support for the simplest case: inference of flags of `add(x, constant)`
if we can contextually prove that `x <= max_int - constant`.
In case if it has negative CT impact, we can add an option to switch it off. I woudln't
expect that though.
Differential Revision: https://reviews.llvm.org/D129643
Reviewed By: apilipenko
After replacing a loop phi with the preheader value, it's usually
possible to simplify some of the using instructions, so do that as
part of replaceLoopPHINodesWithPreheaderValues().
Doing this as part of IndVars is valuable, because it may make GEPs
in the loop have constant offsets and allow the following SROA run
to succeed (as demonstrated in the PhaseOrdering test).
Differential Revision: https://reviews.llvm.org/D129293
Currently we only call replaceLoopPHINodesWithPreheaderValues() if
optimizeLoopExits() replaces the exit with an unconditional exit.
However, it is very common that this already happens as part of
eliminateIVComparison(), in which case we're leaving behind the
dead header phi.
Tweak the early bailout for already-constant exits to also call
replaceLoopPHINodesWithPreheaderValues().
Differential Revision: https://reviews.llvm.org/D129214
Currently, we hardly ever actually run SCEV verification, even in
tests with -verify-scev. This is because the NewPM LPM does not
verify SCEV. The reason for this is that SCEV verification can
actually change the result of subsequent SCEV queries, which means
that you see different transformations depending on whether
verification is enabled or not.
To allow verification in the LPM, this limits verification to
BECounts that have actually been cached. It will not calculate
new BECounts.
BackedgeTakenInfo::getExact() is still not entirely readonly,
it still calls getUMinFromMismatchedTypes(). But I hope that this
is not problematic in the same way. (This could be avoided by
performing the umin in the other SCEV instance, but this would
require duplicating some of the code.)
Differential Revision: https://reviews.llvm.org/D120551
For unreachable loops, any BECount is legal, and since D98706 SCEV
can make use of this for loops that are unreachable due to constant
branches. To avoid false positives, adjust SCEV verification to only
check BECounts in reachable loops.
Fixes https://github.com/llvm/llvm-project/issues/50523.
Differential Revision: https://reviews.llvm.org/D120651
This reverts commit 7cd273c339cfe8427404f881ae280bd9fae6ff78.
Several patches with tests fixes have been applied:
0cada82f0a30e5ae22dce66b58604ab9b47a3897 "[Test] Remove incorrect test in GVN"
97cb13615d6d9df254e3c0f3deef9eaedfe189b6 "[Test] Separate IndVars test into AArch64 and X86 parts"
985cc490f17d28b20392ee214895d947b85120ef "[Test] Remove separated test in IndVars",
and test failures caused by 5ec2386 should be resolved now.
The widen-loop-comp.ll in indvars has a target triple with
specified aarch64 architecture. This caused test failures with
db28934 "[IndVars] Pass TTI to replaceCongruentIVs" applied, because
with the patch indvars performed some target-specific
transforms, and for example if a build supported only X86,
then indvars would not have applied those transforms.
However, the checks in the test were generated as for aarch64.
Thus the test failures on such builds.
This patch separates widen-loop-comp.ll into two parts.
The first one is intended to be run only if a build supports aarch64.
This is now in AArch64 directory with a lit config.
The second one was added recently to show db28934 improvements.
This one is now in X86 directory.
This patch should resolve build issues caused by
5ec23863320ca12bfabb6dcff1d0425cb614b7a5.
This reapplies patch db289340c841990055a164e8eb2a3b5ff25677bf.
The test failures on build with expensive checks caused by the patch happened due
to the fact that we sorted loop Phis in replaceCongruentIVs using llvm::sort,
which shuffles the given container if the expensive checks are enabled,
so equivalent Phis in the sorted vector had different mutual order from run
to run. replaceCongruentIVs tries to replace narrow Phis with truncations
of wide ones. In some test cases there were several Phis with the same
width, so if their order differs from run to run, the narrow Phis would
be replaced with a different Phi, depending on the shuffling result.
The patch ae14fae0ff4304022beda5ab484f84ac0fdda807 fixed this issue by
replacing llvm::sort with llvm::stable_sort.
In IndVarSimplify after simplifying and extending loop IVs we call 'replaceCongruentIVs'.
This function optionally takes a TTI argument to be able to replace narrow IVs uses
with truncates of the widest one.
For some reason the TTI wasn't passed to the function, so it couldn't perform such
transform.
This patch fixes it.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D113024
Only the most recent cpus support really 1cy 64-bit multiplies, and the X64 cost table represents a realistic worst case. The 1cy value was also discouraging vectorization when most vXi64 PMULDQ expansions aren't actually slower than scalarization.
Noticed while investigating PR51436.
Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to
setting attribute to false.
This is a preliminary commit for https://reviews.llvm.org/D99080