The code already guards against values coming from a previous iteration
using properlyDominates(). However, addrecs are considered to properly
dominate the loop they are defined in.
Handle this special case separately, by checking for expressions that
have computable loop evolution (this should cover cases like a zext of
an addrec as well).
I considered changing the definition of properlyDominates() instead, but
decided against it. The current definition is useful in other context,
e.g. when deciding whether an expression is safe to expand in a given
block.
Fixes https://github.com/llvm/llvm-project/issues/126012.
Use CmpPredicate::getMatching in isImpliedCondBalancedTypes to pass
samesign information to isImpliedViaOperations, and teach it to call
CmpPredicate::getPreferredSignedPredicate, effectively making it
optimize with samesign information.
This is a followup to #117152. That patch introduced a check for
UB/poison on BEValue. However, the SCEV we're actually going to use is
Shifted. In some cases, it's possible for Shifted to contain UB, while
BEValue doesn't.
In the test case the values are:
BEValue: (-1 * (zext i8 (-83 + ((-83 /u {1,+,1}<%loop>) *
{-1,+,-1}<%loop>)) to i32))<nuw><nsw>
Shifted: (-173 + (-1 * (zext i8 ((-83 /u {0,+,1}<%loop>) *
{0,+,-1}<%loop>) to i32))<nuw><nsw>)<nuw><nsw>
Fixes https://github.com/llvm/llvm-project/issues/123550.
In preparation to teach implied-cond functions about samesign, migrate
integer-compare predicates that flow through to the functions from
CmpInst::Predicate to CmpPredicate.
When `collectFromBlock` is called without a predecessor (in particular
for loops that don't have a unique predecessor outside the loop) we
never start climbing the predecessor chain, and thus don't mark the
starting block as visited.
Fixes https://github.com/llvm/llvm-project/issues/120615.
When assumptions are present `Terms.size()` does not actually count the
number of conditions collected from dominating branches; introduce a
separate counter.
Fixes https://github.com/llvm/llvm-project/issues/120237
This patch adds initial matchers for unary and binary SCEV expressions
and specializes it for SExt, ZExt and binary add expressions.
Also adds matchers for SCEVConstant and SCEVUnknown.
This patch only converts a few instances to use the new matchers to make
sure everything builds as expected for now.
The goal of the matchers is to hopefully make it slightly easier to
write code matching SCEV patterns.
Depends on https://github.com/llvm/llvm-project/pull/119389
PR: https://github.com/llvm/llvm-project/pull/119390
A SCEVWrapPredicate A implies B, if
* they have the same flag,
* both steps are positive and
* B's start and step are ULE/SLE (for NSUW/NSSW) than A's.
See https://alive2.llvm.org/ce/z/n2T4ss (first pair with known constants
as strides, second pair with variable strides).
Note that this is limited to steps of the same size, due to NSUW having
slightly different semantics than regular NUW. We should be able to
remove this restriction for NSSW (which matches NSW) in the future.
PR: https://github.com/llvm/llvm-project/pull/118184
Add initial pattern matching for SCEV constants. Follow-up patches will
add additional matchers for various SCEV expressions.
This patch only converts a few instances to use the new matchers to make
sure everything builds as expected for now.
PR: https://github.com/llvm/llvm-project/pull/119389
De-duplicate the functions getSignedPredicate and getUnsignedPredicate,
nearly identical versions of which were present in CmpInst and ICmpInst,
creating less confusion.
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
There are a number of places where we call getSmallConstantMaxTripCount
without passing a vector of predicates:
getSmallBestKnownTC
isIndvarOverflowCheckKnownFalse
computeMaxVF
isMoreProfitable
I've changed all of these to now pass in a predicate vector so that
we get the benefit of making better vectorisation choices when we
know the max trip count for loops that require SCEV predicate checks.
I've tried to add tests that cover all the cases affected by these
changes.
Store predicates in ExitLimit and ExitNotTaken in a SmallVector instead
of a SmallPtrSet. This guarantees the predicates can be iterated on in a
predictable manner. This ensures the predicates can be printed and
generated in a predictable order.
This shifts de-duplication of predicates to construction time for
ExitLimit. ExitNotTaken just takes predicates from ExitLimit, so they
should also be free of duplicates.
This was exposed by 2f7ccaf4a8565628a4c7d2b5a49bb45478940be6
(https://github.com/llvm/llvm-project/pull/108777).
Should fix https://lab.llvm.org/buildbot/#/builders/110/builds/1494.
This time with 100% more building unit tests. Original commit message follows.
[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)
If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurious malloc
traffic. Discovered by instrumenting DenseMap with some accounting
code, then selecting sites where we'll get the most bang for our buck.
If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurious malloc
traffic. Discovered by instrumenting DenseMap with some accounting
code, then selecting sites where we'll get the most bang for our buck.
Currently if a loop contains loads that we can prove at compile time
are dereferenceable when certain conditions are satisfied the function
isDereferenceableAndAlignedInLoop will still return false because
getSmallConstantMaxTripCount will return 0 when SCEV predicates
are required. This patch changes getSmallConstantMaxTripCount to take
an optional Predicates pointer argument so that we can permit
functions such as isDereferenceableAndAlignedInLoop to consider more
cases.
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
There are a few places in ScalarEvolution.cpp where we copy predicates
from one list to another and they have a similar pattern:
for (const auto *P : ENT.Predicates)
Predicates->push_back(P);
We can avoid the loop by writing them like this:
Predicates->append(ENT.Predicates.begin(), ENT.Predicates.end());
which may end up being more efficient since we only have to try
reserving more space once.
Due to a reviewer request on PR #88385 I have created this patch
to add a getPredicatedExitCount function, which is similar to
getExitCount except that it uses the predicated backedge taken
information. With PR #88385 we will start to care about more
loops with multiple exits, and want the ability to query exit
counts for a particular exiting block. Such loops may require
predicates in order to be vectorised.
New tests added here:
Analysis/ScalarEvolution/predicated-exit-count.ll
Whilst dealing with review comments on
https://github.com/llvm/llvm-project/pull/96752
I discovered that SCEV does not know about the dereferenceable attribute
on function arguments so I have updated getRangeRef to make use of it
by calling getPointerDereferenceableBytes.