This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
Similarly to https://github.com/llvm/llvm-project/pull/131538, we can
also try and check if a predicate is known to wrap given the backedge
taken count.
For now, this just checks directly when we try to create predicated
AddRecs. This both helps to avoid spending compile-time on optimizations
where we know the predicate is false, and can also help to allow
additional vectorization (e.g. by deciding to scalarize memory accesses
when otherwise we would try to create a predicated AddRec with a
predicate that's always false).
The initial version is quite restricted, but can be extended in
follow-ups to cover more cases.
PR: https://github.com/llvm/llvm-project/pull/151134
For the attached test:
Before the loop-idiom pass, we have a store into the inner loop which is
considered simple and one that does not have any side effects on the
loop. Post loop-idiom pass, we get a memset into the outer loop that is
considered to introduce side effects on the loop. This changes the
backedge taken count before and after the pass and hence, the crash with
verify-scev.
We try to consider non-volatile memory intrinsics as not having
side-effect for forward progress to fix the issue.
Fixes#149377
Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not
unsigned-wrap.
The actual code supports ZExts with arbitrary number of arguments, hence
the getAddExpr in the return.
This helps SCEV reasoning in some cases, commonly when adding an offset
to a zero-extended SCEV that subtracts the same offset.
Note that this is restricted to cases where we can fold away an operand
of the inner Add. This is needed to avoid bad interactions with patterns
when forming ZExts, which try to push to ZExt to add operands.
https://alive2.llvm.org/ce/z/q7d303
PR: https://github.com/llvm/llvm-project/pull/151227
Relax the NUW requirements for isKnownPredicateViaNoOverflow, if the
second operand (Y) is an ADD. The code only simplifies the condition if
C1 < C2, so if the second ADD is NUW, it doesn't matter whether the
first operand also has the NUW flag, as it cannot wrap if C1 < C2.
https://alive2.llvm.org/ce/z/b3dM7N
PR: https://github.com/llvm/llvm-project/pull/149795
Current the comparator is inconsistent when we have two external globals
and one internal globals due to
```
if (IsGVNameSemantic(LGV) && IsGVNameSemantic(RGV))
return LGV->getName().compare(RGV->getName());
```
The internal global compares equal to (not strictly less than) the
external globals, but the external globals are not equal.
Fixes#147862.
Update SimplifyICmpOperands to only try subtracting 1 from RHS first, if
RHS is an op we can fold the subtract directly into. Otherwise try
adding to LHS first, as we can preserve NUW flags.
This improves results in a few cases, including the modified test case
from berkeley-abc and new code to be added in
https://github.com/llvm/llvm-project/pull/128061.
Note that there are more cases where the results can be improved by
better ordering here which I'll try to investigate as follow-up.
PR: https://github.com/llvm/llvm-project/pull/144404
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
try_emplace can default-construct values, so we do not need to do so
on our own. Plus, try_emplace(Key) is much simpler/shorter than
insert({Key, LongValueType()}).
This patch corrects the behavior of the Dependence Analysis for memory
accesses that do not start at the same offset or do not have similar
strides. When offsets or strides cannot be disambiguated at compile
time, DA collects a set of runtime assumptions under which the
dependence test becomes valid. The default remains the same as before
the patch: DA rejects the dependence test as undecidable instead of
collecting runtime assumptions.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
Co-authored-by: Ryotaro Kasuga <kasuga.ryotaro@fujitsu.com>
Introduce m_scev_AffineAddRec to match affine AddRecs, a class_match for
SCEVConstant, and demonstrate their utility in LSR and SCEV. While at
it, rename m_Specific to m_scev_Specific for clarity.
The code already guards against values coming from a previous iteration
using properlyDominates(). However, addrecs are considered to properly
dominate the loop they are defined in.
Handle this special case separately, by checking for expressions that
have computable loop evolution (this should cover cases like a zext of
an addrec as well).
I considered changing the definition of properlyDominates() instead, but
decided against it. The current definition is useful in other context,
e.g. when deciding whether an expression is safe to expand in a given
block.
Fixes https://github.com/llvm/llvm-project/issues/126012.
Use CmpPredicate::getMatching in isImpliedCondBalancedTypes to pass
samesign information to isImpliedViaOperations, and teach it to call
CmpPredicate::getPreferredSignedPredicate, effectively making it
optimize with samesign information.
This is a followup to #117152. That patch introduced a check for
UB/poison on BEValue. However, the SCEV we're actually going to use is
Shifted. In some cases, it's possible for Shifted to contain UB, while
BEValue doesn't.
In the test case the values are:
BEValue: (-1 * (zext i8 (-83 + ((-83 /u {1,+,1}<%loop>) *
{-1,+,-1}<%loop>)) to i32))<nuw><nsw>
Shifted: (-173 + (-1 * (zext i8 ((-83 /u {0,+,1}<%loop>) *
{0,+,-1}<%loop>) to i32))<nuw><nsw>)<nuw><nsw>
Fixes https://github.com/llvm/llvm-project/issues/123550.
In preparation to teach implied-cond functions about samesign, migrate
integer-compare predicates that flow through to the functions from
CmpInst::Predicate to CmpPredicate.
When `collectFromBlock` is called without a predecessor (in particular
for loops that don't have a unique predecessor outside the loop) we
never start climbing the predecessor chain, and thus don't mark the
starting block as visited.
Fixes https://github.com/llvm/llvm-project/issues/120615.
When assumptions are present `Terms.size()` does not actually count the
number of conditions collected from dominating branches; introduce a
separate counter.
Fixes https://github.com/llvm/llvm-project/issues/120237
This patch adds initial matchers for unary and binary SCEV expressions
and specializes it for SExt, ZExt and binary add expressions.
Also adds matchers for SCEVConstant and SCEVUnknown.
This patch only converts a few instances to use the new matchers to make
sure everything builds as expected for now.
The goal of the matchers is to hopefully make it slightly easier to
write code matching SCEV patterns.
Depends on https://github.com/llvm/llvm-project/pull/119389
PR: https://github.com/llvm/llvm-project/pull/119390
A SCEVWrapPredicate A implies B, if
* they have the same flag,
* both steps are positive and
* B's start and step are ULE/SLE (for NSUW/NSSW) than A's.
See https://alive2.llvm.org/ce/z/n2T4ss (first pair with known constants
as strides, second pair with variable strides).
Note that this is limited to steps of the same size, due to NSUW having
slightly different semantics than regular NUW. We should be able to
remove this restriction for NSSW (which matches NSW) in the future.
PR: https://github.com/llvm/llvm-project/pull/118184