Move SCEVPtrToIntSinkingRewriter out of getLosslessPtrToIntExpr to be
re-used for PtrToAddr. Also streamline code in getLosslessPtrToIntExpr
by moving zero handling to the rewriter and removing special handling
for SCEVUnknown in getLosslessPtrToIntExpr. Instead, always use the
rewriter, which will automatically handle the case where the expression
is a SCEVUnknown.
This makes it slightly easier to add support for PtrToAddr as follow-up
to https://github.com/llvm/llvm-project/pull/158032
PR: https://github.com/llvm/llvm-project/pull/174435
Split out from https://github.com/llvm/llvm-project/pull/171456.
This explicitly allows implicit truncation in a number of places,
prior to switching the default. This limits the scope of the
initial change.
getSCEVExprForVPValue is used to create SCEVs for expressions from the
original loop, which may be predicated. Use PSE to construct predicated
SCEVs if possible. This matches the legacy LV code behavior.
Currently should be NFC, but will enable migrating more SCEV/cost-based
computations to VPlan.
The patch requires exposing a new getPredicatedSCEV helper to
PredicatedScalarEvolution which just takes a SCEV, to avoid needing to
go through IR values, which isn't an option for getSCEVExprForVPValue.
This is a slightly different API than ConstantRange's
areInsensitiveToSignednessOfICmpPredicate. The only actual difference
(beyond naming) is the handling of empty ranges (i.e. unreachable code).
I wanted to keep the existing SCEV behavior for the unreachable code as
we should be folding that to poison, not reasoning about samesign. I
tried the other variant locally, and saw no test changes.
At the moment, the effectivness of guards that contain divisibility
information (A % B == 0 ) depends on the order of the conditions.
This patch makes using divisibility information independent of the
order, by collecting and applying the divisibility information
separately.
We first collect all conditions in a vector, then collect the
divisibility information from all guards.
When processing other guards, we apply divisibility info collected
earlier.
After all guards have been processed, we add the divisibility info,
rewriting the existing rewrite. This ensures we apply the divisibility
info to the largest rewrite expression.
This helps to improve results in a few cases, one in
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2921 and another one
in a different large C/C++ based IR corpus.
PR: https://github.com/llvm/llvm-project/pull/163021
Move getPreviousSCEVDivisibleByDivisor from a lambda to a static
function and clarify the name (DividesBy -> DivisibleBy).
Split off refactoring from https://github.com/llvm/llvm-project/pull/163021.
Add a new variant of m_scev_Mul that binds a SCEVMulExpr and use it in
SCEVURem_match and also update 2 more places in ScalarEvolution.cpp that
can use m_scev_Mul as well.
PR: https://github.com/llvm/llvm-project/pull/163364
Follow-up to https://github.com/llvm/llvm-project/pull/160941.
Even if we don't have a context instruction for the caller, we should be
able to provide context instructions for SCEVUnknowns. Unless I am
missing something, SCEVUnknown only become available at the point their
underlying IR instruction has been defined. If it is an argument, it
should be safe to use the first instruction in the entry block or the
instruction itself if it wraps an instruction.
This allows getConstantMultiple to make better use of alignment
assumptions.
PR: https://github.com/llvm/llvm-project/pull/163260
My usecase is simplifying the control flow generated by LoopVectorize
when vectorising loops whose tripcount is a function of the runtime
vector length. This can be problematic because:
* CSE is a pre-LoopVectorize transform and so it's common for an IR
function to include several calls to llvm.vscale(). (NOTE: Code
generation will typically remove the duplicates)
* Pre-LoopVectorize instcombines will rewrite some multiplies as shifts.
This leads to a mismatch between VL based maths of the scalar loop and
that created for the vector loop, which prevents some obvious
simplifications.
SCEV does not suffer these issues because it effectively does CSE during
construction and shifts are represented as multiplies.
When adding a new predicate to a union, we currently do a bidirectional
implication for all the contained predicates. This means that the number
of implication checks is quadratic in the number of total predicates (if
they don't end up being eliminated).
Fix this by not checking for implication if the number of predicates
grows too large. The expectation is that if there is a large number of
predicates, we should be discarding them later anyway, as expanding them
would be too expensive.
Fixes https://github.com/llvm/llvm-project/issues/156114.
Reverts llvm/llvm-project#157656
There are multiple reports that this is causing miscompiles in the MSan
test suite after bootstrapping and that this is causing miscompiles in
rustc. Let's revert for now, and work to capture a reproducer next week.
If we have a phi where one of it's source blocks is an unreachable
block, we don't want to traverse back into the unreachable region. Doing
so allows e.g. finding a trivial self loop when walking back the
predecessor chain.
This reverts commit f0df1e3dd4ec064821f673ced7d83e5a2cf6afa1.
Recommit with extra check for SCEVCouldNotCompute. Test has been added in
b16930204b.
Original message:
Remove the fall-back to constant max BTC if the backedge-taken-count
cannot be computed.
The constant max backedge-taken count is computed considering loop
guards, so to avoid regressions we need to apply loop guards as needed.
Also remove the special handling for Mul in willNotOverflow, as this
should not longer be needed after 914374624f
(https://github.com/llvm/llvm-project/pull/155300).
PR: https://github.com/llvm/llvm-project/pull/155672
Remove the fall-back to constant max BTC if the backedge-taken-count
cannot be computed.
The constant max backedge-taken count is computed considering loop
guards, so to avoid regressions we need to apply loop guards as needed.
Also remove the special handling for Mul in willNotOverflow, as this
should not longer be needed after 914374624f
(https://github.com/llvm/llvm-project/pull/155300).
PR: https://github.com/llvm/llvm-project/pull/155672
Add support for identifying multiplication overflow in SCEV.
This is needed in LoopAccessAnalysis and that limitation was worked
around by 484417a.
This allows early-exit vectorization to work as expected in
vect.stats.ll test without needing the workaround.
This patch adds a new VPlan-based addMinimumIterationCheck, which
replaced the ILV version for the non-epilogue case.
The VPlan-based version constructs a SCEV expression to compute the
minimum iterations, use that to check if the check is known true or
false. Otherwise it creates a VPExpandSCEV recipe and emits a
compare-and-branch.
When using epilogue vectorization, we still need to create the minimum
trip-count-check during the legacy skeleton creation. The patch moves
the definitions out of ILV.
PR: https://github.com/llvm/llvm-project/pull/153643
We just replaced SmallSet<T *, N> with SmallPtrSet<T *, N>, bypassing
the redirection found in SmallSet.h. With that, we no longer need to
include SmallSet.h in many files.
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
Similarly to https://github.com/llvm/llvm-project/pull/131538, we can
also try and check if a predicate is known to wrap given the backedge
taken count.
For now, this just checks directly when we try to create predicated
AddRecs. This both helps to avoid spending compile-time on optimizations
where we know the predicate is false, and can also help to allow
additional vectorization (e.g. by deciding to scalarize memory accesses
when otherwise we would try to create a predicated AddRec with a
predicate that's always false).
The initial version is quite restricted, but can be extended in
follow-ups to cover more cases.
PR: https://github.com/llvm/llvm-project/pull/151134