The expansion of a SCEVUnknown is trivial (it's just the wrapped value).
If we try to reuse an existing value it might be a more complex
expression that simplifies to the SCEVUnknown.
This is inspired by https://github.com/llvm/llvm-project/issues/114879,
because SCEVExpander replacing a constant with a phi node is just silly.
(I don't consider this a fix for that issue though.)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
This macros is always defined: either 0 or 1. The correct pattern is to
use #if.
Re-apply #110185 with more fixes for debug build with the ABI breaking
checks disabled.
…ead of #ifdef (#110883)"
This reverts commit 1905cdbf4ef15565504036c52725cb0622ee64ef, which
causes lots of failures where LLVM doesn't have the right header guards.
The errors can be seen on
[BuildKite](https://buildkite.com/llvm-project/upstream-bazel/builds/112362#01924eae-231c-4d06-ba87-2c538cf40e04),
where the source uses `#ifndef NDEBUG`, but the content in question is
defined when `LLVM_ENABLE_ABI_BREAKING_CHECKS == 1`.
For example, `llvm/include/llvm/Support/GenericDomTreeConstruction.h`
has the following:
```cpp
// Helper struct used during edge insertions.
struct InsertionInfo {
// ...
#ifdef LLVM_ENABLE_ABI_BREAKING_CHECKS
SmallVector<TreeNodePtr, 8> VisitedUnaffected;
#endif
};
// ...
InsertionInfo II;
// ...
#ifndef NDEBUG
II.VisitedUnaffected.push_back(SuccTN);
#endif
```
When expanding SCEV adds to geps, transfer the nuw flag to the resulting
gep. (Note that this doesn't apply to IV increment GEPs, which go
through a different code path.)
As pointed out in the review of #102133, SCEVExpander currently
incorrectly reuses GEP instructions that have poison-generating flags
set. Fix this by clearing the flags on the reused instruction.
The current isHighCostExpansion cost model for addrecs computes the cost
for some kind of polynomial expansion that does not appear to have any
relation to addrec expansion whatsoever.
A literal expansion of an affine addrec is a phi and add (plus the
expansion of start and step). For a non-affine addrec, we get another
phi+add for each additional addrec nested in the step recurrence.
This partially `fixes` https://github.com/llvm/llvm-project/issues/53205
(the runtime unroll test case in this PR).
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
If we have a urem expression, emitting it as a urem is significantly
better that letting the fully expansion kick in. We have the risk of a
udiv or mul which could have previously been shared, but loosing that
seems like a reasonable tradeoff for being able to round trip a urem w/o
modification.
The patch didn't consistently clean up `#ifdef LLVM_ENABLE_ABI_BREAKING_CHECKS` and '#if defined(LLVM_ENABLE_ABI_BREAKING_CHECKS)' paths, causing a lot of build failures
`LLVM_ENABLE_ABI_BREAKING_CHECKS` is always defined:
72c901f5e5/llvm/include/llvm/Config/abi-breaking.h.cmake (L16C2-L16C15)
It uses `cmakedefine01` rather than `cmakedefine`, so
`LLVM_ENABLE_ABI_BREAKING_CHECKS` is always defined,
so the preprocessed code is probably not what the author wanted.
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.
When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.
Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.
Depends on https://github.com/llvm/llvm-project/pull/93498
PR: https://github.com/llvm/llvm-project/pull/93499
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit
messsage follows:
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
SCEVExpanderCleaner will currently remove instructions created by
SCEVExpander, but not restore poison generating flags that it may have
dropped. As such, running LIR can currently spuriously drop flags
without performing any transforms.
Fix this by keeping track of original instruction flags in SCEVExpander.
Fixes https://github.com/llvm/llvm-project/issues/82337.
widenIVUse may hoist a wide induction increment and introduce new uses,
but does not recompute the wrap flags. In some cases this can make the
new uses of the wide IV inc more poisonous.
Update the code to recompute flags if needed when hoisting an IV. If
both the narrow and wide IV increment's flags match and we can re-use
the flags from the increments, there's no need to recompute the flags,
as the replacement won't make the new uses of the wide IV's increment
more poisonous.
Note that this also updates a stale comment which claimed that the widen
increment is only used if it dominates the new use.
The helper should also be used to guard the code added in da437330be,
which I am planning on doing separately once the helper lands.
Fixes https://github.com/llvm/llvm-project/issues/82243.
Similar to the non-ptr case, directly create the getelementptr
instruction. Going through expandAddToGEP() no longer makes sense
with opaque pointers, where generating the necessary instruction
is trivial.
This avoids recursive expansion of (the SCEV of) StepV while the
IR is in an inconsistent state, in particular with an incomplete
IV phi node, which utilities may not be prepared to deal with.
Fixes https://github.com/llvm/llvm-project/issues/80954.
SCEV treats "or disjoint" the same as "add nsw nuw". However, when
expanding, we cannot generally replace an add SCEV node with an "or
disjoint" instruction. Just dropping the poison flag is insufficient in
this case, we would have to actually convert the or into an add.
This is a partial fix for #79861.
We are replacing with a wider increment. If both OrigInc and
IsomorphicInc are NUW/NSW, then we can preserve them on the wider
increment; the narrower IsomorphicInc would wrap before the wider
OrigInc, so the replacement won't make IsomorphicInc's uses more
poisonous.
PR: https://github.com/llvm/llvm-project/pull/79512
Move logic to replace congruent IV increments to helper function, to
reduce the indentation by using early returns. This is in preparation
for a follow-up patch.
LSR uses SCEVExpander to generate induction formulas. The expander
internally tries to reuse existing IR expressions. To do that, it needs
to strip any poison generating flags (nsw, nuw, exact, nneg, etc..)
which may not be valid for the newly added users.
This is conservatively correct, but has the effect that LSR will strip
nneg flags on zext instructions involved in trip counts in loop
preheaders. To avoid this, this patch adjusts the expanded to reinfer
the flags on the CSE candidate if legal for all possible users.
This should fix the regression reported in
https://github.com/llvm/llvm-project/issues/71200.
This should arguably be done inside canReuseInstruction instead, but
doing it outside is more conservative compile time wise. Both
canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so
right now we are performing work which is roughly O(N^2) in the size of
the operand graph. We should fix that before making the per operand step
more expensive. My tenative plan is to land this, and then rework the
code to sink the logic into more core interfaces.
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.
gcc warned about it:
../lib/Transforms/Utils/ScalarEvolutionExpander.cpp: In lambda function:
../lib/Transforms/Utils/ScalarEvolutionExpander.cpp:2104:22: warning: unused variable 'ARPtrTy' [-Wunused-variable]
2104 | if (PointerType *ARPtrTy = dyn_cast<PointerType>(ARTy)) {
| ^~~~~~~
Fix the warning by removing the variable and turn dyn_cast into isa.
Remove all the expandCodeFor() uses that specify an explicit type,
as well as InsertNoopCastOfTo() calls and most uses of
getEffectiveSCEVType().
The only place where no-op casts can now be inserted are public
expandCodeFor() uses.
SCEVExpander currently has special handling for the case where the
start or the step of an addrec do not dominate the loop header,
which is not used by any lit test.
Initially I thought that this is entirely dead code, because
addrec operands are required to be loop invariant. However,
SCEV currently allows creating an addrec with operands that are
loop invariant but defined *after* the loop.
This doesn't seem like a useful case to allow, and we don't
appear to be using this outside a single easy to adjust unit test.
A lot of SCEV expressions only work on integers -- in which case
the effective type will always be the same as the type.
There is a lot more cleanup to do here.
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
SCEVExpander tries to reuse existing instruction with the same
SCEV expression. However, doing this replacement blindly is not
safe, because the instruction might be more poisonous.
What we were already doing is to drop poison-generating flags on
the reused instruction. But this is not the only way that more
poison can be introduced. The poison-generating flag might not
be directly on the reused instruction, or the poison contribution
might come from something like 0 * %var, which folds to 0 but can
still introduce poison.
This patch fixes the issue in a principled way, by determining which
values can contribute poison to the SCEV expression, and then
checking whether any additional values can contribute poison to the
instruction being reused. Poison-generating flags are dropped if
doing that enables reuse.
This is a pretty big hammer and does cause some regressions in
tests, but less than I would have expected. I wasn't able to come
up with a less intrusive fix that still satisfies the correctness
requirements.
Fixes https://github.com/llvm/llvm-project/issues/63763.
Fixes https://github.com/llvm/llvm-project/issues/63926.
Fixes https://github.com/llvm/llvm-project/issues/64333.
Fixes https://github.com/llvm/llvm-project/issues/63727.
Differential Revision: https://reviews.llvm.org/D158181
This method is only used to determine whether a related expansion
exists, the actual value is unused. Clarify that by renaming
get -> has and returning bool.