During inter-procedural SCCP, also infer attributes on arguments, not
just return values. This allows other non-interprocedural passes to make
use of the information later.
Don't just leave the result as unknown. I think this currently
works out thanks to undef resolution, but the correct thing to
do is set it to overdefined explicitly.
IPSCCP can currently return worse results than SCCP for arguments that
are tracked interprocedurally, because information from attributes is
not used for them.
Fix this by intersecting in the attribute information when propagating
lattice values from calls.
Add NotConstant(Null) roots for nonnull arguments and then propagate
them through nuw/inbounds GEPs.
Having this functionality in SCCP is useful because it allows reliably
eliminating null comparisons, independently of how deeply nested they
are in selects/phis. This handles cases that would hit a cutoff in
ValueTracking otherwise.
The implementation is something of a MVP, there are a number of obvious
extensions (e.g. allocas are also non-null).
This is a confusingly named helper than means "is not unknown,
undef or constant". Prefer the more obvious ValueLattice API
instead. Most of these checks are for values which are forced to
overdefined by undef resolution, in which case only actual
overdefined values are relevant.
Add preliminary support for vectors of integers by using the
`ValueLatticeElement::asConstantRange()` helper instead of a custom
implementation, and relxing various integer type checks.
This enables just the part that works automatically, e.g. icmps with a
constant vector operand aren't supported yet.
The change in ssa.copy handling is because asConstantRange() returns an
unknown LV for empty range, while SCCP's getConstantRange() returned a
full range. I've made the change to preserve the existing behavior.
When performing some range operation (e.g. and) on a constant range that
includes undef, we currently just ignore the undef value, which is
obviously incorrect. Instead, we can do one of two things:
* Say that the result range also includes undef.
* Treat undef as a full range.
This patch goes with the second approach -- I'd expect it to be a bit
better overall, e.g. it allows preserving the fact that a zext of a
range with undef isn't a full range.
Fixes https://github.com/llvm/llvm-project/issues/93096.
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit
messsage follows:
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
Introduce isDesirableCastOp() which determines whether IR builder
and constant folding should produce constant expressions for a
given cast type. This mirrors what we do for binary operators.
Mark zext/sext as undesirable, which prevents most creations of such
constant expressions. This is still somewhat incomplete and there
are a few more places that can create zext/sext expressions.
This is part of the work for
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
The reason for the odd result in the constantexpr-fneg.c test is
that initially the "a[]" global is created with an [0 x i32] type,
at which point the icmp expression cannot be folded. Later it is
replaced with an [1 x i32] global and the icmp gets folded away.
But at that point we no longer fold the zext.
isLegalToHoistInto() currently return true for callbr instructions.
That means that a callbr with one successor will be considered a
proper loop preheader, which may result in instructions that use
the callbr return value being hoisted past it.
Fix this by adding callbr to isExceptionTerminator (with a rename
to isSpecialTerminator), which also fixes similar assumptions in
other places.
Fixes https://github.com/llvm/llvm-project/issues/64215.
Differential Revision: https://reviews.llvm.org/D158609
Scalable vector GEPs are not constants and trying to create one for
these GEPs causes an assertion failure.
Reviewed By: nikic, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D157590
If a value is already the last element of the worklist, then I think that we don't have to add it again, it is not needed to process it repeatedly.
For some long Triton-generated LLVM IR, this can cause a ~100x speedup.
Differential Revision: https://reviews.llvm.org/D153561
The ConstantRange specifies the range of the scalar elements in the
vector. When converting into a Constant, we need to create a vector
splat with the correct type. For that purpose, pass in the expected
type for the constant.
Fixes https://github.com/llvm/llvm-project/issues/63380.
In replaceSignedInst, if a signed instruction can be repalced with
unsigned instruction, we created a new instruction and removed the old
instruction's value state. If the following instructions has this new
instruction as a use operand, transformations like replaceSignedInst and
refineInstruction would be blocked. The reason is there is no value
state for the new instrution.
This patch set the new instruction's value state with the removed
instruction's value state. I believe it is correct bacause when we
repalce a signed instruction with unsigned instruction, the value state
is not changed.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D152337
For constant range supported intrinsics, we got consantrange from args
no matter if they are unknown or undef. And the constant range computed
from unknown or undef value state is full range.
I think compute with full constant range is harmful since although we
can do mergeIn after these args value state are changed, the merge
operation of two constant ranges is union.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D152499
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.
Differential Revision: https://reviews.llvm.org/D150375
Reapply with extra check for struct types, which caused buildbot
failures last time.
-----
The freeze instruction has not been handled by SCCPInstVisitor.
This patch adds SCCPInstVisitor::visitFreezeInst(FreezeInst &I)
method to handle freeze instructions.
Differential Revision: https://reviews.llvm.org/D151659
The SCCPSolver is using a structure (AnalysisResultsForFn) where it keeps
pointers to various analyses needed by the IPSCCP pass. These analyses are
requested all at the same time, which can become problematic in some cases.
For example one could be retrieved via getCachedAnalysis() prior to the
actual execution of the analysis. In more detail:
The IPSCCP pass uses a DomTreeUpdater to preserve the PostDominatorTree
in case the PostDominatorTreeAnalysis had run before IPSCCP. Starting with
commit 1b1232047e83b the IPSCCP pass may use BlockFrequencyAnalysis for
some functions in the module. As a result, the PostDominatorTreeAnalysis
may not run until the BlockFrequencyAnalysis has run, since the latter
analysis depends on the former. Currently, we setup the DomTreeUpdater
using getCachedAnalysis to retrieve a PostDominatorTree. This happens
before BlockFrequencyAnalysis has run, therefore the cached analysis can
become invalid by the time we use it.
Differential Revision: https://reviews.llvm.org/D151666
The freeze instruction has not been handled by SCCPInstVisitor.
This patch adds SCCPInstVisitor::visitFreezeInst(FreezeInst &I)
method to handle freeze instructions.
Differential Revision: https://reviews.llvm.org/D151659
As reported on https://reviews.llvm.org/D150375#4367861 and
following, this change causes PDT invalidation issues. Revert
it and dependent commits.
This reverts commit 0524534d5220da5ecb2cd424a46520184d2be366.
This reverts commit ced90d1ff64a89a13479a37a3b17a411a3259f9f.
This reverts commit 9f992cc9350a7f7072a6dbf018ea07142ea7a7ed.
This reverts commit 1b1232047e83b69561fd64b9547cb0a0d374473a.
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.
Differential Revision: https://reviews.llvm.org/D150375
To track the return values of specializations, we need to invalidate all
the lattice values across the use-def chain which originates from the
callsites, recompute and propagate.
Differential Revision: https://reviews.llvm.org/D146158
This reverts commit 43acb61a08fffada31fb2e20e45fcc8492ef76b9.
Recommit the patch after fixing the issue causing the revert in 4e607ec4987.
Extra tests have been added in 5c6cb61ad416a544.
Original commit message:
Extend the NUW/NSW inference logic add in 72121a20cd and cdeaf5f28c3dc
to all overflowing binary operators.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142721
This brings the operand order in line with the NUW handling, which was
missed out in 72121a20cda4dc91d0ef5548f930.
At the moment this is NFC as we only additions, but it
should fix miscompiles with 024115ab14822a recommitted.
This reverts commit 024115ab14822a97c09adcd2545c14e78b843b36.
I suspect that this may be causing some buildbot bootstrapping failures.
Revert while I investigate.
Extend the NUW/NSW inference logic add in 72121a20cd and cdeaf5f28c3dc
to all overflowing binary operators.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142721
This patch updates SCCP to use the value ranges of AddInst operands to
try to prove the AddInst does not overflow in the signed sense and
adds the NSW flag. The reasoning is done with
makeGuaranteedNoWrapRegion (thanks @nikic for point it out!).
Follow-ups will include extending this to more
OverflowingBinaryOperators.
Depends on D142387.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142390