In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
This adds support for using dominating conditions in computeKnownBits()
when called from InstCombine. The implementation uses a
DomConditionCache, which stores which branches may provide information
that is relevant for a given value.
DomConditionCache is similar to AssumptionCache, but does not try to do
any kind of automatic tracking. Relevant branches have to be explicitly
registered and invalidated values explicitly removed. The necessary
tracking is done inside InstCombine.
The reason why this doesn't just do exactly the same thing as
AssumptionCache is that a lot more transforms touch branches and branch
conditions than assumptions. AssumptionCache is an immutable analysis
and mostly gets away with this because only a handful of places have to
register additional assumptions (mostly as a result of cloning). This is
very much not the case for branches.
This change regresses compile-time by about ~0.2%. It also improves
stage2-O0-g builds by about ~0.2%, which indicates that this change results
in additional optimizations inside clang itself.
Fixes https://github.com/llvm/llvm-project/issues/74242.
My initial patch contained a typo, resulting in the wrong value
being checked for non-negativeness.
-----
If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.
Proof: https://alive2.llvm.org/ce/z/Ahg4ri
If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.
Proof: https://alive2.llvm.org/ce/z/Ahg4ri
This patch updates the last few places in LLVM using findDbgValues that
don't also collect and handle DPValue objects. This largely involves
instcombine and mem2reg changes, and are largely mechanical, calling
existing utilities on collections of DPValues instead of just
DbgValuesInsts.
A variety of tests have had RemoveDIs RUN lines added to them to cover
these behaviours. We have some technical debt of the instcombine sinking
code for DPValues not being implemented yet, so I've left FIXME stubs
indicating that we intend to cover tests with RemoveDIs but haven't yet.
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.
This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.
Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
With the current logic of `if(isFreeToInvert(Op)) return Not(Op)` its
fairly easy to either 1) cause regressions or 2) infinite loops
if the folds we have for `Not(Op)` ever de-sync with the cases we
know are freely invertible.
This patch adds `getFreeInverted` which is able to build the free
inverted op along with check for free inversion to alleviate this
problem.
It might seem obvious, but it's not a good idea to convert a
debug-intrinsic instruction into an UnreachableInst, as this means
things operate differently with and without the -g option. However this
can happen due to the "mutate the next instruction" API calls we make.
With RemoveDIs eliminating debug intrinsics, this behaviour is at risk
of changing, hence this patch ensures we only ever mutate the next _non_
debuginfo instruction into an Unreachable.
The tests instrumented with the --try... flag all exercise this, I've
added some metadata to a SCCP test to ensure it's exercised.
This patch re-implements a variety of debug-info maintenence functions
to use DPValues instead of DbgValueInst's: supporting the "new"
non-intrinsic representation of debug-info. As per [0], we need to have
parallel implementations of various utilities for a time, and these are
the most fundamental utilities used throughout the compiler.
I've added --try-experimental-debuginfo-iterators to a variety of RUN
lines: this is a flag that turns on "new debug-info" if it's built into
LLVM, and not otherwise. This should ensure that we have the same
behaviour for the same IR inputs, but using a different internal
representation. For the most part these changes affect SROA/Mem2Reg
promotion of dbg.declares into dbg.value intrinsics (now DPValues),
we're leaving dbg.declares as instructions until later in the day.
There's also some salvaging changes made.
I believe the tests that I've added cover almost all the code being
updated here. The only thing I'm not confident about is SimplifyCFG,
which calls rewriteDebugUsers down a variety of code paths. Those
changes can't immediately get full coverage as an additional patch is
needed that updates handling of Unreachable instructions, will upload
that shortly.
[0]
https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939/9
At the moment, all alloc-like functions are assumed to return non-null
pointers, if their return value is only used in a compare. This is based
on being allowed to substitute the allocation function with one that
doesn't fail to allocate the required memory.
aligned_alloc however must also return null if the required alignment
cannot be satisfied, so I don't think the same reasoning as above can be
applied to it.
This patch adds a bail-out for aligned_alloc calls to
isAllocSiteRemovable.
This is the floating-point analog of SimplifyDemandedBits. If we know
the edge cases are assumed impossible in uses, it's possible to prune
upstream edge case handling.
Start by only using this on returns in functions with nofpclass
returns (where I'm surprised there are no other combines), but this
can be extended to include any other nofpclass use or FPMathOperator
with flags.
Partially addresses issue #64870https://reviews.llvm.org/D158648
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152543
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
As outlined in my proposal of how to get rid of debug intrinsics, this
patch adds a moveBefore method that signals the caller /intends/ the order
of moved instructions is to stay the same. This semantic difference has an
effect on debug-info, as it signals whether debug-info needs to move with
instructions or not.
The patch just replaces a few calls to moveBefore with calls to
moveBeforePreserving -- and the latter just calls the former, so it's all
NFC right now. A future patch will add an implementation of
moveBeforePreserving that takes action to correctly preserve debug-info,
but that's tightly coupled with our non-instruction debug-info
representation that's still being reviewed.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D156369
This is partial revert of cbca9ce91c6440f8815742b8a73a27aa81e806e6.
That commit removed the code guarding against min/max SPF patterns,
because those are now canonicalized to min/max intrinsics. However,
this is only true for integer min/max, while FP min/max can not
always be canonicalized to an intrinsic. As such, restore a
simplified version of the guard that handles only the FP case.
Fixes https://github.com/llvm/llvm-project/issues/64937.
When the 'int Four = Two;' is sunk into the 'case 0:' block,
the debug value for 'Three' is set incorrectly to 'poison'.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D158171
This check is redundant as it is covered by the call to
isPotentiallyReachable.
Depends on D155726.
Differential Revision: https://reviews.llvm.org/D155718
The zext constant expression was detected by the fold, but then
handled as a sext. Use ZExtOperator instead of ZExtInst to handle
constant expressions.
Fixes https://github.com/llvm/llvm-project/issues/64669.
Set phi inputs to poison whenever we find a dead edge (either
during initial worklist population or the main InstCombine run),
instead of only doing this for successors of dead blocks.
This means that the phi operand is set to poison even if for
critical edges without an intermediate block.
There are quite a few test changes, because the pattern is fairly
common in vectorizer output, for cases where we know the vectorized
loop will be entered.
Improve foldOpIntoPhi() for icmp operator to check if incoming PHI value can be replaced with constant based on implied condition.
Depends on D156619.
Differential Revision: https://reviews.llvm.org/D156620
In degenerate cases, it is possible for unreachable code removal
to remove the current instruction. However, we still return the
instruction to report a change, resulting in a use after free.
Instead, perform the change reporting in the same way as
eraseInstFromFunction() does, by directly setting MadeIRChange
and returning nullptr.
Fixes https://github.com/llvm/llvm-project/issues/64235.
Fixes a regression introduced by D75362 for irreducible control
flow. In that case, we may visit the predecessor that renders
the current block live only later, and incorrectly determine
that a block is dead.
Instead, switch to using the same DeadEdges based implementation
we also use during the main InstCombine iteration.
This temporarily regresses some cases that need replacement of
dead phi operands with poison, which is currently only done during
the main run, but not worklist population. This will be addressed
in a followup, to keep it separate from the correctness fix here.
Fixes https://github.com/llvm/llvm-project/issues/64259.
InstCombine is a worklist-driven algorithm, which works roughly
as follows:
* All instructions are initially pushed to the worklist.
The initial order is in RPO program order.
* All newly inserted instructions get added to the worklist.
* When an instruction is folded, its users get added back to the
worklist.
* When the use-count of an instruction decreases, it gets added
back to the worklist.
* And a few of other heuristics on when we should revisit
instructions.
On top of the worklist algorithm, InstCombine layers an additional
fix-point iteration: If any fold was performed in the previous
iteration, then InstCombine will re-populate the worklist from
scratch and fold the entire function again. This continues until
a fix-point is reached.
In the vast majority of cases, InstCombine will reach a fix-point
within a single iteration: However, a second iteration is performed
to verify that this is indeed the fixpoint. We can see this in the
statistics for llvm-test-suite:
"instcombine.NumOneIteration": 411380,
"instcombine.NumTwoIterations": 117921,
"instcombine.NumThreeIterations": 236,
"instcombine.NumFourOrMoreIterations": 2,
The way to read these numbers is that in 411380 cases, InstCombine
performs no folds. In 117921 cases it performs a fold and reaches
the fix-point within one iteration (the second iteration verifies
the fixpoint). In the remaining 238 cases, more than one iteration
is needed to reach the fixpoint.
In other words, only in 0.04% of cases are additional iterations
needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine
performs a completely useless extra iteration to verify the fix point.
This patch removes the fixpoint iteration from InstCombine, and always
only perform a single iteration. This results in a major compile-time
improvement of around 4% at negligible codegen impact.
This explicitly does accept that we will not reach a fixpoint in all
cases. However, this is mitigated by two factors: First, the data
suggests that this happens very rarely in practice. Second,
InstCombine runs many times during the optimization pipeline
(8 times even without LTO), so there are many chances to recover
such cases.
In order to prevent accidental optimization regressions in the
future, this implements a verify-fixpoint option, which is enabled
by default when instcombine is specified in -passes and disabled
when InstCombinePass() is constructed from C++. This means that
test cases need to explicitly use the no-verify-fixpoint option
if they fail to reach a fixed point (for a well understand reason
we cannot / do not want to avoid).
Differential Revision: https://reviews.llvm.org/D154579