This pass steps through a block forwards and backwards, identifying those
variable assignment records that are redundant, and erases them, saving us
a decent wedge of compile-time. This patch re-implements it to use the
replacement for DbgValueInsts, DPValues, in an almost identical way.
Alas the test I've added the try-remove-dis flag to is the only one I've
been able to find that manually runs this pass.
Some passes has limitation that only support simple terminators:
branch/unreachable/return. Right now, they ask the pass manager to add
LowerSwitch pass to eliminate `switch`. Let's manage such kind of pass
dependency by ourselves. Also add the assertion in the related passes.
As per the stack of patches this is attached to, allow users of
BasicBlock::splitBasicBlock to provide an iterator for a position, instead
of just an instruction pointer. This is to fit with my proposal for how to
get rid of debug intrinsics [0]. There are other call-sites that would need
to change, but this is sufficient for a stage2clang self host and some
other C++ projects to build identical binaries, in the context of the whole
remove-DIs project.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152545
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
As outlined in my proposal of how to get rid of debug intrinsics, this
patch adds a moveBefore method that signals the caller /intends/ the order
of moved instructions is to stay the same. This semantic difference has an
effect on debug-info, as it signals whether debug-info needs to move with
instructions or not.
The patch just replaces a few calls to moveBefore with calls to
moveBeforePreserving -- and the latter just calls the former, so it's all
NFC right now. A future patch will add an implementation of
moveBeforePreserving that takes action to correctly preserve debug-info,
but that's tightly coupled with our non-instruction debug-info
representation that's still being reviewed.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D156369
isLegalToHoistInto() currently return true for callbr instructions.
That means that a callbr with one successor will be considered a
proper loop preheader, which may result in instructions that use
the callbr return value being hoisted past it.
Fix this by adding callbr to isExceptionTerminator (with a rename
to isSpecialTerminator), which also fixes similar assumptions in
other places.
Fixes https://github.com/llvm/llvm-project/issues/64215.
Differential Revision: https://reviews.llvm.org/D158609
Add an API that allows removing multiple incoming phi values based
on a predicate callback, as suggested on D157621.
This makes sure that the removal is linear time rather than quadratic,
and avoids subtleties around iterator invalidation.
I have replaced some of the more straightforward users with the new
API, though there's a couple more places that should be able to use it.
Differential Revision: https://reviews.llvm.org/D158064
Refactor to use BasicBlockUtils functions and make life easier for
a subsequent patch for updating the dominator tree.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D154053
Add a more "flexible" `SplitBlockAndInsertIfThenElse` function
and re-implement some others on top of it.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D154052
SplitBlockAndInsertIfThen utility creates two new blocks,
they're called ThenBlock and Tail (true and false destinations of a conditional
branch correspondingly). The function has a bool parameter Unreachable,
and if it's set, then ThenBlock is terminated with an unreachable.
At the end of the function the new blocks are added to the loop of the split
block. However, in case ThenBlock is terminated with an unreachable,
it cannot belong to any loop.
Differential Revision: https://reviews.llvm.org/D152434
The method is marked for deprecation. Delete the method and move all of
its consumers to use the DomTreeUpdater version.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D149428
The patch adds new member MaybeEVL into InterestingMemoryOperand to represent
the effective vector length for vp intrinsics. It may be extended for some target intrinsics in the future.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D146208
This reverts commit c5ea42bcf48c8f3d3e35a6bff620b06d2a499108.
Recommit the patch with a fix for loops where the exiting terminator is
not a branch instruction. In that case, ExitInfos may be empty. In
addition to checking if there's a single exiting block also check if
there's a single ExitInfo.
A test case has been added in f92b35392ed8e4631.
Remove LLVM flag -experimental-assignment-tracking. Assignment tracking is
still enabled from Clang with the command line -Xclang
-fexperimental-assignment-tracking which tells Clang to ask LLVM to run the
pass declare-to-assign. That pass converts conventional debug intrinsics to
assignment tracking metadata. With this patch it now also sets a module flag
debug-info-assignment-tracking with the value `i1 true` (using the flag conflict
rule `Max` since enabling assignment tracking on IR that contains only
conventional debug intrinsics should cause no issues).
Update the docs and tests too.
Reviewed By: CarlosAlbertoEnciso
Differential Revision: https://reviews.llvm.org/D142027
The scope of DT updates are very limited when unrolling loops: the DT
should only need updating for
* new blocks added
* exiting blocks we simplified branches
This can be done manually without too much extra work.
MergeBlockIntoPredecessor also needs to be updated to support direct
DT updates.
This fixes excessive time spent in DTU for same cases. In an internal
example, time spent in LoopUnroll with this patch goes from ~200s to 2s.
It also is slightly positive for CTMark:
* NewPM-O3: -0.13%
* NewPM-ReleaseThinLTO: -0.11%
* NewPM-ReleaseLTO-g: -0.13%
Notable improvements are mafft (~ -0.50%) and lencod (~ -0.30%), with no
workload regressed.
https://llvm-compile-time-tracker.com/compare.php?from=78a9ee7834331fb4360457cc565fa36f5452f7e0&to=687e08d011b0dc6d3edd223612761e44225c7537&stat=instructions:u
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D141487
Currently, SROA is CFG-preserving.
Not doing so does not affect any pipeline test. (???)
Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call.
By design, we can't really SROA alloca if their address escapes somehow,
but we have logic to deal with `load` of `select`/`PHI`,
where at least one of the possible addresses prevents promotion,
by speculating the `load`s and `select`ing between loaded values.
As one would expect, that requires ensuring that the speculation is actually legal.
Even ignoring complexity bailouts, that logic does not deal with everything,
e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`.
There can also be cases where the load is genuinely non-speculate.
So if we can't prove that the load can be speculated,
unfold the select, produce two-entry phi node, and perform predicated load.
Now, that transformation must obviously update Dominator Tree,
since we require it later on. Doing so is trivial.
Additionally, we don't want to do this for the final SROA invocation (D136806).
In the end, this ends up having negative (!) compile-time cost:
https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u
Though indeed, this only deals with `select`s, `PHI`s are still using speculation.
Should we update some more analysis?
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138238
This reverts commit 739611870d3b06605afe25cc07833f6a62de9545,
and recommits 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5
with a fixed assertion - we should check that DTU is there,
not just assert false...
The assertion about not modifying the CFG seems to not hold,
will recommit in a bit.
https://lab.llvm.org/buildbot#builders/139/builds/32412
This reverts commit 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5.
This reverts commit 4f90f4ada33718f9025d0870a4fe3fe88276b3da.
Currently, SROA is CFG-preserving.
Not doing so does not affect any pipeline test. (???)
Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call.
By design, we can't really SROA alloca if their address escapes somehow,
but we have logic to deal with `load` of `select`/`PHI`,
where at least one of the possible addresses prevents promotion,
by speculating the `load`s and `select`ing between loaded values.
As one would expect, that requires ensuring that the speculation is actually legal.
Even ignoring complexity bailouts, that logic does not deal with everything,
e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`.
There can also be cases where the load is genuinely non-speculate.
So if we can't prove that the load can be speculated,
unfold the select, produce two-entry phi node, and perform predicated load.
Now, that transformation must obviously update Dominator Tree,
since we require it later on. Doing so is trivial.
Additionally, we don't want to do this for the final SROA invocation (D136806).
In the end, this ends up having negative (!) compile-time cost:
https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u
Though indeed, this only deals with `select`s, `PHI`s are still using speculation.
Should we update some more analysis?
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138238
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Update the RemoveRedundantDbgInstrs utility to avoid sometimes losing
information when deleting dbg.assign intrinsics.
removeRedundantDbgInstrsUsingBackwardScan - treat dbg.assign intrinsics that
are not linked to any instruction just like dbg.values. That is, in a block of
contiguous debug intrinsics, delete all other than the last definition for a
fragment. Leave linked dbg.assign intrinsics in place.
removeRedundantDbgInstrsUsingForwardScan - Don't delete linked dbg.assign
intrinsics and don't delete the next intrinsic found even if it would otherwise
be eligible for deletion.
remomveUndefDbgAssignsFromEntryBlock - Delete undef and unlinked dbg.assign
intrinsics encountered in the entry block that come before non-undef
non-unlinked intrinsics for the same variable.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133294
The existing way of creating the predicate in the guard blocks uses
a boolean value per outgoing block. This increases the number of live
booleans as the number of outgoing blocks increases. The new way added
in this change is to store one integer to represent the outgoing block
we want to branch to, then at each guard block, an integer equality
check is performed to decide which a specific outgoing block is taken.
Using an integer reduces the number of live values and decreases
register pressure especially in cases where there are a large number
of outgoing blocks. The integer based approach is used when the
number of outgoing blocks crosses a threshold, which is currently set
to 32.
Patch by Ruiling Song.
Differential review: https://reviews.llvm.org/D127831
Replacing the following instances of UndefValue with PoisonValue, where the UndefValue is used as an arbitrary value:
- llvm/lib/CodeGen/WinEHPrepare.cpp
`demotePHIsOnFunclets`: RAUW arbitrary value for lingering uses of removed PHI nodes
- llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
`FoldSingleEntryPHINodes`: Removes a self-referential single entry phi node.
- llvm/lib/Transforms/Utils/CallGraphUpdater.cpp
`finalize`: Remove all references to removed functions.
- llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
`cleanup`: the result is not used then the inserted instructions are removed.
- llvm/tools/bugpoint/CrashDebugger.cpp
`TestInts`: the program is cloned and instructions are removed to narrow down source of crash.
Differential Revision: https://reviews.llvm.org/D133640
As callbr is now allowed to have duplicate destinations, we can
have a callbr with a unique successor. Make sure it doesn't get
dropped, as we still need to preserve the side-effect.
After D129205, we support SplitBlockPredecessors() for predecessors
with callbr terminators. This means that it is now also safe to
invoke critical edge splitting for an edge coming from a callbr
terminator. Remove checks in various passes that were protecting
against that.
Differential Revision: https://reviews.llvm.org/D129256
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.
The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.
I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.
Differential Revision: https://reviews.llvm.org/D129205
Fixes a bug preventing moving the loop's metadata to an outer loop's header,
which happens if the loop's exit is also the header of an outer loop.
Adjusts test for above.
Fixes#55416.
Differential Revision: https://reviews.llvm.org/D125574
This solves a problem with non-deterministic output from opt due
to not performing dominator tree updates in a deterministic order.
The problem that was analysed indicated that JumpThreading was using
the DomTreeUpdater via llvm::MergeBasicBlockIntoOnlyPred. When
preparing the list of updates to send to DomTreeUpdater::applyUpdates
we iterated over a SmallPtrSet, which didn't give a well-defined
order of updates to perform.
The added domtree-updates.ll test case is an example that would
result in non-deterministic printouts of the domtree. Semantically
those domtree:s are equivalent, but it show the fact that when we
use the domtree iterator the order in which nodes are visited depend
on the order in which dominator tree updates are performed.
Since some passes (at least EarlyCSE) are iterating over nodes in the
dominator tree in a similar fashion as the domtree printer, then the
order in which transforms are applied by such passes, transitively,
also depend on the order in which dominator tree updates are
performed. And taking EarlyCSE as an example the end result could be
different depending on in which order the transforms are applied.
Reviewed By: nikic, kuhar
Differential Revision: https://reviews.llvm.org/D110292
Added support for peeling loops with exits that are followed either by an
unreachable-terminated block or block that has a terminatnig deoptimize call.
All blocks in the sequence must have an unique successor, maybe except
for the last one.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D110922