Instead of a hash map mapping pairs of blocks and successor index to the
probability, store the probabilities as flat array and start indices
into this array in a per-block information vector.
Also drop value handles: no stored pointers => no stale pointers. If a
block is removed, the block number is not reused unless the function is
renumbered, and BPI doesn't support renumbering.
This is possible since now all successor operands are stored
consecutively.
There is just one out-of-line function call instead of one call to
getSuccessor() per operand.
`Constant::isZeroValue` currently behaves same as
`Constant::isNullValue` for all types except floating-point, where it
additionally returns true for negative zero (`-0.0`). However, in
practice, almost all callers operate on integer/pointer types where the
two are equivalent, and the few FP-relevant callers have no meaningful
dependence on the `-0.0` behavior.
This PR removes `isZeroValue` to eliminate the confusing API. All
callers are changed to `isNullValue` with no test failures.
`isZeroValue` will be reintroduced in a future change with clearer
semantics: when null pointers may have non-zero bit patterns,
`isZeroValue` will check for bitwise-all-zeros, while `isNullValue` will
check for the semantic null (which
may be non-zero).
The collection of library function names in TargetLibraryInfo faces
similar challenges as RuntimeLibCalls in the IR component. The number of
function names is large, there are numerous customizations based on the
triple (including alternate names), and there is a lot of replicated
data in the signature table.
The ultimate goal would be to capture all lbrary function related
information in a .td file. This PR brings the current .def file to
TableGen, almost as a 1:1 replacement. However, there are some
improvements which are not possible in the current implementation:
- the function names are now stored as a long string together with an
offset table.
- the table of signatures is now deduplicated, using an offset table for
access.
The size of the object file decreases about 34kB with these changes. The
hash table of all function names is still constructed dynamically. A
static table like for RuntimeLibCalls is the next logical step.
The main motivation for this change is that I have to add a large number
of custom names for z/OS (like in RuntimeLibCalls.td), and the current
infrastructur does not support this very well.
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
There is no need to iterate all predecessors of current block, check if
current block is the invoke unwind destination of any predecessor. We
can directly call `BasicBlock::isEHPad()` to check if current block is
an exception handling block.
The `LoopBlock` stored in `LoopWorkList` consist of basic block and its
loop data information. When iterate `LoopWorkList`, if estimated weight
of a loop is not stored in `EstimatedLoopWeight`, `getLoopExitBlocks()`
is called to get all exit blocks of the loop. The estimated weight of a
loop is calculated by iterating over edges leading from basic block to
all exit blocks of the loop. If at least one edge has unknown estimated
weight, the estimated weight of loop is unknown and will not be stored
in `EstimatedLoopWeight`. `LoopWorkList` can contain different blocks in
a same loop, so there is wasted work that calls `getLoopExitBlocks()`
for same loop multiple times.
Since computing the exit blocks of loop is expensive and the loop
structure is not mutated in Branch Probability Analysis, we can cache
the result and improve compile time.
With this change, the overall compile time for a file containing a very
large loop is dropped by around 82%.
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
53 under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
When using `BranchProbabilityPrinterPass`, if a BB has no name, we get pretty unusable information like `edge -> has probability...` (i.e. we have no idea what the vertices of that edge are).
This patch uses `printAsOperand`, which uses the same naming scheme as `Function::dump`, so for example during debugging sessions, the IR obtained from a function and the names used by `BranchProbabilityPrinterPass` will match.
A shortcoming is that `printAsOperand` will result in the numbering algorithm re-running for every edge and every vertex (when `BranchProbabilityPrinterPass` is run on a function). If, for the given scenario, this is a problem, we can revisit this subsequently.
Another nuance is that the entry basic block will be numbered, which may be slightly confusing when it's anonymous, but it's easily identifiable - the first edge would have it as source (and the number should be easily recognizable)
- Change `BranchProbabilityPrinterPass` output to match expectations of `update_analyze_test_checks.py`.
- Add `Branch Probability Analysis` to list of supported analyses.
- Process `llvm/test/Analysis/BranchProbabilityInfo/basic.ll` with `update_analyze_test_checks.py` as proof of concept. Leaving the other tests unchanged to reduce the amount of churn.
The motivation is need to update branch probability info after
swapping successors of branch instruction.
Differential Revision: https://reviews.llvm.org/D148237
Reviewed By: nikic
When debugging and using debug-pass-manager (e.g. in regression tests)
we prefer a consistent order in which analysis passes are executed.
But when for example doing
return MyClass(AM.getResult<LoopAnalysis>(F),
AM.getResult<DominatorTreeAnalysis>(F));
then the order in which LoopAnalysis and DominatorTreeAnalysis isn't
guaranteed, and might for example depend on which compiler that is
used when building LLVM.
I've not scanned the full source tree, but this fixes some occurances
of the above pattern found in lib/Analysis.
This problem was discussed briefly in review for D146206.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This commit fixes LLVMAnalysis and its dependencies.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
We currently instrument CallBrInst but do not annotate it with
the branch weight. This patch enables PGO annotation of CallBrInst.
Differential Revision: https://reviews.llvm.org/D133040
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.
Reviewed By: bogner, davidxl
Differential Revision: https://reviews.llvm.org/D128860
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.
Reviewed By: bogner, davidxl
Differential Revision: https://reviews.llvm.org/D128860
Use ConstantFoldBinaryOpOperands() instead, to prepare for the case
where not all binary operators have a constant expression form.
I believe this code actually intended to set OnlyIfReduced=true,
however ConstantExpr::get() actually accepts a Flags argument at
that position (and OnlyIfReducedTy as the next argument), so this
ended up creating a constant expression with some random flag
(probably exact or nuw depending on which).
This adds and uses look-up tables for non-loop branch probabilities, which have
have probabilities directly encoded into the tables for the different condition
codes. Compared to having this logic inlined in different functions, as it used
to be the case, I think this is compacter and thus also easier to check/cross
reference. This also adds a test for pointer heuristics that was missing.
Differential Revision: https://reviews.llvm.org/D114009
The function BranchProbabilityInfo::SccInfo::getSccExitBlocks is
supposed to collect all exit blocks for SCC rather than all exiting
blocks. This patch fixes the typo.
Reviewed By: ebrevnov
Differential Revision: https://reviews.llvm.org/D113344
Current approach doesn't work well in cases when multiple paths are predicted to be "cold". By "cold" paths I mean those containing "unreachable" instruction, call marked with 'cold' attribute and 'unwind' handler of 'invoke' instruction. The issue is that heuristics are applied one by one until the first match and essentially ignores relative hotness/coldness
of other paths.
New approach unifies processing of "cold" paths by assigning predefined absolute weight to each block estimated to be "cold". Then we propagate these weights up/down IR similarly to existing approach. And finally set up edge probabilities based on estimated block weights.
One important difference is how we propagate weight up. Existing approach propagates the same weight to all blocks that are post-dominated by a block with some "known" weight. This is useless at least because it always gives 50\50 distribution which is assumed by default anyway. Worse, it causes the algorithm to skip further heuristics and can miss setting more accurate probability. New algorithm propagates the weight up only to the blocks that dominates and post-dominated by a block with some "known" weight. In other words, those blocks that are either always executed or not executed together.
In addition new approach processes loops in an uniform way as well. Essentially loop exit edges are estimated as "cold" paths relative to back edges and should be considered uniformly with other coldness/hotness markers.
Reviewed By: yrouban
Differential Revision: https://reviews.llvm.org/D79485
Constant hoisting may hide the constant value behind bitcast for And's
operand. Track down the constant to make the BFI result consistent
regardless of hoisting.
Differential Revision: https://reviews.llvm.org/D91450
This patch teaches the jump threading pass to call BPI->eraseBlock
when it folds a conditional branch.
Without this patch, BranchProbabilityInfo could end up with stale edge
probabilities for the basic block containing the conditional branch --
one edge probability with less than 1.0 and the other for a removed
edge.
This patch is one of the steps before we can safely re-apply D91017.
Differential Revision: https://reviews.llvm.org/D91511