- **Reapply "[BOLT] Add --pad-funcs-before=func:n (#117924)"**
- **[BOLT] Fix --pad-funcs{,-before} state misinteraction**
When --pad-funcs-before was introduced, it introduced a bug whereby the
first one to get parsed could influence the other.
Ensure that each has its own state and test that they don't interact in
this manner by testing how the `_subsequent` symbol moves when both
arguments are supplied with different padding values.
Fixed by having a function (and static state) for each of before/after.
14dcf8214f9c66172d17c1cfaec6aec0030748e0 introduced a subtle bug with
the static `FunctionPadding` map.
If either `opts::FunctionPadSpec` or `opts::FunctionPadBeforeSpec` are set,
the map is going to be populated with the respective spec in the first
invocation of `BinaryEmitter::emitFunction`. The subsequent invocations
will pick up the padding from the map irrespective of whether
`opts::FunctionPadSpec` or `opts::FunctionPadBeforeSpec` is passed as a
parameter.
This breaks an internal test, hence reverting the patch.
This change extracts the comparator for sorting functions by index into
a helper function `compareBinaryFunctionByIndex()`
Not sure why the comparator used in
`BinaryContext::getSortedFunctions()` is not same as the other two
places. I think they should use the same comparator, so I also change
`BinaryContext::getSortedFunctions()` to use
`compareBinaryFunctionByIndex()` for sorting functions.
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.
Test Plan: NFC
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we add a new class
BOLTError and auxiliary functions `createFatalBOLTError()` and
`createNonFatalBOLTError()` that allow BOLT code to bubble up the
problem to the caller by using the Error class as a return
type (or Expected). Also changes passes to use these.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
interface to `BinaryFunctionPass` to return an Error on
`runOnFunctions()`. This gives passes the ability to report a
serious problem to the caller (RewriteInstance class), so the
caller may decide how to best handle the exceptional situation.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
A new function sorting algorithm (cdsort) in LLVM is an optimized
version of BOLT's hfsort+. In order to avoid code duplication and
simplify maintenance, getting rid of hfsort+.
Perf-wise this is likely a neutral change, though differences on
individual benchmarks are possible, since the generated function layout
has changed. I tested cdsort vs hfsort+ on a number of open-source and
prod binaries built in different modes and record an average neutral
perf difference, perhaps with more "green" counters.
This commit establishes the general structure of the CDSplit strategy in
SplitFunctions without incorporating the exact splitting logic. With
-split-functions -split-strategy=cdsplit, the SplitFunctions pass will
run twice: the first time is before function reordering and functions
are hot-cold split; the second time is after function reordering and
functions are hot-warm-cold split based on the fixed function ordering.
Currently, all functions are hot-warm split after the entry block in the
second splitting pass. Subsequent commits will introduce the precise
splitting logic. NFC.
Unify naming for the layout algorithms by renaming "cds" to "cdsort".
This is
NFC unless someone is already using the new algorithm (which is
unlikely).
* Place types and functions in the llvm::codelayout namespace
* Change EdgeCountT from pair<pair<uint64_t, uint64_t>, uint64_t> to a struct and utilize structured bindings.
It is not conventional to use the "T" suffix for structure types.
* Remove a redundant copy in ChainT::merge.
* Change {ExtTSPImpl,CDSortImpl}::run to use return value instead of an output parameter
* Rename applyCDSLayout to computeCacheDirectedLayout: (a) avoid rare
abbreviation "CDS" (cache-directed sort) (b) "compute" is more conventional
for the specific use case
* Change the parameter types from std::vector to ArrayRef so that
SmallVector arguments can be used.
* Similarly, rename applyExtTspLayout to computeExtTspLayout.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D159526
I noticed that `-reorder-functions=exec-count` doesn't work as expected due to
a bug in the comparison function (which isn't symmetric). It is questionable
whether anyone would want to ever use the sorting method (as sorting by say
density is much better in all cases) but it is probably better to fix the bug.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D152959
This is a new algorithm for function layout (reordering) based on the call graph
extracted from a profile data; see diffs down the stack for more details.
This layout is very similar to the existing hfsort+, but perhaps a little better
on some benchmarks. The goals of the change is as follows:
(i) rename and replace hfsort+ with a newer (hopefully better) implementation.
I'd prefer to keep both algs together for some time to simplify evaluation and
transition, but do want to remove hfsort+ once we're confident that there are
no regressions.
(ii) unify the implementation of code layout algorithms across LLVM. Currently
Passes/HfsortPlus.cpp and Utils/CodeLayout.cpp share many implementation-specific
details; this diff unifies the code.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D153039
Allow partial name matching wrt LTO suffixes in `function-order`
user-supplied function list, the same as permitted by profile matching.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D142269
Break out stats-printing code from ReorderFunctions::reorder for brevity.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D142250
Use isSplit() instead of isCold() when building the call graph and
update parameter names to reflect this.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132047
Some of the passes that calculates tentative layout like LongJmp and
Golang are expecting that only functions with valid index will be
located in hot text section. But currently functions with valid profiles
and not set index are breaking this logic, to fix this we can move the
hasValidProfile() condition from AssignSections pass to ReorderFunctions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D127223
Summary:
Refactor bolt/*/Passes to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33344642)
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.
Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.
To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.
To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).
Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.
(cherry picked from FBD32746834)