35 Commits

Author SHA1 Message Date
Maksim Panchenko
92301180f7
[BOLT] Use compact EH format for fixed-address executables (#117274)
Use ULEB128 format for emitting LSDAs for fixed-address executables,
similar to what we use for PIEs/DSOs. Main difference is that we don't
use landing pad trampolines when landing pads are not contained in a
single fragment. Instead, we fallback to emitting larger fixed-address
LSDAs, which is still better than adding trampoline instructions.
2024-11-22 00:28:55 -08:00
Maksim Panchenko
105ecd8bb2
[BOLT] Avoid EH trampolines for PIEs/DSOs (#117106)
We used to emit EH trampolines for PIE/DSO whenever a function fragment
contained a landing pad outside of it. However, it is common to have all
landing pads in a cold fragment even when their throwers are in a hot
one.

To reduce the number of trampolines, analyze landing pads for any given
function fragment, and if they all belong to the same (possibly
different) fragment, designate that fragment as a landing pad fragment
for the "thrower" fragment. Later, emit landing pad fragment symbol as
an LPStart for the thrower LSDA.
2024-11-21 18:18:30 -08:00
Maksim Panchenko
dd09a7db03
[BOLT] Add split function support for the Linux kernel (#90541)
While rewriting the Linux kernel, we try to fit optimized functions into
their original boundaries. When a function becomes larger, we skip it
during the rewrite and end up with less than optimal code layout. To
overcome that issue, add support for --split-function option so that hot
part of the function could be fit into the original space. The cold part
should go to reserved space in the binary.
2024-05-01 21:45:44 -07:00
Amir Ayupov
fd38366e45
[BOLT][NFC] Clean includes, add license headers (#87200) 2024-03-31 19:29:45 -07:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Amir Ayupov
a5f3d1a803
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
interface to `BinaryFunctionPass` to return an Error on
`runOnFunctions()`. This gives passes the ability to report a
serious problem to the caller (RewriteInstance class), so the
caller may decide how to best handle the exceptional situation.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:36:12 -08:00
ShatianWang
1577483413
[BOLT] Don't split likely fallthrough in CDSplit (#76164)
This diff speeds up CDSplit by not considering any hot-warm splitting
point that could break a fall-through branch from a basic block to its
most likely successor.

Co-authored-by: spupyrev <spupyrev@fb.com>
2023-12-21 16:17:10 -05:00
ShatianWang
296088bdf3
[BOLT][NFC] Remove unused code for CDSplit (#74136)
This diff removes JumpInfo related code that is no longer needed by
CDSplit from SplitFunctions.cpp.
2023-12-01 15:21:30 -05:00
ShatianWang
4483cf2d8b
[BOLT] CDSplit main logic part 2/2 (#74032)
This diff implements the main splitting logic of CDSplit. CDSplit
processes functions in a binary in parallel. For each function BF, it
assumes that all other functions are hot-cold split. For each possible
hot-warm split point of BF, it computes its corresponding SplitScore,
and chooses the split point with the best SplitScore. The SplitScore of
each split point is computed in the following way: each call edge or
jump edge has an edge score that is proportional to its execution count,
and inversely proportional to its distance. The SplitScore of a split
point is a sum of edge scores over a fixed set of edges whose distance
can change due to hot-warm splitting BF. This set contains all cover
calls in the form of X->Y or Y->X given function order [... X ... BF ...
Y ...]; we refer to the sum of edge scores over the set of cover calls
as CoverCallScore. This set also contains all jump edges (branches)
within BF as well as all call edges originated from BF; we refer to the
sum of edge scores over this set of edges as LocalScore. CDSplit finds
the split index maximizing CoverCallScore + LocalScore.
2023-11-30 23:17:11 -05:00
ShatianWang
56bbf8135e
[BOLT] CDSplit main logic part 1/2 (#73895)
This diff defines and initializes auxiliary variables used by CDSplit
and implements two important helper functions. The first helper function
approximates the block level size increase if a function is hot-warm
split at a given split index (X86 specific). The second helper function
finds all calls in the form of X->Y or Y->X for each BF given function
order [... X ... BF ... Y ...]. These calls are referred to as "cover
calls". Their distance will decrease if BF's hot fragment size is
further reduced by hot-warm splitting. NFC.
2023-11-30 20:55:36 -05:00
ShatianWang
c43d0432ef
[BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous approach of using .text.cold.0 as warm and .text.cold.1 as cold
in 3-way function splitting. NFC.
2023-11-29 22:42:36 -05:00
ShatianWang
076bd22f57
[BOLT] Add structure of CDSplit to SplitFunctions (#73430)
This commit establishes the general structure of the CDSplit strategy in
SplitFunctions without incorporating the exact splitting logic. With
-split-functions -split-strategy=cdsplit, the SplitFunctions pass will
run twice: the first time is before function reordering and functions
are hot-cold split; the second time is after function reordering and
functions are hot-warm-cold split based on the fixed function ordering.
Currently, all functions are hot-warm split after the entry block in the
second splitting pass. Subsequent commits will introduce the precise
splitting logic. NFC.
2023-11-29 15:43:21 -05:00
Amir Ayupov
c30ab9dcff [BOLT][NFC] Log reversing splitting decision
Expose log for testing purposes.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D144674
2023-02-27 15:28:06 -08:00
Amir Ayupov
2563fd63c6 [BOLT][NFC] Use std::optional in MCPlusBuilder
Reviewed By: maksfb, #bolt

Differential Revision: https://reviews.llvm.org/D139260
2022-12-06 14:51:38 -08:00
Fabian Parzefall
3ac46f377a [BOLT] Emit LSDA call sites for all fragments
For exception handling, LSDA call sites have to be emitted for each
fragment individually. With this patch, call sites and respective LSDA
symbols are generated and associated with each fragment of their
function, such that they can be used by the emitter.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D132052
2022-09-08 17:10:29 -07:00
Fabian Parzefall
ae2b4da166 [BOLT] Fragment all blocks (not just outlineable blocks)
To enable split strategies that require view of the entire CFG (e.g. to
estimate cost of path from entry block), with this patch, all blocks of
a function are passed to `SplitStrategy::fragment`. Because this might
move non-outlineable blocks into a split fragment, these blocks are
moved back into the main fragment after fragmenting. This also gives
strategies the option to specify whether empty fragments should be
kept or removed.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D132423
2022-09-08 17:10:13 -07:00
Fabian Parzefall
4fdbe9853c [BOLT] Introduce SplitStrategy ABC
This introduces an abstract base class for splitting strategies to
document the interface a strategy needs to implement, and also to avoid
code bloat of the `splitFunction` method.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D132054
2022-09-08 14:59:18 -07:00
Kazu Hirata
a0c7ca8ad4 [BOLT] Use range-based for loops (NFC)
LLVM Coding Standards discourage for_each unless callable objects
already exist.
2022-09-03 11:17:32 -07:00
Fabian Parzefall
e001a4e489 [BOLT] Insert EH trampolines for multiple fragments
This patch adds exception handling trampolines when a function is split
into more than two fragments. Trampolines are tracked per-fragment, such
that they can be removed if splitting is reversed.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132048
2022-08-18 21:55:08 -07:00
Fabian Parzefall
48ff38ce5d [BOLT] Add randomN split strategy
This adds a strategy to split functions into a random number of
fragments at randomly chosen split points.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D130647
2022-08-18 21:55:07 -07:00
Fabian Parzefall
f428db7a00 [BOLT] Add split all blocks strategy
This adds a function splitting strategy that splits each outlineable
basic block into its own fragment. This is exposed through a new command
line option `--split-strategy`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D129827
2022-08-18 21:55:07 -07:00
Fabian Parzefall
0f74d191d1 [BOLT] Generate sections for multiple fragments
This patch adds support to generate any number of sections that are
assigned to fragments of functions that are split more than two-way.
With this, a function's *nth* split fragment goes into section
`.text.cold.n`.

This also changes `FunctionLayout::erase` to make sure, that there are
no empty fragments at the end of the function. This sometimes happens
when blocks are erased from the function. To avoid creating symbols
pointing to these fragments, they need to be removed.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D130521
2022-08-18 21:55:06 -07:00
Fabian Parzefall
8477bc6761 [BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's
layout. It also lays the groundwork for splitting functions into
multiple fragments (as opposed to a strict hot/cold split).

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129518
2022-07-16 17:23:24 -07:00
Fabian Parzefall
d55dfeaf32 [BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that
iterate over all basic blocks of a function, but do not depend on the
order of basic blocks in the final layout, should iterate over binary
functions directly, rather than the layout.

Eventually, all loops using the layout list should either iterate over
the function, or be aware of multiple layouts. This patch replaces
references to binary function's block layout with the binary function
itself where only little code changes are necessary.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129585
2022-07-14 13:07:05 -07:00
Maksim Panchenko
ed74304506 [BOLT] Fix EH trampoline backout code
When SplitFunctions pass adds a trampoline code for exception landing
pads (limited to shared objects), it may increase the size of the hot
fragment making it larger than the whole function pre-split. When this
happens, the pass reverts the splitting action by restoring the original
block order and marking all blocks hot.

However, if createEHTrampolines() added new blocks to the CFG and
modified invoke instructions, simply restoring the original block layout
will not suffice as the new CFG has more blocks.

For proper backout of the split, modify the original layout by merging
in trampoline blocks immediately before their matching targets. As a
result, the number of blocks increases, but the number of instructions
and the function size remains the same as pre-split.

Add an assertion for the number of blocks when updating a function
layout.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128696
2022-06-29 14:35:57 -07:00
Fabian Parzefall
e341e9f094 [BOLT] Add option to randomize function split point
For test purposes, we want to split functions at a random split point
to be able to test different layouts without relying on the profile.
This patch introduces an option, that randomly chooses a split point
to partition blocks of a function into hot and cold regions.

Reviewed By: Amir, yota9

Differential Revision: https://reviews.llvm.org/D128773
2022-06-29 13:02:05 -07:00
Fabian Parzefall
96f6ec5090 [BOLT] Mark option values of --split-functions deprecated
The SplitFunctions pass does not distinguish between various splitting
modes anymore. This change updates the command line interface to
reflect this behavior by deprecating values passed to the
--split-function option.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128558
2022-06-24 17:01:13 -07:00
Amir Ayupov
d2c8769936 [BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts
accepting ranges.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128154
2022-06-23 22:16:27 -07:00
Maksim Panchenko
f263a66ba0 [BOLT] Split functions with exceptions in shared objects and PIEs
Add functionality to allow splitting code with C++ exceptions in shared
libraries and PIEs. To overcome a limitation in exception ranges format,
for functions with fragments spanning multiple sections, add trampoline
landing pads in the same section as the corresponding throwing range.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127936
2022-06-19 16:48:48 -07:00
Fangrui Song
b92436efcb [bolt] Remove unneeded cl::ZeroOrMore for cl::opt options 2022-06-05 13:29:49 -07:00
Amir Ayupov
f92ab6af35 [BOLT][NFC] Fix braces usage in Passes
Summary:
Refactor bolt/*/Passes to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33344642)
2021-12-28 16:36:17 -08:00
Maksim Panchenko
2f09f445b2 [BOLT][NFC] Fix file-description comments
Summary: Fix comments at the start of source files.

(cherry picked from FBD33274597)
2021-12-21 10:21:41 -08:00
Maksim Panchenko
40c2e0fafe [BOLT][NFC] Reformat with clang-format
Summary: Selectively apply clang-format to BOLT code base.

(cherry picked from FBD33119052)
2021-12-14 16:52:51 -08:00
Maksim Panchenko
ebe51c4d23 [BOLT] Use more ADT data structures for BinaryFunction
Summary:
Switched members of BinaryFunction to ADT where it was possible and
made sense. As a result, the size of BinaryFunction on x86-64 Linux
reduced from 1624 bytes to 1448.

(cherry picked from FBD32981555)
2021-12-08 22:59:09 -08:00
Rafael Auler
a34c753fe7 Rebase: [NFC] Refactor sources to be buildable in shared mode
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.

Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.

To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.

To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).

Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.

(cherry picked from FBD32746834)
2021-10-08 11:47:10 -07:00