Allow the user to allocate space in a binary that could be used by BOLT
for allocating new sections. The reservation is specified by two special
symbols recognizable by BOLT: __bolt_reserved_{start,end}.
The reserved space will be useful for optimizing the Linux kernel where
we cannot allocate a new executable segment. However, the support is not
limited to kernel binaries as some user-space application may find it
useful too.
Fragment matching relies on symbol names to identify and register split
function fragments. However, as split fragments are often local symbols,
name aliasing is possible. For such cases, use symbol table to resolve
ambiguities.
This requires the presence of FILE symbols in the input binary. As BOLT
requires non-stripped binary, this is a reasonable assumption. Note that
`strip -g` removes FILE symbols by default, but `--keep-file-symbols`
can be used to preserve them.
Depends on: https://github.com/llvm/llvm-project/pull/89861
Test Plan:
Updated X86/fragment-lite.s
When we rewrite dynamic relocations, there could be cases where they
reference code locations inside functions that were rewritten. When this
happens, we need to precisely map old address to a new one. Until we can
reliably perform the mapping, detect such condition and issue an error
refusing to write a broken binary.
Under normal circumstances, we terminate basic blocks on a trap
instruction. However, Linux kernel may resume execution after hitting a
trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will
specify if the trap instruction should terminate the control flow. The
option is on by default except for the Linux kernel mode when it's off.
Update instruction locations in the __bug_table section after new code
is emitted. If an instruction with associated bug ID was deleted,
overwrite its location with zero.
For DWARF5 BOLT was not retreiving address and instead was setting an
index.
Changed so that an address is used, and added DWARF4 test because it was
missing.
Relax assumptions that YAML output is not supported in BAT mode.
Set up basic infrastructure for emitting YAML for functions not covered
by BAT, such as from `.bolt.org.text` section (code identical to input binary
sans external refs), or non-rewritten functions in non-relocation mode (where
the function stays in the same section but BAT mapping is not emitted).
This diff only produces YAML profile for non-BAT functions (skipped,
non-simple). YAML profile for BAT functions is added in follow-up diffs:
- https://github.com/llvm/llvm-project/pull/76911 emits YAML profile with
internal control flow information only (branch profile),
- https://github.com/llvm/llvm-project/pull/76896 adds cross-function profile
(calls profile).
Test Plan: Added bolt/test/X86/bolt-address-translation-yaml.test
Reviewers: ayermolo, dcci, maksfb, rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/76910
Runtime code modification used by static keys is the most ubiquitous
self-modifying feature of the Linux kernel. The idea is to to eliminate
the condition check and associated conditional jump on a hot path if
that condition (based on a boolean value of a static key) does not
change often. Whenever they condition changes, the kernel runtime
modifies all code paths associated with that key flipping the code
between nop and (unconditional) jump.
commit 43a2ec483fe08064b53a6293682e9bab97df61a0
Author: Jonas Devlieghere <jonas@devlieghere.com>
Date: Tue Mar 19 08:30:47 2024 -0700
removed parameter Translator from the constructor of DwarfStreamer.
This patch fixes the build by updating the constructor of DIEStreamer
accordingly.
.pci_fixup section contains a table with entries allowing to invoke a
fixup hook whenever a problem is encountered with a PCI device. The
hookup code typically points to the start of a function. As we are not
relocating functions in the kernel (at least not yet), verify this
assumption while reading the table and ignore any functions with a fixup
code in the middle.
Read .altinstructions and annotate instructions that have alternative
sequences with "AltInst" annotation. Note that some instructions may
have more than one alternatives, in which case they will have multiple
annotations in the form "AltInst", "AltInst2", "AltInst3", etc.
Read Linux exception table and ignore functions with exceptions for now.
Proper support requires an introduction of new control flow since some
instructions with memory access can cause a control flow change.
Hence looking at disassembly or CFG with exceptions annotations is
valuable for code analysis, delay marking functions with exceptions as
non-simple until immediately before emitting the code.
To avoid accidentally setting the label twice for the same instruction,
which can lead to a "lost" label, introduce getOrSetInstLabel()
function. Rename existing functions to getInstLabel()/setInstLabel() to
make it explicit that they operate on instruction labels. Add an
assertion in setInstLabel() that the instruction did not have a prior
label set.
DWARF5 spec supports the .debug_names acceleration table. This is the
formalized version of combination of gdb-index/pubnames/types. Added
implementation of it to BOLT. It supports both monolothic and split
dwarf, with and without Type Units. It does not include parent indices.
This will be in followup PR. Unlike LLVM output this will put all the
CUs and TUs into one Module.
The change in #80950 exposed a memory leak in BinarySection. Let
BinarySection manage memory passed via updateContents() unless a valid
SectionID is set indicating that the contents are managed by JITLink.
Static calls are calls that are getting patched during runtime. Hence,
for every such call the kernel runtime needs the location of the call or
jmp instruction that will be patched. Instruction locations together
with a corresponding key are stored in the static call site table. As
BOLT rewrites these instructions it needs to update the table.
Update ORC information based on the new code layout and emit
corresponding ORC sections for the Linux kernel.
We rewrite ORC sections in place, which puts a limit on the size of new
section contents. Since ORC info changes for the new code layout and the
number of ORC entries can become larger, we free up space in the tables
by removing redundant ORC terminators. As a result, we effectively emit
fewer entries and have to add duplicate terminators at the end to match
the original section sizes. Ideally, we need to update ORC boundaries to
reflect the reduced size and optimize runtime lookup, but we will need
relocations for this, and the benefits will be marginal, if any.
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.
Test Plan: NFC
Co-authored-by: Rafael Auler <rafaelauler@fb.com>