252 Commits

Author SHA1 Message Date
Amir Ayupov
1529ec085a
[BOLT][NFC] Move out PrintProgramStats from Profile into Rewrite (#93075)
Eliminate the dependence of Profile on Passes.

Test Plan: NFC
2024-05-22 13:53:41 -07:00
shaw young
96378b3da8
[BOLT] Add NamedRegionTimer to inferStaleProfile (#93078) 2024-05-22 11:04:12 -07:00
Amir Ayupov
935b946b1f
[BOLT] Process cross references between ignored functions in BAT mode (#92484)
To align YAML and fdata profiles produced in BAT mode, lift two
restrictions applied in non-relocation mode when BAT is present:
1) register secondary entry points from ignored functions,
2) treat functions with secondary entry points as simple.

This allows constructing CFG for non-simple functions in non-relocation
mode and emitting YAML profile for them, which can then be used for
optimizations in relocation mode.

Test Plan: added test ignored-interprocedural-reference.s
2024-05-21 20:22:12 -07:00
Amir Ayupov
32c9d5ef4f Revert "[BOLT] Add NamedRegionTimer to inferStaleProfile (#92621)"
This reverts commit 9f2313829fd210f9923375e93bc11fe9685c26d5.

Creates a dependency cycle: lib/Rewrite depends on lib/Profile.
2024-05-21 13:55:32 -07:00
shaw young
9f2313829f
[BOLT] Add NamedRegionTimer to inferStaleProfile (#92621) 2024-05-21 13:26:57 -07:00
Amir Ayupov
bb627b0a0c
[BOLT] Ignore special symbols as function aliases in updateELFSymbolTable
Exempt special symbols (hot text/data and _end symbol) from normal
handling. We only need to set their value and make them absolute.

If these symbols are handled as normal symbols and if they alias
functions we may create non-sensical symbols, e.g. __hot_start.cold.

Test Plan: updated hot-end-symbol.s

Reviewers: maksfb, rafaelauler, ayermolo, dcci

Reviewed By: dcci, maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/92713
2024-05-20 16:55:11 -07:00
Maksim Panchenko
9cd218e427
[BOLT] Refactor BOLT reserved space discovery (#90893)
Move code that checks for __bolt_reserved_{start,end} into a new
discoverBOLTReserved() function and call it from discoverFileObjects()
so that the reserved space info is accessible to passes. NFC for the
current set of binaries.
2024-05-02 13:17:29 -07:00
Maksim Panchenko
ad7ee900c7
[BOLT][NFC] Add BOLTReserved to BinaryContext (#90766)
Use BOLTReserved to track binary space preallocated for BOLT.
2024-05-01 18:22:38 -07:00
Maksim Panchenko
49bb993959
[BOLT] Fix build-time assertion in RewriteInstance (#90540)
We use pwrite() in RewriteInstance to update contents of existing
sections. pwrite() requires file position to be set past the written
offset which we guarantee at the start of rewriteFile(). Then we had an
implicit assumption in patchBuildID() that the file position will be set
again in patchELFSymTabs() after being reset in patchELFPHDRTable().
That assumption was broken in #90300. The fix is to save and restore
file position in patchELFPHDRTable(). Then we don't have to update it
again in patchELFSymTabs().
2024-04-30 10:51:08 -07:00
Amir Ayupov
c4c4e17c99
[BOLT] Use heuristic for matching split local functions (#90424)
Use known order of BOLT split function symbols: fragment symbols
immediately precede the parent fragment symbol.

Depends On: https://github.com/llvm/llvm-project/pull/89648

Test Plan: Added register-fragments-bolt-symbols.s
2024-04-29 16:18:13 -07:00
Maksim Panchenko
3a0d894faf
[BOLT] Add support for BOLT-reserved space in a binary (#90300)
Allow the user to allocate space in a binary that could be used by BOLT
for allocating new sections. The reservation is specified by two special
symbols recognizable by BOLT: __bolt_reserved_{start,end}.

The reserved space will be useful for optimizing the Linux kernel where
we cannot allocate a new executable segment. However, the support is not
limited to kernel binaries as some user-space application may find it
useful too.
2024-04-29 14:44:04 -07:00
Amir Ayupov
a1e9608b0f
[BOLT] Use symbol table info in registerFragment (#89648)
Fragment matching relies on symbol names to identify and register split
function fragments. However, as split fragments are often local symbols,
name aliasing is possible. For such cases, use symbol table to resolve
ambiguities.

This requires the presence of FILE symbols in the input binary. As BOLT
requires non-stripped binary, this is a reasonable assumption. Note that
`strip -g` removes FILE symbols by default, but `--keep-file-symbols`
can be used to preserve them.

Depends on: https://github.com/llvm/llvm-project/pull/89861

Test Plan:
Updated X86/fragment-lite.s
2024-04-29 11:14:31 -07:00
Maksim Panchenko
3ec858bc5d
[BOLT] Refactor patchELFPHDRTable() (#90290)
Mostly NFC accept for one assertion that was converted into an error.
2024-04-26 16:29:42 -07:00
Maksim Panchenko
12d322db46
[BOLT][NFC] Use getEHFrameHdrSectionName() (#90257)
Reference section name via wrapper.
2024-04-26 14:13:23 -07:00
Fangrui Song
e982032199
[BOLT,RISCV] Remove empty name special case from #68977
The special case is unneeded after #89693.

Pull Request: https://github.com/llvm/llvm-project/pull/90004
2024-04-25 20:42:40 -07:00
Amir Ayupov
090c92e015
[BOLT] Emit synthetic FILE symbol for local cold fragments of global symbols (#89794) 2024-04-25 04:53:15 +02:00
Maksim Panchenko
418e4b0c4f
[BOLT] Detect incorrect update of dynamic relocations (#89681)
When we rewrite dynamic relocations, there could be cases where they
reference code locations inside functions that were rewritten. When this
happens, we need to precisely map old address to a new one. Until we can
reliably perform the mapping, detect such condition and issue an error
refusing to write a broken binary.
2024-04-24 14:03:33 -07:00
Maksim Panchenko
0af8caeb2f
[BOLT][NFC] Remove another unused function (#89011)
RewriteInstance::isKSymtabSection() is deprecated.
2024-04-16 17:58:47 -07:00
Nathan Sidwell
bd7b170e97
[BOLT][NFC] Remove extraneous braces (#88620)
A small cleanup -- no braces needed here.
2024-04-15 13:12:53 -04:00
Nathan Sidwell
603fa4c6b9
[BOLT][NFC] Be more obvious about selecting X86 (#88527)
Use `isX86()` rather than `!isAArch64() && !isRISCV()`, and similar.
2024-04-15 13:11:29 -04:00
Nathan Sidwell
4dd20b0728
[BOLT][NFC] Refactor relocation loop (#88424)
Use the std `if () continue;` idiom before falling into the
processing.
2024-04-12 11:35:23 -04:00
Nathan Sidwell
6ec467297d
[BOLT][NFC] Adjust misleading comment & formatting (#88409)
This originally dealt with tbss, but now handles any bss-like section.
So the comment is inaccurate. Also, the `{}` on the messaging seem
unnecessary.
2024-04-12 08:34:43 -04:00
Nathan Sidwell
5bed6afc21
[BOLT][NFC] Remove unneeded if (#88322)
No need need to special-case zero. Section 0 will map to section 0.
2024-04-11 14:44:11 -04:00
Nathan Sidwell
364963a0a3
[BOLT][NFC] Do not assume text section name in more places (#88303)
Fixes a couple more places where ".text" is presumed for the main
code section name.
2024-04-11 06:29:51 -04:00
Amir Ayupov
c0febca3a6
[BOLT][NFC] Refactor BC::createBinaryContext for #81346 (#87172) 2024-03-30 20:43:23 -07:00
Maksim Panchenko
7de82ca369
[BOLT] Don't terminate on trap instruction for Linux kernel (#87021)
Under normal circumstances, we terminate basic blocks on a trap
instruction. However, Linux kernel may resume execution after hitting a
trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will
specify if the trap instruction should terminate the control flow. The
option is on by default except for the Linux kernel mode when it's off.
2024-03-29 16:41:15 -07:00
Maksim Panchenko
51268a57fd
[BOLT] Enable --keep-nops option for Linux kernel by default (#86349)
Preserve nop instructions in the Linux kernel since they could be used
for runtime patching.
2024-03-22 15:29:26 -07:00
Amir Ayupov
6280681137
[BOLT] Output basic YAML profile in BAT mode
Relax assumptions that YAML output is not supported in BAT mode.
Set up basic infrastructure for emitting YAML for functions not covered
by BAT, such as from `.bolt.org.text` section (code identical to input binary
sans external refs), or non-rewritten functions in non-relocation mode (where
the function stays in the same section but BAT mapping is not emitted).

This diff only produces YAML profile for non-BAT functions (skipped,
non-simple). YAML profile for BAT functions is added in follow-up diffs:
- https://github.com/llvm/llvm-project/pull/76911 emits YAML profile with
  internal control flow information only (branch profile),
- https://github.com/llvm/llvm-project/pull/76896 adds cross-function profile
  (calls profile).

Test Plan: Added bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76910
2024-03-21 14:32:13 -07:00
Maksim Panchenko
5daf2001a1
[BOLT] Fix memory leak in BinarySection (#82520)
The change in #80950 exposed a memory leak in BinarySection. Let
BinarySection manage memory passed via updateContents() unless a valid
SectionID is set indicating that the contents are managed by JITLink.
2024-02-21 11:54:34 -08:00
Amir Ayupov
d2c9a19dd8
[BOLT][NFC] Pass BF/BB hashes to BAT
Test Plan: NFC

Reviewers: dcci, rafaelauler, maksfb, ayermolo

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76906
2024-02-15 12:49:43 -08:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Amir Ayupov
13d60ce2f2
[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.

Test Plan: NFC

Co-authored-by: Rafael Auler <rafaelauler@fb.com>
2024-02-12 14:51:15 -08:00
Maksim Panchenko
7fe97f0420
[BOLT] Always run CheckLargeFunctions in non-relocation mode (#80922)
We run CheckLargeFunctions pass in non-relocation mode to prevent the
emission of functions that later could not be written to the output due
to their large size. The main reason behind the pass is to prevent the
emission of metadata for such functions since this metadata becomes
incorrect if the function is left unmodified.

Currently, the pass is enabled in non-relocation mode only when debug
info output is also enabled. As we emit increasingly more kinds of
metadata, e.g. for the Linux kernel, it becomes more challenging to
track metadata that needs to be fixed. Hence, I'm enabling the pass to
always run in non-relocation mode.
2024-02-08 14:21:49 -08:00
Job Noorman
e7c0e59bbc
[BOLT] Fix crash for relocs in data sections against ABS symbols (#76026)
Fixes #75771
2024-02-07 07:53:02 +00:00
Maksim Panchenko
a693ae5306
[BOLT] Enable re-writing of Linux kernel binary (#80228)
Write modified Linux kernel binary to disk. The output is not supposed
to be functional at the moment, but it will allow for future patches to
test the output binary.
2024-02-01 12:11:26 -08:00
Maksim Panchenko
116e801a15
[BOLT] Adjust section sizes based on file offsets (#80226)
When we adjust section sizes while rewriting a binary, we should be
using section offsets and not addresses to determine if section overlap.
NFC for existing binaries.
2024-02-01 12:08:41 -08:00
Amir Ayupov
bed3608c22
[BOLT][NFC] Factor out RI::disassemblePLTInstruction (#80302) 2024-02-01 08:26:21 -08:00
Maksim Panchenko
2abcbbd96a
[BOLT] Detect Linux kernel based on ELF program headers (#80086)
Check if program header addresses fall into the kernel space to detect a
Linux kernel binary on x86-64.

Delete opts::LinuxKernelMode and use BinaryContext::IsLinuxKernel
instead.
2024-01-30 18:04:29 -08:00
Maksim Panchenko
aa1968c2eb
[BOLT] Add metadata pre-emit finalization interface (#79925)
Some metadata needs to be updated/finalized before the binary context is
emitted into the binary. Add the interface and use it for Linux ORC
update invocation.
2024-01-29 17:27:33 -08:00
Amir Ayupov
2bb511e277
[BOLT][NFC] Print BAT section size (#76897)
Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 11:04:04 -08:00
Kazu Hirata
ad8fd5b185 [BOLT] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 23:34:49 -08:00
Nathan Sidwell
9596676e65
[BOLT] Determine address size from binary (#74870)
Query the executable for address size.
2023-12-09 14:39:57 -05:00
ShatianWang
c43d0432ef
[BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous approach of using .text.cold.0 as warm and .text.cold.1 as cold
in 3-way function splitting. NFC.
2023-11-29 22:42:36 -05:00
Vladislav Khmelevsky
c5a306f07e
[BOLT] Fix LSDA section handling (#71821)
Currently BOLT finds LSDA secition by it's name .gcc_except_table.main .
But sometimes it might have suffix e.g. .gcc_except_table.main. Find
LSDA section by it's address, rather by it's name.
Fixes #71804
2023-11-15 23:21:50 +04:00
Maksim Panchenko
2db9b6a93f
[BOLT] Make instruction size a first-class annotation (#72167)
When NOP instructions are used to reserve space in the code, e.g. for
patching, it becomes critical to preserve their original size while
emitting the code. On x86, we rely on "Size" annotation for NOP
instructions size, as the original instruction size is lost in the
disassembly/assembly process.

This change makes instruction size a first-class annotation and is
affectively NFCI. A follow-up diff will use the annotation for code
emission.
2023-11-13 14:33:39 -08:00
Vladislav Khmelevsky
cf18f142c0
[BOLT] Read .rela.dyn in static non-pie binary (#71635)
Static non-pie binary doesn't have DYNAMIC segment and BOLT skips
reading .rela.dyn section because of it. But such binaries might have
this section for example to store IFUNC relocation which is resolved
by linked-in startup files, so force reading this section for static
executables.
2023-11-10 11:47:12 +04:00
spaette
1a2f83366b
[BOLT] Fix typos (#68121)
Closes https://github.com/llvm/llvm-project/issues/63097

Before merging please make sure the change to
bolt/include/bolt/Passes/StokeInfo.h is correct.

bolt/include/bolt/Passes/StokeInfo.h

```diff
  //  This Pass solves the two major problems to use the Stoke program without
- //  proting its code:
+ //  probing its code:
```

I'm still not happy about the awkward wording in this comment.

bolt/include/bolt/Passes/FixRelaxationPass.h

```
$ ed -s bolt/include/bolt/Passes/FixRelaxationPass.h <<<'9,12p'
// This file declares the FixRelaxations class, which locates instructions with
// wrong targets and fixes them. Such problems usually occures when linker
// relaxes (changes) instructions, but doesn't fix relocations types properly
// for them.
$
```


bolt/docs/doxygen.cfg.in
bolt/include/bolt/Core/BinaryContext.h
bolt/include/bolt/Core/BinaryFunction.h
bolt/include/bolt/Core/BinarySection.h
bolt/include/bolt/Core/DebugData.h
bolt/include/bolt/Core/DynoStats.h
bolt/include/bolt/Core/Exceptions.h
bolt/include/bolt/Core/MCPlusBuilder.h
bolt/include/bolt/Core/Relocation.h
bolt/include/bolt/Passes/FixRelaxationPass.h
bolt/include/bolt/Passes/InstrumentationSummary.h
bolt/include/bolt/Passes/ReorderAlgorithm.h
bolt/include/bolt/Passes/StackReachingUses.h
bolt/include/bolt/Passes/StokeInfo.h
bolt/include/bolt/Passes/TailDuplication.h
bolt/include/bolt/Profile/DataAggregator.h
bolt/include/bolt/Profile/DataReader.h
bolt/lib/Core/BinaryContext.cpp
bolt/lib/Core/BinarySection.cpp
bolt/lib/Core/DebugData.cpp
bolt/lib/Core/DynoStats.cpp
bolt/lib/Core/Relocation.cpp
bolt/lib/Passes/Instrumentation.cpp
bolt/lib/Passes/JTFootprintReduction.cpp
bolt/lib/Passes/ReorderData.cpp
bolt/lib/Passes/RetpolineInsertion.cpp
bolt/lib/Passes/ShrinkWrapping.cpp
bolt/lib/Passes/TailDuplication.cpp
bolt/lib/Rewrite/BoltDiff.cpp
bolt/lib/Rewrite/DWARFRewriter.cpp
bolt/lib/Rewrite/RewriteInstance.cpp
bolt/lib/Utils/CommandLineOpts.cpp
bolt/runtime/instr.cpp
bolt/test/AArch64/got-ld64-relaxation.test
bolt/test/AArch64/unmarked-data.test
bolt/test/X86/Inputs/dwarf5-cu-no-debug-addr-helper.s
bolt/test/X86/Inputs/linenumber.cpp
bolt/test/X86/double-jump.test
bolt/test/X86/dwarf5-call-pc-function-null-check.test
bolt/test/X86/dwarf5-split-dwarf4-monolithic.test
bolt/test/X86/dynrelocs.s
bolt/test/X86/fallthrough-to-noop.test
bolt/test/X86/tail-duplication-cache.s
bolt/test/runtime/X86/instrumentation-ind-calls.s
2023-11-09 11:29:46 -08:00
Job Noorman
96b5e092dc
[BOLT] Support instrumentation hook via DT_FINI_ARRAY (#67348)
BOLT currently hooks its its instrumentation finalization function via
`DT_FINI`. However, this method of calling finalization routines is not
supported anymore on newer ABIs like RISC-V. `DT_FINI_ARRAY` is
preferred there.

This patch adds support for hooking into `DT_FINI_ARRAY` instead if the
binary does not have a `DT_FINI` entry. If it does, `DT_FINI` takes
precedence so this patch should not change how the currently supported
instrumentation targets behave.

`DT_FINI_ARRAY` points to an array in memory of `DT_FINI_ARRAYSZ` bytes.
It consists of pointer-length entries that contain the addresses of
finalization functions. However, the addresses are only filled-in by the
dynamic linker at load time using relative relocations. This makes
hooking via `DT_FINI_ARRAY` a bit more complicated than via `DT_FINI`.

The implementation works as follows:
- While scanning the binary: find the section where `DT_FINI_ARRAY`
points to, read its first dynamic relocation and use its addend to find
the address of the fini function we will use to hook;
- While writing the output file: overwrite the addend of the dynamic
relocation with the address of the runtime library's fini function.

Updating the dynamic relocation required a bit of boiler plate: since
dynamic relocations are stored in a `std::multiset` which doesn't
support getting mutable references to its items, functions were added to
`BinarySection` to take an existing relocation and insert a new one.
2023-11-08 11:01:10 +00:00
Vladislav Khmelevsky
e2f1a95f2a
[BOLT][AArch64] Handle IFUNCS properly (#71104)
Currently we were testing only the binaries compiled with O0, which
results in indirect call to the IFUNC trampoline and the trampoline has
associated IFUNC symbol with it. Compile with O3 results in direct
calling the IFUNC trampoline and no symbols are associated with it, the
IFUNC symbol address becomes the same as IFUNC resolver address. Since
no symbol was associated the BF was not created before PLT analyze and
be the algorithm we're going to analyze target relocation. As we're
expecting the JUMP relocation we're also expecting the associated symbol
with it to be presented. But for IFUNC relocation the IRELATIVE
relocation is used and no symbol is associated with it, the addend value
is pointing on the target symbol, so we need to find BF using it and use
it's symbol in this situation. Currently this is checked only for
AArch64 platform, so I've limited it in code to use this logic only for
this platform, although I wouldn't be surprised if other platforms needs
to activate this logic too.
2023-11-08 11:41:43 +04:00
maksfb
8244ff6739
[BOLT] Fix incorrect basic block output addresses (#70000)
Some optimization passes may duplicate basic blocks and assign the same
input offset to a number of different blocks in a function. This is done
e.g. to correctly map debugging ranges for duplicated code.

However, duplicate input offsets present a problem when we use
AddressMap to generate new addresses for basic blocks. The output
address is calculated based on the input offset and will be the same for
blocks with identical offsets. The result is potentially incorrect debug
info and BAT records.

To address the issue, we have to eliminate the dependency on input
offsets while generating output addresses for a basic block. Each block
has a unique label, hence we extend AddressMap to include address lookup
based on MCSymbol and use the new functionality to update block
addresses.
2023-10-24 12:22:43 -07:00