122 Commits

Author SHA1 Message Date
Jared Wyles
2ccf7ed277
[JITLink] Switch to SymbolStringPtr for Symbol names (#115796)
Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning
on the boundary between JITLink and ORC, and allows pointer comparisons (rather
than string comparisons) between Symbol names. This should improve the
performance and readability of code that bridges between JITLink and ORC (e.g.
ObjectLinkingLayer and ObjectLinkingLayer::Plugins).

To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to
LinkGraph and threaded through to its construction sites in LLVM and Bolt. All
LinkGraphs that are to have symbol names compared by pointer equality must point
to the same SymbolStringPool instance, which in ORC sessions should be the pool
attached to the ExecutionSession.
---------

Co-authored-by: Lang Hames <lhames@gmail.com>
2024-12-06 10:22:09 +11:00
Enna1
4d2bc0adc6
[BOLT] Extract comparator for sorting functions by index into helper function (#116217)
This change extracts the comparator for sorting functions by index into
a helper function `compareBinaryFunctionByIndex()`

Not sure why the comparator used in
`BinaryContext::getSortedFunctions()` is not same as the other two
places. I think they should use the same comparator, so I also change
`BinaryContext::getSortedFunctions()` to use
`compareBinaryFunctionByIndex()` for sorting functions.
2024-11-27 09:01:12 +08:00
sinan
c3bbc3a57d
[BOLT] Fix logs with no hex convension (#112650)
Add `utohexstr` to ensure that offsets/addresses are correctly formatted
as hexadecimal values.
2024-10-18 09:46:41 +08:00
Kristof Beyls
6d216fb7b8
[perf2bolt] Improve heuristic to map in-process addresses to specific… (#109397)
… segments in Elf binary.

The heuristic is improved by also taking into account that only
executable segments should contain instructions.

Fixes #109384.
2024-09-23 15:14:51 +02:00
Davide Italiano
e49549ff19 Revert "[BOLT] Abort on out-of-section symbols in GOT (#100801)"
This reverts commit a4900f0d936f0e86bbd04bd9de4291e1795f1768.
2024-08-07 20:52:19 -07:00
Sayhaan Siddiqui
62e894e0d7
[BOLT][DWARF][NFC] Move Arch assignment out of createBinaryContext (#102054)
Moves the assignment of Arch out of createBinaryContext to prevent data
races when parallelized.
2024-08-07 16:55:39 +00:00
Vladislav Khmelevsky
a4900f0d93
[BOLT] Abort on out-of-section symbols in GOT (#100801)
This patch aborts BOLT execution if it finds out-of-section (section
end) symbol in GOT table. In order to handle such situations properly in
future, we would need to have an arch-dependent way to analyze
relocations or its sequences, e.g., for ARM it would probably be ADRP +
LDR analysis in order to get GOT entry address. Currently, it is also
challenging because GOT-related relocation symbols are replaced to
__BOLT_got_zero. Anyway, it seems to be quite a rare case, which seems
to be only? related to static binaries. For the most part, it seems that
it should be handled on the linker stage, since static binary should not
have GOT table at all. LLD linker with relaxations enabled would replace
instruction addresses from GOT directly to target symbols, which
eliminates the problem.

Anyway, in order to achieve detection of such cases, this patch fixes a
few things in BOLT:
1. For the end symbols, we're now using the section provided by ELF
binary. Previously it would be tied with a wrong section found by symbol
address.
2. The end symbols would have limited registration we would only
add them in name->data GlobalSymbols map, since using address->data
BinaryDataMap map would likely be impossible due to address duality of
such symbols.
3. The outdated BD->getSection (currently returning refence, not
pointer) check in postProcessSymbolTable is replaced by getSize check in
order to allow zero-sized top-level symbols if they are located in
zero-sized sections. For the most part, such things could only be found
in tests, but I don't see a reason not to handle such cases.
4. Updated section-end-sym test and removed x86_64 requirement since
there is no reason for this (tested on aarch64 linux)

The test was provided by peterwaller-arm (thank you) in #100096 and
slightly modified by me.
2024-08-07 16:26:12 +04:00
Amir Ayupov
9d2dd009b6
[BOLT] Support more than two jump table parents
Multi-way splitting can cause multiple fragments to access the same jump
table. Relax the assumption that a jump table can only have up to two
parents.

Test Plan: added bolt/test/X86/three-way-split-jt.s

Reviewers: ayermolo, dcci, rafaelauler, maksfb

Reviewed By: rafaelauler, dcci

Pull Request: https://github.com/llvm/llvm-project/pull/99988
2024-07-24 07:16:39 -07:00
Amir Ayupov
83ea7ce3a1
[BOLT][NFC] Track fragment relationships using EquivalenceClasses
Three-way splitting can create references between split fragments (warm
to cold or vice versa) that are not handled by
`isChildOf/isParentOf/isChildOrParentOf`. Generalize fragment
relationships to allow checking if two functions belong to one group,
potentially in presence of ICF which can join multiple groups.

Test Plan: NFC for existing tests

Reviewers: maksfb, ayermolo, rafaelauler, dcci

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/99979
2024-07-24 07:15:10 -07:00
Jordan Brantner
d251a328b8
[BOLT] Fix typo from alterantive to alternative (#99704)
Fix typo from `alterantive` -> `alternative`

Signed-off-by: Jordan Brantner <brantnej@oregonstate.edu>
2024-07-22 18:35:20 -07:00
Fangrui Song
86e21e1af2 [BOLT] Remove unused bool arguments from createMCObjectStreamer callers 2024-07-20 21:30:49 -07:00
Amir Ayupov
3023b15fb1 [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH
Detect and support fixed PIC indirect jumps of the following form:
```
movslq  En(%rip), %r1
leaq  PIC_JUMP_TABLE(%rip), %r2
addq  %r2, %r1
jmpq  *%r1
```

with PIC_JUMP_TABLE that looks like following:

```
  JT:  ----------
   E1:| L1 - JT  |
      |----------|
   E2:| L2 - JT  |
      |----------|
      |          |
         ......
   En:| Ln - JT  |
       ----------
```

The code could be produced by compilers, see
https://github.com/llvm/llvm-project/issues/91648.

Test Plan: updated jump-table-fixed-ref-pic.test

Reviewers: maksfb, ayermolo, dcci, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/91667
2024-07-18 20:57:05 -07:00
Fangrui Song
4c79fac140
[BOLT] Remove workaround for flushPendingLabels
The code emits an empty MCDataFragment to ensure that the labels are
attached to `SplitSection`. The workaround, due to the removed
`flushPendingLabels` mechanism (see
75006466296ed4b0f845cbbec4bf77c21de43b40), is now unneeded.

Pull Request: https://github.com/llvm/llvm-project/pull/97632
2024-07-03 16:40:49 -07:00
Fangrui Song
35668e2c9c
Remove llvm/MC/MCAsmLayout.h and the unused parameter in MCAssembler::layout
This restores 63ec52f867ada8d841dd872acf3d0cb62e2a99e8 and
46f7929879a59ec72dc75679b4201e2d314efba9, NFC changes that were
unnecessarily reverted.

This completes the work that merges MCAsmLayout into MCAssembler.

Pull Request: https://github.com/llvm/llvm-project/pull/97449
2024-07-02 16:56:35 -07:00
Davide Italiano
ac0b48a0db Revert "MCAssembler::layout: remove the unused MCAsmLayout parameter"
This reverts commit 63ec52f867ada8d841dd872acf3d0cb62e2a99e8.
2024-07-02 08:54:05 -07:00
Fangrui Song
63ec52f867 MCAssembler::layout: remove the unused MCAsmLayout parameter
Almost complete the MCAsmLayout removal work started by 67957a45ee1ec42ae1671cdbfa0d73127346cc95.
2024-07-01 18:17:05 -07:00
Fangrui Song
dbf12b2f77 [MC] Remove MCAsmLayout::{getSymbolOffset,getBaseSymbol}
The MCAsmLayout::* forwarders added by
67957a45ee1ec42ae1671cdbfa0d73127346cc95 have all been removed.
2024-07-01 11:51:26 -07:00
Amir Ayupov
c460e454d1
[BOLT][NFCI] Fix return type of BC::getSignedValueAtAddress (#91664) 2024-05-24 16:04:06 -07:00
Amir Ayupov
a38f0157f2
[BOLT] Set InitialDynoStats after EstimateEdgeCounts (#93218)
InitialDynoStats used to be assigned inside `runAllPasses`, but the
assignment executed before any of the passes. As we've moved
`EstimateEdgeCounts` into a pass out of ProfileReader, it needs to
execute before initial dyno stats are set.

Thus move `InitialDynoStats` into BinaryContext and assignment into
`DynoStatsSetPass`.
2024-05-23 11:37:06 -07:00
shaw young
c8fc234ee2
[BOLT][NFC] Eliminate uses of throwing std::map::at (#92950)
Remove calls to std::unordered_map::at, std::map::at, and
std::vector::at.
2024-05-22 09:27:14 -07:00
Amir Ayupov
935b946b1f
[BOLT] Process cross references between ignored functions in BAT mode (#92484)
To align YAML and fdata profiles produced in BAT mode, lift two
restrictions applied in non-relocation mode when BAT is present:
1) register secondary entry points from ignored functions,
2) treat functions with secondary entry points as simple.

This allows constructing CFG for non-simple functions in non-relocation
mode and emitting YAML profile for them, which can then be used for
optimizations in relocation mode.

Test Plan: added test ignored-interprocedural-reference.s
2024-05-21 20:22:12 -07:00
Nathan Sidwell
603fa4c6b9
[BOLT][NFC] Be more obvious about selecting X86 (#88527)
Use `isX86()` rather than `!isAArch64() && !isRISCV()`, and similar.
2024-04-15 13:11:29 -04:00
Maksim Panchenko
43d0891d3b
[BOLT] Fix handling of trailing entries in jump tables (#88444)
If a jump table has entries at the end that are a result of
__builtin_unreachable() targets, BOLT can confuse them with function
pointers. In such case, we should exclude these targets from the table
as we risk incorrectly updating the function pointers. It is safe to
exclude them as branching on such targets is considered an undefined
behavior.
2024-04-11 16:11:00 -07:00
Amir Ayupov
fd38366e45
[BOLT][NFC] Clean includes, add license headers (#87200) 2024-03-31 19:29:45 -07:00
Amir Ayupov
c0febca3a6
[BOLT][NFC] Refactor BC::createBinaryContext for #81346 (#87172) 2024-03-30 20:43:23 -07:00
Maksim Panchenko
6b1cf00400
[BOLT] Add support for Linux kernel static keys jump table (#86090)
Runtime code modification used by static keys is the most ubiquitous
self-modifying feature of the Linux kernel. The idea is to to eliminate
the condition check and associated conditional jump on a hot path if
that condition (based on a boolean value of a static key) does not
change often. Whenever they condition changes, the kernel runtime
modifies all code paths associated with that key flipping the code
between nop and (unconditional) jump.
2024-03-21 14:05:21 -07:00
Maksim Panchenko
7c206c7812
[BOLT] Refactor interface for instruction labels. NFCI (#83209)
To avoid accidentally setting the label twice for the same instruction,
which can lead to a "lost" label, introduce getOrSetInstLabel()
function. Rename existing functions to getInstLabel()/setInstLabel() to
make it explicit that they operate on instruction labels. Add an
assertion in setInstLabel() that the instruction did not have a prior
label set.
2024-02-27 18:44:28 -08:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Amir Ayupov
fa7dd4919a
[BOLT][NFC] Add BOLTError and return it from passes (1/2) (#81522)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we add a new class
BOLTError and auxiliary functions `createFatalBOLTError()` and
`createNonFatalBOLTError()` that allow BOLT code to bubble up the
problem to the caller by using the Error class as a return
type (or Expected). Also changes passes to use these.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:39:59 -08:00
Alexander Yermolovich
7d272722fb
[BOLT][DWARF] Add option to specify DW_AT_comp_dir (#79395)
Added an --comp-dir-override option that overrides DW_AT_comp_dir in the
unit die. This allows for llvm-bolt to be invoked from any category and
still find .dwo files.
2024-01-25 15:00:52 -08:00
Kazu Hirata
ad8fd5b185 [BOLT] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 23:34:49 -08:00
ShatianWang
d333c0e062
[BOLT] Extend calculateEmittedSize() for block size calculation (#73076)
This commit modifies BinaryContext::calculateEmittedSize() to update 
the BinaryBasicBlock::OutputAddressRange of each basic block in the
function in place. BinaryBasicBlock::getOutputSize() now gives the 
emitted size of the basic block.
2023-11-23 15:28:31 -05:00
llongint
f3e54f2f97
[BOLT][NFC] Extract a function for dump MCInst (#67225)
In GDB debugging, obtaining the assembly representation of MCInst is
more intuitive.
2023-11-21 20:30:44 +08:00
JohnLee1243
ae51ec84bb
[Bolt] Solving pie support issue (#65494)
Now PIE is default supported after clang 14. It cause parsing error when
using perf2bolt. The reason is the base address can not get correctly.
Fix the method of geting base address. If SegInfo.Alignment is not equal
to pagesize, alignDown(SegInfo.FileOffset, SegInfo.Alignment) can not
equal to FileOffset. So the SegInfo.FileOffset and FileOffset should be
aligned by SegInfo.Alignment first and then judge whether they are
equal.
The .text segment's offset from base address in VAS is aligned by
pagesize. So MMapAddress's offset from base address is
alignDown(SegInfo.Address, pagesize) instead of
alignDown(SegInfo.Address, SegInfo.Alignment). So the base address
calculate way should be changed.

Co-authored-by: Li Zhuohang <lizhuohang3@huawei.com>
2023-11-16 15:05:06 +08:00
Maksim Panchenko
2db9b6a93f
[BOLT] Make instruction size a first-class annotation (#72167)
When NOP instructions are used to reserve space in the code, e.g. for
patching, it becomes critical to preserve their original size while
emitting the code. On x86, we rely on "Size" annotation for NOP
instructions size, as the original instruction size is lost in the
disassembly/assembly process.

This change makes instruction size a first-class annotation and is
affectively NFCI. A follow-up diff will use the annotation for code
emission.
2023-11-13 14:33:39 -08:00
spaette
1a2f83366b
[BOLT] Fix typos (#68121)
Closes https://github.com/llvm/llvm-project/issues/63097

Before merging please make sure the change to
bolt/include/bolt/Passes/StokeInfo.h is correct.

bolt/include/bolt/Passes/StokeInfo.h

```diff
  //  This Pass solves the two major problems to use the Stoke program without
- //  proting its code:
+ //  probing its code:
```

I'm still not happy about the awkward wording in this comment.

bolt/include/bolt/Passes/FixRelaxationPass.h

```
$ ed -s bolt/include/bolt/Passes/FixRelaxationPass.h <<<'9,12p'
// This file declares the FixRelaxations class, which locates instructions with
// wrong targets and fixes them. Such problems usually occures when linker
// relaxes (changes) instructions, but doesn't fix relocations types properly
// for them.
$
```


bolt/docs/doxygen.cfg.in
bolt/include/bolt/Core/BinaryContext.h
bolt/include/bolt/Core/BinaryFunction.h
bolt/include/bolt/Core/BinarySection.h
bolt/include/bolt/Core/DebugData.h
bolt/include/bolt/Core/DynoStats.h
bolt/include/bolt/Core/Exceptions.h
bolt/include/bolt/Core/MCPlusBuilder.h
bolt/include/bolt/Core/Relocation.h
bolt/include/bolt/Passes/FixRelaxationPass.h
bolt/include/bolt/Passes/InstrumentationSummary.h
bolt/include/bolt/Passes/ReorderAlgorithm.h
bolt/include/bolt/Passes/StackReachingUses.h
bolt/include/bolt/Passes/StokeInfo.h
bolt/include/bolt/Passes/TailDuplication.h
bolt/include/bolt/Profile/DataAggregator.h
bolt/include/bolt/Profile/DataReader.h
bolt/lib/Core/BinaryContext.cpp
bolt/lib/Core/BinarySection.cpp
bolt/lib/Core/DebugData.cpp
bolt/lib/Core/DynoStats.cpp
bolt/lib/Core/Relocation.cpp
bolt/lib/Passes/Instrumentation.cpp
bolt/lib/Passes/JTFootprintReduction.cpp
bolt/lib/Passes/ReorderData.cpp
bolt/lib/Passes/RetpolineInsertion.cpp
bolt/lib/Passes/ShrinkWrapping.cpp
bolt/lib/Passes/TailDuplication.cpp
bolt/lib/Rewrite/BoltDiff.cpp
bolt/lib/Rewrite/DWARFRewriter.cpp
bolt/lib/Rewrite/RewriteInstance.cpp
bolt/lib/Utils/CommandLineOpts.cpp
bolt/runtime/instr.cpp
bolt/test/AArch64/got-ld64-relaxation.test
bolt/test/AArch64/unmarked-data.test
bolt/test/X86/Inputs/dwarf5-cu-no-debug-addr-helper.s
bolt/test/X86/Inputs/linenumber.cpp
bolt/test/X86/double-jump.test
bolt/test/X86/dwarf5-call-pc-function-null-check.test
bolt/test/X86/dwarf5-split-dwarf4-monolithic.test
bolt/test/X86/dynrelocs.s
bolt/test/X86/fallthrough-to-noop.test
bolt/test/X86/tail-duplication-cache.s
bolt/test/runtime/X86/instrumentation-ind-calls.s
2023-11-09 11:29:46 -08:00
Maksim Panchenko
0df154671b
[BOLT] Use Label annotation instead of EHLabel pseudo. NFCI. (#70179)
When we need to attach EH label to an instruction, we can now use Label
annotation instead of EHLabel pseudo instruction.
2023-11-06 14:43:14 -08:00
maksfb
e28c393bd1
[BOLT] Reduce the number of emitted symbols. NFCI. (#70175)
We emit a symbol before an instruction for a number of reasons, e.g. for
tracking LocSyms, debug line, or if the instruction has a label
annotation. Currently, we may emit multiple symbols per instruction.

Reuse the same label instead of creating and emitting new ones when
possible. I'm planning to refactor EH labels as well in a separate diff.

Change getLabel() to return a pointer instead of std::optional<> since
an empty label should be treated identically to no label.
2023-11-06 11:41:47 -08:00
Amir Ayupov
3a72bcbf33
[BOLT] Fix build issues after #69836 (#70087)
Fix clang build (`return Error => return std::move(Error)`)
2023-10-24 11:47:12 -07:00
Job Noorman
86bc486785
[BOLT][RISCV] Use target features from object file (#69836)
We used to hard-code target features for RISC-V. However, most features
(with the exception of relax) are stored in the object file. This patch
extracts those features to ensure BOLT's output doesn't use any features
not present in the input file.
2023-10-23 06:40:25 +00:00
Job Noorman
c6f065d9d9
[BOLT][RISCV] Recognize mapping syms with encoded ISA (#68964)
RISC-V supports mapping syms for code that encode the exact ISA for
which the code is valid. They have the form `$x<ISA>` where `<ISA>` is
the textual encoding of an ISA specification.

BOLT currently doesn't recognize these mapping symbols causing many
binaries compiled with newer versions of GCC (which emits them) to not
be properly processed. This patch makes sure BOLT recognizes them as
code markers.

Note that LLVM does not emit these kinds of mapping symbols yet so the
test is based on a binary produced by GCC.
2023-10-13 10:34:13 +00:00
Job Noorman
ff5e2babcb
[BOLT] Improve handling of relocations targeting specific instructions (#66395)
On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like a function or basic
block. Take the following example that loads a value from symbol `foo`:

```
nop
1: auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(1b)(t0)
```

This results in two relocation:
- auipc: `R_RISCV_PCREL_HI20` referencing `foo`;
- ld: `R_RISCV_PCREL_LO12_I` referencing to local label `1` which points
to the auipc instruction.

It is of utmost importance that the `R_RISCV_PCREL_LO12_I` keeps
referring to the auipc instruction; if not, the program will fail to
assemble. However, BOLT currently does not guarantee this.

BOLT currently assumes that all local symbols are jump targets and
always starts a new basic block at symbol locations. The example above
results in a CFG the looks like this:

```
.BB0:
    nop
.BB1:
    auipc t0, %pcrel_hi(foo)
    ld t0, %pcrel_lo(.BB1)(t0)
```

While this currently works (i.e., the `R_RISCV_PCREL_LO12_I` relocation
points to the correct instruction), it has two downsides:
- Too many basic blocks are created (the example above is logically only
  one yet two are created);
- If instructions are inserted in `.BB1` (e.g., by instrumentation),
  things will break since the label will not point to the auipc anymore.

This patch proposes to fix this issue by teaching BOLT to track labels
that should always point to a specific instruction. This is implemented
as follows:
- Add a new annotation type (`kLabel`) that allows us to annotate
  instructions with an `MCSymbol *`;
- Whenever we encounter a relocation type that is used to refer to a
  specific instruction (`Relocation::isInstructionReference`), we
  register it without a symbol;
- During disassembly, whenever we encounter an instruction with such a
  relocation, create a symbol for its target and store it in an offset
  to symbol map (to ensure multiple relocations referencing the same
  instruction use the same label);
- After disassembly, iterate this map to attach labels to instructions
  via the new annotation type;
- During emission, emit these labels right before the instruction.

I believe the use of annotations works quite well for this use case as
it allows us to reliably track instruction labels. If we were to store
them as offsets in basic blocks, it would be error prone to keep them
updated whenever instructions are inserted or removed.

I have chosen to add labels as first-class annotations (as opposed to a
generic one) because the documentation of `MCAnnotation` suggests that
generic annotations are to be used for optional metadata that can be
discarded without affecting correctness. As this is not the case for
labels, a first-class annotation seemed more appropriate.
2023-10-06 06:46:16 +00:00
Rafael Auler
853e126ce3 [BOLT] Support input binaries that use R_X86_GOTPC64
In large code model, the address of GOT is calculated by the
static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ
instruction. In the final binary, it can be disassembled as a regular
immediate, but because such immediate is the result of PC-relative
pointer arithmetic, we need to parse this relocation and update this
calculation whenever we move code, otherwise we break the code trying
to read GOT.

A test case showing how GOT is accessed was provided.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D158911
2023-10-02 23:12:44 -07:00
Job Noorman
c5ba61978c [BOLT][RISCV] Add support for linker relaxation
Calls on RISC-V are typically compiled to `auipc`/`jalr` pairs to allow
a maximum target range (32-bit pc-relative). In order to optimize calls
to near targets, linker relaxation may replace those pairs with, for
example, single `jal` instructions.

To allow BOLT to freely reassign function addresses in relaxed binaries,
this patch proposes the following approach:
- Expand all relaxed calls back to `auipc`/`jalr`;
- Rely on JITLink to relax those back to shorter forms where possible.

This is implemented by detecting all possible call instructions and
replacing them with `PseudoCALL` (or `PseudoTAIL`) instructions. The
RISC-V backend then expands those and adds the necessary relocations for
relaxation.

Since BOLT generally ignores pseudo instruction, this patch makes
`MCPlusBuilder::isPseudo` virtual so that `RISCVMCPlusBuilder` can
override it to exclude `PseudoCALL` and `PseudoTAIL`.

To ensure JITLink knows about the correct section addresses while
relaxing, reassignment of addresses has been moved to a post-allocation
pass. Note that this is probably the time it had to be done in the
first place since in `notifyResolved` (where it was done before), all
symbols are supposed to be resolved already.

Depends on D159082

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D159089
2023-09-15 11:57:28 +02:00
Job Noorman
fc395884de [BOLT][RISCV] Recognize mapping symbols
The RISC-V psABI [1] defines them similarly to AArch64.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#mapping-symbol

Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D153277
2023-07-29 09:18:36 +02:00
Maksim Panchenko
1e4ee588fb [BOLT] Accept function start as valid jump table entry
Jump tables may contain a function start address. One real-world example
is when a target basic block contains a recursive tail call that is
later optimized/folded into a jump table target.

While analyzing a jump table, we treat start address similar to an
address past the end of the containing function (a result of
__builtin_unreachable), i.e. we require another "regular" entry for the
heuristic to proceed.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D156206
2023-07-26 13:25:08 -07:00
Job Noorman
f873029386 [BOLT] Add minimal RISC-V 64-bit support
Just enough features are implemented to process a simple "hello world"
executable and produce something that still runs (including libc calls).
This was mainly a matter of implementing support for various
relocations. Currently, the following are handled:

- R_RISCV_JAL
- R_RISCV_CALL
- R_RISCV_CALL_PLT
- R_RISCV_BRANCH
- R_RISCV_RVC_BRANCH
- R_RISCV_RVC_JUMP
- R_RISCV_GOT_HI20
- R_RISCV_PCREL_HI20
- R_RISCV_PCREL_LO12_I
- R_RISCV_RELAX
- R_RISCV_NONE

Executables linked with linker relaxation will probably fail to be
processed. BOLT relocates .text to a high address while leaving .plt at
its original (low) address. This causes PC-relative PLT calls that were
relaxed to a JAL to not fit their offset in an I-immediate anymore. This
is something that will be addressed in a later patch.

Changes to the BOLT core are relatively minor. Two things were tricky to
implement and needed slightly larger changes. I'll explain those below.

The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a
AUIPC/JALR pair, the second does not get any relocation (unlike other
PCREL pairs). This causes issues with the combinations of the way BOLT
processes binaries and the RISC-V MC-layer handles relocations:
- BOLT reassembles instructions one by one and since the JALR doesn't
  have a relocation, it simply gets copied without modification;
- Even though the MC-layer handles R_RISCV_CALL properly (adjusts both
  the AUIPC and the JALR), it assumes the immediates of both
  instructions are 0 (to be able to or-in a new value). This will most
  likely not be the case for the JALR that got copied over.

To handle this difficulty without resorting to RISC-V-specific hacks in
the BOLT core, a new binary pass was added that searches for
AUIPC/JALR pairs and zeroes-out the immediate of the JALR.

A second difficulty was supporting ABS symbols. As far as I can tell,
ABS symbols were not handled at all, causing __global_pointer$ to break.
RewriteInstance::analyzeRelocation was updated to handle these
generically.

Tests are provided for all supported relocations. Note that in order to
test the correct handling of PLT entries, an ELF file produced by GCC
had to be used. While I tried to strip the YAML representation, it's
still quite large. Any suggestions on how to improve this would be
appreciated.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D145687
2023-06-16 12:19:36 +02:00
Job Noorman
05634f7346 [BOLT] Move from RuntimeDyld to JITLink
RuntimeDyld has been deprecated in favor of JITLink. [1] This patch
replaces all uses of RuntimeDyld in BOLT with JITLink.

Care has been taken to minimize the impact on the code structure in
order to ease the inspection of this (rather large) changeset. Since
BOLT relied on the RuntimeDyld API in multiple places, this wasn't
always possible though and I'll explain the changes in code structure
first.

Design note: BOLT uses a JIT linker to perform what essentially is
static linking. No linked code is ever executed; the result of linking
is simply written back to an executable file. For this reason, I
restricted myself to the use of the core JITLink library and avoided ORC
as much as possible.

RuntimeDyld contains methods for loading objects (loadObject) and symbol
lookup (getSymbol). Since JITLink doesn't provide a class with a similar
interface, the BOLTLinker abstract class was added to implement it. It
was added to Core since both the Rewrite and RuntimeLibs libraries make
use of it. Wherever a RuntimeDyld object was used before, it was
replaced with a BOLTLinker object.

There is one major difference between the RuntimeDyld and BOLTLinker
interfaces: in JITLink, section allocation and the application of fixups
(relocation) happens in a single call (jitlink::link). That is, there is
no separate method like finalizeWithMemoryManagerLocking in RuntimeDyld.
BOLT used to remap sections between allocating (loadObject) and linking
them (finalizeWithMemoryManagerLocking). This doesn't work anymore with
JITLink. Instead, BOLTLinker::loadObject accepts a callback that is
called before fixups are applied which is used to remap sections.

The actual implementation of the BOLTLinker interface lives in the
JITLinkLinker class in the Rewrite library. It's the only part of the
BOLT code that should directly interact with the JITLink API.

For loading object, JITLinkLinker first creates a LinkGraph
(jitlink::createLinkGraphFromObject) and then links it (jitlink::link).
For the latter, it uses a custom JITLinkContext with the following
properties:
- Use BOLT's ExecutableFileMemoryManager. This one was updated to
  implement the JITLinkMemoryManager interface. Since BOLT never
  executes code, its finalization step is a no-op.
- Pass config: don't use the default target passes since they modify
  DWARF sections in a way that seems incompatible with BOLT. Also run a
  custom pre-prune pass that makes sure sections without symbols are not
  pruned by JITLink.
- Implement symbol lookup. This used to be implemented by
  BOLTSymbolResolver.
- Call the section mapper callback before the final linking step.
- Copy symbol values when the LinkGraph is resolved. Symbols are stored
  inside JITLinkLinker to ensure that later objects (i.e.,
  instrumentation libraries) can find them. This functionality used to
  be provided by RuntimeDyld but I did not find a way to use JITLink
  directly for this.

Some more minor points of interest:
- BinarySection::SectionID: JITLink doesn't have something equivalent to
  RuntimeDyld's Section IDs. Instead, sections can only be referred to
  by name. Hence, SectionID was updated to a string.
- There seem to be no tests for Mach-O. I've tested a small hello-world
  style binary but not more than that.
- On Mach-O, JITLink "normalizes" section names to include the segment
  name. I had to parse the section name back from this manually which
  feels slightly hacky.

[1] https://reviews.llvm.org/D145686#4222642

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D147544
2023-06-15 11:13:52 +02:00
Amir Ayupov
702fe36b70 [BOLT][NFC] Const-ify getDynamicRelocationAt
Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D152662
2023-06-12 09:55:16 -07:00
Amir Ayupov
068e9889b1 [BOLT] Add isParentOf and isParentOrChildOf BF checks
Add helper methods and simplify cases where we want to check if two functions
are parent-child of each other (function-fragment relationship).

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D142668
2023-05-19 17:51:54 -07:00