154 Commits

Author SHA1 Message Date
Maksim Panchenko
4db0cc4c55
[BOLT] Allow sections in --print-only flag (#109622)
While printing functions, expand --print-only flag to accept section
names. E.g., "--print-only=\.init" will only print functions from
".init" section.
2024-09-25 23:44:06 +02:00
Maksim Panchenko
abd69b3653
[BOLT] Handle internal calls in ValidateInternalCalls (#105736)
Move handling of all internal calls into the designated pass. Preserve
NOPs and mark functions as non-simple on non-X86 platforms.
2024-08-27 11:31:32 -07:00
Maksim Panchenko
8f3050684e
[BOLT] Reduce CFI warning verbosity (#105336)
CFI programs may have more saves than restores and this is completely
benign from BOLT's perspective. Reduce the verbosity and print the
warning only under `-v=1` and above.
2024-08-20 13:41:19 -07:00
Amir Ayupov
f83a89c1b1
[BOLT] Turn non-empty CFI StateStack assert into a warning (#102216)
clang-15 can produce binaries with mismatched RememberState/RestoreState
CFIs. This is benign for unwinding, so replace an assert with a warning.
2024-08-06 17:23:43 -07:00
Amir Ayupov
3023b15fb1 [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH
Detect and support fixed PIC indirect jumps of the following form:
```
movslq  En(%rip), %r1
leaq  PIC_JUMP_TABLE(%rip), %r2
addq  %r2, %r1
jmpq  *%r1
```

with PIC_JUMP_TABLE that looks like following:

```
  JT:  ----------
   E1:| L1 - JT  |
      |----------|
   E2:| L2 - JT  |
      |----------|
      |          |
         ......
   En:| Ln - JT  |
       ----------
```

The code could be produced by compilers, see
https://github.com/llvm/llvm-project/issues/91648.

Test Plan: updated jump-table-fixed-ref-pic.test

Reviewers: maksfb, ayermolo, dcci, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/91667
2024-07-18 20:57:05 -07:00
Fangrui Song
2718654c54
[MC] Support .cfi_label
GNU assembler 2.26 introduced the .cfi_label directive. It does not
expand to any CFI instructions, but defines a label in
.eh_frame/.debug_frame, which can be used by runtime patching code to
locate the FDE. .cfi_label is not allowed for CIE's initial
instructions, and can therefore be used to force the next instruction to
be placed in a FDE instead of a CIE.

In glibc since 2018, sysdeps/riscv/start.S utilizes .cfi_label to force
DW_CFA_undefined to be placed in a FDE. arc/csky/loongarch ports have
copied this use.
```
.cfi_startproc
// DW_CFA_undefined is allowed for CIE's initial instructions.
// Without .cfi_label, gas would place DW_CFA_undefined in a CIE.
.cfi_label .Ldummy
.cfi_undefined ra
.cfi_endproc
```

No CFI instruction is associated with .cfi_label, so the `case
MCCFIInstruction::OpLabel:` code in BOLT is unreachable and onlt to make
-Wswitch happy.

Close #97222

Pull Request: https://github.com/llvm/llvm-project/pull/97922
2024-07-07 12:41:13 -07:00
Amir Ayupov
344228ebf4 [BOLT] Drop macro-fusion alignment (#97358)
9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performance measurements with large binaries didn't show a significant
improvement.

Test Plan:
macro-fusion alignment was never upstreamed, so no upstream tests are
affected.
2024-07-02 09:20:41 -07:00
Nathan Sidwell
6c5b62b846
[BOLT][NFC] Separate isReversibleBranch's 2 semantics (#95572)
`isUnsupportedBranch` was renamed (and inverted)  to `isReversibleBranch`, as that was how it was being used. But one use  in `BinaryFunction::disassemble` was using the original meaning to detect unsupported branches, and the `isUnsupportedBranch` had 2 separate semantic checks.

Move the unsupported branch check from `isReversibleBranch` to a new entry point: `isUnsupportedInstruction`. Call that from `BinaryFunction::disassemble`.

Move the dynamic branch check from X86's isReversibleBranch to the base class, as it is not an architecture-specific check.

Remove unnecessary `isReversibleBranch` calls from Instrumentation and X86 MCPlusBuilder.
2024-06-28 07:45:37 -04:00
Maksim Panchenko
d16b21b17d
[BOLT][Linux] Support ORC for alternative instructions (#96709)
Alternative instruction sequences in the Linux kernel can modify the
stack and thus they need their own ORC unwind entries. Since there's
only one ORC table, it has to be "shared" among multiple instruction
sequences. The kernel achieves this by putting a restriction on
instruction boundaries. If ORC state changes at a given IP, only one of
the alternative sequences can have an instruction starting/ending at
this IP. Then, developers can insert NOPs to guarantee the above
requirement is met.

The most common use of ORC with alternatives is "pushf; pop %rax"
sequence used for paravirtualization. Note that newer kernel versions
no longer use .parainstructions; instead, they utilize alternatives for
the same purpose.

Before we implement a better support for alternatives, we can safely
skip ORC entries associated with them.

Fixes #87052.
2024-06-27 19:26:11 -07:00
Maksim Panchenko
ca06b61084
[BOLT] Omit CFI state while printing functions without CFI (#96723)
If a function has no CFI program attached to it, do not print redundant
empty CFI state for every basic block.
2024-06-27 17:26:58 -07:00
Nikita Popov
b23fe1088f [bolt] Add missing <stack> include (NFC) 2024-06-21 14:02:15 +02:00
shaw young
4be3083bb3
[BOLT] Remove mutable from BB::LayoutIndex (#93224)
Removed mutability from BB::LayoutIndex, subsequently removed const from
BB::SetLayout, and changed BF::dfs to track visited blocks with a set as
opposed to tracking and altering LayoutIndexes for more consistent code.
2024-05-31 11:52:22 -07:00
Amir Ayupov
f239490592
[BOLT][NFC] Define getExprValue helper (#91663)
Move out common code extracting the address of a MCExpr. To be reused in
#91667.

Test Plan: NFC
2024-05-24 15:33:25 -07:00
Amir Ayupov
720cade2b6
[BOLT][NFC] Avoid computing BF hash twice in YAML reader (#75096)
We compute BF hashes in `YAMLProfileReader::readProfile` when first
matching profile functions with binary functions, and second time in
`YAMLProfileReader::parseFunctionProfile` during the profile assignment
(we need to do that to account for LTO private functions with
mismatching suffix).

Avoid recomputing the hash if it's been set.
2024-05-24 14:00:03 -07:00
Amir Ayupov
935b946b1f
[BOLT] Process cross references between ignored functions in BAT mode (#92484)
To align YAML and fdata profiles produced in BAT mode, lift two
restrictions applied in non-relocation mode when BAT is present:
1) register secondary entry points from ignored functions,
2) treat functions with secondary entry points as simple.

This allows constructing CFG for non-simple functions in non-relocation
mode and emitting YAML profile for them, which can then be used for
optimizations in relocation mode.

Test Plan: added test ignored-interprocedural-reference.s
2024-05-21 20:22:12 -07:00
Nathan Sidwell
76fdc2e527
[BOLT][NFC] Rename isUnsupportedBranch to isReversibleBranch (#92447)
`isUnsupportedBranch` is not a very informative name, and doesn't match
its corresponding `reverseBranchCondition`, as I noted in PR #92018.
Here's a renaming to a more mnemonic name.
2024-05-17 15:40:40 -04:00
Nathan Sidwell
725014d866
[BOLT][NFC] Simplify CFG validation (#91977)
Remove 'Valid' local boolean that has a single use, and return directly instead.
2024-05-14 09:36:34 -04:00
Amir Ayupov
db29f20fdd
[BOLT] Ignore returns in DataAggregator
Returns are ignored in perf/pre-aggregated/fdata profile reader (see
DataReader::convertBranchData). They are also omitted in
YAMLProfileWriter by virtue of not having the profile attached to them
in the reader, and YAMLProfileWriter converting the profile attached to
BinaryFunctions. Thus, return profile is universally ignored across all
profile types except BAT YAML.

To make returns ignored for YAML produced in BAT mode, we can:
1) ignore them in YAMLProfileReader,
2) omit them from YAML profile in profile conversion/writing.

The first option is prone to profile staleness issue, where the profiled
binary doesn't match the one to be optimized, and thus returns in the
profile can no longer be reliably detected (as we don't distinguish them
from calls in the profile).

The second option is robust to staleness but requires disassembling the
branch source instruction.

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/90807
2024-05-08 12:02:18 -07:00
Amir Ayupov
fd38366e45
[BOLT][NFC] Clean includes, add license headers (#87200) 2024-03-31 19:29:45 -07:00
Amir Ayupov
d12e45ad16
[BOLT][NFC] Split out DomTree construction from BF::calculateLoopInfo (#87181) 2024-03-31 06:24:19 -07:00
Amir Ayupov
d8fe2e4bb0
[BOLT] Fix enumeration of secondary entry points
Make them start with 1 instead of 0 (reserved for primary entry point).

Test Plan:
```
bin/llvm-lit -a tools/bolt/test/X86/yaml-secondary-entry-discriminator.s
```

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/86848
2024-03-27 15:23:49 -07:00
Maksim Panchenko
6b1cf00400
[BOLT] Add support for Linux kernel static keys jump table (#86090)
Runtime code modification used by static keys is the most ubiquitous
self-modifying feature of the Linux kernel. The idea is to to eliminate
the condition check and associated conditional jump on a hot path if
that condition (based on a boolean value of a static key) does not
change often. Whenever they condition changes, the kernel runtime
modifies all code paths associated with that key flipping the code
between nop and (unconditional) jump.
2024-03-21 14:05:21 -07:00
Maksim Panchenko
d7d564b2fc
[BOLT] Add BinaryFunction::registerBranch(). NFC (#83337)
Add an external interface to register a branch in a function that is in
disassembled state. Allows to make custom modifications to the
disassembler. E.g., a pre-CFG pass can add an instruction and register a
branch that will later be used during the CFG construction.
2024-02-28 20:04:28 -08:00
Maksim Panchenko
3f2a9e5910
[BOLT] Sort TakenBranches immediately before use. NFCI (#83333)
Move code that sorts TakenBranches right before the branches are used.
We can populate TakenBranches in pre-CFG post-processing and hence have
to postpone the sorting to a later point in the processing pipeline.
Will add such a pass later. For now it's NFC.
2024-02-28 19:51:44 -08:00
Maksim Panchenko
7c206c7812
[BOLT] Refactor interface for instruction labels. NFCI (#83209)
To avoid accidentally setting the label twice for the same instruction,
which can lead to a "lost" label, introduce getOrSetInstLabel()
function. Rename existing functions to getInstLabel()/setInstLabel() to
make it explicit that they operate on instruction labels. Add an
assertion in setInstLabel() that the instruction did not have a prior
label set.
2024-02-27 18:44:28 -08:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Amir Ayupov
13d60ce2f2
[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.

Test Plan: NFC

Co-authored-by: Rafael Auler <rafaelauler@fb.com>
2024-02-12 14:51:15 -08:00
Amir Ayupov
b039ccc684
[BOLT] Provide backwards compatibility for YAML profile with std::hash (#74253)
Provide backwards compatibility for YAML profile that uses `std::hash`:
xxh3 hash is the default for newly produced profile (sets `std-hash:
false`),
whereas the profile that doesn't specify `std-hash` will be treated as
`std-hash: true`, preserving old behavior.
2023-12-11 12:27:32 -08:00
Maksim Panchenko
4f3081296f
[BOLT][NFC] Fix comment (#73983)
Fix off-by-one error in comment.
2023-11-30 14:31:38 -08:00
Maksim Panchenko
4bcbbe1f70
[BOLT] Refactor fixBranches() (#73752)
Simplify code in fixBranches(). Mostly NFC, accept the x86-specific
check for code fragments now takes into account presence of more than
two fragments. Should only matter when we split code into multiple
fragments and can run fixBranches() more than once.

Also, don't replace a branch target with the same one, as such operation
may allocate memory for extra MCSymbolRefExpr.
2023-11-29 16:24:16 -08:00
spupyrev
e7dd596c68
[BOLT] Use deterministic xxh3 for computing BF/BB hashes (#72542)
std::hash and ADT/Hashing::hash_value are non-deterministic functions
whose
results might vary across implementation/process/execution. Using xxh3
instead
for computing hashes of BinaryFunctions and BinaryBasicBlock for stale
profile
matching.
(A possible alternative is to use ADT/StableHashing.h based on FNV
hashing but
xxh3 seems to be more popular in LLVM)

This is to address https://github.com/llvm/llvm-project/issues/65241.
2023-11-27 14:45:46 -08:00
Maksim Panchenko
f4834255d3
[BOLT] Reset output addresses for deleted blocks (#73429)
This is a follow-up to #73076. We need to reset output addresses for
deleted blocks, otherwise the address translation may mistakenly
attribute input address of a deleted block to a non-zero address.

While working on a test case, I've discovered that DWARF output ranges
were already broken for deleted basic blocks: #73428. I will provide a
test case for this PR with a DWARF address range fix PR.
2023-11-25 23:23:47 -08:00
Maksim Panchenko
365114292a
[BOLT][NFC] Refactor function state check (#73420)
Remove redundant check in updateOutputValues().
2023-11-25 21:09:54 -08:00
ShatianWang
d333c0e062
[BOLT] Extend calculateEmittedSize() for block size calculation (#73076)
This commit modifies BinaryContext::calculateEmittedSize() to update 
the BinaryBasicBlock::OutputAddressRange of each basic block in the
function in place. BinaryBasicBlock::getOutputSize() now gives the 
emitted size of the basic block.
2023-11-23 15:28:31 -05:00
Maksim Panchenko
f653f6d57a
[BOLT][NFC] Delete unused declarations (#72596) 2023-11-16 23:36:19 -08:00
Vladislav Khmelevsky
5b59540661
[BOLT] Enhance fixed indirect branch handling (#71324)
Previously HasFixedIndirectBranch was set in BF to set isSimple to false
later because of unreachable bb ellimination pass which might remove the
BB with it's symbols accessed by other instructions than calls. It seems
to be that better solution would be to add extra entry point on target
offset instead of marking BF as non-simple.
2023-11-16 09:30:55 +04:00
Maksim Panchenko
e823136d43
[BOLT] Refactor --keep-nops option. NFC. (#72228)
Run RemoveNops pass only if --keep-nops is set to false (default).
2023-11-14 11:28:13 -08:00
Maksim Panchenko
f633f325a1
[BOLT] Fix NOP instruction emission on x86 (#72186)
Use MCAsmBackend::writeNopData() interface to emit NOP instructions on
x86. There are multiple forms of NOP instruction on x86 with different
sizes. Currently, LLVM's assembly/disassembly does not support all forms
correctly which can lead to a breakage of input code semantics, e.g. if
the program relies on NOP instructions for reserving a patch space.

Add "--keep-nops" option to preserve NOP instructions.
2023-11-13 18:12:39 -08:00
Maksim Panchenko
2db9b6a93f
[BOLT] Make instruction size a first-class annotation (#72167)
When NOP instructions are used to reserve space in the code, e.g. for
patching, it becomes critical to preserve their original size while
emitting the code. On x86, we rely on "Size" annotation for NOP
instructions size, as the original instruction size is lost in the
disassembly/assembly process.

This change makes instruction size a first-class annotation and is
affectively NFCI. A follow-up diff will use the annotation for code
emission.
2023-11-13 14:33:39 -08:00
Vladislav Khmelevsky
c6c04a83a7
[BOLT] Run EliminateUnreachableBlocks in parallel (#71299)
The wall time for this pass decreased on my laptop from ~80 sec to 5
sec processing the clang.
2023-11-10 00:46:04 +04:00
Maksim Panchenko
11f52f783a
[BOLT][DWARF] Fix invalid address ranges (#71474)
When NOP instructions are removed by BOLT and a DWARF address range
falls past the removed instructions, it may lead to invalid DWARF ranges
in the output binary. E.g. the range may fall outside of the basic block
boundaries.

This fix makes sure the modified range fits within the containing basic
block. A proper fix requires tracking instructions within the block and
will come in a different PR.
2023-11-09 09:55:49 -08:00
Maksim Panchenko
254ccb95e8
[BOLT] Follow-up to "Fix incorrect basic block output addresses" (#71630)
In 8244ff6739a09cb75e6e7fd1c24b85e2b1397266, I've introduced an
assertion that incorrectly used BasicBlock::empty(). Some basic blocks
may contain only pseudo instructions and thus BB->empty() will evaluate
to false, while the actual code size will be zero.
2023-11-08 10:53:36 -08:00
maksfb
74e0a26fd1
[BOLT] Modify MCPlus annotation internals. NFCI. (#70412)
When annotating MCInst instructions, attach extra annotation operands
directly to the annotated instruction, instead of attaching them to an
instruction pointed to by a special kInst operand.

With this change, it's no longer necessary to allocate MCInst and most
of the first-class annotations come with free memory as currently MCInst
is declared with:

    SmallVector<MCOperand, 10> Operands;

i.e. more operands than are normally being used.

We still create a kInst operand with a nullptr instruction value to
designate the beginning of annotation operands. However, this special
operand might not be needed if we can rely on MCInstrDesc::NumOperands.
2023-11-06 12:14:22 -08:00
maksfb
7f031d1c7c
[BOLT] Fix address mapping for ICP code (#70136)
When we create new code for indirect code promotion optimization, we
should mark it as originating from the indirect jump instruction for
BOLT address translation (BAT) to map it to the original instruction.
2023-11-06 11:25:49 -08:00
maksfb
6e26246c22
[BOLT][DWARF] Refactor address ranges processing (#71225)
Create BinaryFunction::translateInputToOutputRange() and use it for
updating DWARF debug ranges and location lists while de-duplicating the
existing code. Additionally, move DWARF-specific code out of
BinaryFunction and add print functions to facilitate debugging.

Note that this change is deliberately kept "bug-level" compatible with
the existing solution to keep it NFCI and make it easier to track any
possible regressions in the future updates to the ranges-handling code.
2023-11-06 11:10:20 -08:00
Vladislav Khmelevsky
838331a081
[BOLT] Set NOOP size only on X86 (NFC) (#71307)
Small fix, we have problems with noop size only on x86, no reason to do
it on other platforms.
2023-11-06 15:21:50 +04:00
maksfb
8244ff6739
[BOLT] Fix incorrect basic block output addresses (#70000)
Some optimization passes may duplicate basic blocks and assign the same
input offset to a number of different blocks in a function. This is done
e.g. to correctly map debugging ranges for duplicated code.

However, duplicate input offsets present a problem when we use
AddressMap to generate new addresses for basic blocks. The output
address is calculated based on the input offset and will be the same for
blocks with identical offsets. The result is potentially incorrect debug
info and BAT records.

To address the issue, we have to eliminate the dependency on input
offsets while generating output addresses for a basic block. Each block
has a unique label, hence we extend AddressMap to include address lookup
based on MCSymbol and use the new functionality to update block
addresses.
2023-10-24 12:22:43 -07:00
Job Noorman
6795bfce4d
[BOLT][RISCV] Handle CIE's produced by GNU as (#69578)
On RISC-V, GNU as produces the following initial instruction in CIE's:

```
DW_CFA_def_cfa_register: r2
```

While I believe it is technically illegal to use this instruction
without first using a `DW_CFA_def_cfa` (since the offset is undefined),
both `readelf` and `llvm-dwarfdump` accept this and implicitly set the
offset to 0.

In BOLT, however, this triggers an assert (in `CFISnapshot::advanceTo`)
as it (correctly) believes the offset is not set. This patch fixes this
by setting the offset to 0 whenever executing `DW_CFA_def_cfa_register`
while the offset is undefined.

Note that this is probably the simplest workaround but it has a
downside: while emitting CFI start, we check if the initial instructions
are contained within `MCAsmInfo::getInitialFrameState` and omit them if
they are. This will not be true for GNU CIE's (since they differ from
LLVM's) which causes an unnecessary `DW_CFA_def_cfa_register` to be
emitted.

While technically correct, it would probably be better to replace the
GNU CIE with the one used by LLVM to avoid this situation. This would
solve the problem this patch solves while also preventing unnecessary
CFI instructions. However, this is a bit trickier to implement correctly
so I propose to keep this for a later time.

Note on testing: the test creates a simple function with three basic
blocks and forces the CFI state of the last one to be different from the
others using an arbitrary CFI instruction. Then,
`--reorder-blocks=reverse` is used to force `CFISnapshot::advanceTo` to
be called. This causes an assert on the current main branch.
2023-10-20 15:49:17 +00:00
Vladislav Khmelevsky
b7944f7c04
[BOLT] Return proper minimal alignment from BF (#67707)
Currently minimal alignment of function is hardcoded to 2 bytes.
Add 2 more cases:
1. In case BF is data in code return the alignment of CI as minimal
alignment
2. For aarch64 and riscv platforms return the minimal value of 4 (added
test for aarch64)
Otherwise fallback to returning the 2 as it previously was.
2023-10-12 09:33:08 +04:00
Job Noorman
ff5e2babcb
[BOLT] Improve handling of relocations targeting specific instructions (#66395)
On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like a function or basic
block. Take the following example that loads a value from symbol `foo`:

```
nop
1: auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(1b)(t0)
```

This results in two relocation:
- auipc: `R_RISCV_PCREL_HI20` referencing `foo`;
- ld: `R_RISCV_PCREL_LO12_I` referencing to local label `1` which points
to the auipc instruction.

It is of utmost importance that the `R_RISCV_PCREL_LO12_I` keeps
referring to the auipc instruction; if not, the program will fail to
assemble. However, BOLT currently does not guarantee this.

BOLT currently assumes that all local symbols are jump targets and
always starts a new basic block at symbol locations. The example above
results in a CFG the looks like this:

```
.BB0:
    nop
.BB1:
    auipc t0, %pcrel_hi(foo)
    ld t0, %pcrel_lo(.BB1)(t0)
```

While this currently works (i.e., the `R_RISCV_PCREL_LO12_I` relocation
points to the correct instruction), it has two downsides:
- Too many basic blocks are created (the example above is logically only
  one yet two are created);
- If instructions are inserted in `.BB1` (e.g., by instrumentation),
  things will break since the label will not point to the auipc anymore.

This patch proposes to fix this issue by teaching BOLT to track labels
that should always point to a specific instruction. This is implemented
as follows:
- Add a new annotation type (`kLabel`) that allows us to annotate
  instructions with an `MCSymbol *`;
- Whenever we encounter a relocation type that is used to refer to a
  specific instruction (`Relocation::isInstructionReference`), we
  register it without a symbol;
- During disassembly, whenever we encounter an instruction with such a
  relocation, create a symbol for its target and store it in an offset
  to symbol map (to ensure multiple relocations referencing the same
  instruction use the same label);
- After disassembly, iterate this map to attach labels to instructions
  via the new annotation type;
- During emission, emit these labels right before the instruction.

I believe the use of annotations works quite well for this use case as
it allows us to reliably track instruction labels. If we were to store
them as offsets in basic blocks, it would be error prone to keep them
updated whenever instructions are inserted or removed.

I have chosen to add labels as first-class annotations (as opposed to a
generic one) because the documentation of `MCAnnotation` suggests that
generic annotations are to be used for optional metadata that can be
discarded without affecting correctness. As this is not the case for
labels, a first-class annotation seemed more appropriate.
2023-10-06 06:46:16 +00:00