171 Commits

Author SHA1 Message Date
Haibo Jiang
21a5729b87
[BOLT] Do not use HLT as split point when build the CFG (#150963)
For x86, the halt instruction is defined as a terminator instruction.
When building the CFG, the instruction sequence following the hlt
instruction is treated as an independent MBB. Since there is no jump
information, the predecessor of this MBB cannot be identified, and it is
considered an unreachable MBB that will be removed.

Using this fix, the instruction sequences before and after hlt are
refused to be placed in different blocks.
2025-08-15 14:35:13 -07:00
Fangrui Song
dcf485609c
MC: Centralize X86 PC-relative fixup adjustment in MCAssembler
Move the X86 PC-relative fixup adjustment from
X86MCCodeEmitter::emitImmediate to MCAssembler, leveraging a generalized
evaluateFixup. This saves a MCBinaryExpr. For `call foo`, the fixup
expression is now `foo` instead of `foo-4`. There is no change in
generated relocations.

In bolt/lib/Target/X86/X86MCPlusBuilder.cpp, createRelocation needs to
decrease the addend.

Both max-rss and instructions:u show a minor decrease.
https://llvm-compile-time-tracker.com/compare.php?from=ea600576a6f94d6f28925c4b99962cc26b463c29&to=016e8fd4ddf851e5555f606c6394241d68f1a7bb&stat=max-rss&linkStats=on

Next: Update targets that use FKF_IsAlignedDownTo32Bits to define
`evaluateFixup` and remove FKF_IsAlignedDownTo32Bits from the generic
code.

Pull Request: https://github.com/llvm/llvm-project/pull/147113
2025-07-08 09:22:30 -07:00
Fangrui Song
244e053b6c MC: Remove llvm/MC/MCFixupKindInfo.h
The file used to define `MCFixupKindInfo`, a simple structure,
which is now in MCAsmBackend.h.
2025-07-05 11:24:11 -07:00
Fangrui Song
5b7f1c17d9 BOLT: Replace deprecated MCFixupKindInfo::FKF_IsPCRel with MCFixup::isPCRel
MCFixup::PCRel is now set at creation and the MCFixupKindInfo::FKF_IsPCRel flag
is no longer set.
2025-07-04 17:33:20 -07:00
Fangrui Song
2bfc488d34 X86MCCodeEmitter: Remove unneeded MCFixupKindInfo::FKF_IsPCRel 2025-07-04 16:30:07 -07:00
Fangrui Song
109b7d965c MC: Remove unneeded VK_None argument to MCSymbolRefExpr::create calls
The MCSymbolRefExpr::create overload with the specifier parameter is
discouraged and being phased out. Expressions with relocation specifiers
should use MCSpecifierExpr instead.
2025-06-27 21:22:46 -07:00
Fangrui Song
30922f740e
Move relocation specifier constants to AArch64::
Rename these relocation specifier constants, aligning with the naming
convention used by other targets (`S_` instead of `VK_`).

* ELF/COFF: AArch64MCExpr::VK_ => AArch64::S_ (VK_ABS/VK_PAGE_ABS are
  also used by Mach-O as a hack)
* Mach-O: AArch64MCExpr::M_ => AArch64::S_MACHO_
* shared: AArch64MCExpr::None => AArch64::S_None

Apologies for the churn following the recent rename in #132595. This
change ensures consistency after introducing MCSpecifierExpr to replace
MCTargetSpecifier subclasses.

Pull Request: https://github.com/llvm/llvm-project/pull/144633
2025-06-24 19:06:22 -07:00
Fangrui Song
17e8465a3e
AArch64: Replace AArch64MCExpr with MCSpecifierExpr
Replace AArch64MCExpr, which encodes expressions with relocation
specifiers, with the new generic MCSpecifierExpr interface, aligning
with other targets by phasing out target-specific XXXMCExpr classes.

Temporarily convert AArch64MCExpr to a namespace to avoid renaming
`AArch64MCExpr::VK_` constants in this PR. A follow-up patch will rename
these to `AArch64::S_` to match the convention used by other targets.

Move helper functions to AArch64MCAsmInfo.h, with the goal of eventually
removing AArch64MCExpr.h.

Pull Request: https://github.com/llvm/llvm-project/pull/144632
2025-06-20 20:06:32 -07:00
Fangrui Song
f11dd116e0 RISCV: Replace RISCVMCExpr with MCSpecifierExpr 2025-06-15 16:51:09 -07:00
Fangrui Song
fedf6c68dd RISCV: Move RISCVMCExpr functions to RISCVMCAsmInfo or RISCVMCAsmBackend
* Move getPCRelHiFixup closer to the only caller RISCVAsmBackend::evaluateTargetFixup.
* Declare getSpecifierForName in RISCVMCAsmInfo, in align with other
  targets that have migrated to the new relocation specifier representation.
2025-06-15 16:22:39 -07:00
Fangrui Song
4635b6076d RISCV: Rename RISCVMCExpr::VK_ to RISCV::S_ 2025-06-15 16:01:28 -07:00
Fangrui Song
cdd0a6c781 BOLT: Replace MCTargetExpr with MCSpecifierExpr to fix bolt-icf.test on aarch64 host 2025-06-07 22:35:20 -07:00
Anatoly Trosinenko
e1328fd9ad
[BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (#136147)
Clarify the semantics of `getAuthenticatedReg` and remove a redundant
`isAuthenticationOfReg` method, as combined auth+something instructions
(such as `retaa` on AArch64) should be handled carefully, especially
when searching for authentication oracles: usually, such instructions
cannot be authentication oracles and only some of them actually write an
authenticated pointer to a register (such as "ldra x0, [x1]!").

Use `std::optional<MCPhysReg>` returned type instead of plain MCPhysReg
and returning `getNoRegister()` as a "not applicable" indication.

Document a few existing methods, add information about preconditions.
2025-05-26 18:31:20 +03:00
Kazu Hirata
c0e7a59204
[BOLT] Remove redundant control flow statements (NFC) (#141182) 2025-05-22 22:36:23 -07:00
Anatoly Trosinenko
48a2836b4d
[BOLT] Gadget scanner: detect signing oracles (#134146)
Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

    resign:
      autda x0, x1
      pacda x0, x2
      ret

is probably safe provided `autda` generates an error on authentication
failure, this function

    sign_anything:
      pacda x0, x1
      ret

is inherently unsafe.
2025-05-20 13:42:53 +03:00
Fangrui Song
2f05451198
RISCV: Replace most Specifier constants with relocation types
... as they map directly and we don't utilize -Wswitch.
Retained VK_*_LO constants for lowering to LO_I or LO_S.

The Sparc port has eliminated all Specifier constants (commit
003fa7731d81a47c98e9c55f80d509933c9b91f6), and the LoongArch port is
nearly free of them (#138632).

Pull Request: https://github.com/llvm/llvm-project/pull/138644
2025-05-17 10:14:33 -07:00
Kazu Hirata
d5b170c39b
[BOLT] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#139403) 2025-05-10 13:39:15 -07:00
Fangrui Song
c239acb5b6 MCFixup: Make FixupKindInfo smaller and change getFixupKindInfo to return value
We will increase the use of raw relocation types and eliminate fixup
kinds that correspond to relocation types. The getFixupKindInfo
functions will return an rvalue instead. Let's update the return type
from a const reference to a value type.
2025-04-18 20:55:43 -07:00
Kazu Hirata
2af5e01456
[BOLT][RISCV] Fix MCPlusBuilder instrumentation ifaces (#136211)
a) Due to the different capabilities of the functions implemented,
rename the createCmpJE function
b) Refactor the convertIndirectCallToLoad function to override the
interface.

Patch by WangJee, originally posted in #136129
2025-04-17 15:27:44 -07:00
wangjue
dbb79c30c9
[BOLT][Instrumentation] Initial instrumentation support for RISCV64 (#133882)
This patch adds code generation for RISCV64 instrumentation.The work
    involved includes the following three points:

a) Implements support for instrumenting direct function call and jump
    on RISC-V which relies on , Atomic instructions
    (used to increment counters) are only available on RISC-V when the A
    extension is used.

b) Implements support for instrumenting direct function inderect call
    by implementing the createInstrumentedIndCallHandlerEntryBB and
createInstrumentedIndCallHandlerExitBB interfaces. In this process, we
    need to accurately record the target address and IndCallID to ensure
    the correct recording of the indirect call counters.

c)Implemented the RISCV64 Bolt runtime library, implemented some system
call interfaces through embedded assembly. Get the difference between
runtime addrress of .text section andstatic address in section header
table, which in turn can be used to search for indirect call
description.

However, the community code currently has problems with relocation in
    some scenarios, but this has nothing to do with instrumentation. We
    may continue to submit patches to fix the related bugs.
2025-04-16 23:01:00 -07:00
Anatoly Trosinenko
8521bd2424
[BOLT][AArch64] Handle PAuth call instructions in isIndirectCall (#133227)
Handle `BLRA*` opcodes in AArch64MCPlusBuilder::isIndirectCall, update
getRegUsedAsCallDest accordingly.
2025-04-08 13:23:10 +03:00
Anatoly Trosinenko
0fc7aec349
[BOLT] Gadget scanner: detect address materialization and arithmetic (#132540)
In addition to authenticated pointers, consider the contents of a
register safe if it was
* written by PC-relative address computation
* updated by an arithmetic instruction whose input address is safe
2025-04-07 13:13:11 +03:00
Maksim Panchenko
e4cbb7780b
[BOLT][AArch64] Fix symbolization of unoptimized TLS access (#134332)
TLS relocations may not have a valid BOLT symbol associated with them.
While symbolizing the operand, we were checking for the symbol value,
and since there was no symbol the check resulted in a crash.

Handle TLS case while performing operand symbolization on AArch64.
2025-04-04 11:42:21 -07:00
Rodrigo Rocha
b9891715af
[BOLT] Handle generation of compare and jump sequences (#131949)
This patch fixes the following two issues with the createCmpJE for
AArch64:
1. Avoids overwriting the value of the input register RegNo by use XZR
as the destination register.
   subs xzr, RegNo, #Imm
   which is equivalent to a simple
   cmp RegNo, #Imm
2. The immediate operand to the Bcc instruction must be EQ instead of
#Imm.

This patch also adds a new function for createCmpJNE and unit tests for
the both createCmpJE and createCmpJNE for X86 and AArch64.
2025-04-03 18:34:24 -07:00
Anatoly Trosinenko
c818ae7399
[BOLT] Gadget scanner: detect non-protected indirect calls (#131899)
Implement the detection of non-protected indirect calls and branches
similar to pac-ret scanner.
2025-04-03 16:40:34 +03:00
Maksim Panchenko
b2d272ccfb
[BOLT][X86] Fix getTargetSymbol() (#133834)
In 96e5ee2, I inadvertently broke the way non-trivial symbol references
got updated from non-optimized code. The breakage was a consequence of
`getTargetSymbol(MCExpr *)` not returning a symbol when the parameter
was a binary expression. Fix `getTargetSymbol()` to cover such cases.
2025-03-31 18:31:33 -07:00
Maksim Panchenko
96e5ee23a7
[BOLT][AArch64] Add partial support for lite mode (#133014)
In lite mode, we only emit code for a subset of functions while
preserving the original code in .bolt.org.text. This requires updating
code references in non-emitted functions to ensure that:

* Non-optimized versions of the optimized code never execute.
* Function pointer comparison semantics is preserved.

On x86-64, we can update code references in-place using "pending
relocations" added in scanExternalRefs(). However, on AArch64, this is
not always possible due to address range limitations and linker address
"relaxation".

There are two types of code-to-code references: control transfer (e.g.,
calls and branches) and function pointer materialization.
AArch64-specific control transfer instructions are covered by #116964.

For function pointer materialization, simply changing the immediate
field of an instruction is not always sufficient. In some cases, we need
to modify a pair of instructions, such as undoing linker relaxation and
converting NOP+ADR into ADRP+ADD sequence.

To achieve this, we use the instruction patch mechanism instead of
pending relocations. Instruction patches are emitted via the regular MC
layer, just like regular functions. However, they have a fixed address
and do not have an associated symbol table entry. This allows us to make
more complex changes to the code, ensuring that function pointers are
correctly updated. Such mechanism should also be portable to RISC-V and
other architectures.

To summarize, for AArch64, we extend the scanExternalRefs() process to
undo linker relaxation and use instruction patches to partially
overwrite unoptimized code.
2025-03-27 21:33:25 -07:00
Anatoly Trosinenko
b6b40e9ac9
[BOLT] Gadget scanner: reformulate the state for data-flow analysis (#131898)
In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as `ret x1` in
`f_nonx30_ret_non_auted` test case.
2025-03-25 21:45:02 +03:00
Fangrui Song
42a8813757 [RISCV] Rename VariantKind to Specifier
Follow the X86 and Mips renaming.

> "Relocation modifier" suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation.
> "Relocation specifier" is clear, aligns with Arm and IBM AIX's documentation, and fits the assembler's role seamlessly.

In addition, rename *MCExpr::getKind, which confusingly shadows the base class getKind.
2025-03-20 22:25:57 -07:00
Maksim Panchenko
70bf5e514b
[BOLT][AArch64] Symbolize ADRP after relaxation (#131414)
When the linker relaxes a GOT load, it changes ADRP+LDR instruction pair
into ADRP+ADD. It is relatively straightforward to detect and symbolize
the second instruction in the disassembler. However, it is not always
possible to properly symbolize the ADRP instruction without looking at
the second instruction. Hence, we have the FixRelaxationPass that adjust
the operand of ADRP by looking at the corresponding ADD.

This PR tries to properly symbolize ADRP earlier in the pipeline, i.e.
in AArch64MCSymbolizer. This change makes it easier to adjust the
instruction once we add AArch64 support in `scanExternalRefs()`.
Additionally, we get a benefit of looking at proper operands while
observing the function state prior to running FixRelaxationPass.

To disambiguate the operand of ADRP that has a GOT relocation against
it, we look at the contents/value of the operand. If it contains an
address of a page that is valid for GOT, we assume that the operand
wasn't modified by the linker and leave it up to FixRelaxationPass to do
a proper adjustment. If the page referenced by ADRP cannot point to GOT,
then it's an indication that the linker has modified the operand and we
substitute the operand with a non-GOT reference to the symbol.
2025-03-18 14:31:31 -07:00
Kazu Hirata
c72f7958b0 [BOLT] Fix the build
This is a follow-up for:

  commit 3c4b9317916ccd2e18c30b1540589518a4c7c88a
  Author: Fangrui Song <i@maskray.me>
  Date:   Mon Mar 17 20:05:28 2025 -0700
2025-03-17 20:18:34 -07:00
Anatoly Trosinenko
4f2ee07454
[BOLT][AArch64] Do not crash on authenticated branch instructions (#129898)
When an indirect branch instruction is decoded, analyzeIndirectBranch
method is asked if this is a well-known code pattern. On AArch64, the
only special pattern which is detected is Jump Table, emitted as a
branch to the sum of a constant base address and a variable offset.
Therefore, `Inst.getOpcode()` being one of `AArch64::BRA*` means Inst
cannot belong to such Jump Table pattern, thus returning early.
2025-03-17 12:00:05 +03:00
Kazu Hirata
4b1b629d60 [BOLT] Fix a warning
This patch fixes:

  bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp:128:20: error:
  unused variable 'SymbolPageAddr' [-Werror,-Wunused-variable]
2025-03-14 19:20:03 -07:00
Maksim Panchenko
bac21719a8
[BOLT] Pass unfiltered relocations to disassembler. NFCI (#131202)
Instead of filtering and modifying relocations in readRelocations(),
preserve the relocation info and use it in the symbolizing disassembler.
This change mostly affects AArch64, where we need to look at original
linker relocations in order to properly symbolize instruction operands.
2025-03-14 18:44:33 -07:00
Paschalis Mpeis
2f9d94981c
[BOLT] Change Relocation Type to 32-bit NFCI (#130792) 2025-03-14 18:15:59 +00:00
Maksim Panchenko
a28daa7c1a
[BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (#129980)
We used to filter out relocations corresponding to NOP+ADR instruction
pairs that were a result of linker "relaxation" optimization. However,
these relocations will be useful for reversing the linker optimization.
Keep the relocations and ignore them while symbolizing ADR instruction
operands.
2025-03-05 23:06:01 -08:00
Maksim Panchenko
b971d4d7c8
[BOLT][AArch64] Add symbolizer for AArch64 disassembler. NFCI (#127969)
Add AArch64MCSymbolizer that symbolizes `MCInst` operands during
disassembly. The symbolization was previously done in
`BinaryFunction::disassemble()`, but it is also required by
`scanExternalRefs()` for "lite" mode functionality. Hence, similar to
x86, I've implemented the symbolizer interface that uses
`BinaryFunction` relocations to properly create instruction operands. I
expect the result of the disassembly to be identical after the change.

AArch64 disassembler was not calling `tryAddingSymbolicOperand()` for
`MOV` instructions. Fix that. Additionally, the disassembler marks `ldr`
instructions as branches by setting `IsBranch` parameter to true. Ignore
the parameter and rely on `MCPlusBuilder` interface instead.

I've modified `--check-encoding` flag to check symolization of operands
of instructions that have relocations against them.
2025-03-03 12:44:28 -08:00
Maksim Panchenko
8910e41c86
[BOLT][AArch64] Refactor ADR to ADRP+ADD conversion pass. NFCI (#129399)
In preparation of using the new interface in more places, refactor the
ADR conversion pass.
2025-03-01 14:10:59 -08:00
Kristof Beyls
850b492976
[BOLT][binary-analysis] Add initial pac-ret gadget scanner (#122304)
This adds an initial pac-ret gadget scanner to the
llvm-bolt-binary-analysis-tool.

The scanner is taken from the prototype that was published last year at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype,
and has been discussed in RFC

https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and in the EuroLLVM 2024 keynote "Does LLVM implement security
hardenings correctly? A BOLT-based static analyzer to the rescue?"
[Video](https://youtu.be/Sn_Fxa0tdpY)
[Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf)

In the spirit of incremental development, this PR aims to add a minimal
implementation that is "fully working" on its own, but has major
limitations, as described in the bolt/docs/BinaryAnalysis.md
documentation in this proposed commit. These and other limitations will
be fixed in follow-on PRs, mostly based on code already existing in the
prototype branch. I hope incrementally upstreaming will make it easier
to review the code.

Note that I believe that this could also form the basis of a scanner to
analyze correct implementation of PAuthABI.
2025-02-24 07:26:28 +00:00
Maksim Panchenko
1b4bd4e1a5
[BOLT][AArch64] Remove assertions from jump table heuristic (#124372)
The code for jump table detection on AArch64 asserts liberally whenever
the input instruction sequence does not match the expected pattern. As a
result, BOLT fails to process binaries with such sequences instead of
ignoring functions with unknown control flow.

Remove asserts in analyzeIndirectBranchFragment() and mark indirect
jumps as instructions with unknown control flow instead.
2025-01-24 16:43:02 -08:00
Maksim Panchenko
34c6c5e72f
[BOLT][AArch64] Fix PLT optimization (#124192)
Preserve C++ exception metadata while running PLT optimization on
AArch64.
2025-01-24 14:20:24 -08:00
Alexey Moksyakov
ad599c25d9
[BOLT][AArch64] Add isPush & isPop (#120713)
This functionality is needed for inliner pass and also for correct dyno
stats.

Needed for [PR](https://github.com/llvm/llvm-project/pull/120187)
2025-01-20 10:42:48 +08:00
Nicholas
ee4282259d
[BOLT][AArch64]support inline-small-functions for AArch64 (#120187)
Add some functions in `AArch64MCPlusBuilder.cpp` to support inline for
AArch64.
2025-01-17 17:55:55 +08:00
Nicholas
1fa02b9684
[BOLT][AArch64] Speedup computeInstructionSize (#121106)
AArch64 instructions have a fixed size 4 bytes, no need to compute.
2025-01-17 09:48:17 +08:00
Nikita Popov
e7244d8659
[BOLT][CMake] Don't export bolt libraries in LLVMExports.cmake (#121936)
Bolt makes use of add_llvm_library and as such ends up exporting its
libraries from LLVMExports.cmake, which is not correct.

Bolt doesn't have its own exports file, and I assume that there is no
desire to have one either -- Bolt libraries are not intended to be
consumed as a cmake module, right?

As such, this PR adds a NO_EXPORT option to simplify exclude these
libraries from the exports file.
2025-01-08 09:41:09 +01:00
Alexey Moksyakov
e11d49cbf5
[BOLT][AArch64] Adds tls relocations support (#117465)
Co-authored-by: yavtuk <yavtuk@ya.ru>
2024-12-20 15:54:36 +03:00
Maksim Panchenko
be89e794f7
[BOLT][AArch64] Add support for long absolute LLD thunks/veneers (#113408)
Absolute thunks generated by LLD reference function addresses recorded
as data in code. Since they are generated by the linker, they don't have
relocations associated with them and thus the addresses are left
undetected. Use pattern matching to detect such thunks and handle them
in VeneerElimination pass.
2024-11-12 11:27:08 -08:00
Tristan Ross
abc2eae682
[BOLT] Enable standalone build (#97130)
Continue from #87196 as author did not have much time, I have taken over
working on this PR. We would like to have this so it'll be easier to
package for Nix.

Can be tested by copying cmake, bolt, third-party, and llvm directories
out into their own directory with this PR applied and then build bolt.

---------

Co-authored-by: pca006132 <john.lck40@gmail.com>
2024-07-25 08:18:14 -07:00
Amir Ayupov
3023b15fb1 [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH
Detect and support fixed PIC indirect jumps of the following form:
```
movslq  En(%rip), %r1
leaq  PIC_JUMP_TABLE(%rip), %r2
addq  %r2, %r1
jmpq  *%r1
```

with PIC_JUMP_TABLE that looks like following:

```
  JT:  ----------
   E1:| L1 - JT  |
      |----------|
   E2:| L2 - JT  |
      |----------|
      |          |
         ......
   En:| Ln - JT  |
       ----------
```

The code could be produced by compilers, see
https://github.com/llvm/llvm-project/issues/91648.

Test Plan: updated jump-table-fixed-ref-pic.test

Reviewers: maksfb, ayermolo, dcci, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/91667
2024-07-18 20:57:05 -07:00
Paschalis Mpeis
587308c343
[BOLT][AArch64] Provide createDummyReturnFunction (#96626)
AArch64 needs this function when instrumenting statically-linked binaries.

Sample commands:
```bash
clang -Wl,-q test.c -static -o out
llvm-bolt -instrument -instrumentation-sleep-time=5 out -o out.instr
```
2024-07-15 07:20:47 +01:00