25 Commits

Author SHA1 Message Date
Anatoly Trosinenko
f7f37fb71e
[BOLT] Gadget scanner: make use of C++17 features and LLVM helpers (#141665)
Perform trivial syntactical cleanups:

- make use of structured binding declarations
- use LLVM utility functions when appropriate
- omit braces around single expression inside single-line LLVM_DEBUG()

This patch is NFC aside from minor debug output changes.
2025-10-01 14:12:45 +03:00
Anatoly Trosinenko
55bd45852c
[BOLT] Gadget scanner: optionally assume auth traps on failure (#139778)
On AArch64 it is possible for an auth instruction to either return an
invalid address value on failure (without FEAT_FPAC) or generate an
error (with FEAT_FPAC). It thus may be possible to never emit explicit
pointer checks, if the target CPU is known to support FEAT_FPAC.

This commit implements an --auth-traps-on-failure command line option,
which essentially makes "safe-to-dereference" and "trusted" register
properties identical and disables scanning for authentication oracles
completely.
2025-10-01 14:03:29 +03:00
Anatoly Trosinenko
58edd27670
[BOLT] Gadget scanner: account for BRK when searching for auth oracles (#137975)
An authenticated pointer can be explicitly checked by the compiler via a
sequence of instructions that executes BRK on failure. It is important
to recognize such BRK instruction as checking every register (as it is
expected to immediately trigger an abnormal program termination) to
prevent false positive reports about authentication oracles:

      autia   x2, x3
      autia   x0, x1
      ; neither x0 nor x2 are checked at this point
      eor     x16, x0, x0, lsl #1
      tbz     x16, #62, on_success ; marks x0 as checked
      ; end of BB: for x2 to be checked here, it must be checked in both
      ; successor basic blocks
    on_failure:
      brk     0xc470
    on_success:
      ; x2 is checked
      ldr     x1, [x2] ; marks x2 as checked
2025-08-25 14:24:19 +03:00
Dmitry Vasilyev
c00df536e3
[BOLT] Fixed cmdline-args.test to work on Windows (#151209)
Added regex to ignore `.exe` in the executable name. 
Ignored OS-dependent message "No such file or directory".
2025-07-31 18:25:39 +04:00
Anatoly Trosinenko
7a5af4f6b8
[BOLT] Gadget scanner: detect untrusted LR before tail call (#137224)
Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, detection of this one involves some amount
of guessing which branch instructions should be checked as tail calls.
2025-06-26 12:37:25 +03:00
Paschalis Mpeis
249f074b22
[BOLT][AArch64] Make gs-pacret-autiasp.s deterministic (#145527)
In gs-pacret-autiasp.s, the undefined call `bl g` causes inconsistent
basic block splitting: in some platforms BOLT emits two blocks, on some
others one.

Defining a dummy `g` symbol forces a single basic block everywhere.
2025-06-26 09:33:49 +01:00
Anatoly Trosinenko
a8a2c6fa88
[BOLT] Gadget scanner: fix LR to be safe in leaf functions without CFG (#141824)
After a label in a function without CFG information, use a reasonably
pessimistic estimation of register state (assume that any register that
can be clobbered in this function was actually clobbered) instead of the
most pessimistic "all registers are unsafe". This is the same estimation
as used by the dataflow variant of the analysis when the preceding
instruction is not known for sure.

Without this, leaf functions without CFG information are likely to have
false positive reports about non-protected return instructions, as
1) LR is unlikely to be signed and authenticated in a leaf function and
2) LR is likely to be used by a return instruction near the end of the
   function and
3) the register state is likely to be reset at least once during the
   linear scan through the function
2025-06-25 13:11:23 +03:00
Anatoly Trosinenko
20a72083fd
[BOLT] Gadget scanner: improve handling of unreachable basic blocks (#136183)
Instead of refusing to analyze an instruction completely when it is
unreachable according to the CFG reconstructed by BOLT, use pessimistic
assumption of register state when possible. Nevertheless, unreachable
basic blocks found in optimized code likely means imprecise CFG
reconstruction, thus report a warning once per function.
2025-06-25 12:29:41 +03:00
Anatoly Trosinenko
e873fd157e
[BOLT] Gadget scanner: do not crash on debug-printing CFI instructions (#136151)
Some instruction-printing code used under LLVM_DEBUG does not handle CFI
instructions well. While CFI instructions seem to be harmless for the
correctness of the analysis results, they do not convey any useful
information to the analysis either, so skip them early.
2025-06-19 15:52:54 +03:00
Anatoly Trosinenko
2b4d757290
[BOLT] Gadget scanner: detect authentication oracles (#135663)
Implement the detection of authentication instructions whose results can
be inspected by an attacker to know whether authentication succeeded.

As the properties of output registers of authentication instructions are
inspected, add a second set of analysis-related classes to iterate over
the instructions in reverse order.
2025-06-19 15:15:26 +03:00
Anatoly Trosinenko
e1328fd9ad
[BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (#136147)
Clarify the semantics of `getAuthenticatedReg` and remove a redundant
`isAuthenticationOfReg` method, as combined auth+something instructions
(such as `retaa` on AArch64) should be handled carefully, especially
when searching for authentication oracles: usually, such instructions
cannot be authentication oracles and only some of them actually write an
authenticated pointer to a register (such as "ldra x0, [x1]!").

Use `std::optional<MCPhysReg>` returned type instead of plain MCPhysReg
and returning `getNoRegister()` as a "not applicable" indication.

Document a few existing methods, add information about preconditions.
2025-05-26 18:31:20 +03:00
Anatoly Trosinenko
f578f56fea
[BOLT] Gadget scanner: refactor issue reporting (#135662)
Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from
the base `Report` class. Instead, rename the `Report` class to
`Diagnostic` and make it always represent the brief version of the
report, which is kept unchanged since initially found. Throughout its
life-cycle, an instance of `Diagnostic` is first wrapped into
`PartialReport<ReqT>` together with an optional request for extra
details. Then, on the second run of the analysis, it is re-wrapped into
`FinalReport` together with the requested detailed information.
2025-05-22 18:27:46 +03:00
Anatoly Trosinenko
48a2836b4d
[BOLT] Gadget scanner: detect signing oracles (#134146)
Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

    resign:
      autda x0, x1
      pacda x0, x2
      ret

is probably safe provided `autda` generates an error on authentication
failure, this function

    sign_anything:
      pacda x0, x1
      ret

is inherently unsafe.
2025-05-20 13:42:53 +03:00
Anatoly Trosinenko
f5401c6a16
[BOLT] Gadget scanner: analyze functions without CFG information (#133461)
Support simple analysis of the functions for which BOLT is unable to
reconstruct the CFG. This patch is inspired by the approach implemented
by Kristof Beyls in the original prototype of gadget scanner, but a
CFG-unaware counterpart of the data-flow analysis is implemented
instead of separate version of gadget detector, as multiple gadget kinds
are detected now.
2025-05-20 13:01:04 +03:00
Anatoly Trosinenko
2927050dd4
[BOLT] Gadget scanner: refine class names and debug output (NFC) (#135073)
Scanning functions without CFG information as well as the detection of
authentication oracles requires introducing more classes related to
register state analysis. To make the future code easier to understand,
rename several classes beforehand.

To detect authentication oracles, one has to query the properties of
*output* operands of authentication instructions *after* the instruction
is executed - this requires adding another analysis that iterates over
the instructions in reverse order, and a corresponding state class.

As the main difference of the existing `State` class is that it stores
the properties of source register operands of the instructions before
the instruction's execution, rename it to `SrcState` and
`PacRetAnalysis` to `SrcSafetyAnalysis`.

Apply minor adjustments to the debug output along the way.
2025-04-10 20:54:05 +03:00
Anatoly Trosinenko
0fc7aec349
[BOLT] Gadget scanner: detect address materialization and arithmetic (#132540)
In addition to authenticated pointers, consider the contents of a
register safe if it was
* written by PC-relative address computation
* updated by an arithmetic instruction whose input address is safe
2025-04-07 13:13:11 +03:00
Anatoly Trosinenko
c818ae7399
[BOLT] Gadget scanner: detect non-protected indirect calls (#131899)
Implement the detection of non-protected indirect calls and branches
similar to pac-ret scanner.
2025-04-03 16:40:34 +03:00
Anatoly Trosinenko
b6b40e9ac9
[BOLT] Gadget scanner: reformulate the state for data-flow analysis (#131898)
In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as `ret x1` in
`f_nonx30_ret_non_auted` test case.
2025-03-25 21:45:02 +03:00
Anatoly Trosinenko
72d1058af0
[BOLT] Gadget scanner: refactor analysis of RET instructions (#131897)
In preparation for implementing detection of more gadget kinds,
refactor checking for non-protected return instructions.
2025-03-21 19:54:57 +03:00
Anatoly Trosinenko
03557169e0
[BOLT] Gadget scanner: streamline issue reporting (#131896)
In preparation for adding more gadget kinds to detect, streamline
issue reporting.

Rename classes representing issue reports. In particular, rename
`Annotation` base class to `Report`, as it has nothing to do with
"annotations" in `MCPlus` terms anymore. Remove references to "return
instructions" from variable names and report messages, use generic
terms instead. Rename NonPacProtectedRetAnalysis to PAuthGadgetScanner.

Remove `GeneralDiagnostic` as a separate class, make `GenericReport`
(former `GenDiag`) store `std::string Text` directly. Remove unused
`operator=` and `operator==` methods, as `Report`s are created on the
heap and referenced via `shared_ptr`s.

Introduce `GadgetKind` class - currently, it only wraps a `const char *`
description to display to the user. This description is intended to be
a per-gadget-kind constant (or a few hard-coded constants), so no need
to store it to `std::string` field in each report instance. To handle
both free-form `GenericReport`s and statically-allocated messages
without unnecessary overhead, move printing of the report header to the
base class (and take the message argument as a `StringRef`).
2025-03-21 11:19:53 +03:00
Anatoly Trosinenko
482b95217e
[BOLT] Gadget scanner: factor out utility code (#131895)
Factor out the code for mapping from physical registers to consecutive
array indexes.

Introduce helper functions to print instructions and registers to
prevent mixing of analysis logic and implementation details of debug
output.

Removed the debug printing from `Gadget::generateReport`, as it doesn't
seem to add important information to what was already printed in the
report itself.
2025-03-20 19:35:31 +03:00
Kristof Beyls
6c61c55756
[BOLT] pacret-scanner: fix regression test failure (#128576)
... which is caused by a seemingly recent change in BOLTs basic block
calculation, where function calls seem to be ending basic blocks? I
don't have a pointer to the commit that caused this change. I'll be
looking for that later. For now, I'm trying to get the regression tests
passing again.
2025-02-24 21:08:43 +00:00
Kristof Beyls
55c76ea391
[BOLT] pacret-scanner: fix regression tests... (#128565)
by making the regex to match basic block names more general. See failing
test case that was reported on some system in comment
https://github.com/llvm/llvm-project/pull/122304#issuecomment-2679460678

These test cases were introduced in PR #122304, commit
850b49297615a613ac83adca2c9cf823a4b8ef95 .
2025-02-24 20:24:12 +00:00
Kristof Beyls
850b492976
[BOLT][binary-analysis] Add initial pac-ret gadget scanner (#122304)
This adds an initial pac-ret gadget scanner to the
llvm-bolt-binary-analysis-tool.

The scanner is taken from the prototype that was published last year at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype,
and has been discussed in RFC

https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and in the EuroLLVM 2024 keynote "Does LLVM implement security
hardenings correctly? A BOLT-based static analyzer to the rescue?"
[Video](https://youtu.be/Sn_Fxa0tdpY)
[Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf)

In the spirit of incremental development, this PR aims to add a minimal
implementation that is "fully working" on its own, but has major
limitations, as described in the bolt/docs/BinaryAnalysis.md
documentation in this proposed commit. These and other limitations will
be fixed in follow-on PRs, mostly based on code already existing in the
prototype branch. I hope incrementally upstreaming will make it easier
to review the code.

Note that I believe that this could also form the basis of a scanner to
analyze correct implementation of PAuthABI.
2025-02-24 07:26:28 +00:00
Kristof Beyls
ceb7214be0
[BOLT] Introduce binary analysis tool based on BOLT (#115330)
This initial commit does not add any specific binary analyses yet, it
merely contains the boilerplate to introduce a new BOLT-based tool.

This basically combines the 4 first patches from the prototype pac-ret
and stack-clash binary analyzer discussed in RFC
https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and published at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype

The introduction of such a BOLT-based binary analysis tool was proposed
and discussed in at least the following places:
- The RFC pointed to above
- EuroLLVM 2024 round table
https://discourse.llvm.org/t/summary-of-bolt-as-a-binary-analysis-tool-round-table-at-eurollvm/78441
The round table showed quite a few people interested in being able to
build a custom binary analysis quickly with a tool like this.
- Also at the US LLVM dev meeting a few weeks ago, I heard interest from
a few people, asking when the tool would be available upstream.
- The presentation "Adding Pointer Authentication ABI support for your
ELF platform"
(https://llvm.swoogo.com/2024devmtg/session/2512720/adding-pointer-authentication-abi-support-for-your-elf-platform)
explicitly mentioned interest to extend the prototype tool to verify
correct implementation of pauthabi.
2024-12-12 10:06:27 +00:00