llvm-project

Author	SHA1	Message	Date
Anatoly Trosinenko	f7f37fb71e	[BOLT] Gadget scanner: make use of C++17 features and LLVM helpers (#141665 ) Perform trivial syntactical cleanups: - make use of structured binding declarations - use LLVM utility functions when appropriate - omit braces around single expression inside single-line LLVM_DEBUG() This patch is NFC aside from minor debug output changes.	2025-10-01 14:12:45 +03:00
Anatoly Trosinenko	55bd45852c	[BOLT] Gadget scanner: optionally assume auth traps on failure (#139778 ) On AArch64 it is possible for an auth instruction to either return an invalid address value on failure (without FEAT_FPAC) or generate an error (with FEAT_FPAC). It thus may be possible to never emit explicit pointer checks, if the target CPU is known to support FEAT_FPAC. This commit implements an --auth-traps-on-failure command line option, which essentially makes "safe-to-dereference" and "trusted" register properties identical and disables scanning for authentication oracles completely.	2025-10-01 14:03:29 +03:00
Anatoly Trosinenko	58edd27670	[BOLT] Gadget scanner: account for BRK when searching for auth oracles (#137975 ) An authenticated pointer can be explicitly checked by the compiler via a sequence of instructions that executes BRK on failure. It is important to recognize such BRK instruction as checking every register (as it is expected to immediately trigger an abnormal program termination) to prevent false positive reports about authentication oracles: autia x2, x3 autia x0, x1 ; neither x0 nor x2 are checked at this point eor x16, x0, x0, lsl #1 tbz x16, #62, on_success ; marks x0 as checked ; end of BB: for x2 to be checked here, it must be checked in both ; successor basic blocks on_failure: brk 0xc470 on_success: ; x2 is checked ldr x1, [x2] ; marks x2 as checked	2025-08-25 14:24:19 +03:00
Dmitry Vasilyev	c00df536e3	[BOLT] Fixed cmdline-args.test to work on Windows (#151209 ) Added regex to ignore `.exe` in the executable name. Ignored OS-dependent message "No such file or directory".	2025-07-31 18:25:39 +04:00
Anatoly Trosinenko	7a5af4f6b8	[BOLT] Gadget scanner: detect untrusted LR before tail call (#137224 ) Implement the detection of tail calls performed with untrusted link register, which violates the assumption made on entry to every function. Unlike other pauth gadgets, detection of this one involves some amount of guessing which branch instructions should be checked as tail calls.	2025-06-26 12:37:25 +03:00
Paschalis Mpeis	249f074b22	[BOLT][AArch64] Make gs-pacret-autiasp.s deterministic (#145527 ) In gs-pacret-autiasp.s, the undefined call `bl g` causes inconsistent basic block splitting: in some platforms BOLT emits two blocks, on some others one. Defining a dummy `g` symbol forces a single basic block everywhere.	2025-06-26 09:33:49 +01:00
Anatoly Trosinenko	a8a2c6fa88	[BOLT] Gadget scanner: fix LR to be safe in leaf functions without CFG (#141824 ) After a label in a function without CFG information, use a reasonably pessimistic estimation of register state (assume that any register that can be clobbered in this function was actually clobbered) instead of the most pessimistic "all registers are unsafe". This is the same estimation as used by the dataflow variant of the analysis when the preceding instruction is not known for sure. Without this, leaf functions without CFG information are likely to have false positive reports about non-protected return instructions, as 1) LR is unlikely to be signed and authenticated in a leaf function and 2) LR is likely to be used by a return instruction near the end of the function and 3) the register state is likely to be reset at least once during the linear scan through the function	2025-06-25 13:11:23 +03:00
Anatoly Trosinenko	20a72083fd	[BOLT] Gadget scanner: improve handling of unreachable basic blocks (#136183 ) Instead of refusing to analyze an instruction completely when it is unreachable according to the CFG reconstructed by BOLT, use pessimistic assumption of register state when possible. Nevertheless, unreachable basic blocks found in optimized code likely means imprecise CFG reconstruction, thus report a warning once per function.	2025-06-25 12:29:41 +03:00
Anatoly Trosinenko	e873fd157e	[BOLT] Gadget scanner: do not crash on debug-printing CFI instructions (#136151 ) Some instruction-printing code used under LLVM_DEBUG does not handle CFI instructions well. While CFI instructions seem to be harmless for the correctness of the analysis results, they do not convey any useful information to the analysis either, so skip them early.	2025-06-19 15:52:54 +03:00
Anatoly Trosinenko	2b4d757290	[BOLT] Gadget scanner: detect authentication oracles (#135663 ) Implement the detection of authentication instructions whose results can be inspected by an attacker to know whether authentication succeeded. As the properties of output registers of authentication instructions are inspected, add a second set of analysis-related classes to iterate over the instructions in reverse order.	2025-06-19 15:15:26 +03:00
Anatoly Trosinenko	e1328fd9ad	[BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (#136147 ) Clarify the semantics of `getAuthenticatedReg` and remove a redundant `isAuthenticationOfReg` method, as combined auth+something instructions (such as `retaa` on AArch64) should be handled carefully, especially when searching for authentication oracles: usually, such instructions cannot be authentication oracles and only some of them actually write an authenticated pointer to a register (such as "ldra x0, [x1]!"). Use `std::optional<MCPhysReg>` returned type instead of plain MCPhysReg and returning `getNoRegister()` as a "not applicable" indication. Document a few existing methods, add information about preconditions.	2025-05-26 18:31:20 +03:00
Anatoly Trosinenko	f578f56fea	[BOLT] Gadget scanner: refactor issue reporting (#135662 ) Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from the base `Report` class. Instead, rename the `Report` class to `Diagnostic` and make it always represent the brief version of the report, which is kept unchanged since initially found. Throughout its life-cycle, an instance of `Diagnostic` is first wrapped into `PartialReport<ReqT>` together with an optional request for extra details. Then, on the second run of the analysis, it is re-wrapped into `FinalReport` together with the requested detailed information.	2025-05-22 18:27:46 +03:00
Anatoly Trosinenko	48a2836b4d	[BOLT] Gadget scanner: detect signing oracles (#134146 ) Implement the detection of signing oracles. In this patch, a signing oracle is defined as a sign instruction that accepts a "non-protected" pointer, but for a slightly different definition of "non-protected" compared to control flow instructions. A second BitVector named TrustedRegs is added to the register state computed by the data-flow analysis. The difference between a "safe-to-dereference" and a "trusted" register states is that to make an unsafe register trusted by authentication, one has to make sure that the authentication succeeded. For example, on AArch64 without FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an invalid pointer on failure, so that subsequent memory access triggers an error, but re-signing such pointer would "fix" the signature. Note that while a separate "trusted" register state may be redundant depending on the specific semantics of auth and sign operations, it is still important to check signing operations: while code like this resign: autda x0, x1 pacda x0, x2 ret is probably safe provided `autda` generates an error on authentication failure, this function sign_anything: pacda x0, x1 ret is inherently unsafe.	2025-05-20 13:42:53 +03:00
Anatoly Trosinenko	f5401c6a16	[BOLT] Gadget scanner: analyze functions without CFG information (#133461 ) Support simple analysis of the functions for which BOLT is unable to reconstruct the CFG. This patch is inspired by the approach implemented by Kristof Beyls in the original prototype of gadget scanner, but a CFG-unaware counterpart of the data-flow analysis is implemented instead of separate version of gadget detector, as multiple gadget kinds are detected now.	2025-05-20 13:01:04 +03:00
Anatoly Trosinenko	2927050dd4	[BOLT] Gadget scanner: refine class names and debug output (NFC) (#135073 ) Scanning functions without CFG information as well as the detection of authentication oracles requires introducing more classes related to register state analysis. To make the future code easier to understand, rename several classes beforehand. To detect authentication oracles, one has to query the properties of output operands of authentication instructions after the instruction is executed - this requires adding another analysis that iterates over the instructions in reverse order, and a corresponding state class. As the main difference of the existing `State` class is that it stores the properties of source register operands of the instructions before the instruction's execution, rename it to `SrcState` and `PacRetAnalysis` to `SrcSafetyAnalysis`. Apply minor adjustments to the debug output along the way.	2025-04-10 20:54:05 +03:00
Anatoly Trosinenko	0fc7aec349	[BOLT] Gadget scanner: detect address materialization and arithmetic (#132540 ) In addition to authenticated pointers, consider the contents of a register safe if it was * written by PC-relative address computation * updated by an arithmetic instruction whose input address is safe	2025-04-07 13:13:11 +03:00
Anatoly Trosinenko	c818ae7399	[BOLT] Gadget scanner: detect non-protected indirect calls (#131899 ) Implement the detection of non-protected indirect calls and branches similar to pac-ret scanner.	2025-04-03 16:40:34 +03:00
Anatoly Trosinenko	b6b40e9ac9	[BOLT] Gadget scanner: reformulate the state for data-flow analysis (#131898 ) In preparation for implementing support for detection of non-protected call instructions, refine the definition of state which is computed for each register by data-flow analysis. Explicitly marking the registers which are known to be trusted at function entry is crucial for finding non-protected calls. In addition, it fixes less-common false negatives for pac-ret, such as `ret x1` in `f_nonx30_ret_non_auted` test case.	2025-03-25 21:45:02 +03:00
Anatoly Trosinenko	72d1058af0	[BOLT] Gadget scanner: refactor analysis of RET instructions (#131897 ) In preparation for implementing detection of more gadget kinds, refactor checking for non-protected return instructions.	2025-03-21 19:54:57 +03:00
Anatoly Trosinenko	03557169e0	[BOLT] Gadget scanner: streamline issue reporting (#131896 ) In preparation for adding more gadget kinds to detect, streamline issue reporting. Rename classes representing issue reports. In particular, rename `Annotation` base class to `Report`, as it has nothing to do with "annotations" in `MCPlus` terms anymore. Remove references to "return instructions" from variable names and report messages, use generic terms instead. Rename NonPacProtectedRetAnalysis to PAuthGadgetScanner. Remove `GeneralDiagnostic` as a separate class, make `GenericReport` (former `GenDiag`) store `std::string Text` directly. Remove unused `operator=` and `operator==` methods, as `Report`s are created on the heap and referenced via `shared_ptr`s. Introduce `GadgetKind` class - currently, it only wraps a `const char *` description to display to the user. This description is intended to be a per-gadget-kind constant (or a few hard-coded constants), so no need to store it to `std::string` field in each report instance. To handle both free-form `GenericReport`s and statically-allocated messages without unnecessary overhead, move printing of the report header to the base class (and take the message argument as a `StringRef`).	2025-03-21 11:19:53 +03:00
Anatoly Trosinenko	482b95217e	[BOLT] Gadget scanner: factor out utility code (#131895 ) Factor out the code for mapping from physical registers to consecutive array indexes. Introduce helper functions to print instructions and registers to prevent mixing of analysis logic and implementation details of debug output. Removed the debug printing from `Gadget::generateReport`, as it doesn't seem to add important information to what was already printed in the report itself.	2025-03-20 19:35:31 +03:00
Kristof Beyls	6c61c55756	[BOLT] pacret-scanner: fix regression test failure (#128576 ) ... which is caused by a seemingly recent change in BOLTs basic block calculation, where function calls seem to be ending basic blocks? I don't have a pointer to the commit that caused this change. I'll be looking for that later. For now, I'm trying to get the regression tests passing again.	2025-02-24 21:08:43 +00:00
Kristof Beyls	55c76ea391	[BOLT] pacret-scanner: fix regression tests... (#128565 ) by making the regex to match basic block names more general. See failing test case that was reported on some system in comment https://github.com/llvm/llvm-project/pull/122304#issuecomment-2679460678 These test cases were introduced in PR #122304, commit 850b49297615a613ac83adca2c9cf823a4b8ef95 .	2025-02-24 20:24:12 +00:00
Kristof Beyls	850b492976	[BOLT][binary-analysis] Add initial pac-ret gadget scanner (#122304 ) This adds an initial pac-ret gadget scanner to the llvm-bolt-binary-analysis-tool. The scanner is taken from the prototype that was published last year at https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype, and has been discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and in the EuroLLVM 2024 keynote "Does LLVM implement security hardenings correctly? A BOLT-based static analyzer to the rescue?" [Video](https://youtu.be/Sn_Fxa0tdpY) [Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf) In the spirit of incremental development, this PR aims to add a minimal implementation that is "fully working" on its own, but has major limitations, as described in the bolt/docs/BinaryAnalysis.md documentation in this proposed commit. These and other limitations will be fixed in follow-on PRs, mostly based on code already existing in the prototype branch. I hope incrementally upstreaming will make it easier to review the code. Note that I believe that this could also form the basis of a scanner to analyze correct implementation of PAuthABI.	2025-02-24 07:26:28 +00:00
Kristof Beyls	ceb7214be0	[BOLT] Introduce binary analysis tool based on BOLT (#115330 ) This initial commit does not add any specific binary analyses yet, it merely contains the boilerplate to introduce a new BOLT-based tool. This basically combines the 4 first patches from the prototype pac-ret and stack-clash binary analyzer discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and published at https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype The introduction of such a BOLT-based binary analysis tool was proposed and discussed in at least the following places: - The RFC pointed to above - EuroLLVM 2024 round table https://discourse.llvm.org/t/summary-of-bolt-as-a-binary-analysis-tool-round-table-at-eurollvm/78441 The round table showed quite a few people interested in being able to build a custom binary analysis quickly with a tool like this. - Also at the US LLVM dev meeting a few weeks ago, I heard interest from a few people, asking when the tool would be available upstream. - The presentation "Adding Pointer Authentication ABI support for your ELF platform" (https://llvm.swoogo.com/2024devmtg/session/2512720/adding-pointer-authentication-abi-support-for-your-elf-platform) explicitly mentioned interest to extend the prototype tool to verify correct implementation of pauthabi.	2024-12-12 10:06:27 +00:00

25 Commits