llvm-project

Author	SHA1	Message	Date
Amir Ayupov	a850912de1	[BOLT] Require CFG in BAT mode (#150488 ) `getFallthroughsInTrace` requires CFG for functions not covered by BAT, even in BAT/fdata mode. BAT-covered functions go through special handling in fdata (`BAT->getFallthroughsInTrace`) and YAML (`DataAggregator::writeBATYAML`) modes. Since all modes (BAT/no-BAT, YAML/fdata) now need disassembly/CFG construction: - drop special BAT/fdata handling that omitted disassembly/CFG in `RewriteInstance::run`, enabling CFG for all non-BAT functions, - switch `getFallthroughsInTrace` to check if a function has CFG, - which allows emitting profile for non-simple functions in all modes. Previously, traces in non-simple functions were reported as invalid/ mismatching disassembled function contents. This change reduces the number of such invalid traces and increases the number of profiled functions. These functions may participate in function reordering via call graph profile. Test Plan: updated unclaimed-jt-entries.s	2025-07-25 13:54:37 +02:00
Maksim Panchenko	d5d94ba8bc	[BOLT] More refactoring of PHDR handling. NFC (#148932 ) Replace ad-hoc adjustment of the program header count with info from the new segment list.	2025-07-24 11:47:27 -07:00
Maksim Panchenko	218fd69261	[BOLT] Decouple new segment creation from PHDR rewrite. NFCI (#146111 ) Refactor handling of PHDR table rewrite to make modifications easier.	2025-07-02 11:22:12 -07:00
Maksim Panchenko	ad7d675991	[BOLT] Refactor mapCodeSections(). NFC (#146434 ) Factor out non-relocation specific code into a separate function.	2025-06-30 17:09:41 -07:00
Maksim Panchenko	fb24b4d46a	[BOLT] Push code to higher addresses under options (#146180 ) When --hot-functions-at-end is used in combination with --use-old-text, allocate code at the highest possible addresses withing old .text. This feature is mostly useful for HHVM, where it is beneficial to have hot static code placed as close as possible to jitted code.	2025-06-28 13:53:56 -07:00
Maksim Panchenko	d00c83ef22	[BOLT] Skip creation of new segments (#146023 ) When all section contents are updated in-place, we can skip creation of new segment(s), save disk space, and free up low memory addresses. Currently, this feature only works with --use-gnu-stack.	2025-06-27 09:12:08 -07:00
Maksim Panchenko	4308292d1e	[BOLT] Refactor NewTextSegmentAddress handling (#145950 ) Refactor the code for NewTextSegmentAddress to correctly point at the true start of the segment when PHDR table is placed at the beginning. We used to offset NewTextSegmentAddress by PHDR table plus cache line alignment. NFC for proper binaries. Some YAML binaries from our tests will diverge due to bad segment address/offset alignment.	2025-06-26 12:09:11 -07:00
Amir Ayupov	f0d32575a1	[BOLT][NFCI] Use FileSymbols for local symbol disambiguation (#89088 ) Remove SymbolToFileName mapping from every local symbol to its containing FILE symbol name, and reuse FileSymbols to disambiguate local symbols instead. Also removes the check for `ld-temp.o` file symbol which was added to prevent LTO build mode from affecting the disambiguated name. This may cause incompatibility when using the profile collected on a binary built in a different mode than the input binary. Addresses #90661. Speeds up discover file objects by 5-10% for large binaries: - binary with ~1.2M symbols: 12.6422s -> 12.0297s - binary with ~4.5M symbols: 48.8851s -> 43.7315s	2025-06-20 14:29:32 -07:00
Amir Ayupov	4959e8a1da	[BOLT][NFCI] Use heuristic for matching split global functions (#90429 ) This change speeds up fragment matching for large BOLTed binaries where all fragments of global parent functions are put under `bolt-pseudo.o` file symbol: - before: iterating over symbols under `bolt-pseudo.o` only to fail to find a parent, - after: bail out immediately and use a global parent by name. Test Plan: NFC, updated register-fragments-bolt-symbols.s	2025-06-20 12:46:56 -07:00
Maksim Panchenko	06f13f8684	[BOLT] Fix references in ignored functions in CFG state (#140678 ) When we call setIgnored() on functions that already have CFG built, these functions are not going to get emitted and we risk missing external function references being updated. To mitigate the potential issues, run scanExternalRefs() on such functions to create patches/relocations. Since scanExternalRefs() relies on function relocations, we have to preserve relocations until the function is emitted. As a result, the memory overhead without debug info update could reach up to 2%.	2025-06-02 12:33:54 -07:00
Maksim Panchenko	c9022a29b4	[BOLT][AArch64] Detect veneers with missing data markers (#142069 ) The linker may omit data markers for long absolute veneers causing BOLT to treat data as code. Detect such veneers and introduce data markers artificially before BOLT's disassembler kicks in.	2025-05-29 19:24:34 -07:00
Kazu Hirata	d08833f23f	[BOLT] Remove unused local variables (NFC) (#140421 ) While I'm at it, this patch removes GetExecutablePath, which becomes unused after removing the sole use.	2025-05-17 17:43:29 -07:00
Amir Ayupov	9d5d715330	[BOLT][heatmap] Add synthetic hot text section (#139824 ) In heatmap mode, report samples and utilization of the section(s) between hot text markers `[__hot_start, __hot_end)`. The intended use is with multi-way splitting where there are several sections that contain "hot" code (e.g. `.text.warm` with CDSplit). Addresses the comment on #139193 https://github.com/llvm/llvm-project/pull/139193#pullrequestreview-2835274682 Test Plan: updated heatmap-preagg.test	2025-05-14 09:47:14 -07:00
Amir Ayupov	0289ca09be	[BOLT] Print heatmap from perf2bolt (#139194 ) Add perf2bolt `--heatmap` option to produce heatmaps during profile aggregation. Distinguish exclusive mode (`llvm-bolt-heatmap`) and optional mode (`perf2bolt --heatmap`), which impacts perf.data handling: exclusive mode covers all addresses, whereas optional mode consumes attached profile only covering function addresses. Test Plan: updated per2bolt tests: - pre-aggregated-perf.test: pre-aggregated data, - bolt-address-translation-yaml.test: pre-aggregated + BOLTed input, - perf_test.test: no-LBR perf data.	2025-05-13 13:23:18 -07:00
Amir Ayupov	7f4febde10	[BOLT][heatmap] Compute section utilization and partition score (#139193 ) Heatmap groups samples into buckets of configurable size (`--block-size` flag with 64 bytes as the default =X86 cache line size). Buckets are mapped to containing sections; for buckets that cover multiple sections, they are attributed to the first overlapping section. Buckets not mapped to a section are reported as unmapped. Heatmap reports section hotness which is a percentage of samples attributed to the section. Define section utilization as a percentage of buckets with non-zero samples relative to the total number of section buckets. Also define section partition score as a product of section hotness (where total excludes unmapped buckets) and mapped utilization, ranging from 0 to 1 (higher is better). The intended use of new metrics is with production profile collected from BOLT-optimized binary. In this case the partition score of .text (hot text if function splitting is enabled) reflects optimization profile representativeness and the quality of hot-cold splitting. Partition score of 1 means that all samples fall into hot text, and all buckets (cache lines) in hot text are exercised, equivalent to perfect hot-cold splitting. Test Plan: updated heatmap-preagg.test	2025-05-13 13:20:13 -07:00
Kazu Hirata	d5b170c39b	[BOLT] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#139403 )	2025-05-10 13:39:15 -07:00
YongKang Zhu	316a6ff3d0	[BOLT][RelVTable] Skip special handling on non virtual function pointer relocations (#137406 ) Besides virtual function pointers vtable could contain other kinds of entries like those for RTTI data that also require relocations. We need to skip special handling on relocations for non virtual function pointers in relative vtable. Co-authored-by: Maksim Panchenko <maks@meta.com>	2025-04-29 08:13:44 -07:00
Rafael Auler	3bcb724903	[BOLT] Add --custom-allocation-vma flag (#136385 ) Add an advanced-user flag so we are able to rewrite binaries when we fail to identify a suitable location to put new code. User then can supply a custom location via --custom-allocation-vma. This happens more obviously if the binary has segments mapped to very high addresses.	2025-04-18 21:02:09 -07:00
Rafael Auler	5c4e6c6113	[BOLT] Don't choke on nobits symbols (#136384 )	2025-04-18 17:29:24 -07:00
wangjue	dbb79c30c9	[BOLT][Instrumentation] Initial instrumentation support for RISCV64 (#133882 ) This patch adds code generation for RISCV64 instrumentation.The work involved includes the following three points: a) Implements support for instrumenting direct function call and jump on RISC-V which relies on , Atomic instructions (used to increment counters) are only available on RISC-V when the A extension is used. b) Implements support for instrumenting direct function inderect call by implementing the createInstrumentedIndCallHandlerEntryBB and createInstrumentedIndCallHandlerExitBB interfaces. In this process, we need to accurately record the target address and IndCallID to ensure the correct recording of the indirect call counters. c)Implemented the RISCV64 Bolt runtime library, implemented some system call interfaces through embedded assembly. Get the difference between runtime addrress of .text section andstatic address in section header table, which in turn can be used to search for indirect call description. However, the community code currently has problems with relocation in some scenarios, but this has nothing to do with instrumentation. We may continue to submit patches to fix the related bugs.	2025-04-16 23:01:00 -07:00
alekuz01	38faf32d23	[BOLT] Enable hugify for AArch64 (#117158 ) Add required hugify instrumentation and runtime libraries support for AArch64. Fixes #58226 Unblocks #62695	2025-04-15 12:59:05 +01:00
YongKang Zhu	2a83c0cc13	[BOLT] Support relative vtable (#135449 ) To handle relative vftable, which is enabled with clang option `-fexperimental-relative-c++-abi-vtables`, we look for PC relative relocations whose fixup locations fall in vtable address ranges. For such relocations, actual target is just virtual function itself, and the addend is to record the distance between vtable slot for target virtual function and the first virtual function slot in vtable, which is to match generated code that calls virtual function. So we can skip the logic of handling "function + offset" and directly save such relocations for future fixup after new layout is known.	2025-04-14 10:24:47 -07:00
Maksim Panchenko	e4cbb7780b	[BOLT][AArch64] Fix symbolization of unoptimized TLS access (#134332 ) TLS relocations may not have a valid BOLT symbol associated with them. While symbolizing the operand, we were checking for the symbol value, and since there was no symbol the check resulted in a crash. Handle TLS case while performing operand symbolization on AArch64.	2025-04-04 11:42:21 -07:00
Anatoly Trosinenko	c818ae7399	[BOLT] Gadget scanner: detect non-protected indirect calls (#131899 ) Implement the detection of non-protected indirect calls and branches similar to pac-ret scanner.	2025-04-03 16:40:34 +03:00
Maksim Panchenko	96e5ee23a7	[BOLT][AArch64] Add partial support for lite mode (#133014 ) In lite mode, we only emit code for a subset of functions while preserving the original code in .bolt.org.text. This requires updating code references in non-emitted functions to ensure that: * Non-optimized versions of the optimized code never execute. * Function pointer comparison semantics is preserved. On x86-64, we can update code references in-place using "pending relocations" added in scanExternalRefs(). However, on AArch64, this is not always possible due to address range limitations and linker address "relaxation". There are two types of code-to-code references: control transfer (e.g., calls and branches) and function pointer materialization. AArch64-specific control transfer instructions are covered by #116964. For function pointer materialization, simply changing the immediate field of an instruction is not always sufficient. In some cases, we need to modify a pair of instructions, such as undoing linker relaxation and converting NOP+ADR into ADRP+ADD sequence. To achieve this, we use the instruction patch mechanism instead of pending relocations. Instruction patches are emitted via the regular MC layer, just like regular functions. However, they have a fixed address and do not have an associated symbol table entry. This allows us to make more complex changes to the code, ensuring that function pointers are correctly updated. Such mechanism should also be portable to RISC-V and other architectures. To summarize, for AArch64, we extend the scanExternalRefs() process to undo linker relaxation and use instruction patches to partially overwrite unoptimized code.	2025-03-27 21:33:25 -07:00
Anatoly Trosinenko	03557169e0	[BOLT] Gadget scanner: streamline issue reporting (#131896 ) In preparation for adding more gadget kinds to detect, streamline issue reporting. Rename classes representing issue reports. In particular, rename `Annotation` base class to `Report`, as it has nothing to do with "annotations" in `MCPlus` terms anymore. Remove references to "return instructions" from variable names and report messages, use generic terms instead. Rename NonPacProtectedRetAnalysis to PAuthGadgetScanner. Remove `GeneralDiagnostic` as a separate class, make `GenericReport` (former `GenDiag`) store `std::string Text` directly. Remove unused `operator=` and `operator==` methods, as `Report`s are created on the heap and referenced via `shared_ptr`s. Introduce `GadgetKind` class - currently, it only wraps a `const char *` description to display to the user. This description is intended to be a per-gadget-kind constant (or a few hard-coded constants), so no need to store it to `std::string` field in each report instance. To handle both free-form `GenericReport`s and statically-allocated messages without unnecessary overhead, move printing of the report header to the base class (and take the message argument as a `StringRef`).	2025-03-21 11:19:53 +03:00
Ash Dobrescu	3bba268013	[BOLT] Support computed goto and allow map addrs inside functions (#120267 ) Create entry points for addresses referenced by dynamic relocations and allow getNewFunctionOrDataAddress to map addrs inside functions. By adding addresses referenced by dynamic relocations as entry points. This patch fixes an issue where bolt fails on code using computing goto's. This also fixes a mapping issue with the bugfix from this PR: https://github.com/llvm/llvm-project/pull/117766.	2025-03-19 14:55:59 +00:00
Maksim Panchenko	bac21719a8	[BOLT] Pass unfiltered relocations to disassembler. NFCI (#131202 ) Instead of filtering and modifying relocations in readRelocations(), preserve the relocation info and use it in the symbolizing disassembler. This change mostly affects AArch64, where we need to look at original linker relocations in order to properly symbolize instruction operands.	2025-03-14 18:44:33 -07:00
Paschalis Mpeis	2f9d94981c	[BOLT] Change Relocation Type to 32-bit NFCI (#130792 )	2025-03-14 18:15:59 +00:00
chrisPyr	038fff3f24	[NFC][BOLT] Make file-local cl::opt global variables static (#126472 ) #125983	2025-03-05 22:11:05 -08:00
YongKang Zhu	5401c675eb	[BOLT][instr] Avoid WX segment (#128982 ) BOLT instrumented binary today has a readable (R), writeable (W) and also executable (X) segment, which Android system won't load due to its WX attribute. Such RWX segment was produced because BOLT has a two step linking, first for everything in the updated or rewritten input binary and next for runtime library. Each linking will layout sections in the order of RX sections followed by RO sections and then followed by RW sections. So we could end up having a RW section `.bolt.instr.counters` surrounded by a number of RO and RX sections, and a new text segment was then formed by including all RX sections which includes the RW section in the middle, and hence the RWX segment. One way to fix this is to separate the RW `.bolt.instr.counters` section into its own segment by a). assigning the starting addresses for section `.bolt.instr.counters` and its following section with regular page aligned addresses and b). creating two extra program headers accordingly.	2025-02-27 16:13:57 -08:00
Kristof Beyls	850b492976	[BOLT][binary-analysis] Add initial pac-ret gadget scanner (#122304 ) This adds an initial pac-ret gadget scanner to the llvm-bolt-binary-analysis-tool. The scanner is taken from the prototype that was published last year at https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype, and has been discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and in the EuroLLVM 2024 keynote "Does LLVM implement security hardenings correctly? A BOLT-based static analyzer to the rescue?" [Video](https://youtu.be/Sn_Fxa0tdpY) [Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf) In the spirit of incremental development, this PR aims to add a minimal implementation that is "fully working" on its own, but has major limitations, as described in the bolt/docs/BinaryAnalysis.md documentation in this proposed commit. These and other limitations will be fixed in follow-on PRs, mostly based on code already existing in the prototype branch. I hope incrementally upstreaming will make it easier to review the code. Note that I believe that this could also form the basis of a scanner to analyze correct implementation of PAuthABI.	2025-02-24 07:26:28 +00:00
YongKang Zhu	9fa77c1854	[BOLT][Linker][NFC] Remove lookupSymbol() in favor of lookupSymbolInfo() (#128070 ) Sometimes we need to know the size of a symbol besides its address, so maybe we can start using the existing `BOLTLinker::lookupSymbolInfo()` (that returns symbol address and size) and remove `BOLTLinker::lookupSymbol()` (that only returns symbol address). And for both we need to check return value as it is wrapped in `std::optional<>`, which makes the difference even smaller.	2025-02-20 17:14:33 -08:00
Alexander Yermolovich	3c357a49d6	[BOLT] Add support for safe-icf (#116275 ) Identical Code Folding (ICF) folds functions that are identical into one function, and updates symbol addresses to the new address. This reduces the size of a binary, but can lead to problems. For example when function pointers are compared. This can be done either explicitly in the code or generated IR by optimization passes like Indirect Call Promotion (ICP). After ICF what used to be two different addresses become the same address. This can lead to a different code path being taken. This is where safe ICF comes in. Linker (LLD) does it using address significant section generated by clang. If symbol is in it, or an object doesn't have this section symbols are not folded. BOLT does not have the information regarding which objects do not have this section, so can't re-use this mechanism. This implementation scans code section and conservatively marks functions symbols as unsafe. It treats symbols as unsafe if they are used in non-control flow instruction. It also scans through the data relocation sections and does the same for relocations that reference a function symbol. The latter handles the case when function pointer is stored in a local or global variable, etc. If a relocation address points within a vtable these symbols are skipped.	2024-12-16 21:49:53 -08:00
Maksim Panchenko	b560b87ba1	[BOLT] Clean up jump table handling in non-reloc mode. NFCI (#119614 ) This change affects non-relocation mode only. Prior to having CheckLargeFunctions pass, we could have emitted code for functions that was discarded at the end due to size limitations. Since we didn't know at the time of emission if the code would be discarded or not, we had to emit jump tables in separate sections and handle them separately. However, now we always run CheckLargeFunctions and make sure all emitted code is used. Thus, we can get rid of the special jump table handling.	2024-12-13 13:14:02 -08:00
Kristof Beyls	ceb7214be0	[BOLT] Introduce binary analysis tool based on BOLT (#115330 ) This initial commit does not add any specific binary analyses yet, it merely contains the boilerplate to introduce a new BOLT-based tool. This basically combines the 4 first patches from the prototype pac-ret and stack-clash binary analyzer discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and published at https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype The introduction of such a BOLT-based binary analysis tool was proposed and discussed in at least the following places: - The RFC pointed to above - EuroLLVM 2024 round table https://discourse.llvm.org/t/summary-of-bolt-as-a-binary-analysis-tool-round-table-at-eurollvm/78441 The round table showed quite a few people interested in being able to build a custom binary analysis quickly with a tool like this. - Also at the US LLVM dev meeting a few weeks ago, I heard interest from a few people, asking when the tool would be available upstream. - The presentation "Adding Pointer Authentication ABI support for your ELF platform" (https://llvm.swoogo.com/2024devmtg/session/2512720/adding-pointer-authentication-abi-support-for-your-elf-platform) explicitly mentioned interest to extend the prototype tool to verify correct implementation of pauthabi.	2024-12-12 10:06:27 +00:00
Jared Wyles	2ccf7ed277	[JITLink] Switch to SymbolStringPtr for Symbol names (#115796 ) Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning on the boundary between JITLink and ORC, and allows pointer comparisons (rather than string comparisons) between Symbol names. This should improve the performance and readability of code that bridges between JITLink and ORC (e.g. ObjectLinkingLayer and ObjectLinkingLayer::Plugins). To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to LinkGraph and threaded through to its construction sites in LLVM and Bolt. All LinkGraphs that are to have symbol names compared by pointer equality must point to the same SymbolStringPool instance, which in ORC sessions should be the pool attached to the ExecutionSession. --------- Co-authored-by: Lang Hames <lhames@gmail.com>	2024-12-06 10:22:09 +11:00
Peter Waller	b5ed375f9d	[BOLT] Skip _init; avoiding GOT breakage for static binaries (#117751 ) _init is used during startup of binaires. Unfortunately, its address can be shared (at least on AArch64 glibc static binaries) with a data reference that lives in the GOT. The GOT rewriting is currently unable to distinguish between data addresses and function addresses. This leads to the data address being incorrectly rewritten, causing a crash on startup of the binary: Unexpected reloc type in static binary. To avoid this, don't consider _init for being moved, by skipping it. ~We could add further conditions to narrow the skipped case for known crashes, but as a straw man I thought it'd be best to keep the condition as simple as possible and see if there any objections to this.~ (Edit: this broke the test bolt/test/runtime/X86/retpoline-synthetic.test, because _init was skipped from the retpoline pass and it has an indirect call in it, so I include a check for static binaries now, which avoids the test failure, but perhaps this could/should be narrowed further?) For now, skip _init for static binaries on any architecture; we could add further conditions to narrow the skipped case for known crashes, but as a straw man I thought it'd be best to keep the condition as simple as possible and see if there any objections to this. Updates #100096.	2024-11-28 14:59:07 +00:00
Maksim Panchenko	996553228f	[BOLT] Overwrite .eh_frame and .gcc_except_table (#116755 ) Under --use-old-text or --strict, we completely rewrite contents of EH frames and exception tables sections. If new contents of either section do not exceed the size of the original section, rewrite the section in-place.	2024-11-19 12:59:05 -08:00
Maksim Panchenko	08ef939637	[BOLT] Overwrite .eh_frame_hdr in-place (#116730 ) If the new EH frame header can fit into the original .eh_frame_hdr section, overwrite it in-place and pad with zeroes.	2024-11-18 20:42:38 -08:00
Maksim Panchenko	1b8e0cf090	[BOLT] Never emit "large" functions (#115974 ) "Large" functions are functions that are too big to fit into their original slots after code modifications. CheckLargeFunctions pass is designed to prevent such functions from emission. Extend this pass to work with functions with constant islands. Now that CheckLargeFunctions covers all functions, it guarantees that we will never see such functions after code emission on all platforms (previously it was guaranteed on x86 only). Hence, we can get rid of RewriteInstance extensions that were meant to support "large" functions.	2024-11-13 09:58:44 -08:00
Jacob Bramley	16cd5cdf4d	[BOLT] Ignore AArch64 markers outside their sections. (#74106 ) AArch64 uses $d and $x symbols to delimit data embedded in code. However, sometimes we see $d symbols, typically in .eh_frame, with addresses that belong to different sections. These occasionally fall inside .text functions and cause BOLT to stop disassembling, which in turn causes DWARF CFA processing to fail. As a workaround, we just ignore symbols with addresses outside the section they belong to. This behaviour is consistent with objdump and similar tools.	2024-11-07 15:16:14 +03:00
Youngsuk Kim	0a5edb4de4	[bolt] Don't call llvm::raw_string_ostream::flush() (NFC) Don't call raw_string_ostream::flush(), which is essentially a no-op. As specified in the docs, raw_string_ostream is always unbuffered. ( 65b13610a5226b84889b923bae884ba395ad084d for further reference )	2024-09-23 17:07:11 -05:00
Kristof Beyls	6d216fb7b8	[perf2bolt] Improve heuristic to map in-process addresses to specific… (#109397 ) … segments in Elf binary. The heuristic is improved by also taking into account that only executable segments should contain instructions. Fixes #109384.	2024-09-23 15:14:51 +02:00
sinan	31ac3d092b	[BOLT] Add .iplt support to x86 (#106513 ) Add X86 support for parsing .iplt section and symbols.	2024-09-23 18:22:43 +08:00
Davide Italiano	e49549ff19	Revert "[BOLT] Abort on out-of-section symbols in GOT (#100801 )" This reverts commit a4900f0d936f0e86bbd04bd9de4291e1795f1768.	2024-08-07 20:52:19 -07:00
Sayhaan Siddiqui	62e894e0d7	[BOLT][DWARF][NFC] Move Arch assignment out of createBinaryContext (#102054 ) Moves the assignment of Arch out of createBinaryContext to prevent data races when parallelized.	2024-08-07 16:55:39 +00:00
Vladislav Khmelevsky	a4900f0d93	[BOLT] Abort on out-of-section symbols in GOT (#100801 ) This patch aborts BOLT execution if it finds out-of-section (section end) symbol in GOT table. In order to handle such situations properly in future, we would need to have an arch-dependent way to analyze relocations or its sequences, e.g., for ARM it would probably be ADRP + LDR analysis in order to get GOT entry address. Currently, it is also challenging because GOT-related relocation symbols are replaced to __BOLT_got_zero. Anyway, it seems to be quite a rare case, which seems to be only? related to static binaries. For the most part, it seems that it should be handled on the linker stage, since static binary should not have GOT table at all. LLD linker with relaxations enabled would replace instruction addresses from GOT directly to target symbols, which eliminates the problem. Anyway, in order to achieve detection of such cases, this patch fixes a few things in BOLT: 1. For the end symbols, we're now using the section provided by ELF binary. Previously it would be tied with a wrong section found by symbol address. 2. The end symbols would have limited registration we would only add them in name->data GlobalSymbols map, since using address->data BinaryDataMap map would likely be impossible due to address duality of such symbols. 3. The outdated BD->getSection (currently returning refence, not pointer) check in postProcessSymbolTable is replaced by getSize check in order to allow zero-sized top-level symbols if they are located in zero-sized sections. For the most part, such things could only be found in tests, but I don't see a reason not to handle such cases. 4. Updated section-end-sym test and removed x86_64 requirement since there is no reason for this (tested on aarch64 linux) The test was provided by peterwaller-arm (thank you) in #100096 and slightly modified by me.	2024-08-07 16:26:12 +04:00
Vladislav Khmelevsky	097ddd3565	[BOLT] Fix relocations handling (#100890 ) After porting BOLT to RISCV some of the relocations were broken on both AArch64 and X86. On AArch64 the example of broken relocations would be GOT, during handling them, we should replace the symbol to __BOLT_got_zero in order to address GOT entry, not the symbol that addresses this entry. This is done further in code, so it is too early to add rel here. On X86 it is a mistake to add relocations without addend. This is the exact problem that is raised on #97937. Due to different code generation I had to use gcc-generated yaml test, since with clang I wasn't able to reproduce problem. Added tests for both architectures and made the problematic condition riscV-specific.	2024-08-07 16:25:46 +04:00
sinan	6c8933e1a0	[BOLT] Skip PLT search for zero-value weak reference symbols (#69136 ) Take a common weak reference pattern for example ``` __attribute__((weak)) void undef_weak_fun(); if (&undef_weak_fun) undef_weak_fun(); ``` In this case, an undefined weak symbol `undef_weak_fun` has an address of zero, and Bolt incorrectly changes the relocation for the corresponding symbol to symbol@PLT, leading to incorrect runtime behavior.	2024-08-07 18:02:42 +08:00

1 2 3 4 5 ...

271 Commits