llvm-project

Author	SHA1	Message	Date
Fangrui Song	34c7b7ccae	MCSymbol: Remove setUndefined The name is misleading, as setting Fragment to nullptr does not necessarily make it undefined - common and equated symbols have a nullptr fragment as well.	2025-08-17 15:57:27 -07:00
Haibo Jiang	21a5729b87	[BOLT] Do not use HLT as split point when build the CFG (#150963 ) For x86, the halt instruction is defined as a terminator instruction. When building the CFG, the instruction sequence following the hlt instruction is treated as an independent MBB. Since there is no jump information, the predecessor of this MBB cannot be identified, and it is considered an unreachable MBB that will be removed. Using this fix, the instruction sequences before and after hlt are refused to be placed in different blocks.	2025-08-15 14:35:13 -07:00
Kazu Hirata	b6cfa023b4	[BOLT] Use std::optional::value_or (NFC) (#151628 )	2025-08-01 07:01:58 -07:00
Dmitry Vasilyev	497d177375	[BOLT] Allow to compile with MSVC (#151189 ) This change is necessary to build BOLT with MSVC on Windows.	2025-07-30 13:57:11 +04:00
Amir Ayupov	0d5325bb20	[BOLT] Directly use call count in buildCallGraph (#134966 ) In call graph construction, call block count is used for call graph edge weight. Change that to use call count directly if it's available, falling back to block count if not. Test Plan: This change together with disabling `fix-block-counts` improves profile quality metrics, e.g. for large binaries and sampled LBR profiles: `br_inst_retired.near_taken:uppp` trigger event - Ads1: - Profiled functions 58096 - CFG imbalance 2.63% -> 2.45% - CG imbalance 8.23% -> 7.44% - Ads2: - Profiled functions 54358 - CFG imbalance 3.12% -> 2.77% - CG imbalance 8.22% -> 7.06% - uwsgi: - Profiled functions 78103 - CFG imbalance 4.42% -> 4.03% - CG imbalance 100.00% -> 100.00% `cycles:u` trigger event: - web: - Profiled functions 31306 - CG flow imbalance: 31.16% -> 20.29% - CFG flow imbalance: 7.04% -> 6.44%	2025-07-14 14:28:52 -07:00
Fangrui Song	109b7d965c	MC: Remove unneeded VK_None argument to MCSymbolRefExpr::create calls The MCSymbolRefExpr::create overload with the specifier parameter is discouraged and being phased out. Expressions with relocation specifiers should use MCSpecifierExpr instead.	2025-06-27 21:22:46 -07:00
Sterling-Augustine	23f1ba3ee4	Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959 ) (#146112 ) Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959) This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for the shared-library build and the unconventional sanitizer-runtime build. Original Description: This is the culmination of a series of changes described in [1]. Although somewhat large by line count, it is almost entirely mechanical, creating a new library in DebugInfo/DWARF/LowLevel. This new library has very minimal dependencies, allowing it to be used from more places than the normal DebugInfo/DWARF library--in particular from MC. 1. https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2	2025-06-27 11:05:49 -07:00
Sterling-Augustine	5d03e7a204	Revert "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959 ) …145081)" This reverts commit cbf781f0bdf2f680abbe784faedeefd6f84c246e. Breaks a couple of buildbots.	2025-06-26 13:09:20 -07:00
Sterling-Augustine	cbf781f0bd	[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#145081 ) This is the culmination of a series of changes described in [1]. Although somewhat large by line count, it is almost entirely mechanical, creating a new library in DebugInfo/DWARF/LowLevel. This new library has very minimal dependencies, allowing it to be used from more places than the normal DebugInfo/DWARF library--in particular from MC. I am happy to put it in another location, or to structure it differently if that makes sense. Some have suggested in BinaryFormat, but it is not a great fit there. But if that makes more sense to the reviewers, I can do that. Another possibility would be to use pass-through headers to allow clients who don't care to depend only on DebugInfo/DWARF. This would be a much less invasive change, and perhaps easier for clients. But also a system that hides details. Either way, I'm open. 1. https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2	2025-06-26 11:23:46 -07:00
Amir Ayupov	0c77468288	[BOLT] Expose external entry count for functions (#141674 ) Record the number of function invocations from external code - code outside the binary, which may include JIT code and DSOs. Accounting external entry counts improves the fidelity of call graph flow conservation analysis. Test Plan: updated shrinkwrapping.test	2025-06-10 14:31:22 -07:00
Fangrui Song	cdd0a6c781	BOLT: Replace MCTargetExpr with MCSpecifierExpr to fix bolt-icf.test on aarch64 host	2025-06-07 22:35:20 -07:00
Maksim Panchenko	06f13f8684	[BOLT] Fix references in ignored functions in CFG state (#140678 ) When we call setIgnored() on functions that already have CFG built, these functions are not going to get emitted and we risk missing external function references being updated. To mitigate the potential issues, run scanExternalRefs() on such functions to create patches/relocations. Since scanExternalRefs() relies on function relocations, we have to preserve relocations until the function is emitted. As a result, the memory overhead without debug info update could reach up to 2%.	2025-06-02 12:33:54 -07:00
Kazu Hirata	c0e7a59204	[BOLT] Remove redundant control flow statements (NFC) (#141182 )	2025-05-22 22:36:23 -07:00
Maksim Panchenko	778801cc84	[BOLT] Never call fixBranches() on non-simple functions (#141112 ) We should never call fixBranches() on a function with invalid CFG. E.g., ValidateInternalCalls modifies CFG for its internal analysis purposes. At the same time, it marks the function as non-simple with an assumption that fixBranches() will never run on that function. However, calculateEmittedSize() by default calls fixBranches() which can lead to all sorts of issues, including assertions firing in fixBranches(). The fix is to use the original size for non-simple functions in calculateEmittedSize() since we are supposed to emit the function unmodified. Additionally, add an assertion at the start of fixBranches().	2025-05-22 14:01:54 -07:00
Kazu Hirata	0641ca1cd2	[BOLT] Avoid creating a temporary instance of std::string (NFC) (#140987 ) lookupTarget takes StringRef and internally creates an instance of std::string with the StringRef as part of constructing Triple, so we don't need to create a temporary instance of std::string on our own.	2025-05-21 20:32:40 -07:00
Kazu Hirata	7c8b39740b	[BOLT] Use llvm::is_contained (NFC) (#140984 )	2025-05-21 20:32:09 -07:00
Maksim Panchenko	51e222ef48	[BOLT][AArch64] Fix crash for conditional tail calls (#140669 ) When conditional tail call is located in old code while BOLT is operating in lite mode, the call will require optional pending relocation with a type that is currently not supported resulting in a build-time crash. Before a proper fix is implemented, ignore conditional tail calls for relocation purposes and mark their target functions to be patched, i.e. to be served as veneers/thunks.	2025-05-20 10:38:00 -07:00
Kazu Hirata	d08833f23f	[BOLT] Remove unused local variables (NFC) (#140421 ) While I'm at it, this patch removes GetExecutablePath, which becomes unused after removing the sole use.	2025-05-17 17:43:29 -07:00
Kazu Hirata	e401fb8c47	[BOLT] Use llvm::replace (NFC) (#140199 )	2025-05-16 07:30:29 -07:00
Kazu Hirata	a83668c3dd	[BOLT] Use llvm::upper_bound (NFC) (#140174 )	2025-05-15 23:29:37 -07:00
Amir Ayupov	0289ca09be	[BOLT] Print heatmap from perf2bolt (#139194 ) Add perf2bolt `--heatmap` option to produce heatmaps during profile aggregation. Distinguish exclusive mode (`llvm-bolt-heatmap`) and optional mode (`perf2bolt --heatmap`), which impacts perf.data handling: exclusive mode covers all addresses, whereas optional mode consumes attached profile only covering function addresses. Test Plan: updated per2bolt tests: - pre-aggregated-perf.test: pre-aggregated data, - bolt-address-translation-yaml.test: pre-aggregated + BOLTed input, - perf_test.test: no-LBR perf data.	2025-05-13 13:23:18 -07:00
Amir Ayupov	e039d16ee5	[BOLT][NFC] Disambiguate sample as basic sample (#139350 ) Sample is a general term covering both basic (IP) and branch (LBR) profiles. Find and replace ambiguous uses of sample in a basic sample sense. Rename `RawBranchCount` into `RawSampleCount` reflecting its use for both kinds of profile. Rename `PF_LBR` profile type as `PF_BRANCH` reflecting non-LBR based branch profiles (non-brstack SPE, synthesized brstack ETM/PT). Follow-up to #137644. Test Plan: NFC	2025-05-12 17:15:16 -07:00
Kazu Hirata	d5b170c39b	[BOLT] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#139403 )	2025-05-10 13:39:15 -07:00
Maksim Panchenko	254c13d872	[BOLT][AArch64] Patch functions targeted by optional relocs (#138750 ) On AArch64, we create optional/weak relocations that may not be processed due to the relocated value overflow. When the overflow happens, we used to enforce patching for all functions in the binary via --force-patch option. This PR relaxes the requirement, and enforces patching only for functions that are target of optional relocations. Moreover, if the compact code model is used, the relocation overflow is guaranteed not to happen and the patching will be skipped.	2025-05-08 10:53:47 -07:00
Gergely Bálint	5b20b5721a	[BOLT][AArch64] Allow binary-analysis and heatmap tool to run with pac-ret binaries (#136664 ) OpNegateRAState support is only needed for tools that produce binaries.	2025-04-30 13:41:11 +01:00
Anatoly Trosinenko	37e8c6c6ee	[BOLT] Do not return Def-ed registers from MCPlusBuilder::getUsedRegs (#129890 ) Update the implementation of `MCPlusBuilder::getUsedRegs` to match its description in the header file, add unit tests.	2025-04-23 13:32:59 +03:00
Kazu Hirata	c6e7bb19f7	[BOLT] Use llvm::unique (NFC) (#136513 )	2025-04-20 18:29:51 -07:00
Maksim Panchenko	0977a7130b	[BOLT] Skip FDE emission for patch functions (#136224 ) Patch functions are used to fix instructions in the original code, i.e., they are not functions in a traditional sense, but rather pieces of emitted code that are embedded into real functions. We used to emit FDEs for all functions, including patch functions. However, FDEs for patches are not only unnecessary, but they can lead to problems with libraries and runtimes that consume FDEs, e.g. C++ exception handling runtime. Note that we use named patches to fix function entry points and in that case they behave more like regular functions. Thus we issue FDEs for those.	2025-04-17 19:58:32 -07:00
wangjue	dbb79c30c9	[BOLT][Instrumentation] Initial instrumentation support for RISCV64 (#133882 ) This patch adds code generation for RISCV64 instrumentation.The work involved includes the following three points: a) Implements support for instrumenting direct function call and jump on RISC-V which relies on , Atomic instructions (used to increment counters) are only available on RISC-V when the A extension is used. b) Implements support for instrumenting direct function inderect call by implementing the createInstrumentedIndCallHandlerEntryBB and createInstrumentedIndCallHandlerExitBB interfaces. In this process, we need to accurately record the target address and IndCallID to ensure the correct recording of the indirect call counters. c)Implemented the RISCV64 Bolt runtime library, implemented some system call interfaces through embedded assembly. Get the difference between runtime addrress of .text section andstatic address in section header table, which in turn can be used to search for indirect call description. However, the community code currently has problems with relocation in some scenarios, but this has nothing to do with instrumentation. We may continue to submit patches to fix the related bugs.	2025-04-16 23:01:00 -07:00
YongKang Zhu	823adc7a2d	[BOLT] Validate secondary entry point (#135731 ) Some functions have their sizes as zero in input binary's symbol table, like those compiled by assembler. When figuring out function sizes, we may create label symbol if it doesn't point to any constant island. However, before function size is known, marker symbol can not be correctly associated to a function and therefore all such checks would fail and we could end up adding a code label pointing to constant island as secondary entry point and later mistakenly marking the function as not simple. Querying the global marker symbol array has big throughput overhead. Instead we can run an extra check when post processing entry points to identify such label symbols that actually point to constant islands.	2025-04-15 13:19:15 -07:00
YongKang Zhu	2a83c0cc13	[BOLT] Support relative vtable (#135449 ) To handle relative vftable, which is enabled with clang option `-fexperimental-relative-c++-abi-vtables`, we look for PC relative relocations whose fixup locations fall in vtable address ranges. For such relocations, actual target is just virtual function itself, and the addend is to record the distance between vtable slot for target virtual function and the first virtual function slot in vtable, which is to match generated code that calls virtual function. So we can skip the logic of handling "function + offset" and directly save such relocations for future fixup after new layout is known.	2025-04-14 10:24:47 -07:00
Kazu Hirata	7940b0546b	[BOLT] Fix warning This patch fixes: bolt/lib/Core/BinaryContext.cpp:582:8: error: unused variable 'printEntryDiagnostics' [-Werror,-Wunused-variable] bolt/lib/Core/BinaryContext.cpp:842:10: error: unused variable 'isSibling' [-Werror,-Wunused-variable]	2025-04-12 23:35:49 -07:00
Amir Ayupov	ba93fe97c2	[BOLT][NFC] Simplify getOrCreate/analyze/populate/emitJumpTable (#132108 )	2025-04-10 21:17:04 -07:00
Paschalis Mpeis	3d24046b33	[BOLT] Skip out-of-range pending relocations (#116964 ) When a pending relocation is created it is also marked whether it is optional or not. It can be optional when such relocation is added as part of an optimization (i.e., `scanExternalRefs`). When bolt tries to `flushPendingRelocations`, it safely skips any optional relocations that cannot be encoded due to being out of range. A pre-requisite to that is the usage of the `-force-patch` flag. Alternatrively, BOLT will bail out with a relevant message. Background: BOLT, as part of scanExternalRefs, identifies external references from calls and creates some pending relocations for them. Those when flushed will update references to point to the optimized functions. This optimization can be disabled using `--no-scan`. BOLT can assert if any of these pending relocations cannot be encoded. This patch does not disable this optimization but instead selectively applies it given that a pending relocation is optional and `-force-patch` was enabled.	2025-04-04 17:31:14 +01:00
Alexey Moksyakov	19a319667b	[bolt][aarch64] Adding test with unsupported indirect branches (#127655 ) This test contains the set of common indirect branch patterns. Adding the support will be step by step	2025-04-01 13:49:09 +03:00
Kazu Hirata	0c7be9392f	[BOLT] Use *Set::insert_range (NFC) (#133601 )	2025-03-29 16:52:16 -07:00
Maksim Panchenko	96e5ee23a7	[BOLT][AArch64] Add partial support for lite mode (#133014 ) In lite mode, we only emit code for a subset of functions while preserving the original code in .bolt.org.text. This requires updating code references in non-emitted functions to ensure that: * Non-optimized versions of the optimized code never execute. * Function pointer comparison semantics is preserved. On x86-64, we can update code references in-place using "pending relocations" added in scanExternalRefs(). However, on AArch64, this is not always possible due to address range limitations and linker address "relaxation". There are two types of code-to-code references: control transfer (e.g., calls and branches) and function pointer materialization. AArch64-specific control transfer instructions are covered by #116964. For function pointer materialization, simply changing the immediate field of an instruction is not always sufficient. In some cases, we need to modify a pair of instructions, such as undoing linker relaxation and converting NOP+ADR into ADRP+ADD sequence. To achieve this, we use the instruction patch mechanism instead of pending relocations. Instruction patches are emitted via the regular MC layer, just like regular functions. However, they have a fixed address and do not have an associated symbol table entry. This allows us to make more complex changes to the code, ensuring that function pointers are correctly updated. Such mechanism should also be portable to RISC-V and other architectures. To summarize, for AArch64, we extend the scanExternalRefs() process to undo linker relaxation and use instruction patches to partially overwrite unoptimized code.	2025-03-27 21:33:25 -07:00
Maksim Panchenko	bac21719a8	[BOLT] Pass unfiltered relocations to disassembler. NFCI (#131202 ) Instead of filtering and modifying relocations in readRelocations(), preserve the relocation info and use it in the symbolizing disassembler. This change mostly affects AArch64, where we need to look at original linker relocations in order to properly symbolize instruction operands.	2025-03-14 18:44:33 -07:00
Paschalis Mpeis	2f9d94981c	[BOLT] Change Relocation Type to 32-bit NFCI (#130792 )	2025-03-14 18:15:59 +00:00
Maksim Panchenko	a28daa7c1a	[BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (#129980 ) We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.	2025-03-05 23:06:01 -08:00
chrisPyr	038fff3f24	[NFC][BOLT] Make file-local cl::opt global variables static (#126472 ) #125983	2025-03-05 22:11:05 -08:00
Maksim Panchenko	b971d4d7c8	[BOLT][AArch64] Add symbolizer for AArch64 disassembler. NFCI (#127969 ) Add AArch64MCSymbolizer that symbolizes `MCInst` operands during disassembly. The symbolization was previously done in `BinaryFunction::disassemble()`, but it is also required by `scanExternalRefs()` for "lite" mode functionality. Hence, similar to x86, I've implemented the symbolizer interface that uses `BinaryFunction` relocations to properly create instruction operands. I expect the result of the disassembly to be identical after the change. AArch64 disassembler was not calling `tryAddingSymbolicOperand()` for `MOV` instructions. Fix that. Additionally, the disassembler marks `ldr` instructions as branches by setting `IsBranch` parameter to true. Ignore the parameter and rely on `MCPlusBuilder` interface instead. I've modified `--check-encoding` flag to check symolization of operands of instructions that have relocations against them.	2025-03-03 12:44:28 -08:00
Maksim Panchenko	5a11912ece	[BOLT] Refactor interface for creating instruction patches. NFCI (#129404 ) Add BinaryContext::createInstructionPatch() interface for patching parts of the original binary with new instruction sequences. Refactor PatchEntries pass to use the new interface.	2025-03-01 19:20:17 -08:00
Maksim Panchenko	074c2c6713	[BOLT] Refactor MCInst target symbol lookup. NFCI (#129131 ) In analyzeInstructionForFuncReference(), use MCPlusBuilder interface while scanning symbolic operands of MCInst. Should be NFC on x86, but will make the function work on other architectures. Note that it's currently unused on non-x86 as its functionality is exclusive to safe ICF that runs on x86 only.	2025-02-28 17:57:54 -08:00
Amir Ayupov	3968ebd00d	[BOLT] Keep multi-entry functions simple in aggregation mode (#128253 ) BOLT used to mark multi-entry functions non-simple in non-relocation mode with the reasoning that we can't move them due to potentially undetected references. However, in aggregation mode it doesn't apply as BOLT doesn't perform optimizations. Relax this constraint in case of an aggregation job. Test Plan: added entry-point-fallthru.s	2025-02-25 10:53:45 -08:00
YongKang Zhu	9fa77c1854	[BOLT][Linker][NFC] Remove lookupSymbol() in favor of lookupSymbolInfo() (#128070 ) Sometimes we need to know the size of a symbol besides its address, so maybe we can start using the existing `BOLTLinker::lookupSymbolInfo()` (that returns symbol address and size) and remove `BOLTLinker::lookupSymbol()` (that only returns symbol address). And for both we need to check return value as it is wrapped in `std::optional<>`, which makes the difference even smaller.	2025-02-20 17:14:33 -08:00
Maksim Panchenko	0ba391a85f	[BOLT] Improve constant island disassembly (#127971 ) * Add label that identifies constant island. * Support cases where the island is located after the function.	2025-02-20 11:16:01 -08:00
Amir Ayupov	b884be8640	[BOLT] Exit with error code on missing DWO CU (#125976 ) If BOLT fails to locate DWO CU when using split DWARF, this signifies an issue with the input (missing .dwo) rather than an internal assertion.	2025-02-06 10:01:12 -08:00
Maksim Panchenko	3115278c4e	[BOLT] Fixup for commit 137c378/#125961	2025-02-06 00:26:20 -08:00
Maksim Panchenko	137c3781e6	[BOLT][AArch64] Include constant islands in disassembly (#125961 ) When printing disassembly of a function with constant islands, include the island info in the dump. At the moment, only print islands in pre-CFG state. Include islands that are interleaved with instructions.	2025-02-05 22:41:40 -08:00

1 2 3 4 5 ...

493 Commits