llvm-project

Author	SHA1	Message	Date
Haibo Jiang	21a5729b87	[BOLT] Do not use HLT as split point when build the CFG (#150963 ) For x86, the halt instruction is defined as a terminator instruction. When building the CFG, the instruction sequence following the hlt instruction is treated as an independent MBB. Since there is no jump information, the predecessor of this MBB cannot be identified, and it is considered an unreachable MBB that will be removed. Using this fix, the instruction sequences before and after hlt are refused to be placed in different blocks.	2025-08-15 14:35:13 -07:00
Fangrui Song	dcf485609c	MC: Centralize X86 PC-relative fixup adjustment in MCAssembler Move the X86 PC-relative fixup adjustment from X86MCCodeEmitter::emitImmediate to MCAssembler, leveraging a generalized evaluateFixup. This saves a MCBinaryExpr. For `call foo`, the fixup expression is now `foo` instead of `foo-4`. There is no change in generated relocations. In bolt/lib/Target/X86/X86MCPlusBuilder.cpp, createRelocation needs to decrease the addend. Both max-rss and instructions:u show a minor decrease. https://llvm-compile-time-tracker.com/compare.php?from=ea600576a6f94d6f28925c4b99962cc26b463c29&to=016e8fd4ddf851e5555f606c6394241d68f1a7bb&stat=max-rss&linkStats=on Next: Update targets that use FKF_IsAlignedDownTo32Bits to define `evaluateFixup` and remove FKF_IsAlignedDownTo32Bits from the generic code. Pull Request: https://github.com/llvm/llvm-project/pull/147113	2025-07-08 09:22:30 -07:00
Fangrui Song	244e053b6c	MC: Remove llvm/MC/MCFixupKindInfo.h The file used to define `MCFixupKindInfo`, a simple structure, which is now in MCAsmBackend.h.	2025-07-05 11:24:11 -07:00
Fangrui Song	2bfc488d34	X86MCCodeEmitter: Remove unneeded MCFixupKindInfo::FKF_IsPCRel	2025-07-04 16:30:07 -07:00
Fangrui Song	109b7d965c	MC: Remove unneeded VK_None argument to MCSymbolRefExpr::create calls The MCSymbolRefExpr::create overload with the specifier parameter is discouraged and being phased out. Expressions with relocation specifiers should use MCSpecifierExpr instead.	2025-06-27 21:22:46 -07:00
Fangrui Song	c239acb5b6	MCFixup: Make FixupKindInfo smaller and change getFixupKindInfo to return value We will increase the use of raw relocation types and eliminate fixup kinds that correspond to relocation types. The getFixupKindInfo functions will return an rvalue instead. Let's update the return type from a const reference to a value type.	2025-04-18 20:55:43 -07:00
Rodrigo Rocha	b9891715af	[BOLT] Handle generation of compare and jump sequences (#131949 ) This patch fixes the following two issues with the createCmpJE for AArch64: 1. Avoids overwriting the value of the input register RegNo by use XZR as the destination register. subs xzr, RegNo, #Imm which is equivalent to a simple cmp RegNo, #Imm 2. The immediate operand to the Bcc instruction must be EQ instead of #Imm. This patch also adds a new function for createCmpJNE and unit tests for the both createCmpJE and createCmpJNE for X86 and AArch64.	2025-04-03 18:34:24 -07:00
Maksim Panchenko	b2d272ccfb	[BOLT][X86] Fix getTargetSymbol() (#133834 ) In 96e5ee2, I inadvertently broke the way non-trivial symbol references got updated from non-optimized code. The breakage was a consequence of `getTargetSymbol(MCExpr *)` not returning a symbol when the parameter was a binary expression. Fix `getTargetSymbol()` to cover such cases.	2025-03-31 18:31:33 -07:00
Maksim Panchenko	96e5ee23a7	[BOLT][AArch64] Add partial support for lite mode (#133014 ) In lite mode, we only emit code for a subset of functions while preserving the original code in .bolt.org.text. This requires updating code references in non-emitted functions to ensure that: * Non-optimized versions of the optimized code never execute. * Function pointer comparison semantics is preserved. On x86-64, we can update code references in-place using "pending relocations" added in scanExternalRefs(). However, on AArch64, this is not always possible due to address range limitations and linker address "relaxation". There are two types of code-to-code references: control transfer (e.g., calls and branches) and function pointer materialization. AArch64-specific control transfer instructions are covered by #116964. For function pointer materialization, simply changing the immediate field of an instruction is not always sufficient. In some cases, we need to modify a pair of instructions, such as undoing linker relaxation and converting NOP+ADR into ADRP+ADD sequence. To achieve this, we use the instruction patch mechanism instead of pending relocations. Instruction patches are emitted via the regular MC layer, just like regular functions. However, they have a fixed address and do not have an associated symbol table entry. This allows us to make more complex changes to the code, ensuring that function pointers are correctly updated. Such mechanism should also be portable to RISC-V and other architectures. To summarize, for AArch64, we extend the scanExternalRefs() process to undo linker relaxation and use instruction patches to partially overwrite unoptimized code.	2025-03-27 21:33:25 -07:00
Paschalis Mpeis	2f9d94981c	[BOLT] Change Relocation Type to 32-bit NFCI (#130792 )	2025-03-14 18:15:59 +00:00
Maksim Panchenko	34c6c5e72f	[BOLT][AArch64] Fix PLT optimization (#124192 ) Preserve C++ exception metadata while running PLT optimization on AArch64.	2025-01-24 14:20:24 -08:00
Amir Ayupov	3023b15fb1	[BOLT] Support POSSIBLE_PIC_FIXED_BRANCH Detect and support fixed PIC indirect jumps of the following form: ``` movslq En(%rip), %r1 leaq PIC_JUMP_TABLE(%rip), %r2 addq %r2, %r1 jmpq *%r1 ``` with PIC_JUMP_TABLE that looks like following: ``` JT: ---------- E1:\| L1 - JT \| \|----------\| E2:\| L2 - JT \| \|----------\| \| \| ...... En:\| Ln - JT \| ---------- ``` The code could be produced by compilers, see https://github.com/llvm/llvm-project/issues/91648. Test Plan: updated jump-table-fixed-ref-pic.test Reviewers: maksfb, ayermolo, dcci, rafaelauler Reviewed By: rafaelauler Pull Request: https://github.com/llvm/llvm-project/pull/91667	2024-07-18 20:57:05 -07:00
Paschalis Mpeis	587308c343	[BOLT][AArch64] Provide createDummyReturnFunction (#96626 ) AArch64 needs this function when instrumenting statically-linked binaries. Sample commands: ```bash clang -Wl,-q test.c -static -o out llvm-bolt -instrument -instrumentation-sleep-time=5 out -o out.instr ```	2024-07-15 07:20:47 +01:00
Amir Ayupov	344228ebf4	[BOLT] Drop macro-fusion alignment (#97358 ) 9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for optimal macro-fusion alignment in BOLT. Remove the support in BOLT as performance measurements with large binaries didn't show a significant improvement. Test Plan: macro-fusion alignment was never upstreamed, so no upstream tests are affected.	2024-07-02 09:20:41 -07:00
Nathan Sidwell	6c5b62b846	[BOLT][NFC] Separate isReversibleBranch's 2 semantics (#95572 ) `isUnsupportedBranch` was renamed (and inverted) to `isReversibleBranch`, as that was how it was being used. But one use in `BinaryFunction::disassemble` was using the original meaning to detect unsupported branches, and the `isUnsupportedBranch` had 2 separate semantic checks. Move the unsupported branch check from `isReversibleBranch` to a new entry point: `isUnsupportedInstruction`. Call that from `BinaryFunction::disassemble`. Move the dynamic branch check from X86's isReversibleBranch to the base class, as it is not an architecture-specific check. Remove unnecessary `isReversibleBranch` calls from Instrumentation and X86 MCPlusBuilder.	2024-06-28 07:45:37 -04:00
Paschalis Mpeis	a13bc9714a	[BOLT][AArch64] Implement PLTCall optimization (#93584 ) `convertCallToIndirectCall` applies the PLTCall optimization and returns an (updated if needed) iterator to the converted call instruction. Since AArch64 requires to inject additional instructions to implement this pass, the relevant BasicBlock and an iterator was passed to the `convertCallToIndirectCall`. `NumCallsOptimized` is updated only on successful application of the pass. Tests: - Inputs/plt-tailcall.c: an example of a tail call optimized PLT call. - AArch64/plt-call.test: it is the actual A64 test, that runs the PLTCall optimization on the above input file and verifies the application of the pass to the calls: 'printf' and 'puts'.	2024-06-11 19:21:11 +01:00
Nathan Sidwell	3fefb3c598	[BOLT][NFC] Infailable fns return void (#92018 ) Both `reverseBranchCondition` and `replaceBranchTarget` return a success boolean. But all-but-one caller ignores the return value, and the exception emits a fatal error on failure. Thus, just return nothing.	2024-06-07 06:59:52 -04:00
Amir Ayupov	be83f5c150	[BOLT][NFC] Simplify analyzeIndirectBranch (#91662 ) Simplify mutually exclusive sanity checks in analyzeIndirectBranch, where an UNKNOWN IndirectBranchType is to be returned. Reduces confusion and code duplication when adding a new IndirectBranchType (to be added in #91667). Test Plan: NFC	2024-05-24 15:13:10 -07:00
Amir Ayupov	4658803958	[BOLT][NFC] Add isRIPRel and isIndexed helpers (#91661 ) Move out common X86MemOperand checks into helper lambdas. To be reused in #91667. Test Plan: NFC	2024-05-24 14:49:41 -07:00
Nathan Sidwell	76fdc2e527	[BOLT][NFC] Rename isUnsupportedBranch to isReversibleBranch (#92447 ) `isUnsupportedBranch` is not a very informative name, and doesn't match its corresponding `reverseBranchCondition`, as I noted in PR #92018. Here's a renaming to a more mnemonic name.	2024-05-17 15:40:40 -04:00
Maksim Panchenko	7de82ca369	[BOLT] Don't terminate on trap instruction for Linux kernel (#87021 ) Under normal circumstances, we terminate basic blocks on a trap instruction. However, Linux kernel may resume execution after hitting a trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will specify if the trap instruction should terminate the control flow. The option is on by default except for the Linux kernel mode when it's off.	2024-03-29 16:41:15 -07:00
Maksim Panchenko	6b1cf00400	[BOLT] Add support for Linux kernel static keys jump table (#86090 ) Runtime code modification used by static keys is the most ubiquitous self-modifying feature of the Linux kernel. The idea is to to eliminate the condition check and associated conditional jump on a hot path if that condition (based on a boolean value of a static key) does not change often. Whenever they condition changes, the kernel runtime modifies all code paths associated with that key flipping the code between nop and (unconditional) jump.	2024-03-21 14:05:21 -07:00
Maksim Panchenko	49b8a99a0f	[BOLT] Add createCondBranch() and createLongUncondBranch() (#85315 ) Add MCPlusBuilder interface for creating two new branch types.	2024-03-14 15:28:22 -07:00
Maksim Panchenko	bba790db47	[BOLT] Refactor instruction creation interface. NFCI (#85292 ) Refactor MCPlusBuilder's create{Instruction}() functions that used to return bool. We almost never check the return value as we rely on llvm_unreachable() to detect unimplemented functionality. There were a couple of cases that checked the return value, but they would hit the unreachable condition first (at least in debug builds) before the return value gets checked.	2024-03-14 13:17:17 -07:00
Maksim Panchenko	59ab86bb2f	[BOLT] Clear operands when creating new instructions. NFCI (#85191 ) Reset operand list whenever we create a new instruction via a parameter passed by reference. Most functions were already doing this, but there are several places missing the reset. Potentially, if we don not clear the list it could lead to invalid instruction operands. But the existing code is unaffected.	2024-03-14 11:00:08 -07:00
Maksim Panchenko	082fe9a5dd	[BOLT] Remove duplicate expression (#80380 ) Reported by cpp check static analyzer in #80111. Fixes #80111.	2024-02-01 19:05:11 -08:00
Job Noorman	8fb83bf5f1	[BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223 ) On RISC-V, it's helpful to have access to `MCSubtargetInfo` while generating instructions in `MCPlusBuilder`. For example, a return instruction might be generated differently based on if the target supports compressed instructions (`c.jr ra`) or not (`jalr ra`).	2023-10-06 06:39:58 +00:00
Rafael Auler	853e126ce3	[BOLT] Support input binaries that use R_X86_GOTPC64 In large code model, the address of GOT is calculated by the static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ instruction. In the final binary, it can be disassembled as a regular immediate, but because such immediate is the result of PC-relative pointer arithmetic, we need to parse this relocation and update this calculation whenever we move code, otherwise we break the code trying to read GOT. A test case showing how GOT is accessed was provided. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D158911	2023-10-02 23:12:44 -07:00
Job Noorman	eafe4ee2e8	[BOLT] Rename isLoad/isStore to mayLoad/mayStore As discussed in D159266, for some instructions it's impossible to know statically if they will load/store (e.g., predicated instructions). Therefore, mayLoad/mayStore are more appropriate names.	2023-09-01 09:36:05 +02:00
Elvina Yakubova	6e4c230525	[BOLT][Instrumentation] Initial instrumentation support for AArch64 This commit adds code generation for AArch64 instrumentation, including direct and indirect calls support. Reviewed By: rafauler, yota9 Differential Revision: https://reviews.llvm.org/D151899	2023-08-24 19:34:57 +03:00
Denis Revunov	28fd2ca142	[BOLT] Fix trap value for non-X86 The trap value used by BOLT was assumed to be single-byte instruction. It made some functions unaligned on AArch64(e.g exceptions-instrumentation test) and caused emission failures. Fix that by changing fill value to StringRef. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D158191	2023-08-24 01:29:41 +03:00
zhoujiapeng	9fee2ac044	[BOLT][NFC] Split createRelocation in X86 and share the second part This commit splits the createRelocation function for the X86 architecture into two parts, retaining the first half and moving the second half to a new function called extractFixupExpr. The purpose of this change is to make extractFixupExpr a shared function between AArch64 and X86 architectures, increasing code reusability and maintainability. Child revision: https://reviews.llvm.org/D156018 Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D157217	2023-08-23 00:29:25 +08:00
Maksim Panchenko	5c4d306a10	[BOLT][NFC] Change signature of MCPlusBuilder::isUnsupportedBranch() Make MCPlusBuilder::isUnsupportedBranch() take MCInst, not opcode. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D152765	2023-06-13 12:20:36 -07:00
Maksim Panchenko	43f56a2f27	[BOLT] Fix handling of code references from unmodified code In lite mode (default for X86), BOLT optimizes and relocates functions with profile. The rest of the code is preserved, but if it references relocated code such references have to be updated. The update is handled by scanExternalRefs() function. Note that we cannot solely rely on relocations written by the linker, as not all code references are exposed to the linker. Additionally, the linker can modify certain instructions and relocations will no longer match the code. With this change, start using symbolic disassembler for scanning code for references in scanExternalRefs(). Unlike the previous approach, the symbolizer properly detects and creates references for instructions with multiple/ambiguous symbolic operands and handles cases where a relocation doesn't match any operand. See test cases for examples. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D152631	2023-06-12 10:46:51 -07:00
Shengchen Kan	3f1e9468f6	[X86][MC][bolt] Share code between encoding optimization and assembler relaxation, NFCI PUSH[16\|32\|64]i[8\|32] are not arithmetic instructions, so I renamed the functions. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D151028	2023-05-21 09:31:50 +08:00
Shengchen Kan	89ca4eb002	[X86][NFC] Correct the instruction names for PUSH16i, PUSH32i Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D151012	2023-05-20 17:33:42 +08:00
Amir Ayupov	b6f07d3ae8	[BOLT][NFC] Add MCPlusBuilder defOperands/useOperands helpers Make intent more explicit with the use of new helper methods. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D150810	2023-05-17 21:52:33 -07:00
spupyrev	3e3a926be8	[BOLT][NFC] Add hash computation for basic blocks Extending yaml profile format with block hashes, which are used for stale profile matching. To avoid duplication of the code, created a new class with a collection of utilities for computing hashes. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D144306	2023-05-02 14:03:47 -07:00
Amir Ayupov	edda85771a	[BOLT][NFC] Move addRelocation{X86,AArch64} into MCPlusBuilder The two methods don't belong in BinaryFunction methods. Move the dispatch tables into target-specific MCPlusBuilder methods. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131813	2023-03-14 17:34:25 -07:00
Amir Ayupov	223ec28da4	[BOLT][NFC] Return instruction list from createInstrIncMemory Leverage move semantics for `std::vector`. This also makes it consistent with `createInstrumentationSnippet`. Reviewed By: Elvina Differential Revision: https://reviews.llvm.org/D145465	2023-03-13 12:56:39 -07:00
Maksim Panchenko	fb28196a64	[BOLT] Fix intermittent crash with instrumentation When createInstrumentedIndirectCall() was invoked for tail calls, we attached annotation instruction twice to the new call instruction. First in createDirectCall(), and then again while copying over the metadata operands. As a result, the annotations were not properly stripped for such calls before the call to freeAnnotations() in LowerAnnotations pass. That lead to use-after-free while restoring the offsets with setOffset() call. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D144806	2023-02-27 14:11:10 -08:00
Shengchen Kan	471c0e000a	[BOLT][X86][NFC] Simplify the code of X86MCPlusBuilder::getAliasSized Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D144551	2023-02-23 10:41:28 +08:00
Amir Ayupov	48a215ae6c	[BOLT][NFC] Return struct from evaluateX86MemoryOperand Simplify `MCPlusBuilder::evaluateX86MemoryOperand`: make it return a struct with memory operand analysis struct `X86MemOperand`. Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D144310	2023-02-22 12:06:50 -08:00
Jay Foad	fbb003378b	[BOLT] Use MCInstrDesc::operands() instead of OpInfo operands() is the preferred accessor since D142213. OpInfo will be removed in D142219. Differential Revision: https://reviews.llvm.org/D142530	2023-01-25 17:26:48 +00:00
Amir Ayupov	2563fd63c6	[BOLT][NFC] Use std::optional in MCPlusBuilder Reviewed By: maksfb, #bolt Differential Revision: https://reviews.llvm.org/D139260	2022-12-06 14:51:38 -08:00
Kazu Hirata	e324a80fab	[BOLT] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 23:12:38 -08:00
Kazu Hirata	1fa870b1bd	Use None consistently (NFC) This patch replaces NoneType() and NoneType::None with None in preparation for migration from llvm::Optional to std::optional. In the std::optional world, we are not guranteed to be able to default-construct std::nullopt_t or peek what's inside it, so neither NoneType() nor NoneType::None has a corresponding expression in the std::optional world. Once we consistently use None, we should even be able to replace the contents of llvm/include/llvm/ADT/None.h with something like: using NoneType = std::nullopt_t; inline constexpr std::nullopt_t None = std::nullopt; to ease the migration from llvm::Optional to std::optional. Differential Revision: https://reviews.llvm.org/D138376	2022-11-20 00:24:40 -08:00
Fangrui Song	0972a390b9	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2022-08-09 04:06:52 +00:00
Kazu Hirata	f081ec20b5	[bolt] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-30 10:35:51 -07:00
Rafael Auler	a3cfdd746e	[BOLT] Increase coverage of shrink wrapping [5/5] Add -experimental-shrink-wrapping flag to control when we want to move callee-saved registers even when addresses of the stack frame are captured and used in pointer arithmetic, making it more challenging to do alias analysis to prove that we do not access optimized stack positions. This alias analysis is not yet implemented, hence, it is experimental. In practice, though, no compiler would emit code to do pointer arithmetic to access a saved callee-saved register unless there is a memory bug or we are failing to identify a callee-saved reg, so I'm not sure how useful it would be to formally prove that. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126115	2022-07-11 17:30:13 -07:00

1 2

84 Commits