llvm-project

Author	SHA1	Message	Date
Amir Ayupov	3968ebd00d	[BOLT] Keep multi-entry functions simple in aggregation mode (#128253 ) BOLT used to mark multi-entry functions non-simple in non-relocation mode with the reasoning that we can't move them due to potentially undetected references. However, in aggregation mode it doesn't apply as BOLT doesn't perform optimizations. Relax this constraint in case of an aggregation job. Test Plan: added entry-point-fallthru.s	2025-02-25 10:53:45 -08:00
YongKang Zhu	9fa77c1854	[BOLT][Linker][NFC] Remove lookupSymbol() in favor of lookupSymbolInfo() (#128070 ) Sometimes we need to know the size of a symbol besides its address, so maybe we can start using the existing `BOLTLinker::lookupSymbolInfo()` (that returns symbol address and size) and remove `BOLTLinker::lookupSymbol()` (that only returns symbol address). And for both we need to check return value as it is wrapped in `std::optional<>`, which makes the difference even smaller.	2025-02-20 17:14:33 -08:00
Maksim Panchenko	0ba391a85f	[BOLT] Improve constant island disassembly (#127971 ) * Add label that identifies constant island. * Support cases where the island is located after the function.	2025-02-20 11:16:01 -08:00
Maksim Panchenko	3115278c4e	[BOLT] Fixup for commit 137c378/#125961	2025-02-06 00:26:20 -08:00
Maksim Panchenko	137c3781e6	[BOLT][AArch64] Include constant islands in disassembly (#125961 ) When printing disassembly of a function with constant islands, include the island info in the dump. At the moment, only print islands in pre-CFG state. Include islands that are interleaved with instructions.	2025-02-05 22:41:40 -08:00
Maksim Panchenko	ef232a7e34	[BOLT][AArch64] Remove nops in functions with defined control flow (#124705 ) When a function has an indirect branch with unknown control flow, we preserve nops in order to keep all instruction offsets (from the start of the function) the same in case the indirect branch is used by a PC-relative jump table. However, when we know the control flow of the function, we should be able to safely remove nops.	2025-01-28 11:03:49 -08:00
Alexander Yermolovich	3c357a49d6	[BOLT] Add support for safe-icf (#116275 ) Identical Code Folding (ICF) folds functions that are identical into one function, and updates symbol addresses to the new address. This reduces the size of a binary, but can lead to problems. For example when function pointers are compared. This can be done either explicitly in the code or generated IR by optimization passes like Indirect Call Promotion (ICP). After ICF what used to be two different addresses become the same address. This can lead to a different code path being taken. This is where safe ICF comes in. Linker (LLD) does it using address significant section generated by clang. If symbol is in it, or an object doesn't have this section symbols are not folded. BOLT does not have the information regarding which objects do not have this section, so can't re-use this mechanism. This implementation scans code section and conservatively marks functions symbols as unsafe. It treats symbols as unsafe if they are used in non-control flow instruction. It also scans through the data relocation sections and does the same for relocations that reference a function symbol. The latter handles the case when function pointer is stored in a local or global variable, etc. If a relocation address points within a vtable these symbols are skipped.	2024-12-16 21:49:53 -08:00
Enna1	4d2bc0adc6	[BOLT] Extract comparator for sorting functions by index into helper function (#116217 ) This change extracts the comparator for sorting functions by index into a helper function `compareBinaryFunctionByIndex()` Not sure why the comparator used in `BinaryContext::getSortedFunctions()` is not same as the other two places. I think they should use the same comparator, so I also change `BinaryContext::getSortedFunctions()` to use `compareBinaryFunctionByIndex()` for sorting functions.	2024-11-27 09:01:12 +08:00
Daniel Sanders	74003f11b3	[mc] Add CFI directive to emit val_offset() rules (#113971 ) These specify that the value of the given register in the previous frame is the CFA plus some offset. This isn't very common but can be necessary if the original value is normally reconstructed from the stack/frame pointer instead of being saved on the stack and reloaded from there.	2024-11-11 11:38:36 -08:00
Kazu Hirata	41baa69a7e	[BOLT] Fix warnings (#114116 ) This patch fixes: bolt/lib/Core/BinaryFunction.cpp:2537:13: error: enumeration value 'OpNegateRAStateWithPC' not handled in switch [-Werror,-Wswitch] bolt/lib/Core/BinaryFunction.cpp:2661:13: error: enumeration value 'OpNegateRAStateWithPC' not handled in switch [-Werror,-Wswitch] bolt/lib/Core/BinaryFunction.cpp:2805:13: error: enumeration value 'OpNegateRAStateWithPC' not handled in switch [-Werror,-Wswitch]	2024-10-29 13:52:22 -07:00
Kazu Hirata	7928e14f5e	[BOLT] Avoid repeated map lookups (NFC) (#112118 )	2024-10-12 22:06:49 -07:00
Maksim Panchenko	4db0cc4c55	[BOLT] Allow sections in --print-only flag (#109622 ) While printing functions, expand --print-only flag to accept section names. E.g., "--print-only=\.init" will only print functions from ".init" section.	2024-09-25 23:44:06 +02:00
Maksim Panchenko	abd69b3653	[BOLT] Handle internal calls in ValidateInternalCalls (#105736 ) Move handling of all internal calls into the designated pass. Preserve NOPs and mark functions as non-simple on non-X86 platforms.	2024-08-27 11:31:32 -07:00
Maksim Panchenko	8f3050684e	[BOLT] Reduce CFI warning verbosity (#105336 ) CFI programs may have more saves than restores and this is completely benign from BOLT's perspective. Reduce the verbosity and print the warning only under `-v=1` and above.	2024-08-20 13:41:19 -07:00
Amir Ayupov	f83a89c1b1	[BOLT] Turn non-empty CFI StateStack assert into a warning (#102216 ) clang-15 can produce binaries with mismatched RememberState/RestoreState CFIs. This is benign for unwinding, so replace an assert with a warning.	2024-08-06 17:23:43 -07:00
Amir Ayupov	3023b15fb1	[BOLT] Support POSSIBLE_PIC_FIXED_BRANCH Detect and support fixed PIC indirect jumps of the following form: ``` movslq En(%rip), %r1 leaq PIC_JUMP_TABLE(%rip), %r2 addq %r2, %r1 jmpq *%r1 ``` with PIC_JUMP_TABLE that looks like following: ``` JT: ---------- E1:\| L1 - JT \| \|----------\| E2:\| L2 - JT \| \|----------\| \| \| ...... En:\| Ln - JT \| ---------- ``` The code could be produced by compilers, see https://github.com/llvm/llvm-project/issues/91648. Test Plan: updated jump-table-fixed-ref-pic.test Reviewers: maksfb, ayermolo, dcci, rafaelauler Reviewed By: rafaelauler Pull Request: https://github.com/llvm/llvm-project/pull/91667	2024-07-18 20:57:05 -07:00
Fangrui Song	2718654c54	[MC] Support .cfi_label GNU assembler 2.26 introduced the .cfi_label directive. It does not expand to any CFI instructions, but defines a label in .eh_frame/.debug_frame, which can be used by runtime patching code to locate the FDE. .cfi_label is not allowed for CIE's initial instructions, and can therefore be used to force the next instruction to be placed in a FDE instead of a CIE. In glibc since 2018, sysdeps/riscv/start.S utilizes .cfi_label to force DW_CFA_undefined to be placed in a FDE. arc/csky/loongarch ports have copied this use. ``` .cfi_startproc // DW_CFA_undefined is allowed for CIE's initial instructions. // Without .cfi_label, gas would place DW_CFA_undefined in a CIE. .cfi_label .Ldummy .cfi_undefined ra .cfi_endproc ``` No CFI instruction is associated with .cfi_label, so the `case MCCFIInstruction::OpLabel:` code in BOLT is unreachable and onlt to make -Wswitch happy. Close #97222 Pull Request: https://github.com/llvm/llvm-project/pull/97922	2024-07-07 12:41:13 -07:00
Amir Ayupov	344228ebf4	[BOLT] Drop macro-fusion alignment (#97358 ) 9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for optimal macro-fusion alignment in BOLT. Remove the support in BOLT as performance measurements with large binaries didn't show a significant improvement. Test Plan: macro-fusion alignment was never upstreamed, so no upstream tests are affected.	2024-07-02 09:20:41 -07:00
Nathan Sidwell	6c5b62b846	[BOLT][NFC] Separate isReversibleBranch's 2 semantics (#95572 ) `isUnsupportedBranch` was renamed (and inverted) to `isReversibleBranch`, as that was how it was being used. But one use in `BinaryFunction::disassemble` was using the original meaning to detect unsupported branches, and the `isUnsupportedBranch` had 2 separate semantic checks. Move the unsupported branch check from `isReversibleBranch` to a new entry point: `isUnsupportedInstruction`. Call that from `BinaryFunction::disassemble`. Move the dynamic branch check from X86's isReversibleBranch to the base class, as it is not an architecture-specific check. Remove unnecessary `isReversibleBranch` calls from Instrumentation and X86 MCPlusBuilder.	2024-06-28 07:45:37 -04:00
Maksim Panchenko	d16b21b17d	[BOLT][Linux] Support ORC for alternative instructions (#96709 ) Alternative instruction sequences in the Linux kernel can modify the stack and thus they need their own ORC unwind entries. Since there's only one ORC table, it has to be "shared" among multiple instruction sequences. The kernel achieves this by putting a restriction on instruction boundaries. If ORC state changes at a given IP, only one of the alternative sequences can have an instruction starting/ending at this IP. Then, developers can insert NOPs to guarantee the above requirement is met. The most common use of ORC with alternatives is "pushf; pop %rax" sequence used for paravirtualization. Note that newer kernel versions no longer use .parainstructions; instead, they utilize alternatives for the same purpose. Before we implement a better support for alternatives, we can safely skip ORC entries associated with them. Fixes #87052.	2024-06-27 19:26:11 -07:00
Maksim Panchenko	ca06b61084	[BOLT] Omit CFI state while printing functions without CFI (#96723 ) If a function has no CFI program attached to it, do not print redundant empty CFI state for every basic block.	2024-06-27 17:26:58 -07:00
Nikita Popov	b23fe1088f	[bolt] Add missing <stack> include (NFC)	2024-06-21 14:02:15 +02:00
shaw young	4be3083bb3	[BOLT] Remove mutable from BB::LayoutIndex (#93224 ) Removed mutability from BB::LayoutIndex, subsequently removed const from BB::SetLayout, and changed BF::dfs to track visited blocks with a set as opposed to tracking and altering LayoutIndexes for more consistent code.	2024-05-31 11:52:22 -07:00
Amir Ayupov	f239490592	[BOLT][NFC] Define getExprValue helper (#91663 ) Move out common code extracting the address of a MCExpr. To be reused in #91667. Test Plan: NFC	2024-05-24 15:33:25 -07:00
Amir Ayupov	720cade2b6	[BOLT][NFC] Avoid computing BF hash twice in YAML reader (#75096 ) We compute BF hashes in `YAMLProfileReader::readProfile` when first matching profile functions with binary functions, and second time in `YAMLProfileReader::parseFunctionProfile` during the profile assignment (we need to do that to account for LTO private functions with mismatching suffix). Avoid recomputing the hash if it's been set.	2024-05-24 14:00:03 -07:00
Amir Ayupov	935b946b1f	[BOLT] Process cross references between ignored functions in BAT mode (#92484 ) To align YAML and fdata profiles produced in BAT mode, lift two restrictions applied in non-relocation mode when BAT is present: 1) register secondary entry points from ignored functions, 2) treat functions with secondary entry points as simple. This allows constructing CFG for non-simple functions in non-relocation mode and emitting YAML profile for them, which can then be used for optimizations in relocation mode. Test Plan: added test ignored-interprocedural-reference.s	2024-05-21 20:22:12 -07:00
Nathan Sidwell	76fdc2e527	[BOLT][NFC] Rename isUnsupportedBranch to isReversibleBranch (#92447 ) `isUnsupportedBranch` is not a very informative name, and doesn't match its corresponding `reverseBranchCondition`, as I noted in PR #92018. Here's a renaming to a more mnemonic name.	2024-05-17 15:40:40 -04:00
Nathan Sidwell	725014d866	[BOLT][NFC] Simplify CFG validation (#91977 ) Remove 'Valid' local boolean that has a single use, and return directly instead.	2024-05-14 09:36:34 -04:00
Amir Ayupov	db29f20fdd	[BOLT] Ignore returns in DataAggregator Returns are ignored in perf/pre-aggregated/fdata profile reader (see DataReader::convertBranchData). They are also omitted in YAMLProfileWriter by virtue of not having the profile attached to them in the reader, and YAMLProfileWriter converting the profile attached to BinaryFunctions. Thus, return profile is universally ignored across all profile types except BAT YAML. To make returns ignored for YAML produced in BAT mode, we can: 1) ignore them in YAMLProfileReader, 2) omit them from YAML profile in profile conversion/writing. The first option is prone to profile staleness issue, where the profiled binary doesn't match the one to be optimized, and thus returns in the profile can no longer be reliably detected (as we don't distinguish them from calls in the profile). The second option is robust to staleness but requires disassembling the branch source instruction. Test Plan: Updated bolt-address-translation-yaml.test Reviewers: rafaelauler, dcci, ayermolo, maksfb Reviewed By: maksfb Pull Request: https://github.com/llvm/llvm-project/pull/90807	2024-05-08 12:02:18 -07:00
Amir Ayupov	fd38366e45	[BOLT][NFC] Clean includes, add license headers (#87200 )	2024-03-31 19:29:45 -07:00
Amir Ayupov	d12e45ad16	[BOLT][NFC] Split out DomTree construction from BF::calculateLoopInfo (#87181 )	2024-03-31 06:24:19 -07:00
Amir Ayupov	d8fe2e4bb0	[BOLT] Fix enumeration of secondary entry points Make them start with 1 instead of 0 (reserved for primary entry point). Test Plan: ``` bin/llvm-lit -a tools/bolt/test/X86/yaml-secondary-entry-discriminator.s ``` Reviewers: rafaelauler, ayermolo, maksfb, dcci Reviewed By: maksfb Pull Request: https://github.com/llvm/llvm-project/pull/86848	2024-03-27 15:23:49 -07:00
Maksim Panchenko	6b1cf00400	[BOLT] Add support for Linux kernel static keys jump table (#86090 ) Runtime code modification used by static keys is the most ubiquitous self-modifying feature of the Linux kernel. The idea is to to eliminate the condition check and associated conditional jump on a hot path if that condition (based on a boolean value of a static key) does not change often. Whenever they condition changes, the kernel runtime modifies all code paths associated with that key flipping the code between nop and (unconditional) jump.	2024-03-21 14:05:21 -07:00
Maksim Panchenko	d7d564b2fc	[BOLT] Add BinaryFunction::registerBranch(). NFC (#83337 ) Add an external interface to register a branch in a function that is in disassembled state. Allows to make custom modifications to the disassembler. E.g., a pre-CFG pass can add an instruction and register a branch that will later be used during the CFG construction.	2024-02-28 20:04:28 -08:00
Maksim Panchenko	3f2a9e5910	[BOLT] Sort TakenBranches immediately before use. NFCI (#83333 ) Move code that sorts TakenBranches right before the branches are used. We can populate TakenBranches in pre-CFG post-processing and hence have to postpone the sorting to a later point in the processing pipeline. Will add such a pass later. For now it's NFC.	2024-02-28 19:51:44 -08:00
Maksim Panchenko	7c206c7812	[BOLT] Refactor interface for instruction labels. NFCI (#83209 ) To avoid accidentally setting the label twice for the same instruction, which can lead to a "lost" label, introduce getOrSetInstLabel() function. Rename existing functions to getInstLabel()/setInstLabel() to make it explicit that they operate on instruction labels. Add an assertion in setInstLabel() that the instruction did not have a prior label set.	2024-02-27 18:44:28 -08:00
Amir Ayupov	52cf07116b	[BOLT][NFC] Log through JournalingStreams (#81524 ) Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:53:53 -08:00
Amir Ayupov	13d60ce2f2	[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch continue the migration on libCore, libRewrite and libPasses to use the new BOLTError class whenever a failure occurs. Test Plan: NFC Co-authored-by: Rafael Auler <rafaelauler@fb.com>	2024-02-12 14:51:15 -08:00
Amir Ayupov	b039ccc684	[BOLT] Provide backwards compatibility for YAML profile with std::hash (#74253 ) Provide backwards compatibility for YAML profile that uses `std::hash`: xxh3 hash is the default for newly produced profile (sets `std-hash: false`), whereas the profile that doesn't specify `std-hash` will be treated as `std-hash: true`, preserving old behavior.	2023-12-11 12:27:32 -08:00
Maksim Panchenko	4f3081296f	[BOLT][NFC] Fix comment (#73983 ) Fix off-by-one error in comment.	2023-11-30 14:31:38 -08:00
Maksim Panchenko	4bcbbe1f70	[BOLT] Refactor fixBranches() (#73752 ) Simplify code in fixBranches(). Mostly NFC, accept the x86-specific check for code fragments now takes into account presence of more than two fragments. Should only matter when we split code into multiple fragments and can run fixBranches() more than once. Also, don't replace a branch target with the same one, as such operation may allocate memory for extra MCSymbolRefExpr.	2023-11-29 16:24:16 -08:00
spupyrev	e7dd596c68	[BOLT] Use deterministic xxh3 for computing BF/BB hashes (#72542 ) std::hash and ADT/Hashing::hash_value are non-deterministic functions whose results might vary across implementation/process/execution. Using xxh3 instead for computing hashes of BinaryFunctions and BinaryBasicBlock for stale profile matching. (A possible alternative is to use ADT/StableHashing.h based on FNV hashing but xxh3 seems to be more popular in LLVM) This is to address https://github.com/llvm/llvm-project/issues/65241.	2023-11-27 14:45:46 -08:00
Maksim Panchenko	f4834255d3	[BOLT] Reset output addresses for deleted blocks (#73429 ) This is a follow-up to #73076. We need to reset output addresses for deleted blocks, otherwise the address translation may mistakenly attribute input address of a deleted block to a non-zero address. While working on a test case, I've discovered that DWARF output ranges were already broken for deleted basic blocks: #73428. I will provide a test case for this PR with a DWARF address range fix PR.	2023-11-25 23:23:47 -08:00
Maksim Panchenko	365114292a	[BOLT][NFC] Refactor function state check (#73420 ) Remove redundant check in updateOutputValues().	2023-11-25 21:09:54 -08:00
ShatianWang	d333c0e062	[BOLT] Extend calculateEmittedSize() for block size calculation (#73076 ) This commit modifies BinaryContext::calculateEmittedSize() to update the BinaryBasicBlock::OutputAddressRange of each basic block in the function in place. BinaryBasicBlock::getOutputSize() now gives the emitted size of the basic block.	2023-11-23 15:28:31 -05:00
Maksim Panchenko	f653f6d57a	[BOLT][NFC] Delete unused declarations (#72596 )	2023-11-16 23:36:19 -08:00
Vladislav Khmelevsky	5b59540661	[BOLT] Enhance fixed indirect branch handling (#71324 ) Previously HasFixedIndirectBranch was set in BF to set isSimple to false later because of unreachable bb ellimination pass which might remove the BB with it's symbols accessed by other instructions than calls. It seems to be that better solution would be to add extra entry point on target offset instead of marking BF as non-simple.	2023-11-16 09:30:55 +04:00
Maksim Panchenko	e823136d43	[BOLT] Refactor --keep-nops option. NFC. (#72228 ) Run RemoveNops pass only if --keep-nops is set to false (default).	2023-11-14 11:28:13 -08:00
Maksim Panchenko	f633f325a1	[BOLT] Fix NOP instruction emission on x86 (#72186 ) Use MCAsmBackend::writeNopData() interface to emit NOP instructions on x86. There are multiple forms of NOP instruction on x86 with different sizes. Currently, LLVM's assembly/disassembly does not support all forms correctly which can lead to a breakage of input code semantics, e.g. if the program relies on NOP instructions for reserving a patch space. Add "--keep-nops" option to preserve NOP instructions.	2023-11-13 18:12:39 -08:00
Maksim Panchenko	2db9b6a93f	[BOLT] Make instruction size a first-class annotation (#72167 ) When NOP instructions are used to reserve space in the code, e.g. for patching, it becomes critical to preserve their original size while emitting the code. On x86, we rely on "Size" annotation for NOP instructions size, as the original instruction size is lost in the disassembly/assembly process. This change makes instruction size a first-class annotation and is affectively NFCI. A follow-up diff will use the annotation for code emission.	2023-11-13 14:33:39 -08:00

1 2 3 4

165 Commits