llvm-project

Author	SHA1	Message	Date
Alexey Moksyakov	6f748698f7	Revert "[bolt][aarch64] simplify rodata/literal load for X86 & AArch6… (#172822 ) few tests are broken on ubuntu, need find out the cause This reverts commit 999c9382571d6aadf9b786263862bf4085dd2dba. Co-authored-by: yavtuk <yavtuk@ya.ru>	2025-12-18 14:19:17 +03:00
Alexey Moksyakov	999c938257	[bolt][aarch64] simplify rodata/literal load for X86 & AArch64 (#165723 ) This patch fixed the issue related to load literal for AArch64 (bolt/test/AArch64/materialize-constant.s), address range for literal is limited +/- 1MB, emitCI puts the constants by the end of function and the one is out of available range. SimplifyRODataLoads is enabled by default for X86 & AArch64	2025-12-18 09:59:01 +03:00
Maksim Panchenko	adaca1348e	[BOLT] Introduce getOutputBinaryFunctions(). NFCI (#172174 ) To gain better control over the functions that go into the output file and their order, introduce `BinaryContext::getOutputBinaryFunctions()`. The new API returns a modifiable list of functions in output order. This list is filled by a new `PopulateOutputFunctions` pass and includes emittable functions from the input file, plus functions added by BOLT (injected functions). The new functionality allows to freely intermix input functions with injected ones in the output, which will be used in new PRs. The new function replaces `BinaryContext::getSortedFunctions()`, but unlike its predecessor, it includes injected functions in the returned list.	2025-12-14 16:29:01 -08:00
Gergely Bálint	a25e3674ae	[BOLT] Rename Pointer Auth DWARF rewriter passes (#164622 ) Rename passes to names that better reflect their intent, and describe their relationship to each other. InsertNegateRAStatePass renamed to PointerAuthCFIFixup, MarkRAStates renamed to PointerAuthCFIAnalyzer. Added the --print-<passname> flags for these passes.	2025-12-04 11:29:40 +01:00
YongKang Zhu	718a3b268f	[BOLT][AArch64] Run LDR relaxation (#165787 ) Replace the current `ADRRelaxationPass` with `AArch64RelaxationPass`, which, besides the existing ADR relaxation, will also run LDR relaxation that for now only handles these two forms of LDR instructions: `ldr Xt, [label]` and `ldr Wt, [label]`.	2025-11-04 06:49:04 -08:00
Gergely Bálint	889bfd9172	Reapply "[BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening" (#162353 ) (#162435 ) Reapply "[BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening (#120064)" (#162353) This reverts commit c7d776b06897567e2d698e447d80279664b67d47. #120064 was reverted for breaking builders. Fix: changed the mismatched type in MarkRAStates.cpp to `auto`. --- Original message: OpNegateRAState is an AArch64-specific DWARF CFI used to change the value of the RA_SIGN_STATE pseudoregister. The RA_SIGN_STATE register records whether the current return address has been signed with PAC. OpNegateRAState requires special handling in BOLT because its placement depends on the function layout. Since BOLT reorders basic blocks during optimization, these CFIs must be regenerated after layout is finalized. This patch introduces two new passes: - MarkRAStates (runs before optimizations): assigns a signedness annotation to each instruction based on OpNegateRAState CFIs in the input binary. - InsertNegateRAStates (runs after optimizations): reads the annotations and emits new OpNegateRAState CFIs where RA state changes between instructions. Design details are described in: `bolt/docs/PacRetDesign.md`.	2025-10-08 11:05:41 +02:00
Gergely Bálint	c7d776b068	Revert "[BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening" (#162353 ) Reverts llvm/llvm-project#120064. @gulfemsavrun reported that the patch broke toolchain builders.	2025-10-07 21:59:18 +02:00
Gergely Bálint	32eaf5b59c	[BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening (#120064 ) OpNegateRAState is an AArch64-specific DWARF CFI used to change the value of the RA_SIGN_STATE pseudoregister. The RA_SIGN_STATE register records if the current return address has been signed with PAC. OpNegateRAState requires special handling in BOLT because its placement depends on the function layout. Since BOLT reorders basic blocks during optimization, these CFIs must be regenerated after layout is finalized. This patch introduces two new passes: - MarkRAStates (runs before optimizations): assigns a signedness annotation to each instruction based on OpNegateRAState CFIs in the input binary. - InsertNegateRAStates (runs after optimizations): reads the annotations and emits new OpNegateRAState CFIs where RA state changes between instructions. Design details are described in: `bolt/docs/PacRetDesign.md`.	2025-10-07 10:22:14 +02:00
YafetBeyene	244588b9d7	[BOLT][AArch64] Inlining of Memcpy (#154929 ) The pass for inlining memcpy in BOLT was currently X86-specific and was using the instruction `rep movsb`. This patch implements a static size analysis system for AArch64 memcpy inlining that extracts copy sizes from preceding instructions to then use it to generate the optimal width-specific load/store sequences.	2025-09-09 14:09:23 +01:00
YafetBeyene	fda24dbc16	[BOLT] Add dump-dot-func option for selective function CFG dumping (#153007 ) ## Change: * Added `--dump-dot-func` command-line option that allows users to dump CFGs only for specific functions instead of dumping all functions (the current only available option being `--dump-dot-all`) ## Usage: * Users can now specify function names or regex patterns (e.g., `--dump-dot-func=main,helper` or `--dump-dot-func="init.`") to generate .dot files only for functions of interest Aims to save time when analysing specific functions in large binaries (e.g., only dumping graphs for performance-critical functions identified through profiling) and we can now avoid reduce output clutter from generating thousands of unnecessary .dot files when analysing large binaries ## Testing The introduced test `dump-dot-func.test` confirms the new option does the following: - [x] 1. `dump-dot-func` can correctly filter a specified functions - [x] 2. Can achieve the above with regexes - [x] 3. Can do 1. with a list of functions - [x] No option specified creates no dot files - [x] Passing in a non-existent function generates no dumping messages - [x] `dump-dot-all` continues to work as expected	2025-08-22 10:51:09 +01:00
Maksim Panchenko	7d6fda4fd3	[BOLT] Run PatchEntries pass before LongJmp (#137236 ) With --force-patch option, every original function entry point is overwritten with a trampoline to a new version of the function to prevent the execution of the original code. If the function size is too small for the trampoline code, we are forced to bail out on rewriting the function. That presented a problem on AArch64 due to LongJmp pass that assumed the presence of the new copy of the function. If the new copy was not emitted it could have lead to a relocation overflow. Run PatchEntries pass before LongJmp and make the latter aware of the functions that are not going to be emitted. Make --force-patch option behavior on AArch64 consistent with other architectures.	2025-05-01 15:09:09 -07:00
ShatianWang	7e33bebe7c	[BOLT] Report flow conservation scores (#127954 ) Add two additional profile quality stats for CG (call graph) and CFG (control flow graph) flow conservations besides the CFG discontinuity stats introduced in #109683. The two new stats quantify how different "in-flow" is from "out-flow" in the following cases where they should be equal. The smaller the reported stats, the better the flow conservations are. CG flow conservation: for each function that is not a program entry, the number of times the function is called according to CG ("in-flow") should be equal to the number of times the transition from an entry basic block of the function to another basic block within the function is recorded ("out-flow"). CFG flow conservation: for each basic block that is not a function entry or exit, the number of times the transition into this basic block from another basic block within the function is recorded ("in-flow") should be equal to the number of times the transition from this basic block to another basic block within the function is recorded ("out-flow"). Use `-v=1` for more detailed bucketed stats, and use `-v=2` to dump functions / basic blocks with bad flow conservations.	2025-02-28 11:06:52 -05:00
Alexander Yermolovich	3c357a49d6	[BOLT] Add support for safe-icf (#116275 ) Identical Code Folding (ICF) folds functions that are identical into one function, and updates symbol addresses to the new address. This reduces the size of a binary, but can lead to problems. For example when function pointers are compared. This can be done either explicitly in the code or generated IR by optimization passes like Indirect Call Promotion (ICP). After ICF what used to be two different addresses become the same address. This can lead to a different code path being taken. This is where safe ICF comes in. Linker (LLD) does it using address significant section generated by clang. If symbol is in it, or an object doesn't have this section symbols are not folded. BOLT does not have the information regarding which objects do not have this section, so can't re-use this mechanism. This implementation scans code section and conservatively marks functions symbols as unsafe. It treats symbols as unsafe if they are used in non-control flow instruction. It also scans through the data relocation sections and does the same for relocations that reference a function symbol. The latter handles the case when function pointer is stored in a local or global variable, etc. If a relocation address points within a vtable these symbols are skipped.	2024-12-16 21:49:53 -08:00
Paschalis Mpeis	2df48fa78b	[BOLT][AArch64] Enable function print after ADRRelaxation (#119869 ) Introduce `--print-adr-relaxation` to print after ADR Relaxation pass.	2024-12-16 12:06:56 +00:00
ShatianWang	4cab01f072	[BOLT] Profile quality stats -- CFG discontinuity (#109683 ) In a perfect profile, each positive-execution-count block in the function’s CFG should be reachable from a positive-execution-count function entry block through a positive-execution-count path. This new pass checks how well the BOLT input profile satisfies this “CFG continuity” property. More specifically, for each of the hottest 1000 functions, the pass calculates the function’s fraction of basic block execution counts that is “unreachable”. It then reports the 95th percentile of the distribution of the 1000 unreachable fractions in a single BOLT-INFO line. The smaller the reported value is, the better the BOLT profile satisfies the CFG continuity property. The default value of 1000 above can be changed via the hidden BOLT option `-num-functions-for-continuity-check=[N]`. If more detailed stats are needed, `-v=1` can be added to the BOLT invocation: the hottest N functions will be grouped into 5 equally-sized buckets, from the hottest to the coldest; for each bucket, various summary statistics of the distribution of the fractions and the raw unreachable execution counts will be reported.	2024-10-08 19:07:43 -04:00
Vladislav Khmelevsky	445023f173	Revert "[BOLT] Move ADRRelaxationPass (#101371 )" (#102333 ) This reverts commit 750b12f06badc4cdf767139c70090db62358bb44. The pass should run after splitting phase, but before nop removal	2024-08-07 21:03:51 +04:00
Vladislav Khmelevsky	750b12f06b	[BOLT] Move ADRRelaxationPass (#101371 ) For non-simple functions we need nop instruction to be presented to transform ADR to ADRP+ADD sequence, so run this pass before remove nops pass.	2024-08-07 16:23:38 +04:00
Daniel Hill	b686600a57	[BOLT] Skip instruction shortening (#93032 ) Add the ability to disable the instruction shortening pass through --shorten-instructions=false	2024-07-19 16:52:01 -07:00
Amir Ayupov	a38f0157f2	[BOLT] Set InitialDynoStats after EstimateEdgeCounts (#93218 ) InitialDynoStats used to be assigned inside `runAllPasses`, but the assignment executed before any of the passes. As we've moved `EstimateEdgeCounts` into a pass out of ProfileReader, it needs to execute before initial dyno stats are set. Thus move `InitialDynoStats` into BinaryContext and assignment into `DynoStatsSetPass`.	2024-05-23 11:37:06 -07:00
Amir Ayupov	f3dc732b36	[BOLT][NFC] Make estimateEdgeCounts a BinaryFunctionPass (#93074 )	2024-05-22 11:59:00 -07:00
Amir Ayupov	5fb59e7447	[BOLT] Print program stats in perf2bolt/aggregate-only mode (#89763 )	2024-04-25 19:08:51 +02:00
Nathan Sidwell	e2d4823959	[BOLT][NFC] Make RepRet X86-specific (#88286 ) Bolt's RepRet pass is x86-specific, no need to add it for non-x86 targets.	2024-04-11 06:35:28 -04:00
Maksim Panchenko	51268a57fd	[BOLT] Enable --keep-nops option for Linux kernel by default (#86349 ) Preserve nop instructions in the Linux kernel since they could be used for runtime patching.	2024-03-22 15:29:26 -07:00
Amir Ayupov	52cf07116b	[BOLT][NFC] Log through JournalingStreams (#81524 ) Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:53:53 -08:00
Amir Ayupov	13d60ce2f2	[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch continue the migration on libCore, libRewrite and libPasses to use the new BOLTError class whenever a failure occurs. Test Plan: NFC Co-authored-by: Rafael Auler <rafaelauler@fb.com>	2024-02-12 14:51:15 -08:00
Amir Ayupov	a5f3d1a803	[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch we change the interface to `BinaryFunctionPass` to return an Error on `runOnFunctions()`. This gives passes the ability to report a serious problem to the caller (RewriteInstance class), so the caller may decide how to best handle the exceptional situation. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:36:12 -08:00
ShatianWang	076bd22f57	[BOLT] Add structure of CDSplit to SplitFunctions (#73430 ) This commit establishes the general structure of the CDSplit strategy in SplitFunctions without incorporating the exact splitting logic. With -split-functions -split-strategy=cdsplit, the SplitFunctions pass will run twice: the first time is before function reordering and functions are hot-cold split; the second time is after function reordering and functions are hot-warm-cold split based on the fixed function ordering. Currently, all functions are hot-warm split after the entry block in the second splitting pass. Subsequent commits will introduce the precise splitting logic. NFC.	2023-11-29 15:43:21 -05:00
Maksim Panchenko	e823136d43	[BOLT] Refactor --keep-nops option. NFC. (#72228 ) Run RemoveNops pass only if --keep-nops is set to false (default).	2023-11-14 11:28:13 -08:00
Vladislav Khmelevsky	061adc18e7	[BOLT][NFC] Hide pass print options (#67718 ) Most of the print options are hidden, make hidden them all.	2023-09-30 13:45:25 +04:00
Job Noorman	f873029386	[BOLT] Add minimal RISC-V 64-bit support Just enough features are implemented to process a simple "hello world" executable and produce something that still runs (including libc calls). This was mainly a matter of implementing support for various relocations. Currently, the following are handled: - R_RISCV_JAL - R_RISCV_CALL - R_RISCV_CALL_PLT - R_RISCV_BRANCH - R_RISCV_RVC_BRANCH - R_RISCV_RVC_JUMP - R_RISCV_GOT_HI20 - R_RISCV_PCREL_HI20 - R_RISCV_PCREL_LO12_I - R_RISCV_RELAX - R_RISCV_NONE Executables linked with linker relaxation will probably fail to be processed. BOLT relocates .text to a high address while leaving .plt at its original (low) address. This causes PC-relative PLT calls that were relaxed to a JAL to not fit their offset in an I-immediate anymore. This is something that will be addressed in a later patch. Changes to the BOLT core are relatively minor. Two things were tricky to implement and needed slightly larger changes. I'll explain those below. The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a AUIPC/JALR pair, the second does not get any relocation (unlike other PCREL pairs). This causes issues with the combinations of the way BOLT processes binaries and the RISC-V MC-layer handles relocations: - BOLT reassembles instructions one by one and since the JALR doesn't have a relocation, it simply gets copied without modification; - Even though the MC-layer handles R_RISCV_CALL properly (adjusts both the AUIPC and the JALR), it assumes the immediates of both instructions are 0 (to be able to or-in a new value). This will most likely not be the case for the JALR that got copied over. To handle this difficulty without resorting to RISC-V-specific hacks in the BOLT core, a new binary pass was added that searches for AUIPC/JALR pairs and zeroes-out the immediate of the JALR. A second difficulty was supporting ABS symbols. As far as I can tell, ABS symbols were not handled at all, causing __global_pointer$ to break. RewriteInstance::analyzeRelocation was updated to handle these generically. Tests are provided for all supported relocations. Note that in order to test the correct handling of PLT entries, an ELF file produced by GCC had to be used. While I tried to strip the YAML representation, it's still quite large. Any suggestions on how to improve this would be appreciated. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D145687	2023-06-16 12:19:36 +02:00
Rafael Auler	62a2feff57	[BOLT] Fix state of MCSymbols in lowering pass We have mostly harmless data races when running BinaryContext::calculateEmittedSize() in parallel, while performing split function pass. However, it is possible to end up in a state where some MCSymbols are still registered and our clean up failed. This happens rarely but it does happen, and when it happens, it is a difficult to diagnose heisenbug. To avoid this, add a new clean pass to perform a last check on MCSymbols, before they undergo our final emission pass, to verify that they are in a sane state. If we fail to do this, we might resolve some symbols to zero and crash the output binary. Reviewed By: #bolt, Amir Differential Revision: https://reviews.llvm.org/D137984	2023-05-16 14:54:16 -07:00
Vladislav Khmelevsky	17ed8f2928	[BOLT][AArch64] Handle adrp+ld64 linker relaxations Linker might relax adrp + ldr got address loading to adrp + add for local non-preemptible symbols (e.g. hidden/protected symbols in executable). As usually linker doesn't change relocations properly after relaxation, so we have to handle such cases by ourselves. To do that during relocations reading we change LD64 reloc to ADD if instruction mismatch found and introduce FixRelaxationPass that searches for ADRP+ADD pairs and after performing some checks we're replacing ADRP target symbol to already fixed ADDs one. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D138097	2022-12-23 01:20:18 +04:00
Maksim Panchenko	be9d3edee8	[BOLT][NFC] Remove unused PrintInstructions argument PrintInstructions was unused in BinaryFunction::print() and dump(). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D140440	2022-12-20 15:57:13 -08:00
Alexey Moksyakov	1fb186198a	adds huge pages support of PIE/no-PIE binaries This patch adds the huge pages support (-hugify) for PIE/no-PIE binaries. Also returned functionality to support the kernels < 5.10 where there is a problem in a dynamic loader with the alignment of pages addresses. Differential Revision: https://reviews.llvm.org/D129107	2022-11-04 15:14:21 +03:00
Rafael Auler	4f158995b9	[BOLT] Add pass to fix ambiguous memory references This adds a round of checks to memory references, looking for incorrect references to jump table objects. Fix them by replacing the jump table reference with another object reference + offset. This solves bugs related to regular data references in code accidentally being bound to a jump table, and this reference being updated to a new (incorrect) location because we moved this jump table. Fixes #55004 Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D134098	2022-10-12 18:39:50 -07:00
Vladislav Khmelevsky	35efe1d806	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D129260	2022-07-13 14:47:22 +03:00
Amir Ayupov	66b01a8934	[BOLT] Fix getDynoStats to handle BCs with no functions Address fuzzer crash Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D120696	2022-06-30 01:18:45 -07:00
Rafael Auler	fc2d96c334	Revert "[BOLT][AArch64] Handle gold linker veneers" This reverts commit 425dda76e9fac93117289fd68a2abdfb1e4a0ba5. This commit is currently causing BOLT to crash in one of our binaries and needs a bit more checking to make sure it is safe to land.	2022-06-28 19:23:28 -07:00
Vladislav Khmelevsky	425dda76e9	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D128082	2022-06-28 16:14:05 +03:00
Fangrui Song	b92436efcb	[bolt] Remove unneeded cl::ZeroOrMore for cl::opt options	2022-06-05 13:29:49 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
spupyrev	5904836b8a	[BOLT] Cache-Aware Tail Duplication A new "cache-aware" strategy for tail duplication. Differential Revision: https://reviews.llvm.org/D123050	2022-06-03 09:08:45 -07:00
Amir Ayupov	687e4af1c0	[BOLT] CMOVConversion pass Convert simple hammocks into cmov based on misprediction rate. Test Plan: - Assembly test: `cmov-conversion.s` - Testing on a binary: # Bootstrap clang with `-x86-cmov-converter-force-all` and `-Wl,--emit-relocs` (Release build) # Collect perf.data: - `clang++ <opts> bolt/lib/Core/BinaryFunction.cpp -E > bf.cpp` - `perf record -e cycles:u -j any,u -- clang-15 bf.cpp -O2 -std=c++14 -c -o bf.o` # Optimize clang-15 with and w/o -cmov-conversion: - `llvm-bolt clang-15 -p perf.data -o clang-15.bolt` - `llvm-bolt clang-15 -p perf.data -cmov-conversion -o clang-15.bolt.cmovconv` # Run perf experiment: - test: `clang-15.bolt.cmovconv`, - control: `clang-15.bolt`, - workload (clang options): `bf.cpp -O2 -std=c++14 -c -o bf.o` Results: ``` task-clock [delta: -360.21 ± 356.75, delta(%): -1.7760 ± 1.7589, p-value: 0.047951, balance: -6] instructions [delta: 44061118 ± 13246382, delta(%): 0.0690 ± 0.0207, p-value: 0.000001, balance: 50] icache-misses [delta: -5534468 ± 2779620, delta(%): -0.4331 ± 0.2175, p-value: 0.028014, balance: -28] branch-misses [delta: -1624270 ± 1113244, delta(%): -0.3456 ± 0.2368, p-value: 0.030300, balance: -22] ``` Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D120177	2022-03-08 10:44:31 -08:00
Amir Ayupov	194b164eb5	[BOLT][NFC] Fix compiler warnings Summary: - variable 'TotalSize' set but not used - variable 'TotalCallsTopN' set but not used - use of bitwise '\|' with boolean operands Reviewed By: maksfb FBD33911129	2022-02-04 15:57:33 -08:00
Maksim Panchenko	330c8e42ab	[BOLT][NFC] Refactor command line options in BinaryPassManager Summary: Reformat code and put options in lexicographical order. Comparing to clang-format output, manual formatting looks cleaner to me. (cherry picked from FBD33481692)	2022-01-07 11:36:22 -08:00
Maksim Panchenko	ee0e9ccb52	[BOLTRewrite][NFC] Fix braces usages Summary: Refactor bolt/*/Rewrite to follow the braces rule for if/else/loop from LLVM Coding Standards. (cherry picked from FBD33305364)	2021-12-23 12:38:33 -08:00
Maksim Panchenko	2f09f445b2	[BOLT][NFC] Fix file-description comments Summary: Fix comments at the start of source files. (cherry picked from FBD33274597)	2021-12-21 10:21:41 -08:00
Maksim Panchenko	ccb99dd126	[BOLT] Fix profile and tests for nop-removal pass Summary: Since nops are now removed in a separate pass, the profile is consumed on a CFG with nops. If previously a profile was generated without nops, the offsets in the profile could be different if branches included nops either as a source or a destination. This diff adjust offsets to make the profile reading backwards compatible. (cherry picked from FBD33231254)	2021-12-18 17:05:00 -08:00
Vladislav Khmelevsky	08f56926c2	[BOLT] Move disassemble optimizations to optimization passes Summary: The patch moves the shortenInstructions and nop remove to separate binary passes. As a result when llvm-bolt optimizations stage will begin the instructions of the binary functions will be absolutely the same as it was in the binary. This is needed for the golang support by llvm-bolt. Some of the tests must be changed, since bb alignment nops might create unreachable BBs in original functions. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD32896517)	2021-12-18 17:03:35 -08:00
Maksim Panchenko	40c2e0fafe	[BOLT][NFC] Reformat with clang-format Summary: Selectively apply clang-format to BOLT code base. (cherry picked from FBD33119052)	2021-12-14 16:52:51 -08:00

1 2

53 Commits