llvm-project

Author	SHA1	Message	Date
Vladislav Khmelevsky	097ddd3565	[BOLT] Fix relocations handling (#100890 ) After porting BOLT to RISCV some of the relocations were broken on both AArch64 and X86. On AArch64 the example of broken relocations would be GOT, during handling them, we should replace the symbol to __BOLT_got_zero in order to address GOT entry, not the symbol that addresses this entry. This is done further in code, so it is too early to add rel here. On X86 it is a mistake to add relocations without addend. This is the exact problem that is raised on #97937. Due to different code generation I had to use gcc-generated yaml test, since with clang I wasn't able to reproduce problem. Added tests for both architectures and made the problematic condition riscV-specific.	2024-08-07 16:25:46 +04:00
sinan	6c8933e1a0	[BOLT] Skip PLT search for zero-value weak reference symbols (#69136 ) Take a common weak reference pattern for example ``` __attribute__((weak)) void undef_weak_fun(); if (&undef_weak_fun) undef_weak_fun(); ``` In this case, an undefined weak symbol `undef_weak_fun` has an address of zero, and Bolt incorrectly changes the relocation for the corresponding symbol to symbol@PLT, leading to incorrect runtime behavior.	2024-08-07 18:02:42 +08:00
sinan	734c0488b6	[BOLT] Support map other function entry address (#101466 ) Allow BOLT to map the old address to a new binary address if the old address is the entry of the function.	2024-08-07 15:57:25 +08:00
Amir Ayupov	3f51bec466	[BOLT][NFC] Print timers in perf2bolt invocation When BOLT is run in AggregateOnly mode (perf2bolt), it exits with code zero so destructors are not run thus TimerGroup never prints the timers. Add explicit printing just before the exit to honor options requesting timers (`--time-rewrite`, `--time-aggr`). Test Plan: updated bolt/test/timers.c Reviewers: ayermolo, maksfb, rafaelauler, dcci Reviewed By: dcci Pull Request: https://github.com/llvm/llvm-project/pull/101270	2024-07-31 22:14:52 -07:00
Amir Ayupov	fb97b4f962	[BOLT][NFC] Add timers for MetadataManager invocations Test Plan: added bolt/test/timers.c Reviewers: ayermolo, maksfb, rafaelauler, dcci Reviewed By: dcci Pull Request: https://github.com/llvm/llvm-project/pull/101267	2024-07-31 22:12:34 -07:00
Sayhaan Siddiqui	33960ce5a8	[BOLT][DWARF] Sort GDBIndexTUEntryVector (#101264 ) Sorts GDBIndexTUEntryVector in decreasing order by hash to ensure determinism when parallelized.	2024-07-31 11:35:38 -07:00
Sayhaan Siddiqui	79dcd93b70	[BOLT][DWARF] Remove option to write to DWP (#100771 ) Remove the --write-dwp option as well as related code and tests.	2024-07-30 16:58:01 -07:00
Vladislav Khmelevsky	803eaf2926	[BOLT][NFC] Fix test requirement (#100867 ) Tests that are using instrumentation should have bolt-runtime in requirements	2024-07-27 18:44:58 +04:00
Sayhaan Siddiqui	9a3e66e314	[BOLT][DWARF][NFC] Fix DebugStrOffsetsWriter (#100672 ) Fix DebugStrOffsetsWriter so updateAddressMap can't be called after it is finalized.	2024-07-26 18:58:25 -07:00
Tristan Ross	abc2eae682	[BOLT] Enable standalone build (#97130 ) Continue from #87196 as author did not have much time, I have taken over working on this PR. We would like to have this so it'll be easier to package for Nix. Can be tested by copying cmake, bolt, third-party, and llvm directories out into their own directory with this PR applied and then build bolt. --------- Co-authored-by: pca006132 <john.lck40@gmail.com>	2024-07-25 08:18:14 -07:00
Amir Ayupov	4d19676de4	[BOLT] Add profile-use-pseudo-probes option Move pseudo probe profile generation under --profile-use-pseudo-probes option. Note that updating pseudo probes is independent from this flag. Test Plan: updated pseudoprobe-decoding-inline.test Reviewers: maksfb, rafaelauler, ayermolo, dcci, WenleiHe Reviewed By: WenleiHe Pull Request: https://github.com/llvm/llvm-project/pull/100299	2024-07-24 07:31:01 -07:00
Amir Ayupov	9d2dd009b6	[BOLT] Support more than two jump table parents Multi-way splitting can cause multiple fragments to access the same jump table. Relax the assumption that a jump table can only have up to two parents. Test Plan: added bolt/test/X86/three-way-split-jt.s Reviewers: ayermolo, dcci, rafaelauler, maksfb Reviewed By: rafaelauler, dcci Pull Request: https://github.com/llvm/llvm-project/pull/99988	2024-07-24 07:16:39 -07:00
Sayhaan Siddiqui	7cd7a1eab4	[BOLT][DWARF][NFC] Split processUnitDIE into two lambdas (#99957 ) Split processUnitDIE into two lambdas to separate the processing of DWO CUs and CUs in the main binary.	2024-07-23 12:59:40 -07:00
Eisuke Kawashima	8bc02bf5c6	fix(bolt/**.py): fix comparison to None (#94012 ) from PEP8 (https://peps.python.org/pep-0008/#programming-recommendations): > Comparisons to singletons like None should always be done with is or is not, never the equality operators. Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>	2024-07-19 16:59:56 -07:00
klensy	1ee8238f0e	[BOLT][test] Fix Filecheck typos (#93979 ) Fixes few FileCheck typos in tests and add missing(?) filecheck call in test. Co-authored-by: klensy <nightouser@gmail.com>	2024-07-19 16:57:14 -07:00
Shaw Young	296a956369	[BOLT] Match functions with call graph (#98125 ) Implemented call graph function matching. First, two call graphs are constructed for both profiled and binary functions. Then functions are hashed based on the names of their callee/caller functions. Finally, functions are matched based on these neighbor hashes and the longest common prefix of their names. The `match-with-call-graph` flag turns this matching on. Test Plan: Added match-with-call-graph.test. Matched 164 functions in a large binary with 10171 profiled functions.	2024-07-19 14:00:28 -07:00
Amir Ayupov	c905db67a0	[BOLT] Attach pseudo probes to blocks in YAML profile Read pseudo probes in regular and BAT YAML profile generation, and attach them to YAML profile basic blocks. This exposes GUID, probe id, and probe type in profile for future use in stale profile matching. Test Plan: updated pseudoprobe-decoding-inline.test Reviewers: dcci, rafaelauler, ayermolo, maksfb Reviewed By: rafaelauler Pull Request: https://github.com/llvm/llvm-project/pull/99554	2024-07-18 21:01:40 -07:00
Amir Ayupov	9b007a199d	[BOLT] Expose pseudo probe function checksum and GUID (#99389 ) Add a BinaryFunction field for pseudo probe function GUID. Populate it during pseudo probe section parsing, and emit it in YAML profile (both regular and BAT), along with function checksum. To be used for stale function matching. Test Plan: update pseudoprobe-decoding-inline.test	2024-07-18 20:58:16 -07:00
Amir Ayupov	3023b15fb1	[BOLT] Support POSSIBLE_PIC_FIXED_BRANCH Detect and support fixed PIC indirect jumps of the following form: ``` movslq En(%rip), %r1 leaq PIC_JUMP_TABLE(%rip), %r2 addq %r2, %r1 jmpq *%r1 ``` with PIC_JUMP_TABLE that looks like following: ``` JT: ---------- E1:\| L1 - JT \| \|----------\| E2:\| L2 - JT \| \|----------\| \| \| ...... En:\| Ln - JT \| ---------- ``` The code could be produced by compilers, see https://github.com/llvm/llvm-project/issues/91648. Test Plan: updated jump-table-fixed-ref-pic.test Reviewers: maksfb, ayermolo, dcci, rafaelauler Reviewed By: rafaelauler Pull Request: https://github.com/llvm/llvm-project/pull/91667	2024-07-18 20:57:05 -07:00
Amir Ayupov	3fe50b6dde	[BOLT] Store FileSymRefs in a multimap With aggressive ICF, it's possible to have different local symbols (under different FILE symbols) to be mapped to the same address. FileSymRefs only keeps a single SymbolRef per address, which prevents fragment matching from finding the correct symbol to perform parent function lookup. Work around this issue by switching FileSymRefs to a multimap. In future, uses of FileSymRefs can be replaced with SortedSymbols which keeps essentially the same information. Test Plan: added ambiguous_fragment.test Reviewers: dcci, ayermolo, maksfb, rafaelauler Reviewed By: rafaelauler Pull Request: https://github.com/llvm/llvm-project/pull/98992	2024-07-16 22:14:43 -07:00
Sayhaan Siddiqui	e140a8a3c8	[BOLT][DWARF][NFC] Refactor address writers (#98094 ) Refactors address writers to create an instance for each CU and its DWO CU.	2024-07-15 23:03:43 -07:00
Daniel Bertalan	c6b3f50194	[bolt][test] Require asserts in X86/match-functions-with-calls-as-anchors.test (#98882 ) Otherwise, it fails due to the unsupported `--debug` flag in non-asserts builds.	2024-07-15 21:40:50 +02:00
Paschalis Mpeis	587308c343	[BOLT][AArch64] Provide createDummyReturnFunction (#96626 ) AArch64 needs this function when instrumenting statically-linked binaries. Sample commands: ```bash clang -Wl,-q test.c -static -o out llvm-bolt -instrument -instrumentation-sleep-time=5 out -o out.instr ```	2024-07-15 07:20:47 +01:00
Shaw Young	131eb30584	[BOLT] Match blocks with calls as anchors (#96596 ) Added another hash level – call hash – following opcode hash matching for stale block matching. Call hash strings are the concatenation of the lexicographically ordered names of each blocks’ called functions. This change bolsters block matching in cases where some instructions have been removed or added but calls remain constant. Test Plan: added match-functions-with-calls-as-anchors.test.	2024-07-10 15:46:47 -07:00
Sayhaan Siddiqui	7e10ad99ad	[BOLT][DWARF] Cleanup buffer initialization for DWO range writer (#97843 ) Cleanup buffer initialization for DWO range writer instances to remove empty buffer at the beginning.	2024-07-10 11:35:40 -07:00
Amir Ayupov	c641fc3a4c	[BOLT][test] Fix tests for aarch64 buildbot (#97620 ) Fix broken tests in [bolt-aarch64-ubuntu-clang-shared](https://lab.llvm.org/buildbot/#/builders/126/builds/138)	2024-07-09 20:02:01 -07:00
Amir Ayupov	dc1da93958	[BOLT][BAT] Add support for three-way split functions (#93760 ) In three-way split functions, if only .warm fragment is present, BAT incorrectly overwrites the map for .warm fragment by empty .cold fragment. Test Plan: updated register-fragments-bolt-symbols.s	2024-07-05 15:18:49 -07:00
Ádám Kallai	e2cee2c1e6	[BOLT][AArch64] Fixes assertion errors occurred when perf2bolt was executed (#83394 ) BOLT only checks for the most common indirect branch pattern during the branch analyzation. Extended the logic with two other indirect patterns which slightly differ from the expected one. Those patterns may be hit when statically linking libc (pattern 2 requires 'lld' linker). As a workaround mark them as UNKNOWN branch for now. Fixes: #83114	2024-07-05 16:24:22 +04:00
Alexander Yermolovich	361350fc89	[BOLT][DWARF] Deduplicate Foreign TU list (#97629 ) There could be multiple TUs with the same hash in various DWO files. In bigger binaries this could be in the thousands. Although they could be structurally different and we need to output Entries for all of them, for the purposes of figuring out a TU hash we only need one entry in Foreign TU list.	2024-07-04 07:20:06 -07:00
Sayhaan Siddiqui	5828b04b03	[BOLT][DWARF] Refactor legacy ranges writers (#96006 ) Refactors legacy ranges writers to create a writer for each instance of a DWO file. We now write out everything into .debug_ranges after the all the DWO files are processed. This also changes the order that ranges is written out in, as before we wrote out while in the main CU processing loop and we now iterate through the CU buckets created by partitionCUs, after the main processing loop.	2024-07-03 14:50:40 -07:00
Shaw Young	97dc50882c	[BOLT] Match functions with name similarity (#95884 ) A mapping - from namespace to associated binary functions - is used to match function profiles to binary based on the '--name-similarity-function-matching-threshold' flag set edit distance threshold. The flag is set to 0 (exact name matching) by default as it is expensive, requiring the processing of all BFs. Test Plan: Added name-similarity-function-matching.test. On a binary with 5M functions, rewrite passes took ~520s without the flag and ~2018s with the flag set to 20.	2024-07-03 11:39:18 -07:00
Fangrui Song	58004e5bb7	[BOLT,test] Temporarily unsupport reader-stale-yaml-std.test This test from #74253 relies on particular hash values from Hashing.h. The test fails in LLVM_ENABLE_ABI_BREAKING_CHECKS=on modes (#96282) or whenever Hashing.h implementation changes.	2024-07-01 14:58:45 -07:00
Shaw Young	49fdbbcfed	[BOLT] Match functions with exact hash (#96572 ) Added flag '--match-profile-with-function-hash' to match functions based on exact hash. After identical and LTO name matching, more functions can be recovered for inference with exact hash, in the case of function renaming with no functional changes. Collisions are possible in the unlikely case where multiple functions share the same exact hash. The flag is off by default as it requires the processing of all binary functions and subsequently is expensive. Test Plan: added hashing-based-function-matching.test.	2024-06-29 21:19:00 -07:00
Maksim Panchenko	d16b21b17d	[BOLT][Linux] Support ORC for alternative instructions (#96709 ) Alternative instruction sequences in the Linux kernel can modify the stack and thus they need their own ORC unwind entries. Since there's only one ORC table, it has to be "shared" among multiple instruction sequences. The kernel achieves this by putting a restriction on instruction boundaries. If ORC state changes at a given IP, only one of the alternative sequences can have an instruction starting/ending at this IP. Then, developers can insert NOPs to guarantee the above requirement is met. The most common use of ORC with alternatives is "pushf; pop %rax" sequence used for paravirtualization. Note that newer kernel versions no longer use .parainstructions; instead, they utilize alternatives for the same purpose. Before we implement a better support for alternatives, we can safely skip ORC entries associated with them. Fixes #87052.	2024-06-27 19:26:11 -07:00
Maksim Panchenko	ca06b61084	[BOLT] Omit CFI state while printing functions without CFI (#96723 ) If a function has no CFI program attached to it, do not print redundant empty CFI state for every basic block.	2024-06-27 17:26:58 -07:00
shawbyoung	902952ae04	Revert "[𝘀𝗽𝗿] initial version" This reverts commit bb5ab1ffe719f5e801ef08ac08be975546aa3266.	2024-06-25 08:30:29 -07:00
shawbyoung	bb5ab1ffe7	[𝘀𝗽𝗿] initial version Created using spr 1.3.4	2024-06-25 08:05:29 -07:00
shaw young	32e4906c28	Revert "[BOLT] Hash-based function matching" (#96568 ) Reverts llvm/llvm-project#95821	2024-06-24 18:44:24 -04:00
shaw young	5e097c79d8	[BOLT] Hash-based function matching (#95821 ) Using the hashes of binary and profiled functions to recover functions with changed names. Test Plan: added hashing-based-function-matching.test.	2024-06-24 15:29:44 -07:00
Maksim Panchenko	ad2905e52c	[BOLT] Skip optimization of functions with alt instructions (#95172 ) Alternative instructions in the Linux kernel may modify control flow in a function. As such, it is unsafe to optimize functions with alternative instructions until we properly support CFG alternatives. Previously, we marked functions with alt instructions before the emission, but that could be too late if we remove or replace instructions with alternatives. We could have marked functions as non-simple immediately after reading .altinstructions, but it's nice to be able to view functions after CFG is built. Thus assign the non-simple status after building CFG.	2024-06-18 12:33:37 -07:00
Hans Wennborg	69753aa43b	[bolt] stale-matching-min-matched-block.test requires asserts Because of the --debug-only flag.	2024-06-18 13:41:57 +02:00
shaw young	753498eed1	[BOLT] Add sink block to flow CFG in profile inference (#95047 ) Summary: Constructing an artificial sink block for the flow CFG in stale profile inference to allow profile inference to be run on CFGs with blocks that terminate and have successors. Testing Plan: Added infer_no_exits.test to verify that functions with exit blocks with a landing pad are covered by stale profile inference. --------- Co-authored-by: Amir Ayupov <fads93@gmail.com>	2024-06-17 16:58:26 -07:00
Maksim Panchenko	c67ecf3853	[BOLT][tests] Fix jrcxz instruction test (#95861 ) Rewrite the test case intended to check that BOLT does not separate jrcxz instruction from its destination by more than a one-byte offset.	2024-06-17 16:45:34 -07:00
shaw young	68fc8dffe4	[BOLT] Drop high discrepancy profiles in matching (#95156 ) Summary: Functions with high discrepancy (measured by matched function blocks) can be ignored with an added command line argument for better performance. Test Plan: Added stale-matching-min-matched-block.test --------- Co-authored-by: Amir Ayupov <aaupov@fb.com>	2024-06-17 15:14:35 -07:00
Paschalis Mpeis	a13bc9714a	[BOLT][AArch64] Implement PLTCall optimization (#93584 ) `convertCallToIndirectCall` applies the PLTCall optimization and returns an (updated if needed) iterator to the converted call instruction. Since AArch64 requires to inject additional instructions to implement this pass, the relevant BasicBlock and an iterator was passed to the `convertCallToIndirectCall`. `NumCallsOptimized` is updated only on successful application of the pass. Tests: - Inputs/plt-tailcall.c: an example of a tail call optimized PLT call. - AArch64/plt-call.test: it is the actual A64 test, that runs the PLTCall optimization on the above input file and verifies the application of the pass to the calls: 'printf' and 'puts'.	2024-06-11 19:21:11 +01:00
Maksim Panchenko	540893e43f	[BOLT] Add auto parsing for Linux kernel .altinstructions (#95068 ) .altinstructions section contains a list of structures where fields can have different sizes while other fields could be present or not depending on the kernel version. Add automatic detection of such variations and use it by default. The user can still overwrite the automatic detection with `--alt-inst-has-padlen` and `--alt-inst-feature-size` options.	2024-06-11 10:52:51 -07:00
Alexander Yermolovich	61589b8599	[BOLT][DWARF] Fix parent chain in debug_names entries with forward declaration. (#93865 ) Previously when an entry was skipped in parent chain a child will point to the next valid entry in the chain. After discussion in https://github.com/llvm/llvm-project/pull/91808 this is not very useful. Changed implemenation so that all the children of the entry that is skipped won't have DW_IDX_parent.	2024-06-05 09:57:11 -07:00
Maksim Panchenko	c923d39509	[BOLT] Fix ValidateMemRefs pass (#94406 ) In ValidateMemRefs pass, when we validate references in the form of `Symbol + Addend`, we should check `Symbol` not `Symbol + Addend` against aliasing a jump table. Recommitting with a modified test case: https://github.com/llvm/llvm-project/pull/88838 Co-authored-by: sinan <sinan.lin@linux.alibaba.com>	2024-06-04 16:12:29 -07:00
Sayhaan Siddiqui	7103e60f65	[BOLT][DWARF][NFC] Add split-dwarf5 test with multiple CUs (#93744 ) Adds a split-dwarf test for DWARF5 with multiple CUs.	2024-06-04 10:34:29 -07:00
Mehdi Amini	c7b7875e1e	Fix lsda-section-name adding back RUN line incorrectly removed in 6ef632ad36c522b0 (#94301 )	2024-06-03 18:43:49 -07:00

... 2 3 4 5 6 ...

750 Commits