llvm-project

Author	SHA1	Message	Date
Maksim Panchenko	c9b1f06288	[BOLT] Introduce MetadataRewriter interface Introduce the MetadataRewriter interface to handle updates for various types of auxiliary data stored in a binary file. To implement metadata processing using this new interface, all metadata rewriters should derive from the RewriterBase class and implement one or more of the following methods, depending on the timing of metadata read and write operations: * preCFGInitializer() * postCFGInitializer() // TBD * preEmitFinalizer() // TBD * postEmitFinalizer() By adopting this approach, we aim to simplify the RewriteInstance class and improve its scalability to accommodate new extensions of file formats, including various metadata types of the Linux Kernel. Differential Revision: https://reviews.llvm.org/D154020	2023-07-06 11:09:51 -07:00
Amir Ayupov	fd49cc87d0	[BOLT][NFC] Print functions after attaching profile (-print-profile) Add an extra point of dumping functions: immediately after attaching the profile information. This dumping is enabled by newly introduced `-print-profile` and `-print-all`. The reason is that in `aggregate-only`/perf2bolt mode BOLT may not reach the point of printing the function after CFG is constructed (`-print-cfg`), while we may still want to inspect the attached profile, especially for diff'ing purposes. Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D153996	2023-06-28 17:51:17 -07:00
Shatian Wang	a89c9b35be	[BOLT] Fixing relative ordering of cold sections under multi-way function splitting Order code sections with names in the form of ".text.cold.i" based on the value of i [Context] SplitFunctions.cpp implements splitting strategies that can potentially split each function into maximum N>2 fragments. When such N-way splitting happens, new code sections with names ".text.cold.1", ..., ".text.cold.i", ... "text.cold.N-2" will be created A section with name ".text.cold.i" contains the the (i+2)th fragment of each function. As an example, if each function is splitted into N=3 fragments: hot, warm, cold, then code sections will now include - a section with name ".text" containing hot fragments - a section with name ".text.cold" containing warm fragments - a section with name ".text.cold.1" containing cold fragments The order of these new sections in the output binary currently depends on the order in which they are encountered by the emitter. For example, under N=3-way splitting, if the first function is 2-way splitted into hot and cold and the second function is 3-way splitted into hot, warm, and cold then the cold fragment is encountered first, resulting in the final section to be in the following order .text (hot), .text.cold.1 (cold), .text.cold (warm) The above is suboptimal because the distance of jumps/calls between the hot and the warm sections will be much bigger than when ordering the sections as follows .text (hot), .text.cold (warm), .text.cold.1 (cold) This diff orders the sections with names in the form of ".text.cold" or ".text.cold.i" based on the value of i (assuming the i-value of ".text.cold" is 0). Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D152941	2023-06-22 14:26:48 -07:00
Job Noorman	38ba2824c8	[BOLT] Don't register internal func relocs as external references Currently, all relocations that point inside a function are registered as external references. If these relocations cannot be resolved as jump tables or computed gotos, the containing function gets marked as not-simple and excluded from optimizations. RISC-V uses relocations for branches and jumps (to support linker relaxation) and as such, almost no functions get marked as simple. This patch fixes this by only registering relocations that originate outside of the referenced function as external references. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D153345	2023-06-22 09:35:54 +02:00
Job Noorman	b410d24a19	[BOLT][RISCV] Implement R_RISCV_ADD32/SUB32 Thispatch implements the R_RISCV_ADD32 and R_RISCV_SUB32 relocations for RISC-V. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D146554	2023-06-22 09:35:54 +02:00
Amir Ayupov	82ef86c194	[BOLT] Set IsRelro section attribute based on PT_GNU_RELRO segment Handle PT_GNU_RELRO segment in accordance with Linux Standard Base spec chapter 12: > PT_GNU_RELRO > The array element specifies the location and size of a segment which may > be made read-only after relocations have been processed. Perform a readelf-style mapping check between this segment and sections, set `IsRelro` section attribute. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D152944	2023-06-20 20:44:18 -07:00
Kazu Hirata	e7541f561d	[BOLT] Use llvm::is_contained (NFC)	2023-06-18 11:53:01 -07:00
Job Noorman	f873029386	[BOLT] Add minimal RISC-V 64-bit support Just enough features are implemented to process a simple "hello world" executable and produce something that still runs (including libc calls). This was mainly a matter of implementing support for various relocations. Currently, the following are handled: - R_RISCV_JAL - R_RISCV_CALL - R_RISCV_CALL_PLT - R_RISCV_BRANCH - R_RISCV_RVC_BRANCH - R_RISCV_RVC_JUMP - R_RISCV_GOT_HI20 - R_RISCV_PCREL_HI20 - R_RISCV_PCREL_LO12_I - R_RISCV_RELAX - R_RISCV_NONE Executables linked with linker relaxation will probably fail to be processed. BOLT relocates .text to a high address while leaving .plt at its original (low) address. This causes PC-relative PLT calls that were relaxed to a JAL to not fit their offset in an I-immediate anymore. This is something that will be addressed in a later patch. Changes to the BOLT core are relatively minor. Two things were tricky to implement and needed slightly larger changes. I'll explain those below. The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a AUIPC/JALR pair, the second does not get any relocation (unlike other PCREL pairs). This causes issues with the combinations of the way BOLT processes binaries and the RISC-V MC-layer handles relocations: - BOLT reassembles instructions one by one and since the JALR doesn't have a relocation, it simply gets copied without modification; - Even though the MC-layer handles R_RISCV_CALL properly (adjusts both the AUIPC and the JALR), it assumes the immediates of both instructions are 0 (to be able to or-in a new value). This will most likely not be the case for the JALR that got copied over. To handle this difficulty without resorting to RISC-V-specific hacks in the BOLT core, a new binary pass was added that searches for AUIPC/JALR pairs and zeroes-out the immediate of the JALR. A second difficulty was supporting ABS symbols. As far as I can tell, ABS symbols were not handled at all, causing __global_pointer$ to break. RewriteInstance::analyzeRelocation was updated to handle these generically. Tests are provided for all supported relocations. Note that in order to test the correct handling of PLT entries, an ELF file produced by GCC had to be used. While I tried to strip the YAML representation, it's still quite large. Any suggestions on how to improve this would be appreciated. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D145687	2023-06-16 12:19:36 +02:00
Job Noorman	05634f7346	[BOLT] Move from RuntimeDyld to JITLink RuntimeDyld has been deprecated in favor of JITLink. [1] This patch replaces all uses of RuntimeDyld in BOLT with JITLink. Care has been taken to minimize the impact on the code structure in order to ease the inspection of this (rather large) changeset. Since BOLT relied on the RuntimeDyld API in multiple places, this wasn't always possible though and I'll explain the changes in code structure first. Design note: BOLT uses a JIT linker to perform what essentially is static linking. No linked code is ever executed; the result of linking is simply written back to an executable file. For this reason, I restricted myself to the use of the core JITLink library and avoided ORC as much as possible. RuntimeDyld contains methods for loading objects (loadObject) and symbol lookup (getSymbol). Since JITLink doesn't provide a class with a similar interface, the BOLTLinker abstract class was added to implement it. It was added to Core since both the Rewrite and RuntimeLibs libraries make use of it. Wherever a RuntimeDyld object was used before, it was replaced with a BOLTLinker object. There is one major difference between the RuntimeDyld and BOLTLinker interfaces: in JITLink, section allocation and the application of fixups (relocation) happens in a single call (jitlink::link). That is, there is no separate method like finalizeWithMemoryManagerLocking in RuntimeDyld. BOLT used to remap sections between allocating (loadObject) and linking them (finalizeWithMemoryManagerLocking). This doesn't work anymore with JITLink. Instead, BOLTLinker::loadObject accepts a callback that is called before fixups are applied which is used to remap sections. The actual implementation of the BOLTLinker interface lives in the JITLinkLinker class in the Rewrite library. It's the only part of the BOLT code that should directly interact with the JITLink API. For loading object, JITLinkLinker first creates a LinkGraph (jitlink::createLinkGraphFromObject) and then links it (jitlink::link). For the latter, it uses a custom JITLinkContext with the following properties: - Use BOLT's ExecutableFileMemoryManager. This one was updated to implement the JITLinkMemoryManager interface. Since BOLT never executes code, its finalization step is a no-op. - Pass config: don't use the default target passes since they modify DWARF sections in a way that seems incompatible with BOLT. Also run a custom pre-prune pass that makes sure sections without symbols are not pruned by JITLink. - Implement symbol lookup. This used to be implemented by BOLTSymbolResolver. - Call the section mapper callback before the final linking step. - Copy symbol values when the LinkGraph is resolved. Symbols are stored inside JITLinkLinker to ensure that later objects (i.e., instrumentation libraries) can find them. This functionality used to be provided by RuntimeDyld but I did not find a way to use JITLink directly for this. Some more minor points of interest: - BinarySection::SectionID: JITLink doesn't have something equivalent to RuntimeDyld's Section IDs. Instead, sections can only be referred to by name. Hence, SectionID was updated to a string. - There seem to be no tests for Mach-O. I've tested a small hello-world style binary but not more than that. - On Mach-O, JITLink "normalizes" section names to include the segment name. I had to parse the section name back from this manually which feels slightly hacky. [1] https://reviews.llvm.org/D145686#4222642 Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D147544	2023-06-15 11:13:52 +02:00
Maksim Panchenko	1ebad216ef	[BOLT][NFCI] Remove redundant instance of MCAsmBackend Use instance of MCAsmBackend from BinaryContext instead of creating a new one. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D152849	2023-06-13 13:14:05 -07:00
Maksim Panchenko	c4e60a7f60	[BOLT] Fix --max-funcs=<N> option Fix off-by-one error while handling of the --max-funcs=<N> option. We used to process N+1 functions when N was requested. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D152751	2023-06-12 16:54:14 -07:00
Christian Ulmann	f5425c128a	[LoopInfo] Move generic LoopInfo into own files This commit splits the generic part of `LoopInfo` into separate files. These new `GenericLoopInfo` files are located in `llvm/Support` to be inline with `GenericDomTree`. Furthermore, this change ensures that MLIR's Bazel build does not have to link against `LLVMAnalysis` just to use these template headers. Depends on D148219 Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D148235	2023-04-24 06:07:05 +00:00
Nathan Sidwell	5b9f0309d6	[BOLT] Remove unsupported ELF type reloc handling Drop unsupported ELF format reloc handling -- RewriteInstance lacks this flexibility elsewhere. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D148946	2023-04-23 13:09:37 -04:00
Nathan Sidwell	ffb42e313d	[BOLT] Remove unneeded dyncasts These checks are unnecessary -- we've already bailed if the format was wrong. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D148848	2023-04-21 13:40:54 -04:00
Nathan Sidwell	9c92b023da	[BOLT][NFC] Move phdr typedef to cpp file This typedef is only used inside the RewriteInstance source file, let's not expose it in the header file -- even if private. Differential Revision: https://reviews.llvm.org/D148667	2023-04-19 15:51:17 -04:00
Nathan Sidwell	f2f0411924	[BOLT] Adjust Shdr alignment Shdr's are not necesarily size 2^n, and there is no reason to align to that boundary if they are. Differential Revision: https://reviews.llvm.org/D148666	2023-04-19 15:51:12 -04:00
Job Noorman	48ad4296f7	[BOLT] Fix use-after-free in RewriteInstance::mapCodeSections When a cold function is too large, its section gets deregistered. However, the section is still dereferenced later to get its RuntimeDyld ID. This patch moves the deregistration to after the last dereference. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D148427	2023-04-17 16:16:49 +02:00
Job Noorman	54ab954149	[BOLT] Reject symbols pointing to section end Sometimes, symbols are present that point to the end of a section (i.e., one-past the highest valid address). Currently, BOLT either rejects those symbols when they don't point to another existing section, or errs when they do and the other section is not executable. I suppose BOLT would accept the symbol when it points to an executable section. In any case, these symbols should not be considered while discovering functions and should not result in an error. This patch implements that. Note that this patch checks explicitly for symbols whose value equals the end of their section. It might make more sense to verify that the symbol's value is within [section start, section end). However, I'm not sure if this could every happen and its value does not equal the end. Another way to implement this is to verify that the BinarySection we find at the symbol's address actually corresponds to the symbol's section. I'm not sure what the best approach is so feedback is welcome. Reviewed By: yota9, rafauler Differential Revision: https://reviews.llvm.org/D146215	2023-03-21 13:59:39 +04:00
Vladislav Khmelevsky	f9bf9f925e	[BOLT] Add .relr.dyn section support Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D146085	2023-03-17 17:24:19 +04:00
Kazu Hirata	4e585e51c1	Use *{Map,Set}::contains (NFC)	2023-03-15 22:55:35 -07:00
Vladislav Khmelevsky	207ea5f2e4	[BOLT] Add writable segment for allocatable sections The golang support creates 2 new data segments, one of them contains relocations in PIC binaries, so the section must have writable rights. Currently BOLT creates only one new segment that contains new sections with RX rights, now also create RW segment if there are any new writable sections were allocated during BOLT binary processing. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D143390	2023-03-15 00:06:55 +04:00
Vladislav Khmelevsky	7117af529e	[BOLT] Improve dynamic relocations support for CI This patch fixes few problems with supporting dynamic relocations in CI. 1. After dynamic relocations and functions were read search for dynamic relocations located in functions. Currently we expected them only to be relative and only to be in constant island. Mark islands of such functions to have dynamic relocations and create CI access symbol on the relocation offset, so the BD would be created for such place. 2. During function disassemble and handling address reference for constant island check if the referred external CI has dynamic relocation. And if it has one we would continue to refer original CI rather then creating a local copy. 3. After function disassembly stage mark function that has dynamic reloc in CI as non-simple. We don't want such functions to be optimized, since such passes as split function would create 2 copies of CI which we unable to support currently. 4. During updating output values for BF search for BD located in CI and update their output locations. 5. On dynamic relocation patching stage search for binary data located on relocation offset. If it was moved use new relocation offset value rather then an old one. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D143748	2023-03-13 13:37:28 +04:00
Amir Ayupov	c49941bd0d	[BOLT] Process fragment siblings in lite mode, keep lite mode on In lite mode, include split function fragments to the list of functions to process even if a fragment has no samples. This is required to properly detect and update split jump tables (jump tables that contain pointers to code in the main and cold fragments). Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D140457	2023-02-08 19:11:27 -08:00
yavtuk	0776fc32b1	[BOLT] Search section based on relocation symbol We need to search referenced section based on relocations symbol section to properly match end section symbols. For example on some binaries we can observe that init_array_end/fini_array_end might be "placed" in to the gap and since no section could be found for address the relocation would be skipped resulting in wrong ADRP imm after emitting new text resulting in binary sigsegv. Credits for the test to Vladislav Khmelevskii aka yota9.	2023-02-08 00:15:56 +03:00
Amir Ayupov	c8482da779	[BOLT] Reintroduce allow-stripped Reject stripped binaries as a policy. The core issue with stripped binaries is that we can't detect the presence of split functions which require extra handling. Therefore BOLT can't ensure functional correctness of produced binary if the input stripped binary contains split functions. Supporting such cases is an interesting problem but it goes against BOLT's intended goal of achieving peak program performance. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D142686	2023-02-06 18:08:13 -08:00
Amir Ayupov	16492a6143	[BOLT][NFC] Rename {MachO,}RewriteInstance::create methods Follow the code style of fallible constructors in [LLVM Programmer's Manual] (https://llvm.org/docs/ProgrammersManual.html#fallible-constructors) and rename `RewriteInstance::createRewriteInstance` to `RewriteInstance::create` Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D143119	2023-02-02 12:30:45 -08:00
Amir Ayupov	72e5b14fe7	[BOLT][NFC] Use llvm::make_second_range Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D143019	2023-02-02 12:02:31 -08:00
Amir Ayupov	287508cd9c	[BOLT] Use LTO fuzzy name matching in function-order Allow partial name matching wrt LTO suffixes in `function-order` user-supplied function list, the same as permitted by profile matching. Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D142269	2023-01-25 11:43:10 -08:00
Amir Ayupov	69a9bbf106	[BOLT][NFC] Replace ambiguous BinarySection::isReadOnly with isWritable Address feedback in https://reviews.llvm.org/D102284#2755060 Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D141733	2023-01-18 14:53:07 -08:00
Amir Ayupov	43f382a9f4	[BOLT][NFC] Simplify handleRelocation Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132089	2023-01-18 14:19:35 -08:00
Kazu Hirata	e8d6c537ac	[BOLT] Use std::optional instead of llvm::Optional (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-02 18:40:21 -08:00
Amir Ayupov	703d94d8f0	[BOLT] Respect -function-order in lite mode Process functions listed in -function-order file even in lite mode. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D140435	2022-12-28 20:50:20 -08:00
Vladislav Khmelevsky	17ed8f2928	[BOLT][AArch64] Handle adrp+ld64 linker relaxations Linker might relax adrp + ldr got address loading to adrp + add for local non-preemptible symbols (e.g. hidden/protected symbols in executable). As usually linker doesn't change relocations properly after relaxation, so we have to handle such cases by ourselves. To do that during relocations reading we change LD64 reloc to ADD if instruction mismatch found and introduce FixRelaxationPass that searches for ADRP+ADD pairs and after performing some checks we're replacing ADRP target symbol to already fixed ADDs one. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D138097	2022-12-23 01:20:18 +04:00
Maksim Panchenko	be9d3edee8	[BOLT][NFC] Remove unused PrintInstructions argument PrintInstructions was unused in BinaryFunction::print() and dump(). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D140440	2022-12-20 15:57:13 -08:00
Amir Ayupov	72528ee4b4	[BOLT][NFC] Use std::optional in has*NameRegex	2022-12-11 22:13:47 -08:00
Amir Ayupov	6e5b4dacf3	[BOLT][NFC] Use std::optional in RI	2022-12-11 22:13:46 -08:00
Kazu Hirata	e324a80fab	[BOLT] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 23:12:38 -08:00
Kazu Hirata	1fa870b1bd	Use None consistently (NFC) This patch replaces NoneType() and NoneType::None with None in preparation for migration from llvm::Optional to std::optional. In the std::optional world, we are not guranteed to be able to default-construct std::nullopt_t or peek what's inside it, so neither NoneType() nor NoneType::None has a corresponding expression in the std::optional world. Once we consistently use None, we should even be able to replace the contents of llvm/include/llvm/ADT/None.h with something like: using NoneType = std::nullopt_t; inline constexpr std::nullopt_t None = std::nullopt; to ease the migration from llvm::Optional to std::optional. Differential Revision: https://reviews.llvm.org/D138376	2022-11-20 00:24:40 -08:00
Alexey Moksyakov	1fb186198a	adds huge pages support of PIE/no-PIE binaries This patch adds the huge pages support (-hugify) for PIE/no-PIE binaries. Also returned functionality to support the kernels < 5.10 where there is a problem in a dynamic loader with the alignment of pages addresses. Differential Revision: https://reviews.llvm.org/D129107	2022-11-04 15:14:21 +03:00
Hongtao Yu	d5a963ab8b	[PseudoProbe] Replace relocation with offset for entry probe. Currently pseudo probe encoding for a function is like: - For the first probe, a relocation from it to its physical position in the code body - For subsequent probes, an incremental offset from the current probe to the previous probe The relocation could potentially cause relocation overflow during link time. I'm now replacing it with an offset from the first probe to the function start address. A source function could be lowered into multiple binary functions due to outlining (e.g, coro-split). Since those binary function have independent link-time layout, to really avoid relocations from .pseudo_probe sections to .text sections, the offset to replace with should really be the offset from the probe's enclosing binary function, rather than from the entry of the source function. This requires some changes to previous section-based emission scheme which now switches to be function-based. The assembly form of pseudo probe directive is also changed correspondingly, i.e, reflecting the binary function name. Most of the source functions end up with only one binary function. For those don't, a sentinel probe is emitted for each of the binary functions with a different name from the source. The sentinel probe indicates the binary function name to differentiate subsequent probes from the ones from a different binary function. For examples, given source function ``` Foo() { … Probe 1 … Probe 2 } ``` If it is transformed into two binary functions: ``` Foo: … Foo.outlined: … ``` The encoding for the two binary functions will be separate: ``` GUID of Foo Probe 1 GUID of Foo Sentinel probe of Foo.outlined Probe 2 ``` Then probe1 will be decoded against binary `Foo`'s address, and Probe 2 will be decoded against `Foo.outlined`. The sentinel probe of `Foo.outlined` makes sure there's not accidental relocation from `Foo.outlined`'s probes to `Foo`'s entry address. On the BOLT side, to be minimal intrusive, the pseudo probe re-encoding sticks with the old encoding format. This is fine since unlike linker, Bolt processes the pseudo probe section as a whole and it is free from relocation overflow issues. The change is downwards compatible as long as there's no mixed use of the old encoding and the new encoding. Reviewed By: wenlei, maksfb Differential Revision: https://reviews.llvm.org/D135912 Differential Revision: https://reviews.llvm.org/D135914 Differential Revision: https://reviews.llvm.org/D136394	2022-10-27 13:28:22 -07:00
Maksim Panchenko	20204db503	[BOLT] Add mold-style PLT support mold linker creates symbols for PLT entries and that caught BOLT by surprise. Add the support for marked PLT entries. Fixes: #58498 Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D136655	2022-10-25 11:03:52 -07:00
Rafael Auler	c0d954a068	[BOLT] Ignore duplicate global symbols We noticed some binaries with duplicated global symbol entries (same name, address and size). Ignore them as it is possibly a bug in the linker, and continue processing, unless the symbol has a different size or address. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D136122	2022-10-19 11:52:06 -07:00
Maksim Panchenko	28d70d3f1e	[BOLT][NFC] Refactor EFMM initialization Move EFMM initialization code to emitAndLink(), where EFMM is used. Reviewed By: yavtuk Differential Revision: https://reviews.llvm.org/D136205	2022-10-18 20:31:10 -07:00
Maksim Panchenko	dc8035bddd	[BOLT][NFCI] Avoid calling registerName() twice Calling registerName() for the same symbol twice, even with a different size, has no effect other than the lookup overhead. Avoid the redundancy. Fixes facebookincubator/BOLT#299 Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D136115	2022-10-17 16:16:31 -07:00
Maksim Panchenko	4d3a0cade2	[BOLT] Section-handling refactoring/overhaul Simplify the logic of handling sections in BOLT. This change brings more direct and predictable mapping of BinarySection instances to sections in the input and output files. * Only sections from the input binary will have a non-null SectionRef. When a new section is created as a copy of the input section, its SectionRef is reset to null. * RewriteInstance::getOutputSectionName() is removed as the section name in the output file is now defined by BinarySection::getOutputName(). * Querying BinaryContext for sections by name uses their original name. E.g., getUniqueSectionByName(".rodata") will return the original section even if the new .rodata section was created. * Input file sections (with relocations applied) are emitted via MC with ".bolt.org" prefix. However, their name in the output binary is unchanged unless a new section with the same name is created. * New sections are emitted internally with ".bolt.new" prefix if there's a name conflict with an input file section. Their original name is preserved in the output file. * Section header string table is properly populated with section names that are actually used. Previously we used to include discarded section names as well. * Fix the problem when dynamic relocations were propagated to a new section with a name that matched a section in the input binary. E.g., the new .rodata with jump tables had dynamic relocations from the original .rodata. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D135494	2022-10-13 23:10:39 -07:00
Maksim Panchenko	0b213c9090	[BOLT] Fix writing out unmarked .eh_frame section When BOLT updates .eh_frame section, it concatenates newly-generated contents (from CFI directives) with the original .eh_frame that has relocations applied to it. However, if no new content is generated, the original .eh_frame has to be left intact. In that case, BOLT was still writing out the relocatable copy of the original .eh_frame section to the new segment, even though this copy was never used and was not even marked in the section header table. Detect the scenario above and skip allocating extra space for .eh_frame. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D135223	2022-10-07 11:19:51 -07:00
Maksim Panchenko	c683e281cd	[BOLT] Properly set _end symbol To properly set the "_end" symbol, we need to track the last allocatable address. Simply emitting "_end" at the end of some section is not sufficient since the order of section allocation is unknown during the emission step. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D135121	2022-10-07 11:19:14 -07:00
Maksim Panchenko	3e097fab5a	[BOLT][NFC] Remove text section assertion We can emit a binary without a new text section. Hence, the text section assertion is not needed. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D135120	2022-10-07 11:18:37 -07:00
Huan Nguyen	153eeb4a5e	[BOLT] Disable -lite when split function is present In lite mode, BOLT only transforms a subset of functions, leave the remaining functions intact. For NoPIC, it is fine. BOLT can scan relocations and fix-up all refs that point to any function body in the subset. For no-split function PIC, it is fine. Since jump tables are intra- procedural transfer, BOLT can find both the jump table base and the target within same function. Thus, BOLT can update and/or move jump tables. However, it is wrong to process a subset of functions in split function PIC. This is because BOLT does not know if functions in the subset are isolated, i.e., cannot be accessed by functions out of the subset, especially via split jump table. For example, BOLT only process three functions A, B and C. Suppose that A is reached via jump table from A.cold, which is not processed. When A is moved (due to optimization), the jump table in A.cold is invalid. We cannot fix-up this jump table since it is only recognized in A.cold, which BOLT does not process. Solution: Disable lite mode if split function is present. Future improvement: In lite mode, if split function is found, BOLT processes both functions in the subset and all of their sibling fragments. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir, maksfb Differential Revision: https://reviews.llvm.org/D131283	2022-09-28 19:26:17 +02:00
Amir Ayupov	39336fc09c	[BOLT] Control aggregation mode output profile file format In perf2bolt and `-aggregate-only` BOLT mode, the output profile file is written in fdata format by default. Provide a knob `-profile-format=[fdata,yaml]` to control the format. Note that `-w` option still dumps in YAML format. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D133995	2022-09-19 13:37:10 -07:00

... 2 3 4 5 6

275 Commits