llvm-project

Author	SHA1	Message	Date
Micah Weston	a3f61c8bfd	[SHT_LLVM_BB_ADDR_MAP][obj2yaml] Implements PGOAnalysisMap for elf2yaml and tests. (#80924 ) Adds support to obj2yaml for PGO Analysis Map. Adds a test to both obj2yaml and yaml2obj.	2024-02-13 21:53:05 -05:00
Rahman Lavaee	acec6419e8	[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128 ) Today `-split-machine-functions` and `-fbasic-block-sections={all,list}` cannot be combined with `-basic-block-sections=labels` (the labels option will be ignored). The inconsistency comes from the way basic block address map -- the underlying mechanism for basic block labels -- encodes basic block addresses (https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). Specifically, basic block offsets are computed relative to the function begin symbol. This relies on functions being contiguous which is not the case for MFS and basic block section binaries. This means Propeller cannot use binary profiles collected from these binaries, which limits the applicability of Propeller for iterative optimization. To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section binaries, we propose modifying the encoding of this section as follows. First let us review the current encoding which emits the address of each function and its number of basic blocks, followed by basic block entries for each basic block. \| \| \| \|--\|--\| \| Address of the function \| Function Address \| \| Number of basic blocks in this function \| NumBlocks \| \| BB entry 1 \| BB entry 2 \| ... \| BB entry #NumBlocks To make this work for basic block sections, we treat each basic block section similar to a function, except that basic block sections of the same function must be encapsulated in the same structure so we can map all of them to their single function. We modify the encoding to first emit the number of basic block sections (BB ranges) in the function. Then we emit the address map of each basic block section section as before: the base address of the section, its number of blocks, and BB entries for its basic block. The first section in the BB address map is always the function entry section. \| \| \| \|--\|--\| \| Number of sections for this function \| NumBBRanges \| \| Section 1 begin address \| BaseAddress[1] \| \| Number of basic blocks in section 1 \| NumBlocks[1] \| \| BB entries for Section 1 \|..................\| \| Section #NumBBRanges begin address \| BaseAddress[NumBBRanges] \| \| Number of basic blocks in section #NumBBRanges \| NumBlocks[NumBBRanges] \| \| BB entries for Section #NumBBRanges The encoding of basic block entries remains as before with the minor change that each basic block offset is now computed relative to the begin symbol of its containing BB section. This patch adds a new boolean codegen option `-basic-block-address-map`. Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD flag `--lto-basic-block-address-map` are introduced. Analogously, we add a new TargetOption field `BBAddrMap`. This means BB address maps are either generated for all functions in the compiling unit, or for none (depending on `TargetOptions::BBAddrMap`). This patch keeps the functionality of the old `-fbasic-block-sections=labels` option but does not remove it. A subsequent patch will remove the obsolete option. We refactor the `BasicBlockSections` pass by separating the BB address map and BB sections handing to their own functions (named `handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers basic blocks and places them in their assigned sections. `handleBBAddrMap` is invoked after `handleBBSections` (if requested) and only renumbers the blocks. - New tests added: - Two tests basic-block-address-map-with-basic-block-sections.ll and basic-block-address-map-with-mfs.ll to exercise the combination of `-basic-block-address-map` with `-basic-block-sections=list` and '-split-machine-functions`. - A driver sanity test for the `-fbasic-block-address-map` option (basic-block-address-map.c). - An LLD test for testing the `--lto-basic-block-address-map` option. This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`. - Renamed and modified the two existing codegen tests for basic block address map (`basic-block-sections-labels-functions-sections.ll` and `basic-block-sections-labels.ll`) - Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of `SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2 will happen in a separate PR in a few months.	2024-02-01 17:50:46 -08:00
Micah Weston	23faa81d3f	[SHT_LLVM_BB_ADDR_MAP] Avoids side-effects in addition since order is unspecified. (#79168 ) Turns out the problem with https://github.com/llvm/llvm-project/issues/60013 is due to the fact that order of operation is unspecified in C++: https://en.cppreference.com/w/cpp/language/eval_order. A small example of where this manifests with MSVC can be seen here https://ooo.godbolt.org/z/bxqKeqzqn. This patch does the following: * Removes the addition operations where we sequence more than one side-effect based expression. * Removes test guards to now run on Windows	2024-01-24 17:26:48 -05:00
Fangrui Song	82b4368f7f	[llvm-readobj] Print <null> for relocation target with an empty name For a relocation, we don't differentiate the two cases: * the symbol index is 0 * the symbol index is non zero, the type is not STT_SECTION, and the name is empty. Clang generates such local symbols for RISC-V linker relaxation. So we may print ``` Offset Info Type Symbol's Value Symbol's Name + Addend 000000000000001c 0000000100000039 R_RISCV_32_PCREL 0000000000000000 0 // llvm-readobj 0x1C R_RISCV_32_PCREL - 0x0 ``` while GNU readelf prints "<null>", which is clearer. Let's match the GNU behavior. Related to https://reviews.llvm.org/D81842 ``` 000000000000001c 0000000100000039 R_RISCV_32_PCREL 0000000000000000 <null> + 0 // llvm-readobj 0x1C R_RISCV_32_PCREL <null> 0x0 ``` Reviewed By: jhenderson, kito-cheng Differential Revision: https://reviews.llvm.org/D155353	2023-07-20 00:42:38 -07:00
Rahman Lavaee	3d6841b2b1	[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number. Let Propeller use specialized IDs for basic blocks, instead of MBB number. This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version. ####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs. To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies. The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions. ####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D100808	2023-01-17 15:25:29 -08:00
Rahman Lavaee	96b6ee1bdc	Revert "[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number." This reverts commit 6015a045d768feab3bae9ad9c0c81e118df8b04a. Differential Revision: https://reviews.llvm.org/D139952	2022-12-13 11:13:57 -08:00
Rahman Lavaee	6015a045d7	[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number. Let Propeller use specialized IDs for basic blocks, instead of MBB number. This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version. ####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs. To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies. The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions. ####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D100808	2022-12-06 22:50:09 -08:00
Joel E. Denny	28412d1800	[lit] Implement DEFINE and REDEFINE directives These directives define per-test lit substitutions. The concept was discussed at <https://discourse.llvm.org/t/iterating-lit-run-lines/62596/10>. For example, the following directives can be inserted into a test file to define `%{cflags}` and `%{fcflags}` substitutions with empty initial values, which serve as the parameters of another newly defined `%{check}` substitution: ``` // DEFINE: %{cflags} = // DEFINE: %{fcflags} = // DEFINE: %{check} = %clang_cc1 %{cflags} -emit-llvm -o - %s \| \ // DEFINE: FileCheck %{fcflags} %s ``` The following directives then redefine the parameters before each use of `%{check}`: ``` // REDEFINE: %{cflags} = -foo // REDEFINE: %{fcflags} = -check-prefix=FOO // RUN: %{check} // REDEFINE: %{cflags} = -bar // REDEFINE: %{fcflags} = -check-prefix=BAR // RUN: %{check} ``` Of course, `%{check}` would typically be more elaborate, increasing the benefit of the reuse. One issue is that the strings `DEFINE:` and `REDEFINE:` already appear in 5 tests. This patch adjusts those tests not to use those strings. Our prediction is that, in the vast majority of cases, if a test author mistakenly uses one of those strings for another purpose, the text appearing after the string will not happen to have the syntax required for these directives. Thus, the test author will discover the mistake immediately when lit reports the syntax error. This patch also expands the documentation on existing lit substitution behavior. Reviewed By: jhenderson, MaskRay, awarzynski Differential Revision: https://reviews.llvm.org/D132513	2022-09-21 11:32:05 -04:00
Rahman Lavaee	0aa6df6575	[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks. This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format. Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address. This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary). The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D121346	2022-06-28 07:42:54 -07:00
Rainer Orth	42e391e4ca	[ELF] Use SHF_SUNW_NODISCARD instead of SHF_GNU_RETAIN on Solaris Instead of the GNU extension `SHF_GNU_RETAIN`, Solaris provides equivalent functionality with `SHF_SUNW_NODISCARD`. This patch implements the necessary support. Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D107955	2022-02-23 15:41:43 +01:00
Fangrui Song	aa3df8ddcd	[test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers)	2021-07-15 10:26:21 -07:00
Fangrui Song	46580d43fc	[llvm-readobj] Switch command line parsing from llvm::cl to OptTable Users should generally observe no difference as long as they don't use unintended option forms. Behavior changes: * `-t=d` is removed. Use `-t d` instead. * `--demangle=false` and `--demangle=0` cannot be used. Omit the option or use `--no-demangle`. Other flag-style options don't have `--no-` forms. * `--help-list` is removed. This is a `cl::` specific option. * llvm-readobj now supports grouped short options as well. * `--color` is removed. This is generally not useful (only apply to errors/warnings) but was inherited from Support. Some adjustment to the canonical forms (usually from GNU readelf; currently llvm-readobj has too many redundant aliases): * --dyn-syms is canonical. --dyn-symbols is a hidden alias * --file-header is canonical. --file-headers is a hidden alias * --histogram is canonical. --elf-hash-histogram is a hidden alias * --relocs is canonical. --relocations is a hidden alias * --section-groups is canonical. --elf-section-groups is a hidden alias OptTable avoids global option collision if we decide to support multiplexing for binary utilities. * Most one-dash long options are still supported. `-dt, -sd, -st, -sr` are dropped due to their conflict with grouped short options. * `--section-mapping=false` (D57365) is strange but is kept for now. * Many `cl::opt` variables were unnecessarily external. I added `static` whenever appropriate. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D105532	2021-07-12 10:14:42 -07:00
Fangrui Song	d4dcb55c70	[llvm-readobj] Make -s and -t match llvm-readelf llvm-readobj is an internal testing tool for binary formats. Its output and command line options do not need to be stable. It isn't supposed to be part of a build process. llvm-readelf was created as a user-facing utility and its interface intends to be compatible with GNU readelf (unless there are good reasons not to). The two tools have mostly compatible options. -s and -t are noticeable exceptions due to history. I think the cost of keeping the inconsistency overweighs the little history-compatible benefit and hinders transition from cl::opt to OptTable, so let's change it. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D105055	2021-06-29 11:56:26 -07:00
Alexander Yermolovich	a224c5199b	[LLD][LLVM] CG Graph profile using relocations Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index". This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied. With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct. Doing a quick experiment with clang-13. The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D104080	2021-06-24 09:09:33 -07:00
James Henderson	5c1639fe06	[yaml2obj][obj2yaml] Support custom ELF section header string table name This patch adds support for a new field in the FileHeader, which states the name to use for the section header string table. This also allows combining the string table with another string table in the object, e.g. the symbol name string table. The field is optional. By default, .shstrtab will continue to be used. This partially fixes https://bugs.llvm.org/show_bug.cgi?id=50506. Reviewed by: Higuoxing Differential Revision: https://reviews.llvm.org/D104035	2021-06-16 10:02:23 +01:00
James Henderson	fef3bfb1b2	[yaml2obj] Fix bug when referencing items in SectionHeaderTable There was an off-by-one error caused by an index (which included an index for the null section header) being used to check against the size of a list of sections (which didn't include the null section header). This is a partial fix for https://bugs.llvm.org/show_bug.cgi?id=50506. Reviewed by: MaskRay Differential Revision: https://reviews.llvm.org/D104098	2021-06-16 10:02:22 +01:00
Abhina Sreeskantharajan	c52fe0b021	[test] Use host platform specific error message substitution in lit tests This patch uses the errno python library to print out the correct error messages instead of hardcoding the error message per platform. Reviewed By: jhenderson, ASDenysPetrov Differential Revision: https://reviews.llvm.org/D97472	2021-03-05 07:21:53 -05:00
Rahman Lavaee	0252e6ead1	[obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field. As discussed in D95511, this allows us to encode invalid BBAddrMap sections to be used in more rigorous testing. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D96831	2021-02-17 15:45:13 -08:00
Alex Richardson	d613d8eb0e	[yaml2obj] Handle NT_* string values in for ELF note types This is required for D74393. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D95953	2021-02-09 16:59:22 +00:00
Abhina Sreeskantharajan	e59d336e75	[test] Use host platform specific error message substitution in lit tests - continued On z/OS, other error messages are not matched correctly in lit tests. ``` EDC5121I Invalid argument. EDC5111I Permission denied. ``` This patch adds a lit substitution to fix it. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D95808	2021-02-03 09:53:22 -05:00
Georgii Rymar	68195b15a3	[yaml2obj] - Allow empty SectionHeaderTable definitions. Currently we don't allow the following definition: ``` Sections: - Type: SectionHeaderTable - Name: .foo Type: SHT_PROGBITS ``` We report an error: "SectionHeaderTable can't be empty. Use 'NoHeaders' key to drop the section header table". It was implemented in this way earlier, when `SectionHeaderTable` was a dedicated key outside of the `Sections` list. And we did not allow to select where the table is written. Currently it makes sense to allow it, because a user might want to place the default section header table at an arbitrary position, e.g. before other sections. In this case it is not convenient and error prone to require specifying all sections: ``` Sections: - Type: SectionHeaderTable Sections: - Name: .foo - Name: .strtab - Name: .shstrtab - Name: .foo Type: SHT_PROGBITS ``` This patch allows empty SectionHeaderTable definitions. Differential revision: https://reviews.llvm.org/D95341	2021-01-28 10:51:52 +03:00
Georgii Rymar	9c89dcf807	[yaml2obj, obj2yaml] - Implement section header table as a special Chunk. This was discussed in D93678 thread. Currently we have one special chunk - Fill. This patch re implements the "SectionHeaderTable" key to become a special chunk too. With that we are able to place the section header table at any location, just like we place sections. Differential revision: https://reviews.llvm.org/D95140	2021-01-25 13:08:08 +03:00
Georgii Rymar	51f4958057	[yaml2obj/obj2yaml] - Improve dumping/creating of ELF versioning sections. This makes the following improvements. For `SHT_GNU_versym`: * yaml2obj: set `sh_link` to index of `.dynsym` section automatically. For `SHT_GNU_verdef`: * yaml2obj: set `sh_link` to index of `.dynstr` section automatically. * yaml2obj: set `sh_info` field automatically. * obj2yaml: don't dump the `Info` field when its value matches the number of version definitions. For `SHT_GNU_verneed`: * yaml2obj: set `sh_link` to index of `.dynstr` section automatically. * yaml2obj: set `sh_info` field automatically. * obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies. Also, simplifies few test cases. Differential revision: https://reviews.llvm.org/D94956	2021-01-21 10:36:48 +03:00
Georgii Rymar	d9afe8588e	[yaml2obj/obj2yaml] - Refine handling of SHT_GNU_verdef sections. This patch: 1) Makes `Version`, `Flags`, `VersionNdx` and `Hash` fields to be `Optional<>`. 2) Disallows dumping version definitions that have `vd_version != 1`. `vd_version` identifies the version of the structure itself. (https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html, https://docs.oracle.com/cd/E19683-01/816-7777/chapter6-80869/index.html) 3) Stops dumping default values for `Version`, `Flags`, `VersionNdx` and `Hash` fields. 4) Refines testing. Differential revision: https://reviews.llvm.org/D94659	2021-01-15 12:40:42 +03:00
Georgii Rymar	6d3098e7ff	[obj2yaml,yaml2obj] - Refine how we set/dump the sh_entsize field. This reuses the code from yaml2obj (moves it to ELFYAML.h). With it we can set the `sh_entsize` in a single place in `obj2yaml`. Note that it also fixes a bug of `yaml2obj`: we do not set the `sh_entsize` field for the `SHT_ARM_EXIDX` section properly. Differential revision: https://reviews.llvm.org/D93858	2021-01-13 11:52:40 +03:00
Georgii Rymar	141906fa14	[llvm-readelf/obj] - Add support of multiple SHT_SYMTAB_SHNDX sections. Currently we don't support multiple SHT_SYMTAB_SHNDX sections and the DT_SYMTAB_SHNDX tag currently. This patch implements it and fixes the https://bugs.llvm.org/show_bug.cgi?id=43991. I had to introduce the `struct DataRegion` to ELF.h, it is used to represent a region that might have no known size. It is needed, because we don't know the size of the extended section indices table when it is located via DT_SYMTAB_SHNDX. In this case we still want to validate that we don't read past the end of the file. Differential revision: https://reviews.llvm.org/D92923	2021-01-13 11:36:43 +03:00
Georgii Rymar	60df7c08b1	[obj2yaml,yaml2obj] - Fix issues with creating/dumping group sections. We have the following issues related to group sections: 1) yaml2obj is unable to set the custom `sh_entsize` value, because the `EntSize` key is currently ignored. 2) obj2yaml is unable to dump the group section which `sh_entsize != 4`. 3) obj2yaml always dumps the "EntSize" for group sections, though usually we are trying to omit dumping default values when dumping keys. I.e. we should not print the "EntSize" key when `sh_entsize` == 4. This patch fixes (1),(3) and adds the test case to document the behavior of (2). Differential revision: https://reviews.llvm.org/D93854	2021-01-12 14:07:42 +03:00
Georgii Rymar	8590b5ccd5	[libObject, llvm-readobj] - Reimplement `ELFFile<ELFT>::getEntry`. Currently, `ELFFile<ELFT>::getEntry` does not check an index of an entry. Because of that the code might read past the end of the symbol table silently. I've added a test to `llvm-readobj\ELF\relocations.test` to demonstrate the possible issue. Also, I've added a unit test for this method. After this change, `getEntry` stops reporting the section index and reuses the `getSectionContentsAsArray` method, which already has all the validation needed. Our related warnings now provide more and better context sometimes. Differential revision: https://reviews.llvm.org/D93209	2020-12-18 16:52:27 +03:00
Georgii Rymar	8c2cf89834	[yaml2obj/obj2yaml] - Make Value/Size fields of Symbol optional. When a field is optional we can use the `=<none>` syntax in macros. This patch makes `Value`/`Size` fields of `Symbol` optional and adds test cases for them. Differential revision: https://reviews.llvm.org/D93010	2020-12-16 13:49:57 +03:00
Georgii Rymar	98a4289810	[llvm-readobj] - For SHT_REL relocations, don't display an addend. This is https://bugs.llvm.org/show_bug.cgi?id=44257. In LLVM style we always print `0` as addend when dumping SHT_REL relocations. It is confusing, this patch stops printing it as the first comment on the bug page suggests. Differential revision: https://reviews.llvm.org/D93033	2020-12-14 12:03:00 +03:00
Georgii Rymar	7ac06444b8	[yaml2obj,obj2yaml] - Make Symbol::Section field optional. This is similar to what we did earlier for fields of the Section class. When a field is optional we can use the =<none> syntax in macros. This was splitted from D92478. Differential revision: https://reviews.llvm.org/D92565	2020-12-04 13:45:47 +03:00
Georgii Rymar	9aa7898200	Reland "[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types." (https://reviews.llvm.org/D90930 ). This reverts reverting commit fc40a03323a4b265ccbed34a07e281b13c5e8367 and fixes LLD (MachO/wasm) tests that failed previously.	2020-11-18 13:08:46 +03:00
Georgii Rymar	fc40a03323	Revert "[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types." This reverts commit 65fd17c241e22e1671e81efdb683687369c2feb3. It breaks LLD/MachO tests that seems use obj2yaml the check the output.	2020-11-18 11:55:03 +03:00
Georgii Rymar	65fd17c241	[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types. When we produce an YAML output, we also print leading zeroes currently. An output might look like this: ``` - Name: .dynsym Type: SHT_DYNSYM Address: 0x0000000000001000 EntSize: 0x0000000000000018 ``` There are probably no reason to print leading zeroes. It just makes harder to read values. This patch stops printing them. The output becomes like: ``` - Name: .dynsym Type: SHT_DYNSYM Address: 0x1000 EntSize: 0x18 ``` This affects obj2yaml mostly, but also dsymutil and llvm-xray tools output. Differential revision: https://reviews.llvm.org/D90930	2020-11-18 11:31:00 +03:00
Georgii Rymar	a7a447be0f	[yaml2obj] - ProgramHeaders: introduce FirstSec/LastSec instead of Sections list. Imagine we have a YAML declaration of few sections: `foo1`, `<unnamed 2>`, `foo3`, `foo4`. To put them into segment we can do (1): ``` Sections: - Section: foo1 - Section: foo4 ``` or we can use (2): ``` Sections: - Section: foo1 - Section: foo3 - Section: foo4 ``` or (3) : ``` Sections: - Section: foo1 ## "(index 2)" here is a name that we automatically created for a unnamed section. - Section: (index 2) - Section: foo3 - Section: foo4 ``` It looks really confusing that we don't have to list all of sections. At first I've tried to make this rule stricter and report an error when there is a gap (i.e. when a section is included into segment, but not listed explicitly). This did not work perfect, because such approach conflicts with unnamed sections/fills (see (3)). This patch drops "Sections" key and introduces 2 keys instead: `FirstSec` and `LastSec`. Both are optional. Differential revision: https://reviews.llvm.org/D90458	2020-11-09 13:00:50 +03:00
Georgii Rymar	99a6401acc	Recommit: [llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt. This is recommit for D90903 with fixes for BB: 1) Used std::move<> when returning Expected<> (http://lab.llvm.org:8011/#/builders/112/builds/913) 2) Fixed the name of temporarily file in the file-headers.test (http://lab.llvm.org:8011/#/builders/36/builds/1269) (a local old temporarily file was used before) For creating `ELFObjectFile` instances we have the factory method `ELFObjectFile<ELFT>::create(MemoryBufferRef Object)`. The problem of this method is that it scans the section header to locate some sections. When a file is truncated or has broken fields in the ELF header, this approach does not allow us to create the `ELFObjectFile` and dump the ELF header. This is https://bugs.llvm.org/show_bug.cgi?id=40804 This patch suggests a solution - it allows to delay scaning sections in the `ELFObjectFile<ELFT>::create`. It now allows user code to call an object initialization (`initContent()`) later. With that it is possible, for example, for dumpers just to dump the file header and exit. By default initialization is still performed as before, what helps to keep the logic of existent callers untouched. I've experimented with different approaches when worked on this patch. I think this approach is better than doing initialization of sections (i.e. scan of them) on demand, because normally users of `ELFObjectFile` API expect to work with a valid object. In most cases when a section header table can't be read (because of an error), we don't have to continue to work with object. So we probably don't need to implement a more complex API. Differential revision: https://reviews.llvm.org/D90903	2020-11-09 12:53:53 +03:00
Georgii Rymar	f59216b58f	Revert "[llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt." This reverts commit ea8a0b8b29eb08d3f0f6ac40942a2d8e98ab57ee. It broke BBots. http://lab.llvm.org:8011/#/builders/14/builds/1439 http://lab.llvm.org:8011/#/builders/112/builds/913	2020-11-09 11:50:50 +03:00
Georgii Rymar	ea8a0b8b29	[llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt. For creating `ELFObjectFile` instances we have the factory method `ELFObjectFile<ELFT>::create(MemoryBufferRef Object)`. The problem of this method is that it scans the section header to locate some sections. When a file is truncated or has broken fields in the ELF header, this approach does not allow us to create the `ELFObjectFile` and dump the ELF header. This is https://bugs.llvm.org/show_bug.cgi?id=40804 This patch suggests a solution - it allows to delay scaning sections in the `ELFObjectFile<ELFT>::create`. It now allows user code to call an object initialization (`initContent()`) later. With that it is possible, for example, for dumpers just to dump the file header and exit. By default initialization is still performed as before, what helps to keep the logic of existent callers untouched. I've experimented with different approaches when worked on this patch. I think this approach is better than doing initialization of sections (i.e. scan of them) on demand, because normally users of `ELFObjectFile` API expect to work with a valid object. In most cases when a section header table can't be read (because of an error), we don't have to continue to work with object. So we probably don't need to implement a more complex API. Differential revision: https://reviews.llvm.org/D90903	2020-11-09 11:27:07 +03:00
Rahman Lavaee	82e7c4ce45	[obj2yaml] [yaml2obj] Add yaml support for SHT_LLVM_BB_ADDR_MAP section. YAML support allows us to better test the feature in the subsequent patches. The implementation is quite similar to the .stack_sizes section. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D88717	2020-11-06 12:44:42 -08:00
Georgii Rymar	1af3cb5424	[llvm-readobj/libObject] - Allow dumping objects that has a broken SHT_SYMTAB_SHNDX section. Currently it is impossible to create an instance of ELFObjectFile when the SHT_SYMTAB_SHNDX can't be read. We error out when fail to parse the SHT_SYMTAB_SHNDX section in the factory method. This change delays reading of the SHT_SYMTAB_SHNDX section entries, with it llvm-readobj is now able to work with such inputs. Differential revision: https://reviews.llvm.org/D89379	2020-11-03 11:30:28 +03:00
Georgii Rymar	5ffafa870c	[yaml2obj] - Add support of Offset for .strtab/.shstrtab/.dynstr sections. These sections are implicit and handled a bit differently. Currently the "Offset" is ignored for them. This patch fixes an issue. Differential revision: https://reviews.llvm.org/D90446	2020-11-02 11:56:32 +03:00
Georgii Rymar	2bfaf19516	[yaml2obj] - Make `Section::Link` field to be `Optional<>`. `Link` is not an optional field currently. Because of this it is not convenient to write macros. This makes it optional and fixes corresponding test cases. Differential revision: https://reviews.llvm.org/D90390	2020-10-30 16:18:53 +03:00
Georgii Rymar	18b4b0b80d	[yaml2obj][test] - Merge strtab-implicit-sections-.yaml into strtab-implicit-sections.yaml and improve testing of .shstrtab This creates `strtab-implicit-sections.yaml` and merges 2 `strtab-implicit-sections` tests into it. I've also added a few tests for `.shstrtab` section related to section flags. With that we have a single place where we can test implicit string table sections and the `.shstrtab` section in particular. Differential revision: https://reviews.llvm.org/D90372	2020-10-29 15:08:04 +03:00
Georgii Rymar	840737fc82	[yaml2obj][test] - Merge dynsymtab-shlink.yaml to dynsym-section.yaml This simplifies the dynsymtab-shlink.yaml test (with use of macros) and merges it into the dynsym-section.yaml test. Differential revision: https://reviews.llvm.org/D90301	2020-10-29 13:30:07 +03:00
Georgii Rymar	fcf6287916	[yaml2obj] - Improve handling of SectionHeaderTable::NoHeaders flag. When `NoHeaders` is set, we still have following issues: 1) We emit the `.shstrtab` implicit section of size 1 (empty string table). 2) We still align the start of the section header table, what affects the output size. 3) We still write section header table bytes. This patch fixes all of these issues. Differential revision: https://reviews.llvm.org/D90295	2020-10-29 12:16:52 +03:00
Georgii Rymar	edfb2f8b23	[yaml2obj] - Support the "Offset" key for the .dynsym section. Our "implicit" sections are handled separately from regular ones. It turns out that the "Offset" key is not handled properly for them. Perhaps we can generalize handling in one place, but before doing that I'd like to add support and test cases for each implicit section. (I need this particular single change to unblock another patch that is already on review, and I guess doing it independently for each section will be cleaner, see below). In this patch I've removed `explicit-dynsym-no-dynstr.yaml` to `dynsym-section.yaml` and added the new test into. In a follow-up we probably might want to merge 2 another existent `dynsymtab-*.yaml` tests into it too. Differential revision: https://reviews.llvm.org/D90224	2020-10-28 14:22:29 +03:00
Georgii Rymar	2d59ed4e62	[yaml2obj] - Add a way to override the sh_addralign field of a section. Imagine the following declaration of a section: ``` Sections: - Name: .dynsym Type: SHT_DYNSYM AddressAlign: 0x1111111111111111 ``` The aligment is large and yaml2obj reports an error currently: "the desired output size is greater than permitted. Use the --max-size option to change the limit" This patch implements the "ShAddrAlign" key, which is similar to other "Sh*" keys we have. With it it is possible to override the `sh_addralign` field, ignoring the writing of alignment bytes. Differential revision: https://reviews.llvm.org/D90019	2020-10-27 13:03:38 +03:00
Georgii Rymar	6487ffafd1	Reland "[yaml2obj][ELF] - Simplify the code that performs sections validation." This reverts commit 1b589f4d4db27e3fcd81fdc5abeb9407753ab790 and relands the D89463 with the fix: update `MappingTraits<FileFilter>::validate()` in ClangTidyOptions.cpp to match the new signature (change the return type to "std::string" from "StringRef"). Original commit message: This: Changes the return type of MappingTraits<T>>::validate to std::string instead of StringRef. It allows to create more complex error messages. It introduces std::vector<std::pair<StringRef, bool>> getEntries(): a new virtual method of Section, which is the base class for all sections. It returns names of special section specific keys (e.g. "Entries") and flags that says if them exist in a YAML. The code in validate() uses this list of entries descriptions to generalize validation. This approach was discussed in the D89039 thread. Differential revision: https://reviews.llvm.org/D89463	2020-10-20 16:25:33 +03:00
Georgii Rymar	1b589f4d4d	Revert "[yaml2obj][ELF] - Simplify the code that performs sections validation." This reverts commit b9e2b59680ad1bbfd2b9110b3ebf3d2b22cad51b.	2020-10-20 15:16:56 +03:00
Georgii Rymar	b9e2b59680	[yaml2obj][ELF] - Simplify the code that performs sections validation. This: 1) Changes the return type of `MappingTraits<T>>::validate` to `std::string` instead of `StringRef`. It allows to create more complex error messages. 2) It introduces std::vector<std::pair<StringRef, bool>> getEntries(): a new virtual method of Section, which is the base class for all sections. It returns names of special section specific keys (e.g. "Entries") and flags that says if them exist in a YAML. The code in validate() uses this list of entries descriptions to generalize validation. This approach was discussed in the D89039 thread. Differential revision: https://reviews.llvm.org/D89463	2020-10-20 11:28:23 +03:00

1 2 3 4

185 Commits