195 Commits

Author SHA1 Message Date
Micah Weston
a3f61c8bfd
[SHT_LLVM_BB_ADDR_MAP][obj2yaml] Implements PGOAnalysisMap for elf2yaml and tests. (#80924)
Adds support to obj2yaml for PGO Analysis Map. Adds a test to both
obj2yaml and yaml2obj.
2024-02-13 21:53:05 -05:00
stephenpeckham
b5abaea3c0
[yaml2obj][XOFF] Update yaml2obj for XCOFF to create valid XCOFF files in more cases. (#77620)
yaml2obj creates invalid object files even when the input was created by
obj2yaml using a valid object file. On the other hand, yaml2obj is used
to intentionally create invalid object files for testing purposes.

This update balances using specified input values when provided and
computing file offsets and sizes if necessary.
2024-02-09 08:20:21 -06:00
stephenpeckham
5aeabf2df9
[XCOFF][obj2yaml] Support SymbolAlignmentAndType as 2 separate fields in YAML. (#76828)
XCOFF encodes a symbol type and alignment in a single 8-bit field. It is
easier to read and write YAML files if the fields can be specified
separately. This PR causes obj2yaml to write the fields separately and
allows yaml2obj to read either the single combined field or the separate
fields.
2024-02-08 10:44:19 -06:00
Rahman Lavaee
acec6419e8
[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128)
Today `-split-machine-functions` and `-fbasic-block-sections={all,list}`
cannot be combined with `-basic-block-sections=labels` (the labels
option will be ignored).
The inconsistency comes from the way basic block address map -- the
underlying mechanism for basic block labels -- encodes basic block
addresses
(https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html).
Specifically, basic block offsets are computed relative to the function
begin symbol. This relies on functions being contiguous which is not the
case for MFS and basic block section binaries. This means Propeller
cannot use binary profiles collected from these binaries, which limits
the applicability of Propeller for iterative optimization.
    
To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section
binaries, we propose modifying the encoding of this section as follows.

First let us review the current encoding which emits the address of each
function and its number of basic blocks, followed by basic block entries
for each basic block.

| | |
|--|--|
| Address of the function | Function Address |
|  Number of basic blocks in this function | NumBlocks |
|  BB entry 1
|  BB entry 2
|   ...
|  BB entry #NumBlocks
    
To make this work for basic block sections, we treat each basic block
section similar to a function, except that basic block sections of the
same function must be encapsulated in the same structure so we can map
all of them to their single function.
    
We modify the encoding to first emit the number of basic block sections
(BB ranges) in the function. Then we emit the address map of each basic
block section section as before: the base address of the section, its
number of blocks, and BB entries for its basic block. The first section
in the BB address map is always the function entry section.
| | |
|--|--|
|  Number of sections for this function   | NumBBRanges |
| Section 1 begin address                     | BaseAddress[1]  |
| Number of basic blocks in section 1 | NumBlocks[1]    |
| BB entries for Section 1
|..................|
| Section #NumBBRanges begin address | BaseAddress[NumBBRanges] |
| Number of basic blocks in section #NumBBRanges |
NumBlocks[NumBBRanges] |
| BB entries for Section #NumBBRanges
    
The encoding of basic block entries remains as before with the minor
change that each basic block offset is now computed relative to the
begin symbol of its containing BB section.
    
This patch adds a new boolean codegen option `-basic-block-address-map`.
Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD
flag `--lto-basic-block-address-map` are introduced.
Analogously, we add a new TargetOption field `BBAddrMap`. This means BB
address maps are either generated for all functions in the compiling
unit, or for none (depending on `TargetOptions::BBAddrMap`).
    
This patch keeps the functionality of the old
`-fbasic-block-sections=labels` option but does not remove it. A
subsequent patch will remove the obsolete option.

We refactor the `BasicBlockSections` pass by separating the BB address
map and BB sections handing to their own functions (named
`handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers
basic blocks and places them in their assigned sections.
`handleBBAddrMap` is invoked after `handleBBSections` (if requested) and
only renumbers the blocks.
  - New tests added:
- Two tests basic-block-address-map-with-basic-block-sections.ll and
basic-block-address-map-with-mfs.ll to exercise the combination of
`-basic-block-address-map` with `-basic-block-sections=list` and
'-split-machine-functions`.
- A driver sanity test for the `-fbasic-block-address-map` option
(basic-block-address-map.c).
- An LLD test for testing the `--lto-basic-block-address-map` option.
This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`.
- Renamed and modified the two existing codegen tests for basic block
address map (`basic-block-sections-labels-functions-sections.ll` and
`basic-block-sections-labels.ll`)
- Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of
`SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2
will happen in a separate PR in a few months.
2024-02-01 17:50:46 -08:00
Micah Weston
23faa81d3f
[SHT_LLVM_BB_ADDR_MAP] Avoids side-effects in addition since order is unspecified. (#79168)
Turns out the problem with
https://github.com/llvm/llvm-project/issues/60013 is due to the fact
that order of operation is unspecified in C++:
https://en.cppreference.com/w/cpp/language/eval_order. A small example
of where this manifests with MSVC can be seen here
https://ooo.godbolt.org/z/bxqKeqzqn.

This patch does the following:
* Removes the addition operations where we sequence more than one
side-effect based expression.
* Removes test guards to now run on Windows
2024-01-24 17:26:48 -05:00
esmeyi
e462937173 Reland "[XCOFF][obj2yaml] support parsing auxiliary symbols for XCOFF (#70642)"
This PR adds the support for parsing auxiliary symbols of XCOFF object file for obj2yaml.

The sanitizer error is clean now.
2023-12-07 03:45:06 -05:00
esmeyi
fc42a2f5c4 Revert "[XCOFF][obj2yaml] support parsing auxiliary symbols for XCOFF (#70642)"
This reverts commit 0e3faa20c467e6cd62423b22cf8650c6aa2628ba.

Due to a sanitizer error in https://lab.llvm.org/buildbot/#/builders/5/builds/39023.
Will be re-landed after repairs and thorough testing.
2023-12-07 02:01:06 -05:00
Esme
0e3faa20c4
[XCOFF][obj2yaml] support parsing auxiliary symbols for XCOFF (#70642)
This PR adds the support for parsing auxiliary symbols of XCOFF object
file for obj2yaml.
2023-12-07 11:30:30 +08:00
Job Noorman
8545093715 [obj2yaml] Emit ProgramHeader.Offset
Currently, obj2yaml doesn't emit the offset of program headers, leaving
it to yaml2obj to calculate offsets based on `FirstSec` and `LastSec`.
This causes an obj2yaml->yaml2obj round trip to often produce an ELF
file that is not equivalent to the original, especially since it seems
common to have program headers at offset 0 whose first section starts at
a higher address. Besides being non-equivalent, the produced ELF files
also do not seem to work propery and readelf complains about them.

Taking a simple hello world program in C, compiled using either GCC or
Clang, the original ELF file has the following program headers (only
showing some relevant ones):

```
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000002d8 0x00000000000002d8  R      0x8
  INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000630 0x0000000000000630  R      0x1000
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x0000000000000161 0x0000000000000161  R E    0x1000
...

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
   03     .init .plt .text .fini
...
```

While this is the result of an obj2yaml->yaml2obj round trip:

```
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000000 0x0000000000000040 0x0000000000000040
                 0x0000000000000000 0x0000000000000000  R      0x8
readelf: Error: the PHDR segment is not covered by a LOAD segment
  INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000318 0x0000000000000000 0x0000000000000000
                 0x0000000000000318 0x0000000000000318  R      0x1000
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x0000000000000161 0x0000000000000161  R E    0x1000
...

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02
   03     .init .plt .text .fini
...
```

Note that the offset of segment 2 changed from 0x0 to 0x318. This has
two effects:
- readelf complains "Error: the PHDR segment is not covered by a LOAD
  segment" since PHDR was originally covered by segment 2 but not
  anymore;
- Segment 2 effectively became empty according to the section to segment
  mapping.

I addition to these, the output doesn't correctly execute anymore,
crashing with a "SIGSEGV (Address boundary error)".

This patch fixes the difference in program header layout after a round
trip by explicitly emitting offsets.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D145555
2023-09-25 10:21:53 +02:00
Cyndy Ishida
15cfb4494d [macho2yaml] Fixup offset for nodes in export trie.
The offset recorded in export trie nodes is used to capture symbol
information like address and node size. This offset should be relative to the
root of the trie. This updates to `processExportNode` to follow that
expectation.

Reviewed By: pete, steven_wu

Differential Revision: https://reviews.llvm.org/D157431
2023-08-09 09:18:18 -07:00
Rahman Lavaee
3d6841b2b1 [Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.

This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.

####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR.  This is done as follows.
    - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
    - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
    - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization.  Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
    - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
    - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR.  Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
    - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline.  Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block.  It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.

 To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.

The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.

####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D100808
2023-01-17 15:25:29 -08:00
Chris Bieneman
621ffbcbe4 [DX] Improve parse error messages
This change refactors the parte parsing logic to operate on StringRefs
of the part data rather than starting from an offset and splicing down.
It also improves some of the error reporting around part layout.

Specifically, this code now reports a distinct error if there isn't
enough data in the buffer to store the part size and it reports an
error if the parts overlap.

Reviewed By: bob80905

Differential Revision: https://reviews.llvm.org/D139681
2023-01-03 12:50:22 -06:00
Rahman Lavaee
96b6ee1bdc Revert "[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number."
This reverts commit 6015a045d768feab3bae9ad9c0c81e118df8b04a.

Differential Revision: https://reviews.llvm.org/D139952
2022-12-13 11:13:57 -08:00
Chris Bieneman
76fca14750 [NFC] Update DXContainer tests to use fake parts
The tests that are focused on testing the file structure should use
fake part names so that the errors get triggered consistently. As we
add parsing support for known data types and structures under the parts
the parsing errors change to be more semantically accurate.

This change ensures the general errors continue to work with less churn
on the test cases in the future.
2022-12-09 13:16:49 -06:00
Rahman Lavaee
6015a045d7 [Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.

This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.

####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR.  This is done as follows.
    - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
    - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
    - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization.  Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
    - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
    - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR.  Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
    - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline.  Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block.  It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.

 To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.

The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.

####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D100808
2022-12-06 22:50:09 -08:00
Chris Bieneman
2556ba4a52 [ObjectYAML] Add support for DXContainer HASH
DXContainer files contain a part that has an MD5 of the generated
shader. This adds support to the ObjectYAML tooling to expand the hash
part data and hash iteself in preparation for adding hashing support to
DirectX code generation.

Reviewed By: python3kgae

Differential Revision: https://reviews.llvm.org/D136632
2022-10-27 12:28:45 -05:00
WANG Xuerui
4e2dfd3589 [LoongArch] Updates for the LoongArch ELF psABI v2.01 revision
The e_flags of existing object files are all 0x3 which happens to be
compatible. From this commit on, all LoongArch objects produced with
upstream LLVM will be of object file ABI v1, which is already supported
by binutils' master branch (to be released as 2.40), and is allowed by
the same binutils version to interlink with v0 objects so the existing
distributions have time to migrate.

Differential Revision: https://reviews.llvm.org/D134601
2022-10-13 19:12:26 +08:00
Chris Bieneman
49dc58f551 [DX] [ObjectYAML] Support DX shader feature flags
DXContainers contain a feature flag part, which stores a bitfield used
to denote what underlying hardware features the shader requires. This
change adds feature flags to the DXContainer YAML tooling to enable
testing generating feature flags during HLSL code generation.

Depends on D133980

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D134315
2022-09-29 12:37:11 -05:00
Weining Lu
aff68f5ad6 [LoongArch] Parse LoongArch base ABI in ObjectYAML and llvm-readobj
LoongArch e_flags definition:
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_e_flags_identifies_abi_type_and_version

Differential Revision: https://reviews.llvm.org/D130238
2022-07-25 20:40:57 +08:00
Fangrui Song
df42d63d37 [obj2yaml] Refactor command line parsing
Similar to D73982 for yaml2obj.

* Hide unrelated options.
* Add an OVERVIEW: message.
* Disallow single-dash long options.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D129839
2022-07-18 00:13:55 -07:00
Fangrui Song
cfec2080b7 [obj2yaml] Add -o to specify output filename
-o is very common among tools. yaml2obj supports -o and it surprised me that
obj2yaml doesn't support -o. Just add it which doesn't take much code.

Differential Revision: https://reviews.llvm.org/D129713
2022-07-14 00:32:48 -07:00
Rahman Lavaee
0aa6df6575 [Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original  section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function.  This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.

Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.

This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D121346
2022-06-28 07:42:54 -07:00
Chris Bieneman
157f1f25da [NFC] Fix spelling error M->L
Clearly I cannot spell...
2022-06-06 19:38:15 -05:00
Chris Bieneman
21c9452305 [DX][ObjYAML] Support for parsing DXIL part
This patch adds support for parsing the DXIL part data into the
ObjectYAML tooling.

The DXIL part has additional headers describing the shader and bitcode
data and stores serialized bitcode after the headers.

Depends on D124945

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D126795
2022-06-06 18:46:19 -05:00
Chris Bieneman
352c395fb6 [ObjectYAML][DX] Add dxcontainer2yaml support
This change finishes fleshing out the ObjectYAML tools to support
converting DXContainer files into yaml representations.

Depends on D124944

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D124945
2022-06-06 13:23:29 -05:00
Chris Bieneman
129c056d62 [ObjectYAML][DX] Support yaml2dxcontainer
This patch adds a the first bits of support for a yaml representation
of dxcontainer files.

Since the YAML representation's primary purpose is testing
infrastructure, the yaml representation supports both verbose and a
more friendly format by making computable sizes and offsets optional.
If provided they are validated to be correct, otherwise they are
computed on the fly during emission.

As I expand the format I'll be able to make more size fields optional,
and I will continue to make the format easier to work with.

Depends on D124804

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D124944
2022-06-01 15:34:00 -05:00
Anubhab Ghosh
9da89651a8 [llvm-objcopy][ObjectYAML][mips] Add MIPS specific ELF section indexes
This fixes https://github.com/llvm/llvm-project/issues/53998
and displays correct information in obj2yaml for SHN_MIPS_*
sections according to
https://refspecs.linuxfoundation.org/elf/mipsabi.pdf

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D123902
2022-05-25 09:01:12 -07:00
Rainer Orth
42e391e4ca [ELF] Use SHF_SUNW_NODISCARD instead of SHF_GNU_RETAIN on Solaris
Instead of the GNU extension `SHF_GNU_RETAIN`, Solaris provides equivalent
functionality with `SHF_SUNW_NODISCARD`. This patch implements the necessary
support.

Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D107955
2022-02-23 15:41:43 +01:00
Simon Atanasyan
3c840e3c00 [MIPS] Recognize DT_MIPS_XHASH dynamic table tag
LLVM tools do not emit `DT_MIPS_XHASH` dynamic table tag. But now
`llvm-objdump` and `llvm-readelf` recognize this tag and print it.

Fixes https://github.com/llvm/llvm-project/issues/53996
2022-02-23 16:03:14 +03:00
Adrian Prantl
8bd8dd16e2 Extend obj2yaml to optionally preserve raw __LINKEDIT/__DATA segments.
I am planning to upstream MachOObjectFile code to support Darwin
chained fixups. In order to test the new parser features we need a way
to produce correct (and incorrect) chained fixups. Right now the only
tool that can produce them is the Darwin linker. To avoid having to
check in binary files, this patch allows obj2yaml to print a hexdump
of the raw LINKEDIT and DATA segment, which both allows to
bootstrap the parser and enables us to easily create malformed inputs
to test error handling in the parser.

This patch adds two new options to obj2yaml:

  -raw-data-segment
  -raw-linkedit-segment

Differential Revision: https://reviews.llvm.org/D113234
2021-11-08 11:30:12 -08:00
Esme-Yi
a00ff71668 [XCOFF] Improve error message context.
Summary: This patch improves the error message context of the
XCOFF interfaces by providing more details.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D110320
2021-10-11 02:52:20 +00:00
Fangrui Song
8971b99c83 [llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC
STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling
convention or the vector calling convention.

See https://github.com/riscv/riscv-elf-psabi-doc/pull/190

Differential Revision: https://reviews.llvm.org/D107949
2021-09-29 16:56:52 -07:00
Esme-Yi
945df8bc4c [obj2yaml][XCOFF] Dump sections
Summary: This patch implements parsing sections for obj2yaml on AIX.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D98003
2021-09-15 05:16:33 +00:00
Esme-Yi
909f3d7380 [yaml2obj][XCOFF] customize the string table
Summary: The patch adds support for yaml2obj customizing the string table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D107421
2021-09-13 09:24:38 +00:00
Fangrui Song
aa3df8ddcd [test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers) 2021-07-15 10:26:21 -07:00
Alexander Yermolovich
a224c5199b [LLD][LLVM] CG Graph profile using relocations
Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index".

This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied.

With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct.

Doing a quick experiment with clang-13.
The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D104080
2021-06-24 09:09:33 -07:00
James Henderson
b9ce8ea454 [obj2yaml] Address D104035 review comments
Accidentally missed from commit 5c1639fe064b.

Differential Revision: https://reviews.llvm.org/D104035
2021-06-16 15:01:54 +01:00
James Henderson
5c1639fe06 [yaml2obj][obj2yaml] Support custom ELF section header string table name
This patch adds support for a new field in the FileHeader, which states
the name to use for the section header string table. This also allows
combining the string table with another string table in the object, e.g.
the symbol name string table. The field is optional. By default,
.shstrtab will continue to be used.

This partially fixes https://bugs.llvm.org/show_bug.cgi?id=50506.

Reviewed by: Higuoxing

Differential Revision: https://reviews.llvm.org/D104035
2021-06-16 10:02:23 +01:00
CarlosAlbertoEnciso
d0a5d86119 [Debug-Info][CodeView] Fix GUID string generation for MSVC generated objects.
This patch is to address https://bugs.llvm.org/show_bug.cgi?id=50459.
  YAML:455:28: error: GUID strings are 38 characters long

The valid format for a GUID is {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
where X is a hex digit (0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F).
The length of the individual components must be: 8, 4, 4, 4, 12.

For some cases, the converted string generated by obj2yaml, does not
comply with those lengths. yaml2obj checks that the GUID string must
be 38 characters including the dashes and braces.

Reviewed By: amccarth

Differential Revision: https://reviews.llvm.org/D103089
2021-06-15 06:53:21 +01:00
Rahman Lavaee
9f52708660 [obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field.
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96831
2021-02-22 18:08:26 -08:00
Rahman Lavaee
0252e6ead1 [obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field.
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96831
2021-02-17 15:45:13 -08:00
Rahman Lavaee
f1ff6d210a [obj2yaml, yaml2obj] Use Hex64 for BBAddressMap fields.
This patch let the yaml encoding use Hex64 values for NumBlocks, BB AddressOffset, BB Size, and BB Metadata.
Additionally, it changes the decoded values in elf2yaml to uint64_t to match DataExtractor::getULEB128 return type.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D95767
2021-02-01 15:37:30 -08:00
Abhina Sreeskantharajan
42a21778f6 [test] Use host platform specific error message substitution in lit tests
On z/OS, the following error message is not matched correctly in lit tests.

```
EDC5129I No such file or directory.
```

This patch uses a lit config substitution to check for platform specific error messages.

Reviewed By: muiez, jhenderson

Differential Revision: https://reviews.llvm.org/D95246
2021-01-29 07:16:30 -05:00
Georgii Rymar
d5e48f1347 [yaml2obj][obj2yaml] - Improve how we set/dump the sh_entsize field.
We already set the `sh_entsize` field in a single place
for all non-implicit sections.

This patch reorders the logic slightly and with it
we finally have the only one place where the `sh_entsize` is set.

obj2yaml will not dump the `EntSize` key for `SHT_DYNSYM/SHT_SYMTAB` sections anymore,
when the value of `sh_entsize` is equal to `sizeof(Elf_Sym)`

Note that this also seems revealed an issue in llvm-objcopy:
Previously yaml2obj set the `sh_entsize` for the `.symtab` section to 0x18,
now we it sets it for `SHT_SYMTAB` sections, i.e. by type.
But the `llvm-objcopy/ELF/only-keep-debug.test` has a `.symtab` section of type `SHT_STRTAB`,
and now yaml2obj sets the `sh_entsize` to 0 for it.
I had to update the corresponding check lines for `ES`, but the behavior of
`llvm-objcopy` should be fixed instead I think.
I've added a TODO and a comment.

Differential revision: https://reviews.llvm.org/D95364
2021-01-26 13:33:02 +03:00
Abhina Sreeskantharajan
978444d531 Revert "[SystemZ][z/OS] Fix No such file or directory expression error"
This reverts commit 06f8a49693957bc27b83e0ab5f429ff874941a07.
2021-01-25 08:29:38 -05:00
Georgii Rymar
9c89dcf807 [yaml2obj, obj2yaml] - Implement section header table as a special Chunk.
This was discussed in D93678 thread.
Currently we have one special chunk - Fill.

This patch re implements the "SectionHeaderTable" key to become a special chunk too.
With that we are able to place the section header table at any location,
just like we place sections.

Differential revision: https://reviews.llvm.org/D95140
2021-01-25 13:08:08 +03:00
Georgii Rymar
51f4958057 [yaml2obj/obj2yaml] - Improve dumping/creating of ELF versioning sections.
This makes the following improvements.

For `SHT_GNU_versym`:
 * yaml2obj: set `sh_link` to index of `.dynsym` section automatically.
For `SHT_GNU_verdef`:
 * yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
 * yaml2obj: set `sh_info` field automatically.
 * obj2yaml: don't dump the `Info` field when its value matches the number of version definitions.
For `SHT_GNU_verneed`:
 * yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
 * yaml2obj: set `sh_info` field automatically.
 * obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies.

Also, simplifies few test cases.

Differential revision: https://reviews.llvm.org/D94956
2021-01-21 10:36:48 +03:00
Abhina Sreeskantharajan
689aaba7ac [SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully.
```
EDC5129I No such file or directory.
```

Reviewed By: muiez

Differential Revision: https://reviews.llvm.org/D94239
2021-01-18 07:14:37 -05:00
Georgii Rymar
d9afe8588e [yaml2obj/obj2yaml] - Refine handling of SHT_GNU_verdef sections.
This patch:
1) Makes `Version`, `Flags`, `VersionNdx` and `Hash` fields to be `Optional<>`.
2) Disallows dumping version definitions that have `vd_version != 1`.
   `vd_version` identifies the version of the structure itself.
   (https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html,
    https://docs.oracle.com/cd/E19683-01/816-7777/chapter6-80869/index.html)
3) Stops dumping default values for `Version`, `Flags`, `VersionNdx` and `Hash` fields.
4) Refines testing.

Differential revision: https://reviews.llvm.org/D94659
2021-01-15 12:40:42 +03:00
Georgii Rymar
6d3098e7ff [obj2yaml,yaml2obj] - Refine how we set/dump the sh_entsize field.
This reuses the code from yaml2obj (moves it to ELFYAML.h).
With it we can set the `sh_entsize` in a single place in `obj2yaml`.

Note that it also fixes a bug of `yaml2obj`: we do not
set the `sh_entsize` field for the `SHT_ARM_EXIDX` section properly.

Differential revision: https://reviews.llvm.org/D93858
2021-01-13 11:52:40 +03:00