3 Commits

Author SHA1 Message Date
Rahman Lavaee
0aa6df6575 [Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original  section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function.  This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.

Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.

This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D121346
2022-06-28 07:42:54 -07:00
Rahman Lavaee
2b0c5d76a6 Introduce and use a new section type for the bb_addr_map section.
This patch lets the bb_addr_map (renamed to __llvm_bb_addr_map) section use a special section type (SHT_LLVM_BB_ADDR_MAP) instead of SHT_PROGBITS. This would help parsers, dumpers and other tools to use the sh_type ELF field to identify this section rather than relying on string comparison on the section name.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D88199
2020-10-08 11:13:19 -07:00
Rahman Lavaee
7841e21c98 Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols.
This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:

|  Address of the function                      |  -> 8 bytes (pointer size)
|  Number of basic blocks in this function (>0) |  -> ULEB128

The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:

|  Offset of the basic block relative to function begin |  -> ULEB128
|  Binary size of the basic block                       |  -> ULEB128
|  BB metadata                                          |  -> ULEB128  [ MBB.isReturn() OR MBB.hasTailCall() << 1  OR  MBB.isEHPad() << 2 ]

The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.

For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html

Reviewed By: MaskRay, snehasish

Differential Revision: https://reviews.llvm.org/D85408
2020-09-14 10:16:44 -07:00