When using Temporal Profiling with the BP algorithm, we encounter an
issue with the internal function reorder. In cases where the symbol
table contains entries like:
```
Symbol table '.symtab' contains 45 entries:
Num: Value Size Type Bind Vis Ndx Name
10: 0000000000000000 0 SECTION LOCAL DEFAULT 18 .text.L1
11: 0000000000000000 24 FUNC LOCAL DEFAULT 18 L1
````
The zero-sized section symbol .text.L1 gets stored in the secToSym map
first. However, when the function lookup searches for L1 (as seen in
[BPSectionOrdererBase.inc:191](https://github.com/llvm/llvm-project/blob/main/lld/include/lld/Common/BPSectionOrdererBase.inc#L191)),
it fails to find the correct entry in rootSymbolToSectionIdxs because
the section symbol has already claimed that slot.
This patch fixes the issue by skipping zero-sized symbols during the
addSections process, ensuring that function symbols are properly
registered for lookup.
```
/// Symbols can be appended with "(.__uniq.xxxx)?.llvm.yyyy" where "xxxx" and
/// "yyyy" are numbers that could change between builds. We need to use the root
/// symbol name before this suffix so these symbols can be matched with profiles
/// which may have different suffixes.
```
Just like what we are doing in BP,
https://github.com/llvm/llvm-project/blob/main/lld/MachO/BPSectionOrderer.cpp#L127
the patch removes the suffixes when parsing the order file and getting
the symbol priority to have a better symbol match.
---------
Co-authored-by: Sharon Xu <sharonxu@fb.com>
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
ICF runs before BPSectionOrderer. When a section is ICF'ed, it seems
that the original sections are marked as not live, but are still kept
around. Prior to this patch, those ICF'ed sections would be passed to BP
and ordered before being skipped when writing the output. Now, these
sections are no longer passed to BP, saving runtime and possibly
improving BP's output.
In a large binary, I found that the number of sections ordered using BP
decreased, while the number of duplicate sections drastically decreased
as expected.
```
Functions for startup: 50755 -> 50520
Functions for compression: 165734 -> 105328
Duplicate functions: 1827231 -> 55230
```
Reland #120514 after 2f6e3df08a8b7cd29273980e47310cf09c6fdbd8 fixed
iteration order issue and libstdc++/libc++ differences.
---
Both options instruct the linker to optimize section layout with the
following goals:
* `--bp-compression-sort=[data|function|both]`: Improve Lempel-Ziv
compression by grouping similar sections together, resulting in a
smaller compressed app size.
* `--bp-startup-sort=function --irpgo-profile=<file>`: Utilize a
temporal profile file to reduce page faults during program startup.
The linker determines the section order by considering three groups:
* Function sections ordered according to the temporal profile
(`--irpgo-profile=`), prioritizing early-accessed and frequently
accessed functions.
* Function sections. Sections containing similar functions are placed
together, maximizing compression opportunities.
* Data sections. Similar data sections are placed together.
Within each group, the sections are ordered using the Balanced
Partitioning algorithm.
The linker constructs a bipartite graph with two sets of vertices:
sections and utility vertices.
* For profile-guided function sections:
+ The number of utility vertices is determined by the symbol order
within the profile file.
+ If `--bp-compression-sort-startup-functions` is specified, extra
utility vertices are allocated to prioritize nearby function similarity.
* For sections ordered for compression: Utility vertices are determined
by analyzing k-mers of the section content and relocations.
The call graph profile is disabled during this optimization.
When `--symbol-ordering-file=` is specified, sections described in that
file are placed earlier.
Co-authored-by: Pengying Xu <xpy66swsry@gmail.com>
The ELF/bp-section-orderer.s test is failing on some buildbots due to
what seems like non-determinism issues, see comments on the original PR
and #125450
Reverting to green the build.
This reverts commit 0154dce8d39d2688b09f4e073fe601099a399365 and
follow-up commits 046dd4b28b9c1a75a96cf63465021ffa9fe1a979 and
c92f20416e6dbbde9790067b80e75ef1ef5d0fa4.
Add new ELF linker options for profile-guided section ordering
optimizations:
- `--irpgo-profile=<file>`: Read IRPGO profile data for use with startup
and compression optimizations
- `--bp-startup-sort={none,function}`: Order sections based on profile
data to improve star tup time
- `--bp-compression-sort={none,function,data,both}`: Order sections
using balanced partitioning to improve compressed size
- `--bp-compression-sort-startup-functions`: Additionally optimize
startup functions for compression
- `--verbose-bp-section-orderer`: Print statistics about balanced
partitioning section ordering
Thanks to the @ellishg, @thevinster, and their team's work.
---------
Co-authored-by: Fangrui Song <i@maskray.me>