19 Commits

Author SHA1 Message Date
SharonXSharon
79cc728b77
[lld][macho] Strip .__uniq. and .llvm. hashes in -order_file (#140670)
```
/// Symbols can be appended with "(.__uniq.xxxx)?.llvm.yyyy" where "xxxx" and
/// "yyyy" are numbers that could change between builds. We need to use the root
/// symbol name before this suffix so these symbols can be matched with profiles
/// which may have different suffixes.
```
Just like what we are doing in BP,
https://github.com/llvm/llvm-project/blob/main/lld/MachO/BPSectionOrderer.cpp#L127

the patch removes the suffixes when parsing the order file and getting
the symbol priority to have a better symbol match.

---------

Co-authored-by: Sharon Xu <sharonxu@fb.com>
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
2025-06-03 10:12:36 -07:00
Teresa Johnson
892bfa9366
[lld-macho] Qualify Reloc with macho namespace (NFC) (#141692)
With an upcoming change to llvm/ProfileData headers, this file ends up
with both llvm::Reloc in scope as well as the lld::macho::Reloc type
intended for use here. Qualify the use in this file with the intended
namespace to avoid compilation errors.
2025-05-27 17:05:21 -07:00
Ellis Hoag
efef83e11d
[lld][InstrProf] Skip BP ordering input sections with null data (#137906)
In MachO, `.bss` `isec`s have a length, but point to null. This causes
Balanced Partitioning to crash on these inputs.

This code was in the original implementation, but was removed in
https://github.com/llvm/llvm-project/pull/124482/files#diff-b2aac8833d29d297ae5ada1b36eb4e88f53691fd91df954c24e0c264f3131d4aL27
2025-04-29 17:15:55 -07:00
Kazu Hirata
f347a06591
[lld] Use llvm::unique (NFC) (#136453) 2025-04-19 13:35:51 -07:00
Ellis Hoag
79fff6aa32
[lld][BP] Avoid ordering ICF'ed sections (#126327)
ICF runs before BPSectionOrderer. When a section is ICF'ed, it seems
that the original sections are marked as not live, but are still kept
around. Prior to this patch, those ICF'ed sections would be passed to BP
and ordered before being skipped when writing the output. Now, these
sections are no longer passed to BP, saving runtime and possibly
improving BP's output.

In a large binary, I found that the number of sections ordered using BP
decreased, while the number of duplicate sections drastically decreased
as expected.
```
Functions for startup: 50755 -> 50520
Functions for compression: 165734 -> 105328
Duplicate functions: 1827231 -> 55230
```
2025-02-13 08:57:44 -08:00
Fangrui Song
2f6e3df08a BPSectionOrderer: stabilize iteration order and node order
Exposed by the test added in the reverted #120514

* Fix libstdc++/libc++ differences due to nth_element. https://github.com/llvm/llvm-project/pull/125450#issuecomment-2631404178
* Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences
* Fix potential issue in `currentSize += D::getSize(*sections[*sectionIdxs.begin()])` where DenseSet was used, though not covered by a test
2025-02-03 10:36:51 -08:00
Hans Wennborg
f3c4b58f4b Revert "[ELF] Add BPSectionOrderer options (#120514)"
The ELF/bp-section-orderer.s test is failing on some buildbots due to
what seems like non-determinism issues, see comments on the original PR
and #125450

Reverting to green the build.

This reverts commit 0154dce8d39d2688b09f4e073fe601099a399365 and
follow-up commits 046dd4b28b9c1a75a96cf63465021ffa9fe1a979 and
c92f20416e6dbbde9790067b80e75ef1ef5d0fa4.
2025-02-03 11:41:23 +01:00
Fangrui Song
046dd4b28b [lld] BPSectionOrderer: stabilize iteration order 2025-02-02 21:58:29 -08:00
Fangrui Song
115bb87ad0 [lld] BPSectionOrderer: replace Symbol with Defined and optimize getSymbols. NFC 2025-02-02 15:43:01 -08:00
Fangrui Song
e0c7f081f1
[lld-macho] Refactor BPSectionOrderer with CRTP. NFC
PR #117514 refactored BPSectionOrderer to be used by the ELF port
but introduced some inefficiency:

* BPSectionBase/BPSymbol are wrappers around a single pointer.
  The numbers of sections and symbols could be huge, and the extra
  allocations are memory inefficient.
* Reconstructing the returned DenseMap (since BPSectionBase != InputSectin)
  is wasteful.

This patch refactors BPSectionOrderer with Curiously Recurring Template
Pattern and eliminates the inefficiency. In addition,
`symbolToSectionIdxs` is removed and `rootSymbolToSectionIdxs` building
is moved to lld/MachO: while getting sections for symbols is cheap in
Mach-O, it is awkward and inefficient in the ELF port.

While here, add a file-level comment and replace some `StringMap<*>`
(which copies strings) with `DenseMap<CachedHashStringRef, *>`.

Pull Request: https://github.com/llvm/llvm-project/pull/124482
2025-01-27 18:24:59 -08:00
Fangrui Song
e8e75e08c9 [lld-macho] Remove unneeded functions from BPSectionOrderer. NFC 2025-01-26 09:46:38 -08:00
Fangrui Song
cc88a5e615
[lld-macho,NFC] Switch to increasing priorities
--order_file, call graph profile, and BalancedPartitioning currently
build the section order vector by decreasing priority (from SIZE_MAX to
0). However, it's conventional to use an increasing key (see
OutputSection::inputOrder).

Switch to increasing priorities, remove the global variable
highestAvailablePriority, and remove the highestAvailablePriority
parameter from BPSectionOrderer. Change size_t to int.

This improves consistenty with the ELF and COFF ports. The ELF port
utilizes negative priorities for --symbol-ordering-file and call graph
profile, and non-negative priorities for --shuffle-sections (no Mach-O
counterpart yet).

Pull Request: https://github.com/llvm/llvm-project/pull/121727
2025-01-10 09:32:03 -08:00
Max
79e859e049
[lld] Move BPSectionOrderer from MachO to Common for reuse in ELF (#117514)
Add lld/Common/BPSectionOrdererBase from MachO for reuse in ELF
2024-12-18 09:24:25 -08:00
SharonXSharon
6827a00d4d
[lld][InstrProf] Do not use cstring offset hashes in function order for compression (#113606) 2024-10-28 09:47:21 -07:00
Ellis Hoag
ce91e2153f
[lld][InstrProf] Sort startup functions for compression (#107348) 2024-09-06 09:22:03 -07:00
Ellis Hoag
3380dae2f0
[lld][InstrProf] Refactor BPSectionOrderer.cpp (#107347)
Refactor some code in `BPSectionOrderer.cpp` in preparation for
https://github.com/llvm/llvm-project/pull/107348.

* Rename `constructNodesForCompression()` -> `getUnsForCompression()`
and return a `SmallVector` directly rather than populating a vector
alias
* Pass `duplicateSectionIdxs` as a pointer to make it possible to skip
finding (nearly) duplicate sections
* Combine `duplicate{Function,Data}SectionIdxs` into one variable
* Compute all `BPFunctionNode` vectors at the end (like
`nodesForStartup`)

There should be no functional change.
2024-09-05 14:55:05 -07:00
Ellis Hoag
f95bd62cf9
[lld][InstrProf] Add "Separate" irpgo-profile-sort option (#101084)
Add the "Separate" option `--irpgo-profile-sort <profile` instead of
just the "Joined" option `--irpgo-profile-sort=<profile>`. This is
useful if the path has a `,` for some reason which would break when
trying to use `-Wl,--irpgo-profile-sort=<profile-with-comma>`.

While I'm here, use `static_cast<>` instead of the C style cast
introduced in https://github.com/llvm/llvm-project/pull/100627
2024-08-01 11:20:01 -07:00
Scott Todd
4f8050806e
[lld] Add explicit conversion for enum to Twine. (#100627)
This fixes `error: ambiguous conversion for functional-style cast from
'lld::macho::InputSection::Kind' to 'llvm::Twine'`, observed when
building with clang-9 and reported here:
https://github.com/llvm/llvm-project/pull/96268#discussion_r1691909931.
2024-07-25 11:37:31 -07:00
Ellis Hoag
e3b30bc553
[lld][InstrProf] Profile guided function order (#96268)
Add the lld flags `--irpgo-profile-sort=<profile>` and
`--compression-sort={function,data,both}` to order functions to improve
startup time, and functions or data to improve compressed size,
respectively.

We use Balanced Partitioning to determine the best section order using
traces from IRPGO profiles (see
https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
for details) to improve startup time and using hashes of section
contents to improve compressed size.

In our recent LLVM talk (https://www.youtube.com/watch?v=yd4pbSTjwuA),
we showed that this can reduce page faults during startup by 40% on a
large iOS app and we can reduce compressed size by 0.8-3%.

More details can be found in https://dl.acm.org/doi/10.1145/3660635

---------

Co-authored-by: Vincent Lee <thevinster@users.noreply.github.com>
2024-07-23 08:34:40 -07:00