221 Commits

Author SHA1 Message Date
alx32
bbfa50696e
[lld-macho] Fix bug in makeSyntheticInputSection when -dead_strip flag is specified (#86878)
Previously, `makeSyntheticInputSection` would create a new
`ConcatInputSection` without setting `live` explicitly for it. Without
`-dead_strip` this would be OK since `live` would default to `true`.
However, with `-dead_strip`, `live` would default to false, and it would
remain set to `false`.
This hasn't resulted in any issues so far since no code paths that
exposed this issue were present.
However a recent change - ObjC relative method lists
(https://github.com/llvm/llvm-project/pull/86231) exposes this issue by
creating relocations to the `SyntheticInputSection`.
When these relocations are attempted to be written, this ends up with a
crash(assert), since the `SyntheticInputSection` they refer to is marked
as dead (`live` = `false`).

With this change, we set the correct behavior - `live` will always be
`true`. We add a test case that before this change would trigger an
assert in the linker.
2024-03-27 17:27:51 -07:00
alx32
742a82a729
[lld-macho] Implement support for ObjC relative method lists (#86231)
The MachO format supports relative offsets for ObjC method lists. This
support is present already in ld64. With this change we implement this
support in lld also.

Relative method lists can be identified by a specific flag (0x80000000)
in the method list header. When this flag is present, the method list
will contain 32-bit relative offsets to the current Program Counter
(PC), instead of absolute pointers.
Additionally, when relative method lists are used, the offset to the
selector name will now be relative and point to the selector reference
(selref) instead of the name itself.
2024-03-27 14:34:27 -07:00
alx32
e1a003dbbd
[lld-macho][NFC] Refactor ObjCSelRefsSection => ObjCSelRefsHelper (#86456)
In a previous PR: https://github.com/llvm/llvm-project/pull/83878, the
intent was to make no functional changes, just refactor out the code for
reuse.
However, by creating `ObjCSelRefsSection` as a `SyntheticSection` - this
slightly changed the functionality of the application as the
`SyntheticSection` constructor registers the `SyntheticSection` as a
functional one - with an associated `SyntheticInputSection`.

With this change we remove this unintended consequence by making the
code not use a `SyntheticSection` as base, but just by having it be a
static helper.
2024-03-25 06:55:11 -07:00
Amir Ayupov
f66d631bf8 Revert "[BOLT] Add BB index to BAT (#86044)"
This reverts commit 3b3de48fd84b8269d5f45ee0a9dc6b7448368424.
2024-03-22 08:38:40 -07:00
Amir Ayupov
3b3de48fd8
[BOLT] Add BB index to BAT (#86044) 2024-03-22 06:07:17 -07:00
alx32
b609a4d7ea
[lld-macho][NFC] Refactor insertions into inputSections (#85692)
Before this change, after `InputSection` objects are created, they need
to be added to the appropriate container for tracking.
The logic for selecting the appropriate container lives in `Driver.cpp`
/ `gatherInputSections`, where the `InputSection` is added to the
matching container depending on the input config and the type of
`InputSection`.

Also, multiple other locations also insert directly into `inputSections`
array - assuming that that is the appropriate container for the
`InputSection`'s they create. Currently this is the correct assumption,
however an upcoming feature will change this.

For an upcoming feature (relative method lists), we need to route
`InputSection`'s either to `inputSections` array or to a synthetic
section, depending on weather the relative method list optimization is
enabled or not.

We can achieve the above either by duplicating some of the logic or
refactoring the routing and `InputSection`'s and reusing that.

The refactoring & code sharing approach seems the correct way to go - as
such this diff performs the refactoring while not introducing any
functional changes. Later on we can just call `addInputSection` and not
have to worry about routing logic.

---------
2024-03-21 14:50:44 -07:00
alx32
a53401e9df
[lld-macho][NFC] Refactor ObjCSelRefsSection out of ObjCStubsSection (#83878)
Currently ObjCStubsSection is handling both the logic for the
"__objc_stubs" section, as well as the logic for the "__objc_selrefs"
section.
While this is OK for now, it will be an issue for other features that
want to interact with the "__objc_selrefs" section, such as upcoming
relative method lists feature - which will also want to create /
reference entries in the "__objc_selrefs" section.
In this PR we split the logic relating to handling the "__objc_selrefs"
section into a new SyntheticSection (ObjCSelRefsSection). Non-functional
change - neither the behavior nor implementation changes, the interface
is just made more friendly to not have "__objc_selrefs" so bound to
"__objc_stubs".

---------

Co-authored-by: Alex B <alexborcan@meta.com>
2024-03-10 13:22:31 -07:00
Nico Weber
624ea349d7
[lld/MachO] Fix assert on unsorted data-in-code entries (#81758)
When the data-in-code entries are in separate sections, they are not
guaranteed to be sorted. In particular, 68b1cc36f3df marked some libc++
string functions as noinline, which leads to global ctors involving
strings now producing data-in-code sections in __TEXT,__StaticInit,
which is why this now happens in practice.

Since data-in-code entries are relatively rare and small, just sort
them. No observed performance impact.

See also crbug.com/41487860
2024-02-16 07:46:58 -05:00
Kyungwoo Lee
391393179a
[lld-macho] icf objc stubs (#79730)
This supports icf for objc stubs.
2024-02-01 14:19:11 -08:00
Kyungwoo Lee
cb46c61817
[lld-macho] dead-strip objc stubs (#79726)
This supports dead-strip for objc stubs.
2024-01-29 23:29:57 -08:00
Kyungwoo Lee
77e204c7b0
[lld-macho][arm64] implement -objc_stubs_small (#78665)
This patch implements `-objc_stubs_small` targeting arm64, aiming to
align with ld64's behavior.
1. `-objc_stubs_fast`: As previously implemented, this always uses the
Global Offset Table (GOT) to invoke `objc_msgSend`. The alignment of the
objc stub is 32 bytes.
2. `-objc_stubs_small`: This behavior depends on whether `objc_msgSend`
is defined. If it is, it directly jumps to `objc_msgSend`. If not, it
creates another stub to indirectly jump to `objc_msgSend`, minimizing
the size. The alignment of the objc stub in this case is 4 bytes.
2024-01-23 07:31:34 -08:00
Vy Nguyen
642ffbbf38 [lld-macho]Use install_name as Identifier for code-sign, if available.
Detail:
LD64 uses the name provided via  -[dylib]install_name as "Identifier", when available.
For compatiblity, LLD should do that too.

Differential Revision: https://reviews.llvm.org/D155508
2023-07-19 14:19:15 -04:00
Fangrui Song
2090d66b23 [lld-macho] Switch to xxh3_64bits
xxh3 is substantially faster than xxh64.
For lld/ELF, there is substantial speedup in `.debug_str` duplicate
elimination (D154813). Use xxh3 for lld-macho as well.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D155677
2023-07-19 09:58:44 -07:00
Fangrui Song
8d85c96e0e [lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC
The latter form is now preferred to be similar to C++20 starts_with.
This replacement also removes one function call when startswith is not inlined.
2023-06-05 14:36:19 -07:00
Keith Smiley
48e5f704c5
[lld-macho] Remove linking bitcode support
Apple deprecated bitcode in the deployment process in Xcode 14.0. Last
month Apple started requiring Xcode 14.1+ to submit apps to the App
Store. Since there isn't a use for bundling bitcode outside of
submitting to the App Store we should be safe to delete this handling
entirely from LLD.

Differential Revision: https://reviews.llvm.org/D150697
2023-05-30 14:47:11 -07:00
Vincent Lee
ed59b8a11c [lld-macho] Remove partially supported 32-bit ARM arch
We never really supported 32-bit ARM arch entirely, and partial support was added for
very specific features. Regardless, it fails to even link the most basic applications that at
this point, it might be better to move this arch as unsupported. Given that Apple will be
moving towards arm64 long term, I don't see any reason for anyone to invest time in
supporting this either, and for those who still need it should use apple's ld64 linker.

Fixes #62691

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D150544
2023-05-20 13:06:03 -07:00
Jez Ng
c4d9df9f78 [lld-macho][nfc] Clean up a bunch of clang-tidy issues 2023-04-05 07:50:28 -04:00
Jez Ng
4f086218dd [lld-macho] Support re-exports of individual symbols
Specifically, we support this:

  ld64.lld -dylib foo.o libbar.dylib -exported_symbol _bar -o libfoo.dylib

Where `_bar` is defined in libbar.dylib.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D144153
2023-03-27 16:39:37 -04:00
Jez Ng
dd4a9c463b [lld-macho][nfc] Convert more alignTo() to alignToPowerOf2()
Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D145261
2023-03-07 16:59:38 -08:00
Keith Smiley
6578e0d1d0
[lld-macho] Remove duplicate minimum version info
At some point PlatformInfo's Target changed types to a type that also
has minimum deployment target info. This caused ambiguity if you tried
to get the target triple from the Target, as the actual minimum version
info was being stored separately. This bulk of this change is changing
the parsing of these values to support this.

Differential Revision: https://reviews.llvm.org/D145263
2023-03-03 13:47:01 -08:00
Kazu Hirata
55e2cd1609 Use llvm::count{lr}_{zero,one} (NFC) 2023-01-28 12:41:20 -08:00
Daniel Bertalan
948fc66f5e
[lld-macho] Set 4-byte alignment for __init_offsets
dyld refuses to run initializers if this section is unaligned.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1383240

Differential Revision: https://reviews.llvm.org/D137803
2022-11-10 23:32:55 +01:00
Kazu Hirata
3f82caf7b1 [lld] Fix a warning
This patch fixes:

  lld/MachO/SyntheticSections.cpp: In member function ‘virtual void
  lld::macho::ChainedFixupsSection::writeTo(uint8_t*) const’:
2022-10-30 13:33:33 -07:00
Jez Ng
0cf6515e27 [lld-macho][nfc] Use llvm::enumerate + destructuring in more places
I love C++17!

chromium_framework_less_dwarf on my 16-core Mac Pro shows no stat sig change in wall time but a slight decrease in user time:

```
           base           diff           difference (95% CI)
sys_time   1.759 ± 0.037  1.761 ± 0.033  [  -0.9% ..   +1.1%]
user_time  4.920 ± 0.043  4.886 ± 0.051  [  -1.2% ..   -0.2%]
wall_time  5.950 ± 0.117  5.900 ± 0.116  [  -1.8% ..   +0.2%]
samples    26             37
```

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D136518
2022-10-22 10:41:20 -04:00
Jez Ng
b945733026 [lld-macho] Map file should map symbols to their original bitcode file
... instead of mapping them to the intermediate object file.
This matches ld64.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D136380
2022-10-21 22:49:02 -04:00
Daniel Bertalan
0d30e92f59
[lld-macho] Add support for emitting chained fixups
This commit adds support for chained fixups, which were introduced in
Apple's late 2020 OS releases. This format replaces the dyld opcodes
used for supplying rebase and binding information, and encodes most of
that data directly in the memory location that will have the fixup
applied.

This reduces binary size and is a requirement for page-in linking, which
will be available starting with macOS 13.

A high-level overview of the format and my implementation can be found
in SyntheticSections.h.

This feature is currently gated behind the `-fixup_chains` flag, and
will be enabled by default for supported targets in a later commit.

Like in ld64, lazy binding is disabled when chained fixups are in use,
and the `-init_offsets` transformation is performed by default.

Differential Revision: https://reviews.llvm.org/D132560
2022-10-04 11:48:45 +02:00
Daniel Bertalan
3493f1a107
[lld-macho] Simplify base address calculation for init offsets (NFC) 2022-09-17 10:23:05 +02:00
Jez Ng
3925ea4172 [lld-macho][nfci] Don't include null terminator in StringRefs
So @keith observed
[here](https://reviews.llvm.org/D128108#inline-1263900) that the
StringRefs we were returning from `CStringInputSection::getStringRef()`
included the null terminator in their total length, but regular
StringRefs do not. Let's fix that so these StringRefs are less confusing
to use.

Reviewed By: #lld-macho, keith, Roger

Differential Revision: https://reviews.llvm.org/D133728
2022-09-13 21:23:48 -04:00
Daniel Bertalan
025a5b22c8
[lld-macho] Sort data-in-code entries
Previously, we would add entries to DataInCodeSection in the order they
appeared in input files. Because of this, entries would not be sorted if
sections were reordered due to e.g. `-order_file` or call graph profile
sorting. ld64 always keeps data-in-code information sorted.

This commit also fixes an incorrect assertion. The original assertion
from D103006 used to check that data-in-code entries are sorted in the
input objects -- likely because we use binary search on that data. In
D115556, the assertion was moved into `collectDataInCodeEntries`, but
the checked variable's name was not changed, so it ended up checking the
final contents of the DataInCodeSection.

We no longer crash when building LLVM with PGO using an asserts build of
LLD as the linker.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1265937

Numbers for linking the Chromium Framework reproducer from #48001, which
has 6829 data-in-code entries:

  x before
  + after
      N           Min           Max        Median           Avg        Stddev
  x  20     2.1076453     2.3059683     2.1132485     2.1350302   0.049905767
  +  20     2.1069031     2.3915262       2.14465     2.1728429   0.084065898
  No difference proven at 95.0% confidence

Differential Revision: https://reviews.llvm.org/D133581
2022-09-13 19:08:35 +02:00
Kazu Hirata
32aa35b504 Drop empty string literals from static_assert (NFC)
Identified with modernize-unary-static-assert.
2022-09-03 11:17:47 -07:00
Daniel Bertalan
389e0a81a1
[lld-macho] Support synthesizing __TEXT,__init_offsets
This section stores 32-bit `__TEXT` segment offsets of initializer
functions, and is used instead of `__mod_init_func` when chained fixups
are enabled.

Storing the offsets lets us avoid emitting fixups for the initializers.

Differential Revision: https://reviews.llvm.org/D132947
2022-08-31 10:13:45 +02:00
Daniel Bertalan
ae5d5426fb
[lld-macho] Rename {StubHelper,ObjCStubs}Section::setup() to setUp (NFC)
The phrasal verb is spelled "set up"; "setup" is a noun.

Suggested in https://reviews.llvm.org/D132947#inline-1280089
2022-08-30 18:30:14 +02:00
Daniel Bertalan
6b6d1abb10
[lld-macho] Move adding bindings for stub targets out of Writer (NFC)
We now re-use the existing needsBinding() helper to determine if a
branch has to go through a stub. The logic for determining which type of
binding is needed is moved inside StubsSection::addEntry().

This is an NFC refactor that simplifies my diff that adds support for
chained fixups.

Differential Revision: https://reviews.llvm.org/D132476
2022-08-25 17:37:36 +02:00
Keith Smiley
3c24fae398
[lld-macho] Add support for objc_msgSend stubs
Apple Clang in Xcode 14 introduced a new feature for reducing the
overhead of objc_msgSend calls by deduplicating the setup calls for each
individual selector. This works by clang adding undefined symbols for
each selector called in a translation unit, such as `_objc_msgSend$foo`
for calling the `foo` method on any `NSObject`. There are 2
different modes for this behavior, the default directly does the setup
for `_objc_msgSend` and calls it, and the smaller option does the
selector setup, and then calls the standard `_objc_msgSend` stub
function.

The general overview of how this works is:

- Undefined symbols with the given prefix are collected
- The suffix of each matching undefined symbol is added as a string to
  `__objc_methname`
- A pointer is added for every method name in the `__objc_selrefs`
  section
- A `got` entry is emitted for `_objc_msgSend`
- Stubs are emitting pointing to the synthesized locations

Notes:

- Both `__objc_methname` and `__objc_selrefs` can also exist from object
  files, so their contents are merged with our synthesized contents
- The compiler emits method names for defined methods, but not for
  undefined symbols you call, but stubs are used for both
- This only implements the default "fast" mode currently just to reduce
  the diff, I also doubt many folks will care to swap modes
- This only implements this for arm64 and x86_64, we don't need to
  implement this for 32 bit iOS archs, but we should implement it for
  watchOS archs in a later diff

Differential Revision: https://reviews.llvm.org/D128108
2022-08-10 17:17:17 -07:00
Fangrui Song
bccdf9197b Revert "[lld-macho] Work around odr-use of const non-inline static data member to fix -O0 build after D128298"
This reverts commit 20b2d3260d4a1878ca2a37cee6ee335a21a12d0f.
The workaround is no longer needed for C++17.
2022-08-06 16:44:14 -07:00
Jez Ng
ee61dc5f6c [lld-macho][nfc] Reduce nesting of code added in D130125 2022-07-23 13:16:00 -04:00
Kazu Hirata
1cc7f5bede Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2022-07-23 09:22:27 -07:00
Jez Ng
d23da0ec6c [lld-macho] Fold __objc_imageinfo sections
Previously, we treated it as a regular ConcatInputSection. However, ld64
actually parses its contents and uses that to synthesize a single image
info struct, generating one 8-byte section instead of `8 * number of
object files with ObjC code`.

I'm not entirely sure what impact this section has on the runtime, so I
just tried to follow ld64's semantics as closely as possible in this
diff. My main motivation though was to reduce binary size.

No significant perf change on chromium_framework on my 16-core Mac Pro:

             base           diff           difference (95% CI)
  sys_time   1.764 ± 0.062  1.748 ± 0.032  [  -2.4% ..   +0.5%]
  user_time  5.112 ± 0.104  5.106 ± 0.046  [  -0.9% ..   +0.7%]
  wall_time  6.111 ± 0.184  6.085 ± 0.076  [  -1.6% ..   +0.8%]
  samples    30             32

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D130125
2022-07-23 12:12:01 -04:00
Daniel Bertalan
54e18b2397 [lld-macho] Optimize rebase opcode generation
This commit reduces the size of the emitted rebase sections by
generating the REBASE_OPCODE_DO_REBASE_ADD_ADDR_ULEB and
REBASE_OPCODE_DO_REBASE_ULEB_TIMES_SKIPPING_ULEB opcodes.

With this change, chromium_framework's rebase section is a 40% smaller
197 kilobytes, down from the previous 320 kB. That is 6 kB smaller than
what ld64 produces for the same input.

Performance figures from my M1 Mac mini:

x before
+ after

    N           Min           Max        Median           Avg        Stddev
x  10     4.2269349     4.3300061     4.2689675     4.2690016   0.031151669
+  10      4.219331     4.2914009     4.2398136     4.2448277   0.023817308
No difference proven at 95.0% confidence

Differential Revision: https://reviews.llvm.org/D130180
2022-07-21 10:00:39 +02:00
Daniel Bertalan
8d29f0fdb9 [lld-macho] Emit REBASE_OPCODE_ADD_ADDR_IMM_SCALED if possible
An ADD_ADDR rebase opcode's argument can be encoded as an immediate if
the offset is less than 15 * word size. This change reduces the size of
chromium_framework by 100+ KiB.

Differential Revision: https://reviews.llvm.org/D128798
2022-06-29 22:28:39 +02:00
Fangrui Song
20b2d3260d [lld-macho] Work around odr-use of const non-inline static data member to fix -O0 build after D128298
```
ld.lld: error: undefined symbol: lld::macho::CodeSignatureSection::blockSize
>>> referenced by SyntheticSections.cpp:1253 (/home/maskray/llvm/lld/MachO/SyntheticSections.cpp:1253)
>>>               tools/lld/MachO/CMakeFiles/lldMachO.dir/SyntheticSections.cpp.o:(lld::macho::CodeSignatureSection::writeHashes(unsigned char*) const::$_7::operator()(unsigned long) const)
```
2022-06-21 19:22:28 -07:00
Nico Weber
0baf13e282 [lld/mac] Parallelize code signature computation
According to ministat, this is a small but measurable speedup
(using the repro in PR56121):

    N           Min           Max        Median           Avg        Stddev
x  10     3.7439518     3.7783802     3.7730219     3.7655502   0.012375226
+  10     3.6149218      3.692198     3.6519327     3.6502951   0.025905601
Difference at 95.0% confidence
	-0.115255 +/- 0.0190746
	-3.06078% +/- 0.506554%
	(Student's t, pooled s = 0.0203008)

(Without 858e8b17f7365, this change here to use parallelFor is an 18% speedup,
and doing 858e8b17f7365 on top of this change is just a 2.55% +/- 0.58% win.
Doing both results in a total speedup of 20.85% +/- 0.44%.)

Differential Revision: https://reviews.llvm.org/D128298
2022-06-21 20:41:35 -04:00
Daniel Bertalan
5792797c5b Reland "[lld-macho] Show source information for undefined references"
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

If DWARF line information is available, we now show where in the source
the references are coming from:

  ld64.lld: error: unreferenced symbol: _foo
  >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
  >>>                /path/to/bar.o:(symbol _baz+0x4)

The reland is identical to the first time this landed. The fix was in D128294.
This reverts commit 0cc7ad417585b3185c32e395cc5e6cf082a347af.

Differential Revision: https://reviews.llvm.org/D128184
2022-06-21 18:50:06 -04:00
Nico Weber
3ade3d3724 [lld/mac] Replace while loop with for loop
No behavior change. In preparation for using a parallelFor() here.

Differential Revision: https://reviews.llvm.org/D128295
2022-06-21 15:42:06 -04:00
Nico Weber
858e8b17f7 [lld/mac] On Apple systems, call CC_SHA256 from libSystem
It's in libSystem, so it doesn't bring in any new deps, and it's
currently much faster than LLVM's current SHA256 implementation.

Makes linking (arm64) Chromium Framework with ld64.lld 17% faster.
See also PR56121.

No behavior change.

Differential Revision: https://reviews.llvm.org/D128290
2022-06-21 14:58:04 -04:00
Nico Weber
ca25baee7e [lld/mac] Extract a sha256() function
No behavior change.

Differential Revision: https://reviews.llvm.org/D128289
2022-06-21 14:02:42 -04:00
Nico Weber
0cc7ad4175 Revert "[lld-macho] Show source information for undefined references"
This reverts commit cd7624f15369f0d395c1edee1a0b9592083d2fe0.
See https://reviews.llvm.org/D128184#3597534
2022-06-20 19:15:57 -04:00
Daniel Bertalan
cd7624f153 [lld-macho] Show source information for undefined references
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

If DWARF line information is available, we now show where in the source
the references are coming from:

  ld64.lld: error: unreferenced symbol: _foo
  >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
  >>>                /path/to/bar.o:(symbol _baz+0x4)

Differential Revision: https://reviews.llvm.org/D128184
2022-06-20 18:49:42 -04:00
Vy Nguyen
82de9bb66b [lld-macho] Addressed additional post-commit comments from D126046
- fixed newlines
- renamed helper function for clarity
- added additional comment

Differential Revision: https://reviews.llvm.org/D126792
2022-06-03 15:48:11 -04:00
Nico Weber
815825f442 [lld/mac] clang-format after f5709066e3b 2022-06-01 14:53:08 -04:00