43 Commits

Author SHA1 Message Date
Jez Ng
070af48d13 [lld-macho][nfc] Decouple tapi-link.s test from libSystem
If we fix https://github.com/llvm/llvm-project/issues/54184, we will end
up including libSystem in every %lld invocation, which would break
tapi-link.s as it assumes that libSystem isn't directly linked (instead
it goes through libReexportSystem).

Let's remove this unnecessary coupling, as well as use `split-file`
instead of having a separate file under `Inputs`.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D120939
2022-03-03 19:48:59 -05:00
Nuri Amari
aaf00f3f19 Add MachO signature verification test
Add a test to ensure that MachO files including
a LC_CODE_SIGNATURE load command produced by lld
are signed correctly.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D109840
2021-09-16 17:55:32 -07:00
Jez Ng
ac2dd06b91 [lld-macho] Deduplicate CFStrings
`__cfstring` is a special literal section, so instead of breaking it up
at symbol boundaries, we break it up at fixed-width boundaries (since
each literal is the same size). Symbols can only occur at one of those
boundaries, so this is strictly more powerful than
`.subsections_via_symbols`.

With that in place, we then run the section through ICF.

This change is about perf-neutral when linking chromium_framework.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D105045
2021-07-01 21:22:38 -04:00
Greg McGary
f27e4548fc [lld-macho] Implement ICF
ICF = Identical C(ode|OMDAT) Folding

This is the LLD ELF/COFF algorithm, adapted for MachO. So far, only `-icf all` is supported. In order to support `-icf safe`, we will need to port address-significance tables (`.addrsig` directives) to MachO, which will come in later diffs.

`check-{llvm,clang,lld}` have 0 regressions for `lld -icf all` vs. baseline ld64.

We only run ICF on `__TEXT,__text` for reasons explained in the block comment in `ConcatOutputSection.cpp`.

Here is the perf impact for linking `chromium_framekwork` on a Mac Pro (16-core Xeon W) for the non-ICF case vs. pre-ICF:
```
    N           Min           Max        Median           Avg        Stddev
x  20          4.27          4.44          4.34         4.349   0.043029977
+  20          4.37          4.46         4.405        4.4115   0.025188761
Difference at 95.0% confidence
        0.0625 +/- 0.0225658
        1.43711% +/- 0.518873%
        (Student's t, pooled s = 0.0352566)
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D103292
2021-06-17 10:07:44 -07:00
Nico Weber
a5645513db [lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324
2021-06-02 11:09:26 -04:00
Jez Ng
9260760235 [lld-macho] Support loading of zippered dylibs
ld64 can emit dylibs that support more than one platform (typically macOS and
macCatalyst). This diff allows LLD to read in those dylibs. Note that this is a
super bare-bones implementation -- in particular, I haven't added support for
LLD to emit those multi-platform dylibs, nor have I added a variety of
validation checks that ld64 does. Until we have a use-case for emitting zippered
dylibs, I think this is good enough.

Fixes PR49597.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D101954
2021-05-06 11:19:40 -04:00
Jez Ng
7654d8e1a9 [lld-macho][nfc] Convert the mock libSystem.tbd to TBDv4
It doesn't seem like TBDv3 allows for specifying multiple platforms, so I'm
upgrading us to TBDv4. (We need to support multiple platforms in order to test
that we can handle zippered dylibs; that functionality will be added in an
upcoming diff.)

Differential Revision: https://reviews.llvm.org/D101953
2021-05-06 11:19:40 -04:00
Jez Ng
2d28100bf2 [lld-macho] Initial scaffolding for ARM32 support
This just parses the `-arch armv7` and emits the right header flags.
The rest will be slowly fleshed out in upcoming diffs.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D101557
2021-04-30 16:17:25 -04:00
Jez Ng
7208bd4b32 [lld-macho] Skip platform checks for a few libSystem re-exports
XCode 12 ships with mismatched platforms for these libraries,
so this hack is necessary...

Fixes PR49799.

Reviewed By: #lld-macho, gkm, smeenai

Differential Revision: https://reviews.llvm.org/D100913
2021-04-20 19:54:53 -04:00
Jez Ng
3bc88eb392 [lld-macho] Add support for arm64_32
From what I can tell, it's pretty similar to arm64. The two main differences
are:

1. No 64-bit relocations
2. Stub code writes to 32-bit registers instead of 64-bit

Plus of course the various on-disk structures like `segment_command` are using
the 32-bit instead of the 64-bit variants.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99822
2021-04-15 21:16:33 -04:00
Jez Ng
8ca366935b Revert "[lld-macho] Add support for arm64_32" and other stacked diffs
This reverts commits:
* 8914902b01a3f8bdea9c71a0d9d23e4ee0ae80e4
* 35a745d814e1cde3de25d0d959fddc31e1061a41
* 682d1dfe09436857aa3b64365b5cc6fcbf1f043b
2021-04-13 12:40:58 -04:00
Jez Ng
8914902b01 [lld-macho] Add support for arm64_32
From what I can tell, it's pretty similar to arm64. The two main differences
are:

1. No 64-bit relocations
2. Stub code writes to 32-bit registers instead of 64-bit

Plus of course the various on-disk structures like `segment_command` are using
the 32-bit instead of the 64-bit variants.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99822
2021-04-13 10:43:28 -04:00
Vy Nguyen
f499b932bf Revert "Revert "Revert "Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)""""""
This reverts commit 4876ba5b2d6a1264ec73e5cf3fcad083f6927d19.

Third-attemp relanding D98559, new change:
  - explicitly cast enum to underlying type to avoid ambiguity (workaround to clang's bug).
2021-03-23 14:51:05 -04:00
Mehdi Amini
4876ba5b2d Revert "Revert "Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"""""
This reverts commit 3c21166a94ea02b946e9eea75c5e9bdfa8c43ae6.
The build is broken (clang-8 host compiler):

lld/MachO/DriverUtils.cpp:271:8: error: use of overloaded operator '<<' is ambiguous (with operand types 'llvm::raw_fd_ostream' and 'lld::macho::DependencyTracker::DepOpCode')
    os << opcode;
    ~~ ^  ~~~~~~
2021-03-23 00:19:12 +00:00
Vy Nguyen
3c21166a94 Revert "Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)""""
This reverts commit 9670d2e4af4c996098089e31b03ca138bc8d27e9.

Second attemp to reland D98559. New changes:
 - inline functions removed from cpp file.
 - updated tests to use CHECK-DAG instead of CHECK-NEXT
 - fixed ambiguous "<<" operator by switching `char` to uint8_t
2021-03-22 19:34:51 -04:00
Vy Nguyen
9670d2e4af Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"""
This reverts commit 5ad2c225f353adc92473af391775c029db23a7d9.

bots still  unhappy - revertting again
2021-03-22 14:54:01 -04:00
Vy Nguyen
5ad2c225f3 Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)""
This reverts commit 2554b95db57cfcc13864d9bbb9f4e75892067c14.

Relanding [lld-macho] Implement -dependency_info (D98559) with changes:
 - inline functions removed from cpp file.
 - updated tests to not check libSystem.tbd with other input files (because of possible indeterministic ordering)
2021-03-22 14:41:57 -04:00
Nico Weber
2554b95db5 Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"
This reverts commit c53a1322f329e29446c7625da423f58f09ec1a55.
Test only passes depending on build dir having a lexicographically later name
than the source dir, and doesn't link on mac/win. See
https://reviews.llvm.org/D98559#2640265 onward.
2021-03-21 16:35:38 -04:00
Vy Nguyen
c53a1322f3 [lld-macho] Implement -dependency_info (partially - more opcodes needed)
Bug: https://bugs.llvm.org/show_bug.cgi?id=49278
The flag is not well documented, so this implementation is based on observed behaviour.

When specified, `-dependency_info <path>` produced a text file containing information pertaining to the current linkage, such as input files, output file, linker version, etc.

This file's layout is also not documented, but it seems to be a series of null ('\0') terminated strings in the form `<op code><path>`

`<op code>` could be:
   `0x00` : linker version
   `0x10` : input
   `0x11` : files not found(??)
   `0x40` : output

`<path>` : is the file path, except for the linker-version case.

(??) This part is a bit unclear. I think it means all the files the linker attempted to look at, but could not find.

Differential Revision: https://reviews.llvm.org/D98559
2021-03-21 14:35:46 -04:00
Jez Ng
38a6374564 [lld-macho] Only codesign by default on arm64 macOS
instead of doing it on all arm64 platforms.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D98446
2021-03-12 17:26:27 -05:00
Jez Ng
55a32812fa [lld-macho] Filter TAPI re-exports by target
Previously, we were loading re-exports without checking whether
they were compatible with our target. Prior to {D97209}, it meant that
we were defining dylib symbols that were invalid -- usually a silent
failure unless our binary actually used them. D97209 exposed this as an
explicit error.

Along the way, I've extended our TAPI compatibility check to cover the
platform as well, instead of just checking the arch. To this end, I've
replaced MachO::Architecture with MachO::Target in our Config struct.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D97867
2021-03-04 14:36:47 -05:00
Jez Ng
8601be809e [lld-macho] Fix & fold reexport-nested-libs test into stub-link.s
The reexport-nested-libs test added in D97438 was a bit wonky.

First, it was linking against libReexportSystem.tbd which targets the
iOS simulator, and which in turn attempted to re-export the iOS
simulator's libSystem. However, due to the way `-syslibroot` works, it
was actually re-exporting the macOS libSystem.

As a result, the test was not actually able to resolve the symbols in
the desired libSystem. I'm guessing that @oontvoo was confused by this
and therefore included those symbols in libReexportSystem.tbd itself.
But this means that the test wasn't actually testing the resolution of
re-exported symbols (though it did at least verify that the re-exported
libraries could be located).

After some consideration, I figured that stub-link.s could be extended
to cover what reexport-nested-libs.s was attempting to do. The test
targets macOS, so we only have one `-syslibroot` and no chance of
confusion.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D97866
2021-03-04 14:36:46 -05:00
Vy Nguyen
9a2e2de15f [lld-macho] Change loadReexport to handle the case where a TAPI re-exports to reference documents nested within other TBD.
Currently, it was delibrately impleneted to not handle this case, but as it has turnt out, we need this feature.
The concrete use case is
       `System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa` reexports
               /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit , which then rexports
                    /System/Library/PrivateFrameworks/UIFoundation.framework/Versions/A/UIFoundation

The current implemention uses a global currentTopLevelTapi, which is not reset until it finishes loading the whole tree.
This is a problem because if the top-level is set to Cocoa, then when we get to UIFoundation, it will try to find UIFoundation in the current top level, which is Cocoa and will not find it.

The right thing should be:
 - When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD.
 - When loading from an actual dylib, no additional TBD documents need to be examined.
 - In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file

Differential Revision: https://reviews.llvm.org/D97438
2021-03-02 12:14:31 -05:00
Nico Weber
3e6b6cee00 [lld/mac] Use libSystem.dylib instead of libSystem.B.dylib in tests
For -flat_namespace, lld needs to load dylibs in LC_LOAD_DYLIB.
The current setup meant that libSystem.dylib would cause a LC_LOAD_DYLIB
with libSystem.B.dylib, but that didn't exist in our libsysroot for
tests. So just drop the .B.

See https://reviews.llvm.org/D97641#2595237 and
https://reviews.llvm.org/D97641#2595270
2021-03-01 15:23:28 -05:00
Jez Ng
4752cdc9a2 [lld-macho] Check for arch compatibility when loading ObjFiles and TBDs
The silent failures had confused me a few times.

I haven't added a similar check for platform yet as we don't yet have logic to
infer the platform automatically, and so adding that check would require
updating dozens of test files.

Reviewed By: #lld-macho, thakis, alexshap

Differential Revision: https://reviews.llvm.org/D97209
2021-02-23 22:02:38 -05:00
Greg McGary
87104faac4 [lld-macho] Add ARM64 target arch
This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working.

I can add more tests to this base commit, or add them in follow-up commits.

It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated.

Differential Revision: https://reviews.llvm.org/D88629
2021-02-08 18:14:07 -07:00
Jez Ng
a817594de9 [lld-macho][nfc] Clean up tests
* Migrate most of our tests to use `split-file` instead of `echo`
* Remove individual `rm -f %t/libfoo.a` commands in favor of a top-level `rm -rf %t`
* Remove unused `Inputs/libfunction.s`

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D93604
2020-12-21 14:44:08 -05:00
Jez Ng
d2b845dd6c [lld-macho] Disable invalid/stub-link.s test for Mac
It seems to be failing on some Google Buildbots.

This diff also includes a minor fix for the install name of one of
libSystem's re-exports. I don't think it's the cause of the test
failure, though. The wrong install name just meant that the symbol
lookup failure would still happen, but it would have been caused by the
re-export not being found, instead of the arch failing to match.

Differential Revision: https://reviews.llvm.org/D86728
2020-08-27 11:24:22 -07:00
Jez Ng
7394460d87 [lld-macho] Handle TAPI and regular re-exports uniformly
The re-exports list in a TAPI document can either refer to other inlined
TAPI documents, or to on-disk files (which may themselves be TBD or
regular files.) Similarly, the re-exports of a regular dylib can refer
to a TBD file.

Differential Revision: https://reviews.llvm.org/D85404
2020-08-26 19:20:48 -07:00
Jez Ng
6336c042f6 [lld-macho] Make it possible to re-export .tbd files
Two things needed fixing for that to work:

1. getName() no longer returns null for DylibFiles constructed from TAPIs
2. markSubLibrary() now accepts .tbd as a possible extension

Differential Revision: https://reviews.llvm.org/D86180
2020-08-26 19:20:42 -07:00
Jez Ng
a499898e86 [lld-macho] Generate ObjC symbols from .tbd files
I followed similar logic in TapiFile.cpp.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D85255
2020-08-12 19:50:10 -07:00
Jez Ng
25367dfefb [lld-macho] Add .tbd support for frameworks
Required for e.g. linking iOS apps since they don't have a platform-native
SDK

Reviewed By: #lld-macho, compnerd, smeenai

Differential Revision: https://reviews.llvm.org/D85153
2020-08-07 11:04:54 -07:00
Jez Ng
ca85e37338 [lld-macho] Support static linking of thread-locals
Note: What ELF refers to as "TLS", Mach-O seems to refer to as "TLV", i.e.
thread-local variables.

This diff implements support for TLV relocations that reference defined
symbols. On x86_64, TLV relocations are always used with movq opcodes, so for
defined TLVs, we don't need to create a synthetic section to store the
addresses of the symbols -- we can just convert the `movq` to a `leaq`.

One notable quirk of Mach-O's TLVs is that absolute-address relocations
inside TLV-defining sections behave differently -- their addresses are
no longer absolute, but relative to the start of the target section.
(AFAICT, RIP-relative relocations are not allowed in these sections.)

Reviewed By: #lld-macho, compnerd, smeenai

Differential Revision: https://reviews.llvm.org/D85080
2020-08-07 11:04:52 -07:00
Nico Weber
425fb21e03 ld64.lld: Make janky support for tbd files actually work sometimes
Also fix a bug in the test input that made the test miss this issue.
2020-07-02 15:31:21 -04:00
Saleem Abdulrasool
6fe27b5fed lld: initial pass at supporting TBD
Add support to lld to use Text Based API stubs for linking.  This is
support is incomplete not filtering out platforms.  It also does not
account for architecture specific API handling and potentially does not
correctly handle trees of re-exports with inlined libraries being
treated as direct children of the top level library.
2020-06-08 18:15:40 -07:00
Jez Ng
f04d1c3b90 [lld-macho] Move all tests for erroneous inputs under invalid/
For consistency.

The no-id-dylib test was originally referencing the Inputs/ folder via a
relative path. Instead of updating that path, I decided to make the test
self-contained.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D80217
2020-06-02 13:19:38 -07:00
Jez Ng
b3e2fc931d [lld-macho] Support calls to functions in dylibs
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.

ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).

Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.

The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.

Depends on D78269.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78270
2020-05-09 20:56:22 -07:00
Kellie Medlin
6cb073133c [lld] Merge Mach-O input sections
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).

Differential Revision: https://reviews.llvm.org/D77893
2020-05-01 16:57:18 -07:00
Jez Ng
9854edd817 [lld-macho] Implement basic export trie
Build the trie by performing a three-way radix quicksort: We start by
sorting the strings by their first characters, then sort the strings
with the same first characters by their second characters, and so on
recursively. Each time the prefixes diverge, we add a node to the trie.
Thanks to @ruiu for the idea.

I used llvm-mc's radix quicksort implementation as a starting point. The
trie offset fixpoint code was taken from
MachONormalizedFileBinaryWriter.cpp.

Differential Revision: https://reviews.llvm.org/D76977
2020-04-29 15:44:27 -07:00
Jez Ng
62b8f32f76 [lld-macho][reland] Add support for emitting dylibs with a single symbol
This got reverted due to UBSAN errors in a diff lower in the stack,
which is being fixed in https://reviews.llvm.org/D79050. This diff is
otherwise identical to the original https://reviews.llvm.org/D76908
(which was committed in 9598778bd191 and reverted in b52bc2653bbc).

Differential Revision: https://reviews.llvm.org/D79051
2020-04-28 17:08:32 -07:00
Shoaib Meenai
b52bc2653b Revert "[lld-macho] Add support for emitting dylibs with a single symbol"
This reverts commit 9598778bd1910e77ccd399f2c9e979c8ecf98e55.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Jez Ng
9598778bd1 [lld-macho] Add support for emitting dylibs with a single symbol
Summary:
Add logic for emitting the correct set of load commands and segments
when `-dylib` is passed.

I haven't gotten to implementing a real export trie yet, so we can only
emit a single symbol, but it's enough to replace the YAML test files
introduced in D76252.

Differential Revision: https://reviews.llvm.org/D76908
2020-04-27 13:33:46 -07:00
Jez Ng
060efd24c7 [lld-macho] Add basic support for linking against dylibs
This diff implements:

* dylib loading (much of which is being restored from @pcc and @ruiu's
  original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
  symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT

Differential Revision: https://reviews.llvm.org/D76252
2020-04-21 13:43:19 -07:00