283 Commits

Author SHA1 Message Date
Jacek Caban
a6fcd1a663
[LLD][COFF] Set isUsedInRegularObj for target symbols in resolveAlternateNames (#154837)
Fixes: #154595

Prior to commit bbc8346e6bb543b0a87f52114fed7d766446bee1, this flag was
set by `insert()` from `addUndefined()`. Set it explicitly now.
2025-08-22 13:05:19 +02:00
Jacek Caban
7223f67ffc
[LLD][COFF] Don't resolve weak aliases when performing local import (#152000)
Fixes crashes reported in #151255.

The alias may have already been stored for later resolution, which can
lead to treating a resolved alias as if it were still undefined.
Instead, use the alias target directly for the import.

Also extended the test to make reproducing the problem more likely, and
added an assert that catches the issue.
2025-08-05 00:21:04 +02:00
Jacek Caban
dcc71f22ca
[LLD][COFF] Add support for ARM64X same-address thunks (#151255)
Fixes MSVC CRT thread-local constructors support on hybrid ARM64X
targets.

`-arm64xsameaddress` is an undocumented option that ensures the
specified function has the same address in both native and EC views of
hybrid images. To achieve this, the linker emits additional thunks and
replaces the symbols
of those functions with the thunk symbol (the same thunk is used in both
views). The thunk code jumps to the native function (similar to range
extension thunks), but additional ARM64X relocations are emitted to
replace the target with the EC function in the EC view.

MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a
side effect, the native symbol is pulled in. Since this is used in the
CRT for thread-local constructors, it results in the image containing
unnecessary native code. Because these thunks do not appear to be useful
in that context, we limit this behavior to actual hybrid targets. This
may change if compatibility requires it.

The tricky part is that thunks should be skipped if the symbol is not
live in either view, and symbol replacement must be reflected in weak
aliases. This requires thunk generation to happen before resolving weak
aliases but after the GC pass. To enable this, the `markLive` call was
moved earlier, and the final weak alias resolution was postponed until
afterward. This requires more code to be aware of weak aliases, which
previously could assume they were already resolved.
2025-07-31 13:17:36 +02:00
Jacek Caban
ac31d64a64
[LLD][COFF] Avoid resolving symbols with -alternatename if the target is undefined (#149496)
This change fixes an issue with the use of `-alternatename` in the MSVC
CRT on ARM64EC, where both mangled and demangled symbol names are
specified. Without this patch, the demangled name could be resolved to
an anti-dependency alias of the target. Since chaining anti-dependency
aliases is not allowed, this results in an undefined symbol.

The root cause isn't specific to ARM64EC, it can affect other targets as
well, even when anti-dependency aliases aren't involved. The
accompanying test case demonstrates a scenario where the symbol could be
resolved from an archive. However, because the archive member is pulled
in after the first pass of alternate name resolution, and archive
members don't override weak aliases, eager resolution would incorrectly
skip it.
2025-07-28 19:26:25 +02:00
Jacek Caban
38cd66a6ce
[LLD][COFF] Move resolving alternate names to SymbolTable (NFC) (#149495) 2025-07-28 17:02:49 +02:00
Jacek Caban
d6a2a2612f
[LLD][COFF] Add support for DLL imports on ARM64EC (#141587)
Define additional `__imp_aux_` and mangled lazy symbols. Also allow
overriding EC aliases with lazy symbols, as we do for other lazy symbol
types.
2025-05-29 11:37:18 +02:00
Jacek Caban
6602bfa721
[LLD][COFF] Avoid forcing lazy symbols in loadMinGWSymbols during symbol table enumeration (#141593)
Forcing lazy symbols at this point may introduce new entries into the
symbol table. Avoid mutating `symTab` while iterating over it.
2025-05-29 11:35:16 +02:00
Jacek Caban
5b0572875c
[LLD][COFF] Add support for including native ARM64 objects in ARM64EC images (#137653)
MSVC linker accepts native ARM64 object files as input with
`-machine:arm64ec`, similar to `-machine:arm64x`. Its usefulness is very
limited; for example, both exports and imports are not reflected in the
PE structures and can't work. However, their symbol tables are otherwise
functional.

Since we already have handling of multiple symbol tables implemented for
ARM64X, the required changes are mostly about adjusting relevant checks
to account for them on the ARM64EC target.

Delay-load helper handling is a bit of a shortcut. The patch never pulls
it for native object files and just ensures that the code is fine with
that. In general, I think it would be nice to adjust the driver to pull
it only when it's actually referenced, which would allow applying the
same logic to the native symbol table on ARM64EC without worrying about
pulling too much.
2025-05-15 11:38:24 +02:00
Alexandre Ganea
c75eac7c03
[LLD][COFF] Don't dllimport from static libraries (#134443)
This reverts commit 6a1bdd9 and re-instate behavior that matches what
MSVC link.exe does, that is, error out when trying to dllimport a symbol
from a static library.

A hint is now displayed in stdout, mentioning that we should rather dllimport the symbol
from a import library.

Fixes https://github.com/llvm/llvm-project/issues/131807
2025-04-07 11:34:24 -04:00
Jacek Caban
7c26407a20
[LLD][COFF] Clarify EC vs. native symbols in diagnostics on ARM64X (#130857)
On ARM64X, symbol names alone are ambiguous as they may refer to either
a native or an EC symbol. Append '(EC symbol)' or '(native symbol)' in
diagnostic messages to distinguish them.
2025-03-15 21:15:08 +01:00
Jacek Caban
f2473bc31e
[LLD][COFF] Support -aligncomm directives on ARM64X (#129513) 2025-03-03 22:48:20 +01:00
Jacek Caban
8616c87335
[LLD][COFF] Support alternate names in both symbol tables on ARM64X (#127619)
The `.drectve` directive applies only to the namespace in which it is
defined, while the command-line argument applies only to the EC
namespace.
2025-02-21 12:52:28 +01:00
Nico Weber
d9b8120259
[lld/COFF] Fix -start-lib / -end-lib more after reviews.llvm.org/D116434 (#124294)
This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before all
undefineds (ELF and MachO do this, see #67445 and things linked from
there), it's possible that:

1. We add an obj file a.obj
2. a.obj contains an undefined that's in b.obj, causing b.obj to be
added
3. b.obj contains an undefined that's in a part of a.obj that's not yet
in the symbol table, causing a recursive load of a.obj, which adds the
symbols in there twice, leading to duplicate symbol errors.

For normal archives, `ArchiveFile::addMember()` has a `seen` check to
prevent this. For start-lib lazy objects, we can just check if the
archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports do.)

Includes a test that caused duplicate symbol diagnostics before this
code change.
2025-01-24 13:14:21 -05:00
Jacek Caban
a2c683b665
[LLD][COFF] Use EC symbol table for exports defined in module definition files (#123849) 2025-01-22 23:30:23 +01:00
Jacek Caban
455b3d6df2
[LLD][COFF] Separate EC and native exports for ARM64X (#123652)
Store exports in SymbolTable instead of Configuration.
2025-01-21 10:41:15 +01:00
Jacek Caban
b068f2fd0f
[LLD][COFF] Process bitcode files separately for each symbol table on ARM64X (#123194) 2025-01-17 11:36:12 +01:00
Jacek Caban
1bd5f34d76
[LLD][COFF] Move getChunk to LinkerDriver (NFC) (#123103)
The `getChunk` function returns all chunks, not just those specific to a
symbol table. Move it out of the `SymbolTable` class to clarify its
scope.
2025-01-16 12:55:12 +01:00
Jacek Caban
f22af59336
[LLD][COFF] Move symbol mangling and lookup helpers to SymbolTable class (NFC) (#122836)
This refactor prepares for further ARM64X hybrid support, where these
helpers will need to work with either the native or EC symbol table
based on context.
2025-01-15 15:21:06 +01:00
Jacek Caban
251ef3f503
[LLD][COFF] Use appropriate symbol table for -include argument on ARM64X (#122554)
Move `LinkerDriver::addUndefined` to` SymbolTable` to allow its use with
both symbol tables on ARM64X and rename it to `addGCRoot` to clarify its
distinct role compared to the existing `SymbolTable::addUndefined`.

Command-line `-include` arguments now apply to the EC symbol table, with
`mainSymtab` introduced in `linkerMain`. There will be more similar
cases. For `.drectve` sections, the corresponding symbol table is used
based on the context.
2025-01-13 23:16:57 +01:00
Jacek Caban
1fa0302ba2
[LLD][COFF] Emit warnings for missing load config on EC targets (#121339)
ARM64EC and ARM64X images require a load configuration to be valid.
2025-01-02 12:06:58 +01:00
Jacek Caban
8435225374
[LLD][COFF] Move addFile implementation to LinkerDriver (NFC) (#121342)
The addFile implementation does not rely on the SymbolTable object. With
#119294, the symbol table for input files is determined during the
construction of the objects representing them. To clarify that
relationship, this change moves the implementation from the SymbolTable
class to the LinkerDriver class.
2025-01-01 19:42:49 +01:00
Jacek Caban
ff29f38c02
[LLD][COFF] Store and validate load config in SymbolTable (#120324)
Improve diagnostics for invalid load configurations.
2024-12-29 11:43:45 +01:00
Nico Weber
f8bcd93224
[lld/COFF] Fix -start-lib / -end-lib after reviews.llvm.org/D116434 (#120452)
That change forgot to set `lazy` to false before calling `addFile()` in
`forceLazy()` which caused `addFile()` to parse the file we want to
force a load for to be added as a lazy object again instead of adding
the file to `ctx.objFileInstances`.

This is caught by a pretty simple test (included).
2024-12-19 11:30:54 -05:00
Nico Weber
cde996c31d [lld/COFF] Remove needless indirection
`symtab.ctx.symtab` is just `symtab`. Looks like #119296 added
this using a global find-and-replace.

This was the only instance of `symtab.ctx.symtab` in lld/.

No behavior change.
2024-12-17 16:27:16 -05:00
Jacek Caban
d3c4857179
[LLD][COFF] Store machine type in SymbolTable (NFC) (#119298)
This change prepares for hybrid ARM64X support, which requires two
`SymbolTable` instances: one for native symbols and one for EC symbols.
In such cases, `config.machine` will remain ARM64X, while the
`SymbolTable` instances will store ARM64 and ARM64EC machine types.
2024-12-15 18:43:09 +01:00
Jacek Caban
0a9810d325
[LLD][COFF] Factor out LinkerDriver::setMachine (NFC) (#119297) 2024-12-15 18:41:26 +01:00
Jacek Caban
6b493baec1
[LLD][COFF] Store reference to SymbolTable instead of COFFLinkerContext in InputFile (NFC) (#119296)
This change prepares for the introduction of separate hybrid namespaces.
Hybrid images will require two `SymbolTable` instances, making it
necessary to associate `InputFile` objects with the relevant one.
2024-12-15 12:45:34 +01:00
Fangrui Song
c7caab2238 [lld-link] Simplify some << toString 2024-12-05 20:56:19 -08:00
Fangrui Song
983f88c1ec [lld-link] Use COFFSyncStream
Add a operator<< overload for Symbol *.
2024-12-05 20:41:37 -08:00
Fangrui Song
8d225f10ef [lld-link] Replace error(...) with Err 2024-12-05 19:44:26 -08:00
Fangrui Song
4639a9a063 [lld-link] Replace log(...) with Log 2024-12-04 09:04:40 -08:00
Fangrui Song
1534f45694 [lld-link] Replace warn(...) with Warn(ctx) 2024-12-03 22:19:30 -08:00
Jacek Caban
f942949a7c
[LLD][COFF] Require explicit specification of ARM64EC target (#116281)
Inferring the ARM64EC target can lead to errors. The `-machine:arm64ec`
option may include x86_64 input files, and any valid ARM64EC input is
also valid for `-machine:arm64x`. MSVC requires an explicit `-machine`
argument with informative diagnostics; this patch adopts the same
behavior.
2024-11-24 14:33:14 +01:00
Jacek Caban
cdda76a8cf
[LLD][COFF] Fix handling of invalid ARM64EC function names (#116252)
Since these symbols cannot be mangled or demangled, there is no symbol
to check for conflicts in `checkLazyECPair`, nor is there an alias to
create in `addUndefined`. Attempting to create an import library with
such symbols results in an error; the patch includes a test to ensure
the error is handled correctly.

This is a follow-up to #115567.
2024-11-15 16:42:36 +01:00
Jacek Caban
56077e5ac0
[LLD][COFF] Add support for locally imported EC symbols (#114985)
Allow imported symbols to be recognized in both mangled and demangled
forms. Support __imp_aux_ symbols in addition to __imp_ symbols.
2024-11-06 12:09:22 +01:00
Jacek Caban
98bc5295ec
[LLD][COFF] Check both mangled and demangled symbols before adding a lazy archive symbol to the symbol table on ARM64EC (#113284)
On ARM64EC, a function symbol may appear in both mangled and demangled
forms:
- ARM64EC archives contain only the mangled name, while the demangled
symbol is defined by the object file as an alias.
- x86_64 archives contain only the demangled name (the mangled name is
usually defined by an object referencing the symbol as an alias to a
guess exit thunk).
- ARM64EC import files contain both the mangled and demangled names for
thunks.

If more than one archive defines the same function, this could lead to
different libraries being used for the same function depending on how
they are referenced. Avoid this by checking if the paired symbol is
already defined before adding a symbol to the table.
2024-10-23 13:10:07 +02:00
Jacek Caban
9b88792291
[LLD][COFF] Allow overriding EC alias symbols with lazy archive symbols (#113283)
On ARM64EC, external function calls emit a pair of weak-dependency
aliases: `func` to `#func` and `#func` to the `func` guess exit thunk
(instead of a single undefined `func` symbol, which would be emitted on
other targets). Allow such aliases to be overridden by lazy archive
symbols, just as we would for undefined symbols.
2024-10-23 12:43:38 +02:00
Jacek Caban
f1ba8943c8
[LLD][COFF] Support anti-dependency symbols (#112542)
Co-authored-by: Billy Laws <blaws05@gmail.com>

Anti-dependency symbols are allowed to be duplicated, with the first
definition taking precedence. If a regular weak alias is present, it is
preferred over an anti-dependency definition. Chaining anti-dependencies
is not allowed.
2024-10-21 11:44:31 +02:00
Kazu Hirata
3c2e1d3a00
[lld] Avoid repeated hash lookups (NFC) (#112299) 2024-10-15 07:35:42 -07:00
Mike Hommey
6a1bdd9a2e
[LLD][COFF] Do as many passes of resolveRemainingUndefines as necessary for undefined lazy symbols (#109082) 2024-10-03 22:53:26 +03:00
Jacek Caban
86d2abefcb
[LLD][COFF] Store __imp_ symbols as Defined in InputFile (#109115) 2024-09-18 19:49:06 +02:00
Mike Hommey
5e23b66699
[LLD][COFF] Handle imported weak aliases consistently (#109105)
symTab being a DenseMap, the order in which a symbol and its
corresponding import symbol are processed is not guaranteed, and when
the latter comes first, it is left undefined.
2024-09-18 14:42:42 +03:00
Jacek Caban
6ca5c397a9
[LLD][COFF] Redirect __imp_ Symbols to __imp_aux_ on ARM64EC for x64 object files (#108608)
On ARM64EC, __imp_ symbols reference the auxiliary IAT, while __imp_aux_
symbols reference the regular IAT. However, x86_64 code expects both to
reference the regular IAT. This change adjusts the symbols accordingly,
matching the behavior observed in the MSVC linker.
2024-09-17 00:01:17 +02:00
JOE1994
4b27b5800f [lld] Nits on uses of raw_string_ostream (NFC)
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Strip calls to raw_string_ostream::str(), to avoid excess layer of indirection.
2024-09-15 04:23:11 -04:00
Jacek Caban
82a36468c7
[LLD][COFF] Add support for ARM64EC auxiliary IAT (#108304)
In addition to the regular IAT, ARM64EC also includes an auxiliary IAT.
At runtime, the regular IAT is populated with the addresses of imported
functions, which may be x86_64 functions or the export thunks of ARM64EC
functions. The auxiliary IAT contains versions of functions that are
guaranteed to be directly callable by ARM64 code.

The linker fills the auxiliary IAT with the addresses of `__impchk_`
thunks. These thunks perform a call on the IAT address using
`__icall_helper_arm64ec` with the target address from the IAT. If the
imported function is an ARM64EC function, the OS may replace the address
in the auxiliary IAT with the address of the ARM64EC version of the
function (not its export thunk), avoiding the runtime call checker for
better performance.
2024-09-12 22:20:50 +02:00
Jacek Caban
99a2354993
[LLD][COFF] Add support for ARM64EC import call thunks. (#107931)
These thunks can be accessed using `__impchk_*` symbols, though they
are typically not called directly. Instead, they are used to populate the
auxiliary IAT. When the imported function is x86_64 (or an ARM64EC
function with a patched export thunk), the thunk is used to call it.
Otherwise, the OS may replace the thunk at runtime with a direct
pointer to the ARM64EC function to avoid the overhead.
2024-09-11 14:46:40 +02:00
Jacek Caban
7e0008d5ad
[LLD][COFF][NFC] Create import thunks in ImportFile::parse. (#107929) 2024-09-11 12:22:36 +02:00
Jacek Caban
519b36925c
[LLD][COFF][NFC] Store impSym as DefinedImportData in ImportFile. (#107162) 2024-09-04 11:49:50 +02:00
Jacek Caban
ec4d5a6658
[LLD][COFF] Preserve original symbol name when resolving weak aliases. (#105897)
Instead of replacing it with target's name.
2024-08-26 19:20:18 +02:00
Jacek Caban
a2d8743cc8
[LLD][COFF] Generate X64 thunks for ARM64EC entry points and patchable functions. (#105499)
This implements Fast-Forward Sequences documented in ARM64EC
ABI https://learn.microsoft.com/en-us/windows/arm/arm64ec-abi.

There are two conditions when linker should generate such thunks:

- For each exported ARM64EC functions.
It applies only to ARM64EC functions (we may also have pure x64
functions, for which no thunk is needed). MSVC linker creates
`EXP+<mangled export name>` symbol in those cases that points to the
thunk and uses that symbol for the export. It's observable from the
module: it's possible to reference such symbols as I did in the test.
Note that it uses export name, not name of the symbol that's exported
(as in `foo` in `/EXPORT:foo=bar`). This implies that if the same
function is exported multiple times, it will have multiple thunks. I
followed this MSVC behavior.

- For hybrid_patchable functions.
The linker tries to generate a thunk for each undefined `EXP+*` symbol
(and such symbols are created by the compiler as a target of weak alias
from the demangled name). MSVC linker tries to find corresponding
`*$hp_target` symbol and if fails to do so, it outputs a cryptic error
like `LINK : fatal error LNK1000: Internal error during
IMAGE::BuildImage`. I just skip generating the thunk in such case (which
causes undefined reference error). MSVC linker additionally checks that
the symbol complex type is a function (see also #102898). We generally
don't do such checks in LLD, so I made it less strict. It should be
fine: if it's some data symbol, it will not have `$hp_target` symbol, so
we will skip it anyway.
2024-08-22 22:03:05 +02:00