Fixes MSVC CRT thread-local constructors support on hybrid ARM64X
targets.
`-arm64xsameaddress` is an undocumented option that ensures the
specified function has the same address in both native and EC views of
hybrid images. To achieve this, the linker emits additional thunks and
replaces the symbols
of those functions with the thunk symbol (the same thunk is used in both
views). The thunk code jumps to the native function (similar to range
extension thunks), but additional ARM64X relocations are emitted to
replace the target with the EC function in the EC view.
MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a
side effect, the native symbol is pulled in. Since this is used in the
CRT for thread-local constructors, it results in the image containing
unnecessary native code. Because these thunks do not appear to be useful
in that context, we limit this behavior to actual hybrid targets. This
may change if compatibility requires it.
The tricky part is that thunks should be skipped if the symbol is not
live in either view, and symbol replacement must be reflected in weak
aliases. This requires thunk generation to happen before resolving weak
aliases but after the GC pass. To enable this, the `markLive` call was
moved earlier, and the final weak alias resolution was postponed until
afterward. This requires more code to be aware of weak aliases, which
previously could assume they were already resolved.
String table size is too big for large binary when symbol table is
enabled. Some strings in strtab is same so it can be reused.
This patch revives 9ffeaaa authored by mstorsjo with the prioritized
string table builder to fix debug section name issue (see 4d2eda2
for more details).
---------
Co-authored-by: Wen Haohai <whh108@live.com>
Co-authored-by: James Henderson <James.Henderson@sony.com>
MSVC linker accepts native ARM64 object files as input with
`-machine:arm64ec`, similar to `-machine:arm64x`. Its usefulness is very
limited; for example, both exports and imports are not reflected in the
PE structures and can't work. However, their symbol tables are otherwise
functional.
Since we already have handling of multiple symbol tables implemented for
ARM64X, the required changes are mostly about adjusting relevant checks
to account for them on the ARM64EC target.
Delay-load helper handling is a bit of a shortcut. The patch never pulls
it for native object files and just ensures that the code is fine with
that. In general, I think it would be nice to adjust the driver to pull
it only when it's actually referenced, which would allow applying the
same logic to the native symbol table on ARM64EC without worrying about
pulling too much.
Cygwin requires these symbols for its fork emulation to know what data
to copy into the child. GNU ld defines these symbols for MinGW targets
also, so do the same here.
Cygwin also has the `.data_cygwin_nocopy` section, which is merged into
`.data` outside the `__data_start__` to `__data_end__` range. This
excludes it from fork's copying. AFAIK it's only used by the Cygwin DLL
itself (which requires a custom linker script to link, that's not
supported by LLD), but the section is included in GNU ld's default
linker script so handle it here too.
Signed-off-by: Jeremy Drake <github@jdrake.com>
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
Originally, the intent behind symtab was to represent the symbol table
seen in the PE header (without applying ARM64X relocations). However, in
most cases outside of `writeHeader()`, the code references either both
symbol tables or only the EC one, for example, `mainSymtab` in
`linkerMain()` maps to `hybridSymtab` on ARM64X.
MSVC's link.exe allows pure ARM64EC images to include native ARM64
files. This patch prepares LLD to support the same, which will require
`hybridSymtab` to be available even for ARM64EC. At that point,
`writeHeader()` will need to use the EC symbol table, and the original
reasoning for keeping it in `hybridSymtab` no longer applies.
Given this, it seems cleaner to treat the EC symbol table as the “main”
one, assigning it to `symtab`, and use `hybridSymtab` for the native
symbol table instead. Since `writeHeader()` will need to be conditional
anyway, this change simplifies the rest of the code by allowing other
parts to consistently treat `ctx.symtab` as the main symbol table.
As a further simplification, this also allows us to eliminate `symtabEC`
and use `symtab` directly; I’ll submit that as a separate PR.
The map file now uses the EC symbol table for printed entry points and
exports, matching MSVC behavior.
On ARM64X, symbol names alone are ambiguous as they may refer to either
a native or an EC symbol. Append '(EC symbol)' or '(native symbol)' in
diagnostic messages to distinguish them.
Since commit dadc6f2488684, only the constructor of the `EdataContents`
class is used. Replace it with a function and skip the call when using a
custom `.edata` section.
This change implements support for the /stub flag to align with MS
link.exe. This option is useful when a program needs to optimize the DOS
program that executes when the PE runs on DOS, avoiding the traditional
hardcoded DOS program in LLD.
This change ensures the load config in the hybrid image view is handled
correctly. It introduces a new Arm64XRelocVal class to abstract
relocation values, allowing them to be relative to a symbol. This class
will also be useful for managing ARM64X relocation offsets in the
future.
This change ensures that base relocations are sorted in the output,
aligning with MSVC linker behavior. While input files typically provide
sorted relocations, this update guarantees correct sorting even if the
input relocations are unordered.
This modifies the machine field in the hybrid view to be AMD64, aligning
it with expectations from ARM64EC modules. While this provides initial
support, additional relocations will be necessary for full
functionality. Many of these cases depend on implementing separate
namespace support first.
Move clearing of the .reloc section from addBaserels to assignAddresses
to ensure it is always cleared, regardless of the relocatable
configuration. This change also clarifies the reasoning for adding the
dynamic relocations chunk in that location.