On ARM64EC, external function calls emit a pair of weak-dependency
aliases: `func` to `#func` and `#func` to the `func` guess exit thunk
(instead of a single undefined `func` symbol, which would be emitted on
other targets). Allow such aliases to be overridden by lazy archive
symbols, just as we would for undefined symbols.
Co-authored-by: Billy Laws <blaws05@gmail.com>
Anti-dependency symbols are allowed to be duplicated, with the first
definition taking precedence. If a regular weak alias is present, it is
preferred over an anti-dependency definition. Chaining anti-dependencies
is not allowed.
In addition to the auxiliary IAT, ARM64EC modules also contain a copy of
it. At runtime, the auxiliary IAT is filled with the addresses of actual
ARM64EC functions when possible. If patching is detected, the OS may use
the IAT copy to revert the auxiliary IAT, ensuring that the call checker
is used for calls to imported functions.
In addition to the regular IAT, ARM64EC also includes an auxiliary IAT.
At runtime, the regular IAT is populated with the addresses of imported
functions, which may be x86_64 functions or the export thunks of ARM64EC
functions. The auxiliary IAT contains versions of functions that are
guaranteed to be directly callable by ARM64 code.
The linker fills the auxiliary IAT with the addresses of `__impchk_`
thunks. These thunks perform a call on the IAT address using
`__icall_helper_arm64ec` with the target address from the IAT. If the
imported function is an ARM64EC function, the OS may replace the address
in the auxiliary IAT with the address of the ARM64EC version of the
function (not its export thunk), avoiding the runtime call checker for
better performance.
These thunks can be accessed using `__impchk_*` symbols, though they
are typically not called directly. Instead, they are used to populate the
auxiliary IAT. When the imported function is x86_64 (or an ARM64EC
function with a patched export thunk), the thunk is used to call it.
Otherwise, the OS may replace the thunk at runtime with a direct
pointer to the ARM64EC function to avoid the overhead.
For x86_64 callable functions, ARM64EC requires an entry thunk generated
by the compiler. The linker interprets .hybmp sections to associate
function chunks with their entry points and writes an offset to thunks
preceding function section contents.
Additionally, ICF needs to be aware of entry thunks to not consider
chunks to be equal when they have different entry thunks, and GC needs
to mark entry thunks together with function chunks.
I used a new SectionChunkEC class instead of storing entry thunks in
SectionChunk, following the guideline to keep SectionChunk as compact as
possible. This way, there is no memory usage increase on non-EC targets.
In addition to looking for dependent (input) PDB files next to the associated .OBJ file, we now also look into the output folder as well. This mimics MSVC link.exe behavior.
Fixes#94152
This allows handling importlibs produced by llvm-dlltool in #78772.
ARM64EC import libraries use it by default, but it's supported by MSVC
link.exe on other platforms too.
This also avoids assuming null-terminated input, like in #78769.
The export names are saved as StringRefs pointing into the COFF
directives. In the case of LTO objects, this can be memory allocated
that is owned by the LTO InputFile, which gets destructed when doing the
compilation.
In the case of LTO objects from an older version of LLVM, which require
being upgraded when loaded, the directives string gets destructed, while
when using LTO objects of a matching version (the common case), the
directives string points into memory that doesn't get destructed on LTO
compilation.
Test this by linking a bundled binary LTO object file, from an older
version of LLVM.
This fixes issue #78591, and downstream issue
https://github.com/mstorsjo/llvm-mingw/issues/392.
This shouldn't have any user visible effect, but makes the logic within
the linker implementation more explicit.
Note how DWARF debug info sections were retained even if enabling a link
with PDB info only; that behaviour is preserved.
This redoes the fix from 3ab6209a3f93bdbeec8e9b9fcc00a9a4980915ff
differently, without the unwanted effect of preserving unnecessary
`__imp_` prefixed symbols.
If the referencing object is a regular object, the `__imp_` symbol will
have `isUsedInRegularObj` set on it from that already. If the
referencing object is an LTO object, we set `isUsedInRegularObj` for any
symbol starting with `__imp_`.
If the object file defining the `__imp_` symbol is a regular object, the
`isUsedInRegularObj` flag has no effect. If it is an LTO object, it
causes the symbol to be preserved.
In practice, all the Windows ARMNT IR objects show the architecture type
Thumb, not ARM.
Most other switch cases for architecture in lld/COFF check for and treat
`arm` and `thumb` equally.
When reading the bitcode input, undefined weak symbols will show up as
undefined symbols - which fails the early pass of checking for missing
symbols in symtab.reportUnresolvable(), before doing the actual LTO
compilation.
Mark such symbols as deferUndefined (added in
3785a413feef896e8a022731cc6ed405d5ebe81b /
https://reviews.llvm.org/D89004 for the -wrap option), to let them pass
through this LTO precheck. After the LTO compilation, the weak undefined
symbols will point towards an absolute null symbol as default.
Such weak undefined symbols are used for the TLS init function in the
Itanium C++ ABI, for TLS variables that potentially need to run a
constructor, when accessed across translation units.
This fixes https://github.com/llvm/llvm-project/issues/64513.
Such pointers are often used by the core parts of mingw-w64, to locally
define a function that might have been referred to with dllimport.
(MSVC style linkers can automatically provide such pointers, if there
are undefined references to `__imp_<func>` left but a definition of
`<func>` is available - although this prints the warning LNK4217. GNU ld
doesn't do this, so in mingw-w64, such things are generally handled by
manually providing the relevant `__imp_` pointers.)
Make sure that a full LTO build, that does LTO of both the `__imp_`
pointer and the object file referencing it, successfully resolves such
symbols.
This solution admittedly probably reduces the effect of the LTO
compilation if there would happen to be `__imp_` prefixed symbols
included, in LTO objects, that aren't actually used. Such symbols are
mostly used in the base toolchain, not often in user code, and usually
only the relevant object files are linked in anyway.
This fixes https://github.com/llvm/llvm-project/issues/57982.
The MinGW mode (enabled with the flag -lldmingw) does allow duplicate
weak symbols. A test in
compiler-rt/test/profile/Windows/coverage-weak-lld.cpp does currently
enable the -lldmingw flag in an MSVC context, in order to deal with
duplicate weak symbols.
Add a new, separate, lld specific flag for enabling this. In MinGW mode,
this is enabled by default, otherwise it is disabled.
This allows making the MinGW mode more restrictive in adding libpaths
from the surrounding environment; in MinGW mode, all libpaths are passed
explicitly by the compiler driver to the linker, which is attempted in
https://reviews.llvm.org/D144084.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
This can happen when manually emitting strings into .drectve sections
with `__attribute__((section(".drectve")))`, which is a way to emulate
`#pragma comment(linker, "...")` for mingw compilers, without requiring
building with -fms-extensions.
Normally, this doesn't generate any comdat, but if compiled with
-fsanitize=address, this section does get turned into a comdat section.
This fixes#67261. This issue can be seen as a regression; a change in
the "lli" tool in 17.x triggers this case, if compiled with ASAN
enabled, triggering this unsupported corner case in LLD. With this
change, LLD can handle it.
This reverts commit 7370ff624d217b0f8f7512ca5b651a9b8095a411.
(and 47fb8ae2f9a4075de05433ef24f459b6befd1730).
This commit broke the symbol type in import libraries generated
for mingw autoexported symbols, when the source files were built
with LTO. I'll commit a testcase that showcases this issue after
the revert.
This patch relaxes the constraints on the error message saved in PDBInputFile when failing to load a pdb file.
Storing an `Error` member infers that it must be accessed exactly once, which doesn't fit in several scenarios:
- If an invalid PDB file is provided as input file but never used, a loading error is created but never handled, causing an assert at shutdown.
- PDB file created using MSVC's `/Zi` option : The loading error message must be displayed once per obj file.
Also, the state of `PDBInputFile` was altered when reading (taking) the `Error` member, causing issues:
- accessing it (taking the `Error`) makes the object look valid whereas it's not properly initialized
- read vs write concurrency on a same `PDBInputFile` in the ghash parallel algorithm
The solution adopted here was to instead store an optional error string, and generate Error objects from it on demand.
Differential Revision: https://reviews.llvm.org/D140333
Solve two issues that showed up when using LLD with Unreal Engine & FASTBuild:
1. It seems the S_OBJNAME record doesn't always record the "precomp signature". We were relying on that to match the PCH.OBJ with their dependent-OBJ.
2. MSVC link.exe is able to link a PCH.OBJ when the "precomp signatureÈ doesn't match, but LLD was failing. This was occuring since the Unreal Engine Build Tool was compiling the PCH.OBJ, but the dependent-OBJ were compiled & cached through FASTBuild. Upon a clean rebuild, the PCH.OBJs were recompiled by the Unreal Build Tool, thus the "precomp signatures" were changing; however the OBJs were already cached by FASTBuild, thus having an old "precomp signatures".
We now ignore "precomp signatures" and properly fallback to cmd-line name lookup, like MSVC link.exe does, and only fail if the PCH.OBJ type stream doesn't match the count expected by the dependent-OBJ.
Differential Revision: https://reviews.llvm.org/D136762
LLVM bitcode contains support for weak symbols, so we can add support
for overriding weak symbols in the output COFF even though COFF doesn't
have inherent support for weak symbols.
The motivation for this patch is that Chromium is trying to use libc++'s
assertion handler mechanism, which relies on weak symbols [0], but we're
unable to perform a ThinLTO build on Windows due to this problem [1].
[0]: https://reviews.llvm.org/D121478
[1]: https://crrev.com/c/3863576
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D133165
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.
See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html
Differential Revision: https://reviews.llvm.org/D108850
Similar to ELF 3a5fb57393c3bc77be9e7afc2ec9d4ec3c9bbf70.
* previously when a LazyObjFile was extracted, a new ObjFile/BitcodeFile was created; now the file is reused, just with `lazy` cleared
* avoid the confusing transfer of `symbols` from LazyObjFile to the new file
* simpler code, smaller executable (5200+ bytes smaller on x86-64)
* make eager parsing feasible (for parallel section/symbol table initialization)
Reviewed By: aganea, rnk
Differential Revision: https://reviews.llvm.org/D116434
Original commit description:
[LLD] Remove global state in lld/COFF
This patch removes globals from the lldCOFF library, by moving globals
into a context class (COFFLinkingContext) and passing it around wherever
it's needed.
See https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html for
context about removing globals from LLD.
I also haven't moved the `driver` or `config` variables yet.
Differential Revision: https://reviews.llvm.org/D109634
This reverts commit a2fd05ada9030eab2258fff25e77a05adccae128.
Original commits were b4fa71eed34d967195514fe9b0a5211fca2bc5bc
and e03c7e367adb8f228332e3c2ef8f45484597b719.
check for timer output"
Seems to be causing a number of asan test failures.
This reverts commit b4fa71eed34d967195514fe9b0a5211fca2bc5bc
and e03c7e367adb8f228332e3c2ef8f45484597b719.
This patch removes globals from the lldCOFF library, by moving globals
into a context class (COFFLinkingContext) and passing it around wherever
it's needed.
See https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html for
context about removing globals from LLD.
I also haven't moved the `driver` or `config` variables yet.
Differential Revision: https://reviews.llvm.org/D109634
In PGO, a C++ external linkage function `foo` has a private counter
`__profc_foo` and a private `__profd_foo` in a `comdat nodeduplicate`.
A `__attribute__((weak))` function `foo` has a weak hidden counter `__profc_foo`
and a private `__profd_foo` in a `comdat nodeduplicate`.
In `ld.lld a.o b.o`, say a.o defines an external linkage `foo` and b.o
defines a weak `foo`. Currently we treat `comdat nodeduplicate` as `comdat any`,
ld.lld will incorrectly consider `b.o:__profc_foo` non-prevailing. In the worst
case when `b.o:__profd_foo` is retained and `b.o:__profc_foo` isn't, there will
be dangling reference causing an `undefined hidden symbol` error.
Add SelectionKind to `Comdat` in IRSymtab and let linkers ignore nodeduplicate comdat.
Differential Revision: https://reviews.llvm.org/D106228
If linking directly against a DLL without an import library, the
DLL export symbols might not contain stdcall decorations.
If we have an undefined symbol with decoration, and we happen to have
a matching undecorated symbol (which either is lazy and can be loaded,
or already defined), then alias it against that instead.
This matches what's done in reverse, when we have a def file
declaring to export a symbol without decoration, but we only have
a defined decorated symbol. In that case we do a fuzzy match
(SymbolTable::findMangle). This case is more straightforward; if we
have a decorated undefined symbol, just strip the decoration and look
for the corresponding undecorated symbol name.
Add warnings and options for either silencing the warning or disabling
the whole feature, corresponding to how ld.bfd does it.
(This feature works for any symbol decoration mismatch, not only when
linking against a DLL directly; ld.bfd also tolerates it anywhere,
and also fixes up mismatches in the other direction, like
SymbolTable::findMangle, for any symbol, not only exports. But in
practice, at least for lld, it would primarily end up used for linking
against DLLs.)
Differential Revision: https://reviews.llvm.org/D104532