The script copies `ReleaseNotesTemplate.txt` to corresponding
`ReleaseNotes.rst`/`.md` to clear release notes.
The suffix of `ReleaseNotesTemplate.txt` must be `.txt`. If it is
`.rst`/`.md`, it will be treated as a documentation source file when
building documentation.
Fixed assertion failure when reading .eh_frame sections, and added
.eh_frame sections to tests.
This reverts commit 1e95349dbe329938d2962a78baa0ec421e9cd7d1.
Original commit message follows:
When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.
Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.
The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:
CFI enabled: +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]
The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.
This optimization is implemented for AArch64 and X86_64 only.
lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:
```
N Min Max Median Avg Stddev
x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888
+ 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971
Difference at 95.0% confidence
0.0243538 +/- 0.00233202
1.87831% +/- 0.179859%
(Student's t, pooled s = 0.0190369)
```
[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057
Reviewers: zmodem, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/145579
This caused assertion failures in applyBranchToBranchOpt():
llvm/include/llvm/Support/Casting.h:578:
decltype(auto) llvm::cast(From*)
[with To = lld:🧝:InputSection; From = lld:🧝:InputSectionBase]:
Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
See comment on the PR (https://github.com/llvm/llvm-project/pull/138366)
This reverts commit 491b82a5ec1add78d2c93370580a2f1897b6a364.
This also reverts the follow-up "[lld] Use llvm::partition_point (NFC) (#145209)"
This reverts commit 2ac293f5ac4cf65c0c038bf75a88f1d6715e467d.
When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.
Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.
The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:
CFI enabled: +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]
The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.
This optimization is implemented for AArch64 and X86_64 only.
lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:
```
N Min Max Median Avg Stddev
x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888
+ 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971
Difference at 95.0% confidence
0.0243538 +/- 0.00233202
1.87831% +/- 0.179859%
(Student's t, pooled s = 0.0190369)
```
[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057
Pull Request: https://github.com/llvm/llvm-project/pull/138366
The behavior of an undefined weak reference is implementation defined.
For static -no-pie linking, dynamic relocations are generally avoided (except
IRELATIVE). -shared linking generally emits dynamic relocations.
Dynamic -no-pie linking and -pie allow flexibility. Changes adjust the
behavior for better consistency and simpler internal representation,
e.g. https://reviews.llvm.org/D63003https://reviews.llvm.org/D105164
(generalized to undefined non-weak in
2fcaa00d1e2317a90c9071b735eb0e758b5dd58b).
GNU ld introduced -z [no]dynamic-undefined-weak option to fine-tune the
behavior. (The option is not very effective with -no-pie, e.g. on
x86-64, `ld.bfd a.o s.so -z dynamic-undefined-weak` generates
R_X86_64_NONE relocations instead of GLOB_DAT/JUMP_SLOT)
This patch implements -z [no]dynamic-undefined-weak option.
The effects are summarized as follows:
* Static -no-pie: no-op
* Dynamic -no-pie: nodynamic-undefined-weak suppresses GLOB_DAT/JUMP_SLOT
* Static -pie: dynamic-undefined-weak generates ABS/GLOB_DAT/JUMP_SLOT.
https://discourse.llvm.org/t/lld-weak-undefined-symbols-in-vdso-only/86749
* Dynamic -pie: nodynamic-undefined-weak suppresses ABS/GLOB_DAT/JUMP_SLOT
The -pie behavior likely stays stable while -no-pie (`!ctx.arg.isPic` in
`isStaticLinkTimeConstant`) behavior will likely change in the future.
The current default value of ctx.arg.zDynamicUndefined is selected to
prevent behavior changes.
Pull Request: https://github.com/llvm/llvm-project/pull/143831
When using `-no-pie` without a `SECTIONS` command, the linker uses the
target's default image base. If `-Ttext=` or `--section-start` specifies
an output section address below this base, the result is likely
unintended.
- With `--no-rosegment`, the PT_LOAD segment covering the ELF header cannot include `.text` if `.text`'s address is too low, causing an `error: output file too large`.
- With default `--rosegment`:
- If a read-only section (e.g., `.rodata`) exists, a similar `error: output file too large` occurs.
- Without read-only sections, the PT_LOAD segment covering the ELF header and program headers includes no sections, which is unusual and likely undesired. This also causes non-ascending PT_LOAD `p_vaddr` values related to the PT_LOAD that overlaps with PT_PHDR (#138584).
To prevent these issues, report an error if a section address is below
the image base and suggest `--image-base`. This check also applies when
`--image-base` is explicitly set but is skipped when a `SECTIONS`
command is used.
Pull Request: https://github.com/llvm/llvm-project/pull/140187
Following from the discussion in #132224, this seems like the best
approach to deal with a mix of XO and RX output sections in the same
binary. This change will also simplify the implementation of the
PURECODE section flag for AArch64.
To control this behaviour, the `--[no-]xosegment` flag is added to LLD
(similarly to `--[no-]rosegment`), which determines whether to allow
merging XO and RX sections in the same segment. The default value is
`--no-xosegment`, which is a breaking change compared to the previous
behaviour.
Release notes are also added, since this will be a breaking change.
This allows NOCROSSREFS to be specified in OVERLAY linker script
descriptions. This is a particularly useful part of the OVERLAY syntax,
since it's very rarely possible for one overlay section to sensibly
reference another.
Closes#128790
This prints a stack of reasons that symbols that match the given glob(s)
survived GC. It has no effect unless section GC occurs.
This implementation does not require -ffunction-sections or
-fdata-sections to produce readable results, althought it does tend to
work better (as does GC).
Details about the semantics:
- Some chain of liveness reasons is reported; it isn't specified which
chain.
- A symbol or section may be live:
- Intrisically (e.g., entry point)
- Because needed by a live symbol or section
- (Symbols only) Because part of a section live for another reason
- (Sections only) Because they contain a live symbol
- Both global and local symbols (`STB_LOCAL`) are supported.
- References to symbol + offset are considered to point to:
- If the referenced symbol is a section (`STT_SECTION`):
- If a sized symbol encloses the referenced offset, the enclosing
symbol.
- Otherwise, the section itself, generically.
- Otherwise, the referenced symbol.
Set the default processor version to v68 when the user does not specify
one in the command line. This includes changes in the LLVM backed and
linker (lld). Since lld normally sets the version based on inputs, this
change will only affect cases when there are no inputs.
Fixes#127558
When GCS was introduced to LLD, the gcs-report option allowed for a user
to gain information relating to if their relocatable objects supported
the feature. For an executable or shared-library to support GCS, all
relocatable objects must declare that they support GCS.
The gcs-report checks were only done on relocatable object files,
however for a program to enable GCS, the executable and all shared
libraries that it loads must enable GCS. gcs-report-dynamic enables
checks to be performed on all shared objects loaded by LLD, and in cases
where GCS is not supported, a warning or error will be emitted.
It should be noted that only shared files directly passed to LLD are
checked for GCS support. Files that are noted in the `DT_NEEDED` tags
are assumed to have had their GCS support checked when they were
created.
The behaviour of the -zgcs-dynamic-report option matches that of GNU ld.
The behaviour is as follows unless the user explicitly sets the value:
* -zgcs-report=warning or -zgcs-report=error implies
-zgcs-report-dynamic=warning.
This approach avoids inheriting an error level if the user wishes to
continue building a module without rebuilding all the shared libraries.
The same approach was taken for the GNU ld linker, so behaviour is
identical across the toolchains.
This implementation matches the error message and command line interface
used within the GNU ld Linker. See here:
724a8341f6
To support this option being introduced, two other changes are included
as part of this PR. The first converts the -zgcs-report option to
utilise an Enum, opposed to StringRef values. This enables easier
tracking of the value the user defines when inheriting the value for the
gas-report-dynamic option. The second is to parse the Dynamic Objects
program headers to locate the GNU Attribute flag that shows GCS is
supported. This is needed so, when using the gcs-report-dynamic option,
LLD can correctly determine if a dynamic object supports GCS.
---------
Co-authored-by: Fangrui Song <i@maskray.me>
This was removed from the ABI in riscv-non-isa/riscv-elf-psabi-doc#398.
It is not emitted by LLVM, and seems to have been an internal
implementation detail in binutils.
This is a follow-up to 26ec5da744b8 which removed previous binutils
internal relocations when they were removed from the ABI.
The LLD implementation was not tested when it was added in
https://reviews.llvm.org/D39322
This allows the input section matching algorithm to be separated from
output section descriptions. This allows a group of sections to be
assigned to multiple output sections, providing an explicit version of
--enable-non-contiguous-regions's spilling that doesn't require altering
global linker script matching behavior with a flag. It also makes the
linker script language more expressive even if spilling is not intended,
since input section matching can be done in a different order than
sections are placed in an output section.
The implementation reuses the backend mechanism provided by
--enable-non-contiguous-regions, so it has roughly similar semantics and
limitations. In particular, sections cannot be spilled into or out of
INSERT, OVERWRITE_SECTIONS, or /DISCARD/. The former two aren't
intrinsic, so it may be possible to relax those restrictions later.
GNU ld since 2.41 supports this option, which is mildly useful. It omits
the section header table and non-ALLOC sections (including
.symtab/.strtab (--strip-all)).
This option is simple to implement and might be used by LLDB to test
program headers parsing without the section header table (#100900).
-z sectionheader, which is the default, is also added.
Pull Request: https://github.com/llvm/llvm-project/pull/101286
This reverts commit f55b79f59a77b4be586d649e9ced9f8667265011.
The known issues with chained fixups have been addressed by #98913,
#98305, #97156 and #95171.
Compared to the original commit, support for xrOS (which postdates
chained fixups' introduction) was added and an unnecessary test change
was removed.
----------
Original commit message:
Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:
93d74eafc3/src/ld/Options.cpp (L5458-L5549)
Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32
Implement the two commands described by
https://sourceware.org/binutils/docs/ld/Miscellaneous-Commands.html
After `outputSections` is available, check each output section described
by at least one `NOCROSSREFS`/`NOCROSSERFS_TO` command. For each checked
output section, scan relocations from its input sections.
This step is slow, therefore utilize `parallelForEach(isd->sections, ...)`.
To support non SHF_ALLOC sections, `InputSectionBase::relocations`
(empty) cannot be used. In addition, we may explore eliminating this
member to speed up relocation scanning.
Some parse code is adapted from #95714.
Close#41825
Pull Request: https://github.com/llvm/llvm-project/pull/98773
This patch improves GNU ld compatibility.
Close#87891: Support `OUTPUT_FORMAT(binary)`, which is like
--oformat=binary. --oformat=binary takes precedence over an ELF
`OUTPUT_FORMAT`.
In addition, if more than one OUTPUT_FORMAT command is specified, only
check the first one.
Pull Request: https://github.com/llvm/llvm-project/pull/98837
The current default, build-id=fast, is only 8 bytes due to the usage of
64-bit XXH3. This is incompatible with RPM packaging tools which
requires >=16 bytes [1].
In Clang the ENABLE_LINKER_BUILD_ID define makes it pass --build-id
without a specific hash type. When also defaulting to LLD, this provides
a pretty broken default out-of-box.
Using XXH3 was a considerable performance advantage when build-id was
first implemented, because sha1 was really sha1 and rather slow.
Nowadays sha1 is just 160-bit BLAKE3 which is decently fast and not
cryptographically broken, so it should be a good default.
Note that the default remains "fast" for wasm because sha1 for wasm is
still real sha1.
Close https://github.com/llvm/llvm-project/issues/43483.
[1]:
b7d427728b/build/files.c (L1883)
GNU ld's relocatable linking behaviors:
* Sections with the `SHF_GROUP` flag are handled like sections matched
by the `--unique=pattern` option. They are processed like orphan
sections and ignored by input section descriptions.
* Section groups' (usually named `.group`) content is updated as the
section indexes are updated. Section groups can be discarded with
`/DISCARD/ : { *(.group) }`.
`-r --force-group-allocation` discards section groups and allows
sections with the `SHF_GROUP` flag to be matched like normal sections.
If two section group members are placed into the same output section,
their relocation sections (if present) are combined as well.
This behavior can be useful when -r output is used as a pseudo shared
object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments).
This patch implements --force-group-allocation:
* Input SHT_GROUP sections are discarded.
* Input sections do not get the SHF_GROUP flag, so `addInputSec`
will combine relocation sections if their relocated section group
members are combined.
The default behavior is:
* Input SHT_GROUP sections are retained.
* Input SHF_GROUP sections can be matched (unlike GNU ld)
* Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec`
will create different OutputDesc copies.
GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not
implemented.
Pull Request: https://github.com/llvm/llvm-project/pull/94704
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.
This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:
- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).
- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.
- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.
The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.
Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.
A few notable feature interactions occur:
- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.
- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).
- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.
- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.
- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.
This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:
- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).
- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.
- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.
The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.
Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.
A few notable feature interactions occur:
- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.
- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).
- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.
- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.
- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
zstd excels at scaling from low-ratio-very-fast to
high-ratio-pretty-slow. Some users prioritize speed and prefer disk read
speed, while others focus on achieving the highest compression ratio
possible, similar to traditional high-ratio codecs like LZMA.
Add an optional `level` to `--compress-sections` (#84855) to cater to
these diverse needs. While we initially aimed for a one-size-fits-all
approach, this no longer seems to work.
(https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html)
When --compress-debug-sections is used together, make
--compress-sections take precedence since --compress-sections is usually
more specific.
Remove the level distinction between -O/-O1 and -O2 for
--compress-debug-sections=zlib for a more consistent user experience.
Pull Request: https://github.com/llvm/llvm-project/pull/90567
`clang -g -gpubnames` (with optional -gsplit-dwarf) creates the
`.debug_names` section ("per-CU" index). By default lld concatenates
input `.debug_names` sections into an output `.debug_names` section.
LLDB can consume the concatenated section but the lookup performance is
not good.
This patch adds --debug-names to create a per-module index by combining
the per-CU indexes into a single index that covers the entire load
module. The produced `.debug_names` is a replacement for `.gdb_index`.
Type units (-fdebug-types-section) are not handled yet.
Co-authored-by: Fangrui Song <i@maskray.me>
---------
Co-authored-by: Fangrui Song <i@maskray.me>
--compress-sections <section-glib>=[none|zlib|zstd] is similar to
--compress-debug-sections but applies to broader sections without the
SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is
matched. An interesting use case is to compress `.strtab`/`.symtab`,
which consume a significant portion of the file size (15.1% for a
release build of Clang).
An older revision is available at https://reviews.llvm.org/D154641 .
This patch focuses on non-allocated sections for safety. Moving
`maybeCompress` as D154641 does not handle STT_SECTION symbols for
`-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s`
from #66804).
Since different output sections may use different compression
algorithms, we need CompressedData::type to generalize
config->compressDebugSections.
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674
Pull Request: https://github.com/llvm/llvm-project/pull/84855
The ELF linker transitioned away from archive indexes in
https://reviews.llvm.org/D117284.
This paves the way for supporting `--start-lib`/`--end-lib` (See #77960)
The ELF linker unified library handling with `--start-lib`/`--end-lib` and removed
the ArchiveFile class in https://reviews.llvm.org/D119074.
This adds support for generating Chrome-tracing .json profile traces in
the LLD COFF driver.
Also add the necessary time scopes, so that the profile trace shows in
great detail which tasks are executed.
As an example, this is what we see when linking a Unreal Engine
executable:

Close#57618: currently we align the end of PT_GNU_RELRO to a
common-page-size
boundary, but do not align the end of the associated PT_LOAD. This is
benign
when runtime_page_size >= common-page-size.
However, when runtime_page_size < common-page-size, it is possible that
`alignUp(end(PT_LOAD), page_size) < alignDown(end(PT_GNU_RELRO),
page_size)`.
In this case, rtld's mprotect call for PT_GNU_RELRO will apply to
unmapped
regions and lead to an error, e.g.
```
error while loading shared libraries: cannot apply additional memory protection after relocation: Cannot allocate memory
```
To fix the issue, add a padding section .relro_padding like mold, which
is contained in the PT_GNU_RELRO segment and the associated PT_LOAD
segment. The section also prevents strip from corrupting PT_LOAD program
headers.
.relro_padding has the largest `sortRank` among RELRO sections.
Therefore, it is naturally placed at the end of `PT_GNU_RELRO` segment
in the absence of `PHDRS`/`SECTIONS` commands.
In the presence of `SECTIONS` commands, we place .relro_padding
immediately before a symbol assignment using DATA_SEGMENT_RELRO_END (see
also https://reviews.llvm.org/D124656), if present.
DATA_SEGMENT_RELRO_END is changed to align to max-page-size instead of
common-page-size.
Some edge cases worth mentioning:
* ppc64-toc-addis-nop.s: when PHDRS is present, do not append
.relro_padding
* avoid-empty-program-headers.s: when the only RELRO section is .tbss,
it is not part of PT_LOAD segment, therefore we do not append
.relro_padding.
---
Close#65002: GNU ld from 2.39 onwards aligns the end of PT_GNU_RELRO to
a
max-page-size boundary (https://sourceware.org/PR28824) so that the last
page is
protected even if runtime_page_size > common-page-size.
In my opinion, losing protection for the last page when the runtime page
size is
larger than common-page-size is not really an issue. Double mapping a
page of up
to max-common-page for the protection could cause undesired VM waste.
Internally
we had users complaining about 2MiB max-page-size applying to shared
objects.
Therefore, the end of .relro_padding is padded to a common-page-size
boundary. Users who are really anxious can set common-page-size to match
their runtime page size.
---
17 tests need updating as there are lots of change detectors.
This patch adds support to lld for --fat-lto-objects. We add a new
--fat-lto-objects option to LLD, and slightly change how it chooses input
files in the driver when the option is set.
Fat LTO objects contain both LTO compatible IR, as well as generated object
code. This allows users to defer the choice of whether to use LTO or not to
link-time. This is a feature available in GCC for some time, and makes the
existing -ffat-lto-objects option functional in the same way as GCC's.
If the --fat-lto-objects option is passed to LLD and the input files are fat
object files, then the linker will chose the LTO compatible bitcode sections
embedded within the fat object and link them together using LTO. Otherwise,
standard object file linking is done using the assembly section in the object
files.
The previous version of this patch had a missing `REQUIRES: x86` line in
`fatlto.invalid.s`. Additionally, it was reported that this patch caused
a test failure in `export-dynamic-symbols.s`, however,
29112a994694baee070a2021e00f772f1913d214 disabled the
`export-dynamic-symbols.s` test on Windows due to a quotation difference
between platforms, unrelated to this patch.
Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D146778
This adds support for the LoongArch ELF psABI v2.00 [1] relocation
model to LLD. The deprecated stack-machine-based psABI v1 relocs are not
supported.
The code is tested by successfully bootstrapping a Gentoo/LoongArch
stage3, complete with common GNU userland tools and both the LLVM and
GNU toolchains (GNU toolchain is present only for building glibc,
LLVM+Clang+LLD are used for the rest). Large programs like QEMU are
tested to work as well.
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
Reviewed By: MaskRay, SixWeining
Differential Revision: https://reviews.llvm.org/D138135
This reverts commit c9953d9891a6067549a78e7d07ca8eb6a7596792 and a
forward fix in 3a45b843dec1bca195884aa1c5bc56bd0e6755b4.
D14677 causes some failure on windows bots that the forward fix did not
address. Thus I'm reverting until the underlying cause can me triaged.