Since Ctx &ctx is a member variable,
1f391a75af8685e6bba89421443d72ac6a186599
7a5b9ef54eb96abd8415fd893576c42e51fd95db
e2f0ec3a3a8a2981be8a1aac2004cfb9064c61e8 can be reverted.
When Branch Target Identification BTI is enabled all indirect branches
must target a BTI instruction. A long branch thunk is a source of
indirect branches. To date LLD has been assuming that the object
producer is responsible for putting a BTI instruction at all places the
linker might generate an indirect branch to. This is true for clang, but
not for GCC. GCC will elide the BTI instruction when it can prove that
there are no indirect branches from outside the translation unit(s). GNU
ld was fixed to generate a landing pad stub (gnu ld speak for thunk) for
the destination when a long range stub was needed [1].
This means that using GCC compiled objects with LLD may lead to LLD
generating an indirect branch to a location without a BTI. The ABI [2]
has also been clarified to say that it is a static linker's
responsibility to generate a landing pad when the target does not have a
BTI.
This patch implements the same mechansim as GNU ld. When the output ELF
file is setting the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI property, then we check the
destination to see if it has a BTI instruction. If it does not we
generate a landing pad consisting of:
BTI c
B <destination>
The B <destination> can be elided if the thunk can be placed so that
control flow drops through. For example:
BTI c
<destination>:
This will be common when -ffunction-sections is used.
The landing pad thunks are effectively alternative entry points for the
function. Direct branches are unaffected but any linker generated
indirect branch needs to use the alternative. We place these as close as
possible to the destination section.
There is some further optimization possible. Consider the case:
.text
fn1
...
fn2
...
If we need landing pad thunks for both fn1 and fn2 we could order them
so that the thunk for fn1 immediately precedes fn1. This could save a
single branch. However I didn't think that would be worth the additional
complexity.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
[2] https://github.com/ARM-software/abi-aa/issues/196
Remove the global variable `symtab` and add a member variable
(`std::unique_ptr<SymbolTable>`) to `Ctx` instead.
This is one step toward eliminating global states.
Pull Request: https://github.com/llvm/llvm-project/pull/109612
Ctx was introduced in March 2022 as a more suitable place for such
singletons.
llvm/Support/thread.h includes <thread>, which transitively includes
sstream in libc++ and uses ios_base::in, so we cannot use `#define in ctx.sec`.
`symtab, config, ctx` are now the only variables using
LLVM_LIBRARY_VISIBILITY.
Ctx was introduced in March 2022 as a more suitable place for such
singletons.
We now use default-initialization for `LinkerScript` and should pay
attention to non-class types (e.g. `dot` is initialized by commit
503907dc505db1e439e7061113bf84dd105f2e35).
Follow-up to #98115. For EhInputSection, RelocationScanner::scan calls
sortRels, which doesn't support the CREL iterator. We should set
supportsCrel to false to ensure that the initial_location fields in
.eh_frame FDEs are relocated.
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.
---
Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.
(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)
A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.
---
Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).
* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.
SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.
Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600
Pull Request: https://github.com/llvm/llvm-project/pull/98115
Implement the two commands described by
https://sourceware.org/binutils/docs/ld/Miscellaneous-Commands.html
After `outputSections` is available, check each output section described
by at least one `NOCROSSREFS`/`NOCROSSERFS_TO` command. For each checked
output section, scan relocations from its input sections.
This step is slow, therefore utilize `parallelForEach(isd->sections, ...)`.
To support non SHF_ALLOC sections, `InputSectionBase::relocations`
(empty) cannot be used. In addition, we may explore eliminating this
member to speed up relocation scanning.
Some parse code is adapted from #95714.
Close#41825
Pull Request: https://github.com/llvm/llvm-project/pull/98773
See the comment in handleTlsRelocation. For TLSDESC=>IE (the TLS symbol
is defined in another DSO), R_RISCV_TLSDESC_{LOAD_LO12,ADD_LO12_I,CALL}
referencing a non-preemptible label uses the `R_RELAX_TLS_GD_TO_LE` code
path.
If there is no TLS section, `getTlsTpOffset` will be called with null
`Out::tlsPhdr`, leading to a null pointer dereference. Since the return
value is used by `RISCV::relocateAlloc` and ignored there, just return
0.
LoongArch TLSDESC doesn't use STT_NOTYPE labels. The `if (..) return 0;`
is a no-op for LoongArch.
This patch is a follow-up to #79239 and fixes some comments.
Pull Request: https://github.com/llvm/llvm-project/pull/98569
`IsRela` is used by lld to differentiate REL and RELA static
relocations. The proposed CREL patch will reuse `IsRela` for CREL
(#91280). Rename `IsRela` to be more appropriate.
Pull Request: https://github.com/llvm/llvm-project/pull/96592
LoongArch does not yet implement transition from TLSDESC to LE/IE,
so TLSDESC dynamic relocation needs to be generated for each desc,
which is ultimately handled by the dynamic linker.
The test cases reference RISC-V: #79239
Reviewed By: MaskRay, SixWeining
Pull Request: https://github.com/llvm/llvm-project/pull/94451
After inlining, `scanSection` is significantly longer (more than 100+
instructions on x86-64 built with Clang) when `i` does not always
increment by one (MIPS).
Current Bionic processes relocations in this order:
* DT_ANDROID_REL[A]
* DT_RELR
* DT_REL[A]
* DT_JMPREL
If an IRELATIVE relocation is in DT_ANDROID_REL[A], it would read
unrelocated (incorrect) global variables associated with RELR when
--pack-dyn-relocs=android+relr is enabled. Work around this by placing
IRELATIVE in .rel[a].plt (DT_JMPREL).
Link: https://r.android.com/3014185
`relaIplt` was added so that IRELATIVE relocations are placed at the end
of .rela.dyn (since https://reviews.llvm.org/D65651) or .rela.plt
(--pack-dyn-relocs=android[+relr]). Unfortunately, handling `relaIplt`
requires special cases all over the code base. We can extend
partitionRels/computeRels to partition both RELATIVE and IRELATIVE
relocations, rendering `relaIplt` unneeded.
The change allows IRELATIVE relocations in the DT_ANDROID_REL[A] table
(untested?!), which may be processed before other types of relocations.
This seems acceptable for Bionic's DEFINE_IFUNC_FOR use cases.
In addition, this change simplies changing .rel[a].dyn to a compact
relocation format (CREL).
SHF_INFO_LINK is removed from .rel[a].dyn with IRELATIVE relocations.
(See https://reviews.llvm.org/D89828).
When adding fixups for RISCV_TLSDESC_ADD_LO and RISCV_TLSDESC_LOAD_LO,
the local label added for RISCV TLSDESC relocations have STT_TLS set,
which is incorrect. Instead, these labels should have `STT_NOTYPE`.
This patch stops adding such fixups and avoid setting the STT_TLS on
these symbols. Failing to do so can cause LLD to emit an error `has an
STT_TLS symbol but doesn't have an SHF_TLS section`. We additionally,
adjust how LLD services these relocations to avoid errors with
incompatible relocation and symbol types.
Reviewers: topperc, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/85817
With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:
9e8d8: R_390_TLS_TPOFF *ABS*+0x18
This is caused by the config->isPic conditon in addTpOffsetGotEntry:
static void addTpOffsetGotEntry(Symbol &sym) {
in.got->addEntry(sym);
uint64_t off = sym.getGotOffset();
if (!sym.isPreemptible && !config->isPic) {
in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
return;
}
It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.
Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.
In addition to new platform code and the obvious changes, there were a
few additional changes to common code:
- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.
- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.
This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.
For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.
The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.
```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
auipc a4, %tlsdesc_hi(c) # if relax: remove; otherwise, NOP
load a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
addi a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2) # if LE && !hi20 {if relax: remove; otherwise, NOP}
jalr t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
add a0, a0, tp
```
The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.
* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).
Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373
Reviewed By: ilovepi
Pull Request: https://github.com/llvm/llvm-project/pull/79239
Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile
kind `InternalKind`, use it for
* `ctx.internalFile`: for linker-defined symbols and some synthesized
`Undefined`
* `createInternalFile`: for symbol assignments and --defsym
I picked "internal" instead of "synthetic" to avoid confusion with
SyntheticSection.
Currently a symbol's file is one of: nullptr, ObjKind, SharedKind,
BitcodeKind, BinaryKind. Now it's non-null (I plan to add an
`assert(file)` to Symbol::Symbol and change `toString(const InputFile
*)`
separately).
Debugging and error reporting gets improved. The immediate user-facing
difference is more descriptive "File" column in the --cref output. This
patch may unlock further simplification.
Currently each symbol assignment gets its own
`createInternalFile(cmd->location)`. Two symbol assignments in a linker
script do not share the same file. Making the file the same would be
nice, but would require non trivial code.