292 Commits

Author SHA1 Message Date
John Ericson
d7fd8b19e5
[LLD] Extend special OpenBSD support, but scope under ELFOSABI (#97122)
- Add support for `.openbsd.mutable`

  (rebaser's note) adapted from:

bd249b5664
  New auto-coalescing sections removed

  In the linkers, collect objects in section "openbsd.mutable" and place
  them into a page-aligned region in the bss, with the right markers for
kernel/ld.so to identify the region and skip making it immutable. While
here, fix readelf/objdump versions to show all of this. ok miod kettenis

- Add support for `.openbsd.syscalls`

  (rebaser's note) adapted from:

42a61acefa

  Collect .openbsd.syscalls sections into a new PT_OPENBSD_SYSCALLS
  segment. This will be used soon to pin system calls to designated call
  sites.

  ok deraadt@

- Scope OpenBSD special section handling under that ELFOSABI

  As a preexisting comment in `ELF/Writer.cpp` says:

  > section names shouldn't be significant in ELF in spirit.

  so scoping OSABI-specific magic name hacks to just the OSABI in
  question limits the degree to which we deviate from that "spirit" for
  all other OSABIs.

  OpenBSD in particular is very fast moving, having added a number of
  special sections, etc. in recent years. It is unclear how possible /
  reasonable it is for upstream to implement all these features in any
  event, but scoping like this at least mitigates the fallout for other
  OSABIs systems which wish to be more slow-moving.

Co-authored-by: deraadt <deraadt@openbsd.org>
2024-07-12 14:34:17 -04:00
Fangrui Song
ee4c12f87d
[ELF] Postpone more linker script errors
Since `assignAddresses` is executed more than once, error reporting
during `assignAddresses` would be duplicated. Generalize #66854 to cover
more errors.

Note: address-related errors exposed in one invocation might not be
errors in another invocation.

Pull Request: https://github.com/llvm/llvm-project/pull/96361
2024-06-24 10:15:28 -07:00
Parth Arora
ebb326a51f [ELF] Fix unnecessary inclusion of unreferenced provide symbols
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes #74771
Closes #84730

Co-authored-by: Fangrui Song <i@maskray.me>
2024-03-25 16:11:21 -07:00
Fangrui Song
551e20d190
[ELF] Reject error-prone meta characters in input section description
The lexer is overly permissive. When parsing file patterns in an input
section description and there is a missing `)`, we would accept many
non-sensible tokens (e.g. `}`) as patterns, leading to confusion, e.g.
`*(SORT_BY_ALIGNMENT(SORT_BY_NAME(.text*)) } PROVIDE_HIDDEN(__code_end = .)`
(#81804).

Ideally, the lexer should be stateful to report more errors like GNU ld
and get rid of hacks like `ScriptLexer::maybeSplitExpr`, but that would
require a large rewrite of the lexer. For now, just reject certain
non-wildcard meta characters to detect common mistakes.

Pull Request: https://github.com/llvm/llvm-project/pull/84130
2024-03-06 17:19:59 -08:00
Ulrich Weigand
fe3406e349
[lld] Add target support for SystemZ (s390x) (#75643)
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
2024-02-13 11:29:21 +01:00
Fangrui Song
43b13341fb
[ELF] Add internal InputFile (#78944)
Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile
kind `InternalKind`, use it for

* `ctx.internalFile`: for linker-defined symbols and some synthesized
`Undefined`
* `createInternalFile`: for symbol assignments and --defsym

I picked "internal" instead of "synthetic" to avoid confusion with
SyntheticSection.

Currently a symbol's file is one of: nullptr, ObjKind, SharedKind,
BitcodeKind, BinaryKind. Now it's non-null (I plan to add an
`assert(file)` to Symbol::Symbol and change `toString(const InputFile
*)`
separately).

Debugging and error reporting gets improved. The immediate user-facing
difference is more descriptive "File" column in the --cref output. This
patch may unlock further simplification.

Currently each symbol assignment gets its own
`createInternalFile(cmd->location)`. Two symbol assignments in a linker
script do not share the same file. Making the file the same would be
nice, but would require non trivial code.
2024-01-22 09:09:46 -08:00
Fangrui Song
7c89b20e02 [ELF] OVERLAY: support optional start address and LMA
https://reviews.llvm.org/D44780 implemented rudimentary support for
OVERLAY. The start address and `AT(ldaddr)` in `OVERLAY [start] :
[NOCROSSREFS] [AT ( ldaddr )]` are not optional.

In addition, there are two issues:

* When the start address is `.`, subsequent sections don't share the
  address of the first overlay section.
* When the first overlay section is empty and discardable, `p_paddr` is
  incorrectly zero. This is because a discarded section has a zero
  address, causing `prev->getLMA() + prev->size` where `prev` refers to
  the first section to evaluate to zero.

This patch supports optional start address and LMA and fix the issues.
Close #77265

Pull Request: https://github.com/llvm/llvm-project/pull/77272
2024-01-08 16:12:49 -08:00
Fangrui Song
1bd5df7af6 [ELF] Correct a comment about ^=. NFC
GNU ld added ^= support in July 2023.
2023-09-15 17:52:48 -07:00
Fangrui Song
5a58e98c20
[ELF] Align the end of PT_GNU_RELRO associated PT_LOAD to a common-page-size boundary (#66042)
Close #57618: currently we align the end of PT_GNU_RELRO to a
common-page-size
boundary, but do not align the end of the associated PT_LOAD. This is
benign
when runtime_page_size >= common-page-size.

However, when runtime_page_size < common-page-size, it is possible that
`alignUp(end(PT_LOAD), page_size) < alignDown(end(PT_GNU_RELRO),
page_size)`.
In this case, rtld's mprotect call for PT_GNU_RELRO will apply to
unmapped
regions and lead to an error, e.g.

```
error while loading shared libraries: cannot apply additional memory protection after relocation: Cannot allocate memory
```

To fix the issue, add a padding section .relro_padding like mold, which
is contained in the PT_GNU_RELRO segment and the associated PT_LOAD
segment. The section also prevents strip from corrupting PT_LOAD program
headers.

.relro_padding has the largest `sortRank` among RELRO sections.
Therefore, it is naturally placed at the end of `PT_GNU_RELRO` segment
in the absence of `PHDRS`/`SECTIONS` commands.

In the presence of `SECTIONS` commands, we place .relro_padding
immediately before a symbol assignment using DATA_SEGMENT_RELRO_END (see
also https://reviews.llvm.org/D124656), if present.
DATA_SEGMENT_RELRO_END is changed to align to max-page-size instead of
common-page-size.

Some edge cases worth mentioning:

* ppc64-toc-addis-nop.s: when PHDRS is present, do not append
.relro_padding
* avoid-empty-program-headers.s: when the only RELRO section is .tbss,
it is not part of PT_LOAD segment, therefore we do not append
.relro_padding.

---

Close #65002: GNU ld from 2.39 onwards aligns the end of PT_GNU_RELRO to
a
max-page-size boundary (https://sourceware.org/PR28824) so that the last
page is
protected even if runtime_page_size > common-page-size.

In my opinion, losing protection for the last page when the runtime page
size is
larger than common-page-size is not really an issue. Double mapping a
page of up
to max-common-page for the protection could cause undesired VM waste.
Internally
we had users complaining about 2MiB max-page-size applying to shared
objects.

Therefore, the end of .relro_padding is padded to a common-page-size
boundary. Users who are really anxious can set common-page-size to match
their runtime page size.

---

17 tests need updating as there are lots of change detectors.
2023-09-14 10:33:11 -07:00
Fangrui Song
65a15a56d5 [ELF] Respect orders of symbol assignments and DEFINED (#65866)
Fix #64600: the currently implementation is minimal (see
https://reviews.llvm.org/D83758), and an assignment like
`__TEXT_REGION_ORIGIN__ = DEFINED(__TEXT_REGION_ORIGIN__) ? __TEXT_REGION_ORIGIN__ : 0;`
(used by avr-ld[1]) leads to a value of zero (default value in `declareSymbol`),
which is unexpected.

Assign orders to symbol assignments and references so that
for a script-defined symbol, the `DEFINED` results match users'
expectation. I am unclear about GNU ld's exact behavior, but this hopefully
matches its behavior in the majority of cases.

[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/scripttempl/avr.sc
2023-09-11 10:54:49 -07:00
WANG Xuerui
6084ee7420 [lld][ELF] Support LoongArch
This adds support for the LoongArch ELF psABI v2.00 [1] relocation
model to LLD. The deprecated stack-machine-based psABI v1 relocs are not
supported.

The code is tested by successfully bootstrapping a Gentoo/LoongArch
stage3, complete with common GNU userland tools and both the LLVM and
GNU toolchains (GNU toolchain is present only for building glibc,
LLVM+Clang+LLD are used for the rest). Large programs like QEMU are
tested to work as well.

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

Reviewed By: MaskRay, SixWeining

Differential Revision: https://reviews.llvm.org/D138135
2023-07-25 17:06:07 +08:00
Fangrui Song
fae96104d4 [ELF] Support operator ^ and ^=
GNU ld added ^ support in July 2023 and it looks like ^= is in plan as
well.

For now, we don't support `a^=0` (^= without a preceding space).
2023-07-15 14:10:40 -07:00
Fangrui Song
49dfbc6efc [ELF] Remove one unneeded unquote from D124266
This one is unneeded after commit d60ef9338deb734541ff1c9d0771807815d5d9e6 (2023-02-03).
2023-07-05 15:08:53 -07:00
Roger Pau Monne
7cab385a8f [lld/elf] support quote usage in section names
Section names used in ELF linker scripts can be quoted, but such
quotes must not be propagated to the binary ELF section names.  As
such strip the quotes from the section names when processing them, and
also strip them from linker script functions that take section names
as parameters.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D124266
2023-07-05 14:56:16 -07:00
Fangrui Song
daba24ee7b [ELF] << >>: make RHS less than 64
The left/right shift linker script operators may trigger UB.
E.g. in linkerscript/end-overflow-check.test, the initial REGION1__PADDED_SR_SHIFT is
uint64_t(-3), cause the following expression to trigger an out-of-range shift in
a ubsan build of lld.

    REGION1__PADDED_SR_SIZE = MAX(1 << REGION1__PADDED_SR_SHIFT, 32);

Protect such UBs by making RHS less than 64.
2023-06-15 10:34:33 -07:00
Fangrui Song
8d85c96e0e [lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC
The latter form is now preferred to be similar to C++20 starts_with.
This replacement also removes one function call when startswith is not inlined.
2023-06-05 14:36:19 -07:00
Kazu Hirata
ed1539c6ad Migrate {starts,ends}with_insensitive to {starts,ends}_with_insensitive (NFC)
This patch migrates uses of StringRef::{starts,ends}with_insensitive
to StringRef::{starts,ends}_with_insensitive so that we can use names
similar to those used in std::string_view.

Note that the llvm/ directory has migrated in commit
6c3ea866e93003e16fc55d3b5cedd3bc371d1fde.

I'll post a separate patch to deprecate
StringRef::{starts,ends}with_insensitive.

Differential Revision: https://reviews.llvm.org/D150506
2023-05-16 10:12:42 -07:00
Peter Smith
e16af8a281 [LLD][ELF] Add missing program header parsing to OVERLAY
In D72756 the change to add INPUT_SECTION_FLAGS inadvertantly
removed the line to parse the program header assignment information for
OutputSections within an OVERLAY.

This change adds back the missing line and adds a test for it.

Differential Revision: https://reviews.llvm.org/D150445
2023-05-15 10:04:33 +01:00
Simi Pallipurath
2f68ddc604 [lld][ARM][2/3]Big Endian support - Word invariant support
Changes:
 - Adding BE32 big endian Support for Arm.
 - Replace the writele and readle with their endian-aware versions.
 - Adding test cases for the big-endian be32 arm configuration.

     Patch by: Milosz Plichta. This patch merges all the changes from
     this patch https://reviews.llvm.org/D140203 as well.

Reviewed By: peter.smith, MaskRay

Differential Revision: https://reviews.llvm.org/D140202
2023-03-29 10:21:00 +01:00
Justin Cady
447aa48b4a [ELF] Add REVERSE input section description keyword
The `REVERSE` keyword is described here:

https://sourceware.org/bugzilla/show_bug.cgi?id=27565

It complements `SORT` by allowing the order of input sections to be reversed.

This is particularly useful for order-dependent sections such as .init_array,
where `REVERSE` can be used to either detect static initialization order fiasco
issues or as a mechanism to maintain .ctors element order while transitioning to
the modern .init_array. Such a transition is described here:

https://discourse.llvm.org/t/is-it-possible-to-manually-specify-init-array-order/68649

Differential Revision: https://reviews.llvm.org/D145381
2023-03-07 12:44:02 -05:00
Fangrui Song
d60ef9338d [ELF] Support quoted output section names
Similar to e7a7ad134fe182aad190cb3ebc441164470e92f5 and
2bf06d9345caeb26520be8e830c092683bbdf0f7 for other linker script syntax.

Close https://github.com/llvm/llvm-project/issues/60496
2023-02-03 11:03:00 -08:00
Kazu Hirata
c68af42fa8 [lld] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 23:12:36 -08:00
Jan Svoboda
abf0c6c0c0 Use CTAD on llvm::SaveAndRestore
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D139229
2022-12-02 15:36:12 -08:00
Guillaume Chatelet
08e2a76381 [lld][NFC] rename ELF alignment into addralign 2022-12-01 16:20:12 +00:00
Fangrui Song
4191fda69c [ELF] Change most llvm::Optional to std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 19:19:15 -08:00
Fangrui Song
f596d82385 [ELF] Move driver into ctx and remove indirection. NFC
This removes one global variable and removes GOT and unique_ptr indirection.
2022-10-01 15:12:50 -07:00
Fangrui Song
9c626d4a0d [ELF] Remove symtab indirection. NFC
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.
2022-10-01 14:46:49 -07:00
Fangrui Song
85cfd91723 [ELF] Optimize some non-constant alignTo with alignToPowerOf2. NFC
My x86-64 lld executable is 2KiB smaller. .eh_frame writing gets faster as there
were lots of divisions.
2022-07-24 11:20:49 -07:00
Fangrui Song
b95cca03cd [ELF] Improve compound assignment tests
Also use strchr instead of is_contained.
2022-06-25 22:30:52 -07:00
Fangrui Song
0a0effdd5b [ELF] Support -= *= /= <<= >>= &= |= in symbol assignments 2022-06-25 22:22:59 -07:00
Fangrui Song
21bf6bb3d3 [ELF] Fix assertion failure when PROVIDE/HIDDEN/PROVIDE_HIDDEN does not have = 2022-06-25 20:26:47 -07:00
Fangrui Song
fe0de25b21 [ELF] Allow an expression to follow = in a symbol assignment
GNU ld doesn't require whitespace before =. Match it.
2022-06-25 20:25:34 -07:00
Fangrui Song
b0d6dd3905 [ELF] Fix precedence of ? when there are 2 or more operators on the left hand side
For 1 != 1 <= 1 ? 1 : 2, the current code incorrectly considers that ?
has a higher precedence than != (minPrec).

Also, add a test for right associativity.
2022-06-25 13:48:52 -07:00
Fangrui Song
d479b2e4db [ELF] Fix precedence of == and != in expressions
In GNU ld, the == and != operators have lower precedence than < > <= >=.
This behavior matches C.
2022-06-25 13:47:32 -07:00
Fangrui Song
4cb05dc3cb [ELF] Support quoted name in the TARGET command 2022-06-25 12:31:20 -07:00
Fangrui Song
363b29567e [ELF] Support quoted symbol in the ENTRY command
This matches GNU ld and matches other places we unquote the symbol name.

Fixes #56208
2022-06-25 12:19:45 -07:00
Ben Shi
8527f32f0a [lld][ELF] Support BFD name elf32-avr
Reviewed By: MaskRay

differential Revision: https://reviews.llvm.org/D125544
2022-05-18 00:00:14 +00:00
Fangrui Song
177fd72f5f [ELF] Disallow input section description without a filename
GNU ld does not allow `.foo : { (*foo) }`, but we may recognize it as three
input section descriptions: file "(" with any section name, file "*foo" with
any section name, file ")" with any section name. Disallow the error-prone usage.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D125523
2022-05-13 11:06:01 -07:00
Fangrui Song
5a44980f0a [ELF] Support custom sections between DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END
We currently hard code RELRO sections. When a custom section is between
DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END, we may report a spurious
`error: section: ... is not contiguous with other relro sections`. GNU ld
makes such sections RELRO.

glibc recently switched to default --with-default-link=no. This configuration
places `__libc_atexit` and others between DATA_SEGMENT_ALIGN and
DATA_SEGMENT_RELRO_END. This patch allows such a ld.bfd --verbose
linker script to be fed into lld.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D124656
2022-05-04 01:10:46 -07:00
Fangrui Song
6c814931bc [ELF] Don't use multiple inheritance for OutputSection. NFC
Add an OutputDesc class inheriting from SectionCommand. An OutputDesc wraps an
OutputSection. This change allows InputSection::getParent to be inlined.

Differential Revision: https://reviews.llvm.org/D120650
2022-03-08 11:23:42 -08:00
Fangrui Song
9e9c86fd67 [ELF] Change some non-null pointer parameters to references. NFC
To decrease difference for D120650. Also, rename some `OutputSection *sec` (and
`cmd`) to the more common `osec`.
2022-02-28 11:19:00 -08:00
Fangrui Song
b01430a04f [ELF] Don't rely on Symbols.h's transitive inclusion of InputFiles.h. NFC 2022-02-23 19:18:24 -08:00
Fangrui Song
66f8ac8d36 [ELF] Support (TYPE=<value>) to customize the output section type
The current output section type allows to set the ELF section type to
SHT_PROGBITS or SHT_NOLOAD. This patch allows an arbitrary section value
to be specified. Some common SHT_* literal names are supported as well.

```
SECTIONS {
  note (TYPE=SHT_NOTE) : { BYTE(8) *(note) }
  init_array ( TYPE=14 ) : { QUAD(14) }
  fini_array (TYPE = SHT_FINI_ARRAY) : { QUAD(15) }
}
```

When `sh_type` is specified, it is an error if an input section has a different type.

Our syntax is compatible with GNU ld 2.39 (https://sourceware.org/bugzilla/show_bug.cgi?id=28841).

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D118840
2022-02-17 12:10:58 -08:00
Fangrui Song
27bb799095 [ELF] Clean up headers. NFC 2022-02-07 21:53:34 -08:00
Alexandre Ganea
83d59e05b2 Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

The previous land f860fe362282ed69b9d4503a20e5d20b9a041189 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac9440a74b2e5b3fe3ff13ccdbf55af3.

Differential Revision: https://reviews.llvm.org/D108850
2022-01-20 14:53:26 -05:00
Alexandre Ganea
e6b153947d Revert [LLD] Remove global state in lldCommon
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16 11:03:06 -05:00
Alexandre Ganea
f860fe3622 [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

Differential Revision: https://reviews.llvm.org/D108850
2022-01-16 08:57:57 -05:00
Fangrui Song
64038ef8c3 [ELF] ScriptParser: change std::vector to SmallVector 2021-12-26 20:12:55 -08:00
Fangrui Song
a1c2ee0147 [ELF] LinkerScript/OutputSection: change other std::vector members to SmallVector
11+KiB smaller .text with both libc++ and libstdc++ builds.
2021-12-26 13:53:47 -08:00
Fangrui Song
7051aeef7a [ELF] Rename BaseCommand to SectionCommand. NFC
BaseCommand was picked when PHDRS/INSERT/etc were not implemented. Rename it to
SectionCommand to match `sectionCommands` and make it clear that the commands
are used in SECTIONS (except a special case for SymbolAssignment).

Also, improve naming of some BaseCommand variables (base -> cmd).
2021-11-25 20:24:23 -08:00