292 Commits

Author SHA1 Message Date
Parth
923a3cc160
[LLD] Fix crash on parsing ':ALIGN' in linker script (#146723)
The linker was crashing due to stack overflow when parsing ':ALIGN' in
an output section description. This commit fixes the linker script
parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions
(readExpr) in linker script and the behavior of ScriptLexer::expect(...)
utility. ScriptLexer::expect does not do anything if errors have already
been encountered during linker script parsing. In particular, it never
increments the current token position in the script file, even if the
current token is the same as the expected token. This causes an infinite
call cycle on parsing an expression such as '(4096)' when an error has
already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

  expect("("); // no-op, current token still points to '('
  Expression *E = readExpr(); // The cycle continues...

Closes #146722

Signed-off-by: Parth Arora <partaror@qti.qualcomm.com>
2025-07-06 10:22:50 -07:00
Fangrui Song
5859863bab [ELF] Postpone ASSERT error
assignAddresses is executed more than once. When an ASSERT expression
evaluates to zero, we should only report an error for the last
assignAddresses. Make a change similar to #66854 and #96361.

This change might help https://github.com/ClangBuiltLinux/linux/issues/2094
2025-05-28 20:56:13 -07:00
Kazu Hirata
19f00c0570
[lld] Remove unused includes (NFC) (#141421) 2025-05-25 10:55:39 -07:00
Daniel Thornburgh
e84b57dfbf
[LLD][ELF] Support OVERLAY NOCROSSREFS (#133807)
This allows NOCROSSREFS to be specified in OVERLAY linker script
descriptions. This is a particularly useful part of the OVERLAY syntax,
since it's very rarely possible for one overlay section to sensibly
reference another.

Closes #128790
2025-04-02 09:25:18 -07:00
Daniel Thornburgh
2d7add6e2e
[LLD][ELF] Allow memory region in OVERLAY (#133540)
This allows the contents of OVERLAYs to be attributed to memory regions.
This is the only clean way to overlap VMAs in linker scripts that choose
to primarily use memory regions to lay out addresses.

This also simplifies OVERLAY expansion to better match GNU LD.
Expressions for the first section's LMA and VMA are not generated if the
user did not provide them. This allows the LMA/VMA offset to be
preserved across multiple overlays in the same region, as with regular
sections.

Closes #129816
2025-03-31 10:44:40 -07:00
Nathan Chancellor
381599f1fe
[ELF] Allow KEEP within OVERLAY (#130661)
When attempting to add KEEP within an OVERLAY description, which the
Linux kernel would like to do for ARCH=arm to avoid dropping the
.vectors sections with '--gc-sections' [1], ld.lld errors with:

  ld.lld: error: ./arch/arm/kernel/vmlinux.lds:37: section pattern is expected
  >>>  __vectors_lma = .; OVERLAY 0xffff0000 : AT(__vectors_lma) { .vectors { KEEP(*(.vectors)) } ...
  >>>                                                                               ^

readOverlaySectionDescription() does not handle all input section
description keywords, despite GNU ld's documentation stating that "The
section definitions within the OVERLAY construct are identical to those
within the general SECTIONS construct, except that no addresses and no
memory regions may be defined for sections within an OVERLAY."

Reuse the existing parsing in readInputSectionDescription(), which
handles KEEP, allowing the Linux kernel's use case to work properly.

[1]: https://lore.kernel.org/20250221125520.14035-1-ceggers@arri.de/
2025-03-11 19:58:14 +01:00
Csanád Hajdú
6e457c2001
[LLD][ELF][AArch64] Add support for SHF_AARCH64_PURECODE ELF section flag (3/3) (#125689)
Add support for the new SHF_AARCH64_PURECODE ELF section flag:
https://github.com/ARM-software/abi-aa/pull/304

The general implementation follows the existing one for ARM targets. The
output section only has the `SHF_AARCH64_PURECODE` flag set if all input
sections have it set.

Related PRs:
* LLVM: https://github.com/llvm/llvm-project/pull/125687
* Clang: https://github.com/llvm/llvm-project/pull/125688
2025-02-21 09:01:38 -08:00
Fangrui Song
5c3c0a8cec [ELF] Replace inExpr with lexState. NFC
We may add another state State::Wild to behave more lik GNU ld.
2025-02-01 15:49:08 -08:00
Parth Arora
8c2030b7d4
[LLD] [ELF] Add support for linker script unary plus operator (#121508)
This commit adds support for linker script unary plus ('+') operator. It
is helpful for improving compatibility between LLD and GNU LD.

Closes #118047
2025-01-21 20:05:07 -08:00
Fangrui Song
2991a4e209 [ELF] Replace functions bAlloc/saver/uniqueSaver with member access 2024-11-16 22:34:13 -08:00
Fangrui Song
a626eb2a2f [ELF] Pass ctx to bAlloc/saver/uniqueSaver 2024-11-16 15:20:21 -08:00
Fangrui Song
e24457a330 [ELF] Migrate away from global ctx 2024-11-14 22:17:10 -08:00
Fangrui Song
ed6c106e6a [ELF] Replace errorCount with errCount(ctx)
to reduce reliance on the global context.
2024-11-07 09:06:01 -08:00
Fangrui Song
9b058bb42d [ELF] Replace errorOrWarn(...) with Err 2024-11-06 22:33:51 -08:00
Fangrui Song
f8bae3af74 [ELF] Replace warn(...) with Warn 2024-11-06 22:19:31 -08:00
Fangrui Song
09c2c5e1e9 [ELF] Replace error(...) with ErrAlways or Err
Most are migrated to ErrAlways mechanically.
In the future we should change most to Err.
2024-11-06 22:04:52 -08:00
Fangrui Song
f2b0133858 [ELF] Move static nextGroupId isInGroup to LinkerDriver 2024-10-06 17:38:35 -07:00
Fangrui Song
49865107d4 [ELF] Pass Ctx & to InputFiles 2024-10-06 11:27:24 -07:00
Fangrui Song
3590068950 [ELF] Pass Ctx & to OutputSections 2024-10-03 20:06:58 -07:00
Fangrui Song
df0864e761 [ELF] Move elf::symtab into Ctx
Remove the global variable `symtab` and add a member variable
(`std::unique_ptr<SymbolTable>`) to `Ctx` instead.

This is one step toward eliminating global states.

Pull Request: https://github.com/llvm/llvm-project/pull/109612
2024-09-23 10:33:43 -07:00
Fangrui Song
33204002f6 [ELF] ScriptParser: make Ctx & a member variable. NFC
Lambda captures need adjusting.
2024-09-21 11:51:02 -07:00
Fangrui Song
cf57a670bb [ELF] ScriptParser: pass Ctx to ScriptParser and ScriptLexer. NFC 2024-09-21 11:06:06 -07:00
Fangrui Song
b4feb26606 [ELF] Move target to Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

Follow-up to driver (2022-10) and script (2024-08).
2024-08-21 23:53:36 -07:00
Fangrui Song
4629aa1797 [ELF] Move script into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

We now use default-initialization for `LinkerScript` and should pay
attention to non-class types (e.g. `dot` is initialized by commit
503907dc505db1e439e7061113bf84dd105f2e35).
2024-08-21 21:23:28 -07:00
Daniel Thornburgh
7e8a9020b1
[LLD] Add CLASS syntax to SECTIONS (#95323)
This allows the input section matching algorithm to be separated from
output section descriptions. This allows a group of sections to be
assigned to multiple output sections, providing an explicit version of
--enable-non-contiguous-regions's spilling that doesn't require altering
global linker script matching behavior with a flag. It also makes the
linker script language more expressive even if spilling is not intended,
since input section matching can be done in a different order than
sections are placed in an output section.

The implementation reuses the backend mechanism provided by
--enable-non-contiguous-regions, so it has roughly similar semantics and
limitations. In particular, sections cannot be spilled into or out of
INSERT, OVERWRITE_SECTIONS, or /DISCARD/. The former two aren't
intrinsic, so it may be possible to relax those restrictions later.
2024-08-05 13:06:45 -07:00
Fangrui Song
ff7f97a819 [ELF] --defsym: support quoted LHS
and move = splitting from Driver.cpp to ScriptParser.cpp.
2024-07-28 12:38:10 -07:00
Fangrui Song
a7e8bddfc1 [ELF] Respect --sysroot for INCLUDE
If an included script is under the sysroot directory, when it opens an
absolute path file (`INPUT` or `GROUP`), add sysroot before the absolute
path. When the included script ends, the `isUnderSysroot` state is
restored.
2024-07-28 11:43:27 -07:00
Fangrui Song
a4921f10e0 [ELF] Output section phdr: support quoted names 2024-07-27 17:40:51 -07:00
Fangrui Song
9c16a4a2dc [ELF] INSERT [AFTER|BEFORE]: support quoted names 2024-07-27 17:34:37 -07:00
Fangrui Song
8f72b0cb08 [ELF] Fix INCLUDE cycle detection
Fix #93947: the cycle detection mechanism added by
https://reviews.llvm.org/D37524 also disallowed including a file twice,
which is an unnecessary limitation.

Now that we have an include stack #100493, supporting multiple inclusion
is trivial. Note: a filename can be referenced with many different
paths, e.g. a.lds, ./a.lds, ././a.lds. We don't attempt to detect the
cycle in the earliest point.
2024-07-27 17:25:13 -07:00
Fangrui Song
dbd65a07f2 [ELF] OUTPUT_ARCH: report unclosed error 2024-07-27 16:52:47 -07:00
Fangrui Song
74f843d05f [ELF] Replace unquote(next()) with readName. NFC 2024-07-27 16:47:18 -07:00
Fangrui Song
0d8bc10acb [ELF] Memory region: support quoted names 2024-07-27 16:39:15 -07:00
Fangrui Song
e689515491 [ELF] OVERLAY: support quoted output section names 2024-07-27 16:33:18 -07:00
Fangrui Song
74ef53a01a [ELF] REGION_ALIAS: support quoted names 2024-07-27 16:29:43 -07:00
Fangrui Song
c89566f317 [ELF] Replace unquote(next()) with readName. NFC 2024-07-27 16:27:05 -07:00
Fangrui Song
30ec2bf58d [ELF] PROVIDE: allow quoted names to be discarded
Extend commit ebb326a51fec37b5a47e5702e8ea157cd4f835cd for (#74771) to
support quoted names, e.g. `PROVIDE("f1" = f2 + f3);`.
2024-07-27 16:19:57 -07:00
Fangrui Song
edcc60e403 [ELF] Simplify readAssignment
After #100493, the `=` support from
fe0de25b2195b66d1ebac5d3ebdb18f9e1e776da can be simplified.
2024-07-27 16:04:38 -07:00
Hongyu Chen
f1a7d146e0
[ELF] Updated some while conditions with till (#100893)
This change is based on
[commit](b32c38ab5b)
for a cleaner API usage. Thanks to @MaskRay !
2024-07-27 14:16:12 -07:00
Fangrui Song
b32c38ab5b [ELF] Replace some while (peek() != ")" && !atEOF()) with till 2024-07-26 17:25:23 -07:00
Fangrui Song
10bb296dfc [ELF] Replace some while (peek() != ")" && !atEOF()) with till 2024-07-26 17:19:04 -07:00
Fangrui Song
2a89356d64 [ELF] Add till and rewrite while (... consume("}"))
After #100493, the idiom `while (!errorCount() && !consume("}"))` could
lead to inaccurate diagnostics or dead loops. Introduce till to change
the code pattern.
2024-07-26 17:13:37 -07:00
Fangrui Song
1978c21d96
[ELF] ScriptLexer: generate tokens lazily
The current tokenize-whole-file approach has a few limitations.

* Lack of state information: `maybeSplitExpr` is needed to parse
  expressions. It's infeasible to add new states to behave more like GNU
  ld.
* `readInclude` may insert tokens in the middle, leading to a time
  complexity issue with N-nested `INCLUDE`.
* line/column information for diagnostics are inaccurate, especially
  after an `INCLUDE`.
* `getLineNumber` cannot be made more efficient without significant code
  complexity and memory consumption. https://reviews.llvm.org/D104137

The patch switches to a traditional lexer that generates tokens lazily.

* `atEOF` behavior is modified: we need to call `peek` to determine EOF.
* `peek` and `next` cannot call `setError` upon `atEOF`.
* Since `consume` no longer reports an error upon `atEOF`, the idiom `while (!errorCount() && !consume(")"))`
  would cause a dead loop. Use `while (peek() != ")" && !atEOF()) { ... } expect(")")` instead.
* An include stack is introduced to handle `readInclude`. This can be
  utilized to address #93947 properly.
* `tokens` and `pos` are removed.
* `commandString` is reimplemented. Since it is used in -Map output,
  `\n` needs to be replaced with space.

Pull Request: https://github.com/llvm/llvm-project/pull/100493
2024-07-26 14:26:38 -07:00
Hongyu Chen
2ae862b74b
[ELF] Remove consumeLabel in ScriptLexer (#99567)
This commit removes `consumeLabel` since we can just use consume
function to have the same functionalities.
2024-07-23 22:03:46 -07:00
Hongyu Chen
b828c13f3c
[ELF] Delete peek2 in Lexer (#99790)
Thanks to Fangrui's change

28045ceab0
so peek2 can be removed.
2024-07-20 16:35:38 -07:00
Fangrui Song
efa833dd0f [ELF] Simplify readExpr. NFC 2024-07-20 14:36:55 -07:00
Fangrui Song
28045ceab0 [ELF] Support (TYPE=<value>) beside output section address
Support `preinit_array . (TYPE=SHT_PREINIT_ARRAY) : { QUAD(16) }`

Follow-up to https://reviews.llvm.org/D118840

peek2() could be eliminated by a future change.
2024-07-20 14:13:02 -07:00
Fangrui Song
0778f5c1f1
[ELF] Support NOCROSSREFS and NOCROSSERFS_TO
Implement the two commands described by
https://sourceware.org/binutils/docs/ld/Miscellaneous-Commands.html

After `outputSections` is available, check each output section described
by at least one `NOCROSSREFS`/`NOCROSSERFS_TO` command. For each checked
output section, scan relocations from its input sections.
This step is slow, therefore utilize `parallelForEach(isd->sections, ...)`.

To support non SHF_ALLOC sections, `InputSectionBase::relocations`
(empty) cannot be used. In addition, we may explore eliminating this
member to speed up relocation scanning.

Some parse code is adapted from #95714.

Close #41825

Pull Request: https://github.com/llvm/llvm-project/pull/98773
2024-07-17 10:45:59 -07:00
Brian Cain
9078036685
[lld] Add emulation support for hexagon (#98857) 2024-07-16 15:01:27 -05:00
Fangrui Song
6464dd21b5
[ELF] OUTPUT_FORMAT: support "binary" and ignore extra OUTPUT_FORMAT commands
This patch improves GNU ld compatibility.

Close #87891: Support `OUTPUT_FORMAT(binary)`, which is like
--oformat=binary. --oformat=binary takes precedence over an ELF
`OUTPUT_FORMAT`.

In addition, if more than one OUTPUT_FORMAT command is specified, only
check the first one.

Pull Request: https://github.com/llvm/llvm-project/pull/98837
2024-07-16 10:28:09 -07:00