llvm-project

Author	SHA1	Message	Date
Fangrui Song	a2359a865a	[ELF] Fix PROVIDE_HIDDEN -shared regression with bitcode file references The inaccurate #111945 condition fixes a PROVIDE regression (#111478) but introduces another regression: in a DSO link, if a symbol referenced only by bitcode files is defined as PROVIDE_HIDDEN, lld would not set the visibility correctly, leading to an assertion failure in DynamicReloc::getSymIndex (https://reviews.llvm.org/D123985). This is because `(sym->isUsedInRegularObj \|\| sym->exportDynamic)` is initially false (bitcode undef does not set `isUsedInRegularObj`) then true (in `addSymbol`, after LTO compilation). Fix this by making the condition accurate: use a map to track defined symbols. Reviewers: smithp35 Reviewed By: smithp35 Pull Request: https://github.com/llvm/llvm-project/pull/112386	2024-10-15 09:20:10 -07:00
Brian Cain	77aa8257ac	[lld][Hexagon] Support predicated-add GOT_16_X mask lookup (#111896 ) When encountering an instruction like `if (p0) r0 = add(r0,##bar@GOT)`, lld would fail with: ``` ld.lld: error: unrecognized instruction for 16_X type: 0x7400C000 ``` This issue was encountered while building libreadline with clang 19.1.0. Fixes: #111876	2024-10-11 14:31:41 -05:00
Fangrui Song	1c6688ae34	[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate sections that would be discarded by GC. Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return false (when Defined) and then return true (been demoted to Undefined). ``` addScriptReferencedSymbolsToSymTable shouldAddProvideSym(f1): false // the RHS (bar) is not added to `referencedSymbols` and may be GCed declareSymbols shouldAddProvideSym(f1): false markLive demoteSymbolsAndComputeIsPreemptible // demoted f1 to Undefined processSymbolAssignments addSymbol shouldAddProvideSym(f1): true ``` The inconsistency can cause `cmd->expression()` in `addSymbol` to be evaluated, leading to `symbol not found: bar` errors (since `bar` in the RHS is not in `referencedSymbols` and is GCed) (#111478). Fix this by adding a `sym->isUsedInRegularObj` condition, making `shouldAddProvideSym(f1)` values consistent. In addition, we need a `sym->exportDynamic` condition to keep provide-shared.s working. Fixes: ebb326a51fec37b5a47e5702e8ea157cd4f835cd Pull Request: https://github.com/llvm/llvm-project/pull/111945	2024-10-11 08:47:07 -07:00
Igor Kudrin	1037f577bd	[lld][elf] Warn if '*' pattern is used multiple times in version scripts (#102669 ) If this pattern is used more than once in version script(s), only one will have an effect, so it's probably a user error and can be diagnosed.	2024-10-10 16:51:27 -07:00
Sam Elliott	db1a762069	[LLD][RISCV] Error on PCREL_LO referencing other Section (#107558 ) The RISC-V psABI states that "The `R_RISCV_PCREL_LO12_I` or `R_RISCV_PCREL_LO12_S` relocations contain a label pointing to an instruction in the same section with an `R_RISCV_PCREL_HI20` relocation entry that points to the target symbol." Without this patch, GNU ld errors, but LLD does not -- I think because LLD is doing the right thing, certainly in the testcase provided. Nonetheless, I think an error is good here to bring LLD in line with what GNU ld is doing in showing that the object the user provided is not following the psABI as written. Fixes #107304	2024-10-08 12:45:01 +01:00
Rahman Lavaee	1f17c2d20d	[LLD] Deprecate --lto-basic-block-sections=labels (#110697 ) This option is now replaced by `--lto-basic-block-address-map`.	2024-10-07 09:22:36 -07:00
Nuri Amari	2edd897a42	Make WriteIndexesThinBackend multi threaded (#109847 ) We've noticed that for large builds executing thin-link can take on the order of 10s of minutes. We are only using a single thread to write the sharded indices and import files for each input bitcode file. While we need to ensure the index file produced lists modules in a deterministic order, that doesn't prevent us from executing the rest of the work in parallel. In this change we use a thread pool to execute as much of the backend's work as possible in parallel. In local testing on a machine with 80 cores, this change makes a thin-link for ~100,000 input files run in ~2 minutes. Without this change it takes upwards of 10 minutes. --------- Co-authored-by: Nuri Amari <nuriamari@fb.com>	2024-10-07 08:16:46 -07:00
Peter Smith	c4d9cd8b74	[LLD][ELF][AArch64] Add BTI Aware long branch thunks (#108989 ) When Branch Target Identification BTI is enabled all indirect branches must target a BTI instruction. A long branch thunk is a source of indirect branches. To date LLD has been assuming that the object producer is responsible for putting a BTI instruction at all places the linker might generate an indirect branch to. This is true for clang, but not for GCC. GCC will elide the BTI instruction when it can prove that there are no indirect branches from outside the translation unit(s). GNU ld was fixed to generate a landing pad stub (gnu ld speak for thunk) for the destination when a long range stub was needed [1]. This means that using GCC compiled objects with LLD may lead to LLD generating an indirect branch to a location without a BTI. The ABI [2] has also been clarified to say that it is a static linker's responsibility to generate a landing pad when the target does not have a BTI. This patch implements the same mechansim as GNU ld. When the output ELF file is setting the GNU_PROPERTY_AARCH64_FEATURE_1_BTI property, then we check the destination to see if it has a BTI instruction. If it does not we generate a landing pad consisting of: BTI c B <destination> The B <destination> can be elided if the thunk can be placed so that control flow drops through. For example: BTI c <destination>: This will be common when -ffunction-sections is used. The landing pad thunks are effectively alternative entry points for the function. Direct branches are unaffected but any linker generated indirect branch needs to use the alternative. We place these as close as possible to the destination section. There is some further optimization possible. Consider the case: .text fn1 ... fn2 ... If we need landing pad thunks for both fn1 and fn2 we could order them so that the thunk for fn1 immediately precedes fn1. This could save a single branch. However I didn't think that would be worth the additional complexity. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671 [2] https://github.com/ARM-software/abi-aa/issues/196	2024-10-01 13:12:29 +01:00
Shengchen Kan	31dd29cfb3	[X86,lld] Handle relocation R_X86_64_REX2_GOTPCRELX (#109783 ) For mov name@GOTPCREL(%rip), %reg test %reg, name@GOTPCREL(%rip) binop name@GOTPCREL(%rip), %reg where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, we added R_X86_64_REX2_GOTPCRELX = 43 in #106681. Linker can treat R_X86_64_REX2_GOTPCRELX as R_X86_64_GOTPCREL or convert the above instructions to lea name(%rip), %reg mov $name, %reg test $name, %reg binop $name, %reg if the first byte of the instruction at the relocation `offset - 4` is `0xd5` (namely, encoded w/ REX2 prefix) when possible. Binutils patch: `3d5a60de52` Binutils mailthread: https://sourceware.org/pipermail/binutils/2023-December/131462.html ABI discussion: https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation	2024-09-29 12:52:36 +08:00
Fangrui Song	abe0dd195a	[llvm-objdump] Print ... even if a data mapping symbol is active Swap `!DisassembleZeroes` and `if (DumpARMELFData)` conditions so that in the false DisassembleZeroes case (default), `...` will be printed for long consecutive zeroes, even when a data mapping symbol is active. This is especially useful for certain lld tests that insert a huge padding within a code section. Without `...` the output will be huge. Pull Request: https://github.com/llvm/llvm-project/pull/109553	2024-09-25 10:32:40 -07:00
Fangrui Song	e82f0838ae	[ELF] --icf: don't fold a section without relocation and a section with relocations for SHT_CREL Similar to commit 686cff17cc310884e48ae963bf7507f96950cc90 for SHT_REL (#57693). CREL hasn't been tested with ICF before. And avoid a pitfall that eqClass[0] might interfere with ICF.	2024-09-18 23:06:12 -07:00
Fangrui Song	cf70a1ee81	[ELF] .llvm.sympart: support CREL When both CREL and the experimental lld partitions feature are enabled, the relocation section may look like .crel.llvm_sympart.f1, and `rels.relas` is empty. While here, support relocation sections with zero entry.	2024-09-16 13:12:45 -07:00
Brian Cain	d1ba432533	[lld] select a default eflags for hexagon (#108431 ) Empty archives are apparently routine in linux kernel builds, so instead of asserting, we should handle this case with a sane default value.	2024-09-13 17:10:03 -05:00
Simon Tatham	daf208598b	[lld][AArch64] Fix getImplicitAddend in big-endian mode. (#107845 ) In AArch64, the endianness of instruction encodings is always little, whereas the endianness of data swaps between LE and BE modes. So getImplicitAddend must use the right one of read32() and read32le(), for data and code respectively. It was using read32() throughout, causing instructions to be read as big-endian in BE mode, getting the wrong addend. Fixed, and updated the existing test to check both endiannesses. The expected results for data must be byte-swapped, but the ones for code need no adjustment.	2024-09-10 12:38:32 +01:00
Min-Yih Hsu	5fe852e774	[lld][ELF] Add `-plugin-opt=time-trace=` as an alias of `--time-trace=` (#106803 ) Time trace profiler support was added into LLVMgold in cd3255abede5e3687c1538f2d3857deb2c51af1b. This patch adds its `-plugin-opt` counterpart, which is just an alias to `--time-trace=`, into LLD for compatibility.	2024-09-01 17:38:59 -07:00
Patryk Wychowaniec	a3816b5a57	[AVR] Fix LLD test (#106739 ) Since we don't generate relocations for those, it doesn't make sense to assert them here; fallout of https://github.com/llvm/llvm-project/pull/106722.	2024-08-30 10:50:56 -04:00
Jan Voung	fa4fbaefde	Reapply: Use an abbrev to reduce size of VALUE_GUID records in ThinLTO summaries (#106165 ) This retries #90692 which was reverted previously due to issues with lld-available being set, even if the copy of lld is not built from source. This does not change any code compared to #90692 to address the lld-available issue. The main change w.r.t, lld-available is xfailing tests in PR #99056 (until a longer term fix is available).	2024-08-27 13:53:25 -04:00
Fangrui Song	46707b0a83	[AArch64,ELF] Allow implicit $d/$x at section beginning The start state of a new section is `EMS_None`, often leading to a $d/$x at offset 0. Introduce a MCTargetOption/cl::opt "implicit-mapsyms" to allow an alternative behavior (https://github.com/ARM-software/abi-aa/issues/274): * Set the start state to `EMS_Data` or `EMS_A64`. * For text sections, add an ending $x only if the final data is not instructions. * For non-text sections, add an ending $d only if the final data is not data commands. ``` .section .text.1,"ax" nop // emit $d .long 42 // emit $x .section .text.2,"ax" nop ``` This new behavior decreases the .symtab size significantly: ``` % bloaty a64-2/bin/clang -- a64-0/bin/clang FILE SIZE VM SIZE -------------- -------------- -5.4% -1.13Mi [ = ] 0 .strtab -50.9% -4.09Mi [ = ] 0 .symtab -4.0% -5.22Mi [ = ] 0 TOTAL ``` --- This scheme works as long as the user can rule out some error scenarios: * .text.1 assembled using the traditional behavior is combined with .text.2 using the new behavior * A linker script combining non-text sections and text sections. The lack of mapping symbols in the non-text sections could make them treated as code, unless the linker inserts extra mapping symbols. The above mix-and-match scenarios aren't an issue at all for a significant portion of users. A text section may start with data commands in rare cases (e.g. -fsanitize=function) that many users don't care about. When combing `(.text.0; .word 0)` and `(.text.1; .word 0)`, the ending $x of .text.0 and the initial $d of .text.1 may have the same address. If both sections reside in the same file, ensure the ending symbol comes before the initial $d of .text.1, so that a dumb linker respecting the symbol order will place the ending $x before the initial $d. Disassemblers using stable sort will see both symbols at the same address, and the second will win. When section ordering mechanisms (e.g. --symbol-ordering-file, --call-graph-profile-sort, `.text : { second.o(.text) first.o(.text) }`) are involved, the initial data in a text section following a text section with trailing data could be misidentified as code, but the issue is local and the risk could be acceptable. Pull Request: https://github.com/llvm/llvm-project/pull/99718	2024-08-22 09:12:11 -07:00
Siu Chi Chan	f7bbc40b07	[ELF,test] Enhance hip-section-layout.s Check different object file order Change-Id: I6096c12e29e9ddb6b3053f977e4cbb24eea9b7d3	2024-08-21 16:23:05 +00:00
Fangrui Song	12d4c89e88	[ELF,test] Improve error-handling-script-linux.test * Use split-file * Remove -o /dev/null * Avoid `{ list; }` compound command not supported by the lit internal shell (#102382) * Don't test "ld.lld" before "error:" as per convention Pull Request: https://github.com/llvm/llvm-project/pull/105454	2024-08-20 19:21:13 -07:00
Sergei Barannikov	c91cc459d3	[DataLayout] Refactor the rest of `parseSpecification` (#104545 ) The aim is to improve test coverage of data layout string parsing. Pull Request: https://github.com/llvm/llvm-project/pull/104545	2024-08-20 11:25:49 +03:00
Sam Elliott	9b65558d2f	[lld][ELF] Combine uniqued small data sections (#104485 ) RISC-V GCC with `-fdata-sections` will emit `.sbss.<name>`, `.srodata.<name>`, and `.sdata.<name>` sections for small data items of different kinds. Clang/LLVM already emits `.srodata.*` sections, and we intend to emit the other two section name patterns in #87040. This change ensures that any input sections starting `.sbss` are combined into one output section called `.sbss`, and the same respectively for `.srodata` and `.sdata`. This also allows the existing RISC-V specific code for determining an output order for `.sbss` and `.sdata` sections to apply to placing the sections.	2024-08-19 17:51:46 +01:00
Fangrui Song	b6448a03d8	[ELF] Change "no PT_TLS" error to use errorOrWarn so that --noinhibit-exec downgrades the error to a warning, which helps debugging when `PHDRS` is specified without `PT_TLS`. Also update the message to make it accurate: STT_TLS may exist in the absence of PT_TLS. In addition, invoking `exitLld(1)` (through `fatal`) is problematic (#66974): When a thread is `exitLld(1)`, triggering `llvm_shutdown`, another thread may be at `relocateAlloc`, accessing `sec.relocs()` which got destroyed(tampered?), leading to incorrect `llvm_unreachable("invalid expression")`.	2024-08-12 11:56:29 -07:00
Fangrui Song	dc21cb5cc7	[ELF,test] Test STT_TLS and relocation without PT_TLS	2024-08-12 11:25:46 -07:00
Fangrui Song	a821fee312	[ELF] scanRelocations: support .crel.eh_frame Follow-up to #98115. For EhInputSection, RelocationScanner::scan calls sortRels, which doesn't support the CREL iterator. We should set supportsCrel to false to ensure that the initial_location fields in .eh_frame FDEs are relocated.	2024-08-08 12:02:44 -07:00
Fangrui Song	0766a59be3	[ELF] .llvm.call-graph-profile: support CREL https://reviews.llvm.org/D105217 added RELA support. This patch adds CREL support.	2024-08-08 00:57:43 -07:00
Oliver Stannard	a1c6467bd9	[lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985 ) Previously, we selected the Thumb2 PLT sequences if any input object is marked as not supporting the ARM ISA, which then causes assertion failures when calls from ARM code in other objects are seen. I think the intention here was to only use Thumb PLTs when the target does not have the ARM ISA available, signalled by no objects being marked as having it available. To do that we need to track which ISAs we have seen as we parse the build attributes, and defer the decision about PLTs until all input objects have been parsed. This bug was triggered by real code in picolibc, which have some versions of string.h functions built with Thumb2-only build attributes, so that they are compatible with v7-A, v7-R and v7-M. Fixes #99008.	2024-08-07 10:20:26 +01:00
Siu Chi Chan	048f350377	Move HIP fatbin sections farther away from .text This would avoid wasting relocation range to jump over the HIP fatbin sections and therefore alleviate relocation overflow pressure.	2024-08-06 15:17:59 +00:00
Daniel Thornburgh	7e8a9020b1	[LLD] Add CLASS syntax to SECTIONS (#95323 ) This allows the input section matching algorithm to be separated from output section descriptions. This allows a group of sections to be assigned to multiple output sections, providing an explicit version of --enable-non-contiguous-regions's spilling that doesn't require altering global linker script matching behavior with a flag. It also makes the linker script language more expressive even if spilling is not intended, since input section matching can be done in a different order than sections are placed in an output section. The implementation reuses the backend mechanism provided by --enable-non-contiguous-regions, so it has roughly similar semantics and limitations. In particular, sections cannot be spilled into or out of INSERT, OVERWRITE_SECTIONS, or /DISCARD/. The former two aren't intrinsic, so it may be possible to relax those restrictions later.	2024-08-05 13:06:45 -07:00
Fangrui Song	0af07c0787	[ELF] Support relocatable files using CREL with explicit addends ... using the temporary section type code 0x40000020 (`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the code and break compatibility (Clang and lld of different versions are not guaranteed to cooperate, unlike other features). CREL with implicit addends are not supported. --- Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and update users to check `crels`. (The decoding performance is critical and error checking is difficult. Follow `skipLeb` and `R_LEB128` handling, do not use `llvm::decodeULEB128`, whichs compiles to a lot of code.) A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass `/supportsCrel=/false` to `relsOrRelas` to allocate a buffer and convert CREL to RELA (`relas` instead of `crels` will be used). Since allocating a buffer increases, the conversion is only performed when absolutely necessary. --- Non-alloc SHT_CREL sections may be created in -r and --emit-relocs links. SHT_CREL and SHT_RELA components need reencoding since r_offset/r_symidx/r_type/r_addend may change. (r_type may change because relocations referencing a symbol in a discarded section are converted to `R__NONE`). * SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`) * SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section. * SHT_REL components: print an error for now. SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and unsupported yet. Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600 Pull Request: https://github.com/llvm/llvm-project/pull/98115	2024-08-01 10:22:03 -07:00
Fangrui Song	5d972c582a	[ELF] Add -z nosectionheader GNU ld since 2.41 supports this option, which is mildly useful. It omits the section header table and non-ALLOC sections (including .symtab/.strtab (--strip-all)). This option is simple to implement and might be used by LLDB to test program headers parsing without the section header table (#100900). -z sectionheader, which is the default, is also added. Pull Request: https://github.com/llvm/llvm-project/pull/101286	2024-07-31 12:57:23 -07:00
Fangrui Song	ff7f97a819	[ELF] --defsym: support quoted LHS and move = splitting from Driver.cpp to ScriptParser.cpp.	2024-07-28 12:38:10 -07:00
Fangrui Song	a7e8bddfc1	[ELF] Respect --sysroot for INCLUDE If an included script is under the sysroot directory, when it opens an absolute path file (`INPUT` or `GROUP`), add sysroot before the absolute path. When the included script ends, the `isUnderSysroot` state is restored.	2024-07-28 11:43:27 -07:00
Fangrui Song	30fa011413	[ELF,test] Improve --sysroot and GROUP tests 3i.t (INCLUDE "%t.dir/3a.t") describes a behavior difference from GNU ld, which will be fixed by the next change.	2024-07-28 11:40:14 -07:00
Fangrui Song	a4921f10e0	[ELF] Output section phdr: support quoted names	2024-07-27 17:40:51 -07:00
Fangrui Song	9c16a4a2dc	[ELF] INSERT [AFTER\|BEFORE]: support quoted names	2024-07-27 17:34:37 -07:00
Fangrui Song	8f72b0cb08	[ELF] Fix INCLUDE cycle detection Fix #93947: the cycle detection mechanism added by https://reviews.llvm.org/D37524 also disallowed including a file twice, which is an unnecessary limitation. Now that we have an include stack #100493, supporting multiple inclusion is trivial. Note: a filename can be referenced with many different paths, e.g. a.lds, ./a.lds, ././a.lds. We don't attempt to detect the cycle in the earliest point.	2024-07-27 17:25:13 -07:00
Fangrui Song	aad2238f78	[ELF] Improve INCLUDE cycle tests And demonstrate the incorrect diagnostic when a linker script is included multiple times (#93947).	2024-07-27 17:14:53 -07:00
Fangrui Song	4ad3deeefc	[ELF] PHDRS: test EOF without ;	2024-07-27 16:56:27 -07:00
Fangrui Song	dbd65a07f2	[ELF] OUTPUT_ARCH: report unclosed error	2024-07-27 16:52:47 -07:00
Fangrui Song	0d8bc10acb	[ELF] Memory region: support quoted names	2024-07-27 16:39:15 -07:00
Fangrui Song	e689515491	[ELF] OVERLAY: support quoted output section names	2024-07-27 16:33:18 -07:00
Fangrui Song	74ef53a01a	[ELF] REGION_ALIAS: support quoted names	2024-07-27 16:29:43 -07:00
Fangrui Song	30ec2bf58d	[ELF] PROVIDE: allow quoted names to be discarded Extend commit ebb326a51fec37b5a47e5702e8ea157cd4f835cd for (#74771) to support quoted names, e.g. `PROVIDE("f1" = f2 + f3);`.	2024-07-27 16:19:57 -07:00
Fangrui Song	9328c20cc8	[ELF] Track line number precisely `getLineNumber` is both imprecise (when `INCLUDE` is used) and inefficient (see https://reviews.llvm.org/D104137). Track line number precisely now that we have `struct Buffer` abstraction from #100493.	2024-07-27 14:46:41 -07:00
Fangrui Song	2a89356d64	[ELF] Add till and rewrite while (... consume("}")) After #100493, the idiom `while (!errorCount() && !consume("}"))` could lead to inaccurate diagnostics or dead loops. Introduce till to change the code pattern.	2024-07-26 17:13:37 -07:00
Fangrui Song	6cf1ea99c6	[ELF,test] Improve unclosed tests	2024-07-26 16:51:42 -07:00
Fangrui Song	4f5ad22b95	[ELF,test] Improve PHDRS tests	2024-07-26 15:55:01 -07:00
Fangrui Song	1978c21d96	[ELF] ScriptLexer: generate tokens lazily The current tokenize-whole-file approach has a few limitations. * Lack of state information: `maybeSplitExpr` is needed to parse expressions. It's infeasible to add new states to behave more like GNU ld. * `readInclude` may insert tokens in the middle, leading to a time complexity issue with N-nested `INCLUDE`. * line/column information for diagnostics are inaccurate, especially after an `INCLUDE`. * `getLineNumber` cannot be made more efficient without significant code complexity and memory consumption. https://reviews.llvm.org/D104137 The patch switches to a traditional lexer that generates tokens lazily. * `atEOF` behavior is modified: we need to call `peek` to determine EOF. * `peek` and `next` cannot call `setError` upon `atEOF`. * Since `consume` no longer reports an error upon `atEOF`, the idiom `while (!errorCount() && !consume(")"))` would cause a dead loop. Use `while (peek() != ")" && !atEOF()) { ... } expect(")")` instead. * An include stack is introduced to handle `readInclude`. This can be utilized to address #93947 properly. * `tokens` and `pos` are removed. * `commandString` is reimplemented. Since it is used in -Map output, `\n` needs to be replaced with space. Pull Request: https://github.com/llvm/llvm-project/pull/100493	2024-07-26 14:26:38 -07:00
wanglei	0057a969a2	[lld][ELF][LoongArch] Support R_LARCH_TLS_{LD,GD,DESC}_PCREL_S2 Reviewed By: MaskRay, SixWeining Pull Request: https://github.com/llvm/llvm-project/pull/100105	2024-07-26 14:38:36 +08:00

1 2 3 4 5 ...

4330 Commits