llvm-project

Author	SHA1	Message	Date
Jessica Clarke	489d36cedc	[NFC][ELF] Add missing blank line between functions Fixes: b42f96bc057f ("[lld] Add thunks for hexagon (#111217)") (cherry picked from commit b03d1e1e2e8e4b0b4b9e035b7ad9fb86dccefb93)	2025-08-05 10:56:10 +02:00
Brian Cain	f6c4f0eb70	[lld] Add thunks for hexagon (#111217 ) Without thunks, programs will encounter link errors complaining that the branch target is out of range. Thunks will extend the range of branch targets, which is a critical need for large programs. Thunks provide this flexibility at a cost of some modest code size increase. When configured with the maximal feature set, the hexagon port of the linux kernel would often encounter these limitations when linking with `lld`. The relocations which will be extended by thunks are: * R_HEX_B22_PCREL, R_HEX_{G,L}D_PLT_B22_PCREL, R_HEX_PLT_B22_PCREL relocations have a range of ± 8MiB on the baseline * R_HEX_B15_PCREL: ±65,532 bytes * R_HEX_B13_PCREL: ±16,380 bytes * R_HEX_B9_PCREL: ±1,020 bytes Fixes #149689 Co-authored-by: Alexey Karyakin <akaryaki@quicinc.com> --------- Co-authored-by: Alexey Karyakin <akaryaki@quicinc.com> (cherry picked from commit b42f96bc057fd9e31572069b241ba130c21144e5)	2025-07-28 09:27:54 +02:00
Brian Cain	51245ebda1	[lld] [hexagon] guard allocateAux: only if idx nonzero (#149690 ) While building libclang_rt.asan-hexagon.so, lld would assert in lld:🧝:hexagonTLSSymbolUpdate(). Fixes #132766 (cherry picked from commit 3e9ceae29f39456508eef5b4af4d3c895048706a)	2025-07-22 10:37:04 +02:00
Zhaoxin Yang	2c1900860c	[lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715 ) Support TLSDESC to initial-exec or local-exec optimizations. Introduce a new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE. In normal or medium code model, there are two forms of code sequences: * pcalau12i $a0, %desc_pc_hi20(sym_desc) * addi.d $a0, $a0, %desc_pc_lo12(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) ------ * pcaddi $a0, %desc_pcrel_20(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) Convert to IE: * pcalau12i $a0, %ie_pc_hi20(sym_ie) * ld.[wd] $a0, $a0, %ie_pc_lo12(sym_ie) Convert to LE: * lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP * ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to convert the preceding instructions to NOPs, due to both forms of code sequence (corresponding to relocation combinations: R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and R_LARCH_TLS_DESC_PCREL20_S2) have same process. TODO: When relaxation enables, redundant NOPs can be removed. It will be implemented in a future patch. Note: All forms of TLSDESC code sequences should not appear interleaved in the normal, medium or extreme code model, which compilers do not generate and lld is unsupported. This is thanks to the guard in PostRASchedulerList.cpp in llvm. ``` Calls are not scheduling boundaries before register allocation, but post-ra we don't gain anything by scheduling across calls since we don't need to worry about register pressure. ```	2025-07-02 16:09:51 +08:00
Peter Collingbourne	494a74882b	Reapply "ELF: Add branch-to-branch optimization." Fixed assertion failure when reading .eh_frame sections, and added .eh_frame sections to tests. This reverts commit 1e95349dbe329938d2962a78baa0ec421e9cd7d1. Original commit message follows: When code calls a function which then immediately tail calls another function there is no need to go via the intermediate function. By branching directly to the target function we reduce the program's working set for a slight increase in runtime performance. Normally it is relatively uncommon to have functions that just tail call another function, but with LLVM control flow integrity we have jump tables that replace the function itself as the canonical address. As a result, when a function address is taken and called directly, for example after a compiler optimization resolves the indirect call, or if code built without control flow integrity calls the function, the call will go via the jump table. The impact of this optimization was measured using a large internal Google benchmark. The results were as follows: CFI enabled: +0.1% ± 0.05% queries per second CFI disabled: +0.01% queries per second [not statistically significant] The optimization is enabled by default at -O2 but may also be enabled or disabled individually with --{,no-}branch-to-branch. This optimization is implemented for AArch64 and X86_64 only. lld's runtime performance (real execution time) after adding this optimization was measured using firefox-x64 from lld-speed-test [1] with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows: ``` N Min Max Median Avg Stddev x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888 + 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971 Difference at 95.0% confidence 0.0243538 +/- 0.00233202 1.87831% +/- 0.179859% (Student's t, pooled s = 0.0190369) ``` [1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057 Reviewers: zmodem, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/145579	2025-06-24 22:16:18 -07:00
Hans Wennborg	1e95349dbe	Revert "ELF: Add branch-to-branch optimization." This caused assertion failures in applyBranchToBranchOpt(): llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From*) [with To = lld:🧝:InputSection; From = lld:🧝:InputSectionBase]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed. See comment on the PR (https://github.com/llvm/llvm-project/pull/138366) This reverts commit 491b82a5ec1add78d2c93370580a2f1897b6a364. This also reverts the follow-up "[lld] Use llvm::partition_point (NFC) (#145209)" This reverts commit 2ac293f5ac4cf65c0c038bf75a88f1d6715e467d.	2025-06-23 13:26:02 +02:00
Peter Collingbourne	491b82a5ec	ELF: Add branch-to-branch optimization. When code calls a function which then immediately tail calls another function there is no need to go via the intermediate function. By branching directly to the target function we reduce the program's working set for a slight increase in runtime performance. Normally it is relatively uncommon to have functions that just tail call another function, but with LLVM control flow integrity we have jump tables that replace the function itself as the canonical address. As a result, when a function address is taken and called directly, for example after a compiler optimization resolves the indirect call, or if code built without control flow integrity calls the function, the call will go via the jump table. The impact of this optimization was measured using a large internal Google benchmark. The results were as follows: CFI enabled: +0.1% ± 0.05% queries per second CFI disabled: +0.01% queries per second [not statistically significant] The optimization is enabled by default at -O2 but may also be enabled or disabled individually with --{,no-}branch-to-branch. This optimization is implemented for AArch64 and X86_64 only. lld's runtime performance (real execution time) after adding this optimization was measured using firefox-x64 from lld-speed-test [1] with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows: ``` N Min Max Median Avg Stddev x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888 + 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971 Difference at 95.0% confidence 0.0243538 +/- 0.00233202 1.87831% +/- 0.179859% (Student's t, pooled s = 0.0190369) ``` [1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057 Pull Request: https://github.com/llvm/llvm-project/pull/138366	2025-06-20 13:16:24 -07:00
Fangrui Song	2fcaa00d1e	[ELF] -z undefs: handle relocations referencing undefined non-weak like undefined weak * Merge the special case into isStaticLinkTimeConstant * Generalize isUndefWeak to isUndefined. undefined non-weak is an error case. We choose to be general, which also brings us in line with GNU ld.	2025-06-11 20:37:15 -07:00
Alexander Ziaee	44a7ecd1d7	[doc] Use ISO nomenclature for 1024 byte units (#133148 ) Increase specificity by using the correct unit sizes. KBytes is an abbreviation for kB, 1000 bytes, and the hardware industry as well as several operating systems have now switched to using 1000 byte kBs. If this change is acceptable, sometimes GitHub mangles merges to use the original email of the account. $dayjob asks contributions have my work email. Thanks!	2025-06-11 13:27:23 +02:00
Kazu Hirata	19f00c0570	[lld] Remove unused includes (NFC) (#141421 )	2025-05-25 10:55:39 -07:00
Peter Collingbourne	f53eb88d25	ELF: Remove lock from MTE global relocation handling code. This lock is unnecessary because we can add the relocations to shards and let them be sorted later. Reviewers: smithp35, fmayer, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/135123	2025-04-10 10:38:59 -07:00
Zhaoxin Yang	bd84d66700	[lld][LoongArch] Convert TLS IE to LE in the normal or medium code model (#123680 ) Original code sequence: * pcalau12i $a0, %ie_pc_hi20(sym) * ld.d $a0, $a0, %ie_pc_lo12(sym) The code sequence converted is as follows: * lu12i.w $a0, %le_hi20(sym) # le_hi20 != 0, otherwise NOP * ori $a0, src, %le_lo12(sym) # le_hi20 != 0, src = $a0, # otherwise, src = $zero TODO: When relaxation is enabled, redundant NOP can be removed. This will be implemented in a future patch. Note: In the normal or medium code model, original code sequence with relocations allow interleaving, because converted code sequence calculates the absolute offset. However, in extreme code model, to identify the current code model, the first four instructions with relocations must appear consecutively.	2025-04-07 19:58:48 +08:00
Fangrui Song	ba2de8f22d	[ELF] Allow absolute relocation referencing symbol index 0 in PIC mode The value of an absolute relocation, like R_RISCV_HI20 or R_PPC64_LO16, with a symbol index of 0, the resulting value should be treated as absolute and permitted in both -pie and -shared links. This change also resolves an absolute relocation referencing an undefined symbol in statically-linked executables. PPC64 has unfortunate exceptions: * R_PPC64_TOCBASE uses symbol index 0 but it should be treated as referencing the linker-defined .TOC. * R_PPC64_PCREL_OPT (https://reviews.llvm.org/D84360) could no longer rely on `isAbsoluteValue` return false.	2025-03-28 20:44:07 -07:00
Fangrui Song	f21c35d54f	[ELF] Replace some Fatal with Err In LLD_IN_TEST=2 mode, when a thread calls Fatal, there will be no output even if the process exits with code 1. Change a few Fatal to recoverable Err.	2025-01-25 17:29:28 -08:00
Daniil Kovalev	9178708c3b	[PAC][lld][AArch64][ELF] Support signed TLSDESC (#113817 ) Depends on #120010 Support `R_AARCH64_AUTH_TLSDESC_ADR_PAGE21`, `R_AARCH64_AUTH_TLSDESC_LD64_LO12` and `R_AARCH64_AUTH_TLSDESC_LD64_LO12` static relocations and `R_AARCH64_AUTH_TLSDESC` dynamic relocation. IE/LE optimization is not currently supported for AUTH TLSDESC.	2025-01-22 12:18:05 +03:00
Daniil Kovalev	1ef5b987a4	[PAC][lld][AArch64][ELF] Support signed GOT with tiny code model (#113816 ) Depends on #114525 Support `R_AARCH64_AUTH_GOT_ADR_PREL_LO21` and `R_AARCH64_AUTH_GOT_LD_PREL19` GOT-generating relocations. A corresponding `RE_AARCH64_AUTH_GOT_PC` member of `RelExpr` is added, which is an AUTH-specific variant of `R_GOT_PC`.	2024-12-18 09:41:54 +03:00
Daniil Kovalev	417d2d7ce6	[PAC][lld][AArch64][ELF] Support signed GOT (#113815 ) Depends on #113811 Support `R_AARCH64_AUTH_ADR_GOT_PAGE`, `R_AARCH64_AUTH_GOT_LO12_NC` and `R_AARCH64_AUTH_GOT_ADD_LO12_NC` GOT-generating relocations. For preemptible symbols, dynamic relocation `R_AARCH64_AUTH_GLOB_DAT` is emitted. Otherwise, we unconditionally emit `R_AARCH64_AUTH_RELATIVE` dynamic relocation since pointers in signed GOT needs to be signed during dynamic link time.	2024-12-17 10:23:01 +03:00
Fangrui Song	3733ed6f1c	[ELF] Introduce Symbol::isExported to cache includeInDynsym isExported, intended to replace exportDynamic, is primarily set in two locations, (a) after parseSymbolVersion and (b) during demoteSymbols. In the future, we should try removing exportDynamic. Currently, merging exportDynamic/isExported would cause riscv-gp.s to fail: * The first isExported computation considers the undefined symbol exported * Defined as a linker-synthesized symbol * isExported remains true, while it should be false	2024-12-08 22:40:14 -08:00
Fangrui Song	c650880958	[ELF] Simplify handling of exportDynamic and canBeOmittedFromSymbolTable When computing whether a defined symbol is exported, we set `exportDynamic` in Defined and CommonSymbol's ctor and merge the bit in symbol resolution. The complexity is for the LTO special case canBeOmittedFromSymbolTable, which can be simplified by introducing a new bit. We might simplify the state by caching includeInDynsym in exportDynamic in the future.	2024-12-08 09:33:48 -08:00
Fangrui Song	8669028c18	[ELF] Remove unneeded sym->file check After #78944 and some follow-ups, sym->file, unless in the initial Placeholder stage, is guaranteed to be non-null.	2024-12-07 20:46:02 -08:00
Fangrui Song	ae5fdaea43	[ELF] Simplify printLocation sym.file is always non-null (since around #78944).	2024-12-07 20:31:50 -08:00
pcc	970d6d2096	ELF: Have __rela_iplt_{start,end} surround .rela.iplt with --pack-dyn-relocs=android. In #86751 we moved the IRELATIVE relocations to .rela.plt when --pack-dyn-relocs=android was enabled but we neglected to also move the __rela_iplt_{start,end} symbols. As a result, static binaries linked with this flag were unable to find their IRELATIVE relocations. Fix it by having the symbols surround the correct section. Reviewers: MaskRay, smithp35 Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/118585	2024-12-04 17:35:05 -08:00
Fangrui Song	04996a28b7	[ELF] Rename target-specific RelExpr enumerators RelExpr enumerators are named `R_`, which can be confused with ELF relocation type names. Rename the target-specific ones to `RE_` to avoid confusion. For consistency, the target-independent ones can be renamed as well, but that's not urgent. The relocation processing mechanism with RelExpr has non-trivial overhead compared with mold's approach, and we might make more code into Arch/*.cpp files and decrease the enumerators. Pull Request: https://github.com/llvm/llvm-project/pull/118424	2024-12-03 09:17:17 -08:00
Fangrui Song	1f13713dbb	[ELF] Change getSrcMsg to use ELFSyncStream. NFC	2024-11-29 17:18:22 -08:00
Fangrui Song	d8495ede01	[ELF] Change getLocation to use ELFSyncStream. NFC	2024-11-24 11:16:52 -08:00
Fangrui Song	ff97b28334	[ELF] Simplif reportUndefinedSymbol. NFC	2024-11-24 10:34:11 -08:00
Fangrui Song	37e39667cc	[ELF] Make ThunkCreator take ownership of thunks This removes many SpecificAlloc instantiations and makes my lld (x86-64 Release+Assertions) smaller by ~36k.	2024-11-19 23:16:35 -08:00
Fangrui Song	8f238f662c	[ELF] Make Ctx inherit from CommonLinkerContext link calls `new CommonLinkerContext`. Now that `Ctx ctx` is a local variable, we can make it inherit from CommonLinkerContext.	2024-11-16 22:28:55 -08:00
Fangrui Song	c1a6defd9f	[ELF] Make RelType a struct type otherwise operator<<(const ELFSyncStream &s, RelType type) applies to non-reloc-type uint32_t, which can be confusing.	2024-11-16 20:26:34 -08:00
Fangrui Song	e57331ec63	[ELF] Move global relocMutex/undefs into Ctx	2024-11-16 19:22:11 -08:00
Fangrui Song	9664ce6d59	[ELF] Simplify complex diagnostics	2024-11-16 19:11:58 -08:00
Fangrui Song	24c7d97cff	[ELF] Replace context-less errorHandler() and error() with ctx.errHandler	2024-11-16 14:14:54 -08:00
Fangrui Song	58a971f42f	[ELF] Replace contex-less toString(x) with toStr(ctx, x) so that we can remove the global `ctx` from toString implementations. Rename lld::toString (to lld:🧝:toStr) to simplify name lookup (we have many llvm::toString and another lld::toString(const llvm::opt::Arg &)).	2024-11-16 11:58:10 -08:00
Fangrui Song	47e6673006	[ELF] Replace toString(RelType) with operator<< while using ELFSyncStream	2024-11-16 10:12:08 -08:00
Peter Smith	098b0d18ad	[LLD][AArch64] Detach Landing Pad creation from Thunk creation (#116402 ) Move Landing Pad Creation to a new function that checks each thunk every pass to see if it needs a landing pad. This permits a thunk to be created without needing a landing pad, but later needing one due to drifting out of direct branch range and requiring an indirect branch. We record all the Thunks created so far in a new vector rather than trying to iterate over the DenseMap as we need a deterministic order of adding LandingPadThunks due to the short branch fall through. We cannot use normalizeExistingThunk() either as that only iterates through live thunks. Fixes: https://crbug.com/377438309 Original PR: https://github.com/llvm/llvm-project/pull/108989 Sending without a new test case to fix existing test. A new regression test will come in a separate PR as coming up with a small enough reproducer for this case is non-trivial.	2024-11-15 18:18:18 +00:00
Fangrui Song	3d57c79728	[ELF] Migrate away from global ctx	2024-11-14 22:50:53 -08:00
Fangrui Song	9b058bb42d	[ELF] Replace errorOrWarn(...) with Err	2024-11-06 22:33:51 -08:00
Fangrui Song	f8bae3af74	[ELF] Replace warn(...) with Warn	2024-11-06 22:19:31 -08:00
Fangrui Song	09c2c5e1e9	[ELF] Replace error(...) with ErrAlways or Err Most are migrated to ErrAlways mechanically. In the future we should change most to Err.	2024-11-06 22:04:52 -08:00
Fangrui Song	63c6fe4a0b	[ELF] Replace fatal(...) with Fatal or Err	2024-11-06 21:17:26 -08:00
Fangrui Song	861bd36bce	[ELF] Pass Ctx & to Symbol::getVA	2024-10-19 20:32:58 -07:00
Fangrui Song	fe8af49a1b	[ELF] Pass Ctx & to Defined & CommonSymbol	2024-10-20 01:38:16 +00:00
Fangrui Song	682925ef43	[ELF] Pass Ctx & to Partition	2024-10-15 22:58:07 -07:00
Fangrui Song	a3bad9adcb	[ELF] Pass Ctx &	2024-10-12 09:56:05 -07:00
Fangrui Song	9bf2e20b17	[ELF] Pass Ctx & to OutputSection	2024-10-11 20:28:58 -07:00
Fangrui Song	81bd712f92	[ELF] Revert Ctx & parameters from SyntheticSection Since Ctx &ctx is a member variable, 1f391a75af8685e6bba89421443d72ac6a186599 7a5b9ef54eb96abd8415fd893576c42e51fd95db e2f0ec3a3a8a2981be8a1aac2004cfb9064c61e8 can be reverted.	2024-10-10 23:43:21 -07:00
Fangrui Song	25cda9e069	[ELF] Pass Ctx & to SyntheticSection	2024-10-10 23:07:02 -07:00
Fangrui Song	acb2b1e779	[ELF] Pass Ctx & to Symbols	2024-10-06 16:59:04 -07:00
Fangrui Song	6d03a69034	[ELF] Pass Ctx & to Arch/	2024-10-06 00:14:12 -07:00
Fangrui Song	87d199ff24	[ELF] Pass Ctx & to Relocations	2024-10-05 09:37:27 -07:00

1 2 3 4 5 ...

710 Commits