17655 Commits

Author SHA1 Message Date
Jacek Caban
ac31d64a64
[LLD][COFF] Avoid resolving symbols with -alternatename if the target is undefined (#149496)
This change fixes an issue with the use of `-alternatename` in the MSVC
CRT on ARM64EC, where both mangled and demangled symbol names are
specified. Without this patch, the demangled name could be resolved to
an anti-dependency alias of the target. Since chaining anti-dependency
aliases is not allowed, this results in an undefined symbol.

The root cause isn't specific to ARM64EC, it can affect other targets as
well, even when anti-dependency aliases aren't involved. The
accompanying test case demonstrates a scenario where the symbol could be
resolved from an archive. However, because the archive member is pulled
in after the first pass of alternate name resolution, and archive
members don't override weak aliases, eager resolution would incorrectly
skip it.
2025-07-28 19:26:25 +02:00
Evgenii Kudriashov
75b79c9238
[LLD][X86] Match delayLoad thunk with MSVC (#149521)
Previously we saved registers in the shadow space of callee before
calling __delayLoadHelper2. Now we save arguments in the shadow space of
the caller and allocate shadow space for the callee.

Fixes #51941

---------

Co-authored-by: Benjamin Santerre <benjamin.santerre@gmail.com>
2025-07-28 17:45:16 +02:00
Jacek Caban
38cd66a6ce
[LLD][COFF] Move resolving alternate names to SymbolTable (NFC) (#149495) 2025-07-28 17:02:49 +02:00
Jacek Caban
1ab04fc94c
[LLD][COFF] Allow symbols with empty chunks to have no associated output section in the PDB writer (#149523)
If a chunk is empty and there are no other non-empty chunks in the same
section, `removeEmptySections()` will remove the entire section. In this
case, use a section index of 0, as the MSVC linker does, instead of
asserting.
2025-07-28 17:01:26 +02:00
Haohai Wen
41f333250b
[LLD][COFF] Discard .llvmbc and .llvmcmd sections (#150897)
Those sections are generated by -fembed-bitcode and do not need to be
kept in executable files.
2025-07-28 16:17:26 +08:00
Jessica Clarke
f8685a8533
[NFC][ELF] Wrap invokeELFT in do { } while (0) so it behaves as a function (#150119)
The current implementation is dangerous if used in contexts that need a
single statement, since invokeELFT(...); is in fact two statements, a
switch statement and an empty statement.
2025-07-27 00:20:55 +01:00
Aiden Grossman
296ba58b5d
[lld] Remove uses of %T from tests (#150740)
%T has been deprecated for about seven years. It is to be avoided for
the most part given it does not create a unique directory per test. So,
remove it from lld tests with eventual intention of removing support
from llvm-lit.

Most of the cases in lld were not misusing the feature, using it to get
the directory that the test objects were in or as a path to pass for
-libpath. These cases all work perfectly well with a created directory
however and allow for the removal of %T to prevent incorrect usage.
2025-07-25 23:58:56 -07:00
Jordan Rupprecht
f79efa986d
[lld][test] Fix unintentional write to a non-writeable dir (#150436)
The test added in #147970 fails trying to write `a.out` when run in a
non-writeable directory. I believe the intent was to write to /dev/null
as the output, but `-o` was omitted, so it's actually linking *in*
/dev/null and writing to `a.out`.
2025-07-24 10:17:00 -05:00
Mark Murray
d52675e0a7
[lld][AArch64][Build Attributes] Add support for AArch64 Build Attributes (#147970)
This patch enables lld to read AArch64 Build Attributes and convert them
into GNU Properties.

Changes:
    - Parses AArch64 Build Attributes from input object files.
    - Converts known attributes into corresponding GNU Properties.
    - Merges attributes when linking multiple objects.

Spec reference:
    https://github.com/ARM-software/abi-aa/pull/230/files#r1030

Co-authored-by: Sivan Shani <sivan.shani@arm.com>

---------

Co-authored-by: Sivan Shani <sivan.shani@arm.com>
2025-07-24 10:38:36 +01:00
Tobias Hieta
3d8db8ef50 Clear release notes on main for LLVM 22 2025-07-24 11:36:11 +02:00
Fangrui Song
f97adea477 ELF: Simplify isRelRoDataSection and rename the text file
PR #148920 was merged before I could share my comments.

* Fix the text filename. There are other minor suggestions, but can be
  done in #148985
* Make `isRelRoDataSection` concise, to be consistent with the majority of
  helper functions.
2025-07-23 09:34:07 -07:00
Mingming Liu
7dcd90df45
[lld][ELF] Allow data.rel.ro.hot and .data.rel.ro.unlikely to be RELRO (#148920)
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744
proposes to partition a static data section (like `.data.rel.ro`) into
two sections, one grouping the cold ones and the other grouping the
rest.

lld requires all relro sections to be contiguous. To place
`.data.rel.ro.unlikely` in the middle of all relro sections, this change
proposes to add `.data.rel.ro.unlikely` explicitly as RELRO section.

---------

Co-authored-by: Sam Elliott <quic_aelliott@quicinc.com>
2025-07-23 08:29:17 -07:00
Zhaoxin Yang
2a5cd50c46
[lld][LoongArch] Support relaxation during TLSDESC GD/LD to IE/LE conversion (#123730)
Complement https://github.com/llvm/llvm-project/pull/123715. When
relaxation enable, remove redundant NOPs.
2025-07-23 17:12:13 +08:00
bd1976bris
9f733f4324
[LLD][COFF] Make /wholearchive thin-archive member identifiers consistent (#145487)
A thin archive is an archive/library format where the archive itself
contains only references to member object files on disk, rather than
embedding the file contents.

For the non-/wholearchive case, we use the path to the archive member as
the identifier for thin-archive members (see comments in
`enqueueArchiveMember`). This patch modifies the /wholearchive path to
behave the same way.

Apart from consistency, my motivation for fixing this is DTLTO
(#126654), where having the member identifier be the path on disk allows
distribution of bitcode members during ThinLTO.
2025-07-22 16:57:20 +01:00
Alexandre Ganea
fcacd4e880
[LLD][COFF] Follow up comments on pr146610 (#147152)
This is a follow-up PR for post-commit comments in
https://github.com/llvm/llvm-project/pull/146610

- Changed "exporteddllmain" references to "importeddllmain".
- Add support for x86 target and test coverage.
- Changed a comment to better express why we're skipping importing
`DllMain`.
2025-07-21 16:12:37 -04:00
SivanShani-Arm
8bb97d2d1e
[LLD][Docs] Document -z gcs= option in the man page (#146522)
Add documentation for the -z gcs= option to the LLD man page. This flag
controls how the GCS bit is set in the output:

- implicit (default): inferred from input objects
- never: GCS bit is never set
- always: GCS bit is always set

Clarifies behavior for users and aligns the man page with existing
functionality.
2025-07-21 15:23:18 +01:00
Aleksandr Urakov
c9cbd4e9d4
[lld] Fix -ObjC load behavior with LTO for section names with whitespace (#146654)
This is a fix additional to #92162

In some cases, section names contain a whitespace between the segment
name and the actual section name (e.g. `__TEXT, __swift5_proto`). It is
confirmed by source code of the Swift compiler

This fix allows LTO to work correctly with the `-ObjC` flag in that rare
case when only a section with a whitespace in the name is present in the
linked bitcode module, but there are no sections containing
`__TEXT,__swift`

---------

Co-authored-by: Ураков Александр Сергеевич <a.urakov@tbank.ru>
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
2025-07-21 09:08:33 +03:00
Brian Cain
3e9ceae29f
[lld] [hexagon] guard allocateAux: only if idx nonzero (#149690)
While building libclang_rt.asan-hexagon.so, lld would assert in
lld:🧝:hexagonTLSSymbolUpdate().

Fixes #132766
2025-07-20 14:39:03 -05:00
Brian Cain
b42f96bc05
[lld] Add thunks for hexagon (#111217)
Without thunks, programs will encounter link errors complaining that the
branch target is out of range. Thunks will extend the range of branch
targets, which is a critical need for large programs. Thunks provide
this flexibility at a cost of some modest code size increase.

When configured with the maximal feature set, the hexagon port of the
linux kernel would often encounter these limitations when linking with
`lld`.

The relocations which will be extended by thunks are:

* R_HEX_B22_PCREL, R_HEX_{G,L}D_PLT_B22_PCREL, R_HEX_PLT_B22_PCREL
relocations have a range of ± 8MiB on the baseline
* R_HEX_B15_PCREL: ±65,532 bytes
* R_HEX_B13_PCREL: ±16,380 bytes
* R_HEX_B9_PCREL: ±1,020 bytes

Fixes #149689 

Co-authored-by: Alexey Karyakin <akaryaki@quicinc.com>

---------

Co-authored-by: Alexey Karyakin <akaryaki@quicinc.com>
2025-07-20 11:46:31 -05:00
bd1976bris
bbbbc093fe
[DTLTO][LLD][COFF] Add support for Integrated Distributed ThinLTO (#148594)
This patch introduces support for Integrated Distributed ThinLTO (DTLTO)
in COFF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via
external distribution systems, such as Incredibuild, during the
traditional link step: https://llvm.org/docs/DTLTO.html.

Note: Bitcode members of non-thin archives are not currently supported.
This will be addressed in a future change. This patch is sufficient to
allow for self-hosting an LLVM build with DTLTO if thin archives are
used.

Testing:
- LLD `lit` test coverage has been added, using a mock distributor to
avoid requiring Clang.
- Cross-project `lit` tests cover integration with Clang.

For the design discussion of the DTLTO feature, see:
https://github.com/llvm/llvm-project/pull/126654
2025-07-20 14:47:00 +01:00
quic-areg
ac7ceb3dab
[Hexagon][llvm-objdump] Improve disassembly of Hexagon bundles (#145807)
Hexagon instructions are VLIW "bundles" of up to four instruction words
encoded as a single MCInst with operands for each sub-instruction.
Previously, the disassembler's getInstruction() returned the full
bundle, which made it difficult to work with llvm-objdump.

For example, since all instructions are bundles, and bundles do not
branch, branch targets could not be printed.

This patch modifies the Hexagon disassembler to return individual
sub-instructions instead of entire bundles, enabling correct printing of
branch targets and relocations. It also introduces
`MCDisassembler::getInstructionBundle` for cases where the full bundle
is still needed.

By default, llvm-objdump separates instructions with newlines. However,
this does not work well for Hexagon syntax:

  { inst1
    inst2
    inst3
    inst4 <branch> } :endloop0

Instructions may be followed by a closing brace, a closing brace with
`:endloop`, or a newline. Branches must appear within the braces.

To address this, `PrettyPrinter::getInstructionSeparator()` is added and
overridden for Hexagon.
2025-07-18 10:27:59 -05:00
Pengying Xu
6c705d1136
[lld][elf] Skip BP ordering input sections with null data (#149265) 2025-07-18 08:01:16 -07:00
Fangrui Song
3cb0c7f45b MC: Rework .reloc directive and fix the offset when it evaluates to a constant
* Fix `.reloc constant` to mean section_symbol+constant instead of
  .+constant . The initial .reloc support from MIPS incorrectly
  interpreted the offset.
* Delay the evaluation of the offset expression after
  MCAssembler::layout, deleting a lot of code working with MCFragment.
* Delete many FIXME from https://reviews.llvm.org/D79625
* Some lld/ELF/Arch/LoongArch.cpp relaxation tests rely on .reloc .,
  R_LARCH_ALIGN generating ALIGN relocations at specific location.
  Sort the relocations.
2025-07-17 00:36:11 -07:00
Daniel Bertalan
43f10639a1
[lld-macho] Enable Linker Optimization Hints pass for arm64_32 (#148964)
The backend emits `.loh` directives for arm64_32 as well. Our pass
already handles 32-bit pointer loads correctly (there was an extraneous
sanity check for 8-byte pointer sizes, I removed that here), so we can
enable them for all arm64 subtargets, including our upcoming arm64e
support.
2025-07-16 21:29:48 +02:00
Daniel Bertalan
fb3972dd06 [lld-macho] Move Linker Optimization Hints pass to a separate file
Moving it away from the arm64 `TargetInfo` class will let us enable it
more easily for arm64_32 and the soon-to-be-added arm64e target as well.

This is the NFC part of #148964
2025-07-16 21:13:54 +02:00
Ami-zhang
9ef293ea24
[LoongArch] Add supplemental release notes for LLVM 21 (#148771) 2025-07-15 15:39:00 +08:00
Brian Cain
d2bcc51a5a
[LLD] Merge .hexagon.attributes sections (#148098)
Merge the attributes of object files being linked together. The
`.hexagon.attributes` section can be used by loaders and analysis tools.
This is similar to the .riscv.attributes, introduced in
8a900f2438b4a167b98404565ad4da2645cc9330 /
https://reviews.llvm.org/D138550.
2025-07-14 22:36:05 -05:00
Fangrui Song
8983b22ca1 ReleaseNotes: add lld/ELF notes
Move linker script changes to the middle and target-specific
options/behavior changes to the end.
2025-07-13 23:24:14 -07:00
WhatAmISupposedToPutHere
74a6e5cf91
[LLD][MinGW] Support machine:arm64x when invoked in MinGW mode. (#145343)
Mingw mode already supports building machine:arm64ec arm64x binaries,
support machine:arm64x ones too.

Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com>
2025-07-10 23:47:37 +03:00
Matt Arsenault
2d3d0e502d
RuntimeLibcalls: Fix dropping first libcall entry (#147461)
Fixes regression reported after #144973, which happened to
be acosf.
2025-07-08 21:09:24 +09:00
Nick Fitzgerald
463b3cb93f
[lld][WebAssembly] Abide by configured page size for memory imports (#146916)
This was an oversight in
https://github.com/llvm/llvm-project/pull/128942 where I forgot to add
the configured page size to the `WasmLimits` in the import we emit when
importing a memory.

Fixes: #146713
2025-07-07 09:40:09 -07:00
Parth
923a3cc160
[LLD] Fix crash on parsing ':ALIGN' in linker script (#146723)
The linker was crashing due to stack overflow when parsing ':ALIGN' in
an output section description. This commit fixes the linker script
parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions
(readExpr) in linker script and the behavior of ScriptLexer::expect(...)
utility. ScriptLexer::expect does not do anything if errors have already
been encountered during linker script parsing. In particular, it never
increments the current token position in the script file, even if the
current token is the same as the expected token. This causes an infinite
call cycle on parsing an expression such as '(4096)' when an error has
already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

  expect("("); // no-op, current token still points to '('
  Expression *E = readExpr(); // The cycle continues...

Closes #146722

Signed-off-by: Parth Arora <partaror@qti.qualcomm.com>
2025-07-06 10:22:50 -07:00
Tomer Shafir
65e11f600d
[Clang][AArch64] Remove redundant tune args to the backend (#146896)
This change removes unnecessary tune args to the AArch64 backend. The
AArch64 backend automatically handles `tune-cpu` and adds the necessar
y features based on the models from TableGen.

It follows this fix: https://github.com/llvm/llvm-project/pull/146260
where updating a subtarget feature didn't fail the frontend test because
both the toolchain and the test suffered from a coordinated error.
2025-07-05 09:36:13 +03:00
SingleAccretion
497060fae5
[lld][WebAssembly] Add missing relocation types to the --compress-relocations path (#144578)
Fixes https://github.com/llvm/llvm-project/issues/110045.

Reloc list reference:
```
+ Already handled
A Added in this change
= Not applicable / expected (though technically legal, e. g. you can relocate v128.const...)

+ R_WASM_FUNCTION_INDEX_LEB,      0
+ R_WASM_TABLE_INDEX_SLEB,        1
= R_WASM_TABLE_INDEX_I32,         2
+ R_WASM_MEMORY_ADDR_LEB,         3
+ R_WASM_MEMORY_ADDR_SLEB,        4
= R_WASM_MEMORY_ADDR_I32,         5
+ R_WASM_TYPE_INDEX_LEB,          6
+ R_WASM_GLOBAL_INDEX_LEB,        7
= R_WASM_FUNCTION_OFFSET_I32,     8
= R_WASM_SECTION_OFFSET_I32,      9
+ R_WASM_TAG_INDEX_LEB,          10
A R_WASM_MEMORY_ADDR_REL_SLEB,   11
A R_WASM_TABLE_INDEX_REL_SLEB,   12
= R_WASM_GLOBAL_INDEX_I32,       13
+ R_WASM_MEMORY_ADDR_LEB64,      14
+ R_WASM_MEMORY_ADDR_SLEB64,     15
= R_WASM_MEMORY_ADDR_I64,        16
A R_WASM_MEMORY_ADDR_REL_SLEB64, 17
+ R_WASM_TABLE_INDEX_SLEB64,     18
= R_WASM_TABLE_INDEX_I64,        19
+ R_WASM_TABLE_NUMBER_LEB,       20
A R_WASM_MEMORY_ADDR_TLS_SLEB,   21
= R_WASM_FUNCTION_OFFSET_I64,    22
= R_WASM_MEMORY_ADDR_LOCREL_I32, 23
A R_WASM_TABLE_INDEX_REL_SLEB64, 24
A R_WASM_MEMORY_ADDR_TLS_SLEB64, 25
= R_WASM_FUNCTION_INDEX_I32,     26
```
2025-07-02 13:57:42 -07:00
bd1976bris
da01257c3a
[Test] Account for spaces in paths in the new dtlto/files.test (#146749)
This uses LIT substitutions in a response file that could contain spaces
in paths. This caused a failure on a build bot where the path to the
system Python executable was "C:\Program Files\Python310\python.exe", as
reported in #142757.

Add appropriate quoting to fix the issue.
2025-07-02 19:20:09 +01:00
Tom Tromey
cbfc10260c
Fix lld crash caused by dynamic bit offset patch (#146701)
PR #141106 changed the debuginfo metdata to allow dynamic bit offsets
and sizes. This caused a crash in lld when using LTO.

The problem is that lazyLoadOneMetadata assumes that the metadata in
question can be cast to MDNode; but in the typical case where the offset
is a constant, this is not true.

This patch changes this spot to allow non-MDNodes through.

The included test case comes from the report in #141106.
2025-07-02 10:30:34 -07:00
bd1976bris
3b4e79398d
[DTLTO][LLD][ELF] Add support for Integrated Distributed ThinLTO (#142757)
This patch introduces support for Integrated Distributed ThinLTO (DTLTO)
in ELF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via
external distribution systems, such as Incredibuild, during the
traditional link step: https://llvm.org/docs/DTLTO.html.

It is expected that users will invoke DTLTO through the compiler driver
(e.g., Clang) rather than calling LLD directly. A Clang-side interface
for DTLTO will be added in a follow-up patch.

Note: Bitcode members of archives (thin or non-thin) are not currently
supported. This will be addressed in a future change. As a consequence
of this lack of support, this patch is not sufficient to allow for
self-hosting an LLVM build with DTLTO. Theoretically,
--start-lib/--end-lib could be used instead of archives in a self-host
build. However, it's unclear how --start-lib/--end-lib can be easily
used with the LLVM build system.

Testing:
- ELF LLD `lit` test coverage has been added, using a mock distributor
  to avoid requiring Clang.
- Cross-project `lit` tests cover integration with Clang.

For the design discussion of the DTLTO feature, see: #126654.
2025-07-02 16:12:27 +01:00
Alexandre Ganea
e63de82d90
[LLD][COFF] Disallow importing DllMain from import libraries (#146610)
This is a workaround for
https://github.com/llvm/llvm-project/issues/82050 by skipping the `DllMain` symbol if seen in aimport library. If this situation occurs, after this commit a warning will also be displayed. The warning can be silenced with `/ignore:exporteddllmain`
2025-07-02 08:53:18 -04:00
Zhaoxin Yang
2c1900860c
[lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715)
Support TLSDESC to initial-exec or local-exec optimizations. Introduce a
new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing
R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing
R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE.
    
In normal or medium code model, there are two forms of code sequences:
* pcalau12i  $a0, %desc_pc_hi20(sym_desc)
* addi.d     $a0, $a0, %desc_pc_lo12(sym_desc)
* ld.d       $ra, $a0, %desc_ld(sym_desc)
* jirl       $ra, $ra, %desc_call(sym_desc)
------
* pcaddi     $a0, %desc_pcrel_20(sym_desc)
* ld.d       $ra, $a0, %desc_ld(sym_desc)
* jirl       $ra, $ra, %desc_call(sym_desc)
    
Convert to IE:
* pcalau12i $a0, %ie_pc_hi20(sym_ie)
* ld.[wd]   $a0, $a0, %ie_pc_lo12(sym_ie)

Convert to LE:
* lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP
* ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero

Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to
convert the preceding instructions to NOPs, due to both forms of code
sequence (corresponding to relocation combinations:
R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and
R_LARCH_TLS_DESC_PCREL20_S2) have same process.
    
TODO: When relaxation enables, redundant NOPs can be removed. It will be
implemented in a future patch.
    
Note: All forms of TLSDESC code sequences should not appear interleaved
in the normal, medium or extreme code model, which compilers do not
generate and lld is unsupported. This is thanks to the guard in
PostRASchedulerList.cpp in llvm.
```
Calls are not scheduling boundaries before register allocation,
but post-ra we don't gain anything by scheduling across calls
since we don't need to worry about register pressure.
```
2025-07-02 16:09:51 +08:00
Mingjie Xu
6323541a2a
[LLD][ELF] Skip non-SHF_ALLOC sections when checking max VA and max VA difference in relaxOnce() (#145863)
For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so `minVA` will not be set to 0
with non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will
not contribute to `maxVA`.
2025-07-01 09:02:06 +08:00
Tomer Shafir
dd02fb3a51
[AArch64] Fix stale +zcm target feature to +zcm-gpr64 (#146260)
Replaces all the uses of `+zcm` with `+zcm-gpr64`. A fix for:
https://github.com/llvm/llvm-project/pull/146051
2025-06-29 15:01:05 +03:00
Peter Collingbourne
494a74882b
Reapply "ELF: Add branch-to-branch optimization."
Fixed assertion failure when reading .eh_frame sections, and added
.eh_frame sections to tests.

This reverts commit 1e95349dbe329938d2962a78baa0ec421e9cd7d1.

Original commit message follows:

When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.

Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.

The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:

CFI enabled:  +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]

The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.

This optimization is implemented for AArch64 and X86_64 only.

lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:

```
    N           Min           Max        Median           Avg        Stddev
x 512     1.2264546     1.3481076     1.2970261     1.2965788   0.018620888
+ 512     1.2561196     1.3839965     1.3214632     1.3209327   0.019443971
Difference at 95.0% confidence
        0.0243538 +/- 0.00233202
        1.87831% +/- 0.179859%
        (Student's t, pooled s = 0.0190369)
```

[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057

Reviewers: zmodem, MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/145579
2025-06-24 22:16:18 -07:00
Fabrice de Gans
85d250c96e
Use the Windows SDK arguments over the environment (#144805)
If any of the Windows SDK (and MSVC)-related argument is passed in the
command line, they should take priority over the environment variables
like `INCLUDE` or `LIB` set by vcvarsall from the Visual Studio
Developer Environment on Windows.

These changes ensure that all of the arguments related to VC Tools and
the Windows SDK cause the driver to ignore the environment.
2025-06-24 09:39:11 -07:00
Ellis Hoag
b77c7138a8
[lld][BP] Fix duplicate section size measurment (#145384) 2025-06-24 06:31:23 -07:00
Ellis Hoag
068af5bfb4
[lld][BP] Print total size of startup symbols (#145106)
A good proxy to estimate the number of page faults during startup is the
total size of startup functions. Assuming profiles are up-to-date, we
can measure this total size pretty easily. Note that if profile data is
old, this number could be wrong.
2025-06-23 08:18:04 -07:00
Hans Wennborg
1e95349dbe Revert "ELF: Add branch-to-branch optimization."
This caused assertion failures in applyBranchToBranchOpt():

  llvm/include/llvm/Support/Casting.h:578:
  decltype(auto) llvm::cast(From*)
  [with To = lld:🧝:InputSection; From = lld:🧝:InputSectionBase]:
  Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

See comment on the PR (https://github.com/llvm/llvm-project/pull/138366)

This reverts commit 491b82a5ec1add78d2c93370580a2f1897b6a364.

This also reverts the follow-up "[lld] Use llvm::partition_point (NFC) (#145209)"

This reverts commit 2ac293f5ac4cf65c0c038bf75a88f1d6715e467d.
2025-06-23 13:26:02 +02:00
Kazu Hirata
2ac293f5ac
[lld] Use llvm::partition_point (NFC) (#145209) 2025-06-22 06:30:10 -07:00
Douglas Yung
757c80d88a Add REQUIRES: x86 to test added in 141197 to skip when x86 target is not present. 2025-06-21 22:37:02 +00:00
Haohai Wen
9cc9efc483
[lld][COFF] Remove duplicate strtab entries (#141197)
String table size is too big for large binary when symbol table is
enabled. Some strings in strtab is same so it can be reused.

This patch revives 9ffeaaa authored by mstorsjo with the prioritized
string table builder to fix debug section name issue (see 4d2eda2
for more details).

---------

Co-authored-by: Wen Haohai <whh108@live.com>
Co-authored-by: James Henderson <James.Henderson@sony.com>
2025-06-21 13:44:10 +08:00
Peter Collingbourne
491b82a5ec ELF: Add branch-to-branch optimization.
When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.

Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.

The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:

CFI enabled:  +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]

The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.

This optimization is implemented for AArch64 and X86_64 only.

lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:

```
    N           Min           Max        Median           Avg        Stddev
x 512     1.2264546     1.3481076     1.2970261     1.2965788   0.018620888
+ 512     1.2561196     1.3839965     1.3214632     1.3209327   0.019443971
Difference at 95.0% confidence
	0.0243538 +/- 0.00233202
	1.87831% +/- 0.179859%
	(Student's t, pooled s = 0.0190369)
```

[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057

Pull Request: https://github.com/llvm/llvm-project/pull/138366
2025-06-20 13:16:24 -07:00