816 Commits

Author SHA1 Message Date
Fangrui Song
5d928ffce2 [ELF] Remove error-prone RelocationBaseSection::classof 2024-10-19 21:02:03 -07:00
Fangrui Song
861bd36bce [ELF] Pass Ctx & to Symbol::getVA 2024-10-19 20:32:58 -07:00
Fangrui Song
0dbc85a59f [ELF] Pass Ctx & to Arch-specific code 2024-10-13 11:08:06 -07:00
Fangrui Song
002ca63b3f [ELF] Pass Ctx & to (read|write)(16|64) 2024-10-13 10:47:18 -07:00
Fangrui Song
38dfcd9ac9 [ELF] Pass Ctx & to read32/write32 2024-10-13 10:37:47 -07:00
Fangrui Song
c33133279b [ELF] Pass Ctx & to InputSection 2024-10-11 20:39:53 -07:00
Fangrui Song
9bf2e20b17 [ELF] Pass Ctx & to OutputSection 2024-10-11 20:28:58 -07:00
Fangrui Song
81bd712f92 [ELF] Revert Ctx & parameters from SyntheticSection
Since Ctx &ctx is a member variable,
1f391a75af8685e6bba89421443d72ac6a186599
7a5b9ef54eb96abd8415fd893576c42e51fd95db
e2f0ec3a3a8a2981be8a1aac2004cfb9064c61e8 can be reverted.
2024-10-10 23:43:21 -07:00
Fangrui Song
25cda9e069 [ELF] Pass Ctx & to SyntheticSection 2024-10-10 23:07:02 -07:00
Fangrui Song
c22588c7cd [ELF] Move InputSectionBase::file to SectionBase
... and add getCtx (file->ctx). This allows InputSectionBase and
OutputSection to access ctx without taking an extra function argument.
2024-10-10 22:15:10 -07:00
Fangrui Song
cfd3289a1f [ELF] Pass Ctx & to some free functions 2024-10-06 19:36:21 -07:00
Fangrui Song
7a5b9ef54e [ELF] Pass Ctx & to SyntheticSection::writeTo 2024-10-03 20:56:09 -07:00
Fangrui Song
1f391a75af [ELF] Pass Ctx & to SyntheticSection::finalizeContents 2024-10-03 20:45:40 -07:00
Fangrui Song
3590068950 [ELF] Pass Ctx & to OutputSections 2024-10-03 20:06:58 -07:00
Fangrui Song
6f482010ae [ELF] Replace config-> with ctx.arg. 2024-09-21 22:46:13 -07:00
Daniil Fukalov
65bc259a97
[NFC] Add explicit #include llvm-config.h where its macros are used, last part. (#107615)
(this is the part related to bolt, lld and mlir)

Without these explicit includes, removing other headers, who implicitly
include llvm-config.h, may have non-trivial side effects. For example,
`clangd` may report even `llvm-config.h` as "no used" in case it defines
a macro, that is explicitly used with #ifdef. It is actually amplified
with different build configs which use different set of macros.
2024-09-20 19:59:39 +02:00
Fangrui Song
e88b7ff016 [ELF] Move InStruct into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

llvm/Support/thread.h includes <thread>, which transitively includes
sstream in libc++ and uses ios_base::in, so we cannot use `#define in ctx.sec`.

`symtab, config, ctx` are now the only variables using
LLVM_LIBRARY_VISIBILITY.
2024-09-15 22:15:02 -07:00
Vitaly Buka
a248ec3178 Revert "[ELF] Move InStruct into Ctx. NFC"
The define breaks `std::in`.

https://lab.llvm.org/buildbot/#/builders/169/builds/3253

This reverts commit 2531b46264cd066d51f2571d134a63998d13710f.
2024-09-15 18:22:42 -07:00
Fangrui Song
2531b46264 [ELF] Move InStruct into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

`#define in ctx.sec` is used for now to avoid migrating `in.xxx`.
2024-09-15 16:59:28 -07:00
Fangrui Song
b4feb26606 [ELF] Move target to Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

Follow-up to driver (2022-10) and script (2024-08).
2024-08-21 23:53:36 -07:00
Fangrui Song
2fe3bbdf67 [ELF] Move outputSections into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.
2024-08-03 11:50:48 -07:00
Fangrui Song
09dd0febbb [ELF] Move Out into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons. ctx's hidden visibility optimizes generated instructions.

bufferStart and tlsPhdr, which are not OutputSection, can now be moved
outside of `Out`.
2024-08-03 11:00:11 -07:00
Fangrui Song
0af07c0787
[ELF] Support relocatable files using CREL with explicit addends
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.

---

Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.

(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)

A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.

---

Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).

* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.

SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/98115
2024-08-01 10:22:03 -07:00
Fangrui Song
cfa97699f7 [ELF] Retain uncompressed if compressed content is larger
--compress-debug-sections in GNU ld, gas, and LLVM integrated assembler
retain the uncompressed content if the compressed content is larger.

This patch also updates the manpage (-O2 does not enable zlib level 6)
and fixes a crash of --compress-sections when the uncompressed section
is empty.
2024-05-22 15:55:21 -07:00
Daniel Thornburgh
66466ff151 Reland: [LLD] Implement --enable-non-contiguous-regions (#90007)
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.

This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:

- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).

- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.

- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.

The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.

Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.

A few notable feature interactions occur:

- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.

- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).

- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.

- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.

- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
2024-05-13 11:06:54 -07:00
Daniel Thornburgh
81f34afa5c
Revert "[LLD] Implement --enable-non-contiguous-regions" (#92005)
Reverts llvm/llvm-project#90007

Broke in merging I think.
2024-05-13 10:38:40 -07:00
Daniel Thornburgh
673114447b
[LLD] Implement --enable-non-contiguous-regions (#90007)
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.

This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:

- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).

- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.

- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.

The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.

Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.

A few notable feature interactions occur:

- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.

- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).

- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.

- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.

- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
2024-05-13 10:30:50 -07:00
Fangrui Song
04d0a691af [ELF] Fix --compress-debug-sections=zstd when zlib is disabled 2024-05-07 16:56:45 -07:00
Fangrui Song
6d44a1ef55
[ELF] Adjust --compress-sections to support compression level
zstd excels at scaling from low-ratio-very-fast to
high-ratio-pretty-slow. Some users prioritize speed and prefer disk read
speed, while others focus on achieving the highest compression ratio
possible, similar to traditional high-ratio codecs like LZMA.

Add an optional `level` to `--compress-sections` (#84855) to cater to
these diverse needs. While we initially aimed for a one-size-fits-all
approach, this no longer seems to work.
(https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html)

When --compress-debug-sections is used together, make
--compress-sections take precedence since --compress-sections is usually
more specific.

Remove the level distinction between -O/-O1 and -O2 for
--compress-debug-sections=zlib for a more consistent user experience.

Pull Request: https://github.com/llvm/llvm-project/pull/90567
2024-05-01 11:40:46 -07:00
Fangrui Song
91fef0013f [ELF] Catch zlib deflateInit2 error
The function may return Z_MEM_ERROR or Z_STREAM_ERR. The former does not
have a good way of testing. The latter will be possible with a pending
change that allows setting the compression level, which will come with a
test.
2024-05-01 11:32:04 -07:00
Fangrui Song
79095b4079 [ELF] --compress-debug-sections=zstd: replace ZSTD_c_nbWorkers parallelism with multi-frame parallelism
https://reviews.llvm.org/D133679 utilizes zstd's multithread API to
create one single frame. This provides a higher compression ratio but is
significantly slower than concatenating multiple frames.

With manual parallelism, it is easier to parallelize memcpy in
OutputSection::writeTo for parallel memcpy.

In addition, as the individual allocated decompression buffers are much
smaller, we can make a wild guess (compressed_size/4) without worrying
about a resize (due to wrong guess) would waste memory.
2024-04-29 22:05:35 -07:00
Fangrui Song
0e47dfede4
[ELF] Add isStaticRelSecType to simplify SHT_REL/SHT_RELA testing. NFC
and make it easier to introduce a new relocation format.

https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/85893
2024-03-20 09:58:56 -07:00
Fangrui Song
ea72c082bc [ELF] Change getSymbolIndex to use const reference. NFC 2024-03-18 13:58:55 -07:00
Fangrui Song
f1ca2a0967
[ELF] Add --compress-section to compress matched non-SHF_ALLOC sections
--compress-sections <section-glib>=[none|zlib|zstd] is similar to
--compress-debug-sections but applies to broader sections without the
SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is
matched. An interesting use case is to compress `.strtab`/`.symtab`,
which consume a significant portion of the file size (15.1% for a
release build of Clang).

An older revision is available at https://reviews.llvm.org/D154641 .
This patch focuses on non-allocated sections for safety. Moving
`maybeCompress` as D154641 does not handle STT_SECTION symbols for
`-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s`
from #66804).

Since different output sections may use different compression
algorithms, we need CompressedData::type to generalize
config->compressDebugSections.

GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452

Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674

Pull Request: https://github.com/llvm/llvm-project/pull/84855
2024-03-12 10:56:14 -07:00
Yaxun (Sam) Liu
3594769f20
[ELF] Define NOMINMAX to fix zlib.h caused build failure on Windows (#70368)
On Windows when zlib is enabled, zlib header introduced some Windows
headers which defines max as a macro. Since OutputSections.cpp uses
std::max with template argument, this causes compilation error.

Define macro NOMINMAX to avoid this.
2023-11-02 08:59:54 -04:00
Fangrui Song
0cbe49eade [ELF] Implement getImplicitAddend and enable checkDynamicRelocsDefault for PPC32 2023-09-15 22:49:18 -07:00
Fangrui Song
1b65b159da [ELF] Enable checkDynamicRelocsDefault for PPC64
.plt and .branch_lt have the type of SHT_NOBITS and may be relocated by dynamic
relocations with non-zero addends. They should be skipped for the
--check-dynamic-relocations check, as --apply-dynamic-relocs does not apply.

A side effect is that -z rel does not work for the two sections.

Added two --apply-dynamic-relocs --check-dynamic-relocations tests. Also checked
linking a PPC64 clang.
2023-09-15 22:38:18 -07:00
Simi Pallipurath
f146763e07 Revert "Revert "[lld][Arm] Big Endian - Byte invariant support.""
This reverts commit d8851384c6ac2a1cea15e05228dbde5f13654e23.

Reason: Applied the fix for the Asan buildbot failures.
2023-06-22 16:10:18 +01:00
Simi Pallipurath
d8851384c6 Revert "[lld][Arm] Big Endian - Byte invariant support."
This reverts commit 8cf8956897ce9bca3176c6339077b1ca17b27abc.
2023-06-20 17:27:44 +01:00
Simi Pallipurath
8cf8956897 [lld][Arm] Big Endian - Byte invariant support.
Arm has BE8 big endian configuration called a byte-invariant(every byte has the same address on little and big-endian systems).

When in BE8 mode:
  1. Instructions are big-endian in relocatable objects but
     little-endian in executables and shared objects.
  2. Data is big-endian.
  3. The data encoding of the ELF file is ELFDATA2MSB.

To support BE8 without an ABI break for relocatable objects,the linker takes on the responsibility of changing the endianness of instructions. At a high level the only difference between BE32 and BE8 in the linker is that for BE8:
  1. The linker sets the flag EF_ARM_BE8 in the ELF header.
  2. The linker endian reverses the instructions, but not data.

This patch adds BE8 big endian support for Arm. To endian reverse the instructions we'll need access to the mapping symbols. Code sections can contain a mix of Arm, Thumb and literal data. We need to endian reverse Arm instructions as words, Thumb instructions
as half-words and ignore literal data.The only way to find these transitions precisely is by using mapping symbols. The instruction reversal will need to take place after relocation. For Arm BE8 code sections (Section has SHF_EXECINSTR flag ) we inserted a step after relocation to endian reverse the instructions. The implementation strategy i have used here is to write all sections BE32  including SyntheticSections then endian reverse all code in InputSections via mapping symbols.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D150870
2023-06-20 14:08:21 +01:00
Fangrui Song
8d85c96e0e [lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC
The latter form is now preferred to be similar to C++20 starts_with.
This replacement also removes one function call when startswith is not inlined.
2023-06-05 14:36:19 -07:00
Leonard Chan
b9249a69cc [lld][ELF] Do not emit warning for NOLOAD output sections
Much of NOLOAD's intended use is to explicitly change the type of an
output section, so we shouldn't flag these as warnings.

Differential Revision: https://reviews.llvm.org/D151144
2023-05-23 20:41:20 +00:00
Fangrui Song
1408504564 [ELF] Name MergeSyntheticSection using an input section instead of the output section
In a link map, the input section name gives more information. See the updated
merge-entsize.s for an example. The output file is unchanged.

Compiler generated input sections with the SHF_MERGE flag have names such as
.rodata.str1.1 and .rodata.cstN, and are not affected by -fdata-sections.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D149466
2023-05-02 09:35:00 -07:00
Alexey Lapshin
fea8c07356 [Support][Parallel] Add sequential mode to TaskGroup::spawn().
This patch allows to specify that some part of tasks should be
done in sequential order. It makes it possible to not use
condition operator for separating sequential tasks:

TaskGroup tg;
for () {
  if(condition)      ==>   tg.spawn([](){fn();}, condition)
    fn();
  else
    tg.spawn([](){fn();});
}

It also prevents execution on main thread. Which allows adding
checks for getThreadIndex() function discussed in D142318.

The patch also replaces std::stack with std::deque in the
ThreadPoolExecutor to have natural execution order in case
(parallel::strategy.ThreadsRequested == 1).

Differential Revision: https://reviews.llvm.org/D148728
2023-04-26 13:52:26 +02:00
Jez Ng
3df4c5a92f [NFC] Optimize vector usage in lld
By using emplace_back, as well as converting some loops to for-each, we can do more efficient vectorization.

Make copy constructor for TemporaryFile noexcept.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D139552
2023-01-26 20:31:42 -05:00
serge-sans-paille
984b800a03
Move from llvm::makeArrayRef to ArrayRef deduction guides - last part
This is a follow-up to https://reviews.llvm.org/D140896, split into
several parts as it touches a lot of files.

Differential Revision: https://reviews.llvm.org/D141298
2023-01-10 11:47:43 +01:00
Guillaume Chatelet
08e2a76381 [lld][NFC] rename ELF alignment into addralign 2022-12-01 16:20:12 +00:00
Fangrui Song
1a50213ce7 [ELF] --compress-debug-sections=zstd: ignore error if zstd was not built with ZSTD_MULTITHREAD 2022-09-22 13:16:50 -07:00
Alex Brachet
38b20a02fe [ELF] Fix std::min error on MacOs 2022-09-22 19:03:13 +00:00
Dmitri Gribenko
eda9fdc493 Fix -Wunused-local-typedef warning in some build configurations 2022-09-22 17:10:17 +02:00