COFF associative sections is a feature where relocatable object files
can have section snippets marked as related to another section snippet,
so they are kept or discarded in relation to that other section snippet.
When llvm-objcopy removes sections, it also removes sections that are
marked as associative to the removed section (as the associative
sections otherwise would end up orphaned).
In a linked executable module (EXE or DLL), section associativity is
meaningless - thus, we should ignore those fields from the input.
After linking, GNU ld keeps the SectionDefinition auxillary part of
symbols intact as it was in the source object file, which means that it
references section numbers in the source object files.
This fixes https://github.com/llvm/llvm-project/issues/53433.
The term BSS (Block Started by Symbol) is a standard, widely recognized
term, available in the a.out object file format and adopted by formats
like COFF, XCOFF, Mach-O (called S_ZEROFILL while `__bss` is also used),
and ELF. To avoid introducing unfamiliar terms, we should use
isBSSSection instead of isVirtualSection.
The prioritized string table builder was introduced in 9cc9efc. This
patch sets highest priority for the section name to place it at the
start of string table to avoid the issue described in 4d2eda2.
Code originally added in #120995 and later corrected in #130517 but
apparently still not correct according to #141494 and
rust-lang/rust#141913.
Revert the special handling because the test written in #120995 and
#130517 still passes without those changes. Kept the test and improved
it with a `__DATA` section to keep the current behaviour checked in case
other changes modify the behaviour and break this edge case.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Bug introduced #120995. The LLD code calculates the "size" of the Mach-O
headers, and then uses that size to place the segments, but the `__TEXT`
section stays at `fileoff` zero. When I wrote the code into llvm-objcopy
I calculated the extra space into the initial offset, which moved all
the sections back 1 page.
Besides the modified test checking for the right `fileoff` values of the
sections and the segments, I also manually checked the generated
binaries after `llvm-objcopy` using `dyld_info`, as the bug report
suggested.
Fixes#130472
… memsz>0
Currently, segments with a file size of 0 are ignored for the purposes
of --change-section-lma, regardless of their memory size. It seems
reasonable to me to modify such segments given that we're changing the
LMA for all sections and these LMAs may be used during loading. GNU
objcopy also seems to adjust such segments.
Additionally, segments with file size > 0 and memory size = 0 will no
longer be modified for the purposes of --change-section-lma as they
shouldn't be part of the loaded memory image.
Fixes#124680
Fix printing the correct file path in the error message when the output
file specified by `--dump-section` cannot be opened
Fixes: #125113 on ELF, MachO, Wasm
This fixes "unused-local-typedef" warnings in 9324e6a7a5.
This adds an option `--remove-note=[name/]type` to selectively delete
notes in ELF files, where `type` is the numeric value of the note type
and `name` is the name of the originator. The name can be omitted, in
which case all notes of the specified type will be removed. For now,
only `SHT_NOTE` sections that are not associated with segments are
handled. The implementation can be extended later as needed.
RFC: https://discourse.llvm.org/t/rfc-llvm-objcopy-feature-for-editing-notes/83491
This adds an option `--remove-note=[name/]type` to selectively delete
notes in ELF files, where `type` is the numeric value of the note type
and `name` is the name of the originator. The name can be omitted, in
which case all notes of the specified type will be removed. For now,
only `SHT_NOTE` sections that are not associated with segments are
handled. The implementation can be extended later as needed.
RFC: https://discourse.llvm.org/t/rfc-llvm-objcopy-feature-for-editing-notes/83491
LLD (and other Mach-O linkers) when preparing an encryptable binary make
space to leave all the load commands in an non-encrypted page (see [1])
When using objcopy of a small encryptable binary, the code was not
respecting this fact, and the encryptable segments were not kept beyond
the first page. This was obvious for small or empty binaries.
The changes introduced here keep track if a `LC_ENCRYPTION_INFO` or
`LC_ENCRYPTION_INFO_64` has been seen, and in such case, it adds a full
page of offset in order to leave the load commands in its own page
(similar to what LLD is doing).
[1]:
d8e7929312/lld/MachO/SyntheticSections.cpp (L90-L93)
When not in MinGW mode, the PE debug directory is placed in .rdata by
the linker instead of .buildid. In addition to .buildid always
explicitly preserve the section containing the debug directory to avoid
causing errors later in patchDebugDirectory.
This patch adds support of the following llvm-objcopy flags for MachO:
- `--globalize-symbol`, `--globalize-symbols`,
- `--keep-global-symbol`, `-G`, `--keep-global-symbols`,
- `--localize-symbol`, `-L`, `--localize-symbols`,
- `--skip-symbol`, `--skip-symbols`.
Code in `updateAndRemoveSymbols` for MachO
is kept similar to its version for ELF.
Fixes#120894
Let's say we've run llvm-strip over some MachO. The resulting MachO
became smaller and there are no indirect symbols anymore.
Then according to previous code we would not update indirectsymoff
field. This would lead to `MachOWriter::totalSize()` report size that is
larger than the new MachO. As a result we would get MachO that has zero
bytes past contents of the very last load command.
Codesign has a strict check that size of MachO file must be equal to
lastLoadCommand.offset + lastLoadCommand.size
If this is not satisfied codesign reports the following error
main executable failed strict validation
Fixes#117723
This change is enough to allow `--strip-debug` to work on object files,
without breaking the relocation information or symbol table.
A more complete version of this change would instead reconstruct the
symbol table and relocation sections, but that is much larger change.
Bug: #102002
--change-section address and its alias --adjust-section-vma allows
modification
of section addresses in a relocatable file. This used to be used, for
example,
in Fiasco microkernel.
On a relocatable file this option behaves the same as GNU objcopy, apart
from
the fact that it does not issue any warnings, for example, when an
argument is
not used.
GNU objcopy does not produce an error when passed an executable file but
the
usecase for this is not clear, and the behaviour is inconsistent. The
idea of
GNU objcopy --change-section-address is that the option should change
both LMA
and VMA in an executable file. Since this patch does not implement
executable
file support, only VMA is changed.
Otherwise, llvm-objcopy fails with use-after-free when built under
sanitizers. Simple repro: run the test
`ELF/remove-section-in-group.test` under asan. This is due to symbol
table references to empty section groups that must be removed.
Currently `llvm-objcopy/llvm-strip` in `--strip-debug` mode doesn't
remove such sections. This behavior can lead to incompatibilities with
GNU binutils (for examples ld.bfd before https://sourceware.org/PR20520
cannot process the object file contains empty .group section).
The ELF object that contains group section with `.debug_*` sections
inside can be obtained by `gcc -g3`.
Fix#97139
llvm-objcopy did not support change-section-lma argument.
This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.
This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.
In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
--compress-sections is similar to --compress-debug-sections but applies
to arbitrary sections.
* `--compress-sections <section>=none`: decompress sections
* `--compress-sections <section>=[zlib|zstd]`: compress sections with zlib/zstd
Like `--remove-section`, the pattern is by default a glob, but a regex
when --regex is specified.
For `--remove-section` like options, `!` prevents matches and is not
dependent on ordering (see `ELF/wildcard-syntax.test`). Since
`--compress-sections a=zlib --compress-sections a=none` naturally allows
overriding, having an order-independent `!` would be confusing.
Therefore, `!` is disallowed.
Sections within a segment are effectively immutable. Report an error for
an attempt to (de)compress them. `SHF_ALLOC` sections in a relocatable
file can be compressed, but linkers usually reject them.
Note: Before this patch, a compressed relocation section is recognized
as a `RelocationSectionBase` as well and `removeSections` `!ToRemove(*ToRelSec)`
may incorrectly interpret a `CompressedSections` as `RelocationSectionBase`,
leading to ubsan failure for the new test. Fix this by setting
`OriginalFlags` in CompressedSection::CompressedSection.
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674
Pull Request: https://github.com/llvm/llvm-project/pull/85036
This reverts commit 9e3b64b9f95aadf57568576712902a272fe66503.
Reason: Broke the UBSan buildbot. See the comments in the pull request
(https://github.com/llvm/llvm-project/pull/85036) for more information.
--compress-sections is similar to --compress-debug-sections but applies
to arbitrary sections.
* `--compress-sections <section>=none`: decompress sections
* `--compress-sections <section>=[zlib|zstd]`: compress sections with zlib/zstd
Like `--remove-section`, the pattern is by default a glob, but a regex
when --regex is specified.
For `--remove-section` like options, `!` prevents matches and is not
dependent on ordering (see `ELF/wildcard-syntax.test`). Since
`--compress-sections a=zlib --compress-sections a=none` naturally allows
overriding, having an order-independent `!` would be confusing.
Therefore, `!` is disallowed.
Sections within a segment are effectively immutable. Report an error for
an attempt to (de)compress them. `SHF_ALLOC` sections in a relocatable
file can be compressed, but linkers usually reject them.
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674
Pull Request: https://github.com/llvm/llvm-project/pull/85036
`TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness".
I cannot find noticeable out-of-tree users of `TargetEndianness`, but
keep `TargetEndianness` to make this patch safer. `TargetEndianness`
will be removed by a subsequent change.
Add --skip-symbol and --skip-symbols options that allow to skip symbols
when executing other options that can change the symbol's name, binding
or visibility, similar to an existing option --keep-symbol that keeps a
symbol from being removed by other options.
Simplify --[de]compress-debug-sections to make it easier to add custom section [de]compression.
Change the following two behaviors to match GNU objcopy.
* --compress-debug-sections compresses SHF_ALLOC sections while GNU
doesn't.
* --decompress-debug-sections decompresses non-debug sections while GNU
doesn't.
Pull Request: https://github.com/llvm/llvm-project/pull/84885
Add options --set-symbol-visibility and --set-symbols-visibility to
manually change the visibility of symbols.
There is already an option to set the visibility of newly added symbols
via --add-symbol and --new-symbol-visibility. This option will allow to
change the visibility of already existing symbols.
(#79887) When the offset of a PT_INTERP segment equals the offset of a
PT_LOAD segment, we consider that the parent of the PT_LOAD segment is
the PT_INTERP segment. In `layoutSegments`, we place both segments to be
after the current `Offset`, ignoring the PT_LOAD alignment.
This scenario is possible with fixed section addresses, but doesn't
happen with default linker layouts (.interp precedes other sections and
is part of a PT_LOAD segment containing the ELF header and program
headers).
```
% cat a.s
.globl _start; _start: ret
.rodata; .byte 0
.tdata; .balign 4096; .byte 0
% clang -fuse-ld=lld a.s -o a -nostdlib -no-pie -z separate-loadable-segments -Wl,-Ttext=0x201000,--section-start=.interp=0x202000,--section-start=.rodata=0x202020,-z,nognustack
% llvm-objcopy a a2
% llvm-readelf -l a2 # incorrect offset(PT_LOAD)
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 0x0001c0 R 0x8
INTERP 0x001001 0x0000000000202000 0x0000000000202000 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000200 0x000200 R 0x1000
LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x000001 0x000001 R E 0x1000
//// incorrect offset
LOAD 0x001001 0x0000000000202000 0x0000000000202000 0x000021 0x000021 R 0x1000
LOAD 0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 RW 0x1000
TLS 0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x000001 R 0x1000
GNU_RELRO 0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R 0x1000
```
The same issue occurs for PT_TLS/PT_GNU_RELRO if we PT_TLS's alignment
is smaller and we place the PT_LOAD after PT_TLS/PT_GNU_RELRO segments
(not linker default, but possible with a `PHDRS` linker script command).
Fix#79887: when two segments have the same offset, order the one with a
larger alignment first. In the previous case, the PT_LOAD segment will
go before the PT_INTERP segment. In case of equal alignments, it doesn't
matter which segment is treated as the parent segment.