This reverts commit 9e3b64b9f95aadf57568576712902a272fe66503.
Reason: Broke the UBSan buildbot. See the comments in the pull request
(https://github.com/llvm/llvm-project/pull/85036) for more information.
--compress-sections is similar to --compress-debug-sections but applies
to arbitrary sections.
* `--compress-sections <section>=none`: decompress sections
* `--compress-sections <section>=[zlib|zstd]`: compress sections with zlib/zstd
Like `--remove-section`, the pattern is by default a glob, but a regex
when --regex is specified.
For `--remove-section` like options, `!` prevents matches and is not
dependent on ordering (see `ELF/wildcard-syntax.test`). Since
`--compress-sections a=zlib --compress-sections a=none` naturally allows
overriding, having an order-independent `!` would be confusing.
Therefore, `!` is disallowed.
Sections within a segment are effectively immutable. Report an error for
an attempt to (de)compress them. `SHF_ALLOC` sections in a relocatable
file can be compressed, but linkers usually reject them.
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674
Pull Request: https://github.com/llvm/llvm-project/pull/85036
`TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness".
I cannot find noticeable out-of-tree users of `TargetEndianness`, but
keep `TargetEndianness` to make this patch safer. `TargetEndianness`
will be removed by a subsequent change.
Add --skip-symbol and --skip-symbols options that allow to skip symbols
when executing other options that can change the symbol's name, binding
or visibility, similar to an existing option --keep-symbol that keeps a
symbol from being removed by other options.
Simplify --[de]compress-debug-sections to make it easier to add custom section [de]compression.
Change the following two behaviors to match GNU objcopy.
* --compress-debug-sections compresses SHF_ALLOC sections while GNU
doesn't.
* --decompress-debug-sections decompresses non-debug sections while GNU
doesn't.
Pull Request: https://github.com/llvm/llvm-project/pull/84885
Add options --set-symbol-visibility and --set-symbols-visibility to
manually change the visibility of symbols.
There is already an option to set the visibility of newly added symbols
via --add-symbol and --new-symbol-visibility. This option will allow to
change the visibility of already existing symbols.
(#79887) When the offset of a PT_INTERP segment equals the offset of a
PT_LOAD segment, we consider that the parent of the PT_LOAD segment is
the PT_INTERP segment. In `layoutSegments`, we place both segments to be
after the current `Offset`, ignoring the PT_LOAD alignment.
This scenario is possible with fixed section addresses, but doesn't
happen with default linker layouts (.interp precedes other sections and
is part of a PT_LOAD segment containing the ELF header and program
headers).
```
% cat a.s
.globl _start; _start: ret
.rodata; .byte 0
.tdata; .balign 4096; .byte 0
% clang -fuse-ld=lld a.s -o a -nostdlib -no-pie -z separate-loadable-segments -Wl,-Ttext=0x201000,--section-start=.interp=0x202000,--section-start=.rodata=0x202020,-z,nognustack
% llvm-objcopy a a2
% llvm-readelf -l a2 # incorrect offset(PT_LOAD)
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 0x0001c0 R 0x8
INTERP 0x001001 0x0000000000202000 0x0000000000202000 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000200 0x000200 R 0x1000
LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x000001 0x000001 R E 0x1000
//// incorrect offset
LOAD 0x001001 0x0000000000202000 0x0000000000202000 0x000021 0x000021 R 0x1000
LOAD 0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 RW 0x1000
TLS 0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x000001 R 0x1000
GNU_RELRO 0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R 0x1000
```
The same issue occurs for PT_TLS/PT_GNU_RELRO if we PT_TLS's alignment
is smaller and we place the PT_LOAD after PT_TLS/PT_GNU_RELRO segments
(not linker default, but possible with a `PHDRS` linker script command).
Fix#79887: when two segments have the same offset, order the one with a
larger alignment first. In the previous case, the PT_LOAD segment will
go before the PT_INTERP segment. In case of equal alignments, it doesn't
matter which segment is treated as the parent segment.
This fixes the issue mentioned here:
https://github.com/llvm/llvm-project/issues/57407
It prevents `llvm-objcopy` from removing the `.gnu _debuglink` section
when used with the `--strip-all` flag. Since `--strip-all` is the
default of `llvm-strip` the patch also prevents `llvm-strip` from
removing the `.gnu_debuglink` section.
In the change that added `--gap-fill`, the condition to choose the
sections to write in `BinaryWriter::write()` did not exclude zero-size
sections. However, zero-size sections did not have correct offsets
assigned in `BinaryWriter::finalize()`. The result is either a failed
assertion, or memory corruption due to writing to the buffer beyond its
size.
To fix this, exclude zero-size sections from writing. Also, add a zero-size
section to the test, which would trigger the problem.
`--gap-fill <value>` fills the gaps between sections with a specified
8-bit value, instead of zero.
`--pad-to <address>` pads the output binary up to the specified load
address, using the 8-bit value from `--gap-fill` or zero.
These options are only supported for ELF input and binary output.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
As of now, llvm-objcopy silently ignores a provided regex if it doesn't
compile.
This patch adds returning an error saying that a regex couldn't be
compiled, along with the compilation error message.
---------
Co-authored-by: James Henderson <46713263+jh7370@users.noreply.github.com>
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
Without the fix gcc warned with
../lib/ObjCopy/ELF/ELFObjcopy.cpp: In function 'uint64_t getSectionFlagsPreserveMask(uint64_t, uint64_t, uint16_t)':
../lib/ObjCopy/ELF/ELFObjcopy.cpp:106:31: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
106 | ~(EMachine == EM_X86_64 ? ELF::SHF_X86_64_LARGE : 0UL);
| ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Summary:
llvm-ar is symlinked as llvm-ranlib and will act as ranlib when invoked in that mode. llvm-ar since [[ 4f2cfbe531 | compiler/llvm-project@4f2cfbe ]] supports the -X options, but doesn't seem to accept them when running as llvm-ranlib.
In AIX OS , according to https://www.ibm.com/docs/en/aix/7.2?topic=r-ranlib-command
-X mode Specifies the type of object file ranlib should examine. The mode must be one of the following:
32
Processes only 32-bit object files
64
Processes only 64-bit object files
32_64, any
Processes both 32-bit and 64-bit object files
The default is to process 32-bit object files (ignore 64-bit objects). The mode can also be set with the OBJECT_MODE environment variable. For example, OBJECT_MODE=64 causes ranlib to process any 64-bit objects and ignore 32-bit objects. The -X flag overrides the OBJECT_MODE variable.
Reviewers: James Henderson, MaskRay, Stephen Peckham
Differential Revision: https://reviews.llvm.org/D142660
Previously when objcopy generated section headers, it padded the LEB
that encodes the section size out to 5 bytes, matching the behavior of
clang. This is correct, but results in a binary that differs from the
input. This can sometimes have undesirable consequences (e.g. breaking
source maps).
This change makes the object reader remember the size of the LEB
encoding in the section header, so that llvm-objcopy can reproduce it
exactly. For sections not read from an object file (e.g. that
llvm-objcopy is adding itself), pad to 5 bytes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D155535
Currently, objcopy cannot set the new flag SHF_X86_64_LARGE. This change introduces the named flag "large" which translates to that section flag.
An "invalid argument" error is produced if a user attempts to set the flag on an architecture other than X86_64.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D153262
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
llvm-objcopy should not insert padding before a section if its
physical addresses is not aligned to section's alignment. This
behavior will match GNU objcopy and is important for embedded images
where the physical address is used to store the initial data image.
The loader typically will copy this image using a start symbol
created by the linker. If llvm-objcopy inserts padding before such a
section, the symbol address will not match the location in the image.
This commit refines the change in https://reviews.llvm.org/D128961
which intended to align sections which type changed from NOBITS and
their offset may not be aligned. However, it affected all sections.
Fix https://github.com/llvm/llvm-project/issues/62636
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D150276
This reverts commit eb1442d0f73c76cfb5051d133f858fe760d189cf.
The test tools/llvm-objcopy/ELF/binary-paddr.test fails on
ppc64be-clang-test-suite:
https://lab.llvm.org/buildbot#builders/231/builds/13120
Reverting at author's request.
llvm-objcopy should not insert padding before a section if its
physical addresses is not aligned to section's alignment. This
behavior will match GNU objcopy and is important for embedded images
where the physical address is used to store the initial data image.
The loader typically will copy this image using a start symbol
created by the linker. If llvm-objcopy inserts padding before such a
section, the symbol address will not match the location in the image.
This commit refines the change in https://reviews.llvm.org/D128961
which intended to align sections which type changed from NOBITS and
their offset may not be aligned. However, it affected all sections.
Fix https://github.com/llvm/llvm-project/issues/62636
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D150276
This change to llvm-objcopy preserves the ELF section sh_link to .symtab
so long as none of the symbol table indices have been changed.
Previously, any invocation of llvm-objcopy including a "no-op" would
clear any section sh_link to .symtab.
Differential Revision: https://reviews.llvm.org/D150859
Add support for --dump-section on COFF files. This is helpful for
extracting specific content from an object file on Windows.
Differential Revision: https://reviews.llvm.org/D150305
Reviewed By: @alexander-shaposhnikov, @jhenderson, @hjyamauchi
The gABI prohibits multiple SH_SYMTAB sections. As a result,
llvm-objcopy was crashing in SymbolTableSection::removeSymbols(). This
patch fixes the issue by emitting an error if multiple SH_SYMTAB
sections are encountered when building an ELF object.
Fixes: https://github.com/llvm/llvm-project/issues/60448
Differential Revision: https://reviews.llvm.org/D143508
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
getRawNumberOfSymbols() assumes that a symbol table exists, which isn't
always guaranteed, while getNumberOfSymbols() handles and tolerates objects
without a symbol table. When there is a symbol table, both methods return
the same value.
Also add a test to ensure we don't regress in this regard. The test
generates a basic COFF object with symbols and overrides the symbol table
pointer with zeros to craft the input required to verify llvm-objcopy works
as expected in this scenario.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes check-llvm.
This used %zu to print a uint64_t type. z is for size_t so on 32 bit
we tried to treat it as a 32 bit number.
Use PRIu64 instead to print as 64 bit everywhere.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716