150 Commits

Author SHA1 Message Date
Kazu Hirata
71b4f4d263
[ObjCopy] Remove unnecessary casts (NFC) (#152261)
getBufferStart() already returns char *.
2025-08-06 07:10:58 -07:00
Martin Storsjö
fcbbcffd2e
[llvm-objcopy] [COFF] Ignore associative sections in executables (#151143)
COFF associative sections is a feature where relocatable object files
can have section snippets marked as related to another section snippet,
so they are kept or discarded in relation to that other section snippet.

When llvm-objcopy removes sections, it also removes sections that are
marked as associative to the removed section (as the associative
sections otherwise would end up orphaned).

In a linked executable module (EXE or DLL), section associativity is
meaningless - thus, we should ignore those fields from the input.

After linking, GNU ld keeps the SectionDefinition auxillary part of
symbols intact as it was in the source object file, which means that it
references section numbers in the source object files.

This fixes https://github.com/llvm/llvm-project/issues/53433.
2025-07-30 16:39:04 +03:00
Fangrui Song
6201761e96 MC: Rename isVirtualSection to isBssSection
The term BSS (Block Started by Symbol) is a standard, widely recognized
term, available in the a.out object file format and adopted by formats
like COFF, XCOFF, Mach-O (called S_ZEROFILL while `__bss` is also used),
and ELF. To avoid introducing unfamiliar terms, we should use
isBSSSection instead of isVirtualSection.
2025-07-20 10:39:17 -07:00
Haohai Wen
018548ddff
[objcopy][coff] Place section name first in strtab (#145266)
The prioritized string table builder was introduced in 9cc9efc. This
patch sets highest priority for the section name to place it at the
start of string table to avoid the issue described in 4d2eda2.
2025-06-27 07:39:04 +08:00
Daniel Rodríguez Troitiño
a0662ceba8
[objcopy][MachO] Revert special handling of encryptable binaries (#144058)
Code originally added in #120995 and later corrected in #130517 but
apparently still not correct according to #141494 and
rust-lang/rust#141913.

Revert the special handling because the test written in #120995 and
#130517 still passes without those changes. Kept the test and improved
it with a `__DATA` section to keep the current behaviour checked in case
other changes modify the behaviour and break this edge case.
2025-06-16 12:06:25 -07:00
Kazu Hirata
03f616eb3a
[llvm] Compare std::optional<T> to values directly (NFC) (#143340)
This patch transforms:

  X && *X == Y

to:

  X == Y

where X is of std::optional<T>, and Y is of T or similar.
2025-06-08 22:37:59 -07:00
Abhina Sreeskantharajan
6a4b89055b [SystemZ][z/OS] add back headers needed for strnlen, autoconversion 2025-06-06 09:45:06 -04:00
Kazu Hirata
228f66807d
[llvm] Remove unused includes (NFC) (#142733)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-06-04 12:30:52 -07:00
Fangrui Song
80c61dec2d [llvm-objcopy] Fix some namespace style issues
Similar to https://reviews.llvm.org/D104693
2025-05-11 22:51:18 -07:00
Kazu Hirata
d4e730f49b
[ObjCopy] Use StringRef::starts_with (NFC) (#139408) 2025-05-10 16:04:45 -07:00
Kazu Hirata
c51a3aa6ce
[llvm] Remove unused local variables (NFC) (#138467) 2025-05-04 13:05:18 -07:00
Kazu Hirata
2f3067ed69
[llvm] Remove unused local variables (NFC) (#138454) 2025-05-04 09:38:16 -07:00
Kazu Hirata
8ba3a232d1
[llvm] Use llvm::copy (NFC) (#137470) 2025-04-26 15:50:38 -07:00
Kazu Hirata
c4e9901b5b
[llvm] Use llvm::append_range (NFC) (#135931) 2025-04-16 12:28:47 -07:00
Kazu Hirata
87322c9039
[ObjCopy] Use llvm::reverse (NFC) (#135559) 2025-04-13 14:16:26 -07:00
Kazu Hirata
6257621f41
[llvm] Use llvm::append_range (NFC) (#133658) 2025-03-30 18:43:02 -07:00
Daniel Rodríguez Troitiño
8413f4d837
[llvm-objcopy] Apply encryptable offset to first segment, not section (#130517)
Bug introduced #120995. The LLD code calculates the "size" of the Mach-O
headers, and then uses that size to place the segments, but the `__TEXT`
section stays at `fileoff` zero. When I wrote the code into llvm-objcopy
I calculated the extra space into the initial offset, which moved all
the sections back 1 page.

Besides the modified test checking for the right `fileoff` values of the
sections and the segments, I also manually checked the generated
binaries after `llvm-objcopy` using `dyld_info`, as the bug report
suggested.

Fixes #130472
2025-03-14 13:43:48 -07:00
Fangrui Song
7a66a26658
[llvm-objcopy,ELF] --discard-locals/--discard-all: allow and keep symbols referenced by relocations
In GNU objcopy, symbols referenced by relocations are retained. Our COFF
(https://reviews.llvm.org/D56480) and Mach-O
(https://reviews.llvm.org/D75104) ports port the behavior, but the ELF
port doesn't.

This PR implements the behavior for ELF.
Close #47468 (tcl has a use case that requires `strip -x tclStubLib.o`
to strip local symbols not referenced by relocations.)

Pull Request: https://github.com/llvm/llvm-project/pull/130704
2025-03-11 21:26:13 -07:00
Jonathon Penix
ffecd72479
[llvm-objcopy] Let --change-section-lma change segments wth filesz=0,… (#127724)
… memsz>0

Currently, segments with a file size of 0 are ignored for the purposes
of --change-section-lma, regardless of their memory size. It seems
reasonable to me to modify such segments given that we're changing the
LMA for all sections and these LMAs may be used during loading. GNU
objcopy also seems to adjust such segments.

Additionally, segments with file size > 0 and memory size = 0 will no
longer be modified for the purposes of --change-section-lma as they
shouldn't be part of the loaded memory image.

Fixes #124680
2025-02-27 13:45:04 -08:00
Dmitry Nechitaev
14f33c6bc1
[llvm-objcopy][mach-o] Fix section finding logic for object files (#127604)
Fix section finding logic for object files.
As by product, make --update-section functional when the input is an object file.

This PR fixes #127495
2025-02-23 11:17:58 -08:00
Amr Hesham
66bea0df75
[llvm-objcopy] Fix prints wrong path when dump-section output path doesn't exist (#125345)
Fix printing the correct file path in the error message when the output
file specified by `--dump-section` cannot be opened

Fixes: #125113 on ELF, MachO, Wasm
2025-02-08 14:14:16 +01:00
Igor Kudrin
10772807ab Reapply "[llvm-objcopy][ELF] Add an option to remove notes (#118739)"
This fixes "unused-local-typedef" warnings in 9324e6a7a5.

This adds an option `--remove-note=[name/]type` to selectively delete
notes in ELF files, where `type` is the numeric value of the note type
and `name` is the name of the originator. The name can be omitted, in
which case all notes of the specified type will be removed. For now,
only `SHT_NOTE` sections that are not associated with segments are
handled. The implementation can be extended later as needed.

RFC: https://discourse.llvm.org/t/rfc-llvm-objcopy-feature-for-editing-notes/83491
2025-01-23 15:22:04 -08:00
Igor Kudrin
621e5cd820 Revert "[llvm-objcopy][ELF] Add an option to remove notes (#118739)"
This reverts commit 9324e6a7a5c5adc5b5c38c3e3cbecd7e1e98876a.
2025-01-23 14:57:29 -08:00
Igor Kudrin
9324e6a7a5
[llvm-objcopy][ELF] Add an option to remove notes (#118739)
This adds an option `--remove-note=[name/]type` to selectively delete
notes in ELF files, where `type` is the numeric value of the note type
and `name` is the name of the originator. The name can be omitted, in
which case all notes of the specified type will be removed. For now,
only `SHT_NOTE` sections that are not associated with segments are
handled. The implementation can be extended later as needed.


RFC: https://discourse.llvm.org/t/rfc-llvm-objcopy-feature-for-editing-notes/83491
2025-01-23 14:27:40 -08:00
Daniel Rodríguez Troitiño
1a830aa1fe
[ObjCopy] Respect requirements of LC_ENCRYPTION_INFO commands (#120995)
LLD (and other Mach-O linkers) when preparing an encryptable binary make
space to leave all the load commands in an non-encrypted page (see [1])

When using objcopy of a small encryptable binary, the code was not
respecting this fact, and the encryptable segments were not kept beyond
the first page. This was obvious for small or empty binaries.

The changes introduced here keep track if a `LC_ENCRYPTION_INFO` or
`LC_ENCRYPTION_INFO_64` has been seen, and in such case, it adds a full
page of offset in order to leave the load commands in its own page
(similar to what LLD is doing).

[1]:
d8e7929312/lld/MachO/SyntheticSections.cpp (L90-L93)
2025-01-08 08:49:03 -08:00
Billy Laws
7db0a606a2
[objcopy][COFF] Do not strip .rdata section with --only-keep-debug (#121653)
When not in MinGW mode, the PE debug directory is placed in .rdata by
the linker instead of .buildid. In addition to .buildid always
explicitly preserve the section containing the debug directory to avoid
causing errors later in patchDebugDirectory.
2025-01-04 23:55:12 +02:00
Richard Dzenis
334a5766d7
[llvm-objcopy] Add support of symbol modification flags for MachO (#120895)
This patch adds support of the following llvm-objcopy flags for MachO:

- `--globalize-symbol`, `--globalize-symbols`,
- `--keep-global-symbol`, `-G`, `--keep-global-symbols`,
- `--localize-symbol`, `-L`, `--localize-symbols`,
- `--skip-symbol`, `--skip-symbols`.

Code in `updateAndRemoveSymbols` for MachO
is kept similar to its version for ELF.

Fixes #120894
2024-12-24 16:05:10 +02:00
Richard Dzenis
9c97aa5abb
[llvm-objcopy] Always update indirectsymoff in MachO (#117726)
Let's say we've run llvm-strip over some MachO. The resulting MachO
became smaller and there are no indirect symbols anymore.

Then according to previous code we would not update indirectsymoff
field. This would lead to `MachOWriter::totalSize()` report size that is
larger than the new MachO. As a result we would get MachO that has zero
bytes past contents of the very last load command.

Codesign has a strict check that size of MachO file must be equal to

    lastLoadCommand.offset + lastLoadCommand.size

If this is not satisfied codesign reports the following error

    main executable failed strict validation

Fixes #117723
2024-11-29 12:27:12 +02:00
Kazu Hirata
54dad9e269
[ObjCopy] Remove unused includes (NFC) (#116534)
Identified with misc-include-cleaner.
2024-11-17 08:39:13 -08:00
Kazu Hirata
e4e3ff5adc
[llvm] Use std::optional::value_or (NFC) (#109568) 2024-09-22 01:00:24 -07:00
Sam Clegg
2a6136e552
[llvm-objcopy][WebAssembly] Allow --strip-debug to operate on relocatable files. (#102978)
This change is enough to allow `--strip-debug` to work on object files,
without breaking the relocation information or symbol table.

A more complete version of this change would instead reconstruct the
symbol table and relocation sections, but that is much larger change.

Bug: #102002
2024-08-19 21:52:17 -07:00
Eleanor Bonnici
2b2f4ae0fb
[llvm-objcopy] Add --change-section-address (#98664)
--change-section address and its alias --adjust-section-vma allows
modification
of section addresses in a relocatable file. This used to be used, for
example,
in Fiasco microkernel.

On a relocatable file this option behaves the same as GNU objcopy, apart
from
the fact that it does not issue any warnings, for example, when an
argument is
not used.
GNU objcopy does not produce an error when passed an executable file but
the
usecase for this is not clear, and the behaviour is inconsistent. The
idea of
GNU objcopy --change-section-address is that the option should change
both LMA
and VMA in an executable file. Since this patch does not implement
executable
file support, only VMA is changed.
2024-07-30 10:57:57 +01:00
serge-sans-paille
fb5a38bb49
[llvm-objcopy] Add verification of added .note section format
Also add a --no-verify-note-sections flag to make it possible to add
invalid sections if needs be.

Pull Request: https://github.com/llvm/llvm-project/pull/90458
2024-07-11 17:59:19 +02:00
Pranav Kant
e142a55686
[llvm-objcopy] Remove references for empty section groups (#98106)
Otherwise, llvm-objcopy fails with use-after-free when built under
sanitizers. Simple repro: run the test
`ELF/remove-section-in-group.test` under asan. This is due to symbol
table references to empty section groups that must be removed.
2024-07-09 14:48:09 -07:00
Fangrui Song
2abe53a17f Revert "[llvm-objcopy] Remove empty SHT_GROUP sections (#97141)"
This reverts commit 359c64f314ad568e78ee9a3723260286e3425c2d.

This caused heap-use-after-free. See #98106.
2024-07-08 23:47:28 -07:00
Dmitriy Chestnykh
359c64f314
[llvm-objcopy] Remove empty SHT_GROUP sections (#97141)
Currently `llvm-objcopy/llvm-strip` in `--strip-debug` mode doesn't
remove such sections. This behavior can lead to incompatibilities with
GNU binutils (for examples ld.bfd before https://sourceware.org/PR20520
cannot process the object file contains empty .group section).
The ELF object that contains group section with `.debug_*` sections
inside can be obtained by `gcc -g3`.
Fix #97139
2024-07-08 11:50:23 -07:00
Fangrui Song
9bb4cd5977
[llvm-objcopy] Support CREL
llvm-objcopy may modify the symbol table and need to rewrite
relocations. For CREL, while we can reuse the decoder from #91280, we
need an encoder to support CREL.

Since MC/ELFObjectWriter.cpp has an existing encoder, and MC is at a
lower layer than Object, extract the encoder to a new header file
llvm/MC/MCELFExtras.h.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/97521
2024-07-08 09:32:08 -07:00
Eleanor Bonnici
3320036370
[llvm-objcopy] Add change-section-lma *+/-offset (#95431)
llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
2024-07-08 17:17:09 +01:00
Braden Helmer
79ce70b803
[NFC] Mitigate pointless copies (#95052)
Fixes #95036 #95033 #94933 #94930
2024-06-11 09:54:00 +01:00
Fangrui Song
07942987b5 [llvm-objcopy] Add --compress-sections
--compress-sections is similar to --compress-debug-sections but applies
to arbitrary sections.

* `--compress-sections <section>=none`: decompress sections
* `--compress-sections <section>=[zlib|zstd]`: compress sections with zlib/zstd

Like `--remove-section`, the pattern is by default a glob, but a regex
when --regex is specified.

For `--remove-section` like options, `!` prevents matches and is not
dependent on ordering (see `ELF/wildcard-syntax.test`). Since
`--compress-sections a=zlib --compress-sections a=none` naturally allows
overriding, having an order-independent `!` would be confusing.
Therefore, `!` is disallowed.

Sections within a segment are effectively immutable. Report an error for
an attempt to (de)compress them. `SHF_ALLOC` sections in a relocatable
file can be compressed, but linkers usually reject them.

Note: Before this patch, a compressed relocation section is recognized
as a `RelocationSectionBase` as well and `removeSections` `!ToRemove(*ToRelSec)`
may incorrectly interpret a `CompressedSections` as `RelocationSectionBase`,
leading to ubsan failure for the new test. Fix this by setting
`OriginalFlags` in CompressedSection::CompressedSection.

Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674

Pull Request: https://github.com/llvm/llvm-project/pull/85036
2024-04-15 00:48:51 -07:00
Mitch Phillips
be8bc3cf43 Revert "[llvm-objcopy] Add --compress-sections"
This reverts commit 9e3b64b9f95aadf57568576712902a272fe66503.

Reason: Broke the UBSan buildbot. See the comments in the pull request
(https://github.com/llvm/llvm-project/pull/85036) for more information.
2024-04-05 11:42:52 +02:00
Fangrui Song
9e3b64b9f9 [llvm-objcopy] Add --compress-sections
--compress-sections is similar to --compress-debug-sections but applies
to arbitrary sections.

* `--compress-sections <section>=none`: decompress sections
* `--compress-sections <section>=[zlib|zstd]`: compress sections with zlib/zstd

Like `--remove-section`, the pattern is by default a glob, but a regex
when --regex is specified.

For `--remove-section` like options, `!` prevents matches and is not
dependent on ordering (see `ELF/wildcard-syntax.test`). Since
`--compress-sections a=zlib --compress-sections a=none` naturally allows
overriding, having an order-independent `!` would be confusing.
Therefore, `!` is disallowed.

Sections within a segment are effectively immutable. Report an error for
an attempt to (de)compress them. `SHF_ALLOC` sections in a relocatable
file can be compressed, but linkers usually reject them.

Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674

Pull Request: https://github.com/llvm/llvm-project/pull/85036
2024-04-04 09:33:18 -07:00
Fangrui Song
2763353891
[Object,ELFType] Rename TargetEndianness to Endianness (#86604)
`TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness".

I cannot find noticeable out-of-tree users of `TargetEndianness`, but
keep `TargetEndianness` to make this patch safer. `TargetEndianness`
will be removed by a subsequent change.
2024-03-28 09:10:34 -07:00
Ilia Kuklin
4946cc37f4
[llvm-objcopy] Add --skip-symbol and --skip-symbols options (#80873)
Add --skip-symbol and --skip-symbols options that allow to skip symbols
when executing other options that can change the symbol's name, binding
or visibility, similar to an existing option --keep-symbol that keeps a
symbol from being removed by other options.
2024-03-21 17:05:35 +05:00
Fangrui Song
122d368b2b
[llvm-objcopy] --[de]compress-debug-sections: don't compress SHF_ALLOC sections, only decompress .debug sections
Simplify --[de]compress-debug-sections to make it easier to add custom section [de]compression.
Change the following two behaviors to match GNU objcopy.

* --compress-debug-sections compresses SHF_ALLOC sections while GNU
  doesn't.
* --decompress-debug-sections decompresses non-debug sections while GNU
  doesn't.

Pull Request: https://github.com/llvm/llvm-project/pull/84885
2024-03-13 10:13:21 -07:00
Justin Lebar
fab2bb8bfd
Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)
For some reason this was missing from STLExtras.
2024-03-10 20:00:13 -07:00
Ilia Kuklin
07d8a457ad
[llvm-objcopy] Add --set-symbol-visibility and --set-symbols-visibility options (#80872)
Add options --set-symbol-visibility and --set-symbols-visibility to
manually change the visibility of symbols.

There is already an option to set the visibility of newly added symbols
via --add-symbol and --new-symbol-visibility. This option will allow to
change the visibility of already existing symbols.
2024-02-28 17:38:26 +05:00
Fangrui Song
ef28379022
[llvm-objcopy] Fix file offsets when PT_INTERP/PT_LOAD offsets are equal (#80562)
(#79887) When the offset of a PT_INTERP segment equals the offset of a
PT_LOAD segment, we consider that the parent of the PT_LOAD segment is
the PT_INTERP segment. In `layoutSegments`, we place both segments to be
after the current `Offset`, ignoring the PT_LOAD alignment.

This scenario is possible with fixed section addresses, but doesn't
happen with default linker layouts (.interp precedes other sections and
is part of a PT_LOAD segment containing the ELF header and program
headers).

```
% cat a.s
.globl _start; _start: ret
.rodata; .byte 0
.tdata; .balign 4096; .byte 0
% clang -fuse-ld=lld a.s -o a -nostdlib -no-pie -z separate-loadable-segments -Wl,-Ttext=0x201000,--section-start=.interp=0x202000,--section-start=.rodata=0x202020,-z,nognustack
% llvm-objcopy a a2
% llvm-readelf -l a2   # incorrect offset(PT_LOAD)
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 0x0001c0 R   0x8
  INTERP         0x001001 0x0000000000202000 0x0000000000202000 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000200 0x000200 R   0x1000
  LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000001 0x000001 R E 0x1000
//// incorrect offset
  LOAD           0x001001 0x0000000000202000 0x0000000000202000 0x000021 0x000021 R   0x1000
  LOAD           0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 RW  0x1000
  TLS            0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x000001 R   0x1000
  GNU_RELRO      0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R   0x1000
```

The same issue occurs for PT_TLS/PT_GNU_RELRO if we PT_TLS's alignment
is smaller and we place the PT_LOAD after PT_TLS/PT_GNU_RELRO segments
(not linker default, but possible with a `PHDRS` linker script command).

Fix #79887: when two segments have the same offset, order the one with a
larger alignment first. In the previous case, the PT_LOAD segment will
go before the PT_INTERP segment. In case of equal alignments, it doesn't
matter which segment is treated as the parent segment.
2024-02-20 09:26:04 -08:00
Maksim Panchenko
2cbe5a33a5 [llvm-objcopy] Fix the build again after 7ddc320 2024-02-09 12:26:12 -08:00
Jon Roelofs
5afbed1968
[llvm-objcopy] Fix the build after 7ddc32052546abd41656d2e670f3902b1bf805a7. NFCI 2024-02-09 08:49:45 -08:00