Summary:
We rely on these flags to do things in the runtime and print the
contents of binaries correctly. CUDA updated their ABI encoding recently
and we didn't handle that. it's a new ABI entirely so we just select on
it when it shows up.
Fixes: https://github.com/llvm/llvm-project/issues/148703
This PR adds the SFrameParser class and uses it from llvm-readobj to
dump the section contents. Currently, it only supports parsing the
SFrame section header. Other parts of the section will be added in
follow-up patches.
llvm-readobj uses the same sframe flag syntax as GNU readelf, but I have
not attempted match the output format of the tool. I'm starting with the
"llvm" output format because it's easier to generate and lets us
tweak the format to make it useful for testing the generation code. If
needed, support for the GNU format could be added by overriding this
functionality in the GNU ELF Dumper.
For more information, see the [sframe
specification](https://sourceware.org/binutils/wiki/sframe) and the
related
[RFC](https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900).
Also replace the current static DenseMap of preserved symbol
names in the Symtab hack with this. That was broken statefulness
across compiles, so this at least fixes that. However this is
still broken, llvm-as shouldn't really depend on the triple.
Add support for big-endian RISC-V ELF files:
- Add riscv32be/riscv64be target architectures to Triple
- Support elf32-bigriscv and elf64-bigriscv output targets in
llvm-objcopy
- Update ELFObjectFile to handle BE RISC-V format strings and
architecture detection
- Add BE RISC-V support to RelocationResolver
- Add tests for new functionality
This is a subset of a bigger RISC-V big-endian support patch, containing
only the llvm-objcopy and libObject changes. Other changes will be added
later.
This patch adds dummy symbols for PLT entries for RISC-V 32-bit and
64-bit targets so llvm-objdump can show the function symbol that
corresponds to each PLT entry.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Work towards separating the ABI existence of libcalls vs. the
lowering selection. Set libcall selection through enums, rather
than through raw string names.
Replace RuntimeLibcalls.def with a tablegenerated version. This
is in preparation for splitting RuntimeLibcalls into two components.
For now match the existing functionality.
ArrayRef now has a new constructor that takes a parameter whose type
has data() and size(). This patch migrates:
ArrayRef<T>(X.data(), X.size()
to:
ArrayRef<T>(X)
Recently, we have been looking at some optimizations targeting
individual calls. In particular, we plan to extend the address mapping
technique to map to individual callsites. For example, in this piece of
code for a basic blocks:
```
<BB>:
1200: lea 0x1(%rcx), %rdx
1204: callq foo
1209: cmpq 0x10, %rdx
120d: ja L1
```
We want to emit 0x9 as the call site offset for `callq foo` (the offset
from the block entry to right after the call), so that we know if a
sampled address is before the call or after.
This PR implements the decode/encode/emit capability. The Codegen change
will be implemented in a later PR.
We need the full set of ABI options to accurately compute
the full set of libcalls. This partially resolves missing
information required to compute the set of ARM calls.
Shaders compiled with DXC/LLPC generate these relocations, and even if
that changes in the future we want to handle existing binaries. The
friction to support this and the maintenance cost long term both seem
incredibly low, considering other targets like ARM support both REL/RELA
static relocations behind the same interface.
Compression of SHT_LLVM_BB_ADDR_MAP with zstd can give 3X compression
ratio, which is especially beneficial with PGO_analysis_map. To read the
data back, we must decompress it. Though we can use llvm-objcopy to do
this, it's much better to do this decompression internally in the
library API.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/ObjCopy` and
`llvm/Object` libraries. These annotations currently have no meaningful
impact on the LLVM build; however, they are a prerequisite to support an
LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The following manual adjustments were also applied after running IDS on
Linux:
- Add `#include "llvm/Support/Compiler.h"` to files where it was not
auto-added by IDS due to no pre-existing block of include statements.
- Manually annotate template class `CommonArchiveMemberHeader` with
`LLVM_ABI` since IDS ignores templates and this is the simplest solution
for DLL-exporting its child classes because it has methods defined
out-of-line. Seems to be an uncommon situation.
- Explicitly make `Archive` class non-copyable due to the class-level
`LLVM_ABI` annotation.
- Add default constructor to `ELFFile` so it can be instantiated for
DLL-export as a base class.
- Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported
instantiated templates.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
Which fixes a test failure seen on the bots, introduced by
https://github.com/llvm/llvm-project/pull/140286.
```
[ RUN ] OffloadingBundleTest.checkExtractOffloadBundleFatBinary
ObjectTests: ../llvm/llvm/include/llvm/ADT/StringRef.h:618: StringRef llvm::StringRef::drop_front(size_t) const: Assertion `size() >= N && "Dropping more elements than exist"' failed.
0 0x0a24a990 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-armv8-quick/stage1/unittests/Object/./ObjectTests+0x31a990)
1 0x0a248364 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-armv8-quick/stage1/unittests/Object/./ObjectTests+0x318364)
2 0x0a24b410 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
3 0xf46ed6f0 __default_rt_sa_restorer ./signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:80:0
4 0xf46ddb06 ./csu/../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47:0
5 0xf471d292 __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
6 0xf46ec840 gsignal ./signal/../sysdeps/posix/raise.c:27:6
```
Also reported on 32-bit x86.
I think the cause is the code was casting the result of StringRef.find
into an int64_t. The failure value of find is StringRef::npos, which is
defined as:
static constexpr size_t npos = ~size_t(0);
* size_t(0) is 32 bits of 0s
* the inverse of that is 32 bits of 1s
* Cast to int64_t needs to widen this, and it will preserve the original
value in doing so, which is 0xffffffff.
* The result is 0x00000000ffffffff, which is >= 0, so we keep searching
and try to go off the end of the file.
Or put another way, this equivalent function returns true when compiled
for a 32-bit system:
```
bool fn() {
size_t foo = ~size_t(0);
int64_t foo64 = (int64_t)foo;
return foo64 >= 0;
}
```
Using size_t throughout fixes the problem. Also I don't see a reason it
needs to be a signed number, given that it always searches forward from
the current offset.
As requested in a previous PR, this change moves all structs related to
RTS0 to RTS0 namespace.
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
The current check may fail if the addition overflows. I've observed
failures of macho-invalid.test on 32-bit due to this.
Instead, compare against the remaining bytes until the end of the
object.
Utilize the new extensions to the LLVM Offloading API to extend to
llvm-objdump to handle dumping fatbin offload bundles generated by HIP.
This extension to llvm-objdump adds the option --offload-fatbin.
Specifying this option will take the input object/executable and extract
all offload fatbin bundle entries into distinct code object files with
names reflecting the source file name combined with the Bundle Entry ID.
Users can also use the --arch-name option to filter offload fatbin
bundle entries by their target triple.
---------
Co-authored-by: dsalinas <dsalinas@MKM-L1-DSALINAS.amd.com>
The .syntax unified directive and .codeX/.code X directives are, other
than some simple common printing code, exclusively implemented in the
targets themselves. Thus, remove the corresponding MCAF_* flags and
reimplement the directives solely within the targets. This avoids
exposing all targets to all other targets' flags.
Since MCAF_SubsectionsViaSymbols is all that remains, convert it to its
own function like other directives, simplifying its implementation.
Note that, on X86, we now always need a target streamer when parsing
assembly, as it's now used for directives that aren't COFF-specific. It
still does not however need to do anything when producing a non-COFF
object file, so this commit does not introduce any new target streamers.
There is some churn in test output, and corresponding UTC regex changes,
due to comments no longer being flushed by these various directives (and
EmitEOL is not exposed outside MCAsmStreamer.cpp so we couldn't do so
even if we wanted to), but that was a bit odd to be doing anyway.
This is motivated by Morello LLVM, which adds yet another assembler flag
to distinguish A64 and C64 instruction sets, but did not update every
switch and so emits warnings during the build. Rather than fix those
warnings it seems better to instead make the problem not exist in the
first place via this change.
Utilize the new extensions to the LLVM Offloading API to extend to
llvm-objdump to handle dumping fatbin offload bundles generated by HIP.
This extension to llvm-objdump adds the option --offload-fatbin.
Specifying this option will take the input object/executable and extract
all offload fatbin bundle entries into distinct code object files with
names reflecting the source file name combined with the Bundle Entry ID.
Users can also use the --arch-name option to filter offload fatbin
bundle entries by their target triple.
---------
Co-authored-by: dsalinas <dsalinas@MKM-L1-DSALINAS.amd.com>
Utilize the new extensions to the LLVM Offloading API to extend to
llvm-objdump to handle dumping fatbin offload bundles generated by HIP.
This extension to llvm-objdump adds the option --offload-fatbin.
Specifying this option will take the input object/executable and extract
all offload fatbin bundle entries into distinct code object files with
names reflecting the source file name combined with the Bundle Entry ID.
Users can also use the --arch-name option to filter offload fatbin
bundle entries by their target triple.
---------
Co-authored-by: dsalinas <dsalinas@MKM-L1-DSALINAS.amd.com>
This PR is one of the many PRs in the SYCL upstreaming effort focusing
on device code linking during the SYCL offload compilation process. RFC:
https://discourse.llvm.org/t/rfc-offloading-design-for-sycl-offload-kind-and-spir-targets/74088
Approved PRs so far:
1. [Clang][SYCL] Introduce clang-sycl-linker to link SYCL offloading
device code (Part 1 of many) -
[Link](https://github.com/llvm/llvm-project/pull/112245)
2. [clang-sycl-linker] Replace llvm-link with API calls -
[Link](https://github.com/llvm/llvm-project/pull/133797)
3. [SYCL][SPIR-V Backend][clang-sycl-linker] Add SPIR-V backend support
inside clang-sycl-linker -
[Link](https://github.com/llvm/llvm-project/pull/133967)
This PR adds SYCL device code linking support to clang-linker-wrapper.
### Summary for this PR
Device code linking happens inside clang-linker-wrapper. In the current
implementation, clang-linker-wrapper does the following:
1. Extracts device code. Input_1, Input_2,.....
5. Group device code according to target devices Inputs[triple_1] = ....
Inputs[triple_2] = ....
6. For each group, i.e. Inputs[triple_i], a. Gather all the offload
kinds found inside those inputs in ActiveOffloadKinds b. Link all images
inside Inputs[triple_i] by calling clang --target=triple_i .... c.
Create a copy of that linked image for each offload kind and add it to
Output[Kind] list.
In SYCL compilation flow, there is a deviation in Step 3b. We call
device code splitting inside the 'clang --target=triple_i ....' call and
the output is now a 'packaged' file containing multiple device images.
This deviation requires us to capture the OffloadKind during the linking
stage and pass it along to the linking function (clang), so that clang
can be called with a unique option '--sycl-link' that will help us to
call 'clang-sycl-linker' under the hood (clang-sycl-linker will do SYCL
specific linking).
Our current objective is to implement an end-to-end SYCL offloading flow
and get it working. We will eventually merge our approach with the
community flow.
Thanks
---------
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Adding support for Root Constant in MC, Object and obj2yaml and
yaml2obj, this PR adds:
- new structures to dxbc definition.
- serialize and desirialize logic from dxcontainer to yaml
- tests validating against dxc
- adding support to multiple parts.
Closes: https://github.com/llvm/llvm-project/issues/126633
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
An Inlined library is a dylib that is reexported from an umbrella or
top-level library. When this is encoded in tbd files, ensure we are
reading the symbol table from the inlined library when
`--add-inlinedinfo` is used as opposed to the top-level lib.
resolves: rdar://147767733
This implements arm, armeb, thumb, thumbeb PLT entries parsing support
in ELF for llvm-objdump.
Implementation is similar to AArch64MCInstrAnalysis::findPltEntries. PLT
entry signatures are based on LLD code for PLT generation
(ARM::writePlt).
llvm-objdump tests are produced from lld/test/ELF/arm-plt-reloc.s,
lld/test/ELF/armv8-thumb-plt-reloc.s.
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
This commit adds support for WebAssembly's custom-page-sizes proposal to
`wasm-ld`. An overview of the proposal can be found
[here](https://github.com/WebAssembly/custom-page-sizes/blob/main/proposals/custom-page-sizes/Overview.md).
In a sentence, it allows customizing a Wasm memory's page size, enabling
Wasm to target environments with less than 64KiB of memory (the default
Wasm page size) available for Wasm memories.
This commit contains the following:
* Adds a `--page-size=N` CLI flag to `wasm-ld` for configuring the
linked Wasm binary's linear memory's page size.
* When the page size is configured to a non-default value, then the
final Wasm binary will use the encodings defined in the
custom-page-sizes proposal to declare the linear memory's page size.
* Defines a `__wasm_first_page_end` symbol, whose address points to the
first page in the Wasm linear memory, a.k.a. is the Wasm memory's page
size. This allows writing code that is compatible with any page size,
and doesn't require re-compiling its object code. At the same time,
because it just lowers to a constant rather than a memory access or
something, it enables link-time optimization.
* Adds tests for these new features.
r? @sbc100
cc @sunfishcode