This fixes:
```
[1343/7452] Building CXX object lib\Object\CMakeFiles\LLVMObject.dir\ELFObjectFile.cpp.obj
C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(808,27): warning: comparison of integers of different signs: 'unsigned int' and '_Iter_diff_t<const Elf_Shdr_Impl<ELFType<llvm::endianness::little, false>> *>' (aka 'int') [-Wsign-compare]
808 | if (*TextSectionIndex != std::distance(Sections.begin(), *TextSecOrErr))
| ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(913,12): note: in instantiation of function template specialization 'readBBAddrMapImpl<llvm::object::ELFType<llvm::endianness::little, false>>' requested here
913 | return readBBAddrMapImpl(Obj->getELFFile(), TextSectionIndex, PGOAnalyses);
| ^
```
This patch adds an assertion to readBBAddrMapImpl to confirm that
PGOAnalyses and BBAddrMaps are of the same size when PGO information is
requested (part of the API contract). This patch also updates the
docstring for readBBAddrMap to better clarify what is guaranteed.
Summary:
The linker wrapper's job is to sort various embedded inputs into a list
of files that participate in a single link job. So far, this has been
completely 1-to-1, that is, each input file participates in exactly one
link job. However, support for AMD's target-id requires that one input
file may participate in multiple link jobs. For example, if given a
`gfx90a` static library and a `gfx90a:xnack+` object file input, we
should link the gfx90a` target into the `gfx90a:xnack+` job. These are
considered separate CPUs that can be mutually linked more or less.
This patch adds the necessary logic to make this happen. It primarily
reworks the logic to copy relevant input files into a separate list. So,
it moves construction of the final list of link jobs into the extraction
phase. We also need to copy the files in the case that it is needed more
than once, as the entire workflow expects ownership of said file.
LLVM models some features found in the binary format with raw integers
and others with nested or enumerated types. This PR switches modeling of
tables and segments to use wasm::ValType rather than uint32_t. This NFC
change is in preparation for modeling more reference types, but IMO is
also cleaner and closer to the spec.
A previous commit (82f75ed) made clang ignore .gch files that were not
Clang AST files. This broke `-gmodules`, which embeds the Clang AST into
an object file containing debug info.
This changes the probing to detect any file format recognized by
`llvm::identify_magic()` as potentially containing a Clang AST.
Previous PR: https://github.com/llvm/llvm-project/pull/69204
When linker-relaxation is enabled, part of the label diff in dwarfinfo
cannot be computed before static link. Refer to RISCV, we add the
relaxDwarfLineAddr and relaxDwarfCFA to add relocations for these label
diffs. Calculate whether the label diff is mutable. For immutable label
diff, return false and do the other works by its parent function.
toFeatures and toFeatureVector both output a list of target feature
flags, just with a slightly different interface. toFeatures keeps any
unsupported extensions, and also provides a way to append negative
extensions (AddAllExtensions=true).
This patch combines them into one function, so that a later patch will
be be able to get a std::vector of features that includes all the
negative extensions, which was previously only possible through the
StrAlloc interface.
We had specified that `readBBAddrMap` will always keep PGOAnalyses and
BBAddrMaps the same length on success.
365fbbfbcf/llvm/include/llvm/Object/ELFObjectFile.h (L116-L117)
It turns out that this is not currently the case when no analyses exist
in a function. No test had caught it.
We also should not append PGOBBEntries when there is no BBFreq or
BrProb.
This patch adds:
* tests that PGOAnalyses and BBAddrMaps are same length even when no
analyses are enabled
* fixes decode so that PGOAnalyses and BBAddrMaps are same length
* updates test to not emit unnecessary PGOBBEntries
* fixes decode to not emit PGOBBEntries when unnecessary
Summary:
This patch fixes up the `makeTriple()` interface to emit append the
operating system information when it is readily avaialble from the ELF.
The main motivation for this is so the GPU architectures can be easily
identified correctly when given and ELF. E.g. we want
`amdgpu-amd-amdhsa` as the output and not `amdgpu--`.
This required adding support for the CUDA OS/ABI, which is easily found
to be `0x33` when using `readelf`.
WebAssembly doesn't have a single virtual memory space the way other object
formats or architectures do, so "addresses" mean different things depending
on the context.
Function symbol addresses in object files are offsets from the start of the code
section. This is good for linking and relocation. However when dealing with
linked binaries, offsets from the start of the file/module are more often
used (e.g. for stack traces in browsers), and are more useful for use
cases like binary size attribution. This PR changes Object to use
the file offset instead of the section offset for function symbols, but
only for linked (non-DSO) files.
This is a reland of fc5f51cf with a fix for the MSan failure (it was not caused
by this change, but it was revealed by the new tests).
WebAssembly doesn't have a single virtual memory space the way other
object formats or architectures do, so "addresses" mean different things
depending on the context.
Function symbol addresses in object files are offsets from the start of
the code section. This is good for linking and relocation. However when
dealing with linked binaries, offsets from the start of the file/module
are more often used (e.g. for stack traces in browsers), and are more
useful for use cases like binary size attribution. This PR changes
Object to use the file offset instead of the section offset for function
symbols, but only for linked (non-DSO) files.
This implements item number 4 from #76107
LLVM ObjectFile currently records the start offsets of sections as the
start of the section header, whereas most other tools (WABT, emscripten,
wasm-tools) record it as the start of the section content, after the
header. This affects binutils tools such as objdump and nm, but not
compilation/assembly (since that is driven by symbols and assembler
labels which already have their values inside the section payload rather
in the header. This patch updates LLVM to match the other tools.
The current (experimental) spec for WebAssembly shared libraries does
not include a full symbol table like the object format. This change
extracts symbol information from the normal wasm exports.
This is the first step in having the linker report undefined symbols
when linking with shared libraries. The current behaviour is to ignore
all undefined symbols when linking with `-pie` or `-shared`.
See https://github.com/emscripten-core/emscripten/issues/18198
Summary:
Recently we added support for detecting the CUDA processor with the ELF
flags. This allows us to get a string representation of it in other
code. This will be used by the offloading runtime.
Non-LTO compiles set the buffer name to "<inline asm>"
(`AsmPrinter::addInlineAsmDiagBuffer`) and pass diagnostics to
`ClangDiagnosticHandler` (through the `MCContext` handler in
`MachineModuleInfoWrapperPass::doInitialization`) to ensure that
the exit code is 1 in the presence of errors. In contrast, LTO compiles
spuriously succeed even if error messages are printed.
```
% cat a.c
void _start() {}
asm("unknown instruction");
% clang -c a.c
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 error generated.
% clang -c -flto a.c; echo $? # -flto=thin is the same
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
0
```
`CollectAsmSymbols` parses inline assembly and is transitively called by
both `ModuleSummaryIndexAnalysis::run` and `WriteBitcodeToFile`, leading
to duplicate diagnostics.
This patch updates `CollectAsmSymbols` to be similar to non-LTO
compiles.
```
% clang -c -flto=thin a.c; echo $?
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 errors generated.
1
```
The `HasErrors` check does not prevent duplicate warnings but assembler
warnings are very uncommon.
Summary:
More SPIR-V related patches are being upstreamed. We should add support
to detect when a binary file is SPIR-V. This will be used in the future
when support for SPIR-V is added to the offloading runtime or more
support for bundling.
The magic number is described in the official documentation:
https://registry.khronos.org/SPIR-V/specs/1.0/SPIRV.html#Magic. Notably,
SPIR-V files are streams of 32-bit words. This means that the magic
numbers differ depending on the endianness. Here we simply check the
strandard and byte-reversed versions.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).
This PR adds the PGOAnalysisMap data structure and implements encoding and
decoding through Object and ObjectYAML along with associated tests. When
emitted into the bb-addr-map section, each function is followed by the associated
pgo-analysis-map for that function. The emitting of each analysis in the map is
controlled by a bit in the bb-addr-map feature byte. All existing bb-addr-map
code can ignore the pgo-analysis-map if the caller does not request the data.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
suggested in review of https://github.com/llvm/llvm-project/pull/69553
This is actually not an NFC as isFunction() does not return false for
some "invalid" object, instead it returns the errors to its caller. But
since there is no such invalid object in the LIT tests, so no case
changes.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
Computing the symbol size as the gap between sorted symbols are not
right for XCOFF.
For XCOFF, the size info is stored in aux symbol and can be got from
existing XCOFF interface `getSymbolSize()`.
This patch changes XCOFFObjectFile to call this API to get sizes for
symbols.
For DirectX, program signatures are encoded into three different binary
sections depending on if the signature is for inputs, outputs, or
patches. All three signature types use the same data structure encoding
so they can share a lot of logic.
This patch adds support for reading and writing program signature data
as both yaml and binary data.
Fixes#57743 and #57744
Original PR: https://github.com/llvm/llvm-project/pull/67162
The commit was reverted due to UB detected by santizer:
https://lab.llvm.org/buildbot/#/builders/238/builds/5955
clang/lib/Driver/OffloadBundler.cpp:1012:25: runtime error:
load of misaligned address 0xaaaae2d90e7c for type
'const uint64_t' (aka 'const unsigned long'), which
requires 8 byte alignment
It was fixed by using memcpy instead of dereferencing int*
casted from unaligned char*.
This reverts commit a1e81d2ead02e041471ec2299d7382f80c4dbba6.
Revert "Fix test hip-offload-compress-zlib.hip"
This reverts commit ba01ce60665848478ba4e76190907153a8c26fe9.
Revert due to sanity fail at
https://lab.llvm.org/buildbot/#/builders/5/builds/37188https://lab.llvm.org/buildbot/#/builders/238/builds/5955
/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1012:25: runtime error: load of misaligned address 0xaaaae2d90e7c for type 'const uint64_t' (aka 'const unsigned long'), which requires 8 byte alignment
0xaaaae2d90e7c: note: pointer points here
bc 00 00 00 94 dc 29 9a 89 fb ca 2b 78 9c 8b 8f 77 f6 71 f4 73 8f f7 77 73 f3 f1 77 74 89 77 0a
^
#0 0xaaaaba125f70 in clang::CompressedOffloadBundle::decompress(llvm::MemoryBuffer const&, bool) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1012:25
#1 0xaaaaba126150 in clang::OffloadBundler::ListBundleIDsInFile(llvm::StringRef, clang::OffloadBundlerConfig const&) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1089:7
Will reland after fixing it.
Add option -f[no-]offload-compress to clang to enable/disable
compression of device binary for HIP. By default it is disabled.
Add option -compress to clang-offload-bundler to enable compression of
offload bundle. By default it is disabled.
When enabled, zstd or zlib is used for compression when available.
When disabled, it is NFC compared to previous behavior. The same offload
bundle format is used as before.
Clang-offload-bundler automatically detects whether the input file to be
unbundled is compressed and the compression method and decompress if
necessary.
Fixes a crash in `-Wl,-emit-relocs` where the linker was not able to
write linker-synthetic absolute symbols to the symbol table.
This change adds a new symbol flag (`WASM_SYMBOL_ABS`), which means that
the symbol's offset is absolute and not relative to a given segment.
Such symbols include `__stack_low` and `__stack_low`.
Note that wasm object files never contains such symbols, only binaries
linked with `-Wl,-emit-relocs`.
Fixes: #67111
Mach-O archives seems to be able to contain both IR objects and native
objects mixed together. Apple tooling seems to deal with them correctly.
The current implementation was adding an additional restriction of all
the objects in the archive being either IR objects or native objects.
The changes in this commit remove that restriction and allow mixing both
IR and native objects, while still checking that the CPU restrictions
still apply (all objects in a slice need to be the same CPU
type/subtype).
A test that was testing for the previous behaviour had been modified to
test that mixed archives are allowed and that they create the expected
results.
Additionally, locally I checked the results of Apple's `libtool` against
`llvm-libtool-darwin` with this code, and the resulting libraries are
almost identical with expected differences in the GUID and code
signatures load commands, and some minor differences in the rest of the
binary.