Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."
This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:
```
struct MyStruct<T> {
let t: T
}
let variable = MyStruct<Int>(t: 43)
```
The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
DW_AT_name ("MyStruct")
// "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to
// MyStruct<T>
DW_AT_linkage_name ("$s1w8MyStructVyxGD")
// other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
DW_AT_specification (<link to "MyStruct">)
// "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
// MyStruct<Int>
DW_AT_linkage_name ("$s1w8MyStructVySiGD")
DW_AT_byte_size (0x08)
// other attributes here
```
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.
This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.
Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).
This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
Extend `DIBasicType` and `DISubroutineType` with additional field
`annotations`, e.g. as below:
```
!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed, annotations: !6)
!6 = !{!7}
!7 = !{!"btf:type_tag", !"tag1"}
```
The field would be used by BPF backend to generate DWARF attributes
corresponding to `btf_type_tag` type attributes, e.g.:
```
0x00000029: DW_TAG_base_type
DW_AT_name ("int")
DW_AT_encoding (DW_ATE_signed)
DW_AT_byte_size (0x04)
0x0000002d: DW_TAG_LLVM_annotation
DW_AT_name ("btf:type_tag")
DW_AT_const_value ("tag1")
```
Such DWARF entries would be used to generate BTF definitions by tools
like [pahole](https://github.com/acmel/dwarves).
Note: similar fields with similar purposes are already present in
DIDerivedType and DICompositeType.
Currently "btf_type_tag" attributes are represented in debug information
as 'annotations' fields in DIDerivedType with DW_TAG_pointer_type tag.
The annotation on a pointer corresponds to pointee having the attributes
in the final BTF.
The discussion in
[thread](https://lore.kernel.org/bpf/87r0w9jjoq.fsf@oracle.com/) came to
conclusion, that such annotations should apply to the annotated type
itself. Hence the necessity to extend `DIBasicType` & `DISubroutineType`
types with 'annotations' field to represent cases like below:
```
int __attribute__((btf_type_tag("foo"))) bar;
```
This was previously tracked as differential revision:
https://reviews.llvm.org/D143966
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
If --load-bitcode-into-experimental-debuginfo-iterators is true then debug
intrinsics are auto-upgraded to DbgRecords (the new debug info format).
The upgrade is trivial because the two representations are semantically
identical. llvm.dbg.value with 4 operands and llvm.dbg.addr intrinsics are
upgraded in the same way as usual, but converted directly into DbgRecords
instead of debug intrinsics.
Reland #82363 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/5/builds/41428.
Memory sanitizer detects usage of `RawData` union member which is not
filled directly. Instead, the code relies on filling `Data` union
member, which is a struct consisting of signing schema parameters.
According to https://en.cppreference.com/w/cpp/language/union, this is
UB:
"It is undefined behavior to read from the member of the union that
wasn't most recently written".
Instead of relying on compiler allowing us to do dirty things, do not
use union and only store `RawData`. Particular ptrauth parameters are
obtained on demand via bit operations.
Original PR description below.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
This fixes some cases of missing debuginfo caused by an interaction
between:
f0d66559ea,
which drops the identifier from a DICompositeType in the module
containing its
vtable.
and
a61f5e3796,
which causes ThinLTO to import composite types as declarations when they
have
an identifier.
If a virtual class's DICompositeType has no identifier due to the first
change,
and contains a nested anonymous type which does have an identifier, then
the
second change can cause ThinLTO to output the classes's DICompositeType
as a
type definition that links to a non-defining declaration for the nested
type.
Since the nested anonyous type does not have a name, debuggers are
unable to
find the definition for the declaration.
Repro case:
```
cat > a.h <<EOF
class A {
public:
A();
virtual ~A();
private:
union {
int val;
};
};
EOF
cat > a.cc <<EOF
#include "a.h"
A::A() { asm(""); }
A::~A() {}
EOF
cat > main.cc <<EOF
#include "a.h"
int main(int argc, char **argv) {
A a;
return 0;
}
EOF
clang++ -O2 -g -flto=thin -mllvm -force-import-all main.cc a.cc
gdb ./a.out -batch -ex 'pt /rmt A'
```
The gdb command outputs:
```
type = class A {
private:
union {
<incomplete type>
};
}
```
and dwarfdump -i a.out shows a DW_TAG_class_type for A with an
incomplete union
type (note that there is also a duplicate entry with the full union type
that
comes after).
```
< 1><0x0000001e> DW_TAG_class_type
DW_AT_containing_type <0x0000001e>
DW_AT_calling_convention DW_CC_pass_by_reference
DW_AT_name (indexed string: 0x00000007)A
DW_AT_byte_size 0x00000010
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000001
...
< 2><0x0000002f> DW_TAG_member
DW_AT_type <0x00000037>
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000007
DW_AT_data_member_location 8
< 2><0x00000037> DW_TAG_union_type
DW_AT_export_symbols yes(1)
DW_AT_calling_convention DW_CC_pass_by_value
DW_AT_declaration yes(1)
```
This change works around this by making ThinLTO always import full
definitions
for anonymous types.
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions
This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).
The first commit is added for convenience, as it has already been
accepted.
If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.
This is meant to be committed along with
https://reviews.llvm.org/D144006.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Loading a 2GB bitcode file, I noticed that we spend minutes just running
upgradeCULocals(). Apparently it gets invoked every time a metadata
block is loaded, which will be once at the module level and then once
per function. However, the relevant metadata only exists at the module
level, so running this upgrade per function is unnecessary.
Metadata (via ValueAsMetadata) can reference constant expressions
that may no longer be supported. These references can both be in
function-local metadata and module metadata, if the same expression
is used in multiple functions. At least in theory, such references
could also be in metadata proper, rather than just inside
ValueAsMetadata references in calls.
Instead of trying to expand these expressions (which we can't
reliably do), pretend that the constant has been deleted, which
means that ValueAsMetadata references will get replaced with
undef metadata.
Fixes https://github.com/llvm/llvm-project/issues/68281.
This caused asserts:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
"getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.
See comment on the code review for reproducer.
> RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
>
> Similar to imported declarations, the patch tracks function-local types in
> DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
> the aforementioned metadata change and provided a support of function-local
> types scoped within a lexical block.
>
> The patch assumes that DICompileUnit's 'enums field' no longer tracks local
> types and DwarfDebug would assert if any locally-scoped types get placed there.
>
> Reviewed By: jmmartinez
>
> Differential Revision: https://reviews.llvm.org/D144006
This reverts commit f8aab289b5549086062588fba627b0e4d3a5ab15.
This fixes the bitcode upgrade failure reported in
https://reviews.llvm.org/D155924#4616789.
The expansion always happens in the entry block, so this may be
inaccurate if there are trapping constant expressions.
Test "local-type-as-template-parameter.ll" is now enabled only for
x86_64.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
Depends on D144005
Test "local-type-as-template-parameter.ll" now requires linux-system.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
Depends on D144005
This reverts commit d80fdc6fc1a6e717af1bcd7a7313e65de433ba85.
split-dwarf-local-impor3.ll fails because of an issue with
Dwo sections emission on Windows platform.
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
Fixed PR51501 (tests from D112337).
1. Reuse of DISubprogram's 'retainedNodes' to track other function-local
entities together with local variables and labels (this patch cares about
function-local import while D144006 and D144008 use the same approach for
local types and static variables). So, effectively this patch moves ownership
of tracking local import from DICompileUnit's 'imports' field to DISubprogram's
'retainedNodes' and adjusts DWARF emitter for the new layout. The old layout
is considered unsupported (DwarfDebug would assert on such debug metadata).
DICompileUnit's 'imports' field is supposed to track global imported
declarations as it does before.
This addresses various FIXMEs and simplifies the next part of the patch.
2. Postpone emission of function-local imported entities from
`DwarfDebug::endFunctionImpl()` to `DwarfDebug::endModule()`.
While in `DwarfDebug::endFunctionImpl()` we do not have all the
information about a parent subprogram or a referring subprogram
(whether a subprogram inlined or not), so we can't guarantee we emit
an imported entity correctly and place it in a proper subprogram tree.
So now, we just gather needed details about the import itself and its
parent entity (either a Subprogram or a LexicalBlock) during
processing in `DwarfDebug::endFunctionImpl()`, but all the real work is
done in `DwarfDebug::endModule()` when we have all the required
information to make proper emission.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144004
This reverts commit ed578f02cf44a52adde16647150e7421f3ef70f3.
Tests llvm/test/DebugInfo/Generic/split-dwarf-local-import*.ll fail
when x86_64 target is not registered.
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
Fixed PR51501 (tests from D112337).
1. Reuse of DISubprogram's 'retainedNodes' to track other function-local
entities together with local variables and labels (this patch cares about
function-local import while D144006 and D144008 use the same approach for
local types and static variables). So, effectively this patch moves ownership
of tracking local import from DICompileUnit's 'imports' field to DISubprogram's
'retainedNodes' and adjusts DWARF emitter for the new layout. The old layout
is considered unsupported (DwarfDebug would assert on such debug metadata).
DICompileUnit's 'imports' field is supposed to track global imported
declarations as it does before.
This addresses various FIXMEs and simplifies the next part of the patch.
2. Postpone emission of function-local imported entities from
`DwarfDebug::endFunctionImpl()` to `DwarfDebug::endModule()`.
While in `DwarfDebug::endFunctionImpl()` we do not have all the
information about a parent subprogram or a referring subprogram
(whether a subprogram inlined or not), so we can't guarantee we emit
an imported entity correctly and place it in a proper subprogram tree.
So now, we just gather needed details about the import itself and its
parent entity (either a Subprogram or a LexicalBlock) during
processing in `DwarfDebug::endFunctionImpl()`, but all the real work is
done in `DwarfDebug::endModule()` when we have all the required
information to make proper emission.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144004
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
Fixed PR51501 (tests from D112337).
1. Reuse of DISubprogram's 'retainedNodes' to track other function-local
entities together with local variables and labels (this patch cares about
function-local import while D144006 and D144008 use the same approach for
local types and static variables). So, effectively this patch moves ownership
of tracking local import from DICompileUnit's 'imports' field to DISubprogram's
'retainedNodes' and adjusts DWARF emitter for the new layout. The old layout
is considered unsupported (DwarfDebug would assert on such debug metadata).
DICompileUnit's 'imports' field is supposed to track global imported
declarations as it does before.
This addresses various FIXMEs and simplifies the next part of the patch.
2. Postpone emission of function-local imported entities from
`DwarfDebug::endFunctionImpl()` to `DwarfDebug::endModule()`.
While in `DwarfDebug::endFunctionImpl()` we do not have all the
information about a parent subprogram or a referring subprogram
(whether a subprogram inlined or not), so we can't guarantee we emit
an imported entity correctly and place it in a proper subprogram tree.
So now, we just gather needed details about the import itself and its
parent entity (either a Subprogram or a LexicalBlock) during
processing in `DwarfDebug::endFunctionImpl()`, but all the real work is
done in `DwarfDebug::endModule()` when we have all the required
information to make proper emission.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144004
When opaque pointers are enabled and old IR with typed pointers is read,
the BitcodeReader automatically upgrades all typed pointers to opaque
pointers. This is a lossy conversion, i.e. when a function argument is a
pointer and unused, it’s impossible to reconstruct the original type
behind the pointer.
There are cases where the type information of pointers is needed. One is
reading DXIL, which is bitcode of old LLVM IR and makes a lot of use of
pointers in function signatures.
We’d like to keep using up-to-date llvm to read in and process DXIL, so
in the face of opaque pointers, we need some way to access the type
information of pointers from the read bitcode.
This patch allows extracting type information by supplying functions to
parseBitcodeFile that get called for each function signature or metadata
value. The function can access the type information via the reader’s
type IDs and the getTypeByID and getContainedTypeID functions.
The tests exemplarily shows how type info from pointers can be stored in
metadata for use after the BitcodeReader finished.
Differential Revision: https://reviews.llvm.org/D127728
This reverts commit b56df190b01335506ce30a4559d880da76d1a181.
The unit tests are implemented in a way that requires support for
writing typed pointer bitcode, which is going away soon. Please
rewrite it in a way that not have requirement, e.g. by shipping
pre-compiled bitcode, as we do for integration tests.
When opaque pointers are enabled and old IR with typed pointers is read,
the BitcodeReader automatically upgrades all typed pointers to opaque
pointers. This is a lossy conversion, i.e. when a function argument is a
pointer and unused, it’s impossible to reconstruct the original type
behind the pointer.
There are cases where the type information of pointers is needed. One is
reading DXIL, which is bitcode of old LLVM IR and makes a lot of use of
pointers in function signatures.
We’d like to keep using up-to-date llvm to read in and process DXIL, so
in the face of opaque pointers, we need some way to access the type
information of pointers from the read bitcode.
This patch allows extracting type information by supplying functions to
parseBitcodeFile that get called for each function signature or metadata
value. The function can access the type information via the reader’s
type IDs and the getTypeByID and getContainedTypeID functions.
The tests exemplarily shows how type info from pointers can be stored in
metadata for use after the BitcodeReader finished.
Differential Revision: https://reviews.llvm.org/D127728
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
getCanonicalMDString() also returns a nullptr for empty strings, which
tripped over the getSource() method. Solve the ambiguity of no source
versus an optional containing a nullptr by simply storing a pointer.
Differential Revision: https://reviews.llvm.org/D138658
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716