The CMake flag has been on by default for a month without any issues.
This makes the feature support in LLVM unconditional (but does not
enable the feature by default).
Patch 1/4 adding bitcode support.
Store whether or not a function is using Key Instructions in its DISubprogram so
that we don't need to rely on the -mllvm flag -dwarf-use-key-instructions to
determine whether or not to interpret Key Instructions metadata to decide
is_stmt placement at DWARF emission time. This makes bitcode support simple and
enables well defined mixing of non-key-instructions and key-instructions
functions in an LTO context.
This patch adds the bit (using DISubprogram::SubclassData1).
PR 144104 and 144103 use it during DWARF emission.
PR 44102 adds bitcode
support.
See pull request for overview of alternative attempts.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/IR`,
`llvm/IRPrinter`, and `llvm/IRReader` libraries. These annotations
currently have no meaningful impact on the LLVM build; however, they are
a prerequisite to support an LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The following manual adjustments were also applied after running IDS on
Linux:
- Add `#include "llvm/Support/Compiler.h"` to files where it was not
auto-added by IDS due to no pre-existing block of include statements.
- Add `LLVM_ABI_FRIEND` to friend member functions declared with
`LLVM_ABI`
- Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported
instantiated templates
- Add `LLVM_ABI` to a subset of private class methods and fields that
require export
- Add `LLVM_ABI` to a small number of symbols that require export but
are not declared in headers
- Reorder `LLVM_ABI` with `[[deprecated]]` and `[[nodiscard]]`
attributes.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
NFC for builds with LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS=OFF (default).
In an ideal world we would be able to track that the merged location is used in
multiple source atoms. We can't do this though, so instead we arbitrarily but
deterministically pick one.
In cases where the InlinedAt field is unchanged we keep the atom with the
lowest non-zero rank (highest precedence). If the ranks are equal we choose
the smaller non-zero group number (arbitrary choice).
In cases where the InlinedAt field is adjusted we generate a new atom group.
Keeping the group wouldn't make sense (a source atom is identified by the
group number and InlinedAt pair) but discarding the atom info could result
in missed is_stmts.
Add unittest in MetadataTest.cpp.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
Source location atoms are identified by a function-local number and the
DILocation's InlinedAt field. The front end is responsible for assigning source
atom numbers, but certain optimisations need to assign new atom numbers to some
instructions. Most often code duplication optimisations like loop
unroll. Tracking a global maximum value (waterline) means we can easily
(cheaply) get new numbers that don't clash in any function.
The waterline is managed through DILocation creation,
LLVMContext::incNextAtomGroup, and LLVMContext::updateAtomGroupWaterline.
Add unittest.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
Previously when getMergedLocation was passed nullptr as one of the
parameters we returned nullptr. Change that behaviour to instead return
a valid DebugLoc. This is beneficial for binaries from which SamplePGO
profiles are obtained.
getMergedLocation uses a common parent scope of the two input locations
for an output location.
It doesn't consider the case when the common parent scope is from a file
other than L1's and L2's files. In that case, it produces a merged location
with an erroneous scope (https://github.com/llvm/llvm-project/issues/122846).
In some cases, such as https://github.com/llvm/llvm-project/pull/125780#issuecomment-2651657856,
L1, L2 having a common parent scope from another file indicate that
the code at L1 and L2 is included from the same source location.
With this commit, getMergedLocation detects that L1, L2, or their common parent
scope files are different. If so, it assumes that L1 and L2 were included
from some source location, and tries to attach the output location to a scope
with the nearest common source location with regard to L1 and L2.
If the nearest common location is also from another file, getMergedLocation returns it
as a merged location, assuming that L1 and L2 belong to files that were both included
in the nearest common location.
Fixes https://github.com/llvm/llvm-project/issues/122846.
An Ada program can have types that are subranges of other types. This
patch adds a new DIType node, DISubrangeType, to represent this concept.
I considered extending the existing DISubrange to do this, but as
DISubrange does not derive from DIType, that approach seemed more
disruptive.
A DISubrangeType can be used both as an ordinary type, but also as the
type of an array index. This is also important for Ada.
Ada subrange types can also be stored using a bias. Representing this in
the DWARF required the use of an extension. GCC has been emitting this
extension for years, so I've reused it here.
When creating `EnumDecl`s from DWARF for Objective-C `NS_ENUM`s, the
Swift compiler tries to figure out if it should perform "swiftification"
of that enum (which involves renaming the enumerator cases, etc.). The
heuristics by which it determines whether we want to swiftify an enum is
by checking the `enum_extensibility` attribute (because that's what
`NS_ENUM` pretty much are). Currently LLDB fails to attach the
`EnumExtensibilityAttr` to `EnumDecl`s it creates (because there's not
enough info in DWARF to derive it), which means we have to fall back to
re-building Swift modules on-the-fly, slowing down expression evaluation
substantially. This happens around
4b3931c8ce/lib/ClangImporter/ImportEnumInfo.cpp (L37-L59)
To speed up Swift exression evaluation, this patch proposes encoding the
C/C++/Objective-C `enum_extensibility` attribute in DWARF via a new
`DW_AT_APPLE_ENUM_KIND`. This would currently be only used from the LLDB
Swift plugin. But may be of interest to other language plugins as well
(though I haven't come up with a concrete use-case for it outside of
Swift).
I'm open to naming suggestions of the various new attributes/attribute
constants proposed here. I tried to be as generic as possible if we
wanted to extend it to other kinds of enum properties (e.g., flag
enums).
The new attribute would look as follows:
```
DW_TAG_enumeration_type
DW_AT_type (0x0000003a "unsigned int")
DW_AT_APPLE_enum_kind (DW_APPLE_ENUM_KIND_Closed)
DW_AT_name ("ClosedEnum")
DW_AT_byte_size (0x04)
DW_AT_decl_file ("enum.c")
DW_AT_decl_line (23)
DW_TAG_enumeration_type
DW_AT_type (0x0000003a "unsigned int")
DW_AT_APPLE_enum_kind (DW_APPLE_ENUM_KIND_Open)
DW_AT_name ("OpenEnum")
DW_AT_byte_size (0x04)
DW_AT_decl_file ("enum.c")
DW_AT_decl_line (27)
```
Absence of the attribute means the extensibility of the enum is unknown
and abides by whatever the language rules of that CU dictate.
This does feel like a big hammer for quite a specific use-case, so I'm
happy to discuss alternatives.
Alternatives considered:
* Re-using an existing DWARF attribute to express extensibility. E.g., a
`DW_TAG_enumeration_type` could have a `DW_AT_count` or
`DW_AT_upper_bound` indicating the number of enumerators, which could
imply closed-ness. I felt like a dedicated attribute (which could be
generalized further) seemed more applicable. But I'm open to re-using
existing attributes.
* Encoding the entire attribute string (i.e., `DW_TAG_LLVM_annotation
("enum_extensibility((open))")`) on the `DW_TAG_enumeration_type`. Then
in LLDB somehow parse that out into a `EnumExtensibilityAttr`. I haven't
found a great API in Clang to parse arbitrary strings into AST nodes
(the ones I've found required fully formed C++ constructs). Though if
someone knows of a good way to do this, happy to consider that too.
Bit shift operations with a shift operand greater than or equal to the bit width
of the (promoted) value type result in undefined behavior according to C++
[expr.shift]p1. This change adds checking for this situation and avoids attempts
to constant fold DIExpressions that would otherwise provoke such behavior.
An existing test that presumably intended to exercise shifts at the UB boundary
has been updated; it now checks for shifts of 64 bits instead of 65. This issue
was reported by a static analysis tool; no actual cases of shift operations that
would result in undefined behavior in practice have been identified.
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.
This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.
Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).
This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
This makes unit tests compatible with the assertion added in
https://github.com/llvm/llvm-project/pull/106524, by setting the
isSigned flag to the correct value or changing how the value is
constructed.
raw_string_ostream::flush() is essentially a no-op (also specified in docs).
Don't call it in tests that aren't meant to test 'raw_string_ostream' itself.
p.s. remove a few redundant calls to raw_string_ostream::str()
Patch [2/x] to fix structured bindings debug info in SROA.
It extracts a constant offset from the DIExpression if there is one and fills
RemainingOps with the ops that come after it.
This function will be used in a subsequent patch.
DIExpressions can get very long and have a lot of redundant operations.
This function uses simple pattern matching to fold constant math that
can be evaluated at compile time.
The hope is that other people can contribute other patterns as well.
I also couldn't see a good way of combining this with
`DIExpression::constantFold` so it stands alone.
This is part of a stack of patches and comes after
https://github.com/llvm/llvm-project/pull/69768https://github.com/llvm/llvm-project/pull/71717
Reland #82363 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/5/builds/41428.
Memory sanitizer detects usage of `RawData` union member which is not
filled directly. Instead, the code relies on filling `Data` union
member, which is a struct consisting of signing schema parameters.
According to https://en.cppreference.com/w/cpp/language/union, this is
UB:
"It is undefined behavior to read from the member of the union that
wasn't most recently written".
Instead of relying on compiler allowing us to do dirty things, do not
use union and only store `RawData`. Particular ptrauth parameters are
obtained on demand via bit operations.
Original PR description below.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
The appendToStack() function asserts that no DW_OP_stack_value or
DW_OP_LLVM_fragment operations are present in the operations to be
appended. The function did that by iterating over all elements in the
array rather than just the operations, leading it to falsely asserting
on the following input produced by getExt(), since 159 (0x9f) is the
DWARF code for DW_OP_stack_value:
{dwarf::DW_OP_LLVM_convert, 159, dwarf::DW_ATE_signed}
Fix this by using expr_op iterators.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
`ValueAsMetadata::handleRAUW` is a mechanism to replace all metadata
referring to one value by a different value.
Relax an assert that used to enforce the old and new value to have the
same type.
This seems to be a sanity plausibility assert only, as the
implementation actually supports mismatching types.
This is motivated by a downstream mechanism where we use poison
ValueAsMetadata values to annotate pointee types of opaque pointer
function arguments.
When replacing one type with a different one to work around DXIL vs LLVM
incompatibilities, we need to update type annotations, and handleRAUW is
more efficient than creating new MD nodes.
Since we no longer support typed LLVM IR pointer types, the code can
be simplified into for example using PointerType::get directly instead
of using Type::getInt8PtrTy and Type::getInt32PtrTy etc.
Differential Revision: https://reviews.llvm.org/D156733
This improves the readability of debugging intrinsics. Instead of:
call void @llvm.dbg.value(metadata !2, ...)
!2 = !{}
We will see:
call void @llvm.dbg.value(metadata !{}, ...)
!2 = !{}
Note that we still get a numbered metadata entry for the node even if it's not
used elsewhere. This is to avoid adding more context to the print functions.
This is already legal IR - LLVM can parse and understand it - so there is no
need to update the parser.
The next patches in this stack will make such empty metadata operands more
common and semantically important.
Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D140900
This patch replaces the uses of PointerUnion.is function by llvm::isa,
PointerUnion.get function by llvm::cast, and PointerUnion.dyn_cast by
llvm::dyn_cast_if_present. This is according to the FIXME in
the definition of the class PointerUnion.
This patch does not remove them as they are being used in other
subprojects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D148449
For example, if you have a chain of inlined funtions like this:
1 #include <stdlib.h>
2 int g1 = 4, g2 = 6;
3
4 static inline void bar(int q) {
5 if (q > 5)
6 abort();
7 }
8
9 static inline void foo(int q) {
10 bar(q);
11 }
12
13 int main() {
14 foo(g1);
15 foo(g2);
16 return 0;
17 }
with optimizations you could end up with a single abort call for the two
inlined instances of foo(). When merging the locations for those inlined
instances you would previously end up with a 0:0 location in main().
Leaving out that inlined chain from the location for the abort call
could make troubleshooting difficult in some cases.
This patch changes DILocation::getMergedLocation() to try to handle such
cases. The function is rewritten to first find a common starting point
for the two locations (same subprogram and inlined-at location), and
then in reverse traverses the inlined-at chain looking for matches in
each subprogram. For each subprogram, the merge function will find the
nearest common scope for the two locations, and matching line and
column (or set them to 0 if not matching).
In the example above, you will for the abort call get a location in
bar() at 6:5, inlined in foo() at 10:3, inlined in main() at 0:0 (since
the two inlined functions are on different lines, but in the same
scope).
I have not seen anything in the DWARF standard that would disallow
inlining a non-zero location at 0:0 in the inlined-at function, and both
LLDB and GDB seem to accept these locations (with D142552 needed for
LLDB to handle cases where the file, line and column number are all 0).
One incompatibility with GDB is that it seems to ignore 0-line locations
in some cases, but I am not aware of any specific issue that this patch
produces related to that.
With x86-64 LLDB (trunk) you previously got:
frame #0: 0x00007ffff7a44930 libc.so.6`abort
frame #1: 0x00005555555546ec a.out`main at merge.c:0
and will now get:
frame #0: 0x[...] libc.so.6`abort
frame #1: 0x[...] a.out`main [inlined] bar(q=<unavailable>) at merge.c:6:5
frame #2: 0x[...] a.out`main [inlined] foo(q=<unavailable>) at merge.c:10:3
frame #3: 0x[...] a.out`main at merge.c:0
and with x86-64 GDB (11.1) you will get:
(gdb) bt
#0 0x00007ffff7a44930 in abort () from /lib64/libc.so.6
#1 0x00005555555546ec in bar (q=<optimized out>) at merge.c:6
#2 foo (q=<optimized out>) at merge.c:10
#3 0x00005555555546ec in main ()
Reviewed By: aprantl, dblaikie
Differential Revision: https://reviews.llvm.org/D142556
This patch modifies SelectionDAG and FastISel to produce DBG_INSTR_REFs with
variadic expressions, and produce DBG_INSTR_REFs for debug values with variadic
location expressions. The former essentially means just prepending
DW_OP_LLVM_arg, 0 to the existing expression. The latter is achieved in
MachineFunction::finalizeDebugInstrRefs and InstrEmitter::EmitDbgInstrRef.
Reviewed By: jmorse, Orlando
Differential Revision: https://reviews.llvm.org/D133929
Following support from the previous patches in this stack being added for
variadic DBG_INSTR_REFs to exist, this patch modifies LiveDebugValues to
handle those instructions. Support already exists for DBG_VALUE_LISTs, which
covers most of the work needed to handle these instructions; this patch only
modifies the transferDebugInstrRef function to correctly track them.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133927
Prior to this patch, variadic DIExpressions (i.e. ones that contain
DW_OP_LLVM_arg) could only be created by salvaging debug values to create
stack value expressions, resulting in a DBG_VALUE_LIST being created. As of
the previous patch in this patch stack, DBG_INSTR_REF's syntax has been
changed to match DBG_VALUE_LIST in preparation for supporting variadic
expressions. This patch adds some minor changes needed to allow variadic
expressions that aren't stack values to exist, and allows variadic expressions
that are trivially reduceable to non-variadic expressions to be handled
similarly to non-variadic expressions.
Reviewed by: jmorse
Differential Revision: https://reviews.llvm.org/D133926
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
This patch adds a new function that can be used to check all the
properties, other than the machine values, of a pair of debug values for
equivalence. This is done by folding the "directness" into the
expression, converting the expression to variadic form if it is not
already in that form, and then comparing directly. In a few places which
check whether two debug values are identical to see if their ranges can
be merged, this function will correctly identify cases where two debug
values are expressed differently but have the same meaning, allowing
those ranges to be correctly merged.
Differential Revision: https://reviews.llvm.org/D136173
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes check-llvm.
getCanonicalMDString() also returns a nullptr for empty strings, which
tripped over the getSource() method. Solve the ambiguity of no source
versus an optional containing a nullptr by simply storing a pointer.
Differential Revision: https://reviews.llvm.org/D138658
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
createFragmentExpression rejects expressions containing certain ops, like
DW_OP_plus, that may cause the expression to compute a value that can't be
split.
Teach createFragmentExpression that the value loaded from an address computed
using those ops is safe to split.
Update a unittest to account for and test this change.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D136243
getMergedLocation returns a 'line 0' DILocaiton if the two locations
being merged don't perfecly match, even if they are in the same line but
a different column.
This commit adds support to keep the line number if it matches (but only
the column differs). The merged column number is the leftmost between the
two.
Reviewed By: dblaikie, orlando
Differential Revision: https://reviews.llvm.org/D135166
Fixed a bug with double destruction of operands and corrected a test issue.
Note that this patch leads to a slight increase in compile time (I measured
about .3%) and a slight increase in memory usage. The increased memory usage
should be offset once resizing is used to a larger extent.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D125998