The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.
This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.
In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.
- The next change is https://github.com/llvm/llvm-project/pull/87600
[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
As noted when #82404 was pushed (canonicalizing `sitofp` -> `uitofp`),
different signedness on fp casts can have dramatic performance
implications on different backends.
So, it makes to create a reliable means for the backend to pick its
cast signedness if either are correct.
Further, this allows us to start canonicalizing `sitofp`- > `uitofp`
which may easy middle end analysis.
Closes#86141
This patch adds a new flag: `--preserve-input-debuginfo-format`
This flag instructs the tool to not convert the debug info format
(intrinsics/records) of input IR, but to instead determine the format of
the input IR and overwrite the other format-determining flags so that we
process and output the file in the same format that we received it in.
This flag is turned off by llvm-link, llvm-lto, and llvm-lto2, and
should be turned off by any other tool that expects to parse multiple IR
modules and have their debug info formats match.
The motivation for this flag is to allow tools to not convert the debug
info format - verify-uselistorder and llvm-reduce, and any downstream
tools that seek to test or mutate IR as-is, without applying extraneous
modifications to the input. This is a necessary step to using debug
records by default in all (other) LLVM tools.
The class `ScopedDbgInfoFormatSetter` was added as a convenient way to
temporarily change the debug info format of a function or module, as
part of IR printing; since this process is repeated in a number of other
places, this patch uses the format-setter class in those places as well.
Add annotated vtable GUID as referenced variables in per function
summary, and update bitcode writer to create value-ids for these
referenced vtables.
- This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the
RFC.
[1] https://github.com/llvm/llvm-project/pull/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
Another trivial rename patch, the last big one for now, which renamed
DPMarkers to DbgMarkers. This required the field `DbgMarker` in
`Instruction` to be renamed to `DebugMarker` to avoid a clash, but
otherwise was a simple string substitution of `s/DPMarker/DbgMarker` and
a manual renaming of `DPM` to `DM` in the few places where that acronym
was used for debug markers.
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
isn't as widely used as DPValue and has no real conflicts in either its
full or abbreviated name. As usual, the entire replacement was done
automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
If --load-bitcode-into-experimental-debuginfo-iterators is true then debug
intrinsics are auto-upgraded to DbgRecords (the new debug info format).
The upgrade is trivial because the two representations are semantically
identical. llvm.dbg.value with 4 operands and llvm.dbg.addr intrinsics are
upgraded in the same way as usual, but converted directly into DbgRecords
instead of debug intrinsics.
--load-bitcode-into-experimental-debuginfo-iterators
false: Convert to the old debug mode after reading.
true: Upgrade to the new debug info format (*).
unset: Same as false (for now).
(*) As of this patch it actually just means "don't convert to either
mode after loading". Auto-upgrading will be implemented in an upcoming
patch.
With this flag we can incrementally add support for RemoveDIs by
overriding the "unset" behaviour in individual tools. The flag can be
removed once all tools support the new debug info mode.
Reland #82363 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/5/builds/41428.
Memory sanitizer detects usage of `RawData` union member which is not
filled directly. Instead, the code relies on filling `Data` union
member, which is a struct consisting of signing schema parameters.
According to https://en.cppreference.com/w/cpp/language/union, this is
UB:
"It is undefined behavior to read from the member of the union that
wasn't most recently written".
Instead of relying on compiler allowing us to do dirty things, do not
use union and only store `RawData`. Particular ptrauth parameters are
obtained on demand via bit operations.
Original PR description below.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Reaplying after revert in #85382 (861ebe6446296c96578807363aa292c69d827773).
Fixed intermittent test failure by avoiding piping output in some RUN lines.
If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.
Added the following records:
FUNC_CODE_DEBUG_RECORD_LABEL
FUNC_CODE_DEBUG_RECORD_VALUE
FUNC_CODE_DEBUG_RECORD_DECLARE
FUNC_CODE_DEBUG_RECORD_ASSIGN
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.
Records are formatted as follows:
All DbgRecord start with:
1. DILocation
FUNC_CODE_DEBUG_RECORD_LABEL
2. DILabel
DPValues then share common fields:
2. DILocalVariable
3. DIExpression
FUNC_CODE_DEBUG_RECORD_VALUE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_DECLARE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
4. Location Value (single)
FUNC_CODE_DEBUG_RECORD_ASSIGN
4. Location Metadata
5. DIAssignID
6. DIExpression (address)
7. Location Metadata (address)
Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.
Added the following records:
FUNC_CODE_DEBUG_RECORD_LABEL
FUNC_CODE_DEBUG_RECORD_VALUE
FUNC_CODE_DEBUG_RECORD_DECLARE
FUNC_CODE_DEBUG_RECORD_ASSIGN
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.
Records are formatted as follows:
All DbgRecord start with:
1. DILocation
FUNC_CODE_DEBUG_RECORD_LABEL
2. DILabel
DPValues then share common fields:
2. DILocalVariable
3. DIExpression
FUNC_CODE_DEBUG_RECORD_VALUE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_DECLARE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
4. Location Value (single)
FUNC_CODE_DEBUG_RECORD_ASSIGN
4. Location Metadata
5. DIAssignID
6. DIExpression (address)
7. Location Metadata (address)
Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
This patch continues the ongoing rename work, replacing DPValue with
DbgRecord in comments and the names of variables, both members and
fn-local. This is the most labour-intensive part of the rename, as it is
where the most decisions have to be made about whether a given comment
or variable is referring to DPValues (equivalent to debug variable
intrinsics) or DbgRecords (a catch-all for all debug intrinsics); these
decisions are not individually difficult, but comprise a fairly large
amount of text to review.
This patch still largely performs basic string substitutions followed by
clang-format; there are almost* no places where, for example, a comment
has been expanded or modified to reflect the semantic difference between
DPValues and DbgRecords. I don't believe such a change is generally
necessary in LLVM, but it may be useful in the docs, and so I'll be
submitting docs changes as a separate patch.
*In a few places, `dbg.values` was replaced with `debug intrinsics`.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
`sign-return-address` and similar module attributes should be propagated
to the function level before modules got merged because module flags may
contradict and this information is not recoverable.
Generated code will match with the normal linking flow.
NOTE: For brevity the following talks about ConstantInt but
everything extends to cover ConstantFP as well.
Whilst ConstantInt::get() supports the creation of vectors whereby
each lane has the same value, it achieves this via other constants:
* ConstantVector for fixed-length vectors
* ConstantExprs for scalable vectors
However, ConstantExprs are being deprecated and ConstantVector is
not space efficient for larger vector types. By extending ConstantInt
we can represent vector splats by only storing the underlying scalar
value.
More specifically:
* ConstantInt gains an ElementCount variant of get().
* LLVMContext is extended to map <EC,APInt>->ConstantInt.
* BitcodeReader/Writer support is extended to allow vector types.
Whilst this patch adds the base support, more work is required
before it's production ready. For example, there's likely to be
many places where isa<ConstantInt> assumes a scalar type. Accordingly
the default behaviour of ConstantInt::get() remains unchanged but a
set of flags are added to allow wider testing and thus help with the
migration:
--use-constant-int-for-fixed-length-splat
--use-constant-fp-for-fixed-length-splat
--use-constant-int-for-scalable-splat
--use-constant-fp-for-scalable-splat
NOTE: No change is required to the bitcode format because types and
values are handled separately.
NOTE: For similar reasons as above, code generation doesn't work
out-the-box.
We've been building and testing this no-debug-intrinsic work inside of
the pass manager for a while, so that optimisation passes get exercised
and tested when we turn it on. However, by converting to the
non-intrinsic form in the bitcode loader, we accidentally caused all
parts of LLVM to potentially see non-intrinsic debug-info.
Seeing how we're trying to turn things on incrementally, it was a
mistake to go this far this fast: we can instead just focus on enabling
during optimisations for the moment, then all the other parts of LLVM
later.
Turns out I was using DbgMarker::getDbgValueRange rather than the helper
utility in Instruction::getDbgValueRange, which checks for null-ness.
Original commit message follows.
[DebugInfo][RemoveDIs] Convert debug-info modes when loading bitcode (#78967)
As part of eliminating debug-intrinsics in LLVM, we'll shortly be
pushing the conversion from "old" dbg.value mode to "new" DPValue mode
out from when the pass manager runs, to when modules are loaded. This
patch adds that conversion process and some (temporary) options to
llvm-lto{,2} to help test it.
Specifically: now whenever we load a bitcode module, consider a flag of
whether to "upgrade" it into the new debug-info mode, and if we're
lazily materializing functions then do that lazily too. Doing this
exposes an error in the IRLinker/materializer handling of DPValues,
where we need to transfer the debug-info format flag correctly, and in
ValueMapper we need to remap the Values that DPValues point at.
I've added some test coverage in the modified tests; these will be
exercised by our llvm-new-debug-iterators buildbot.
This upgrading of debug-info won't be happening for the llvm18 release,
instead we'll turn it on after the branch date, thenbe push the boundary
of where "new" debug-info starts and ends down into the existing
debug-info upgrade path over the course of the next release.
As part of eliminating debug-intrinsics in LLVM, we'll shortly be
pushing the conversion from "old" dbg.value mode to "new" DPValue mode
out from when the pass manager runs, to when modules are loaded. This
patch adds that conversion process and some (temporary) options to
llvm-lto{,2} to help test it.
Specifically: now whenever we load a bitcode module, consider a flag of
whether to "upgrade" it into the new debug-info mode, and if we're
lazily materializing functions then do that lazily too. Doing this
exposes an error in the IRLinker/materializer handling of DPValues,
where we need to transfer the debug-info format flag correctly, and in
ValueMapper we need to remap the Values that DPValues point at.
I've added some test coverage in the modified tests; these will be
exercised by our llvm-new-debug-iterators buildbot.
This upgrading of debug-info won't be happening for the llvm18 release,
instead we'll turn it on after the branch date, thenbe push the boundary
of where "new" debug-info starts and ends down into the existing
debug-info upgrade path over the course of the next release.
This fixes some cases of missing debuginfo caused by an interaction
between:
f0d66559ea,
which drops the identifier from a DICompositeType in the module
containing its
vtable.
and
a61f5e3796,
which causes ThinLTO to import composite types as declarations when they
have
an identifier.
If a virtual class's DICompositeType has no identifier due to the first
change,
and contains a nested anonymous type which does have an identifier, then
the
second change can cause ThinLTO to output the classes's DICompositeType
as a
type definition that links to a non-defining declaration for the nested
type.
Since the nested anonyous type does not have a name, debuggers are
unable to
find the definition for the declaration.
Repro case:
```
cat > a.h <<EOF
class A {
public:
A();
virtual ~A();
private:
union {
int val;
};
};
EOF
cat > a.cc <<EOF
#include "a.h"
A::A() { asm(""); }
A::~A() {}
EOF
cat > main.cc <<EOF
#include "a.h"
int main(int argc, char **argv) {
A a;
return 0;
}
EOF
clang++ -O2 -g -flto=thin -mllvm -force-import-all main.cc a.cc
gdb ./a.out -batch -ex 'pt /rmt A'
```
The gdb command outputs:
```
type = class A {
private:
union {
<incomplete type>
};
}
```
and dwarfdump -i a.out shows a DW_TAG_class_type for A with an
incomplete union
type (note that there is also a duplicate entry with the full union type
that
comes after).
```
< 1><0x0000001e> DW_TAG_class_type
DW_AT_containing_type <0x0000001e>
DW_AT_calling_convention DW_CC_pass_by_reference
DW_AT_name (indexed string: 0x00000007)A
DW_AT_byte_size 0x00000010
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000001
...
< 2><0x0000002f> DW_TAG_member
DW_AT_type <0x00000037>
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000007
DW_AT_data_member_location 8
< 2><0x00000037> DW_TAG_union_type
DW_AT_export_symbols yes(1)
DW_AT_calling_convention DW_CC_pass_by_value
DW_AT_declaration yes(1)
```
This change works around this by making ThinLTO always import full
definitions
for anonymous types.
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions
This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).
The first commit is added for convenience, as it has already been
accepted.
If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.
This is meant to be committed along with
https://reviews.llvm.org/D144006.
If tail call optimization was not disabled for the profiled binary, the
call contexts will be missing frames for tail calls. Handle this by
performing a limited search through tail call edges for the profiled
callee when a discontinuity is detected. The search depth is adjustable
but defaults to 5.
If we are able to identify a short sequence of tail calls, update the
graph for those calls. In the case of ThinLTO, synthesize the necessary
CallsiteInfos for carrying the cloning information to the backends.
Add the `dead_on_unwind` attribute, which states that the caller will
not read from this argument if the call unwinds. This allows eliding
stores that could otherwise be visible on the unwind path, for example:
```
declare void @may_unwind()
define void @src(ptr noalias dead_on_unwind %out) {
store i32 0, ptr %out
call void @may_unwind()
store i32 1, ptr %out
ret void
}
define void @tgt(ptr noalias dead_on_unwind %out) {
call void @may_unwind()
store i32 1, ptr %out
ret void
}
```
The optimization is not valid without `dead_on_unwind`, because the `i32
0` value might be read if `@may_unwind` unwinds.
This attribute is primarily intended to be used on sret arguments. In
fact, I previously wanted to change the semantics of sret to include
this "no read after unwind" property (see D116998), but based on the
feedback there it is better to keep these attributes orthogonal (sret is
an ABI attribute, dead_on_unwind is an optimization attribute). This is
a reboot of that change with a separate attribute.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This adds support for a HasTailCall flag on function call edges in the
ThinLTO summary. It is intended for use in aiding discovery of missing
frames from tail calls in profiled call stacks for MemProf of profiled
binaries that did not disable tail call elimination. A follow on change
will add the use of this new flag during MemProf context disambiguation.
The new flag is encoded in the bitcode along with either the hotness
flag from the profile, or the relative block frequency under the
-write-relbf-to-summary flag when there is no profile data.
Because we now will always have some additional call edge information, I
have removed the non-profile function summary record format, and we
simply encode the tail call flag along with a hotness type of none when
there is no profile information or relative block frequency. The change
of record format and name caused most of the test case changes.
I have added explicit testing of generation of the new tail call flag
into the bitcode and IR assembly format as part of the changes to
llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also
added round trip testing through assembly and bitcode to
llvm/test/Assembler/thinlto-summary.ll.
This flag indicates that every bit is known to be zero in at least one
of the inputs. This allows the Or to be treated as an Add since there is
no possibility of a carry from any bit.
If the flag is present and this property does not hold, the result is
poison.
This makes it easier to reverse the InstCombine transform that turns Add
into Or.
This is inspired by a comment here
https://github.com/llvm/llvm-project/pull/71955#discussion_r1391614578
Discourse thread
https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036
This reverts commit 0fd5dc94380d5fe666dc6c603b4bb782cef743e7.
The original commit removed DIArgLists from being in an MDNode map, but did
not insert a new `delete` in the LLVMContextImpl destructor. This
reapply adds that call to delete, preventing a memory leak.