The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.
This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.
In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.
- The next change is https://github.com/llvm/llvm-project/pull/87600
[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
As noted when #82404 was pushed (canonicalizing `sitofp` -> `uitofp`),
different signedness on fp casts can have dramatic performance
implications on different backends.
So, it makes to create a reliable means for the backend to pick its
cast signedness if either are correct.
Further, this allows us to start canonicalizing `sitofp`- > `uitofp`
which may easy middle end analysis.
Closes#86141
Allow using atomicrmw fadd, fsub, fmin, and fmax with vectors of
floating-point type. AMDGPU supports atomic fadd for <2 x half> and <2 x
bfloat> on some targets and address spaces.
Note this only supports the proper floating-point operations; float
vector typed xchg is still not supported. cmpxchg still only supports
integers, so this inserts bitcasts for the loop expansion.
I have support for fp vector typed xchg, and vector of int/ptr
separately implemented but I don't have an immediate need for those
beyond feature consistency.
This patch adds a new flag: `--preserve-input-debuginfo-format`
This flag instructs the tool to not convert the debug info format
(intrinsics/records) of input IR, but to instead determine the format of
the input IR and overwrite the other format-determining flags so that we
process and output the file in the same format that we received it in.
This flag is turned off by llvm-link, llvm-lto, and llvm-lto2, and
should be turned off by any other tool that expects to parse multiple IR
modules and have their debug info formats match.
The motivation for this flag is to allow tools to not convert the debug
info format - verify-uselistorder and llvm-reduce, and any downstream
tools that seek to test or mutate IR as-is, without applying extraneous
modifications to the input. This is a necessary step to using debug
records by default in all (other) LLVM tools.
Depends on #87545
Emit `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` property in
`.note.gnu.property` section depending on
`aarch64-elf-pauthabi-platform` and `aarch64-elf-pauthabi-version` llvm
module flags.
Intrinsics like @llvm.seh.scope.begin and @llvm.seh.scope.end which do
not throw do not need funclets in catchpads or cleanuppads.
Fixes#69428
Co-authored-by: Robert Cox <robert.cox@intel.com>
---------
Co-authored-by: Robert Cox <robert.cox@intel.com>
Fixes issue noted at: https://github.com/llvm/llvm-project/pull/86274
When loading bitcode lazily, we may request debug intrinsics be upgraded
to debug records during the module parsing phase; later on we perform
this upgrade when materializing the module functions. If we change the
module's debug info format between parsing and materializing however,
then the requested upgrade is no longer correct and leads to an
assertion. This patch fixes the issue by adding an extra check in the
autoupgrader to see if the upgrade is no longer suitable, and either
exit-out or fall back to the correct intrinsic->intrinsic upgrade if one
is required.
The class `ScopedDbgInfoFormatSetter` was added as a convenient way to
temporarily change the debug info format of a function or module, as
part of IR printing; since this process is repeated in a number of other
places, this patch uses the format-setter class in those places as well.
We currently just use mangled name. This works fine, because linker
should detect that and demangle it for the export table. However, on
MSVC, the compiler is more specific and passes demangled name as well,
with EXPORTAS. This PR aims to match that. MSVC doesn't use quotes in
this case, so I added '#' to the list of characters that don't need it.
Follow on from #84915 which adds the DbgRecord function variants. The C API
changes were reviewed in #85657.
# C API
Update the LLVMDIBuilderInsert... functions to insert DbgRecords instead
of debug intrinsics.
LLVMDIBuilderInsertDeclareBefore
LLVMDIBuilderInsertDeclareAtEnd
LLVMDIBuilderInsertDbgValueBefore
LLVMDIBuilderInsertDbgValueAtEnd
Calling these functions will now cause an assertion if the module is in the
wrong debug info format. They should only be used when the module is in "new
debug format".
Use LLVMIsNewDbgInfoFormat to query and LLVMSetIsNewDbgInfoFormat to change the
debug info format of a module.
Please see https://llvm.org/docs/RemoveDIsDebugInfo.html#c-api-change
(RemoveDIsDebugInfo.md) for more info.
# OCaml bindings
Add set_is_new_dbg_info_format and is_new_dbg_info_format to the OCaml bindings.
These can be used to set and query the current debug info mode. These will
eventually be removed, but are useful while we're transitioning between old and
new debug info formats.
Add string_of_lldbgrecord, like string_of_llvalue but prints DbgRecords.
In test dbginfo.ml, unconditionally set the module debug info to the new mode
and update CHECK lines to check for DbgRecords. Without this change the test
crashes because it attempts to insert DbgRecords (new default behaviour of
llvm_dibuild_insert_declare_...) into a module that is in the old debug info
mode.
- Put the helper function in `ProfDataUtil.h/cpp`, which is already a
dependency of `Instructions.cpp`
- The helper function could be re-used to update profiles of
`InvokeInst` (in a follow-up pull request)
[RISCV] RISCV vector calling convention (1/2)
This is the vector calling convention based on
https://github.com/riscv-non-isa/riscv-elf-psabi-doc,
the idea is to split between "scalar" callee-saved registers
and "vector" callee-saved registers. "scalar" ones remain the
original strategy, however, "vector" ones are handled together
with RVV objects.
The stack layout would be:
|--------------------------| <-- FP
| callee-allocated save |
| area for register varargs|
|--------------------------|
| callee-saved registers | <-- scalar callee-saved
| (scalar) |
|--------------------------|
| RVV alignment padding |
|--------------------------|
| callee-saved registers | <-- vector callee-saved
| (vector) |
|--------------------------|
| RVV objects |
|--------------------------|
| padding before RVV |
|--------------------------|
| scalar local variables |
|--------------------------| <-- BP
| variable size objects |
|--------------------------| <-- SP
Note: This patch doesn't contain "tuple" type, e.g. vint32m1x2.
It will be handled in https://github.com/riscv-non-isa/riscv-elf-psabi-doc (2/2).
Differential Revision: https://reviews.llvm.org/D154576
Adds logic to the IR verifier that checks whether !tbaa.struct nodes are
well-formed. That is, it checks that the operands of !tbaa.struct nodes
are in groups of three, that each group of three operands consists of
two integers and a valid tbaa node, and that the regions described by
the offset and size operands are non-overlapping.
PR: https://github.com/llvm/llvm-project/pull/86709
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
Follow on from #84915 which adds the DbgRecord function variants.
Update the LLVMDIBuilderInsert... functions to insert DbgRecords instead
of debug intrinsics.
LLVMDIBuilderInsertDeclareBefore
LLVMDIBuilderInsertDeclareAtEnd
LLVMDIBuilderInsertDbgValueBefore
LLVMDIBuilderInsertDbgValueAtEnd
Calling these functions will now cause an assertion if the module is in the
wrong debug info format. They should only be used when the module is in "new
debug format".
Use LLVMIsNewDbgInfoFormat to query and LLVMSetIsNewDbgInfoFormat to change the
debug info format of a module.
Please see https://llvm.org/docs/RemoveDIsDebugInfo.html#c-api-change
(RemoveDIsDebugInfo.md) for more info.
We were passing the min and max values of the range to the ConstantRange
constructor, but the constructor expects the upper bound to 1 more than
the max value so we need to add 1.
We also need to use getNonEmpty so that passing 0, 0 to the constructor
creates a full range rather than an empty range. And passing smin,
smax+1 doesn't cause an assertion.
I believe this fixes at least some of the reason #79158 was reverted.
Another trivial rename patch, the last big one for now, which renamed
DPMarkers to DbgMarkers. This required the field `DbgMarker` in
`Instruction` to be renamed to `DebugMarker` to avoid a clash, but
otherwise was a simple string substitution of `s/DPMarker/DbgMarker` and
a manual renaming of `DPM` to `DM` in the few places where that acronym
was used for debug markers.
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
isn't as widely used as DPValue and has no real conflicts in either its
full or abbreviated name. As usual, the entire replacement was done
automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
Because the RemoveDIs work is putting a debug-info bit into
BasicBlock::iterator and iterators are needed for insertion, the
getAsInstruction method declaration would need to use a fully defined
instruction-iterator, which leads to a complicated
header-inclusion-order problem. Much simpler to instead just not insert,
and make it the callers problem to insert.
This is proportionate because there are only four call-sites to
getAsInstruction -- it would suck if we did this everywhere.
---------
Merged by: Stephen Tozer <stephen.tozer@sony.com>
If --load-bitcode-into-experimental-debuginfo-iterators is true then debug
intrinsics are auto-upgraded to DbgRecords (the new debug info format).
The upgrade is trivial because the two representations are semantically
identical. llvm.dbg.value with 4 operands and llvm.dbg.addr intrinsics are
upgraded in the same way as usual, but converted directly into DbgRecords
instead of debug intrinsics.
This code was assuming that the LHS would always be one of
GlobalVariable, BlockAddress or ConstantExpr. However, it can
also be a special constant like dso_local_equivalent or no_cfi.
Make sure this is handled gracefully.
Reland #82363 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/5/builds/41428.
Memory sanitizer detects usage of `RawData` union member which is not
filled directly. Instead, the code relies on filling `Data` union
member, which is a struct consisting of signing schema parameters.
According to https://en.cppreference.com/w/cpp/language/union, this is
UB:
"It is undefined behavior to read from the member of the union that
wasn't most recently written".
Instead of relying on compiler allowing us to do dirty things, do not
use union and only store `RawData`. Particular ptrauth parameters are
obtained on demand via bit operations.
Original PR description below.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Follow on from #84739, which updates the DIBuilder class.
All the functions that have been added are temporary and will be
deprecated in the future. The intention is that they'll help downstream
projects adapt during the transition period.
```
New functions (all to be deprecated)
------------------------------------
LLVMIsNewDbgInfoFormat # Returns true if the module is in the new non-instruction mode.
LLVMSetIsNewDbgInfoFormat # Convert to the requested debug info format.
LLVMDIBuilderInsertDeclareIntrinsicBefore # Insert a debug intrinsic (old debug info format).
LLVMDIBuilderInsertDeclareIntrinsicAtEnd # Same as above.
LLVMDIBuilderInsertDbgValueIntrinsicBefore # Same as above.
LLVMDIBuilderInsertDbgValueIntrinsicAtEnd # Same as above.
LLVMDIBuilderInsertDeclareRecordBefore # Insert a debug record (new debug info format).
LLVMDIBuilderInsertDeclareRecordAtEnd # Same as above.
LLVMDIBuilderInsertDbgValueRecordBefore # Same as above.
LLVMDIBuilderInsertDbgValueRecordAtEnd # Same as above.
```
The existing `LLVMDIBuilderInsert...` functions call through to the
intrinsic versions (old debug info format) currently.
In the next patch, I'll swap them to call the debug records versions
(new debug info format). Downstream users of this API can query and
change the current format using the first two functions above, or can
instead opt to temporarily use intrinsics or records explicitly.
[GlobalISel] Implement convergence control tokens and intrinsics in GMIR
In the IR translator, convert the LLVM token type to LLT::token(), which is an
alias for the s0 type. These show up as implicit uses on convergent operations.
Differential Revision: https://reviews.llvm.org/D158147
Reaplying after revert in #85382 (861ebe6446296c96578807363aa292c69d827773).
Fixed intermittent test failure by avoiding piping output in some RUN lines.
If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.
Added the following records:
FUNC_CODE_DEBUG_RECORD_LABEL
FUNC_CODE_DEBUG_RECORD_VALUE
FUNC_CODE_DEBUG_RECORD_DECLARE
FUNC_CODE_DEBUG_RECORD_ASSIGN
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.
Records are formatted as follows:
All DbgRecord start with:
1. DILocation
FUNC_CODE_DEBUG_RECORD_LABEL
2. DILabel
DPValues then share common fields:
2. DILocalVariable
3. DIExpression
FUNC_CODE_DEBUG_RECORD_VALUE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_DECLARE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
4. Location Value (single)
FUNC_CODE_DEBUG_RECORD_ASSIGN
4. Location Metadata
5. DIAssignID
6. DIExpression (address)
7. Location Metadata (address)
Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
The appendToStack() function asserts that no DW_OP_stack_value or
DW_OP_LLVM_fragment operations are present in the operations to be
appended. The function did that by iterating over all elements in the
array rather than just the operations, leading it to falsely asserting
on the following input produced by getExt(), since 159 (0x9f) is the
DWARF code for DW_OP_stack_value:
{dwarf::DW_OP_LLVM_convert, 159, dwarf::DW_ATE_signed}
Fix this by using expr_op iterators.