LLVM currently stores heapallocsite information in CodeView debuginfo,
but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific
extension.
heapallocsite debug information is useful when it is combined with
allocator instrumentation that stores caller addresses; I've used a
previous version of this patch for:
- analyzing memory usage by object type
- analyzing the distributions of values of class members
Other possible uses might be:
- attributing memory access profiles (for example, on Intel CPUs, from
PEBS records with Linear Data Address) to object types or specific
object members
- adding type information to crash/ASAN reports
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
RFC on discourse:
https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606
With this commit, we add `DILabel` debug infos to the resume points of a
coroutine. Those labels can be used by debugging scripts to figure out
the exact line and column at which a coroutine was suspended by looking
up current `__coro_index` value inside the coroutines frame, and then
searching for the corresponding label inside the coroutine's resume
function.
The DWARF information generated for such a label looks like:
```
0x00000f71: DW_TAG_label
DW_AT_name ("__coro_resume_1")
DW_AT_decl_file ("generator-example.cpp")
DW_AT_decl_line (5)
DW_AT_decl_column (3)
DW_AT_artificial (true)
DW_AT_LLVM_coro_suspend_idx (0x01)
DW_AT_low_pc (0x00000000000019be)
```
The labels can be mapped to their corresponding `__coro_idx` values
either via their naming convention `__coro_resume_<N>` or using the new
`DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can
be looked up using `info line -function my_coroutine -label
__coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label
debug information, yet.
Given this is an artificial compiler-generated label, I did apply the
DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on
type and variable definitions, but this is a natural extension and was
also blessed in the RFC on discourse.
Also, this commit adds `DW_AT_decl_column` to labels, not only for
coroutines but also for normal C and C++ labels. While not strictly
necessary, I am doing so now because it would be harder to do so later
without breaking the binary LLVM-IR format
Drive-by fixes: While reading the existing test cases to understand how
to write my own test case, I did a couple of small typo fixes and
comment improvements
Reverts llvm/llvm-project#136205
Breaks buildbots, probably something about needing to restrict the test
to running on a specific target or the like - I haven't looked closely.
Co-authored-by: Vladislav Dzhidzhoev <dzhidzhoev@gmail.com>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
**Summary**
This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.
**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)
Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.
In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.
For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.
**Implementation Details**
To address the above issue, the patch makes the following key changes:
**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.
**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.
**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.
**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
This is the final piece to enable register debugging for variables in
registers that have single locations that last throughout their
enclosing scope.
The next step after this for supporting register debugging for NVPTX is
to support the .debug_loc section.
Stacked on top of: https://github.com/llvm/llvm-project/pull/109495
getSectionIDNum may return same value for two different MBBSectionID.
e.g. A Cold type MBBSectionID with number 0 and a Default type
MBBSectionID with number 2 get same value 2 from getSectionIDNum. This
may lead to overwrite of MBBSectionRanges. Using MBBSectionID itself
as DenseMap key is better choice.
When a type unit is emitted, the CU referencing the type unit ends up
with a little DW_TAG_*_type with the DW_AT_signature and
DW_AT_declaration sometimes referred to (by me? maybe other people?) as
a skeleton type.
We shouldn't produce .debug_names reference to these - only to the
actual type definition in the type unit. So this patch does that.
But, inversely, the .debug_gnu_pubtypes /does/ need to reference the
skeleton type (& gcc does this too, when it produces a skeleton type
(gcc doesn't always produce these - if the type is only referenced once
via DW_AT_type, gcc uses a direct DW_FORM_ref_sig8 on the DW_AT_type
without the intermediate skeleton type)) - so there's a little special
case added in to preserve that behavior which is covered by existing
tests.
If -mllvm -add-linkage-names-to-external-call-origins is true then add
DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by
DW_AT_call_origin attributes that would otherwise be omitted.
A debugger may use DW_TAG_call_origin attributes to determine whether any
frames in a callstack are missing due to optimisations (e.g. tail calls).
For example, say a() calls b() tail-calls c(), and you stop in your debugger
in c():
The callstack looks like this:
c()
a()
Looking "up" from c(), call site information can be found in a(). This includes
a DW_AT_call_origin referencing b()'s subprogram DIE, which means the call at
this call site was to b(), not c() where we are currently stopped. This
indicates b()'s frame has been lost due to optimisation (or is misleading due
to ICF).
This patch makes it easier for a debugger to check whether the referenced
DIE describes the target function or not, for example by comparing the referenced
function name to the current frame.
There's already an option to apply DW_AT_linkage_name in a targeted manner:
-dwarf-linkage-names=Abstract, which limits adding DW_AT_linkage_names to
abstract subprogram DIEs (this is default for SCE tuning).
The new flag shouldn't affect non-SCE-tuned behaviour whether it is enabled
or not because the non-SCE-tuned behaviour is to always add linkage names to
subprogram DIEs.
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions
This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).
The first commit is added for convenience, as it has already been
accepted.
If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.
This is meant to be committed along with
https://reviews.llvm.org/D144006.
Enable Type Units with DWARF5 accelerator tables for monolithic DWARF.
Implementation relies on linker to tombstone offset in LocalTU list to
-1 when
it deduplciates type units using COMDAT.
The DWARF 5 specification says that:
> The name index must contain an entry for each debugging information
entry that
> defines a named [...] label [...].
The verifier currently verifies this, but the AsmPrinter does not add
entries for TAG_labels in debug_names. This patch addresses the issue by
ensuring we add labels in the accelerator tables once we have a fully
completed DIE for the TAG_label entry.
We also respect the spec as follows:
> DW_TAG_label debugging information entries without an address
attribute
> (DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges, or DW_AT_entry_pc) are
excluded.
The effect of this on the size of accelerator tables is minimal, as
TAG_labels are usually created by C/C++ labels (see example in test),
which are typically paired with "goto" statements.
This caused asserts:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
"getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.
See comment on the code review for reproducer.
> RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
>
> Similar to imported declarations, the patch tracks function-local types in
> DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
> the aforementioned metadata change and provided a support of function-local
> types scoped within a lexical block.
>
> The patch assumes that DICompileUnit's 'enums field' no longer tracks local
> types and DwarfDebug would assert if any locally-scoped types get placed there.
>
> Reviewed By: jmmartinez
>
> Differential Revision: https://reviews.llvm.org/D144006
This reverts commit f8aab289b5549086062588fba627b0e4d3a5ab15.
Fold constructVariableDIEImpl into constructVariableDIE, simplify it and
group related functions.
Pull out the previously inline lambdas for visiting the active variant
of the DbgVariable to add location and related attributes as an overload
set for a private method
applyConcreteDbgVariableAttributes.
Rename applyVariableAttribute to reflect what kinds of attributes it
applies, and to contrast it with the new
applyConcreteDbgVariableAttributes.
Move constructLabelDIE down in the implementation file, so all of the
constructVariableDIE-related function impls are adjacent.
This potentially has a slightly positive performance impact, as
std::visit can be implemented as a `switch`-like jump rather than
a series of `if`s.
More importantly, the reader can be confident is no overlap between the
cases.
Differential Revision: https://reviews.llvm.org/D158678
Only a subset of the fields of DbgVariable are meaningful at any time,
and some fields are re-used for multiple purposes (for example
FrameIndexExprs is used with a throw-away frame-index of 0 to hold a
single DIExpression without needing to add another member). The exact
invariants must be reverse-engineered by inspecting the actual use of
the class, its imprecise/outdated doc-comment, and some asserts.
Refactor DbgVariable into a sum type by inheriting from std::variant.
This makes the active fields for any given state explicit and removes
the need to re-use fields in disparate contexts. As a bonus, it seems to
reduce the size on my x86_64 linux box from 144 bytes to 96 bytes.
There is some potential cost to `std::get` as it must check the active
alternative even when context or an assert obviates it. To try to help
ensure the compiler can optimize out the checks the patch also adds a
helper `get` method which uses the noexcept `std::get_if`.
Some of the extra cost would also be avoided more cleanly with a
refactor that exposes the alternative types in the public interface,
which will come in another patch.
Differential Revision: https://reviews.llvm.org/D158675
With D149881, we converted EntryValue MachineFunction table entries into
`DbgVariables` initialized by a "DbgValue" intrinsic, which can only handle a
single, non-fragment DIExpression. However, it is desirable to handle variables
with multiple fragments and DIExpressions.
To do this, we expand the `DbgVariable` class to handle the EntryValue case.
This class can already operate under three different "modes" (stack slot,
unchanging location described by a dbg value, changing location described by a
loc list). A fourth case is added as a separate class entirely, but a subsequent
patch should redesign `DbgVariable` with four subclasses in order to make the
code more readable.
This patch also exposed a bug in the `beginEntryValueExpression` function, which
was not initializing the `LocationFlags` properly. Note how the
`finalizeEntryValue` function resets that flag. We fix this bug here, as testing
this changing in isolation would be tricky.
Differential Revision: https://reviews.llvm.org/D158458
This reverts commit d20e4a1d68aa8e14c4e524e4d4eeb4445acac401.
After committing 2ee4d0386c783f58abe708298228de648239b435, We don't support subprogram definitions nested within `DICompositeType` when doing LTO builds.
For a detailed discussion, see https://reviews.llvm.org/D152095.
Test "local-type-as-template-parameter.ll" is now enabled only for
x86_64.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
Depends on D144005
Test "local-type-as-template-parameter.ll" now requires linux-system.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
Depends on D144005
This reverts commit d80fdc6fc1a6e717af1bcd7a7313e65de433ba85.
split-dwarf-local-impor3.ll fails because of an issue with
Dwo sections emission on Windows platform.
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
Fixed PR51501 (tests from D112337).
1. Reuse of DISubprogram's 'retainedNodes' to track other function-local
entities together with local variables and labels (this patch cares about
function-local import while D144006 and D144008 use the same approach for
local types and static variables). So, effectively this patch moves ownership
of tracking local import from DICompileUnit's 'imports' field to DISubprogram's
'retainedNodes' and adjusts DWARF emitter for the new layout. The old layout
is considered unsupported (DwarfDebug would assert on such debug metadata).
DICompileUnit's 'imports' field is supposed to track global imported
declarations as it does before.
This addresses various FIXMEs and simplifies the next part of the patch.
2. Postpone emission of function-local imported entities from
`DwarfDebug::endFunctionImpl()` to `DwarfDebug::endModule()`.
While in `DwarfDebug::endFunctionImpl()` we do not have all the
information about a parent subprogram or a referring subprogram
(whether a subprogram inlined or not), so we can't guarantee we emit
an imported entity correctly and place it in a proper subprogram tree.
So now, we just gather needed details about the import itself and its
parent entity (either a Subprogram or a LexicalBlock) during
processing in `DwarfDebug::endFunctionImpl()`, but all the real work is
done in `DwarfDebug::endModule()` when we have all the required
information to make proper emission.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144004
This reverts commit ed578f02cf44a52adde16647150e7421f3ef70f3.
Tests llvm/test/DebugInfo/Generic/split-dwarf-local-import*.ll fail
when x86_64 target is not registered.
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
Fixed PR51501 (tests from D112337).
1. Reuse of DISubprogram's 'retainedNodes' to track other function-local
entities together with local variables and labels (this patch cares about
function-local import while D144006 and D144008 use the same approach for
local types and static variables). So, effectively this patch moves ownership
of tracking local import from DICompileUnit's 'imports' field to DISubprogram's
'retainedNodes' and adjusts DWARF emitter for the new layout. The old layout
is considered unsupported (DwarfDebug would assert on such debug metadata).
DICompileUnit's 'imports' field is supposed to track global imported
declarations as it does before.
This addresses various FIXMEs and simplifies the next part of the patch.
2. Postpone emission of function-local imported entities from
`DwarfDebug::endFunctionImpl()` to `DwarfDebug::endModule()`.
While in `DwarfDebug::endFunctionImpl()` we do not have all the
information about a parent subprogram or a referring subprogram
(whether a subprogram inlined or not), so we can't guarantee we emit
an imported entity correctly and place it in a proper subprogram tree.
So now, we just gather needed details about the import itself and its
parent entity (either a Subprogram or a LexicalBlock) during
processing in `DwarfDebug::endFunctionImpl()`, but all the real work is
done in `DwarfDebug::endModule()` when we have all the required
information to make proper emission.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144004