DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size
attributes "If the value of an object of the given type does not fully
occupy the storage described by a byte size attribute"
* Add DataSizeInBits to DIBasicType (`DIBasicType(... dataSize: n ...)` in IR).
* Change Clang to add DataSizeInBits to _BitInt type metadata.
* Change LLVM to add DW_AT_bit_size to base_type DIEs that have non-zero
DataSizeInBits.
TODO: Do we need to emit DW_AT_data_bit_offset for big endian targets?
See discussion on the PR.
Fixes [#61952](https://github.com/llvm/llvm-project/issues/61952)
---------
Co-authored-by: David Stenberg <david.stenberg@ericsson.com>
This is an attempt to reland
https://github.com/llvm/llvm-project/pull/159104 with the fix for
https://github.com/llvm/llvm-project/issues/160197.
The original patch had the following problem: when an abstract
subprogram DIE is constructed from within
`DwarfDebug::endFunctionImpl()`,
`DwarfDebug::constructAbstractSubprogramScopeDIE()` acknowledges `unit:`
field of DISubprogram. But an abstract subprogram DIE constructed from
`DwarfDebug::beginModule()` was put in the same compile unit to which
global variable referencing the subprogram belonged, regardless of
subprogram's `unit:`.
This is fixed by adding `DwarfDebug::getOrCreateAbstractSubprogramCU()`
used by both`DwarfDebug:: constructAbstractSubprogramScopeDIE()` and
`DwarfCompileUnit::getOrCreateSubprogramDIE()` when abstract subprogram
is queried during the creation of DIEs for globals in
`DwarfDebug::beginModule()`.
The fix and the already-reviewed code from
https://github.com/llvm/llvm-project/pull/159104 are two separate
commits in this PR.
=====
The original commit message follows:
With this change, construction of abstract subprogram DIEs is split in
two stages/functions: creation of DIE (in
DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population
with children (in
DwarfCompileUnit::constructAbstractSubprogramScopeDIE).
With that, abstract subprograms can be created/referenced from
DwarfDebug::beginModule, which should solve the issue with static local
variables DIE creation of inlined functons with optimized-out
definitions. It fixes https://github.com/llvm/llvm-project/issues/29985.
LexicalScopes class now stores mapping from DISubprograms to their
corresponding llvm::Function's. It is supposed to be built before
processing of each function (so, now LexicalScopes class has a method
for "module initialization" alongside the method for "function
initialization"). It is used by DwarfCompileUnit to determine whether a
DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is
invoked.
DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can
create an abstract or a concrete DIE for a subprogram. It accepts
llvm::Function* argument to determine whether a concrete DIE must be
created.
This is a temporary fix for
https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be
fixed by moving global variables and types emission to
DwarfDebug::endModule (https://reviews.llvm.org/D144007,
https://reviews.llvm.org/D144005).
Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in
https://github.com/llvm/llvm-project/pull/90523 was taken for this
commit.
With this change, construction of abstract subprogram DIEs is split in
two stages/functions:
creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE)
and its population with children (in
DwarfCompileUnit::constructAbstractSubprogramScopeDIE).
With that, abstract subprograms can be created/referenced from
DwarfDebug::beginModule, which should solve the issue with static local
variables DIE creation of inlined functons with optimized-out
definitions. It fixes https://github.com/llvm/llvm-project/issues/29985.
LexicalScopes class now stores mapping from DISubprograms to their
corresponding llvm::Function's. It is supposed to be built before
processing of each function (so, now LexicalScopes class has a method
for "module initialization" alongside the method for "function
initialization"). It is used by DwarfCompileUnit to determine whether a
DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is
invoked.
DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can
create an abstract or a concrete DIE for a subprogram. It accepts
llvm::Function* argument to determine whether a concrete DIE must be
created.
This is a temporary fix for
https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be
fixed by moving global variables and types emission to
DwarfDebug::endModule (https://reviews.llvm.org/D144007,
https://reviews.llvm.org/D144005).
Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in
https://github.com/llvm/llvm-project/pull/90523 was taken for this
commit.
LLVM currently stores heapallocsite information in CodeView debuginfo,
but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific
extension.
heapallocsite debug information is useful when it is combined with
allocator instrumentation that stores caller addresses; I've used a
previous version of this patch for:
- analyzing memory usage by object type
- analyzing the distributions of values of class members
Other possible uses might be:
- attributing memory access profiles (for example, on Intel CPUs, from
PEBS records with Linear Data Address) to object types or specific
object members
- adding type information to crash/ASAN reports
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
RFC on discourse:
https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606
With this commit, we add `DILabel` debug infos to the resume points of a
coroutine. Those labels can be used by debugging scripts to figure out
the exact line and column at which a coroutine was suspended by looking
up current `__coro_index` value inside the coroutines frame, and then
searching for the corresponding label inside the coroutine's resume
function.
The DWARF information generated for such a label looks like:
```
0x00000f71: DW_TAG_label
DW_AT_name ("__coro_resume_1")
DW_AT_decl_file ("generator-example.cpp")
DW_AT_decl_line (5)
DW_AT_decl_column (3)
DW_AT_artificial (true)
DW_AT_LLVM_coro_suspend_idx (0x01)
DW_AT_low_pc (0x00000000000019be)
```
The labels can be mapped to their corresponding `__coro_idx` values
either via their naming convention `__coro_resume_<N>` or using the new
`DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can
be looked up using `info line -function my_coroutine -label
__coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label
debug information, yet.
Given this is an artificial compiler-generated label, I did apply the
DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on
type and variable definitions, but this is a natural extension and was
also blessed in the RFC on discourse.
Also, this commit adds `DW_AT_decl_column` to labels, not only for
coroutines but also for normal C and C++ labels. While not strictly
necessary, I am doing so now because it would be harder to do so later
without breaking the binary LLVM-IR format
Drive-by fixes: While reading the existing test cases to understand how
to write my own test case, I did a couple of small typo fixes and
comment improvements
Reverts llvm/llvm-project#136205
Breaks buildbots, probably something about needing to restrict the test
to running on a specific target or the like - I haven't looked closely.
Co-authored-by: Vladislav Dzhidzhoev <dzhidzhoev@gmail.com>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
**Summary**
This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.
**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)
Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.
In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.
For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.
**Implementation Details**
To address the above issue, the patch makes the following key changes:
**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.
**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.
**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.
**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
This is the final piece to enable register debugging for variables in
registers that have single locations that last throughout their
enclosing scope.
The next step after this for supporting register debugging for NVPTX is
to support the .debug_loc section.
Stacked on top of: https://github.com/llvm/llvm-project/pull/109495
getSectionIDNum may return same value for two different MBBSectionID.
e.g. A Cold type MBBSectionID with number 0 and a Default type
MBBSectionID with number 2 get same value 2 from getSectionIDNum. This
may lead to overwrite of MBBSectionRanges. Using MBBSectionID itself
as DenseMap key is better choice.
When a type unit is emitted, the CU referencing the type unit ends up
with a little DW_TAG_*_type with the DW_AT_signature and
DW_AT_declaration sometimes referred to (by me? maybe other people?) as
a skeleton type.
We shouldn't produce .debug_names reference to these - only to the
actual type definition in the type unit. So this patch does that.
But, inversely, the .debug_gnu_pubtypes /does/ need to reference the
skeleton type (& gcc does this too, when it produces a skeleton type
(gcc doesn't always produce these - if the type is only referenced once
via DW_AT_type, gcc uses a direct DW_FORM_ref_sig8 on the DW_AT_type
without the intermediate skeleton type)) - so there's a little special
case added in to preserve that behavior which is covered by existing
tests.
If -mllvm -add-linkage-names-to-external-call-origins is true then add
DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by
DW_AT_call_origin attributes that would otherwise be omitted.
A debugger may use DW_TAG_call_origin attributes to determine whether any
frames in a callstack are missing due to optimisations (e.g. tail calls).
For example, say a() calls b() tail-calls c(), and you stop in your debugger
in c():
The callstack looks like this:
c()
a()
Looking "up" from c(), call site information can be found in a(). This includes
a DW_AT_call_origin referencing b()'s subprogram DIE, which means the call at
this call site was to b(), not c() where we are currently stopped. This
indicates b()'s frame has been lost due to optimisation (or is misleading due
to ICF).
This patch makes it easier for a debugger to check whether the referenced
DIE describes the target function or not, for example by comparing the referenced
function name to the current frame.
There's already an option to apply DW_AT_linkage_name in a targeted manner:
-dwarf-linkage-names=Abstract, which limits adding DW_AT_linkage_names to
abstract subprogram DIEs (this is default for SCE tuning).
The new flag shouldn't affect non-SCE-tuned behaviour whether it is enabled
or not because the non-SCE-tuned behaviour is to always add linkage names to
subprogram DIEs.
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions
This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).
The first commit is added for convenience, as it has already been
accepted.
If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.
This is meant to be committed along with
https://reviews.llvm.org/D144006.
Enable Type Units with DWARF5 accelerator tables for monolithic DWARF.
Implementation relies on linker to tombstone offset in LocalTU list to
-1 when
it deduplciates type units using COMDAT.
The DWARF 5 specification says that:
> The name index must contain an entry for each debugging information
entry that
> defines a named [...] label [...].
The verifier currently verifies this, but the AsmPrinter does not add
entries for TAG_labels in debug_names. This patch addresses the issue by
ensuring we add labels in the accelerator tables once we have a fully
completed DIE for the TAG_label entry.
We also respect the spec as follows:
> DW_TAG_label debugging information entries without an address
attribute
> (DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges, or DW_AT_entry_pc) are
excluded.
The effect of this on the size of accelerator tables is minimal, as
TAG_labels are usually created by C/C++ labels (see example in test),
which are typically paired with "goto" statements.
This caused asserts:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
"getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.
See comment on the code review for reproducer.
> RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
>
> Similar to imported declarations, the patch tracks function-local types in
> DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
> the aforementioned metadata change and provided a support of function-local
> types scoped within a lexical block.
>
> The patch assumes that DICompileUnit's 'enums field' no longer tracks local
> types and DwarfDebug would assert if any locally-scoped types get placed there.
>
> Reviewed By: jmmartinez
>
> Differential Revision: https://reviews.llvm.org/D144006
This reverts commit f8aab289b5549086062588fba627b0e4d3a5ab15.
Fold constructVariableDIEImpl into constructVariableDIE, simplify it and
group related functions.
Pull out the previously inline lambdas for visiting the active variant
of the DbgVariable to add location and related attributes as an overload
set for a private method
applyConcreteDbgVariableAttributes.
Rename applyVariableAttribute to reflect what kinds of attributes it
applies, and to contrast it with the new
applyConcreteDbgVariableAttributes.
Move constructLabelDIE down in the implementation file, so all of the
constructVariableDIE-related function impls are adjacent.
This potentially has a slightly positive performance impact, as
std::visit can be implemented as a `switch`-like jump rather than
a series of `if`s.
More importantly, the reader can be confident is no overlap between the
cases.
Differential Revision: https://reviews.llvm.org/D158678
Only a subset of the fields of DbgVariable are meaningful at any time,
and some fields are re-used for multiple purposes (for example
FrameIndexExprs is used with a throw-away frame-index of 0 to hold a
single DIExpression without needing to add another member). The exact
invariants must be reverse-engineered by inspecting the actual use of
the class, its imprecise/outdated doc-comment, and some asserts.
Refactor DbgVariable into a sum type by inheriting from std::variant.
This makes the active fields for any given state explicit and removes
the need to re-use fields in disparate contexts. As a bonus, it seems to
reduce the size on my x86_64 linux box from 144 bytes to 96 bytes.
There is some potential cost to `std::get` as it must check the active
alternative even when context or an assert obviates it. To try to help
ensure the compiler can optimize out the checks the patch also adds a
helper `get` method which uses the noexcept `std::get_if`.
Some of the extra cost would also be avoided more cleanly with a
refactor that exposes the alternative types in the public interface,
which will come in another patch.
Differential Revision: https://reviews.llvm.org/D158675
With D149881, we converted EntryValue MachineFunction table entries into
`DbgVariables` initialized by a "DbgValue" intrinsic, which can only handle a
single, non-fragment DIExpression. However, it is desirable to handle variables
with multiple fragments and DIExpressions.
To do this, we expand the `DbgVariable` class to handle the EntryValue case.
This class can already operate under three different "modes" (stack slot,
unchanging location described by a dbg value, changing location described by a
loc list). A fourth case is added as a separate class entirely, but a subsequent
patch should redesign `DbgVariable` with four subclasses in order to make the
code more readable.
This patch also exposed a bug in the `beginEntryValueExpression` function, which
was not initializing the `LocationFlags` properly. Note how the
`finalizeEntryValue` function resets that flag. We fix this bug here, as testing
this changing in isolation would be tricky.
Differential Revision: https://reviews.llvm.org/D158458
This reverts commit d20e4a1d68aa8e14c4e524e4d4eeb4445acac401.
After committing 2ee4d0386c783f58abe708298228de648239b435, We don't support subprogram definitions nested within `DICompositeType` when doing LTO builds.
For a detailed discussion, see https://reviews.llvm.org/D152095.
Test "local-type-as-template-parameter.ll" is now enabled only for
x86_64.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
Depends on D144005
Test "local-type-as-template-parameter.ll" now requires linux-system.
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
Depends on D144005
This reverts commit d80fdc6fc1a6e717af1bcd7a7313e65de433ba85.
split-dwarf-local-impor3.ll fails because of an issue with
Dwo sections emission on Windows platform.