LLVM currently stores heapallocsite information in CodeView debuginfo,
but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific
extension.
heapallocsite debug information is useful when it is combined with
allocator instrumentation that stores caller addresses; I've used a
previous version of this patch for:
- analyzing memory usage by object type
- analyzing the distributions of values of class members
Other possible uses might be:
- attributing memory access profiles (for example, on Intel CPUs, from
PEBS records with Linear Data Address) to object types or specific
object members
- adding type information to crash/ASAN reports
On X86-64, LLVM currently generates the same DWARF debug info for `call
rax` and `call [rax]`; in both cases, the generated DWARF claims that
the call goes to address RAX. This bug occurs because the X86 machine
instructions CALL64r and CALL64m both receive register operands, but
those register operands have different semantics.
To fix it, change DwarfDebug::constructCallSiteEntryDIEs() to validate
the callee operand's semantics (`OperandType`) and make sure it is not
semantically describing a memory location.
This fix will result in less DW_TAG_call_site and DW_AT_call_target
entries being generated.
There is an existing test in dwarf-callsite-related-attrs.ll that
asserts the broken behavior; remove the broken check, and instead add a
new test dwarf-callsite-related-attrs-indirect.ll that checks behavior
for indirect calls.
The existing test xray-custom-log.ll is validating something even more
broken: It checks the debug info generated by a PATCHABLE_EVENT_CALL.
`TII->getCalleeOperand()` assumes that the first argument of a call
instruction is always the destination, but the first argument of
PATCHABLE_EVENT_CALL is instead the event structure; and so we were
emitting debug info claiming the callee was stored in a register that
actually contains some kind of xray event descriptor, and the test
validates that this happens.
I am breaking and deleting this test.
I guess the intent there might have been to validate that we emit
debuginfo referencing the target of the direct call that LLVM emits
(which we don't do)? But I'm not sure.
Patch 3/4 adding bitcode support, though the final patch doesn't depend on this
one.
Prior to this patch, a Key Instructions function inlined into a
Not-Key-Instructions function fell back to Not-Key-Instructions handling.
In order to fully support inlining mixed modes we need to run
`computeKeyInstructions` (in case there's a Key Instructions scope) and
`findForceIsStmtInstrs` (in case there's a Not-Key-Instructions scope) on all
functions. This has a slight performance cost for all configurations - see PR
for details.
Patch 2/4 adding bitcode support.
A non-key-instructions function inlined into a key-instructions function uses
non-key-instructions is_stmt placement (without `findForceIsStmtInstrs`).
A key-instructions function inlined into a non-key-instructions function
currently results in falling back to non-key-instructions for the inlined scope
too.
Both of these concessions (not using `findForceIsStmtInstrs` in the 1st case,
and not using Key Instructions for the inlined scope in the 2nd) are for
performance reasons; to do the right thing we'd need to run both
`findForceIsStmtInstrs` and `computeKeyInstructions` - in case that's
controversial I've got a separate PR for that: PR 144103.
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.
Either way, I'm open.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
- try_emplace(Key) is shorter than insert(std::make_pair(Key, 0)).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
On some platforms (particularly macOS), a `\01` prefix gets added to the
name in an `asm` label. This gets stripped when we emit the
[`DW_AT_linkage_name`](2f877c2722/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (L531)).
But we weren't stripping this prefix when inserting the linkage name
into accelerator tables.
This manifested in an issue where LLDB tried to look up a name in the
index by linkage name, but wasn't able to find it because we indexed it
with the `\01` unstripped.
This patch strips the prefix before indexing.
Reapplied after fixing the config issue that was causing issues following
the previous merge.
This reverts commit fdbf073a86573c9ac4d595fac8e06d252ce1469f.
This reverts commit a9d93ecf1f8d2cfe3f77851e0df179b386cff353.
Reverted due to the commit including a config in LLVM headers that is not
available outside of the llvm source tree.
Reverts llvm/llvm-project#136205
Breaks buildbots, probably something about needing to restrict the test
to running on a specific target or the like - I haven't looked closely.
Co-authored-by: Vladislav Dzhidzhoev <dzhidzhoev@gmail.com>
This is part of a series of patches that tries to improve DILocation bug
detection in Debugify; see the review for more details. This is the patch
that adds the main feature, adding a set of `DebugLoc::get<Kind>`
functions that can be used for instructions with intentionally empty
DebugLocs to prevent Debugify from treating them as bugs, removing the
currently-pervasive false positives and allowing us to use Debugify (in
its original DI preservation mode) to reliably detect existing bugs and
regressions. This patch does not add uses of these functions, except for
once in Clang before optimizations, and in
`Instruction::dropLocation()`, since that is an obvious case that
immediately removes a set of false positives.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
Reland https://github.com/llvm/llvm-project/pull/106230
The original PR was reverted due to compilation time regression.
This PR fixed that by adding a condition OutStreamer->isVerboseAsm() to
the generation of extra inlined-at debug info, so that it does not
affect normal compilation time.
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.
For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.
This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.
For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.
This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
Fixes the "use after poison" issue introduced by #121516 (see
<https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>).
The root cause of this issue is that #121516 introduced "Called Global"
information for call instructions modeling how "Call Site" info is
stored in the machine function, HOWEVER it didn't copy the
copy/move/erase operations for call site information.
The fix is to rename and update the existing copy/move/erase functions
so they also take care of Called Global info.
When creating a Type Unit (TU), LLVM attempts to do so optimistically.
However, if this fails, it discards the TU state and creates the TU
within the Compilation Unit (CU). In such cases, an entry for the
top-level DIE is not created in the debug names table.
This can cause issues when running llvm-dwarfdump --debug-names
--verify, as the missing entry will result in verification failure.
To address this issue, this patch adds a call to the
updateAcceleratorTables when TU creation fails. This ensures that the
debug names table is updated correctly, even in cases where TU creation
fails.
The optimiser will produce empty blocks that are unconditionally
executed according to the CFG -- while it may not be meaningful code,
and won't get a prologue_end position, we need to not crash on this
input.
The fault comes from assuming that there's always a next block with some
instructions in it, that will eventually produce some meaningful control
flow to stop at -- in the given reproducer in issue #117206 this isn't
true, because the function terminates with `unreachable`. Thus, I've
refactored the "get next instruction logic" into a helper that'll step
through all blocks and terminate if there aren't any more.
Reproducer from aeubanks
Add a filter to avoid picking prologue_end when a function is empty (it may
have blocks but no instructions). This saves us from pushing more
validity-checking into findPrologueEndLoc.
In 39b2979a4 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.
Original commit message follows.
[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.
To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.
This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
**Summary**
This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.
**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)
Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.
In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.
For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.
**Implementation Details**
To address the above issue, the patch makes the following key changes:
**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.
**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.
**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.
**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
This reverts commit bf483ddb42065405e345393e022dc72357ec5a3a.
See PR, there's a test testing for this behaviour (possibly adaptable), and
a duplicate line entry too
This patch follows on from the changes made in #105524, by adding an
additional heuristic that prevents us from applying the start-of-MBB
is_stmt flag when we can see that, for all direct branches to the MBB,
the last line stepped on before the branch is the same as the first line
of the MBB. This is mainly to prevent certain pathological cases, such
as macros that expand to multiple basic blocks that all have the same
source location, from giving us repeated steps on the same line. This
approach is not comprehensive, since it relies on analyzeBranch to read
edges, but the default fallback of applying is_stmt may lead only to
useless steps in some cases, rather than skipping useful steps
altogether.
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.
To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.
This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.
This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.
When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.
For example:
.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv
The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".
It is alright to skip this as we already check the section corresponding
to the debugloc entry.
Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.
Enable .debug_loc section for NVPTX backend.
This commit makes NVPTX omit DW_AT_low_pc (and DW_AT_high_pc) for
DW_TAG_compile_unit. This is because cuda-gdb uses the compile unit's
low_pc as a base address, and adds the addresses in the debug_loc
section to it. Removing low_pc is equivalent to setting that base
address to zero, so addition doesn't break the location ranges.
Additionally, this patch forces debug_loc label emission to emit single
labels with no subtraction or base. This would not be necessary if we
could emit `label1 - label2` expressions in PTX. The PTX documentation
at
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#debugging-directives-section
makes it seem like this is supported, but it doesn't actually work. I
believe when that documentation says that you can subtract “label
addresses between labels in the same dwarf section”, it doesn't merely
mean that the labels need to be in the same section as each other, but
in fact they need to be in the same section as the use. If support for
label subtraction is supported such that in the debug_loc section you
can subtract labels from the main code section, then we can remove the
workarounds added in this PR.
Also, since this now emits valid .debug_loc sections, it replaces the
empty .debug_loc to force existence of at least one debug section with
an empty .debug_macinfo section, which matches what nvcc does.
This replaces some of the most frequent offenders of using a DenseMap that
cause a malloc, where the typical element-count is small enough to fit in
an initial stack allocation.
Most of these are fairly obvious, one to highlight is the collectOffset
method of GEP instructions: if there's a GEP, of course it's going to have
at least one offset, but every time we've called collectOffset we end up
calling malloc as well for the DenseMap in the MapVector.
The register encoding used by NVPTX and cuda-gdb basically use strings
encoded as numbers. They are always within 64-bits, but typically
outside of 32-bits, since they often need at least 5 characters.
This patch changes the signature of `MCRegisterInfo::getDwarfRegNum` and
some related data structures to use 64-bit numbers to accommodate
encodings like this.
Additionally, `MCRegisterInfo::getDwarfRegNum` is marked as virtual, so
that targets with peculiar dwarf register mapping schemes (such as
NVPTX) can override its behavior.
I originally tried to do a broader switch to 64-bit types for registers,
but it caused many problems. There are various places in code generation
where registers are not just treated as 32-bit numbers, but also treat
certain bit offsets as flags. So I limited the change as much as
possible to just the output of `getDwarfRegNum`. Keeping the types used
by `DwarfLLVMRegPair` as unsigned preserves the current behaviors. The
only way to give a 64-bit output from `getDwarfRegNum` that actually
needs more than 32-bits is to override `getDwarfRegNum` and provide an
implementation that sidesteps the use of the `DwarfLLVMRegPair` maps
defined in tablegen files.
First layer of stack supporting:
https://github.com/llvm/llvm-project/pull/109495
When emitting debug info for code alignment, it was possible to emit a
.loc directive with a file number of zero, which is invalid for DWARF 4
and earlier. This happened because getCurrentDwarfLoc() returned a
zero-initialised value when there hadn't been a previous .loc directive
emitted.
---------
Co-authored-by: Paul T Robinson <paul.robinson@sony.com>
Reverted due to large .debug_line size regressions for some
configurations; work currently in place to improve the output of this
behaviour in PR #108251.
This patch also modifies two tests that were created or modified after
the original commit landed and are affected by the revert:
llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
llvm/test/DebugInfo/X86/empty-line-info.ll
This reverts commit 5fef40c2c477e92187bd4e5c18091eca6b8465cc.
In degenerate but legal inputs, we can have functions that have no source
locations at all -- all the DebugLocs attached to instructions are empty.
LLVM didn't produce any source location for the function; with this patch
it will at least emit the function-scope source location. Demonstrated by
empty-line-info.ll
The XCOFF test modified has similar symptoms -- with this patch, the size
of the ".dwline" section grows a bit, thus shifting some of the file
internal offsets, which I've updated.
Currently, we identify the end of the prologue as being "the instruction
that first has *this* DebugLoc". It works well enough, but I feel
identifying a position in a function is best communicated by a
MachineInstr. Plus, I've got some patches coming that depend upon this.
Seemingly this goes back to fd07a2a in 2015 -- I anticipate that back
then the metadata layout was radically different. But nowadays at least, we
can just directly look up the subprogram.
Fixes the previous buildbot error by adding an explicit triple to the test,
ensuring that llc can produce a valid object file.
This reverts commit 926f0979af4f6172d4ed3dea5603aa97c800bef1.
Reverted (along with the NFC followup fix) due to buildbot failure:
https://lab.llvm.org/buildbot/#/builders/160/builds/4142
This reverts commit 3ef37e2f8f672393ee409fde8309198df0981735, and commit
616f7d3d4f6d9bea6f776e357c938847e522a681.
Fixes: https://github.com/llvm/llvm-project/issues/104695
This patch adds the is_stmt flag to line table entries for the first
instruction with a non-0 line location in each basic block, to ensure
that it will be used for stepping even if the last instruction in the
previous basic block had the same line number; this is important for
cases where the new BB is reachable from BBs other than the preceding
block.
Since ce0c205813c74b4225180ac8a6e40fd52ea88229, we are doing that if a
single (LTO) compilation contains more than one compile unit, but the
same thing can happen if the non-lto and single-cu lto compilations,
typically when the CU ends up (nearly) empty. In my case, this happened
when LTO emptied two compile units.
Note that the source file name is already a part of the hash, so this
can only happen when a single file is compiled and linked twice into the
same application (most likely with different preprocessor defintiions).
While not exactly common, this pattern is used by some C code to
implement "templates".
The 2017 patch already hinted at the possibility of doing this
unconditionally, and this patch implements that. While the DWARF spec
hints at the option of using the type signature hashing algorithm for
the DWO_id purposes, AFAICT it does not actually require it, so I
believe this change is still conforming.
The relevant section of the spec is in Section 3.1.2 "Skeleton
Compilation Unit Entries" (in non-normative text):
```
The means of determining a compilation unit ID does not need to be
similar or related to the means of determining a type unit signature.
However, it should be suitable for detecting file version skew or other
kinds of mismatched files and for looking up a full split unit in a
DWARF package file (see Section 7.3.5 on page 190).
```