1879 Commits

Author SHA1 Message Date
Philip Reames
4d629f9744
[MIR] Remove std::variant from multiple save/restore point handling [nfc] (#153226)
In review of bbde6b, I had originally proposed that we support the
legacy text format. As review evolved, it bacame clear this had been a
bad idea (too much complexity), but in order to let that patch finally
move forward, I approved the change with the variant. This change undoes
the variant, and updates all the tests to just use the array form.
2025-08-12 11:23:05 -07:00
Jann Horn
3f0c180ca0
[DebugInfo][DWARF] Add heapallocsite information (#132073)
LLVM currently stores heapallocsite information in CodeView debuginfo,
but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific
extension.

heapallocsite debug information is useful when it is combined with
allocator instrumentation that stores caller addresses; I've used a
previous version of this patch for:

 - analyzing memory usage by object type
 - analyzing the distributions of values of class members

Other possible uses might be:

- attributing memory access profiles (for example, on Intel CPUs, from
PEBS records with Linear Data Address) to object types or specific
object members
 - adding type information to crash/ASAN reports
2025-08-06 10:34:58 -07:00
Jann
da6424c9e3
[DebugInfo][DWARF] Don't emit bogus DW_AT_call_target for complex calls (#151378)
On X86-64, LLVM currently generates the same DWARF debug info for `call
rax` and `call [rax]`; in both cases, the generated DWARF claims that
the call goes to address RAX. This bug occurs because the X86 machine
instructions CALL64r and CALL64m both receive register operands, but
those register operands have different semantics.

To fix it, change DwarfDebug::constructCallSiteEntryDIEs() to validate
the callee operand's semantics (`OperandType`) and make sure it is not
semantically describing a memory location.

This fix will result in less DW_TAG_call_site and DW_AT_call_target
entries being generated.

There is an existing test in dwarf-callsite-related-attrs.ll that
asserts the broken behavior; remove the broken check, and instead add a
new test dwarf-callsite-related-attrs-indirect.ll that checks behavior
for indirect calls.

The existing test xray-custom-log.ll is validating something even more
broken: It checks the debug info generated by a PATCHABLE_EVENT_CALL.
`TII->getCalleeOperand()` assumes that the first argument of a call
instruction is always the destination, but the first argument of
PATCHABLE_EVENT_CALL is instead the event structure; and so we were
emitting debug info claiming the callee was stored in a register that
actually contains some kind of xray event descriptor, and the test
validates that this happens.
I am breaking and deleting this test.
I guess the intent there might have been to validate that we emit
debuginfo referencing the target of the direct call that LLVM emits
(which we don't do)? But I'm not sure.
2025-08-05 13:25:01 -07:00
Orlando Cazalet-Hyams
5dab1fa1fa [BranchFolding] Follow up #149999 crash fix
fbf6271c7da20356d7b34583b3711b4126ca1dbb introduced an assertion failure as
setDebugValueUndef was called on DBG_LABELs, which isn't allowed and doesn't
make sense. Fix by skipping the call for DBG_LABELs and hoisting, in line with
the original behaviour.
2025-07-29 09:09:58 +01:00
Orlando Cazalet-Hyams
fbf6271c7d Reapply (2) [BranchFolding] Kill common hoisted debug instructions (#149999)
Reapply #140091.

branch-folder hoists common instructions from TBB and FBB into their
pred. Without this patch it achieves this by splicing the instructions from TBB
and deleting the common ones in FBB. That moves the debug locations and debug
instructions from TBB into the pred without modification, which is not
ideal. Debug locations are handled in #140063.

This patch handles debug instructions - in the simplest way possible, which is
to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB
because otherwise the fact there's an assignment on the code path is deleted
(which might lead to a prior location extending further than it should).

There's possibly something we could do to preserve some variable locations in
some cases, but this is the easiest not-incorrect thing to do.

Note I had to replace the constant DBG_VALUEs to use registers in the test- it
turns out setDebugValueUndef doesn't undef constant DBG_VALUEs... which feels
wrong to me, but isn't something I want to touch right now.

---

Fix end-iterator-dereference and add test.
2025-07-28 16:13:35 +01:00
Fangrui Song
dec978036e MachO,test: Test DWARF section's begin symbol
MCObjectFileInfo::initMachOMCObjectFileInfo creates DWARF sections with
a temporary label as the `Begin` symbol, different from other object
file formats' section symbol.

 #150574 caused a regression that removed the label for MCAsmStreamer,
which was caught by no test but a ystem-darwin/target-aarch64 specific diagnostics-dsym.test
2025-07-25 23:28:50 -07:00
Orlando Cazalet-Hyams
1bd7ccd4a5
Revert "[BranchFolding] Kill common hoisted debug instructions" (#150632)
Reverts llvm/llvm-project#149999

https://lab.llvm.org/buildbot/#/builders/139/builds/17622
2025-07-25 16:23:30 +01:00
Orlando Cazalet-Hyams
c1545b68bc
Reapply [BranchFolding] Kill common hoisted debug instructions (#149999)
Reapply #140091.

branch-folder hoists common instructions from TBB and FBB into their
pred. Without this patch it achieves this by splicing the instructions from TBB
and deleting the common ones in FBB. That moves the debug locations and debug
instructions from TBB into the pred without modification, which is not
ideal. Debug locations are handled in #140063.

This patch handles debug instructions - in the simplest way possible, which is
to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB
because otherwise the fact there's an assignment on the code path is deleted
(which might lead to a prior location extending further than it should).

There's possibly something we could do to preserve some variable locations in
some cases, but this is the easiest not-incorrect thing to do.

Note I had to replace the constant DBG_VALUEs to use registers in the test- it
turns out setDebugValueUndef doesn't undef constant DBG_VALUEs... which feels
wrong to me, but isn't something I want to touch right now.
2025-07-25 15:18:12 +01:00
Orlando Cazalet-Hyams
29af8e59fc
Revert "[BranchFolding] Kill common hoisted debug instructions" (#149845)
Reverts llvm/llvm-project#140091 due to crash (see comments for reproducer)
2025-07-21 17:25:05 +01:00
Orlando Cazalet-Hyams
8ba341eec3
[BranchFolding] Kill common hoisted debug instructions (#140091)
branch-folder hoists common instructions from TBB and FBB into their
pred. Without this patch it achieves this by splicing the instructions from TBB
and deleting the common ones in FBB. That moves the debug locations and debug
instructions from TBB into the pred without modification, which is not
ideal. Debug locations are handled in pull request 140063.

This patch handles debug instructions - in the simplest way possible, which is
to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB
because otherwise the fact there's an assignment on the code path is deleted
(which might lead to a prior location extending further than it should).

We might be able to do something smarter to preserve some variable locations in
some cases, but this is the easiest not-incorrect thing to do.
2025-07-21 14:19:33 +01:00
Tom Tromey
468275dc49
Fix AsmWriter to account for dynamic bit offsets (#146704)
PR #141106 changed the debug metadata to allow dynamic bit offsets and
sizes. In that patch, I forgot to update AsmWriter to handle this case.

This patch corrects the oversight.
2025-07-07 15:37:09 -07:00
Fangrui Song
279e808b75 MC: Make mc-dump output compact
Remove unneeded details like "<" and ">". Reduce indentation.
Omit `this` address to simplify output comparison.
Add a -debug-only=mc-dump test.

While here, add fixup printing for MCRelaxableFragment.
2025-06-28 22:31:38 -07:00
David Blaikie
96ed2abadf
[DebugInfo] Specify x86_64 triple for test (#145797)
Most DWARF tests aren't totally architecture portable anyway - so let's
just put this in x86.
2025-06-26 09:08:10 +02:00
Michael Buch
711f6a8603
[llvm][DebugInfo] Encode DW_AT_object_pointer on method declarations with DW_FORM_implicit_const (#124790)
We started attaching `DW_AT_object_pointer`s on method declarations in
https://github.com/llvm/llvm-project/pull/122742. However, that caused
the `.debug_info` section size to increase significantly (by around ~10%
on some projects). This was mainly due to the large number of new
`DW_FORM_ref4` values. This patch tries to address that regression by
changing the `DW_FORM_ref4` to a `DW_FORM_implicit_const` for
declarations. The value of `DW_FORM_implicit_const` will be the *index*
of the object parameter in the list of formal parameters of the
subprogram (i.e., if the first `DW_TAG_formal_parameter` is the object
pointer, the `DW_FORM_implicit_const` would be `0`). The DWARFv5 spec
only mentions the use of the `reference` attribute class to for
`DW_AT_object_pointer`. So using a `DW_FORM_impilicit_const` would be an
extension to (and not something mandated/specified by) the standard.
Though it'd make sense to extend the wording in the spec to allow for
this optimization.

That way we don't pay for the 4 byte references on every attribute
occurrence. In a local build of clang this barely affected the
`.debug_info` section size (but did increase `.debug_abbrev` by up to
10%, which doesn't impact the total debug-info size much however).

We guarded this on LLDB tuning (since using `DW_FORM_implicit_const` for
this purpose may surprise consumers) and DWARFv5 (since that's where
`DW_FORM_implicit_const` was first standardized).
2025-06-16 16:58:00 +01:00
Fangrui Song
28bda77843
Introduce MCAsmInfo::UsesSetToEquateSymbol and prefer = to .set
Introduce MCAsmInfo::UsesSetToEquateSymbol to control the preferred
syntax for symbol equating. We now favor the more readable and common
`symbol = expression` syntax over `.set`. This aligns with pre- https://reviews.llvm.org/D44256 behavior.

On Apple platforms, this resolves a clang -S vs -c behavior difference (resolves #104623).

For targets whose = support is unconfirmed, UsesSetToEquateSymbol is set to false.
This also minimizes test updates.

Pull Request: https://github.com/llvm/llvm-project/pull/142289
2025-06-11 22:19:31 -07:00
Jeremy Morse
df4199c3a4
[DebugInfo] Use correct unit when creating variable across CU boundary (#133282)
When creating a static member DIE, we place it in a potentially
pre-existing context DIE, and that DIE might be located in a different
CU if we're in an LTO context. When we then add the source-file-ID to
the static member DIE, use the correct Unit to do so -- the one that
owns the context DIE. Otherwise we might assign a file-ID from one CU to
another, and there isn't a guarantee that they'll be the same file, or
even exist.

Fixes #109227

(I'd normally remove my home directory from these tests, but in this
circumstances the same-file-but-with-a-different-name nature of the
DIFile is part of the test).
2025-06-05 10:32:17 +01:00
alx32
aab79c41b2
[DebugInfo] Fix issue with debug line table offsets for empty functions (#142253)
This patch addresses an issue where an anonymous DWARF line table symbol
could be inadvertently defined multiple times, leading to an "symbol ''
is already defined" error during assembly or object file emission. This
issue happens for empty functions when
`-emit-func-debug-line-table-offsets` is enabled.

The root cause is the creation of the "end sequence" entry for a DWARF
line table. This entry was sometimes created by copying the last
existing line table entry. If this last entry was a special one (created
for the purpose of marking the position in the line table stream and
having an anonymous symbol attached), the copied end-sequence entry
would also incorrectly reference this same anonymous symbol.
Consequently, when the line table was finally emitted, the DWARF
emission logic would attempt to emit a label for this anonymous symbol
twice, triggering the redefinition error.

The fix ensures that when an end-sequence line table entry is created,
it does not inherit any special stream label from the entry it might
have been based on, thereby preventing the duplicate label emission.
2025-06-03 07:33:51 -07:00
Jeremy Morse
26d9cb17a6
[MC][DebugInfo] Emit linetable entries with known offsets immediately (#134677)
DWARF linetable entries are usually emitted as a sequence of
MCDwarfLineAddrFragment fragments containing the line-number difference
and an MCExpr describing the instruction-range the linetable entry
covers. These then get relaxed during assembly emission.

However, a large number of these instruction-range expressions are
ranges within a fixed MCDataFragment, i.e. a range over fixed-size
instructions that are not subject to relaxation at a later stage. Thus,
we can compute the address-delta immediately, and not spend time and
memory describing that computation so it can be deferred.
2025-05-20 21:26:56 +01:00
Orlando Cazalet-Hyams
4060d38746
[BranchFolding] Merge debug locs on common hoisted code (#140063)
branch-folder hoists common instructions from TBB and FBB into their
pred. Without this patch it achieves this by splicing the instructions from TBB
and deleting the common ones in FBB. That moves the debug locations and debug
instructions from TBB into the pred without modification, which is not
ideal. The merged instructions should get merged debug locations for debugging
and PGO purposes, which is handled in this patch. Debug instructions also need
to be handled differently. That'll come in another patch. This issue was found
by @omern1.
2025-05-20 11:11:45 +01:00
Matthias Braun
675cb70641
Register assembly printer passes (#138348)
Register assembly printer passes in the pass registry.

This makes it possible to use `llc -start-before=<target>-asm-printer ...` in tests.

Adds a `char &ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.
2025-05-06 18:01:17 -07:00
Vladislav Dzhidzhoev
bea3b9214e
Revert "Revert "[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined DW_TAG_lexical_blocks"" (#137243)
Reverts llvm/llvm-project#137237, as the problem was fixed with
92dc18b6df043d788d77b4a98e5afa3954a44cb0.
2025-04-24 21:49:55 +02:00
David Blaikie
dd9f92c886
Revert "[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined DW_TAG_lexical_blocks" (#137237)
Reverts llvm/llvm-project#136205

Breaks buildbots, probably something about needing to restrict the test
to running on a specific target or the like - I haven't looked closely.

Co-authored-by: Vladislav Dzhidzhoev <dzhidzhoev@gmail.com>
2025-04-24 12:14:51 -07:00
Vladislav Dzhidzhoev
1143a04f34
[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined DW_TAG_lexical_blocks (#136205)
During the discussion under
https://github.com/llvm/llvm-project/pull/119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes https://github.com/llvm/llvm-project/issues/49297.
2025-04-24 19:44:18 +02:00
Jeremy Morse
1ebc308bba
[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855)
During the transition from debug intrinsics to debug records, we used
several different command line options to customise handling: the
printing of debug records to bitcode and textual could be independent of
how the debug-info was represented inside a module, whether the
autoupgrader ran could be customised. This was all valuable during
development, but now that totally removing debug intrinsics is coming
up, this patch removes those options in favour of a single flag
(experimental-debuginfo-iterators), which enables autoupgrade, in-memory
debug records, and debug record printing to bitcode and textual IR.

We need to do this ahead of removing the
experimental-debuginfo-iterators flag, to reduce the amount of
test-juggling that happens at that time.

There are quite a number of weird test behaviours related to this --
some of which I simply delete in this commit. Things like
print-non-instruction-debug-info.ll , the test suite now checks for
debug records in all tests, and we don't want to check we can print as
intrinsics. Or the update_test_checks tests -- these are duplicated with
write-experimental-debuginfo=false to ensure file writing for intrinsics
is correct, but that's something we're imminently going to delete.

A short survey of curious test changes:
* free-intrinsics.ll: we don't need to test that debug-info is a zero
cost intrinsic, because we won't be using intrinsics in the future.
* undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory
mode while we sorted something out; it works now either way.
* salvage-cast-debug-info.ll: was testing intrinsics-in-memory get
salvaged, isn't necessary now
* localize-constexpr-debuginfo.ll: was producing "dead metadata"
intrinsics for optimised-out variable values, dbg-records takes the
(correct) representation of poison/undef as an operand. Looks like we
didn't update this in the past to avoid spurious test differences.
* Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing
that debug-info affected codegen, and we deferred updating the tests
until now. This is just one of those silent gnochange issues that get
fixed by RemoveDIs.

Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc,
that checks we can autoupgrade debug intrinsics that are in bitcode into
the new debug records.
2025-04-01 14:27:11 +01:00
Jeremy Morse
792a6f8119
[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.

Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.

(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
2025-03-14 15:50:49 +00:00
Yaxun (Sam) Liu
da0f9e75d8
Reland: [MC] output inlined-at debug info (#106230) (#130306)
Reland https://github.com/llvm/llvm-project/pull/106230

The original PR was reverted due to compilation time regression.

This PR fixed that by adding a condition OutStreamer->isVerboseAsm() to
the generation of extra inlined-at debug info, so that it does not
affect normal compilation time.

Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.

For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.

This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
2025-03-11 09:43:14 -04:00
Nikita Popov
aa1d2cc5d7 Revert "[MC] output inlined-at debug info (#106230)"
This reverts commit f3dc358953a13caf7521fc615a08f6317930351c.

This causes a large compile-time regression:
https://llvm-compile-time-tracker.com/compare.php?from=267403442264959f6b06e227ff450c385f4b3ef2&to=f3dc358953a13caf7521fc615a08f6317930351c&stat=instructions:u
2025-03-07 09:45:15 +01:00
Yaxun (Sam) Liu
f3dc358953
[MC] output inlined-at debug info (#106230)
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.

For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.

This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
2025-03-06 22:47:11 -05:00
Daniel Paoliello
16e051f0b9
[win] NFC: Rename EHCatchret to EHCont to allow for EH Continuation targets that aren't catchret instructions (#129953)
This change splits out the renaming and comment updates from #129612 as a non-functional change.
2025-03-06 09:28:44 -08:00
Pedro Lobo
05589ee455
[Metadata] Replace undef VAMs with poison VAMs (#129450)
`undef` debug info can be replaced with `poison` debug info.
2025-03-03 10:55:41 +01:00
Michael Buch
41f96f91cd
[llvm][DebugInfo] Emit DW_AT_const_value for float non-type template parameters (#127045)
In C++20, non-type template parameters can be float/double. Clang didn't
emit those constants in DWARF. This patch emits floating point constants
the same way we do other integral template value parameters.
2025-02-13 23:08:44 +00:00
David Blaikie
ce96c26cd6
Revert "[llvm][DebugInfo] Attach object-pointer to DISubprogram declarations (#122742)" (#124853)
This introduces a substantial (5-10%) regression in .debug_info size, so
we're discussing alternatives in #122742 and #124790.

This reverts commit 7c729418d721147bf1f2b257afd30f84721888ad.
2025-01-29 15:11:33 +01:00
Stephen Tozer
822f74a911
[Clang] Cleanup docs and comments relating to -fextend-variable-liveness (#124767)
This patch contains a number of changes relating to the above flag;
primarily it updates comment references to the old flag names,
"-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names,
"-fextend-variable-liveness[={all,this}]". These changes are all NFC.

This patch also removes the explicit -fextend-this-ptr-liveness flag
alias, and shortens the help-text for the main flag; these are both
changes that were meant to be applied in the initial PR (#110000), but
due to some user-error on my part they were not included in the merged
commit.
2025-01-28 18:25:32 +00:00
David Blaikie
42043c423f Reapply "Verifier: Add check for DICompositeType elements being null"
This remove some erroneous debug info from tests that should address the
test failures that showed up when the this was previously committed.

This reverts commit 6716ce8b641f0e42e2343e1694ee578b027be0c4.
2025-01-23 22:29:30 +00:00
Michael Buch
7c729418d7
[llvm][DebugInfo] Attach object-pointer to DISubprogram declarations (#122742)
Currently Clang only attaches `DW_AT_object_pointer` to
`DW_TAG_subprogram` definitions. LLDB constructs C++ method types from
their `DW_TAG_subprogram` declaration, which is also the point at which
it needs to determine whether a method is static or not. LLDB's
heuristic for this could be very simple if we emitted
`DW_AT_object_pointer` on declarations. But since we don't, LLDB has to
guess whether an argument is an implicit object parameter based on the
DW_AT_name and DW_AT_type.

To simplify LLDB's job (and to eventually support C++23's explicit
object parameters), this patch adds the `DIFlagObjectPointer` to
`DISubprogram` declarations.

For reference, GCC attaches the object-pointer DIE to both the
definition and declaration: https://godbolt.org/z/3TWjTfWon

Fixes https://github.com/llvm/llvm-project/issues/120973
2025-01-17 15:27:48 +00:00
Jay Foad
e87f94a6a8
[llvm-project] Fix typos mutli and mutliple. NFC. (#122880) 2025-01-14 11:59:41 +00:00
Alexander Yermolovich
fce0314c38
[LLVM][DWARF] Create debug names entry for non-tu top level DIE (#121856)
When creating a Type Unit (TU), LLVM attempts to do so optimistically.
However, if this fails, it discards the TU state and creates the TU
within the Compilation Unit (CU). In such cases, an entry for the
top-level DIE is not created in the debug names table.
This can cause issues when running llvm-dwarfdump --debug-names
--verify, as the missing entry will result in verification failure.
To address this issue, this patch adds a call to the
updateAcceleratorTables when TU creation fails. This ensures that the
debug names table is updated correctly, even in cases where TU creation
fails.
2025-01-08 17:08:35 -08:00
Fangrui Song
7b23f413d1 MCAsmStreamer: Omit initial ".text"
llvm-mc --assemble prints an initial `.text` from `initSections`.
This is weird for quick assembly tasks that do not specify `.text`.

Omit the .text by moving section directive printing from `changeSection`
to `switchSection`. switchSectionNoPrint now correctly calls the
`changeSection` hook (needed by MachO).

The initial directives of clang -S are now reordered. On ELF targets, we
get `.file "a.c"; .text` instead of `.text; .file "a.c"`.
If there is no function, `.text` will be omitted.
2024-12-22 22:03:44 -08:00
Fangrui Song
133352feb3 [test] Remove redundant -march= when target triple is specified in IR 2024-12-15 12:42:17 -08:00
Bruno De Fraine
20d8f8ca1a
[GlobalOpt] Fix global SRA incorrect alignment on some elements (#115328)
The logic had a flaw where the alignment from the original aggregate is
unintentionally retained for elements when the calculated known
alignment is not higher than the element's ABI type alignment.

Fixes #115282.
2024-11-18 10:49:50 +01:00
Jeremy Morse
b468ed494a Reapply ccddb6ffad1, "Emit a worst-case prologue_end"
In 39b2979a4 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.

Original commit message follows.

[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-14 10:30:17 +00:00
alx32
f407dff50c
[DebugInfo][DWARF] Emit Per-Function Line Table Offsets and End Sequences (#110192)
**Summary**

This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.

**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)

Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.

In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.

For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.


**Implementation Details**
To address the above issue, the patch makes the following key changes:

**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.

**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.

**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.

**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
2024-11-13 18:51:34 -08:00
Jeremy Morse
ccddb6ffad Revert "[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)"
This reverts commit bf483ddb42065405e345393e022dc72357ec5a3a.

See PR, there's a test testing for this behaviour (possibly adaptable), and
a duplicate line entry too
2024-11-12 17:07:56 +00:00
Stephen Tozer
fe18ab983d
[DebugInfo] Don't apply is_stmt on MBB branches that preserve lines (#108251)
This patch follows on from the changes made in #105524, by adding an
additional heuristic that prevents us from applying the start-of-MBB
is_stmt flag when we can see that, for all direct branches to the MBB,
the last line stepped on before the branch is the same as the first line
of the MBB. This is mainly to prevent certain pathological cases, such
as macros that expand to multiple basic blocks that all have the same
source location, from giving us repeated steps on the same line. This
approach is not comprehensive, since it relies on analyzeBranch to read
edges, but the default fallback of applying is_stmt may lead only to
useless steps in some cases, rather than skipping useful steps
altogether.
2024-11-12 16:23:35 +00:00
Jeremy Morse
bf483ddb42
[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-12 15:09:40 +00:00
Lee Wei
1469d82e1c
Remove br i1 undef from some regression tests [NFC] (#115130)
As defined in LangRef, branching on `undef` is undefined behavior.
This PR aims to remove undefined behavior from tests. As UB tests break
Alive2 and may be the root cause of breaking future optimizations.

Here's an Alive2 proof for one of the examples:
https://alive2.llvm.org/ce/z/TncxhP
2024-11-07 08:11:15 +00:00
Sriraman Tallam
c7ef002bc6
Fix performance bug in buildLocationList (#109343)
In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.

This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.

When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.

For example:

.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv

The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".

It is alright to skip this as we already check the section corresponding
to the debugloc entry.

Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.
2024-10-31 09:00:25 -07:00
David Stenberg
97861981cc [LiveDebugVariables] Fix a DBG_VALUE reordering issue (#111124)
LDV could reorder reinserted fragment and non-fragment debug values for
the same variable (compared to the input order), potentially resulting
in stale values being presented.

For example, before:

  DBG_VALUE 1001, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16)
  DBG_VALUE 1002, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16)
  DBG_VALUE %0, $noreg, !13, !DIExpression()

After (without this patch):

  DBG_VALUE %stack.0, 0, !13, !DIExpression()
  DBG_VALUE 1002, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16)
  DBG_VALUE 1001, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16)

It would also reorder DBG_VALUEs for different variables. Although that
does not matter for the debug information output, it resulted in some
noise in before/after pass diffs.

This should hopefully align so that instruction referencing and
DBG_VALUE emit debug instructions in the same order (see the
sdag-salvage-add.ll change).
2024-10-15 11:36:24 +02:00
Jeremy Morse
e6bf48d110
[X86] Don't request 0x90 nop filling in p2align directives (#110134)
As of rev ea222be0d, LLVMs assembler will actually try to honour the
"fill value" part of p2align directives. X86 printed these as 0x90, which
isn't actually what it wanted: we want multi-byte nops for .text
padding. Compiling via a textual assembly file produces single-byte
nop padding since ea222be0d but the built-in assembler will produce
multi-byte nops. This divergent behaviour is undesirable.

To fix: don't set the byte padding field for x86, which allows the
assembler to pick multi-byte nops. Test that we get the same multi-byte
padding when compiled via textual assembly or directly to object file.
Added same-align-bytes-with-llasm-llobj.ll to that effect, updated
numerous other tests to not contain check-lines for the explicit padding.
2024-10-02 11:14:05 +01:00
Stephen Tozer
51a29b5f16 Revert2 "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB (#105524)"
Reverted due to large .debug_line size regressions for some
configurations; work currently in place to improve the output of this
behaviour in PR #108251.

This patch also modifies two tests that were created or modified after
the original commit landed and are affected by the revert:

  llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
  llvm/test/DebugInfo/X86/empty-line-info.ll

This reverts commit 5fef40c2c477e92187bd4e5c18091eca6b8465cc.
2024-09-17 18:29:20 +01:00