When creating a Type Unit (TU), LLVM attempts to do so optimistically.
However, if this fails, it discards the TU state and creates the TU
within the Compilation Unit (CU). In such cases, an entry for the
top-level DIE is not created in the debug names table.
This can cause issues when running llvm-dwarfdump --debug-names
--verify, as the missing entry will result in verification failure.
To address this issue, this patch adds a call to the
updateAcceleratorTables when TU creation fails. This ensures that the
debug names table is updated correctly, even in cases where TU creation
fails.
the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is not
supposed to change the meaning of the program. Further, checking the
calling convention is significantly faster than traversing the metadata,
improving compile time.
This change updates the clang and mlir frontends as well as the
NVPTXCtorDtorLowering pass to emit kernels using the calling convention.
In addition, this updates all NVPTX unit tests to use the calling
convention as well.
llvm-mc --assemble prints an initial `.text` from `initSections`.
This is weird for quick assembly tasks that do not specify `.text`.
Omit the .text by moving section directive printing from `changeSection`
to `switchSection`. switchSectionNoPrint now correctly calls the
`changeSection` hook (needed by MachO).
The initial directives of clang -S are now reordered. On ELF targets, we
get `.file "a.c"; .text` instead of `.text; .file "a.c"`.
If there is no function, `.text` will be omitted.
Similar to 806761a7629df268c8aed49657aeccffa6bca449
-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple (e.g. Windows, macOS).
Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as these MIR tests do not
utilize object file format specific detail, but it's good to change
these tests to neighbor files that use -mtriple=x86_64
The optimiser will produce empty blocks that are unconditionally
executed according to the CFG -- while it may not be meaningful code,
and won't get a prologue_end position, we need to not crash on this
input.
The fault comes from assuming that there's always a next block with some
instructions in it, that will eventually produce some meaningful control
flow to stop at -- in the given reproducer in issue #117206 this isn't
true, because the function terminates with `unreachable`. Thus, I've
refactored the "get next instruction logic" into a helper that'll step
through all blocks and terminate if there aren't any more.
Reproducer from aeubanks
Include a regular expression in the 'REQUIRES' clause, to run
the test on all matching targets (x86_64 *linux*).
The original patch restricted to test just to 'x86_64-linux'
https://github.com/llvm/llvm-project/pull/116327
The logic had a flaw where the alignment from the original aggregate is
unintentionally retained for elements when the calculated known
alignment is not higher than the element's ABI type alignment.
Fixes#115282.
Consider the case when the compiler generates a static member. Any
consumer of the debug info generated for that case, would benefit if
that member has the DW_AT_artificial flag.
Fix buildbot failure on: llvm-clang-aarch64-darwin
- Add specific configuration: x86_64-linux
Consider the case when the compiler generates a static member. Any
consumer of the debug info generated for that case, would benefit if
that member has the DW_AT_artificial flag.
In 39b2979a4 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.
Original commit message follows.
[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.
To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.
This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
**Summary**
This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.
**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)
Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.
In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.
For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.
**Implementation Details**
To address the above issue, the patch makes the following key changes:
**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.
**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.
**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.
**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."
This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:
```
struct MyStruct<T> {
let t: T
}
let variable = MyStruct<Int>(t: 43)
```
The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
DW_AT_name ("MyStruct")
// "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to
// MyStruct<T>
DW_AT_linkage_name ("$s1w8MyStructVyxGD")
// other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
DW_AT_specification (<link to "MyStruct">)
// "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
// MyStruct<Int>
DW_AT_linkage_name ("$s1w8MyStructVySiGD")
DW_AT_byte_size (0x08)
// other attributes here
```
This reverts commit bf483ddb42065405e345393e022dc72357ec5a3a.
See PR, there's a test testing for this behaviour (possibly adaptable), and
a duplicate line entry too
This patch follows on from the changes made in #105524, by adding an
additional heuristic that prevents us from applying the start-of-MBB
is_stmt flag when we can see that, for all direct branches to the MBB,
the last line stepped on before the branch is the same as the first line
of the MBB. This is mainly to prevent certain pathological cases, such
as macros that expand to multiple basic blocks that all have the same
source location, from giving us repeated steps on the same line. This
approach is not comprehensive, since it relies on analyzeBranch to read
edges, but the default fallback of applying is_stmt may lead only to
useless steps in some cases, rather than skipping useful steps
altogether.
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.
To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.
This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
As defined in LangRef, branching on `undef` is undefined behavior.
This PR aims to remove undefined behavior from tests. As UB tests break
Alive2 and may be the root cause of breaking future optimizations.
Here's an Alive2 proof for one of the examples:
https://alive2.llvm.org/ce/z/TncxhP
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.
This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.
Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).
This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
Utilize common API in PPCTargetParser
(https://github.com/llvm/llvm-project/pull/97541) to set default CPU
with same interfaces for LLC.
This will update AIX default CPU to pwr7 and LoP powerppc64 default CPU
to ppc64.
In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.
This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.
When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.
For example:
.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv
The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".
It is alright to skip this as we already check the section corresponding
to the debugloc entry.
Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.
Enable specialization on literal constant arguments by default in
Function Specialization.
---------
Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>
Both are based on MachineLICMBase, and the functionality there is
"switched" based on a PreRegAlloc flag. This commit is simply about
trusting the original value of that flag, defined by the `MachineLICM`
and `EarlyMachineLICM` classes.
The `PreRegAlloc` flag used to be overwritten it based on MRI.isSSA(),
which is un-reliable due to how it is inferred by the MIRParser. I see
that we can now define isSSA in MIR (thanks @gargaroff ), meaning the
fix isn’t really needed anymore, but redefining that flag still feels
wrong.
Note that I'm looking into upstreaming more changes to MachineLICM, see
[the discourse
thread](https://discourse.llvm.org/t/extending-post-regalloc-machinelicm/82725).
Store Swift mangled names in DW_AT_linkage_name. The Swift compiler
emits only the type mangled name in debug information, and LLDB uses
those mangled names as keys to look up size, alignment, fields, etc
from either reflection metadata or Swift modules.
Additionally, emit types linkage names for types into the accelerator
table if they exist and they're different from the display name.
Until now debug info was printing the symbols names as-is and that
resulted in invalid PTX when the symbols contained characters that are
invalid for PTX. E.g. `__PRETTY_FUNCTION.something`
Debug info is somewhat disconnected from the symbols themselves, so the
regular "NVPTXAssignValidGlobalNames" pass can't easily fix them.
As the "plan B" this patch catches printout of debug symbols and fixes
them, as needed. One gotcha is that the same code path is used to print
the names of debug info sections. Those section names do start with a
'.debug'. The dot in those names is nominally illegal in PTX, but the
debug section names with a dot are accepted as a special case. The
downside of this change is that if someone ever has a `.debug*` symbol
that needs to be referred to from the debug info, that label will be
passed through as-is, and will still produce broken PTX output. If/when
we run into a case where we need it to work, we could consider only
passing through specific debug section names, or add a mechanism
allowing us to tell section names apart from regular symbols.
Fixes#58491
LDV could reorder reinserted fragment and non-fragment debug values for
the same variable (compared to the input order), potentially resulting
in stale values being presented.
For example, before:
DBG_VALUE 1001, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16)
DBG_VALUE 1002, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16)
DBG_VALUE %0, $noreg, !13, !DIExpression()
After (without this patch):
DBG_VALUE %stack.0, 0, !13, !DIExpression()
DBG_VALUE 1002, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16)
DBG_VALUE 1001, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16)
It would also reorder DBG_VALUEs for different variables. Although that
does not matter for the debug information output, it resulted in some
noise in before/after pass diffs.
This should hopefully align so that instruction referencing and
DBG_VALUE emit debug instructions in the same order (see the
sdag-salvage-add.ll change).
Added .setMIFlag(MachineInstr::FrameSetup) to all BuildMI calls in
HexagonFrameLowering::insertAllocframe. This change ensures that the
test sugared-constants.ll passes upstream by correctly marking
instructions as part of the frame setup.
This patch handles virtual registers in the VarLocBasedImpl of the
LiveDebugVariables pass, allowing it to be used on architectures that
depend on virtual registers in debugging, like NVPTX. It enables the
pass for NVPTX.
Enable .debug_loc section for NVPTX backend.
This commit makes NVPTX omit DW_AT_low_pc (and DW_AT_high_pc) for
DW_TAG_compile_unit. This is because cuda-gdb uses the compile unit's
low_pc as a base address, and adds the addresses in the debug_loc
section to it. Removing low_pc is equivalent to setting that base
address to zero, so addition doesn't break the location ranges.
Additionally, this patch forces debug_loc label emission to emit single
labels with no subtraction or base. This would not be necessary if we
could emit `label1 - label2` expressions in PTX. The PTX documentation
at
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#debugging-directives-section
makes it seem like this is supported, but it doesn't actually work. I
believe when that documentation says that you can subtract “label
addresses between labels in the same dwarf section”, it doesn't merely
mean that the labels need to be in the same section as each other, but
in fact they need to be in the same section as the use. If support for
label subtraction is supported such that in the debug_loc section you
can subtract labels from the main code section, then we can remove the
workarounds added in this PR.
Also, since this now emits valid .debug_loc sections, it replaces the
empty .debug_loc to force existence of at least one debug section with
an empty .debug_macinfo section, which matches what nvcc does.
As of rev ea222be0d, LLVMs assembler will actually try to honour the
"fill value" part of p2align directives. X86 printed these as 0x90, which
isn't actually what it wanted: we want multi-byte nops for .text
padding. Compiling via a textual assembly file produces single-byte
nop padding since ea222be0d but the built-in assembler will produce
multi-byte nops. This divergent behaviour is undesirable.
To fix: don't set the byte padding field for x86, which allows the
assembler to pick multi-byte nops. Test that we get the same multi-byte
padding when compiled via textual assembly or directly to object file.
Added same-align-bytes-with-llasm-llobj.ll to that effect, updated
numerous other tests to not contain check-lines for the explicit padding.
Remove the following intrinsics which can be trivially replaced with an
`addrspacecast`
* llvm.nvvm.ptr.gen.to.global
* llvm.nvvm.ptr.gen.to.shared
* llvm.nvvm.ptr.gen.to.constant
* llvm.nvvm.ptr.gen.to.local
* llvm.nvvm.ptr.global.to.gen
* llvm.nvvm.ptr.shared.to.gen
* llvm.nvvm.ptr.constant.to.gen
* llvm.nvvm.ptr.local.to.gen
Also, cleanup the NVPTX lowering of `addrspacecast` making it more
concise.
This was reverted to avoid conflicts while reverting #107655. Re-landing
unchanged.
This is the final piece to enable register debugging for variables in
registers that have single locations that last throughout their
enclosing scope.
The next step after this for supporting register debugging for NVPTX is
to support the .debug_loc section.
Stacked on top of: https://github.com/llvm/llvm-project/pull/109495
This patch adds support for encoding PTX registers for DWARF, using the
encoding supported by nvcc and cuda-gcc.
There are some other features still needed for proper register debugging
that this patch does not address, such as DW_AT_address_class.
This PR is stacked on: https://github.com/llvm/llvm-project/pull/109494
When emitting debug info for code alignment, it was possible to emit a
.loc directive with a file number of zero, which is invalid for DWARF 4
and earlier. This happened because getCurrentDwarfLoc() returned a
zero-initialised value when there hadn't been a previous .loc directive
emitted.
---------
Co-authored-by: Paul T Robinson <paul.robinson@sony.com>
Remove the following intrinsics which can be trivially replaced with an
`addrspacecast`
* llvm.nvvm.ptr.gen.to.global
* llvm.nvvm.ptr.gen.to.shared
* llvm.nvvm.ptr.gen.to.constant
* llvm.nvvm.ptr.gen.to.local
* llvm.nvvm.ptr.global.to.gen
* llvm.nvvm.ptr.shared.to.gen
* llvm.nvvm.ptr.constant.to.gen
* llvm.nvvm.ptr.local.to.gen
Also, cleanup the NVPTX lowering of `addrspacecast` making it more
concise.
Same fix was provided for AIX in commit
704da919bafa5b088223f9d77424f24ae754539e.
The issue is unsupported DWARF 5 section with the following assertion:
`Assertion failed: Section && "Cannot switch to a null section!", file:
llvm/lib/MC/MCStreamer.cpp, line: 1266 `