5511 Commits

Author SHA1 Message Date
Mingming Liu
a6aa9365f7
[NFC][AsmPrinter] Pass MJTI by const reference instead of const pointer (#122365)
The caller `AsmPrinter::emitJumpTableInfo` checks [1] `MJTI` is not a
null pointer before calling `emitJumpTableEntry` or
`emitJumpTableSizesSection`.

This patch updates callee function's signature to accept const
reference, this way it's explicit `MJTI` won't be nullptr inside the
callee.

[1]
9d5299eb61/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (L2857)
2025-01-09 15:57:32 -08:00
Hubert Tong
e438513f2e
[AIX][AsmPrinter] Fix unsigned subtraction wrap-around (#122214)
Unsigned subtraction wrap-around occurs in `emitGlobalConstantImpl` on
an AIX-specific code path from 8e4423eb0888 when a structure type has
zero elements.

With assertions enabled, this manifests as:
```
TypeSize llvm::StructLayout::getElementOffset(unsigned int) const: Assertion `Idx < NumElements && "Invalid element idx!"' failed.
```
2025-01-09 00:07:57 -04:00
Alexander Yermolovich
fce0314c38
[LLVM][DWARF] Create debug names entry for non-tu top level DIE (#121856)
When creating a Type Unit (TU), LLVM attempts to do so optimistically.
However, if this fails, it discards the TU state and creates the TU
within the Compilation Unit (CU). In such cases, an entry for the
top-level DIE is not created in the debug names table.
This can cause issues when running llvm-dwarfdump --debug-names
--verify, as the missing entry will result in verification failure.
To address this issue, this patch adds a call to the
updateAcceleratorTables when TU creation fails. This ensures that the
debug names table is updated correctly, even in cases where TU creation
fails.
2025-01-08 17:08:35 -08:00
Fangrui Song
25bb6592c9 MCAsmInfo: replace AIX-specific variables with IsAIX
AIX assembly is very different from the gas syntax. We don't expect
other targets to share these differences. Unify the numerous,
essentially AIX-specific variables.
2024-12-24 22:46:13 -08:00
Fangrui Song
56600c11ad MCAsmInfo: replace HLASM-specific variables with IsHLASM
HLASM is very different from the gas syntax. We don't expect other
targets to customize the differences. Unify the numerous variables.
2024-12-24 18:37:46 -08:00
Fangrui Song
7b23f413d1 MCAsmStreamer: Omit initial ".text"
llvm-mc --assemble prints an initial `.text` from `initSections`.
This is weird for quick assembly tasks that do not specify `.text`.

Omit the .text by moving section directive printing from `changeSection`
to `switchSection`. switchSectionNoPrint now correctly calls the
`changeSection` hook (needed by MachO).

The initial directives of clang -S are now reordered. On ELF targets, we
get `.file "a.c"; .text` instead of `.text; .file "a.c"`.
If there is no function, `.text` will be omitted.
2024-12-22 22:03:44 -08:00
Paul Walker
3146911eb0
[LLVM][AsmPrinter] Add vector ConstantInt/FP support to emitGlobalConstantImpl. (#120077)
The fixes a failure path for fixed length vector globals when
ConstantInt/FP is used to represent splats instead of
ConstantDataVector.
2024-12-18 11:51:01 +00:00
Florian Mayer
d9703501b0
[MTE] [NFC] use vector to collect globals to tag (#120283)
The same pattern caused test failures in the HWASan pass, so is brittle.
Let's go for the easier approach.
2024-12-18 00:38:19 -08:00
Florian Mayer
514580b438
[MTE] Apply alignment / size in AsmPrinter rather than IR (#111918)
This makes sure no optimizations are applied that assume the
bigger alignment or size, which could be incorrect if we link
together with non-instrumented code.
2024-12-17 00:47:02 -08:00
Daniil Kovalev
f65a21a4ec
[PAC][ELF][AArch64] Support signed personality function pointer (#119361)
Re-apply #113148 after revert in #119331

If function pointer signing is enabled, sign personality function
pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key,
0x7EAD = `ptrauth_string_discriminator("personality")` constant
discriminator and address diversity enabled.
2024-12-16 10:24:09 +03:00
paperchalice
1562b70eaf
Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547)
This relands commit #115111.
Use traditional way to update post dominator tree, i.e. break critical
edge splitting into insert, insert, delete sequence.
When splitting critical edges, the post dominator tree may change its
root node, and `setNewRoot` only works in normal dominator tree...
See

6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)
2024-12-13 11:43:09 +08:00
Matt Arsenault
ea632e1b34
Reapply "DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm" (#119575) (#119634)
This reverts commit 40986feda8b1437ed475b144d5b9a208b008782a.

Reapply with fix to prevent temporary Twine from going out of scope.
2024-12-11 16:01:48 -08:00
Vitaly Buka
40986feda8
Revert "DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm" (#119575)
Reverts llvm/llvm-project#119485

Breaks builders, details in llvm/llvm-project#119485
2024-12-11 07:51:36 -08:00
Matt Arsenault
884f2ad6f9
DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm (#119485)
Currently LLVMContext::emitError emits any error as an "inline asm"
error which does not make any sense. InlineAsm appears to be special,
in that it uses a "LocCookie" from srcloc metadata, which looks like
a parallel mechanism to ordinary source line locations. This meant
that other types of failures had degraded source information reported
when available.

Introduce some new generic error types, and only use inline asm
in the appropriate contexts. The DiagnosticInfo types are still
a bit of a mess, and I'm not sure why DiagnosticInfoWithLocationBase
exists instead of just having an optional DiagnosticLocation in the
base class.

DK_Generic is for any error that derives from an IR level instruction,
and thus can pull debug locations directly from it. DK_GenericWithLoc
is functionally the generic codegen error, since it does not depend
on the IR and instead can construct a DiagnosticLocation from the
MI debug location.
2024-12-11 17:16:07 +09:00
paperchalice
553058f825
Revert "[DomTreeUpdater] Move critical edge splitting code to updater" (#119512)
Reverts llvm/llvm-project#115111 Causes #119511
2024-12-11 14:25:17 +08:00
paperchalice
79047fac65
[DomTreeUpdater] Move critical edge splitting code to updater (#115111)
Support critical edge splitting in dominator tree updater. Continue the
work in #100856.

Compile time check:
https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au
2024-12-11 11:31:42 +08:00
Daniil Kovalev
ef2e590e7b
Revert "[PAC][ELF][AArch64] Support signed personality function pointer" (#119331)
Reverts llvm/llvm-project#113148

See buildbot failure
https://lab.llvm.org/buildbot/#/builders/190/builds/11048
2024-12-10 09:12:25 +03:00
Daniil Kovalev
4fb1cda660
[PAC][ELF][AArch64] Support signed personality function pointer (#113148)
If function pointer signing is enabled, sign personality function
pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key,
0x7EAD = `ptrauth_string_discriminator("personality")` constant
discriminator and address diversity enabled.
2024-12-10 08:48:09 +03:00
Jeremy Morse
624e52b1e3
[DebugInfo] Handle trailing empty blocks when seeking prologue_end spot (#117320)
The optimiser will produce empty blocks that are unconditionally
executed according to the CFG -- while it may not be meaningful code,
and won't get a prologue_end position, we need to not crash on this
input.

The fault comes from assuming that there's always a next block with some
instructions in it, that will eventually produce some meaningful control
flow to stop at -- in the given reproducer in issue #117206 this isn't
true, because the function terminates with `unreachable`. Thus, I've
refactored the "get next instruction logic" into a helper that'll step
through all blocks and terminate if there aren't any more.

Reproducer from aeubanks
2024-11-26 14:24:25 +00:00
Lei Wang
cf83a7fdc2
[SHT_LLVM_BB_ADDR_MAP] Add an option to skip emitting bb entries (#114447)
Sometimes we want to use a `PgoAnalysisMap` feature that doesn't require
the BB entries info, e.g. only the `FuncEntryCount`, but the BB entries
is emitted by default, so I'm adding an option to skip the info for this
case to save the binary size(can save ~90% size of the section). For
implementation, it extends a new field(`OmitBBEntries`) in
`BBAddrMap::Features` for this and it's controlled by a switch
`--basic-block-address-map-skip-bb-entries`.

Note that this naturally supports backwards compatibility as the field
is zero for the old version, matches the decoding in the new version
llvm.
2024-11-22 11:51:34 -08:00
Simon Pilgrim
e2368afbd0 Fix GCC Wparentheses warning in assert condition / message. NFC. 2024-11-20 17:02:23 +00:00
Zaara Syeda
8e4423eb08
[AsmPrinter] Fix handling in emitGlobalConstantImpl for AIX (#116255)
When GlobalMerge creates a MergedGlobal of statics all initialized to
zero, emitGlobalConstantImpl sees a ConstantAggregateZero. This results
in just emitting zeros followed by labels for the aliases. We need to
handle it more like how emitGlobalConstantStruct does by emitting each
global inside the aggregate.

---------

Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
2024-11-19 09:58:25 -05:00
Carlos Alberto Enciso
c2a9bba4a3
[DebugInfo] Add DW_AT_artificial for compiler generated static member. (#115851)
Consider the case when the compiler generates a static member. Any
consumer of the debug info generated for that case, would benefit if
that member has the DW_AT_artificial flag.
2024-11-15 05:16:12 +00:00
Jeremy Morse
251958f357 [DebugInfo] Don't pick prologue_end if there are no instructions
Add a filter to avoid picking prologue_end when a function is empty (it may
have blocks but no instructions). This saves us from pushing more
validity-checking into findPrologueEndLoc.
2024-11-14 13:41:50 +00:00
Jeremy Morse
b468ed494a Reapply ccddb6ffad1, "Emit a worst-case prologue_end"
In 39b2979a4 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.

Original commit message follows.

[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-14 10:30:17 +00:00
alx32
f407dff50c
[DebugInfo][DWARF] Emit Per-Function Line Table Offsets and End Sequences (#110192)
**Summary**

This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.

**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)

Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.

In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.

For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.


**Implementation Details**
To address the above issue, the patch makes the following key changes:

**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.

**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.

**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.

**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
2024-11-13 18:51:34 -08:00
Augusto Noronha
67fb2686fb
[DebugInfo] Add a specification attribute to LLVM DebugInfo (#115362)
Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."

This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:

```
struct MyStruct<T> {
    let t: T
}

let variable = MyStruct<Int>(t: 43)
```

The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
    DW_AT_name	("MyStruct")
    // "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to 
    // MyStruct<T>
    DW_AT_linkage_name	("$s1w8MyStructVyxGD")
    // other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
    DW_AT_specification	(<link to "MyStruct">)
    // "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
    // MyStruct<Int>
    DW_AT_linkage_name	("$s1w8MyStructVySiGD")
    DW_AT_byte_size	(0x08)
    // other attributes here
```
2024-11-13 09:55:37 -08:00
Jeremy Morse
ccddb6ffad Revert "[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)"
This reverts commit bf483ddb42065405e345393e022dc72357ec5a3a.

See PR, there's a test testing for this behaviour (possibly adaptable), and
a duplicate line entry too
2024-11-12 17:07:56 +00:00
Stephen Tozer
fe18ab983d
[DebugInfo] Don't apply is_stmt on MBB branches that preserve lines (#108251)
This patch follows on from the changes made in #105524, by adding an
additional heuristic that prevents us from applying the start-of-MBB
is_stmt flag when we can see that, for all direct branches to the MBB,
the last line stepped on before the branch is the same as the first line
of the MBB. This is mainly to prevent certain pathological cases, such
as macros that expand to multiple basic blocks that all have the same
source location, from giving us repeated steps on the same line. This
approach is not comprehensive, since it relies on analyzeBranch to read
edges, but the default fallback of applying is_stmt may lead only to
useless steps in some cases, rather than skipping useful steps
altogether.
2024-11-12 16:23:35 +00:00
Jeremy Morse
bf483ddb42
[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-12 15:09:40 +00:00
Daniel Sanders
74003f11b3
[mc] Add CFI directive to emit val_offset() rules (#113971)
These specify that the value of the given register in the previous frame
is the CFA plus some offset. This isn't very common but can be necessary
if the original value is normally reconstructed from the stack/frame
pointer instead of being saved on the stack and reloaded from there.
2024-11-11 11:38:36 -08:00
Augusto Noronha
f6617d65e4
[DebugInfo] Add num_extra_inhabitants to debug info (#112590)
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.

This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.

Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).

This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
2024-11-06 15:48:04 -08:00
Jon Roelofs
4c3e1e3c4a
[llvm][AsmPrinter] Add an option to print instruction latencies (#113243)
... matching what we have in the disassembler. This isn't turned on by
default since several of the scheduling models are not completely
accurate, and we don't want to be misleading.
2024-11-05 17:28:52 -08:00
Kazu Hirata
2fccd5c576
[AsmPrinter] Remove unused includes (NFC) (#114698)
Identified with misc-include-cleaner.
2024-11-03 07:03:02 -08:00
Sriraman Tallam
c7ef002bc6
Fix performance bug in buildLocationList (#109343)
In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.

This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.

When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.

For example:

.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv

The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".

It is alright to skip this as we already check the section corresponding
to the debugloc entry.

Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.
2024-10-31 09:00:25 -07:00
Jack Styles
86f76c3b17
[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171)
As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced,
`DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that
the PC has been used with the signing instruction. This change includes
three commits
- Libunwind support for the newly introduced DWARF Instruction
- CodeGen Support for the DWARF Instructions
- Reversing the changes made in #96377. Due to
`DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed
immediately after the signing instruction, this would mean the CFI
Instruction location was not consistent with the generated location when
not using FEAT_PAuthLR. The commit reverses the changes and makes the
location consistent across the different branch protection options.
While this does have a code size effect, this is a negligible one.

For the ABI information, see here:
853286c7ab/aadwarf64/aadwarf64.rst (id23)
2024-10-28 08:22:38 +00:00
Aiden Grossman
7c9cf0c6f0 [SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Emit error on bad option combinatons
This patch makes it so that specifying all or none for -pgo-analysis-map
along with an explicit option causes an error as this set of options
does not really make sense.
2024-10-26 08:15:34 +00:00
Aiden Grossman
38caf282ab
[SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Add none and all options to PGO Map (#111221)
This patch adds none and all options to the -pgo-analysis-map flag,
which do basically what they say on the tin. The none option is added to
enable forcing the pgo-analysis-map by overriding an earlier invocation
of the flag. The all option is just added for convenience.
2024-10-25 15:39:52 -07:00
Augusto Noronha
8234f8ae26
[DebugInfo] Emit linkage name into DWARF for types for Swift (#112802)
Store Swift mangled names in DW_AT_linkage_name. The Swift compiler
emits only the type mangled name in debug information, and LLDB uses
those mangled names as keys to look up size, alignment, fields, etc
from either reflection metadata or Swift modules.

Additionally, emit types linkage names for types into the accelerator
table if they exist and they're different from the display name.
2024-10-22 16:47:58 -07:00
Frank Schlimbach
d5746d73ce
eliminating g++ warnings (#105520)
Eliminating g++ warnings. Mostly declaring "[[maybe_unused]]", adding
return statements where missing and fixing casts.

@rengolin

---------

Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
Co-authored-by: Renato Golin <rengolin@systemcall.eu>
2024-10-18 21:20:47 +01:00
Jinsong Ji
6c60ead15a
[NFC] Fix Werror=extra warning related to mismatched enum type (#112808)
This is one of the many PRs to fix errors with LLVM_ENABLE_WERROR=on.
Built by GCC 11.

Fix warnings:

llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp: In member
function ‘void llvm::AsmPrinter::emitJumpTableSizesSection(const
llvm::MachineJumpTableInfo*, const llvm::Function&) const’:
llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:2852:31: error:
enumerated and non-enumerated type in conditional expression
[-Werror=extra]
 2852 |     int Flags = F.hasComdat() ? ELF::SHF_GROUP : 0;
      |                 ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
2024-10-18 13:07:37 -04:00
Kazu Hirata
a62768c427
[CodeGen] Simplify code with *Map::operator[] (NFC) (#112075) 2024-10-11 23:01:21 -07:00
William G Hatch
ae635d6f99
[NVPTX] add support for .debug_loc section (#110905)
Enable .debug_loc section for NVPTX backend.

This commit makes NVPTX omit DW_AT_low_pc (and DW_AT_high_pc) for
DW_TAG_compile_unit. This is because cuda-gdb uses the compile unit's
low_pc as a base address, and adds the addresses in the debug_loc
section to it. Removing low_pc is equivalent to setting that base
address to zero, so addition doesn't break the location ranges.
Additionally, this patch forces debug_loc label emission to emit single
labels with no subtraction or base. This would not be necessary if we
could emit `label1 - label2` expressions in PTX. The PTX documentation
at
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#debugging-directives-section
makes it seem like this is supported, but it doesn't actually work. I
believe when that documentation says that you can subtract “label
addresses between labels in the same dwarf section”, it doesn't merely
mean that the labels need to be in the same section as each other, but
in fact they need to be in the same section as the use. If support for
label subtraction is supported such that in the debug_loc section you
can subtract labels from the main code section, then we can remove the
workarounds added in this PR.

Also, since this now emits valid .debug_loc sections, it replaces the
empty .debug_loc to force existence of at least one debug section with
an empty .debug_macinfo section, which matches what nvcc does.
2024-10-03 16:31:49 -06:00
Francis Visoiu Mistrih
916e6ad7d0
[CodeGen] Fix InstructionCount remarks for MI bundles (#107621)
For MI bundles, the instruction count remark doesn't count the
instructions inside the bundle.
2024-10-02 13:37:25 -07:00
Jeremy Morse
96f37ae453
[NFC] Use initial-stack-allocations for more data structures (#110544)
This replaces some of the most frequent offenders of using a DenseMap that
cause a malloc, where the typical element-count is small enough to fit in
an initial stack allocation.

Most of these are fairly obvious, one to highlight is the collectOffset
method of GEP instructions: if there's a GEP, of course it's going to have
at least one offset, but every time we've called collectOffset we end up
calling malloc as well for the DenseMap in the MapVector.
2024-09-30 23:15:18 +01:00
Kazu Hirata
871e32bd2e
[AsmPrinter] Avoid repeated hash lookups (NFC) (#110376) 2024-09-28 13:08:54 -07:00
sinan
d435acb8eb
[DWARF] Don't emit DWARF5 symbols for DWARF2/3 + non-lldb (#110120)
Modify other legacy dwarf versions to align with the dwarf4 handling
approach when determining whether to generate DWARF5 or GNU extensions.
2024-09-27 10:27:04 +08:00
William G Hatch
f7dfaf3506
[NVPTX] add address class for variables with a single register location (#110030)
This is the final piece to enable register debugging for variables in
registers that have single locations that last throughout their
enclosing scope.

The next step after this for supporting register debugging for NVPTX is
to support the .debug_loc section.

Stacked on top of: https://github.com/llvm/llvm-project/pull/109495
2024-09-26 19:56:11 -06:00
William G Hatch
95eb3d45f6
[NVPTX] add support for encoding PTX registers for DWARF (#109495)
This patch adds support for encoding PTX registers for DWARF, using the
encoding supported by nvcc and cuda-gcc.

There are some other features still needed for proper register debugging
that this patch does not address, such as DW_AT_address_class.

This PR is stacked on: https://github.com/llvm/llvm-project/pull/109494
2024-09-26 12:32:43 -06:00
William G Hatch
4822e9dce3
[llvm] use 64-bit types for result of getDwarfRegNum (NFC) (#109494)
The register encoding used by NVPTX and cuda-gdb basically use strings
encoded as numbers. They are always within 64-bits, but typically
outside of 32-bits, since they often need at least 5 characters.

This patch changes the signature of `MCRegisterInfo::getDwarfRegNum` and
some related data structures to use 64-bit numbers to accommodate
encodings like this.

Additionally, `MCRegisterInfo::getDwarfRegNum` is marked as virtual, so
that targets with peculiar dwarf register mapping schemes (such as
NVPTX) can override its behavior.

I originally tried to do a broader switch to 64-bit types for registers,
but it caused many problems. There are various places in code generation
where registers are not just treated as 32-bit numbers, but also treat
certain bit offsets as flags. So I limited the change as much as
possible to just the output of `getDwarfRegNum`. Keeping the types used
by `DwarfLLVMRegPair` as unsigned preserves the current behaviors. The
only way to give a 64-bit output from `getDwarfRegNum` that actually
needs more than 32-bits is to override `getDwarfRegNum` and provide an
implementation that sidesteps the use of the `DwarfLLVMRegPair` maps
defined in tablegen files.

First layer of stack supporting:
https://github.com/llvm/llvm-project/pull/109495
2024-09-26 12:30:06 -06:00