63 Commits

Author SHA1 Message Date
Adam Yang
8710571aba
[AMDGPU] Fixed llvm-debuginfo-analyzer for AMDGPU. (#145125)
Constructing Target triple with `ObjectFile::makeTriple` instead of just
with `Arch` and leaving the rest unknown. Also creating the subtarget
with the `CPU`. AMDGPU needs the full triple and `CPU` to disassemble
correctly.

To run a full test, also fixed a failure in `SIPreAllocateWWMRegs` with
the `$noreg` operand in `DBG_VALUE`.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-08-12 22:04:52 +00:00
Sterling-Augustine
23f1ba3ee4
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959) (#146112)
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
    
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.

Original Description:

This is the culmination of a series of changes described in [1].
    
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
    
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
2025-06-27 11:05:49 -07:00
Sterling-Augustine
5d03e7a204
Revert "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959)
…145081)"

This reverts commit cbf781f0bdf2f680abbe784faedeefd6f84c246e.

Breaks a couple of buildbots.
2025-06-26 13:09:20 -07:00
Sterling-Augustine
cbf781f0bd
[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#145081)
This is the culmination of a series of changes described in [1].
    
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
    
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
 
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.

Either way, I'm open.

1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
2025-06-26 11:23:46 -07:00
Carlos Alberto Enciso
8b7fc6487d
[llvm-debuginfo-analyzer] Fix crash with WebAssembly dead code (#141616)
https://github.com/llvm/llvm-project/issues/136772
Incorrect handling of 'tombstone' value for WebAssembly.

llvm-debuginfo-analyzer already uses the tombstone approach
to identify dead code. Currently, the tombstone value is
evaluated as std::numeric_limits<uint64_t>::max(). Which is
wrong as it does not take into account the 'Address Byte Size'
from the Compile Unit header.
2025-06-26 05:37:32 +01:00
Sterling-Augustine
474db6a852
[NFC] Separate high-level-dependent portions of DWARFExpression (revised) (#143170)
(Revised version of a previous, unreviewed, PR.)

Move all expression verification into its only client: DWARFVerifier.
Move all printing code (which was a mix of static and member functions)
into a separate class.

This is one in a series of refactoring PRs to separate dwarf
functionality into lower-level pieces usable without object files and
sections at build time. The code is already written this way via various
"if (section == nullptr)" and similar conditionals. So the functionality
itself is needed and exists, but only as a runtime feature. The goal of
these refactors is to remove the build-time dependencies, which will
allow the existing functionality to be used from lower-level parts of
the compiler. Particularly from lib/MC/.... More information at:


https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
2025-06-09 16:32:40 -07:00
Javier Lopez-Gomez
0f38c54c6f
[llvm-debuginfo-analyzer] Add support for parsing DWARF / CodeView SourceLanguage (#137223)
This pull request adds support for parsing the source language in both
DWARF and CodeView. Specifically,

- The `LVSourceLanguage` class is introduced to represent any supported
language by any of the debug info representations.

- Update `LVDWARFReader.cpp` and `LVCodeViewVisitor.cpp` to parse the
source language where it applies. Added a new `=Language` attribute;
`getAttributeLanguage()` is internally used to control whether this
information is being printed.
2025-06-06 15:03:07 +01:00
Carlos Alberto Enciso
fef5096a8a
[llvm-debuginfo-analyzer][NFC] Move some functionality to LVReader. (#142740)
Hoist out from LVDWARFReader and LVBinaryReader some generic
code, so it can be available to other readers that do not share the
binary format.
2025-06-04 14:35:48 +01:00
Kazu Hirata
d77c995f14
[DebugInfo] Avoid creating a temporary instance of std::string (NFC) (#142523)
lookupTarget accepts StringRef.  We don't need to create a temporary
instance of std::string only to be converted back to StringRef.
2025-06-02 23:27:35 -07:00
Javier Lopez-Gomez
9cac4bf485
[llvm-debuginfo-analyzer] Add support for DWARF DW_AT_byte_size (#139110)
This PR was split from https://github.com/llvm/llvm-project/pull/137228
(which introduced support for `DW_TAG_module` and `DW_AT_byte_size`).

This PR improves `LVDWARFReader` by introducing handling of
`DW_AT_byte_size`. Most DWARF emitters include this attribute for types
to specify the size of an entity of the given type.
2025-05-22 14:20:40 +01:00
Javier Lopez-Gomez
cb575785b9
[llvm-debuginfo-analyzer] Support DW_TAG_module (#137228)
- Adds support for `DW_TAG_module` DIEs and recurse over their children.
Prior to this patch, entities hanging below `DW_TAG_module` were just
not visible. This DIE kind is commonly generated by Objective-C modules.

This patch will represent such entities, which will print as
```
[001]    {CompileUnit} '/llvm/tools/clang/test/modules/<stdin>'
[002]      {Producer} 'LLVM version 3.7.0'
           {Directory} '/llvm/tools/clang/test/modules'
           {File} '<stdin>'
[002]      {Module} 'DebugModule'
```
The minimal test case included is just the result of
```
$ llc llvm/test/DebugInfo/X86/DIModule.ll
      -accel-tables=Dwarf
      -o llvm/unittests/DebugInfo/LogicalView/Inputs/test-dwarf-clang-module.o
      -filetype=obj
```
2025-05-21 15:05:10 +01:00
Javier Lopez-Gomez
211ee04a61
[llvm-debuginfo-analyzer] Fix a couple of unhandled DWARF situations leading to a crash (#137221)
This pull request fixes a couple of unhandled situations in DWARF input
leading to a crash. Specifically,

- If the DWARF input contains a declaration of a C variadic function
(where `...` translates to `DW_TAG_unspecified_parameters`), which is
then followed by a definition, `llvm_unreachable()` is hit in
`LVScope::addMissingElements()`. This is only visible in Debug builds.

- Parsing of instructions in `LVBinaryReader::createInstructions()` does
not check whether `Offset` lies within the `Bytes` ArrayRef. A specially
crafted DWARF input can lead to this condition.
2025-05-21 05:29:41 +01:00
Kazu Hirata
ff78648b09
[llvm] Use llvm::find_if (NFC) (#140412) 2025-05-17 17:02:04 -07:00
Kazu Hirata
2422b1795f
[DebugInfo] Simplify a string comparison (NFC) (#139918)
Note that the lambda function we are in returns bool, so
FileZero.compare(FileOne) is equivalent to FileZero != FileOne in this
context.
2025-05-14 09:52:48 -07:00
Kazu Hirata
6257621f41
[llvm] Use llvm::append_range (NFC) (#133658) 2025-03-30 18:43:02 -07:00
Kazu Hirata
e49c8d5d3d
[DebugInfo] Avoid repeated map lookups (NFC) (#128826) 2025-02-26 00:56:03 -08:00
Kazu Hirata
38f8ca1d18
[DebugInfo] Avoid repeated hash lookups (NFC) (#128632) 2025-02-25 09:02:52 -08:00
Kazu Hirata
6ad55f1517
[DebugInfo] Avoid repeated hash lookups (NFC) (#128459) 2025-02-24 00:59:49 -08:00
Kazu Hirata
b0d1c51a17
[DebugInfo] Avoid repeated hash lookups (NFC) (#128395) 2025-02-23 01:03:22 -08:00
Kazu Hirata
94b810661a
[DebugInfo] Avoid repeated hash lookups (NFC) (#128301) 2025-02-22 02:13:02 -08:00
Kazu Hirata
f964377df7
[DebugInfo] Avoid repeated hash lookups (NFC) (#128127) 2025-02-21 11:07:51 -08:00
Kazu Hirata
fb14638817
[DebugInfo] Avoid repeated hash lookups (NFC) (#127446) 2025-02-17 01:32:25 -08:00
Carlos Alberto Enciso
c4fbb6500a
[llvm-debuginfo-analyzer] Add support for DW_AT_GNU_template_name. (#115724)
For the given C++ code:
```
  template <typename T> class Foo { T Member; };

  template <template <typename T> class TemplateType>
  class Bar {
    TemplateType<int> Int;
  };

  template <template <template <typename> class> class TemplateTemplateType>
  class Baz {
    TemplateTemplateType<Foo> Foo;
  };

  typedef Baz<Bar> Example;
  Example TT;
```
The '--attribute=encoded' option, will produce the logical view:
```
  {Class} 'Foo<int>'
    {Encoded} <int>
  {Class} 'Bar<Foo>'
    {Encoded} <>                 <-- Missing the template argument info (Foo)
  {Class} 'Baz<Bar>'
    {Encoded} <>                 <-- Missing the template argument info (Bar)
```
When the template argument is another template it is not included in the
{Encoded} field. The correct output should be:
```
  {Class} 'Foo<int>'
    {Encoded} <int>
  {Class} 'Bar<Foo>'
    {Encoded} <Foo>
  {Class} 'Baz<Bar>'
    {Encoded} <Bar>
```
2024-11-28 14:47:21 +00:00
Carlos Alberto Enciso
fb3765959f
[llvm-debuginfo-analyzer] Common handling of unsigned attribute values. (#116027)
- In the DWARF reader, for those attributes that can have an unsigned
value, allow for the following cases:
  * Is an implicit constant
  * Is an optional value
- The testing is done by creating a file with generated DWARF, using
`DwarfGenerator` (generate DWARF debug info for unit tests).
2024-11-28 05:21:47 +00:00
Kazu Hirata
0060c54e0d
[DebugInfo] Remove unused includes (NFC) (#116551)
Identified with misc-include-cleaner.
2024-11-17 10:37:33 -08:00
Carlos Alberto Enciso
5e7662efec
[llvm-debuginfo-analyzer] Incorrect DW_AT_call_line/DW_AT_call_file. (#115701)
The code dealing with DW_AT_call_line/DW_AT_call_file is in the wrong
place. The correct functions were call, but with incorrect values:
  DW_AT_call_line <-- Filename Index
  DW_AT_call_file <-- Line number
2024-11-11 13:00:24 +00:00
Kazu Hirata
5c9c281c25
[DebugInfo] Use heterogenous lookups with std::map (NFC) (#113118) 2024-10-21 06:50:03 -07:00
Kazu Hirata
8b6764fdc0
[DebugInfo] Simplify code with std::unordered_map::operator[] (NFC) (#112658) 2024-10-17 07:47:06 -07:00
Kazu Hirata
8a53dc69c2
[DebugInfo] Avoid repeated map lookups (NFC) (#111936) 2024-10-11 08:59:01 -07:00
Craig Topper
f2b71491d1
[MC] Make MCRegisterInfo::getLLVMRegNum return std::optional<MCRegister>. NFC (#107776) 2024-09-08 21:21:51 -07:00
Pavel Labath
d0d61a7e4c
Split DWARFFormValue::getReference into four functions (#98905)
The result of the function cannot be correctly interpreted without
knowing the precise form type (a type signature needs to be looked up
very differently from a supplementary debug info reference). The
function sort of worked because the two reference types (unit-relative
and section-relative) that can be handled uniformly are also the most
common types of references, but this setup made it easy to write code
which does not support other kinds of reference (and if one tried to
support them, the result didn't look pretty --
https://github.com/llvm/llvm-project/pull/97423/files#r1676217081).

The split is based on the reference type classification from DWARFv5
(Section 7.5.5 Classes and Forms), and it should enable uniform (if
slightly more verbose) hadling. Note that this only affects users which
want more control of how (or if) the references are resolved. Users
which just want to access the referenced DIE can use the higher level
API (DWARFDie::GetAttributeValueAsReferencedDie) which returns (or will
return after #97423 is merged) the correct die for all reference types
(except for supplementary references, which we don't support right now).
2024-07-16 12:55:38 +02:00
Kazu Hirata
026a29e8b3
[Analysis, CodeGen, DebugInfo] Use StringRef::operator== instead of StringRef::equals (NFC) (#91304)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.

- StringRef::operator==/!= outnumber StringRef::equals by a factor of
  53 under llvm/ in terms of their usage.

- The elimination of StringRef::equals brings StringRef closer to
  std::string_view, which has operator== but not equals.

- S == "foo" is more readable than S.equals("foo"), especially for
  !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-07 10:20:10 -07:00
Carlos Alberto Enciso
9c0c98ed37
[llvm-debuginfo-analyzer][DOC] Convert 'README.txt' to markdown. (#86394)
As part of the WebAssembly support work
https://github.com/llvm/llvm-project/pull/85566

The README.txt is a bit odd since it only lists issues and problems
without talking about what works. It’s also hard to read on the GitHub
web view.

- Convert to Markdown and linking to the command docs
https://llvm.org/docs/CommandGuide/llvm-debuginfo-analyzer
- Rename some left 'elf reader' to 'DWARF reader'.
2024-03-27 05:27:44 +00:00
Carlos Alberto Enciso
c1ccf0781b
[llvm-debuginfo-analyzer][NFC] Rename LVElfReader.cpp[h] (#85530)
As part of the WebAssembly support work review
  https://github.com/llvm/llvm-project/pull/82588

It was decided to rename:

  Files: LVElfReader.cpp[h] -> LVDWARFReader.cpp[h]
         ELFReaderTest.cpp  -> DWARFReaderTest.cpp

  Class: LVELFReader        -> LVDWARFReader

The name LVDWARFReader would match the another reader LVCodeViewReader
as they will reflect the type of
debug information format that they are parsing.
2024-03-18 05:08:42 +00:00
Carlos Alberto Enciso
b19cfb9175
[llvm-debuginfo-analyzer] Add support for WebAssembly binary format. (#82588)
Add support for the WebAssembly binary format and be able to generate
logical views.

https://github.com/llvm/llvm-project/issues/69181

The README.txt includes information about how to build the test cases.
2024-03-14 10:03:18 +00:00
Kazu Hirata
c6cfd5350e [llvm] Use StringRef::contains (NFC) 2024-01-19 00:19:36 -08:00
Simon Pilgrim
1950190b61 [DebugInfo] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 15:20:20 +00:00
Kazu Hirata
9bcc094d37 [llvm] Use llvm::erase_if (NFC) 2023-10-12 22:59:25 -07:00
Kazu Hirata
4a0ccfa865 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-12 21:21:45 -07:00
Kazu Hirata
b8885926f8 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-10 22:54:51 -07:00
Daniel Paoliello
050bb26174
[llvm] Implement S_INLINEES debug symbol (#67490)
The `S_INLINEES` debug symbol is used to record all the functions that
are directly inlined within the current function (nested inlining is
ignored).

This change implements support for emitting the `S_INLINEES` debug
symbol in LLVM, and cleans up how the `S_INLINEES` and `S_CALLEES` debug
symbols are dumped.
2023-09-27 14:06:22 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Daniel Paoliello
0c5c7b52f0 Emit the CodeView S_ARMSWITCHTABLE debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:

* The address of the branch instruction that uses the jump table.
* The address of the jump table.
* The "base" address that the values in the jump table are relative to.
* The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted).

Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to.

Documentation for the symbol can be found in the Microsoft PDB library dumper: 0fe89a942f/cvdump/dumpsym7.cpp (L5518)

This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D149367
2023-08-31 12:06:50 -07:00
Arthur Eubanks
0a4fc4ac1c Revert "Emit the CodeView S_ARMSWITCHTABLE debug symbol for jump tables"
This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3.

Causes crashes, see comments in https://reviews.llvm.org/D149367.

Some follow-up fixes are also reverted:

This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085.
This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6.
This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.
2023-08-25 18:34:15 -07:00
Daniel Paoliello
8d0c3db388 Emit the CodeView S_ARMSWITCHTABLE debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:

* The address of the branch instruction that uses the jump table.
* The address of the jump table.
* The "base" address that the values in the jump table are relative to.
* The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted).

Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to.

Documentation for the symbol can be found in the Microsoft PDB library dumper: 0fe89a942f/cvdump/dumpsym7.cpp (L5518)

This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D149367
2023-08-25 10:19:17 -07:00
Scott Linder
2e6bb8c9b8 [DebugInfo] Support more than 2 operands in DWARF operations
Update DWARFExpression::Operation and LVOperation to support more than
2 operands.

Take the opportunity to use a SmallVector, which will handle at least 2
operands without allocation anyway, and removes the static limit
completely.

As there is no longer the concept of an "unused operand", remove
Operation::Encoding::SizeNA. Any use of it is now replaced with explicit
checks for how many operands an operation has.

There are still places where the limit remains 2, namely in the
DWARFLinker and in DIExpressions, but these can be updated in later
patches as-needed.

There are no explicit tests as this is nearly NFC: no new operation is
added which makes use of the additional operand capacity yet. A future
patch adding a new DWARF extension point will include operations which
require the support.

Reviewed By: Orlando, CarlosAlbertoEnciso

Differential Revision: https://reviews.llvm.org/D147270
2023-06-19 19:38:26 +00:00
Nick Desaulniers
8abbc17ff3 reland: [Demangle] make llvm::demangle take std::string_view rather than const std::string&
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549

There's potential for a lot more cleanups around these APIs. This is
just a start.

Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using `llvm::demangle`. Add a
release note.

Differential Revision: https://reviews.llvm.org/D149104
2023-06-06 10:18:06 -07:00
Craig Topper
6006d43e2d LLVM_FALLTHROUGH => [[fallthrough]]. NFC
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D150996
2023-05-24 12:40:10 -07:00
Scott Linder
7f033d0f75 [llvm-debuginfo-analyzer] Support both Reference and Type attrs in single DIE
Relax the assumption that at most one Reference-or-Type-like attribute is
present on a DWARF DIE.

Add support for at most one Type attribute (i.e. DW_AT_import xor
DW_AT_type) and separately at most one Reference attribute (i.e.
DW_AT_specification xor DW_AT_abstract_origin xor ...).

Update comment describing old assumption and tag it as a "FIXME" to
reflect the fact that this is perhaps still not general enough.

Add a test based on the case which led me to encounter the bug in the
wild.

Reviewed By: CarlosAlbertoEnciso

Differential Revision: https://reviews.llvm.org/D150713
2023-05-23 20:52:45 +00:00
Nick Desaulniers
3e3c6f24ff Revert "[Demangle] make llvm::demangle take std::string_view rather than const std::string&"
This reverts commit c117c2c8ba4afd45a006043ec6dd858652b2ffcc.

itaniumDemangle calls std::strlen with the results of
std::string_view::data() which may not be NUL-terminated. This causes
lld/test/wasm/why-extract.s  to fail when "expensive checks" are enabled
via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON. See D149675 for further
discussion. Back this out until the individual demanglers are converted
to use std::string_view.
2023-05-02 15:54:09 -07:00