2669 Commits

Author SHA1 Message Date
alx32
6f28b4b5e9
[GSYM] Add support for querying merged functions in llvm-gsymutil (#120991)
Adds the ability to lookup and display all merged functions for an
address in llvm-gsymutil.

Now, when `--merged-functions` is used in combination with
`--address/--addresses-from-stdin`, lookup results will contain
information about merged functions, if available.

To support printing merged function information when using the
`--verbose` option, the `LookupResult` data structure also had to be
extended with pointers to the raw function data and raw merged function
data. This is because merged functions share the same address range, so
it's not easy to look up the raw merged function data for a particular
`LookupResult` that is based on a merged function.
2025-01-06 11:55:27 -08:00
Zibi Sarbinowski
1d51546635
[SystemZ][z/OS] Open YAML files for read as text files (#121340)
This patch makes sure YAML files are opened for reading as text file to
trigger auto-conversion from EBCDIC encoding into expected ASCII
encoding on z/OS platform. This is required to fix the following lit
tests:

```
LLVM :: tools/llvm-gsymutil/ARM_AArch64/macho-gsym-callsite-info-exe.yaml
LLVM :: tools/llvm-gsymutil/ARM_AArch64/macho-gsym-callsite-info-obj.test
LLVM :: tools/llvm-gsymutil/ARM_AArch64/macho-gsym-callsite-info-dsym.yaml
LLVM :: Transforms/PGOProfile/memprof_undrift_missing_leaf.ll
```
2024-12-31 07:24:59 -05:00
alx32
96839a4720
[llvm-gsymutil] Ensure stable ordering of merged functions (#120796)
Previously, the test
`llvm/test/tools/llvm-gsymutil/ARM_AArch64/macho-gsym-merged-callsites-dsym.yaml`
[was disabled](https://github.com/llvm/llvm-project/pull/119957/files)
due to failing on Linux.

The issue is that the merged functions appear in a different order in
the final produced `gSYM`.
To ensure deterministic ordering of merged functions, we change the
sorting of functions to use the `stable_sort` algorithm. Before merged
functions, this was not an issue as the address is used as a comparator
in the sorting algorithm so there was no chance of two functions testing
as identical according to the comparator.

Confirmed that test now passes locally with
`-DLLVM_ENABLE_EXPENSIVE_CHECKS=ON`.
2024-12-20 16:00:52 -08:00
alx32
ad32576cff
[DWARFVerifier] Allow overlapping ranges for ICF-merged functions (#117952)
This patch modifies the DWARF verifier to handle a valid case where two
or more functions have identical address ranges due to being merged by
ICF (Identical Code Folding). Previously, the verifier would incorrectly
report these as errors, but functions merged via ICF (such as when using
LLD's --keep-icf-stabs option) can legitimately share the same address
range.

A new test case has been added to verify this behavior using YAML-based
DWARF data that simulates two DW_TAG_subprogram entries with identical
address ranges. The test ensures that the verifier correctly identifies
this as a valid case and doesn't emit any errors, while still
maintaining the existing verification for truly invalid overlapping
ranges in other scenarios. Before this change, the newly added test case
would have failed, with `llvm-dwarfdump` marking the overlapping address
ranges in the DWARF as an error.

We also modify the existing tests `llvm-dwarfutil/ELF/X86/verify.test` and 
`llvm/test/tools/llvm-dwarfdump/X86/verify_parent_zero_length.yaml`
which rely on the existence of the error that we're trying to
suppress. We slightly change one offset so that the ranges don't
perfectly overlap and an error is still generated.
2024-12-17 11:00:56 -08:00
alx32
558de0e1f9
[llvm-gsymutil] Add option to load callsites from DWARF (#119913)
This change adds support for loading gSYM callsite information from
DWARF. Previously the only support was for loading callsites info from
YAML.

For testing, we add a pass where `macho-gsym-merged-callsites-dsym`
loads callsite info from DWARF rather than YAML.
2024-12-17 08:51:30 -08:00
aurelien35
37e48e4a73
Fix crash due to un-checked error in LVReaderHandler::handleArchive method (#118951)
[llvm-debuginfo-analyzer] Fix crash due to un-checked error in LVReaderHandler::handleArchive
method.

- Added README describing how to generated the binary files used for the test.
- A follow up patch to add extra ASSERT_NE

Committed on behalf of @aurelien35
2024-12-17 11:19:48 +00:00
alx32
d015578961
[llvm-gsymutil] Fix dumping of call sites for merged functions (#119759)
Currently, when dumping the contents of a GSYM there are three issues:
- Callsite information is not displayed for merged functions - this is
because of a bug in `CallSiteInfoLoader::buildFunctionMap` where when
enumerating through `Func.MergedFunctions` - we enumerate by value
instead of by reference.
- There is no variable indent for printing callsite info - meaning that
when printing callsites for merged functions, the indent will be
different than the other info of the merged function. To address this we
add configurable indent for printing callsite info
- Callsite info is printed right after merged function info. Meaning
that if the merged function also has call site information, the parent's
callsite info will appear right after the merged function's callsite
info - leading to confusion. To address this we print the callsite info
first, then the merged functions info.

This change addresses all the above 3 issues. 
Example of old vs new:

<img width="1074" alt="image"
src="https://github.com/user-attachments/assets/d039ad69-fa79-4abb-9816-eda9cc2eda53"
/>
2024-12-13 15:59:57 -08:00
Zequan Wu
2e425bf629 Reapply "[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#117071)"
9de73b20404f0b2db1cbf70d164cfe0789d5bb94 lands a fix to DWARFTypePrinter that is used by lldb in this change.
2024-12-04 13:05:36 -08:00
itrofimow
0d1d1b363d
[DebugInfo] Clean up LLVMSymbolizer::DemangleName API: const string& -> StringRef (#118056) 2024-11-29 11:25:14 +00:00
Carlos Alberto Enciso
c4fbb6500a
[llvm-debuginfo-analyzer] Add support for DW_AT_GNU_template_name. (#115724)
For the given C++ code:
```
  template <typename T> class Foo { T Member; };

  template <template <typename T> class TemplateType>
  class Bar {
    TemplateType<int> Int;
  };

  template <template <template <typename> class> class TemplateTemplateType>
  class Baz {
    TemplateTemplateType<Foo> Foo;
  };

  typedef Baz<Bar> Example;
  Example TT;
```
The '--attribute=encoded' option, will produce the logical view:
```
  {Class} 'Foo<int>'
    {Encoded} <int>
  {Class} 'Bar<Foo>'
    {Encoded} <>                 <-- Missing the template argument info (Foo)
  {Class} 'Baz<Bar>'
    {Encoded} <>                 <-- Missing the template argument info (Bar)
```
When the template argument is another template it is not included in the
{Encoded} field. The correct output should be:
```
  {Class} 'Foo<int>'
    {Encoded} <int>
  {Class} 'Bar<Foo>'
    {Encoded} <Foo>
  {Class} 'Baz<Bar>'
    {Encoded} <Bar>
```
2024-11-28 14:47:21 +00:00
Carlos Alberto Enciso
fb3765959f
[llvm-debuginfo-analyzer] Common handling of unsigned attribute values. (#116027)
- In the DWARF reader, for those attributes that can have an unsigned
value, allow for the following cases:
  * Is an implicit constant
  * Is an optional value
- The testing is done by creating a file with generated DWARF, using
`DwarfGenerator` (generate DWARF debug info for unit tests).
2024-11-28 05:21:47 +00:00
alx32
5147e5941d
[GSYM] Callsites: Add data format support and loading from YAML (#109781)
This PR adds support in the gSYM format for call site information and
adds support for loading call sites from a YAML file. The support for
YAML input is mostly for testing purposes - so we have a way to test the
functionality.

Note that this data is not currently used in the gSYM tooling - the
logic to use call sites will be added in a later PR.

The reason why we need call site information in gSYM files is so that we
can support better call stack function disambiguation in the case where
multiple functions have been merged due to optimization (linker ICF).
When resolving a merged function on the callstack, we can use the call
site information of the calling function to narrow down the actual
function that is being called, from the set of all merged functions.

See [this
RFC](https://discourse.llvm.org/t/rfc-extending-gsym-format-with-call-site-information-for-merged-function-disambiguation/80682)
for more details on this change.
2024-11-26 16:07:40 -08:00
Mikhail Goncharov
11ee21671f Revert " [lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#117071)"
This reverts commit f06c187799d910fd3ac3e9106397e5eecff9f265.

Temporary revert: there is https://github.com/llvm/llvm-project/pull/117239 that is suppose to fix the issue.
Reverting to keep things rolling.
2024-11-22 13:49:37 +01:00
Zequan Wu
f06c187799
[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#117071)
This is a reland of https://github.com/llvm/llvm-project/pull/112811.
Fixed the bot breakage by running ld.lld explicitly.
2024-11-20 17:19:35 -05:00
Shubham Sandeep Rastogi
c51786b022 Revert "[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#112811)"
This reverts commit 94d100f2ba81c2bf0ef495f68d66ba8c94c71d2a.

Reverted because of greendragon failure on the incremental arm64 bot

******************** TEST 'lldb-shell :: SymbolFile/DWARF/x86/simplified-template-names.cpp' FAILED ********************
Exit Code: 1

RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang --driver-mode=g++ --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-pc-linux -g -gsimple-template-names /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/Shell/SymbolFile/DWARF/x86/simplified-template-names.cpp -o /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/test/Shell/SymbolFile/DWARF/x86/Output/simplified-template-names.cpp.tmp

/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang --driver-mode=g++ --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-pc-linux -g -gsimple-template-names /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/Shell/SymbolFile/DWARF/x86/simplified-template-names.cpp -o /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/test/Shell/SymbolFile/DWARF/x86/Output/simplified-template-names.cpp.tmp
ld: warning: -m is obsolete
ld: unknown option: --hash-style=gnu
clang: error: linker command failed with exit code 1 (use -v to see invocation)
2024-11-18 17:19:38 -08:00
Zequan Wu
94d100f2ba
[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#112811)
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.

Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.

### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
2024-11-18 17:45:54 -05:00
Romain Thomas
a52cb0a2b9
[PDB] Fix missing consumeError which raise error with asserts enabled (#116480)
As mentioned in the title, the missing `consumeError` triggers assertions.
2024-11-18 15:23:10 +01:00
Kazu Hirata
0060c54e0d
[DebugInfo] Remove unused includes (NFC) (#116551)
Identified with misc-include-cleaner.
2024-11-17 10:37:33 -08:00
Carlos Alberto Enciso
5e7662efec
[llvm-debuginfo-analyzer] Incorrect DW_AT_call_line/DW_AT_call_file. (#115701)
The code dealing with DW_AT_call_line/DW_AT_call_file is in the wrong
place. The correct functions were call, but with incorrect values:
  DW_AT_call_line <-- Filename Index
  DW_AT_call_file <-- Line number
2024-11-11 13:00:24 +00:00
Carlos Alberto Enciso
a912c81f65
[llvm-debuginfo-analyzer] Fix crash with thread local storage. (#113904)
The DW_OP_GNU_push_tls_address, DW_OP_form_tls_address DWARF
location forms generated for thread local storage variables, caused a
crash in the DWARFReader, due to incorrect number of operands.
2024-11-11 05:31:59 +00:00
Jack Styles
86f76c3b17
[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171)
As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced,
`DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that
the PC has been used with the signing instruction. This change includes
three commits
- Libunwind support for the newly introduced DWARF Instruction
- CodeGen Support for the DWARF Instructions
- Reversing the changes made in #96377. Due to
`DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed
immediately after the signing instruction, this would mean the CFI
Instruction location was not consistent with the generated location when
not using FEAT_PAuthLR. The commit reverses the changes and makes the
location consistent across the different branch protection options.
While this does have a code size effect, this is a negligible one.

For the ABI information, see here:
853286c7ab/aadwarf64/aadwarf64.rst (id23)
2024-10-28 08:22:38 +00:00
Greg Clayton
dc5c044193
Add verification support for .debug_names with foreign type units. (#109011)
This commit enables 'llvm-dwarfdump --veriy' to verify the DWARF in
foreign type units when using split DWARF for the .debug_names section.
2024-10-22 09:35:10 -07:00
Kazu Hirata
5c9c281c25
[DebugInfo] Use heterogenous lookups with std::map (NFC) (#113118) 2024-10-21 06:50:03 -07:00
Kazu Hirata
8b6764fdc0
[DebugInfo] Simplify code with std::unordered_map::operator[] (NFC) (#112658) 2024-10-17 07:47:06 -07:00
David Stenberg
97da5e6700
[GSYM] Remove redundant getInliningInfoForAddress call (#111136)
In DwarfTransformer::verify() line number information is retrieved for
each address using:

  auto DwarfInlineInfos =
      DICtx.getInliningInfoForAddress(SectAddr, DLIS);

Later down the loop, another such invocation was made before:

  Gsym->dump(Log, *FI);

There is a continue after that, DwarfInlineInfos do not affect the
dump() invocation, I am not aware of any other side effects that is
needed from the extra getInliningInfoForAddress() invocation, and tests
pass without it, so just remove it.
2024-10-15 13:34:27 -07:00
Kazu Hirata
c5c27d8025
[DebugInfo] Avoid repeated hash lookups (NFC) (#112298) 2024-10-15 07:34:50 -07:00
Kazu Hirata
8a53dc69c2
[DebugInfo] Avoid repeated map lookups (NFC) (#111936) 2024-10-11 08:59:01 -07:00
Zequan Wu
9b82e85d81
[DWARF] Generalize DWARFTypePrinter to a template class (#109459)
This generalizes DWARFTypePrinter class to a template class so that it
can be reused for lldb's DWARFDIE type.

This is a split of https://github.com/llvm/llvm-project/pull/90008. The
difference is that this doesn't have `Visitor` template parameter which
can be added later if necessary.
2024-10-08 17:20:42 -04:00
Zequan Wu
4206c37bd1
[lldb][DWARF] Replace lldb's DWARFDebugArangeSet with llvm's (#110058)
They are close enough to swap lldb's `DWARFDebugArangeSet` with the llvm
one.

The difference is that llvm's `DWARFDebugArangeSet` add empty ranges
when extracting. To accommodate this, `DWARFDebugAranges` in lldb
filters out empty ranges when extracting.
2024-10-01 13:52:51 -04:00
Kazu Hirata
f01d45cf97
[DebugInfo] Avoid repeated hash lookups (NFC) (#110620) 2024-10-01 07:48:09 -07:00
Kazu Hirata
30089b1590
[DWARF] Avoid repeated hash lookups (NFC) (#110202) 2024-09-28 10:04:03 -07:00
Youngsuk Kim
d31e314131 [llvm] Don't call raw_string_ostream::flush() (NFC)
Don't call raw_string_ostream::flush(), which is essentially a no-op.
As specified in the docs, raw_string_ostream is always unbuffered.
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
2024-09-20 12:19:59 -05:00
Kazu Hirata
157adcccc5
[GSYM] Avoid repeated hash lookups (NFC) (#109241) 2024-09-19 09:14:45 -07:00
Kazu Hirata
04f45aa7a7
[DebugInfo] Avoid repeated hash lookups (NFC) (#108486) 2024-09-12 22:40:46 -07:00
Craig Topper
f2b71491d1
[MC] Make MCRegisterInfo::getLLVMRegNum return std::optional<MCRegister>. NFC (#107776) 2024-09-08 21:21:51 -07:00
Pavel Labath
771b7af1db
Reapply "[llvm/DWARF] Recursively resolve DW_AT_signature references"… (#99495)
… (#99444)

The previous version introduced a bug (caught by cross-project tests).
Explicit signature resolution is still necessary when one wants to
access the children (not attributes) of a given DIE.

The new version keeps just the findRecursively extension, and reverts
all the DWARFTypePrinter modifications.
2024-09-04 10:13:47 +02:00
Rahul Joshi
b75fe11fd6
[NFC] Fix formatv() usage in preparation of validation (#106454)
Fix several uses of formatv() that would be flagged as invalid by an
upcoming change that will add additional validation to formatv().
2024-08-28 17:41:43 -07:00
itrofimow
bf88db78bd
[Symbolizer, DebugInfo] Clean up LLVMSymbolizer API: const string& -> StringRef (#104541)
Nothing in the affected code depends on the `ModuleName` being
null-terminated,
so take it by `StringRef` instead of `const std::string &`.

This change simplifies API consumption, since one doesn't always have a
`std::string` at the call site (might have `std::string_view` instead),
and also gives some minor performance improvements by removing
string-copies in the cache-hit path of `getOrCreateModuleInfo`.
2024-08-21 19:53:41 -07:00
Greg Clayton
13cc94e30e
Add support for verifying .debug_names in split DWARF for CUs and TUs. (#101775)
This patch fixes .debug_names verification for split DWARF with no type
units. It will print out an error for any name entries where we can't
locate the .dwo file. It finds the non skeleton unit and correctly
figures out the DIE offset in the .dwo file. If the non skeleton unit is
found and yet the skeleton unit has a DWO ID, an error will be emitted
showing we couldn't access the non-skeleton compile unit.
2024-08-13 22:17:49 -07:00
J. Ryan Stinnett
f807c5e492
[DebugInfo] Add expression decoding for DW_OP_implicit_pointer (#102923)
This allows `llvm-dwarfdump` to decode the DWARF 5 opcode
`DW_OP_implicit_pointer` (0xa0). GCC makes use of this opcode in recent
versions. LLVM contains some (unfinished) support as well. With existing
usage in the ecosystem, adding decoding support here seems reasonable.
2024-08-13 15:34:21 +01:00
J. Ryan Stinnett
7c4c72b520
[DebugInfo][NFC] Sort DWARF op descriptions, fix versions (#102773)
This sorts DWARF op descriptions in `DWARFExpression.cpp` by opcode and version, packing the standardised ops together. A few ops also had the wrong version listed, so this fixes those versions as well. (The version does not appear to actually be used currently.)
2024-08-12 16:51:56 +01:00
Haojian Wu
d09be9191b Fix a typecheck_arithmetic_incomplete_or_sizeless_type error in GSYM/MergedFunctionsInfo.h 2024-08-08 07:33:46 +02:00
alx32
cb5dc1faa8
[gSYM] Add support merged functions in gSYM format (#101604)
This patch introduces support for storing debug info for merged
functions in the GSYM debug info. It allows GSYM to represent multiple
functions that share the same address range, which occur when multiple
functions are merged during linker ICF.

The core of this functionality is the new `MergedFunctionsInfo` class,
which is integrated into the existing `FunctionInfo` structure. During
GSYM creation, functions with identical address ranges are now grouped
together, with one function serving as the "master" and the others
becoming "merged" functions. This organization is preserved in the GSYM
format and can be read back and displayed when dumping GSYM information.

Old readers will only see the master function, and ther "merged"
functions will not be processed.

Note: This patch just adds the functionality to the gSYM format -
additional changes to the gsym format and algorithmic changes to logic
existing tooling are needed to take advantage of this data.

Exact output of `llvm-gsymutil --verify --verbose` for the included
test:
[gist](https://gist.github.com/alx32/b9c104d7f87c0b3e7b4171399fc2dca3)
2024-08-07 14:34:20 -07:00
Simon Pilgrim
4859c46761 Fix gcc Wparentheses warning. NFC. 2024-08-07 16:28:06 +01:00
Pavel Labath
7b9fcf5c44
[DWARF] Teach getAttributeValueAsReferencedDie to resolve DW_FORM_ref… (#101197)
…_sig8

Splitting from #99495.

I've extended the type unit test case to feature more kinds of
references, including the gcc-style DW_AT_type[DW_FORM_ref_sig8]
reference, which this patch fixes.
2024-08-05 15:08:43 +02:00
Amit Kumar Pandey
0886440ef0
[Symbolizer] Support for Missing Line Numbers. (#82240)
LLVM Symbolizer attempt to symbolize addresses of optimized binaries
reports missing line numbers for some cases. It maybe due to compiler
which sometimes cannot map an instruction to line number due to
optimizations. Symbolizer should handle those cases gracefully.

Adding an option '--skip-line-zero' to symbolizer so as to report the
nearest non-zero line number.

---------

Co-authored-by: Amit Pandey <amit.pandey@amd.com>
2024-08-05 13:38:34 +05:30
Greg Clayton
b6a2eb0ecc
Add support for verifying local type units in .debug_names. (#101133)
This patch adds support for verifying local type units in .debug_names
section. It adds a test to test if the TU index is valid, and a test
that tests that an error is found inside the name entry for a type unit.
We don't need to test all other errors in the name entry because these
are essentially identical to compile unit entries, they just use a
different DWARF unit offset index.
2024-08-01 16:19:32 -07:00
Pavel Labath
26cb88e321
Revert "[llvm/DWARF] Recursively resolve DW_AT_signature references" (#99444)
Reverts llvm/llvm-project#97423 due to a failure in the
cross-project-tests.
2024-07-18 10:22:05 +02:00
Pavel Labath
e93df78bd4
[llvm/DWARF] Recursively resolve DW_AT_signature references (#97423)
findRecursively follows DW_AT_specification and DW_AT_abstract_origin
references, but not DW_AT_signature. As far as I can tell, there is no
fundamental difference between these attributes that would make this
behavior desirable, and this just seems like a consequence of the fact
that this attribute is newer. This patch aims to change that.

The motivation is some code in lldb, which assumes that it can construct
a qualified name of a type by just walking the parent chain and looking
at the name attribute. This works for "regular" debug info, even when
some of the DIEs are just forward declarations, but it breaks in the
presence of type units, because of the need to explicitly resolve the
signature reference.

While LLDB does not use the llvm's DWARFDie class (yet?), this seems
like a very important change in the overall API, and any divergence here
would complicate eventual reunification, which is why I am making the
change in the llvm API first. However, putting lldb aside, I think this
change is beneficial in llvm on its own, as it allows us to remove the
explicit DW_AT_signature resolution in the DWARFTypePrinter.
2024-07-18 09:44:06 +02:00
Pavel Labath
d0d61a7e4c
Split DWARFFormValue::getReference into four functions (#98905)
The result of the function cannot be correctly interpreted without
knowing the precise form type (a type signature needs to be looked up
very differently from a supplementary debug info reference). The
function sort of worked because the two reference types (unit-relative
and section-relative) that can be handled uniformly are also the most
common types of references, but this setup made it easy to write code
which does not support other kinds of reference (and if one tried to
support them, the result didn't look pretty --
https://github.com/llvm/llvm-project/pull/97423/files#r1676217081).

The split is based on the reference type classification from DWARFv5
(Section 7.5.5 Classes and Forms), and it should enable uniform (if
slightly more verbose) hadling. Note that this only affects users which
want more control of how (or if) the references are resolved. Users
which just want to access the referenced DIE can use the higher level
API (DWARFDie::GetAttributeValueAsReferencedDie) which returns (or will
return after #97423 is merged) the correct die for all reference types
(except for supplementary references, which we don't support right now).
2024-07-16 12:55:38 +02:00