2580 Commits

Author SHA1 Message Date
Felipe de Azevedo Piovezan
380ac53dfa
[DebugNames] Implement Entry::GetParentEntry query (#78760)
This commit introduces a helper function to DWARFAcceleratorTable::Entry
which follows DW_IDX_Parent attributes to returns the corresponding
parent Entry in the table.

It is tested by enhancing dwarfdump so that it now prints:

1. When data is corrupt.
2. When parent information is present, but the parent is not indexed.
3. The parent entry offset, when the parent is present and indexed. This
is printed in terms a real entry offset (the same that gets printed at
the start of each entry: "Entry @ 0x..."), instead of the encoded number
in the table (which is an offset from the start off the Entry list).
This makes it easy to visually inspect the dwarfdump and check what the
parent is.
2024-01-24 06:44:03 -08:00
Kazu Hirata
b0763a1ae9 [DebugInfo] Use std::size (NFC) 2024-01-24 00:27:38 -08:00
Kazu Hirata
28f9041879 [DebugInfo] Use DenseMap::lookup (NFC) 2024-01-22 21:19:11 -08:00
Kazu Hirata
b7a66d0fae [llvm] Use SmallString::operator std::string (NFC) 2024-01-19 18:54:11 -08:00
Felipe de Azevedo Piovezan
b6677835fe
[AsmPrinter][DebugNames] Implement DW_IDX_parent entries (#77457)
This implements the ideas discussed in [1].

To summarize, this commit changes AsmPrinter so that it outputs
DW_IDX_parent information for debug_name entries. It will enable
debuggers to speed up queries for fully qualified types (based on a
DWARFDeclContext) significantly, as debuggers will no longer need to
parse the entire CU in order to inspect the parent chain of a DIE.
Instead, a debugger can simply take the parent DIE offset from the
accelerator table and peek at its name in the debug_info/debug_str
sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent
attribute:

1. DW_FORM_ref4, which points to the accelerator table entry for the
parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the
table (that is, the parent doesn't have a name, or isn't allowed to be
in the table as per the DWARF spec). This is space-efficient, since it
takes 0 bytes.

The implementation works by:

1. Changing how abbreviations are encoded (so that they encode which
form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.


When all patches related to this are merged, we are able to show that
evaluating an expression such as:

```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null
```

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking
at its impact on object file size:

```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616
```

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920
```

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit
already brings support to "base" linker, but it does not attempt to
modify the parallel linker. Accelerator entries refer to the
corresponding DIE offset, and this patch also requires the parent DIE
offset -- it's not clear how the parallel linker can access this. It may
be obvious to someone familiar with it, but it would be nice to get help
from its authors.

[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/
2024-01-19 09:19:09 -08:00
Kazu Hirata
c6cfd5350e [llvm] Use StringRef::contains (NFC) 2024-01-19 00:19:36 -08:00
Kazu Hirata
ac6d2f1ba0 [DebugInfo] Use StringRef::consume_front (NFC) 2024-01-17 20:23:00 -08:00
Greg Clayton
4618ef8cf5
Allow the dumping of .dwo files contents to show up when dumping an executable with split DWARF. (#66726)
Allow the dumping of .dwo files contents to show up when dumping an
executable with split DWARF.

Currently if you run llvm-dwarfdump on a binary that has skeleton
compile units, you only see the skeleton compile units. Since the main
binary has the linked addresses it would be nice to be able to dump
DWARF from the .dwo files and how the resolved addresses instead of
showing the address index and "<unresolved>" in the output. This patch
adds an option that can be specified to dump the non skeleton DIEs named
--dwo.

Added the ability to use the following options with split dwarf as well:
  --name <name>
  --lookup <addr>
  --debug-info <die-offset>
2024-01-12 13:31:55 -08:00
Kazu Hirata
a5dc3f68a8 [llvm] Use SmallString::operator std::string() (NFC) 2024-01-11 23:32:44 -08:00
Kazu Hirata
5e9da33b87 [llvm] Use StringRef::consume_front_insensitive (NFC) 2024-01-11 22:48:20 -08:00
Shubham Sandeep Rastogi
f22dc88759
[NFC] Address review feedback from PR #77004 (#77134)
Accidentally didn't commit the review feedback before merging the PR
2024-01-05 12:13:36 -08:00
Shubham Sandeep Rastogi
93d2e49b6a
Fix file index verifier when there is no file name in the prologue. (#77004)
If there is no file name in the prologue of a line table, the verifier
will try to verify the file index, which will be set to 1 by default.
This will cause the DWARF verifier to throw an error even if there is no
error.

rdar://114476503
rdar://114343624
2024-01-05 11:15:18 -08:00
Serge Pavlov
cb1a7d28e6
[symbolizer] Support symbol+offset lookup (#75067)
GNU addr2line supports lookup by symbol name in addition to the existing
address lookup. llvm-symbolizer starting from
e144ae54dcb96838a6176fd9eef21028935ccd4f supports lookup by symbol name.
This change extends this lookup with possibility to specify optional
offset.

Now the address for which source information is searched for can be
specified with offset:

    llvm-symbolize --obj=abc.so "SYMBOL func_22+0x12"

It decreases the gap in features of llvm-symbolizer and GNU addr2line.
This lookup now is supported for code only.

Migrated from: https://reviews.llvm.org/D139859
Pull request: https://github.com/llvm/llvm-project/pull/75067
2023-12-15 17:35:33 +07:00
Vitaly Buka
b3e111431c
[DebugInfo] Pass string ownership to MarkupFilter (#75403)
Last `getline` call destroys `InputString`, and `finish` accesses dead
`StringRef`.

Detected with #72677.

Fixes
https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux-fast
2023-12-14 00:20:55 -08:00
Kazu Hirata
ab20f23e7e [DebugInfo] Use llvm::to_underlying (NFC) 2023-12-08 22:07:29 -08:00
Adrian Prantl
c6805ea44a
[libDebugInfo] Prevent infinite recursion in DWARFDie::getTypeSize() (#74681)
when run on invalid input.
2023-12-07 14:39:45 -08:00
Adrian Prantl
87e22bdd2b Allow for mixing source/no-source DIFiles in one CU
The DWARF proposal that the DW_LNCT_LLVM_source extension is based on
(https://dwarfstd.org/issues/180201.1.html) allows to mix source and
non-source files in the same CU by storing an empty string as a
sentinel value.

This patch implements this feature.

Review in https://github.com/llvm/llvm-project/pull/73877
2023-11-30 15:09:24 -08:00
Greg Clayton
3661eb150e
Add support for parsing type unit entries in .debug_names. (#72952)
This is a follow up patch after .debug_names can now emit local type
unit entries when we compile with type units + DWARF5 + .debug_names.
The pull request that added this functionality was:

https://github.com/llvm/llvm-project/pull/70515

This patch makes sure that the DebugNamesDWARFIndex in LLDB will not
manually need to parse type units if they have a valid index. It also
fixes the index to be able to correctly extract name entries that
reference type unit DIEs. Added a test to verify things work as
expected.
2023-11-28 13:56:45 -08:00
Greg Clayton
18eefc186d
Modify llvm-gsymutil lookups to handle overlapping ranges correctly. (#72350)
llvm-gsymutil allows address ranges to overlap. There was a bug where if
we had debug info for a function with a range like [0x100-0x200) and a
symbol at the same start address yet with a larger range like
[0x100-0x300), we would randomly get either only information from the
first or second entry. This could cause lookups to fail due to the way
the binary search worked.

This patch makes sure that when lookups happen we find the first address
table entry that can match an address, and also ensures that we always
select the first FunctionInfo that could match. FunctionInfo entries are
sorted such that the most debug info rich entries come first. And if we
have two ranges that have the same start address, the smaller range
comes first and the larger one comes next. This patch also adds the
ability to iterate over all function infos with the same start address
to always find a range that contains the address.

Added a unit test to test this functionality that failed prior to this
fix and now succeeds.

Also fix an issue when dumping an entire GSYM file that has duplicate address entries where it used to always print out the binary search match for the FunctionInfo, not the actual data for the address index.
2023-11-17 10:31:12 -08:00
Kazu Hirata
055f377349 [llvm] Stop including llvm/ADT/SparseBitVector.h (NFC)
Identified with clangd.
2023-11-13 07:30:48 -08:00
Orlando Cazalet-Hyams
fbc652061e
[llvm-dwarfdump] pretty-print DW_AT_call_origin reference (#71693) 2023-11-09 09:04:11 +00:00
Greg Clayton
5aa934e2af
Make DWARFUnitVector threadsafe. (#71487)
The DWARFUnitVector class lives inside of the DWARFContextState. Prior
to this fix a non const reference was being handed out to clients. When
fetching the DWO units, there used to be a "bool Lazy" parameter that
could be passed that would allow the DWARFUnitVector to parse individual
units on the fly. There were two major issues with this approach:
- not thread safe and causes crashes
- the accessor would check if DWARFUnitVector was empty and if not empty
it would return a partially filled in DWARFUnitVector if it was
constructed with "Lazy = true"

This patch fixes the issues by always fully parsing the DWARFUnitVector
when it is requested and only hands out a "const DWARFUnitVector &".
This allows the thread safety mechanism built into the DWARFContext
class to work corrrectly, and avoids the issue where if someone
construct DWARFUnitVector with "Lazy = true", and then calls an API that
partially fills in the DWARFUnitVector with individual entries, and then
someone accesses the DWARFUnitVector, they would get a partial and
incomplete listing of the DWARF units for the DWOs.
2023-11-08 10:20:02 -08:00
Greg Clayton
0156b6ed65
Modify llvm-gsymutil to ignore invalid file indexes (#71431)
DWARF produced by LTO and BOLT can sometimes be broken where file
indexes are beyond the end of the line table's file list in the
prologue. This patch allows llvm-gsymutil to convert this DWARF without
crashing, and emits errors when:

line table contains entries with an invalid file index (line entry will
be removed) inline functions that have invalid DW_AT_call_file file
indexes when there are no line table entries for a function and we fall
back to making a single line table entry from the functions
DW_AT_decl_file/DW_AT_decl_line attributes, we make sure the
DW_AT_decl_file attribute is valid before emitting it.
2023-11-06 13:44:46 -08:00
Greg Clayton
27033cc665
Fix line table lookups in line tables with multiple lines with the sa… (#70879)
Fix line table lookups in line tables with multiple lines with the same
address.

Compilers emit line tables that have multiple line table entries with
the same address. When doing lookups, we always need to use the last
line entry if a lookup address matches the address of one or more line
entries. This is because the size of an address range for a line uses
the next line entry to figure out how big the current line entry is. If
the next line entry has the same address, that means the current line
entry has a size of zero, so no bytes correspond to the line entry.

This patch ensures that lookups will always pick the last matching line
entry when the lookup address matches more than one line entry.
2023-11-06 10:09:59 -08:00
Simon Pilgrim
1950190b61 [DebugInfo] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 15:20:20 +00:00
J. Ryan Stinnett
d5e33cc147
[DebugInfo] Use human-friendly printing for DWARF column attributes (#71062) 2023-11-04 17:08:42 +00:00
Kazu Hirata
5f5f82af96 [llvm] Adjust files including llvm/Support/Endian.h (NFC)
llvm/include/llvm/Support/VersionTuple.h doesn't need anything from
llvm/Support/Endian.h, but llvm/lib/DebugInfo/BTF/BTFParser.cpp relies
on a transitive inclusion of llvm/Support/Endian.h.
2023-11-02 23:09:04 -07:00
Kazu Hirata
bf383dca36 [llvm] Stop including llvm/Support/Endian.h (NFC)
Identified with misc-include-cleaner.
2023-11-01 23:16:33 -07:00
Serge Pavlov
e144ae54dc [symbolizer] Support symbol lookup
Recent versions of GNU binutils starting from 2.39 support symbol+offset
lookup in addition to the usual numeric address lookup. This change adds
symbol lookup to llvm-symbolize and llvm-addr2line.

Now llvm-symbolize behaves closer to GNU addr2line, - if the value specified
as address in command line or input stream is not a number, it is treated as
a symbol name. For example:

    llvm-symbolize --obj=abc.so func_22
    llvm-symbolize --obj=abc.so "CODE func_22"

This lookup is now supported only for functions. Specification with
offset is not supported yet.

This is a recommit of 2b27948783e4bbc1132d3220d8517ef62607b558, reverted
in 39fec5457c0925bd39f67f63fe17391584e08258 because the test
llvm/test/Support/interrupts.test started failing on Windows. The test was
changed in 18f036d0105589c3175bb51a518c5d272dae61e2 and is also updated in
this commit.

Differential Revision: https://reviews.llvm.org/D149759
2023-11-01 14:41:39 +07:00
Kazu Hirata
f9306f6de3
[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156)
C++20 comes with std::erase to erase a value from std::vector.  This
patch renames llvm::erase_value to llvm::erase for consistency with
C++20.

We could make llvm::erase more similar to std::erase by having it
return the number of elements removed, but I'm not doing that for now
because nobody seems to care about that in our code base.

Since there are only 50 occurrences of erase_value in our code base,
this patch replaces all of them with llvm::erase and deprecates
llvm::erase_value.
2023-10-24 23:03:13 -07:00
Kazu Hirata
6e18003a23 [llvm] Use llvm::any_of (NFC) 2023-10-22 10:42:18 -07:00
Kazu Hirata
1f12223cf4 [llvm] Stop including Endian.h (NFC)
These files satisfy all of the following:

- misc-include-cleaner indicates that these files do not need
  Endian.h.
- They do not mention "endian" anywhere.
- They do not include any *.inc or *.def, which could need
  llvm::support::endian.
2023-10-21 22:23:02 -07:00
Kazu Hirata
e5ef5b9e32 [DebugInfo] Use llvm::erase_value (NFC) 2023-10-20 23:45:15 -07:00
Kazu Hirata
dc271b59e9 [DebugInfo] Use llvm::erase_if (NFC) 2023-10-20 00:15:00 -07:00
Alex Langford
0446c589af
[DebugInfo] Correctly report header parsing errors from DWARFContext::fixupIndex (#69505)
In ef762e5e7292, I shifted around where errors were reported when
failing to parse and/or validate DWARFUnitHeaders. When we are doing so
in DWARFContext::fixupIndex, the actual error message isn't prefixed
with `warning:` like it would be elsewhere (because of the way
`logAllUnhandledErrors` is implemented).
2023-10-19 09:48:15 -07:00
Alex Langford
814a79aea6
[DebugInfo] Separate error generation from reporting in DWARFHeaderUnit::extract (#68242)
Instead of reporting the error directly through the DWARFContext passed
in as an argument, it would be more flexible to have extract return the
error and allow the caller to react appropriately.

This will be useful for using llvm's DWARFHeaderUnit from lldb which may
report header extraction errors through a different mechanism.
2023-10-18 09:06:39 -07:00
Greg Clayton
78be6b2234
llvm-gsymutil now handles empty linkage names correctly. (#68931)
Previous to this fix, if we had a DW_TAG_subprogram that had a
DW_AT_linkage_name that was empty, it would attempt to use this name
which would cause an error to be emitted when saving the gsym file to
disk:

error: DWARF conversion failed: : attempted to encode invalid
FunctionInfo object

This patch fixes this issue and adds a unit test case.
2023-10-16 16:24:07 -07:00
Min-Yih Hsu
42b707e5b4 [DWARF][M68k] Add new DW_CC for the new M68kRTD calling convention
Add `DW_CC_M68kRTD` to model the new `llvm::CallingConv::M68kRTD`.

Differential Revision: https://reviews.llvm.org/D152587
2023-10-15 16:13:05 -07:00
Kazu Hirata
9bcc094d37 [llvm] Use llvm::erase_if (NFC) 2023-10-12 22:59:25 -07:00
Kazu Hirata
4a0ccfa865 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-12 21:21:45 -07:00
Kazu Hirata
b8885926f8 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-10 22:54:51 -07:00
Kazu Hirata
b05dbc4d5f [llvm] Use llvm::endianness::{big,little,native} (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
support::endianness::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-10 20:14:20 -07:00
Alexander Yermolovich
2a9f02b709
[LLVM][DWARF] Create thread safe context for DWO/DWP DWARFContext (#68262)
Right now DWARFContext for DWO/DWP that is created is not thread safe. 
Changed it so that thread safety is inherited from the main
binary DWARFContext.
2023-10-10 06:53:44 -07:00
Chen Zheng
efb11c4022
Support big endian in llvm-symbolizer's data location dwarf info parser (#67284)
For now, data location expression is hard coded to little endian. We are
going to support sanitizers on AIX which is big endian. Support big
endian too in the data location expression parser of llvm-symbolizer.
2023-10-10 09:13:25 +08:00
Kazu Hirata
d7b18d5083 Use llvm::endianness{,::little,::native} (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
llvm::support::endianness with llvm::endianness.
2023-10-09 00:54:47 -07:00
Kazu Hirata
b8f70feaef [llvm] Remove "using support::endianness;" (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can directly get endianness from the llvm
namespace.  We don't need to go through support.
2023-10-08 00:11:15 -07:00
Alexander Yermolovich
78502251f4
[DWARF] Change to consistently print out abbrev code in .debug_names (#68353)
Changed so that when Abbrev code is printed out for entry it is done in
the same
way as in Abbrev table.
Once letters are present in a hex number in abbrev table they will be
lower case,
and in the Entry upper case. Which makes FIleCheck Pattern recognition
fail.
Example in: llvm/test/tools/llvm-dwarfdump/X86/debug-names-misaligned.s
2023-10-06 16:19:43 -07:00
Alexandre Ganea
356139bd02
[LLD][COFF] Add support for --time-trace (#68236)
This adds support for generating Chrome-tracing .json profile traces in
the LLD COFF driver.

Also add the necessary time scopes, so that the profile trace shows in
great detail which tasks are executed.

As an example, this is what we see when linking a Unreal Engine
executable:

![image](https://github.com/llvm/llvm-project/assets/37383324/b2e26eb4-9d37-4cf9-b002-48b604e7dcb7)
2023-10-05 22:33:58 -04:00
Nico Weber
f320065aeb Revert "[LLVM][DWARF] Add support for monolithic types in .debug_names (#68131)"
This reverts commit 9bbd2bf654634cd95dd0be7948ec8402c3c76e1e.

Accidental commit: https://github.com/llvm/llvm-project/pull/68131#issuecomment-1749430207
2023-10-05 14:47:04 -04:00
Alexander Yermolovich
9bbd2bf654
[LLVM][DWARF] Add support for monolithic types in .debug_names (#68131)
Added support for Type Units in monolithic DWARF in .debug_names.
2023-10-05 11:14:18 -07:00