For high-frequency multithreaded progress reports, the contention of
taking the progress mutex and the overhead of generating event can
significantly slow down the operation whose progress we are reporting.
This patch adds an (optional) capability to rate-limit them. It's
optional because this approach has one drawback: if the progress
reporting has a pattern where it generates a burst of activity and then
blocks (without reporting anything) for a large amount of time, it can
appear as if less progress was made that it actually was (because we
only reported the first event from the burst and dropped the other
ones).
I've also made a small refactor of the Progress class to better
distinguish between const (don't need protection), atomic (are used on
the hot path) and other (need mutex protection) members.
Indexing a single DWARF unit is a fairly small task, which means the
overhead of enqueueing a task for each unit is not negligible (mainly
because introduces a lot of synchronization points for queue management,
memory allocation etc.). This is particularly true if the binary was
built with type units, as these are usually very small.
This essentially brings us back to the state before
https://reviews.llvm.org/D78337, but the new implementation is built on
the llvm ThreadPool, and I've added a small improvement -- we now
construct one "index set" per thread instead of one per unit, which
should lower the memory usage (fewer small allocations) and make the
subsequent merge step faster.
On its own this patch doesn't actually change the performance
characteristics because we still have one choke point -- progress
reporting. I'm leaving that for a separate patch, but I've tried that
simply removing the progress reporting gives us about a 30-60% speed
boost.
The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
In DWARF 4 and earlier `static const` members of structs, classes and
unions have an entry tag `DW_TAG_member`, and are also tagged as
`DW_AT_declaration`, but otherwise follow the same rules as
`DW_TAG_variable`.
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess
indirection.
This patch adds support for the new foreign type unit support in
.debug_names. Features include:
- don't manually index foreign TUs if we have info for them
- only use the type unit entries that match the .dwo files when we have
a .dwp file
- fix type unit lookups for .dwo files
- fix crashers that happen due to PeekDIEName() using wrong offsets where an entry had DW_IDX_comp_unit and DW_IDX_type_unit entries and when we had no type unit support, it would cause us to think it was a normal DIE in .debug_info from the main executable.
---------
Co-authored-by: paperchalice <liujunchang97@outlook.com>
Per this RFC:
https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717
on improving progress reports, this commit separates the title field and
details field so that the title specifies the category that the progress
report falls under. The details field is added as a part of the
constructor for progress reports and by default is an empty string. In addition, changes the total amount of progress completed into a std::optional. Also
updates the test to check for details being correctly reported from the
event structured data dictionary.
As a followup of https://github.com/llvm/llvm-project/pull/67851, I'm
defining a new namespace `lldb_plugin::dwarf` for the classes in this
Plugins/SymbolFile/DWARF folder. This change is very NFC and helped me
with exporting the necessary symbols for my out-of-tree language plugin.
The only class that I didn't change is ClangDWARFASTParser, because that
shouldn't be in the same namespace as the generic language-agnostic
dwarf parser.
It would be a good idea if other plugins follow the same namespace
scheme.
The goal of this patch is to make it easier to reason about the state of
ObjCLanguage::MethodName. I do that in several ways:
- Instead of using the constructor directly, you go through a factory
method. It returns a std::optional<MethodName> so either you got an
ObjCLanguage::MethodName or you didn't. No more checking if it's valid
to know if you can use it or not.
- ObjCLanguage::MethodName is now immutable. You cannot change its
internals once it is created.
- ObjCLanguage::MethodName::GetFullNameWithoutCategory previously had a
parameter that let you get back an empty string if the method had no
category. Every caller of this method was enabling this behavior so I
dropped the parameter and made it the default behavior.
- No longer store all the various components of the method name as
ConstStrings. The relevant `Get` methods now return llvm::StringRefs
backed by the MethodName's internal storage. The lifetime of these
StringRefs are tied to the MethodName itself, so if you need to
persist these you need to create copies.
Differential Revision: https://reviews.llvm.org/D149914
The purpose of this method is to get the list of attributes of a
DebugInfoEntry. Prior to this change we were passing in a mutable
reference to a DWARFAttributes object and having the method fill it in
for us while returning the size of the filled out list. But
instead of doing that, we can just return a `DWARFAttributes` object
ourselves since every caller creates a new list before calling
GetAttributes.
Differential Revision: https://reviews.llvm.org/D150402
Similar to dw_form_t, dw_attr_t is typedef'd to be a uint16_t. LLVM
defines their type `llvm::dwarf::Attribute` as an enum backed by a
uint16_t. Switching to the LLVM type buys us type checking and the
requirement of explicit casts.
Differential Revision: https://reviews.llvm.org/D150299
This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Depends on D139955
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Depends on D139955
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
This relands a patch previously reverted
in `181d6e24ca3c09bfd6ec7c3b20affde3e5ea9b40`.
This wasn't quite working on Linux because we
weren't populating the manual DWARF index with
`DW_TAG_imported_declaration`. The relanded patch
does this.
**Summary**
This patch makes the expression evaluator understand
namespace aliases.
This will become important once `std::ranges` become
more widespread since `std::views` is defined as:
```
namespace std {
namespace ranges::views {}
namespace views = ranges::views;
}
```
**Testing**
* Added API test
Differential Revision: https://reviews.llvm.org/D143398
This fixes a regression introduced by
https://reviews.llvm.org/D131437. The intention of the patch was to
avoid indexing DWO skeleton units, but it also skipped over full DWARF
compile units linked via a -gmodules DW_AT_dwo_name attribute. This
patch restores the functionality and adds a test for it.
Differential Revision: https://reviews.llvm.org/D142683
In preparation for eanbling 64bit support in LLDB switching to use llvm::formatv
instead of format MACROs.
Reviewed By: labath, JDevlieghere
Differential Revision: https://reviews.llvm.org/D139955
When fission is enabled, we were indexing the skeleton CU _and_ the .dwo CU. Issues arise when users enable compiler options that add extra data to the skeleton CU (like -fsplit-dwarf-inlining) and there can end up being types in the skeleton CU due to template parameters. We never want to index this information since the .dwo file has the real definition, and we really don't want function prototypes from this info since all parameters are removed. The index doesn't work correctly if it does index the skeleton CU as the DIE offset will assume it is from the .dwo file, so even if we do index the skeleton CU, the index entries will try and grab information from the .dwo file using the wrong DIE offset which can cause errors to be displayed or even worse, if the DIE offsets is valid in the .dwo CU, the wrong DIE will be used.
We also fix DWO ID detection to use llvm::Optional<uint64_t> to make sure we can load a .dwo file with a DWO ID of zero.
Differential Revision: https://reviews.llvm.org/D131437
This reverts commit 967df65a3610f98a3bc0ec0f2303641d7bad176c.
This fixes test/Shell/SymbolFile/NativePDB/find-functions.cpp. When
looking up functions with the PDB plugins, if we are looking for a
full function name, we should use `GetName` to populate the `name`
field instead of `GetLookupName` since `GetName` has the more
complete information.
This reverts commit befa77e59a7760d8c4fdd177b234e4a59500f61c.
Looks like this broke a SymbolFileNativePDB test. I'll investigate and
resubmit with a fix soon.
Context:
When setting a breakpoint by name, we invoke Module::FindFunctions to
find the function(s) in question. However, we use a Module::LookupInfo
to first process the user-provided name and figure out exactly what
we're looking for. When we actually perform the function lookup, we
search for the basename. After performing the search, we then filter out
the results using Module::LookupInfo::Prune. For example, given
a:🅱️:foo we would first search for all instances of foo and then filter
out the results to just names that have a:🅱️:foo in them. As one can
imagine, this involves a lot of debug info processing that we do not
necessarily need to be doing. Instead of doing one large post-processing
step after finding each instance of `foo`, we can filter them as we go
to save time.
Some numbers:
Debugging LLDB and placing a breakpoint on
llvm::itanium_demangle::StringView::begin without this change takes
approximately 70 seconds and resolves 31,920 DIEs. With this change,
placing the breakpoint takes around 30 seconds and resolves 8 DIEs.
Differential Revision: https://reviews.llvm.org/D129682
As a preparation for parallelizing loading of symbols (D122975),
it is necessary to use just one thread pool to avoid using
a thread pool from inside a task of another thread pool.
Differential Revision: https://reviews.llvm.org/D123226
We have using namespace llvm::dwarf in dwarf.h header globally. Replacing that
with a using namespace within lldb_private::dwarf and moving to a
using namespace lldb_private::dwarf in .cpp files and fully qualified names
in the few header files.
Differential Revision: https://reviews.llvm.org/D120836
This patch add the ability to cache the manual DWARF indexing results to disk for faster subsequent debug sessions. Manual DWARF indexing is time consuming and causes all DWARF to be fully parsed and indexed each time you debug a binary that doesn't have an acceptable accelerator table. Acceptable accelerator tables include .debug_names in DWARF5 or Apple accelerator tables.
This patch breaks up testing by testing all of the encoding and decoding of required C++ objects in a gtest unit test, and then has a test to verify the debug info cache is generated correctly.
This patch also adds the ability to track when a symbol table or DWARF index is loaded or saved to the cache in the "statistics dump" command. This is essential to know in statistics as it can help explain why a debug session was slower or faster than expected.
Reviewed By: labath, wallace
Differential Revision: https://reviews.llvm.org/D115951
The new key/value pairs that are added to each module's stats are:
"debugInfoByteSize": The size in bytes of debug info for each module.
"debugInfoIndexTime": The time in seconds that it took to index the debug info.
"debugInfoParseTime": The time in seconds that debug info had to be parsed.
At the top level we add up all of the debug info size, parse time and index time with the following keys:
"totalDebugInfoByteSize": The size in bytes of all debug info in all modules.
"totalDebugInfoIndexTime": The time in seconds that it took to index all debug info if it was indexed for all modules.
"totalDebugInfoParseTime": The time in seconds that debug info was parsed for all modules.
Differential Revision: https://reviews.llvm.org/D112501
Skeleton vs. DWO units mismatch has been fixed in D106270. As they both
have type DWARFUnit it is a bit difficult to debug. So it is better to
make it safe against future changes.
Reviewed By: kimanh, clayborg
Differential Revision: https://reviews.llvm.org/D107659
When going through the CU entries in the name index,
make sure to compare the name entry's CU
offset against the skeleton CU's offset.
Previously there would be a mismatch, since the
wrong offset was compared, and thus no suitable
entry was found.
Reviewed By: jankratochvil
Differential Revision: https://reviews.llvm.org/D106270
LLDB can often appear deadlocked to users that use IDEs when it is indexing DWARF, or parsing symbol tables. These long running operations can make a debug session appear to be doing nothing even though a lot of work is going on inside LLDB. This patch adds a public API to allow clients to listen to debugger events that report progress and will allow UI to create an activity window or display that can show users what is going on and keep them informed of expensive operations that are going on inside LLDB.
Differential Revision: https://reviews.llvm.org/D97739
This patch introduces a LLDB_SCOPED_TIMER macro to hide the needlessly
repetitive creation of scoped timers in LLDB. It's similar to the
LLDB_LOG(F) macro.
Differential revision: https://reviews.llvm.org/D93663
Add an optimal thread strategy to execute specified amount of tasks.
This strategy should prevent us from creating too many threads if we
occasionaly have an unexpectedly small amount of tasks.
Differential Revision: https://reviews.llvm.org/D87765
Pavel Labath wrote in D73206:
The internal representation of DebugNames and Apple indexes is fixed by
the relevant (pseudo-)standards, so we can't really change it. The
question is how to efficiently (and cleanly) convert from the internal
representation to some common thing. The conversion from AppleIndex to
DIERef is trivial (which is not surprising as it was the first and the
overall design was optimized for that). With debug_names, the situation
gets more tricky. The internal representation of debug_names uses
CU-relative DIE offsets, but DIERef wants an absolute offset. That means
the index has to do more work to produce the common representation. And
it needs to do that for all results, even though a lot of the index
users are really interested only in a single entry. With the switch to
user_id_t, _all_ indexes would have to do some extra work to encode it,
only for their users to have to immediately decode it back. Having
a iterator/callback based api would allow us to minimize the impact of
that, as it would only need to happen for the entries that are really
used. And /I think/ we could make it interface returns DWARFDies
directly, and each index converts to that using the most direct approach
available.
Jan Kratochvil:
It also makes all the callers shorter as they no longer need to fetch
DWARFDIE from DIERef (and handling if not found by ReportInvalidDIERef)
but the callers are already served DWARFDIE which they need.
In some cases the DWARFDIE had to be fetched both by callee (DWARFIndex
implementation) and caller.
Differential Revision: https://reviews.llvm.org/D77970
It is an obvious part of D77326.
It removes some needless deep indentation and some redundant statements.
It prepares the code for a more clean next patch - DWARF index callbacks
in D77327.
Summary:
When we added support for type units in dwo files, we changed the
"manual" dwarf index to index _all_ dwarf units in the dwo file instead
of just the split unit belonging to our skeleton unit. This was fine for
dwo files, as they contain only a single compile units and type units do
not have a split type unit which would point to them.
However, this does not work for dwp files because, these files do
contain multiple split compile units, and the current approach means
that each unit gets indexed multiple times (once for each split unit =>
n^2 complexity).
This patch teaches the manual dwarf index to treat dwp files specially.
Any type units in the dwp file added to the main list of compile units
and indexed with them in a single batch. Split compile units in dwp
files are still indexed as a part of their skeleton unit -- this is done
because we need the DW_AT_language attribute from the skeleton unit to
index them properly.
Handling of dwo files remains unchanged -- all units (type and skeleton)
are indexed when we reach the dwo file through the split unit.
Reviewers: clayborg, JDevlieghere, aprantl
Subscribers: arphaman, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74964
Change the return value of SymbolFileDWARF::DebugInfo from a pointer to
a reference, and remove all null checks.
Previously, we were not constructing the DebugInfo object when the
debug_info section was empty. Now we always construct the object but
it will return an empty list of dwarf units (a thing which it already
supported).
Summary:
All of our lookup APIs either use `CompilerDeclContext &` or `CompilerDeclContext *` semi-randomly it seems.
This leads to us constantly converting between those two types (and doing nullptr checks when going from
pointer to reference). It also leads to the confusing situation where we have two possible ways to express
that we don't have a CompilerDeclContex: either a nullptr or an invalid CompilerDeclContext (aka a default
constructed CompilerDeclContext).
This moves all APIs to use references and gets rid of all the nullptr checks and conversions.
Reviewers: labath, mib, shafik
Reviewed By: labath, shafik
Subscribers: shafik, arphaman, abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74607
Summary:
This is a preparatory patch to re-enable DWP support in lldb (we already
have code claiming to do that, but it has been completely broken for a
while now).
The idea of the new approach is to make the SymbolFileDWARFDwo class
handle both dwo and dwo files, similar to how llvm uses one DWARFContext
to handle the two.
The first step is to remove the assumption that a SymbolFileDWARFDwo
holds just a single compile unit, i.e. the GetBaseCompileUnit method.
This requires changing the way how we reach the skeleton compile unit
(and the lldb_private::CompileUnit) from a dwo unit, which was
previously done via GetSymbolFile()->GetBaseCompileUnit() (and some
virtual dispatch).
The new approach reuses the "user data" mechanism of DWARFUnits, which
was used to link dwarf units (both skeleton and split) to their
lldb_private counterparts. Now, this is done only for non-dwo units, and
instead of that, the dwo units holds a pointer to the relevant skeleton
unit.
Reviewers: JDevlieghere, aprantl, clayborg
Reviewed By: JDevlieghere, clayborg
Subscribers: arphaman, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73781
This patchset is removing non-DWARF code from DWARFUnit for better
future merge with LLVM DWARF as discussed with @labath.
Differential revision: https://reviews.llvm.org/D70646