The previous commit was reverted because it broke a test on the bots.
Original commit message:
By profiling LLDB debugging a Swift application without a dSYM and a
large amount of .o files, I identified that querying shared modules was
the biggest bottleneck when running "frame variable", and Clang types
need to be searched.
One of the reasons for that slowness is that the shared module list can
can grow very large, and the search through it is O(n).
To solve this issue, this patch adds a new hashmap to the shared module
list whose key is the name of the module, and the value is all the
modules that share that name. This should speed up any search where the
query contains the module name.
rdar://156753350
By profiling LLDB debugging a Swift application without a dSYM and a
large amount of .o files, I identified that querying shared modules was
the biggest bottleneck when running "frame variable", and Clang types
need to be searched.
One of the reasons for that slowness is that the shared module list can
grow very large, and the search through it is O(n).
To solve this issue, this patch adds a new hashmap to the shared module
list whose key is the name of the module, and the value is all the
modules that share that name. This should speed up any search where the
query contains the module name.
rdar://156753350
Summary:
There was a deadlock was introduced by [PR
#146441](https://github.com/llvm/llvm-project/pull/146441) which changed
`CurrentThreadIsPrivateStateThread()` to
`CurrentThreadPosesAsPrivateStateThread()`. This change caused the
execution path in
[`ExecutionContextRef::SetTargetPtr()`](10b5558b61/lldb/source/Target/ExecutionContext.cpp (L513))
to now enter a code block that was previously skipped, triggering
[`GetSelectedFrame()`](10b5558b61/lldb/source/Target/ExecutionContext.cpp (L522))
which leads to a deadlock.
Thread 1 gets m_modules_mutex in
[`ModuleList::AppendImpl`](96148f9214/lldb/source/Core/ModuleList.cpp (L218)),
Thread 3 gets m_language_runtimes_mutex in
[`GetLanguageRuntime`](96148f9214/lldb/source/Target/Process.cpp (L1501)),
but then Thread 1 waits for m_language_runtimes_mutex in
[`GetLanguageRuntime`](96148f9214/lldb/source/Target/Process.cpp (L1501))
while Thread 3 waits for m_modules_mutex in
[`ScanForGNUstepObjCLibraryCandidate`](96148f9214/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp (L57)).
This fixes the deadlock by adding a scoped block around the mutex lock
before the call to the notifier, and moved the notifier call outside of
the mutex-guarded section. The notifier call
[`NotifyModuleAdded`](96148f9214/lldb/source/Target/Target.cpp (L1810))
should be thread-safe, since the module should be added to the
`ModuleList` before the mutex is released, and the notifier doesn't
modify the module list further, and the call is operates on local state
and the `Target` instance.
### Deadlocked Thread backtraces:
```
* thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP
* frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at futex-internal.h:146:13
/*... a bunch of mutex related bt ... */
liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19
frame #8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41
frame #9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8
frame #10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36
frame #11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9
...
frame #21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48
frame #22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32
frame #23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3
frame #24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23
frame #25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41
...
* thread #1, name = 'lldb', stop reason = signal SIGSTOP
* frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13
/*... a bunch of mutex related bt ... */
liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19
frame #8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41
frame #9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36
...
frame #13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19
frame #14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3
...
frame #19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9
frame #20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23
```
## Test Plan:
Built a hello world a.out
Run server in one terminal:
```
~/llvm/build/Debug/bin/lldb-server g :1234 a.out
```
Run client in another terminal
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
```
Before:
Client hangs indefinitely
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main"
(lldb) gdb-remote 1234
^C^C
```
After:
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
(lldb) gdb-remote 1234
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSTOP
frame #0: 0x00007ffff7fe4a60
ld-linux-x86-64.so.2`_start:
-> 0x7ffff7fe4a60 <+0>: movq %rsp, %rdi
0x7ffff7fe4a63 <+3>: callq 0x7ffff7fe5780 ; _dl_start at rtld.c:522:1
ld-linux-x86-64.so.2`_dl_start_user:
0x7ffff7fe4a68 <+0>: movq %rax, %r12
0x7ffff7fe4a6b <+3>: movl 0x18067(%rip), %eax ; _dl_skip_args
(lldb) b hello.cc:3
Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf
(lldb) c
Process 837068 resuming
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13
1 #include <iostream>
2
3 int main() {
-> 4 std::cout << "Hello World" << std::endl;
5 return 0;
6 }
```
LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](https://github.com/llvm/llvm-project/pull/149827)
This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.
**Implementation**
The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).
Depends on https://github.com/llvm/llvm-project/pull/151355
That calls an unknown amount of Python code, and can do quite a bit of
work - especially if people do things like launch scripted processes in
this script affordance. Doing that while holding a major lock like the
ModuleList lock is asking for trouble.
I tried to make a test that would actually stall without this, but I
couldn't come up with anything that reliably failed. You always have to
get pretty unlucky.
Identical PR to: https://github.com/llvm/llvm-project/pull/134563
Previous PR was approved and landed but broke the build due to bad
merge.
Manually resolve the merge conflict and try to land again.
Co-authored-by: George Hu <georgehuyubo@gmail.com>
This reverts commit 070a4ae2f9bcf6967a7147ed2972f409eaa7d3a6.
Multiple buildbot failures have been reported:
https://github.com/llvm/llvm-project/pull/134563
The build fails with:
lldb/source/Target/Statistics.cpp:75:39: error: use of undeclared
identifier 'num_symbols_loaded'
This patch removes all of the Set.* methods from Status.
This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.
This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()
Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form
` ResultTy DoFoo(Status& error)
`
to
` llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?
The interesting changes are in Status.h and Status.cpp, all other
changes are mostly
` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.
Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.
This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.
Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.
This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.
We claim in a couple places that the zeroth element of the module list
for a target is the main executable, but we don't actually enforce that
in the ModuleList class. As we saw, for instance, in
32dd5b20973bde1ef77fa3b84b9f85788a1a303a
it's not all that hard to get this to be off. This patch ensures that
the first object file of type Executable added to it is moved to the
front of the ModuleList. I also added a test for this.
In the normal course of operation, where the executable is added first,
this only adds a check for whether the first element in the module list
is an executable. If that's true, we just append as normal.
Note, the code in Target::GetExecutableModule doesn't actually agree
that the zeroth element must be the executable, it instead returns the
first Module of type Executable. But I can't tell whether that was a
change in intention or just working around the bug that we don't always
maintain this ordering. But given we've said this in scripting as well
as internally, I think we shouldn't change our minds about this.
LLVM supports DWARF 5 linetable extension to store source files inline
in DWARF. This is particularly useful for compiler-generated source
code. This implementation tries to materialize them as temporary files
lazily, so SBAPI clients don't need to be aware of them.
rdar://110926168
This patch revives the effort to get this Phabricator patch into
upstream:
https://reviews.llvm.org/D137900
This patch was accepted before in Phabricator but I found some
-gsimple-template-names issues that are fixed in this patch.
A fixed up version of the description from the original patch starts
now.
This patch started off trying to fix Module::FindFirstType() as it
sometimes didn't work. The issue was the SymbolFile plug-ins didn't do
any filtering of the matching types they produced, and they only looked
up types using the type basename. This means if you have two types with
the same basename, your type lookup can fail when only looking up a
single type. We would ask the Module::FindFirstType to lookup "Foo::Bar"
and it would ask the symbol file to find only 1 type matching the
basename "Bar", and then we would filter out any matches that didn't
match "Foo::Bar". So if the SymbolFile found "Foo::Bar" first, then it
would work, but if it found "Baz::Bar" first, it would return only that
type and it would be filtered out.
Discovering this issue lead me to think of the patch Alex Langford did a
few months ago that was done for finding functions, where he allowed
SymbolFile objects to make sure something fully matched before parsing
the debug information into an AST type and other LLDB types. So this
patch aimed to allow type lookups to also be much more efficient.
As LLDB has been developed over the years, we added more ways to to type
lookups. These functions have lots of arguments. This patch aims to make
one API that needs to be implemented that serves all previous lookups:
- Find a single type
- Find all types
- Find types in a namespace
This patch introduces a `TypeQuery` class that contains all of the state
needed to perform the lookup which is powerful enough to perform all of
the type searches that used to be in our API. It contain a vector of
CompilerContext objects that can fully or partially specify the lookup
that needs to take place.
If you just want to lookup all types with a matching basename,
regardless of the containing context, you can specify just a single
CompilerContext entry that has a name and a CompilerContextKind mask of
CompilerContextKind::AnyType.
Or you can fully specify the exact context to use when doing lookups
like: CompilerContextKind::Namespace "std"
CompilerContextKind::Class "foo"
CompilerContextKind::Typedef "size_type"
This change expands on the clang modules code that already used a
vector<CompilerContext> items, but it modifies it to work with
expression type lookups which have contexts, or user lookups where users
query for types. The clang modules type lookup is still an option that
can be enabled on the `TypeQuery` objects.
This mirrors the most recent addition of type lookups that took a
vector<CompilerContext> that allowed lookups to happen for the
expression parser in certain places.
Prior to this we had the following APIs in Module:
```
void
Module::FindTypes(ConstString type_name, bool exact_match, size_t max_matches,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
TypeList &types);
void
Module::FindTypes(llvm::ArrayRef<CompilerContext> pattern, LanguageSet languages,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
TypeMap &types);
void Module::FindTypesInNamespace(ConstString type_name,
const CompilerDeclContext &parent_decl_ctx,
size_t max_matches, TypeList &type_list);
```
The new Module API is much simpler. It gets rid of all three above
functions and replaces them with:
```
void FindTypes(const TypeQuery &query, TypeResults &results);
```
The `TypeQuery` class contains all of the needed settings:
- The vector<CompilerContext> that allow efficient lookups in the symbol
file classes since they can look at basename matches only realize fully
matching types. Before this any basename that matched was fully realized
only to be removed later by code outside of the SymbolFile layer which
could cause many types to be realized when they didn't need to.
- If the lookup is exact or not. If not exact, then the compiler context
must match the bottom most items that match the compiler context,
otherwise it must match exactly
- If the compiler context match is for clang modules or not. Clang
modules matches include a Module compiler context kind that allows types
to be matched only from certain modules and these matches are not needed
when d oing user type lookups.
- An optional list of languages to use to limit the search to only
certain languages
The `TypeResults` object contains all state required to do the lookup
and store the results:
- The max number of matches
- The set of SymbolFile objects that have already been searched
- The matching type list for any matches that are found
The benefits of this approach are:
- Simpler API, and only one API to implement in SymbolFile classes
- Replaces the FindTypesInNamespace that used a CompilerDeclContext as a
way to limit the search, but this only worked if the TypeSystem matched
the current symbol file's type system, so you couldn't use it to lookup
a type in another module
- Fixes a serious bug in our FindFirstType functions where if we were
searching for "foo::bar", and we found a "baz::bar" first, the basename
would match and we would only fetch 1 type using the basename, only to
drop it from the matching list and returning no results
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
This commit contains the initial scaffolding to convert the
functionality currently implemented in LocateSymbolFile to a plugin
architecture. The plugin approach allows us to easily add new ways to
find symbols and fixes some issues with the current implementation.
For instance, currently we (ab)use the host OS to include support for
querying the DebugSymbols framework on macOS. The plugin approach
retains all the benefits (including the ability to compile this out on
other platforms) while maintaining a higher level of separation with the
platform independent code.
To limit the scope of this patch, I've only converted a single function:
LocateExecutableObjectFile. Future commits will convert the remaining
LocateSymbolFile functions and eventually remove LocateSymbolFile. To
make reviewing easier, that will done as follow-ups.
ConstString can be implicitly converted into a llvm::StringRef. This is
very useful in many places, but it also hides places where we are
creating a ConstString only to use it as a StringRef for the entire
lifespan of the ConstString object.
I locally removed the implicit conversion and found some of the places we
were doing this.
Differential Revision: https://reviews.llvm.org/D159237
This patch fixes warnings like:
lldb/source/Core/ModuleList.cpp:1086:3: error: 'scoped_lock' may not
intend to support class template argument deduction
[-Werror,-Wctad-maybe-unsupported]
These methods all take a `Stream *` to get feedback about what's going
on. By default, it's a nullptr, but we always feed it with a valid
pointer. It would therefore make more sense to have this take a
reference.
Differential Revision: https://reviews.llvm.org/D154883
Use templates to simplify {Get,Set}PropertyAtIndex. It has always
bothered me how cumbersome those calls are when adding new properties.
After this patch, SetPropertyAtIndex infers the type from its arguments
and GetPropertyAtIndex required a single template argument for the
return value. As an added benefit, this enables us to remove a bunch of
wrappers from UserSettingsController and OptionValueProperties.
Differential revision: https://reviews.llvm.org/D149774
The majority of call sites are nullptr as the execution context.
Refactor OptionValueProperties to make the argument optional and
simplify all the callers.
Various OptionValue related classes are passing around will_modify but
the value is never used. This patch simplifies the interfaces by
removing the redundant argument.
Similar to fdbe7c7faa54, refactor OptionValueProperties to return a
std::optional instead of taking a fail value. This allows the caller to
handle situations where there's no value, instead of being unable to
distinguish between the absence of a value and the value happening the
match the fail value. When a fail value is required,
std::optional::value_or() provides the same functionality.
Currently all callsites already assume the pointer is non-null.
This patch just asserts this assumption.
This is practically enforced by `ModuleList::Append`
which won't add `nullptr`s to `m_modules`.
Differential Revision: https://reviews.llvm.org/D139082
On macOS, LLDB uses the DebugSymbols.framework to locate symbol rich
dSYM bundles. [1] The framework uses a variety of methods, one of them
calling into a binary or shell script to locate (and download) dSYMs.
Internally at Apple, that tool is called dsymForUUID and for simplicity
I'm just going to refer to it that way here too, even though it can be
be an arbitrary executable.
The most common use case for dsymForUUID is to fetch symbols from the
network. This can take a long time, and because the calls to the
DebugSymbols.framework are blocking, it takes a while to launch the
process. This is expected and therefore many people don't use this
functionality, but instead use add-dsym when they want symbols for a
given frame, backtrace or module. This is a little faster because you're
only fetching symbols for the module you care about, but it's still a
slow, blocking operation.
This patch introduces a hybrid approach between the two. When
symbols.enable-background-lookup is enabled, lldb will do the equivalent
of add-dsym in the background for every module that shows up in the
backtrace but doesn't have symbols for. From the user's perspective
there is no slowdown, because the process launches immediately, with
whatever symbols are available. Meanwhile, more symbol information is
added over time as the background fetching completes.
[1] https://lldb.llvm.org/use/symbols.html
rdar://76241471
Differential revision: https://reviews.llvm.org/D131328
This reverts commit 967df65a3610f98a3bc0ec0f2303641d7bad176c.
This fixes test/Shell/SymbolFile/NativePDB/find-functions.cpp. When
looking up functions with the PDB plugins, if we are looking for a
full function name, we should use `GetName` to populate the `name`
field instead of `GetLookupName` since `GetName` has the more
complete information.
This reverts commit befa77e59a7760d8c4fdd177b234e4a59500f61c.
Looks like this broke a SymbolFileNativePDB test. I'll investigate and
resubmit with a fix soon.
Context:
When setting a breakpoint by name, we invoke Module::FindFunctions to
find the function(s) in question. However, we use a Module::LookupInfo
to first process the user-provided name and figure out exactly what
we're looking for. When we actually perform the function lookup, we
search for the basename. After performing the search, we then filter out
the results using Module::LookupInfo::Prune. For example, given
a:🅱️:foo we would first search for all instances of foo and then filter
out the results to just names that have a:🅱️:foo in them. As one can
imagine, this involves a lot of debug info processing that we do not
necessarily need to be doing. Instead of doing one large post-processing
step after finding each instance of `foo`, we can filter them as we go
to save time.
Some numbers:
Debugging LLDB and placing a breakpoint on
llvm::itanium_demangle::StringView::begin without this change takes
approximately 70 seconds and resolves 31,920 DIEs. With this change,
placing the breakpoint takes around 30 seconds and resolves 8 DIEs.
Differential Revision: https://reviews.llvm.org/D129682
This diff introduces a new symbol on-demand which skips
loading a module's debug info unless explicitly asked on
demand. This provides significant performance improvement
for application with dynamic linking mode which has large
number of modules.
The feature can be turned on with:
"settings set symbols.load-on-demand true"
The feature works by creating a new SymbolFileOnDemand class for
each module which wraps the actual SymbolFIle subclass as member
variable. By default, most virtual methods on SymbolFileOnDemand are
skipped so that it looks like there is no debug info for that module.
But once the module's debug info is explicitly requested to
be enabled (in the conditions mentioned below) SymbolFileOnDemand
will allow all methods to pass through and forward to the actual SymbolFile
which would hydrate module's debug info on-demand.
In an internal benchmark, we are seeing more than 95% improvement
for a 3000 modules application.
Currently we are providing several ways to on demand hydrate
a module's debug info:
* Source line breakpoint: matching in supported files
* Stack trace: resolving symbol context for an address
* Symbolic breakpoint: symbol table match guided promotion
* Global variable: symbol table match guided promotion
In all above situations the module's debug info will be on-demand
parsed and indexed.
Some follow-ups for this feature:
* Add a command that allows users to load debug info explicitly while using a
new or existing command when this feature is enabled
* Add settings for "never load any of these executables in Symbols On Demand"
that takes a list of globs
* Add settings for "always load the the debug info for executables in Symbols
On Demand" that takes a list of globs
* Add a new column in "image list" that shows up by default when Symbols On
Demand is enable to show the status for each shlib like "not enabled for
this", "debug info off" and "debug info on" (with a single character to
short string, not the ones I just typed)
Differential Revision: https://reviews.llvm.org/D121631
Applied modernize-use-default-member-init clang-tidy check over LLDB.
It appears in many files we had already switched to in class member init but
never updated the constructors to reflect that. This check is already present in
the lldb/.clang-tidy config.
Differential Revision: https://reviews.llvm.org/D121481
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes:
- We no longer modify modification times of the cache files
- Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp)
- Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed
- Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control
This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented.
The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items:
- object file UUID if available
- object file mod time if available
- object name for BSD archive .o files that are in .a files if available
If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached.
When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created.
Module caching must be enabled by the user before this can be used:
symbols.enable-lldb-index-cache (boolean) = false
(lldb) settings set symbols.enable-lldb-index-cache true
There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory:
(lldb) settings show symbols.lldb-index-cache-path
/var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache
If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug.
Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues.
If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed.
The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings:
(lldb) settings set symbols.lldb-index-cache-expiration-days <days>
A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently.
(lldb) settings set symbols.lldb-index-cache-max-byte-size <size>
A value of zero will disable pruning based on a total byte size. The default value is zero currently.
(lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space>
A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero.
Reviewed By: labath, wallace
Differential Revision: https://reviews.llvm.org/D115324
The amount of roundtrips between StringRefs, ConstStrings and std::strings is
getting a bit out of hand, this patch avoid the unnecessary roundtrips.
Reviewed By: wallace, aprantl
Differential Revision: https://reviews.llvm.org/D112863
Rather than passing two booleans around, which is especially error prone
with them being next to each other, use a struct with named fields
instead.
Differential revision: https://reviews.llvm.org/D107295
This converts a default constructor's member initializers into C++11
default member initializers. This patch was automatically generated with
clang-tidy and the modernize-use-default-member-init check.
$ run-clang-tidy.py -header-filter='lldb' -checks='-*,modernize-use-default-member-init' -fix
This is a mass-refactoring patch and this commit will be added to
.git-blame-ignore-revs.
Differential revision: https://reviews.llvm.org/D103483
Replace uses of GetModuleAtIndexUnlocked and
GetModulePointerAtIndexUnlocked with the ModuleIterable and
ModuleIterableNoLocking where applicable.
Differential revision: https://reviews.llvm.org/D94271
LLDB is supposed to ask the Clang Driver what the default module cache path is
and then use that value as the default for the
`symbols.clang-modules-cache-path` setting. However, we use the property type
`String` to change `symbols.clang-modules-cache-path` even though the type of
that setting is `FileSpec`, so the setter will simply do nothing and return
`false`. We also don't check the return value of the setter, so this whole code
ends up not doing anything at all.
This changes the setter to use the correct property type and adds an assert that
we actually successfully set the default path. Also adds a test that checks that
the default value for this setting is never unset/empty path as this would
effectively disable the import-std-module feature from working by default.
Reviewed By: JDevlieghere, shafik
Differential Revision: https://reviews.llvm.org/D92772