649 Commits

Author SHA1 Message Date
Jason Molenda
481f248e08
[lldb] Get shared cache path from inferior, open (#180323)
Get the shared cache filepath and uuid that the inferior process is
using from debugserver, try to open that shared cache on the lldb host
mac and if the UUID matches, index all of the binaries in that shared
cache. When looking for binaries loaded in the process, get them from
the already-indexed shared cache.

Every time a binary is loaded, PlatformMacOSX may query the shared cache
filepath and uuid from the Process, and pass that to
HostInfo::GetSharedCacheImageInfo() if available (else fall back to the
old HostInfo::GetSharedCacheImageInfo method which only looks at lldb's
own shared cache), to get the file being requested.

ProcessGDBRemote caches the shared cache filepath and uuid from the
inferior, once it has a non-zero UUID. I added a lock for this ivar
specifically, so I don't have 20 threads all asking for the shared cache
information from debugserver and updating the cached answer. If we never
get back a non-zero UUID shared cache reply, we will re-query at every
library loaded notification. debugserver has been providing the shared
cache UUID since 2013, although I only added the shared cache filepath
field last November.

Note that a process will not report its shared cache filepath or uuid at
initial launch. As dyld gets a chance to execute a bit, it will start
returning binaries -- it will be available at the point when libraries
start loading. (it won't be available yet when the binary & dyld are the
only two binaries loaded in the process)

I tested this by disabling lldb's scan of its own shared cache
pre-execution -- only loading the system shared cache when the inferior
process reports that it is using that. I got 6-7 additional testsuite
failures running lldb like that, because no system binaries were loaded
before exeuction start, and the tests assumed they would be.

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2026-02-09 16:22:25 -08:00
Jason Molenda
e3c72cf008
[lldb] Add a new way of loading files from a shared cache (#179881)
Taking advantage of a few new SPI in macOS 26.4 libdyld, it is possible
for lldb to load binaries out of a shared cache binary blob, instead of
needing discrete files on disk. lldb has had one special case where it
has done this for years -- if the debugee process and lldb itself are
using the same shared cache, it could create ObjectFiles based on its
own memory contents. This new method requires only the shared cache on
disk, not depending on it being mapped into lldb's address space
already.

In HostInfoMacOSX.mm, we create an array of binaries in lldb's shared
cache, by one of two methods depending on the availability of SPI/SDKs.
This PR adds a new third method for loading lldb's shared cache off disk
as a proof of concept. It will prefer this new method when the needed
SPI are available at runtime. There is also a user setting to disable
this new method in case we uncover a problem as it is deployed.

I did change the internal store of the shared cache files from a single
array, to being organized by shared cache UUIDs, so we can have multiple
shared caches indexed in the future.

In HostInfoBase.h's SharedCacheImageInfo class, you can now create an
ImageInfo with a DataExtractorSP or a void* baton. I added GetUUID and
GetExtractor methods, and the latter will use the libdyld SPI to map the
segments for a specific binary into lldb's memory and return a
DataExtractorSP.

The setting is currently called symbols.shared-cache-binary-loading.

In DynamicLoaderDarwin::FindTargetModuleForImageInfo there was an
ordering mistake where we would always consult the HostInfoMacOSX.mm
shared cache provider, instead of checking lldb's own global module
cache first when looking for a binary, resulting in creating a new
Module repeatedly for shared cache binaries with the new method, parsing
the symbol table repeatedly. I fixed the ordering so we look at existing
Modules before we check the shared cache for one.

In ObjectFileMachOTest, it tests a TEXT and a DATA symbol, checking that
the contents of the function/data object match the bytes we got from the
shared cache. The test was using a DATA_DIRTY symbol, which was fine
when using lldb's own shared cache memory, but when we worked on the
shared cache binary on-disk directly, we were seeing different values
for the bytes because of relocations in there. I changed this to a
constant DATA symbol.

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Alex Langford <nirvashtzero@gmail.com>
2026-02-05 18:38:20 -08:00
Alex Langford
9ebdeb2e8c
[lldb] Return Expected<ModuleSP> from Process::ReadModuleFromMemory (#179583)
I noticed that Module::GetMemoryObjectFile populates a Status object
upon error but it's effectively dropped on the floor. Instead, the
clients can report the error as desired.

At the moment, all clients are either (1) consuming the error because
it's only trying to find a module, or (2) log the error and bail out
early. I tried to preserve existing behavior as faithfully as possible.
2026-02-05 11:49:17 -08:00
Jason Molenda
2aa020f49b
[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347)
In a PR last month I changed the ObjectFile CreateInstance etc methods
to accept an optional DataExtractorSP instead of a DataBufferSP, and
retain the extractor in a shared pointer internally in all of the
ObjectFile subclasses. This is laying the groundwork for using a
VirtualDataExtractor for some Mach-O binaries on macOS, where the
segments of the binary are out-of-order in actual memory, and we add a
lookup table to make it appear that the TEXT segment is at offset 0 in
the Extractor, etc. Working on the actual implementation, I realized we
were still using DataBufferSP's in ModuleSpec and Module, as well as in
ObjectFile::GetModuleSpecifications.

I originally was making a much larger NFC change where I had all
ObjectFile subclasses operating on DataExtractors throughout their
implementation, as well as in the DWARF parser. It was a very large
patchset. Many subclasses start with their DataExtractor, then create
smaller DataExtractors for parts of the binary image - the string table,
the symbol table, etc., for processing.

After consideration and discussion with Jonas, we agreed that a
segment/section of a binary will never require a lookup table to access
the bytes within it, so I changed
VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the
Subset be contained within a single lookup table entry, and (2) return a
simple DataExtractor bounded on that byte range. By doing this, I was
able to remove all of my very-invasive changes to the ObjectFile
subclass internals; it's only when they are operating on the entire
binary image that care is needed.

One pattern that subclasses like ObjectFileBreakpad use is to take an
ArrayRef of the DataBuffer for a binary, then create a StringRef of
that, then look for strings in it. With a VirtualDataExtractor and
out-of-order binary segments, with gaps between them, this allows us to
search the entire buffer looking for a string, and segfault when it gets
to an unmapped region of the buffer. I added a
VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest
contiguous memory region starting at offset 0 for this use case, and I
added a comment about what was being done there because I know it is not
obvious, and people not working on macOS wouldn't be familiar with the
requirement. (when we have a ModuleSpec with a DataExtractor, any of the
ObjectFile subclasses get a shot at Creating, so they all have to be
able to iterate on these)

rdar://148939795
2026-01-29 15:36:40 -08:00
Greg Clayton
5968e29dad
[lldb] Add the ability to load ELF core file executables and shared libraries from memory (#177289)
This patch enables ELF core files to be loaded and still show
executables and shared libraries. Functionality includes:
- Load executable and shared libraries from memory if ELF headers are
available
- Create placeholder for missing shared libraries and executable.
Previously you just wouldn't get anything in the "image list" if no
executable was provided.
2026-01-28 17:49:04 -08:00
Alex Langford
9ca02a13a4
[lldb][NFC] Mark Symbol pointers as const where easily possible (#177472)
These are the places that required no modifications to surrounding code.
2026-01-27 15:23:49 -08:00
Tom Yang
66d5f6a605
[lldb] fix parallel module loading deadlock for Linux DYLD (#166480)
Another attempt at resolving the deadlock issue @GeorgeHuyubo discovered
(his previous
[attempt](https://github.com/llvm/llvm-project/pull/160225)).

This change can be summarized as the following:
* Plumb through a boolean flag to force no preload in
`GetOrCreateModules` all the way through to `LoadModuleAtAddress`.
* Parallelize `Module::PreloadSymbols` separately from
`Target::GetOrCreateModule` and its caller `LoadModuleAtAddress` (this
is what avoids the deadlock).

These changes roughly maintain the performance characteristics of the
previous implementation of parallel module loading. Testing on targets
with between 5000 and 14000 modules, I saw similar numbers as before,
often more than 10% faster in the new implementation across multiple
trials for these massive targets. I think it's because we have less lock
contention with this approach.

# The deadlock

See [bt.txt](https://github.com/user-attachments/files/22524471/bt.txt)
for a sample backtrace of LLDB when the deadlock occurs.

As @GeorgeHuyubo explains in his
[PR](https://github.com/llvm/llvm-project/pull/160225), the deadlock
occurs from an ABBA deadlock that happens when a thread context-switches
out of `Module::PreloadSymbols`, goes into `Target::GetOrCreateModule`
for another module, possibly entering this block:
```
      if (!module_sp) {
        // The platform is responsible for finding and caching an appropriate
        // module in the shared module cache.
        if (m_platform_sp) {
          error = m_platform_sp->GetSharedModule(
              module_spec, m_process_sp.get(), module_sp, &search_paths,
              &old_modules, &did_create_module);
        } else {
          error = Status::FromErrorString("no platform is currently set");
        }
      }
```
`Module::PreloadSymbols` holds a module-level mutex, and then
`GetSharedModule` *attempts* to hold the mutex of the global shared
`ModuleList`. So, this thread holds the module mutex, and waits on the
global shared `ModuleList` mutex.

A competing thread may execute `Target::GetOrCreateModule`, enter the
same block as above, grabbing the global shared `ModuleList` mutex.
Then, in `ModuleList::GetSharedModule`, we eventually call
`ModuleList::FindModules` which eventually waits for the `Module` mutex
held by the first thread (via `Module::GetUUID`). Thus, we deadlock.

## Reproducing the deadlock

It might be worth noting that I've never been able to observe this
deadlock issue during live debugging (e.g. launching or attaching to
processes), however we were able to consistently reproduce this issue
with coredumps when using the following settings:
```
(lldb) settings set target.parallel-module-load true
(lldb) settings set target.preload-symbols true
(lldb) settings set symbols.load-on-demand false
(lldb) target create --core /some/core/file/here
# deadlock happens
```

## How this change avoids this deadlock

This change avoids concurrent executions of `Module::PreloadSymbols`
with `Target::GetOrCreateModule` by waiting until after the
`Target::GetOrCreateModule` executions to run `Module::PreloadSymbols`
in parallel. This avoids the ordering of holding a Module lock *then*
the ModuleList lock, as `Target::GetOrCreateModule` executions maintain
the ordering of the shared ModuleList lock first (from what I've read
and tested).

## Why not read-write lock?

Some feedback in https://github.com/llvm/llvm-project/pull/160225 was to
modify mutexes used in these components with read-write locks. This
might be a good idea overall, but I don't think it would *easily*
resolve this specific deadlock. `Module::PreloadSymbols` would probably
need a write lock to Module, so even if we had a read lock in
`Module::GetUUID` we would still contend. Maybe the `ModuleList` lock
could be a read lock that converts to a write lock if it chooses to
update the module, but it seems likely that some thread would try to
update the shared module list and then the write lock would contend
again.

Perhaps with deeper architectural changes, we could fix this issue?

# Other attempts

One downside of this approach (and the former approach of parallel
module loading) is that each DYLD would need to implement this pattern
themselves. With @clayborg's help, I looked at a few other approaches:
* In `Target::GetOrCreateModule`, backgrounding the
`Module::PreloadSymbols` call by adding it directly to the thread pool
via `Debugger::GetThreadPool().async()`. This required adding a lock to
`Module::SetLoadAddress` (probably should be one there already) since
`ObjectFileELF::SetLoadAddress` is not thread-safe (updates sections).
Unfortunately, during execution, this causes the preload symbols to run
synchronously with `Target::GetOrCreateModule`, preventing us from truly
parallelizing the execution.
* In `Module::PreloadSymbols`, backgrounding the `symtab` and `sym_file`
`PreloadSymbols` calls individually, but similar issues as the above.
* Passing a callback function like
https://github.com/swiftlang/llvm-project/pull/10746 instead of the
boolean I use in this change. It's functionally the same change IMO,
with some design tradeoffs:
* Pro: the caller doesn't need to explicitly call
`Module::PreloadSymbols` itself, and can instead call whatever function
is passed into the callback.
* Con: the caller needs to delay the execution of the callback such that
it occurs after the `GetOrCreateModule` logic, otherwise we run into the
same issue. I thought this would be trickier for the caller, requiring
some kinda condition variable or otherwise storing the calls to execute
afterwards.

# Test Plan:
```
ninja check-lldb
```

---------

Co-authored-by: Tom Yang <toyang@fb.com>
2025-11-14 15:58:43 -08:00
GeorgeHuyubo
fce58897ce
[lldb] Enable locate module callback for all module loading (#160199)
Main executables were bypassing the locate module callback that shared 
libraries use, preventing custom symbol file location logic from working
consistently. 

This PR fix this by
*   Adding target context to ModuleSpec
* Leveraging that context to use target search path and platform's
locate module callback in ModuleList::GetSharedModule

This ensures both main executables and shared libraries get the same 
callback treatment for symbol file resolution.

---------

Co-authored-by: George Hu <hyubo@meta.com>
Co-authored-by: George Hu <georgehuyubo@gmail.com>
2025-11-06 12:48:21 -08:00
Andrew Savonichev
371d1a8e3e
[lldb] Use weak pointers instead of shared pointers in DynamicLoader (#156446)
DynamicLoaderWindowsDYLD uses pointers to Modules to maintain a map
from modules to their addresses, but it does not need to keep "strong"
references to them. Weak pointers should be enough, and would allow
modules to be released elsewhere.

Other DynamicLoader classes do not use shared pointers as well. For
example, DynamicLoaderPOSIXDYLD has a similar map with weak pointers.

Actually testing for modules being completely released can be tricky.
The test here is just to illustrate the case where shared pointers kept
modules in DynamicLoaderWindowsDYLD and prevented them from being
released. The test executes the following sequence:

  1. Create a target, load an executable and run it.

2. Remove one module from the target. The target should be the last
actual use of the module, but we have another reference to it in the
shared module cache.

3. Call MemoryPressureDetected to remove this last reference from the
cache.

  4. Replace the corresponding DLL file.

LLDB memory maps DLLs, and this makes files read-only on Windows. Unless
the modules are completely released (and therefore unmapped), (4) is
going to fail with "access denied".

However, the test does not trigger the bug completely - it passes with
and without the change.
2025-09-04 20:36:14 +09:00
Ely Ronnen
4d3feaea66
[lldb-dap] persistent assembly breakpoints (#148061)
Resolves #141955

- Adds data to breakpoints `Source` object, in order for assembly
breakpoints, which rely on a temporary `sourceReference` value, to be
able to resolve in future sessions like normal path+line breakpoints
- Adds optional `instructions_offset` parameter to `BreakpointResolver`
2025-08-08 22:29:47 +02:00
Alex Langford
a27d34b3f2
[lldb] Fix TLS support on Darwin platforms (#151601)
When I wrote this previously, I was unaware that the TLS function
already adds the offset. The test was working previously because the
offset was 0 in this case (only 1 thread-local variable). I added
another thread-local variable to the test to make sure the offset is
indeed handled correctly.

rdar://156547548
2025-08-04 11:44:17 -07:00
Pavel Labath
46e1e9f104
Reapply "[lldb/cmake] Plugin layering enforcement mechanism (#144543)" (#145305)
The only difference from the original PR are the added BRIEF and
FULL_DOCS arguments to define_property, which are required for
cmake<3.23.
2025-06-24 11:10:35 +02:00
Pavel Labath
18f667d804 Revert "[lldb/cmake] Plugin layering enforcement mechanism (#144543)"
Causes failures on several bots.

This reverts commits 714b2fdf3a385e5b9a95c435f56b1696ec3ec9e8 and
e7c1da7c8ef31c258619c1668062985e7ae83b70.
2025-06-23 12:07:10 +02:00
Pavel Labath
e7c1da7c8e
[lldb/cmake] Plugin layering enforcement mechanism (#144543)
Some inter-plugin dependencies are okay, others are not. Yet others not,
but we're sort of stuck with them. The idea is to be able to prevent
backsliding while making sure that acceptable dependencies are..
accepted. For context, see
https://github.com/llvm/llvm-project/pull/139170 and the attached
changes to the documentation.
2025-06-23 11:31:26 +02:00
Pavel Labath
2c4f67794b
[lldb/cmake] Implicitly pass arguments to llvm_add_library (#142583)
If we're not touching them, we don't need to do anything special to pass
them along -- with one important caveat: due to how cmake arguments
work, the implicitly passed arguments need to be specified before
arguments that we handle.

This isn't particularly nice, but the alternative is enumerating all
arguments that can be used by llvm_add_library and the macros it calls
(it also relies on implicit passing of some arguments to
llvm_process_sources).
2025-06-04 11:33:37 +02:00
Akash Agrawal
e4ed71818e
[LLDB] [NFC] - Remove duplicate #include headers from the files of lldb dir & few other files (#141478)
A few files of lldb dir & few other files had duplicate headers
included. This patch removes those redundancies.

---------

Co-authored-by: Akash Agrawal <akashag@qti.qualcomm.com>
2025-05-29 23:13:30 -07:00
Pavel Labath
53a5bea0ad
[lldb] Call Target::ClearAllLoadedSections even earlier (#140228)
This reapplies https://github.com/llvm/llvm-project/pull/138892, which
was reverted in

5fb9dca14a
due to failures on windows.

Windows loads modules from the Process class, and it does that quite
early, and it kinda makes sense which is why I'm moving the clearing
code even earlier.

The original commit message was:

Minidump files contain explicit information about load addresses of
modules, so it can load them itself. This works on other platforms, but
fails on darwin because DynamicLoaderDarwin nukes the loaded module list
on initialization (which happens after the core file plugin has done its
work).

This used to work until
https://github.com/llvm/llvm-project/pull/109477, which enabled the
dynamic loader
plugins for minidump files in order to get them to provide access to
TLS.

Clearing the load list makes sense, but I think we could do it earlier
in the process, so that both Process and DynamicLoader plugins get a
chance to load modules. This patch does that by calling the function
early in the launch/attach/load core flows.

This fixes TestDynamicValue.py:test_from_core_file on darwin.
2025-05-22 08:32:11 +02:00
Pavel Labath
5fb9dca14a Revert "[lldb] Call Target::ClearAllLoadedSections earlier (#138892)"
This reverts commit 97aa01bef770ec651c86978d137933e09221dd00 and
7e7871d3f58b9da72ca180fcd7f0d2da3f92ec4a due to failures on windows.
2025-05-14 18:22:02 +02:00
Pavel Labath
97aa01bef7
[lldb] Call Target::ClearAllLoadedSections earlier (#138892)
Minidump files contain explicit information about load addresses of
modules, so it can load them itself. This works on other platforms, but
fails on darwin because DynamicLoaderDarwin nukes the loaded module list
on initialization (which happens after the core file plugin has done its
work).

This used to work until #109477, which enabled the dynamic loader
plugins for minidump files in order to get them to provide access to
TLS.

Clearing the load list makes sense, but I think we could do it earlier
in the process, so that both Process and DynamicLoader plugins get a
chance to load modules. This patch does that by calling the function
early in the launch/attach/load core flows.

This fixes TestDynamicValue.py:test_from_core_file on darwin.
2025-05-14 11:16:55 +02:00
jimingham
952b680fd1
Support stepping through Darwin "branch islands" (#139301)
When an intra-module jump doesn't fit in the immediate branch slot, the
Darwin linker inserts "branch island" symbols, and emits code to jump
from branch island to branch island till it makes it to the actual
function.

The previous submissions failed because in that environment the linker
was putting the `foo.island` symbol at the same address as the `padding`
symbol we we emitting to make our faked-up large binary. This submission
jams a byte after the padding symbol so that the other symbols can't
overlap it.
2025-05-13 13:32:53 -07:00
jimingham
74120d0a38
Revert branch island experiments (#139192)
This test is failing because when we step to what is the branch island
address and ask for its symbol, we can't resolve the symbol, and just
call it the last padding symbol plus a bajillion.

That has nothing to do with the changes in this patch, but I'll revert
this and keep trying to figure out why symbol reading on this bot is
wrong.
2025-05-08 18:37:43 -07:00
jimingham
b6922b7170
Add more logging so I can figure out why TestBranchIslands.py is (#139178)
failing but only on the bot.
2025-05-08 17:03:21 -07:00
jimingham
6bb3019691
Branch island debug (#139166)
This patch allows lldb to step in across "branch islands" which is the
Darwin linker's way of dealing with immediate branches to targets that
are too far away for the immediate slot to make the jump.

I submitted this a couple days ago and it failed on the arm64 bot. I was
able to match the bot OS and Tool versions (they are a bit old at this
point) and ran the test there but sadly it succeeded. The x86_64 bot
also failed but that was my bad, I did @skipUnlessDarwin when I should
have done @skipUnlessAppleSilicon.

So this resubmission is with the proper decoration for the test, and
with a bunch of debug output printed in case of failure. With any luck,
if this resubmission fails again I'll be able to see what's going on.
2025-05-08 16:22:39 -07:00
Felipe de Azevedo Piovezan
a1238911f4 Revert "Branch island with numbers (#138781)"
This reverts commit 11f33ab3850886510a831122078a155be7dc1167.

This is failing on CI.
2025-05-06 18:20:25 -07:00
jimingham
11f33ab385
Branch island with numbers (#138781)
Reapply the support for stepping through branch islands, add support for
a branch that takes multiple hops to get to the target.
2025-05-06 16:58:01 -07:00
jimingham
1ff2953f5e
Revert "Handle step-in over a Darwin "branch island". (#138330)" (#138569)
This reverts commit 1ba89ad2c6e405bd5ac0c44e2ee5aa5504c7aba1.

This was failing on the Green Dragon bot, which has an older OS than
have on hand, so I'll have to dig up one and see why it's failing there.
2025-05-05 12:45:17 -07:00
jimingham
1ba89ad2c6
Handle step-in over a Darwin "branch island". (#138330) 2025-05-05 09:55:32 -07:00
Tom Yang
65813e0e94
Control Darwin parallel image loading with target.parallel-module-load (#134437)
A requested follow-up from
https://github.com/llvm/llvm-project/pull/130912 by @JDevlieghere to
control Darwin parallel image loading with the same
`target.parallel-module-load` that controls the POSIX dyld parallel
image loading. Darwin parallel image loading was introduced by
https://github.com/llvm/llvm-project/pull/110646.

This small change:
* removes
`plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load`
and associated code.
* changes setting call site in
`DynamicLoaderDarwin::PreloadModulesFromImageInfos` to use the new
setting.

Tested by running `ninja check-lldb` and loading some targets.

Co-authored-by: Tom Yang <toyang@fb.com>
2025-04-07 16:33:48 -07:00
Tom Yang
a8d2d169c7
Parallelize module loading in POSIX dyld code (#130912)
This patch improves LLDB launch time on Linux machines for **preload
scenarios**, particularly for executables with a lot of shared library
dependencies (or modules). Specifically:
* Launching a binary with `target.preload-symbols = true` 
* Attaching to a process with `target.preload-symbols = true`.
It's completely controlled by a new flag added in the first commit
`plugin.dynamic-loader.posix-dyld.parallel-module-load`, which *defaults
to false*. This was inspired by similar work on Darwin #110646.

Some rough numbers to showcase perf improvement, run on a very beefy
machine:
* Executable with ~5600 modules: baseline 45s, improvement 15s
* Executable with ~3800 modules: baseline 25s,  improvement 10s
* Executable with ~6650 modules: baseline 67s, improvement 20s
* Executable with ~12500 modules: baseline 185s, improvement 85s
* Executable with ~14700 modules: baseline 235s, improvement 120s
A lot of targets we deal with have a *ton* of modules, and unfortunately
we're unable to convince other folks to reduce the number of modules, so
performance improvements like this can be very impactful for user
experience.

This patch achieves the performance improvement by parallelizing
`DynamicLoaderPOSIXDYLD::RefreshModules` for the launch scenario, and
`DynamicLoaderPOSIXDYLD::LoadAllCurrentModules` for the attach scenario.
The commits have some context on their specific changes as well --
hopefully this helps the review.

# More context on implementation

We discovered the bottlenecks by via `perf record -g -p <lldb's pid>` on
a Linux machine. With an executable known to have 1000s of shared
library dependencies, I ran
```
(lldb) b main
(lldb) r
# taking a while
```
and showed the resulting perf trace (snippet shown)
```
Samples: 85K of event 'cycles:P', Event count (approx.): 54615855812
  Children      Self  Command          Shared Object              Symbol
-   93.54%     0.00%  intern-state     libc.so.6                  [.] clone3
     clone3
     start_thread
     lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*)                                                                           r
     std::_Function_handler<void* (), lldb_private::Process::StartPrivateStateThread(bool)::$_0>::_M_invoke(std::_Any_data const&)
     lldb_private::Process::RunPrivateStateThread(bool)                                                                                          n
   - lldb_private::Process::HandlePrivateEvent(std::shared_ptr<lldb_private::Event>&)
      - 93.54% lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*)
         - 93.54% lldb_private::ThreadList::ShouldStop(lldb_private::Event*)
            - lldb_private::Thread::ShouldStop(lldb_private::Event*)                                                                             *
               - 93.53% lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event*)                                            t
                  - 93.52% lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext*)                                     i
                       lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext*)                           k
                       lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext*)                                     b
                       lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext*, unsigned long, unsigned long)    i
                       DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long, unsigned lo
                     - DynamicLoaderPOSIXDYLD::RefreshModules()                                                                                  O
                        - 93.42% DynamicLoaderPOSIXDYLD::RefreshModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const                 u
                           - 93.40% DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, bools
                              - lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, boos
                                 - 83.90% lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&)                        o
                                    - 83.01% lldb_private::Target::GetOrCreateModule(lldb_private::ModuleSpec const&, bool, lldb_private::Status*
                                       - 77.89% lldb_private::Module::PreloadSymbols()
                                          - 44.06% lldb_private::Symtab::PreloadSymbols()
                                             - 43.66% lldb_private::Symtab::InitNameIndexes()
...
```
We saw that majority of time was spent in `RefreshModules`, with the
main culprit within it `LoadModuleAtAddress` which eventually calls
`PreloadSymbols`.

At first, `DynamicLoaderPOSIXDYLD::LoadModuleAtAddress` appears fairly
independent -- most of it deals with different files and then getting or
creating Modules from these files. The portions that aren't independent
seem to deal with ModuleLists, which appear concurrency safe. There were
members of `DynamicLoaderPOSIXDYLD` I had to synchronize though: namely
`m_loaded_modules` which `DynamicLoaderPOSIXDYLD` maintains to map its
loaded modules to their link addresses. Without synchronizing this, I
ran into SEGFAULTS and other issues when running `check-lldb`. I also
locked the assignment and comparison of `m_interpreter_module`, which
may be unnecessary.

# Alternate implementations

When creating this patch, another implementation I considered was
directly background-ing the call to `Module::PreloadSymbol` in
`Target::GetOrCreateModule`. It would have the added benefit of working
across platforms generically, and appeared to be concurrency safe. It
was done via `Debugger::GetThreadPool().async` directly. However, there
were a ton of concurrency issues, so I abandoned that approach for now.

# Testing

With the feature active, I tested via `ninja check-lldb` on both Debug
and Release builds several times (~5 or 6 altogether?), and didn't spot
additional failing or flaky tests.

I also tested manually on several different binaries, some with around
14000 modules, but just basic operations: launching, reaching main,
setting breakpoint, stepping, showing some backtraces.

I've also tested with the flag off just to make sure things behave
properly synchronously.
2025-03-31 13:29:31 -07:00
Jason Molenda
09a36c8279 [lldb][NFC] Correct whitespace in SearchForKernelWithDebugHints 2025-03-13 10:08:09 -07:00
Jonas Devlieghere
78d82d3ae7
[lldb] Store StreamAsynchronousIO in a unique_ptr (NFC) (#127961)
Make StreamAsynchronousIO an unique_ptr instead of a shared_ptr. I tried
passing the class by value, but the llvm::raw_ostream forwarder stored
in the Stream parent class isn't movable and I don't think it's worth
changing that. Additionally, there's a few places that expect a
StreamSP, which are easily created from a StreamUP.
2025-02-20 11:13:46 -08:00
Jonas Devlieghere
65998ab2cb
[lldb] Make GetOutputStreamSP and GetErrorStreamSP protected (#127682)
This makes GetOutputStreamSP and GetErrorStreamSP protected members of
Debugger. Users who want to print to the debugger's stream should use
GetAsyncOutputStreamSP and GetAsyncErrorStreamSP instead and the few
remaining stragglers have been migrated.
2025-02-19 08:31:40 -08:00
Jonas Devlieghere
eff3c343b0
[lldb] Remove Debugger::Get{Output,Error}Stream (NFC) (#126821)
Remove Debugger::GetOutputStream and Debugger::GetErrorStream in
preparation for replacing both with a new variant that needs to be
locked and hence can't be handed out like we do right now.

The patch replaces most uses with GetAsyncOutputStream and
GetAsyncErrorStream respectively. There methods return new StreamSP
objects that automatically get flushed on destruction.

See #126630 for more details.
2025-02-12 08:29:06 -08:00
Jason Molenda
d90399603c
[lldb] [darwin] Upstream a few DriverKit cases (#126604)
A DriverKit process is a kernel extension that runs in userland, instead
of running in the kernel address space/priv levels, they've been around
a couple of years. From lldb's perspective a DriverKit process is no
different from any other userland level process, but it has a different
Triple so we need to handle those cases in the lldb codebase. Some of
the DriverKit triple handling had been upstreamed to llvm-project, but I
noticed a few cases that had not yet. Cleaning that up.
2025-02-10 14:49:53 -08:00
Jason Molenda
fec6d168bb
[lldb] Upstream a few remaining Triple::XROS patches (#126335)
Recognize the visionOS Triple::OSType::XROS os type. Some of these have
already been landed on main, but I reviewed the downstream sources and
there were a few that still needed to be landed upstream.
2025-02-08 15:50:52 -08:00
Jonas Devlieghere
99099cd635
[lldb] Use Lambda to simplify repeptitive code in DynamicLoaderDarwin (NFC) (#126175)
I suggested using a lambda in #126171 but @jasonmolenda missed it.
2025-02-06 20:39:30 -08:00
Jason Molenda
003a2bf954
[lldb][Darwin] Change DynamicLoaderDarwin to default to new SPI (#126171)
In Sep 2016 and newer Darwin releases, debugserver uses libdyld SPI to
gather information about the binaries loaded in a process. Before Sep
2016, lldb would inspect the dyld internal data structures directly
itself to find this information.

DynamicLoaderDarwin::UseDYLDSPI currently defaults to the old
inspect-dyld-internal-structures method for binaries
(DynamicLoaderMacOSXDYLD). If it detects that the Process' host OS
version is new enough, it enables the newer libdyld SPI methods in
debugserver (DynamicLoaderMacOS).

This patch changes the default to use the new libdyld SPI interfaces. If
the Process has a HostOS and it is one of the four specific OSes that
existed in 2015 (Mac OS X, iOS, tvOS, watchOS) with an old version
number, then we will enable the old DynamicLoader plugin.

If this debug session is a corefile, we will always use the old
DynamicLoader plugin -- the libdyld SPI cannot run against a corefile,
lldb must read metadata or the dyld internal data structures in the
corefile to find the loaded binaries.
2025-02-06 19:11:23 -08:00
Pavel Labath
feb5a77d70
[lldb] Add SymbolContext::GetFunctionOrSymbolAddress (#123340)
Many uses of SC::GetAddressRange were not interested in the range, but
in the address of the function/symbol contained inside the symbol
context. They were getting that by calling the GetBaseAddress on the
returned range, which worked well enough so far, but isn't compatible
with discontinuous functions, whose address (entry point) may not be the
lowest address in the range.

To resolve this problem, this PR creates a new function whose purpose is
return the address of the function or symbol inside the symbol context.
It also changes all of the callers of GetAddressRange which do not
actually care about the range to call this function instead.
2025-02-06 09:12:44 +01:00
Greg Clayton
c4fb7180cb
[lldb][NFC] Make the target's SectionLoadList private. (#113278)
Lots of code around LLDB was directly accessing the target's section
load list. This NFC patch makes the section load list private so the
Target class can access it, but everyone else now uses accessor
functions. This allows us to control the resolving of addresses and will
allow for functionality in LLDB which can lazily resolve addresses in
JIT plug-ins with a future patch.
2025-01-14 20:12:46 -08:00
Pavel Labath
66a88f62cd
[lldb] Add Function::GetAddress and redirect some uses (#115836)
Many calls to Function::GetAddressRange() were not interested in the
range itself. Instead they wanted to find the address of the function
(its entry point) or the base address for relocation of function-scoped
entities (technically, the two don't need to be the same, but there's
isn't good reason for them not to be). This PR creates a separate
function for retrieving this, and changes the existing
(non-controversial) uses to call that instead.
2025-01-10 09:56:55 +01:00
Jonas Devlieghere
f109517d15
[lldb] Support overriding the disassembly CPU & features (#115382)
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).

This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.

The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
2024-11-11 16:27:15 -08:00
Kazu Hirata
5dbfb49490
[lldb] Avoid repeated hash lookups (NFC) (#113248) 2024-10-22 07:59:41 -07:00
Dmitrii Galimzianov
5f2cf99e14
DynamicLoaderDarwin load images in parallel with preload (#110646)
This change enables `DynamicLoaderDarwin` to load modules in parallel
using the thread pool. This new behavior is controlled by a new setting
`plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load`,
which is enabled by default. When disabled, DynamicLoaderDarwin will
load modules sequentially as before.
2024-10-15 13:25:01 -07:00
Jacob Lalonde
e9c8f75d45
[LLDB][Minidump] Have Minidumps save off and properly read TLS data (#109477)
This patch adds the support to `Process.cpp` to automatically save off
TLS sections, either via loading the memory region for the module, or
via reading `fs_base` via generic register. Then when Minidumps are
loaded, we now specify we want the dynamic loader to be the `POSIXDYLD`
so we can leverage the same TLS accessor code as `ProcessELFCore`. Being
able to access TLS Data is an important step for LLDB generated
minidumps to have feature parity with ELF Core dumps.
2024-10-10 15:59:51 -07:00
Jacob Lalonde
5d372ea6a1
[LLDB][DYLD] Remove logic around not rebasing when main executable has a load address (#110885)
This is a part of #109477 that I'm making into it's own patch. Here we
remove logic from the DYLD that prevents it's logic from running if the
main executable already has a load address. Instead we let the DYLD
fully determine what should be loaded and what shouldn't.
2024-10-07 09:45:56 -07:00
Jason Molenda
0f98497689 [lldb] [Mach-O corefiles] Sanity check malformed dyld
lldb scans the corefile for dyld, the dynamic loader, and when it
finds a mach-o header that looks like dyld, it tries to read all
of the load commands and symbol table out of the corefile memory.
If the load comamnds and symbol table are absent or malformed,
it doesn't handle this case and can crash.  Back out when we
fail to create a Module from the dyld binary.

rdar://136659551
2024-09-25 21:51:38 -07:00
Adrian Prantl
0642cd768b
[lldb] Turn lldb_private::Status into a value type. (#106163)
This patch removes all of the Set.* methods from Status.

This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.

This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()

Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form

`    ResultTy DoFoo(Status& error)
`
to

`    llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?

The interesting changes are in Status.h and Status.cpp, all other
changes are mostly

` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
2024-08-27 10:59:31 -07:00
Dhruv Srivastava
b804516dc5
[lldb][AIX] 1. Avoid namespace collision on other platforms (#104679)
This PR is in reference to porting LLDB on AIX.

Link to discussions on llvm discourse and github:
1.  https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2.  #101657 

The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601

The changes on this PR are intended to avoid namespace collision for
certain typedefs between lldb and other platforms:
1. tid_t --> lldb::tid_t
2. offset_t --> lldb::offset_t
2024-08-20 10:19:32 +01:00
Med Ismail Bennani
bb8a74075b
[lldb] Change GetStartSymbol to GetStartAddress in DynamicLoader (#99909)
On linux, the start address doesn't necessarily have a symbol attached
to it.

This is why this patch replaces `DynamicLoader::GetStartSymbol` with
`DynamicLoader::GetStartAddress` instead to make it more generic.

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
2024-07-22 11:43:32 -07:00
Med Ismail Bennani
a96c906102
[lldb/Target] Add GetStartSymbol method to DynamicLoader plugins (#99673)
This patch introduces a new method to the dynamic loader plugin, to
fetch its `start` symbol.

This can be useful to resolve the `start` symbol address for instance.

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
2024-07-19 11:39:56 -07:00