17 Commits

Author SHA1 Message Date
Jason Molenda
481f248e08
[lldb] Get shared cache path from inferior, open (#180323)
Get the shared cache filepath and uuid that the inferior process is
using from debugserver, try to open that shared cache on the lldb host
mac and if the UUID matches, index all of the binaries in that shared
cache. When looking for binaries loaded in the process, get them from
the already-indexed shared cache.

Every time a binary is loaded, PlatformMacOSX may query the shared cache
filepath and uuid from the Process, and pass that to
HostInfo::GetSharedCacheImageInfo() if available (else fall back to the
old HostInfo::GetSharedCacheImageInfo method which only looks at lldb's
own shared cache), to get the file being requested.

ProcessGDBRemote caches the shared cache filepath and uuid from the
inferior, once it has a non-zero UUID. I added a lock for this ivar
specifically, so I don't have 20 threads all asking for the shared cache
information from debugserver and updating the cached answer. If we never
get back a non-zero UUID shared cache reply, we will re-query at every
library loaded notification. debugserver has been providing the shared
cache UUID since 2013, although I only added the shared cache filepath
field last November.

Note that a process will not report its shared cache filepath or uuid at
initial launch. As dyld gets a chance to execute a bit, it will start
returning binaries -- it will be available at the point when libraries
start loading. (it won't be available yet when the binary & dyld are the
only two binaries loaded in the process)

I tested this by disabling lldb's scan of its own shared cache
pre-execution -- only loading the system shared cache when the inferior
process reports that it is using that. I got 6-7 additional testsuite
failures running lldb like that, because no system binaries were loaded
before exeuction start, and the tests assumed they would be.

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2026-02-09 16:22:25 -08:00
Jason Molenda
e3c72cf008
[lldb] Add a new way of loading files from a shared cache (#179881)
Taking advantage of a few new SPI in macOS 26.4 libdyld, it is possible
for lldb to load binaries out of a shared cache binary blob, instead of
needing discrete files on disk. lldb has had one special case where it
has done this for years -- if the debugee process and lldb itself are
using the same shared cache, it could create ObjectFiles based on its
own memory contents. This new method requires only the shared cache on
disk, not depending on it being mapped into lldb's address space
already.

In HostInfoMacOSX.mm, we create an array of binaries in lldb's shared
cache, by one of two methods depending on the availability of SPI/SDKs.
This PR adds a new third method for loading lldb's shared cache off disk
as a proof of concept. It will prefer this new method when the needed
SPI are available at runtime. There is also a user setting to disable
this new method in case we uncover a problem as it is deployed.

I did change the internal store of the shared cache files from a single
array, to being organized by shared cache UUIDs, so we can have multiple
shared caches indexed in the future.

In HostInfoBase.h's SharedCacheImageInfo class, you can now create an
ImageInfo with a DataExtractorSP or a void* baton. I added GetUUID and
GetExtractor methods, and the latter will use the libdyld SPI to map the
segments for a specific binary into lldb's memory and return a
DataExtractorSP.

The setting is currently called symbols.shared-cache-binary-loading.

In DynamicLoaderDarwin::FindTargetModuleForImageInfo there was an
ordering mistake where we would always consult the HostInfoMacOSX.mm
shared cache provider, instead of checking lldb's own global module
cache first when looking for a binary, resulting in creating a new
Module repeatedly for shared cache binaries with the new method, parsing
the symbol table repeatedly. I fixed the ordering so we look at existing
Modules before we check the shared cache for one.

In ObjectFileMachOTest, it tests a TEXT and a DATA symbol, checking that
the contents of the function/data object match the bytes we got from the
shared cache. The test was using a DATA_DIRTY symbol, which was fine
when using lldb's own shared cache memory, but when we worked on the
shared cache binary on-disk directly, we were seeing different values
for the bytes because of relocations in there. I changed this to a
constant DATA symbol.

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Alex Langford <nirvashtzero@gmail.com>
2026-02-05 18:38:20 -08:00
Jason Molenda
2aa020f49b
[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347)
In a PR last month I changed the ObjectFile CreateInstance etc methods
to accept an optional DataExtractorSP instead of a DataBufferSP, and
retain the extractor in a shared pointer internally in all of the
ObjectFile subclasses. This is laying the groundwork for using a
VirtualDataExtractor for some Mach-O binaries on macOS, where the
segments of the binary are out-of-order in actual memory, and we add a
lookup table to make it appear that the TEXT segment is at offset 0 in
the Extractor, etc. Working on the actual implementation, I realized we
were still using DataBufferSP's in ModuleSpec and Module, as well as in
ObjectFile::GetModuleSpecifications.

I originally was making a much larger NFC change where I had all
ObjectFile subclasses operating on DataExtractors throughout their
implementation, as well as in the DWARF parser. It was a very large
patchset. Many subclasses start with their DataExtractor, then create
smaller DataExtractors for parts of the binary image - the string table,
the symbol table, etc., for processing.

After consideration and discussion with Jonas, we agreed that a
segment/section of a binary will never require a lookup table to access
the bytes within it, so I changed
VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the
Subset be contained within a single lookup table entry, and (2) return a
simple DataExtractor bounded on that byte range. By doing this, I was
able to remove all of my very-invasive changes to the ObjectFile
subclass internals; it's only when they are operating on the entire
binary image that care is needed.

One pattern that subclasses like ObjectFileBreakpad use is to take an
ArrayRef of the DataBuffer for a binary, then create a StringRef of
that, then look for strings in it. With a VirtualDataExtractor and
out-of-order binary segments, with gaps between them, this allows us to
search the entire buffer looking for a string, and segfault when it gets
to an unmapped region of the buffer. I added a
VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest
contiguous memory region starting at offset 0 for this use case, and I
added a comment about what was being done there because I know it is not
obvious, and people not working on macOS wouldn't be familiar with the
requirement. (when we have a ModuleSpec with a DataExtractor, any of the
ObjectFile subclasses get a shot at Creating, so they all have to be
able to iterate on these)

rdar://148939795
2026-01-29 15:36:40 -08:00
GeorgeHuyubo
fce58897ce
[lldb] Enable locate module callback for all module loading (#160199)
Main executables were bypassing the locate module callback that shared 
libraries use, preventing custom symbol file location logic from working
consistently. 

This PR fix this by
*   Adding target context to ModuleSpec
* Leveraging that context to use target search path and platform's
locate module callback in ModuleList::GetSharedModule

This ensures both main executables and shared libraries get the same 
callback treatment for symbol file resolution.

---------

Co-authored-by: George Hu <hyubo@meta.com>
Co-authored-by: George Hu <georgehuyubo@gmail.com>
2025-11-06 12:48:21 -08:00
Adrian Prantl
0642cd768b
[lldb] Turn lldb_private::Status into a value type. (#106163)
This patch removes all of the Set.* methods from Status.

This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.

This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()

Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form

`    ResultTy DoFoo(Status& error)
`
to

`    llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?

The interesting changes are in Status.h and Status.cpp, all other
changes are mostly

` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
2024-08-27 10:59:31 -07:00
Anthony Ha
95f208f97e
[lldb] Unify CalculateMD5 return types (#91029)
This is a retake of https://github.com/llvm/llvm-project/pull/90921
which got reverted because I forgot to modify the CalculateMD5 unit test
I had added in https://github.com/llvm/llvm-project/pull/88812

The prior failing build is here:
https://lab.llvm.org/buildbot/#/builders/68/builds/73622
To make sure this error doesn't happen, I ran `ninja
ProcessGdbRemoteTests` and then executed the resulting test binary and
observed the `CalculateMD5` test passed.

# Overview
In my previous PR: https://github.com/llvm/llvm-project/pull/88812,
@JDevlieghere suggested to match return types of the various calculate
md5 functions.

This PR achieves that by changing the various calculate md5 functions to
return `llvm::ErrorOr<llvm::MD5::MD5Result>`.
 
The suggestion was to go for `std::optional<>` but I opted for
`llvm::ErrorOr<>` because local calculate md5 was already possibly
returning `ErrorOr`.

To make sure I didn't break the md5 calculation functionality, I ran
some tests for the gdb remote client, and things seem to work.

# Testing
1. Remote file doesn't exist

![image](https://github.com/llvm/llvm-project/assets/1326275/b26859e2-18c3-4685-be8f-c6b6a5a4bc77)

1. Remote file differs

![image](https://github.com/llvm/llvm-project/assets/1326275/cbdb3c58-555a-401b-9444-c5ff4c04c491)

1. Remote file matches

![image](https://github.com/llvm/llvm-project/assets/1326275/07561572-22d1-4e0a-988f-bc91b5c2ffce)

## Test gaps
Unfortunately, I had to modify
`lldb/source/Plugins/Platform/MacOSX/PlatformDarwinDevice.cpp` and I
can't test the changes there. Hopefully, the existing test suite / code
review from whomever is reading this will catch any issues.
2024-05-09 15:57:46 -07:00
Jonas Devlieghere
ca8b064973
Revert "[lldb] Unify CalculateMD5 return types" (#90998)
Reverts llvm/llvm-project#90921
2024-05-03 12:14:45 -07:00
Anthony Ha
2f58b9aae2
[lldb] Unify CalculateMD5 return types (#90921)
# Overview
In my previous PR: https://github.com/llvm/llvm-project/pull/88812,
@JDevlieghere suggested to match return types of the various calculate
md5 functions.

This PR achieves that by changing the various calculate md5 functions to
return `llvm::ErrorOr<llvm::MD5::MD5Result>`.
 
The suggestion was to go for `std::optional<>` but I opted for
`llvm::ErrorOr<>` because local calculate md5 was already possibly
returning `ErrorOr`.

To make sure I didn't break the md5 calculation functionality, I ran
some tests for the gdb remote client, and things seem to work.

# Testing
1. Remote file doesn't exist

![image](https://github.com/llvm/llvm-project/assets/1326275/b26859e2-18c3-4685-be8f-c6b6a5a4bc77)

1. Remote file differs

![image](https://github.com/llvm/llvm-project/assets/1326275/cbdb3c58-555a-401b-9444-c5ff4c04c491)

1. Remote file matches

![image](https://github.com/llvm/llvm-project/assets/1326275/07561572-22d1-4e0a-988f-bc91b5c2ffce)

## Test gaps
Unfortunately, I had to modify
`lldb/source/Plugins/Platform/MacOSX/PlatformDarwinDevice.cpp` and I
can't test the changes there. Hopefully, the existing test suite / code
review from whomever is reading this will catch any issues.

Co-authored-by: Anthony Ha <antha@microsoft.com>
2024-05-03 11:51:25 -07:00
Alex Langford
f4be9ff645 [lldb][NFCI] Platforms should own their SDKBuild and SDKRootDirectory strings
These don't need to be ConstStrings. They don't really benefit much from
deduplication and comparing them isn't on a hot path, so they don't
really benefit much from quick comparisons.

Differential Revision: https://reviews.llvm.org/D152331
2023-06-14 09:59:58 -07:00
Kazu Hirata
2fe8327406 [lldb] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to clean up the "using" declarations, #include
"llvm/ADT/Optional.h", etc.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-07 14:18:35 -08:00
Kazu Hirata
f190ce625a [lldb] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-07 13:43:00 -08:00
Muhammad Omair Javaid
58e9cc13e2 Revert "[lldb] Remove redundant .c_str() and .get() calls"
This reverts commit fbaf48be0ff6fb24b9aa8fe9c2284fe88a8798dd.

This has broken all LLDB buildbots:
https://lab.llvm.org/buildbot/#/builders/68/builds/44990
https://lab.llvm.org/buildbot/#/builders/96/builds/33160
2022-12-19 13:52:10 +05:00
Fangrui Song
fbaf48be0f [lldb] Remove redundant .c_str() and .get() calls
Removing .c_str() has a semantics difference, but the use scenarios
likely do not matter as we don't have NUL in the strings.
2022-12-18 01:15:25 +00:00
Jonas Devlieghere
ecda408178
[lldb] Read from the Rosetta shared cache with Xcode 14
Xcode 14 no longer puts the Rosetta expanded shared cache in a directory
named "16.0". Instead, it includes the real version number (e.g. 13.0),
the build string and the architecture, similar to the device support
directory names for iOS, tvOS and watchOS.

Currently, when there are multiple directories, we might end up picking
the wrong one in GetSDKDirectoryForCurrentOSVersion. The problem is that
without the build string we have no way to differentiate between
multiple directories with the same version number. This patch fixes the
problem by using GetOSBuildString which, as the name implies, returns
the build string if known.

This also adds a test for Rosetta debugging on Apple Silicon. Depending
on whether the Rosetta expanded shared cache is present, the test
ensures that there is or isn't a diagnostic about reading out of memory.

rdar://97576121

Differential revision: https://reviews.llvm.org/D130540
2022-07-27 15:26:46 -07:00
Jonas Devlieghere
e53019a8ff
[lldb] Make GetSharedModuleWithLocalCache consider the device support directory
Make GetSharedModuleWithLocalCache consider the device support
directory. In the past we only needed the device support directory to
debug remote processes. Since the introduction of Apple Silicon and
Rosetta this stopped being true.

When debugging a Rosetta process on macOS we need to consider the
Rosetta expanded shared cache. This patch and it dependencies move that
logic out of PlatfromRemoteDarwinDevice into a new abstract class called
PlatfromDarwinDevice. The new platform sit in between PlatformDarwin and
PlatformMacOSX and PlatformRemoteDarwinDevice and has all the necessary
logic to deal with the device support directory.

Technically I could have moved everything in PlatfromDarwinDevice into
PlatfromDarwin but decided that this logic is sufficiently self
contained that it warrants its own abstraction.

rdar://91966349

Differential revision: https://reviews.llvm.org/D124801
2022-05-02 21:07:11 -07:00
Jonas Devlieghere
322b413041
[lldb] Move GetSharedModuleWithLocalCache to PlatformDarwinDevice (NFC)
Refactoring in preparation of D124801.

Differential revision: https://reviews.llvm.org/D124800
2022-05-02 17:34:40 -07:00
Jonas Devlieghere
41c0ff1e74
[lldb] Hoist device support out of PlatformRemoteDarwinDevice (NFC)
Refactoring in preparation of D124801.

Differential revision: https://reviews.llvm.org/D124799
2022-05-02 17:34:40 -07:00