11 Commits

Author SHA1 Message Date
Jason Molenda
af3cc148c3
[lldb][NFC] Add some override methods to VirtualDataExtractor (#179858)
This changes VirtualDataExtractor's GetByteSize to return the virtual
byte size of the buffer (external users only understand the data
contents in terms of the virtual sizes & offsets). There are check
methods in DataExtractor that check they are not going off the end of a
buffer, they usually use the BytesLeft() method. There are a couple of
callers of BytesLeft() externally, but it is predominantly an internal
use API. I have BytesLeft() use the physical size of the buffer, not the
virtual size, for the benefit of the DataExtractor methods. (and to
avoid duplicating all of them down in VirtualDataExtractor)

Another problem is the we call SetData on DataExtractorSP's (e.g. see
the ObjectFile ctor) with the DataBuffer it already has, an offset of 0,
and the GetByteSize. A no-op for a DataExtractor that is already using
that DataBuffer. But SetData would try to use that length as a physical
size, and truncate the buffer that the DataExtractor would accept.

I added VirtualDataExtractor subclass methods for the SetData's, detect
(1) data being added to an uninitialized DataExtractor, (2) the same
data / offset / length as currently being used is added to the
DataExtractor (a no-op), or (3) we're genuinely changing the data source
or setting an offset / length that is different. This final case we're
not ready to handle today, I added asserts for them so we can catch it
in debug builds, and then I clear the LookupTable and add a no-op entry
so this extractor will behave like a plain DataExtractor -- because I
don't know better to do. If we genuinely need to handle this case, and
I'm pretty sure we don't need to, I'd have to assume that we're taking a
subset of the original data source (an offset & length), so we'd need to
update all of the LookupTable entries to reflect the new offsets, and
remove entries that are no longer referring to the subsetted range. I'll
leave that until there's any evidence it's actually needed.

rdar://148939795
2026-02-05 15:10:54 -08:00
Jason Molenda
2aa020f49b
[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347)
In a PR last month I changed the ObjectFile CreateInstance etc methods
to accept an optional DataExtractorSP instead of a DataBufferSP, and
retain the extractor in a shared pointer internally in all of the
ObjectFile subclasses. This is laying the groundwork for using a
VirtualDataExtractor for some Mach-O binaries on macOS, where the
segments of the binary are out-of-order in actual memory, and we add a
lookup table to make it appear that the TEXT segment is at offset 0 in
the Extractor, etc. Working on the actual implementation, I realized we
were still using DataBufferSP's in ModuleSpec and Module, as well as in
ObjectFile::GetModuleSpecifications.

I originally was making a much larger NFC change where I had all
ObjectFile subclasses operating on DataExtractors throughout their
implementation, as well as in the DWARF parser. It was a very large
patchset. Many subclasses start with their DataExtractor, then create
smaller DataExtractors for parts of the binary image - the string table,
the symbol table, etc., for processing.

After consideration and discussion with Jonas, we agreed that a
segment/section of a binary will never require a lookup table to access
the bytes within it, so I changed
VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the
Subset be contained within a single lookup table entry, and (2) return a
simple DataExtractor bounded on that byte range. By doing this, I was
able to remove all of my very-invasive changes to the ObjectFile
subclass internals; it's only when they are operating on the entire
binary image that care is needed.

One pattern that subclasses like ObjectFileBreakpad use is to take an
ArrayRef of the DataBuffer for a binary, then create a StringRef of
that, then look for strings in it. With a VirtualDataExtractor and
out-of-order binary segments, with gaps between them, this allows us to
search the entire buffer looking for a string, and segfault when it gets
to an unmapped region of the buffer. I added a
VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest
contiguous memory region starting at offset 0 for this use case, and I
added a comment about what was being done there because I know it is not
obvious, and people not working on macOS wouldn't be familiar with the
requirement. (when we have a ModuleSpec with a DataExtractor, any of the
ObjectFile subclasses get a shot at Creating, so they all have to be
able to iterate on these)

rdar://148939795
2026-01-29 15:36:40 -08:00
Jason Molenda
e4c83b7b11
[lldb][NFC] Change ObjectFile argument type (#171574)
The ObjectFile plugin interface accepts an optional DataBufferSP
argument. If the caller has the contents of the binary, it can provide
this in that DataBufferSP. The ObjectFile subclasses in their
CreateInstance methods will fill in the DataBufferSP with the actual
binary contents if it is not set.
ObjectFile base class creates an ivar DataExtractor from the
DataBufferSP passed in.

My next patch will be a caller that creates a VirtualDataExtractor with
the binary data, and needs to pass that in to the ObjectFile plugin,
instead of the bag-of-bytes DataBufferSP. It builds on the previous
patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP
so I could pass in a subclass in the shared ptr. And it will be using
the VirtualDataExtractor that Jonas added in
https://github.com/llvm/llvm-project/pull/168802

No behavior is changed by the patch; we're simply moving the creation of
the DataExtractor to the caller, instead of a DataBuffer that is
immediately used to set up the ObjectFile DataExtractor. The patch is a
bit complicated because all of the ObjectFile subclasses have to
initialize their DataExtractor to pass in to the base class.

I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it
under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked
lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the
SME, that explains why the testsuite passes)

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2025-12-11 10:08:56 -08:00
Jonas Devlieghere
919caa89ee
[lldb] Don't crash on malformed filesets (#98388)
The memory read can fail for a malformed fileset. Handle it gracefully
instead of passing a nullptr to the std::string constructor.

rdar://131477833
2024-07-10 14:16:56 -07:00
Antonio Frighetto
be0e42c16b [lldb][Plugins] Reflect structure changes in ObjectContainer
Reflect recent changes made to MachO
`fileset_entry_command` struct.
2023-09-05 19:13:43 +02:00
Kazu Hirata
2fe8327406 [lldb] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to clean up the "using" declarations, #include
"llvm/ADT/Optional.h", etc.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-07 14:18:35 -08:00
Kazu Hirata
f190ce625a [lldb] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-07 13:43:00 -08:00
Kazu Hirata
343523d040 [lldb] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 16:51:25 -08:00
Jonas Devlieghere
0044cb4b20
[lldb] Fix two bugs in ObjectContainerMachOFileset
Fix two small issues in the live-memory variant of ObjectContainerMachOFileset.

Differential revision: https://reviews.llvm.org/D132973
2022-08-30 14:07:20 -07:00
Jonas Devlieghere
e360281fa7
[lldb] Computer the slide when and apply it to each fileset's vm addr
Computer the slide when and apply it to each entry's vm addr when
reading from memory.

Differential revision: https://reviews.llvm.org/D132710
2022-08-25 16:38:01 -07:00
Jonas Devlieghere
48506fbbbf
[lldb] Teach LLDB about Mach-O filesets
This patch teaches LLDB about Mach-O filesets. Filsets are Mach-O files
that contain a bunch of other Mach-O files. Unlike universal binaries,
which have a different header, Filesets use load commands to describe
the different entries it contains.

Differential revision: https://reviews.llvm.org/D132433
2022-08-25 15:24:51 -07:00