Jason Molenda 2aa020f49b
[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347)
In a PR last month I changed the ObjectFile CreateInstance etc methods
to accept an optional DataExtractorSP instead of a DataBufferSP, and
retain the extractor in a shared pointer internally in all of the
ObjectFile subclasses. This is laying the groundwork for using a
VirtualDataExtractor for some Mach-O binaries on macOS, where the
segments of the binary are out-of-order in actual memory, and we add a
lookup table to make it appear that the TEXT segment is at offset 0 in
the Extractor, etc. Working on the actual implementation, I realized we
were still using DataBufferSP's in ModuleSpec and Module, as well as in
ObjectFile::GetModuleSpecifications.

I originally was making a much larger NFC change where I had all
ObjectFile subclasses operating on DataExtractors throughout their
implementation, as well as in the DWARF parser. It was a very large
patchset. Many subclasses start with their DataExtractor, then create
smaller DataExtractors for parts of the binary image - the string table,
the symbol table, etc., for processing.

After consideration and discussion with Jonas, we agreed that a
segment/section of a binary will never require a lookup table to access
the bytes within it, so I changed
VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the
Subset be contained within a single lookup table entry, and (2) return a
simple DataExtractor bounded on that byte range. By doing this, I was
able to remove all of my very-invasive changes to the ObjectFile
subclass internals; it's only when they are operating on the entire
binary image that care is needed.

One pattern that subclasses like ObjectFileBreakpad use is to take an
ArrayRef of the DataBuffer for a binary, then create a StringRef of
that, then look for strings in it. With a VirtualDataExtractor and
out-of-order binary segments, with gaps between them, this allows us to
search the entire buffer looking for a string, and segfault when it gets
to an unmapped region of the buffer. I added a
VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest
contiguous memory region starting at offset 0 for this use case, and I
added a comment about what was being done there because I know it is not
obvious, and people not working on macOS wouldn't be familiar with the
requirement. (when we have a ModuleSpec with a DataExtractor, any of the
ObjectFile subclasses get a shot at Creating, so they all have to be
able to iterate on these)

rdar://148939795
2026-01-29 15:36:40 -08:00

95 lines
3.7 KiB
C++

//===-- ObjectContainerMachOFileset.h ---------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLDB_SOURCE_PLUGINS_OBJECTCONTAINER_MACH_O_FILESET_OBJECTCONTAINERMADCHOFILESET_H
#define LLDB_SOURCE_PLUGINS_OBJECTCONTAINER_MACH_O_FILESET_OBJECTCONTAINERMADCHOFILESET_H
#include "lldb/Host/SafeMachO.h"
#include "lldb/Symbol/ObjectContainer.h"
#include "lldb/Utility/FileSpec.h"
namespace lldb_private {
class ObjectContainerMachOFileset : public lldb_private::ObjectContainer {
public:
ObjectContainerMachOFileset(const lldb::ModuleSP &module_sp,
lldb::DataBufferSP &data_sp,
lldb::offset_t data_offset,
const lldb_private::FileSpec *file,
lldb::offset_t offset, lldb::offset_t length);
ObjectContainerMachOFileset(const lldb::ModuleSP &module_sp,
lldb::WritableDataBufferSP data_sp,
const lldb::ProcessSP &process_sp,
lldb::addr_t header_addr);
~ObjectContainerMachOFileset() override;
static void Initialize();
static void Terminate();
static llvm::StringRef GetPluginNameStatic() { return "mach-o-fileset"; }
static llvm::StringRef GetPluginDescriptionStatic() {
return "Mach-O Fileset container reader.";
}
static lldb_private::ObjectContainer *
CreateInstance(const lldb::ModuleSP &module_sp, lldb::DataBufferSP &data_sp,
lldb::offset_t data_offset, const lldb_private::FileSpec *file,
lldb::offset_t offset, lldb::offset_t length);
static lldb_private::ObjectContainer *CreateMemoryInstance(
const lldb::ModuleSP &module_sp, lldb::WritableDataBufferSP data_sp,
const lldb::ProcessSP &process_sp, lldb::addr_t header_addr);
static size_t GetModuleSpecifications(const lldb_private::FileSpec &file,
lldb::DataExtractorSP &extractor_sp,
lldb::offset_t data_offset,
lldb::offset_t file_offset,
lldb::offset_t length,
lldb_private::ModuleSpecList &specs);
static bool MagicBytesMatch(const lldb_private::DataExtractor &data);
static bool MagicBytesMatch(lldb::DataBufferSP data_sp,
lldb::addr_t data_offset,
lldb::addr_t data_length);
bool ParseHeader() override;
size_t GetNumObjects() const override { return m_entries.size(); }
lldb::ObjectFileSP GetObjectFile(const lldb_private::FileSpec *file) override;
llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }
struct Entry {
Entry(uint64_t vmaddr, uint64_t fileoff, std::string id)
: vmaddr(vmaddr), fileoff(fileoff), id(id) {}
uint64_t vmaddr;
uint64_t fileoff;
std::string id;
};
Entry *FindEntry(llvm::StringRef id);
private:
static bool ParseHeader(lldb_private::DataExtractor &data,
const lldb_private::FileSpec &file,
lldb::offset_t file_offset,
std::vector<Entry> &entries);
std::vector<Entry> m_entries;
lldb::ProcessWP m_process_wp;
const lldb::addr_t m_memory_addr;
};
} // namespace lldb_private
#endif // LLDB_SOURCE_PLUGINS_OBJECTCONTAINER_MACH_O_FILESET_OBJECTCONTAINERMADCHOFILESET_H