llvm-project

Author	SHA1	Message	Date
Pavel Labath	17491d9130	[lldb] Remove data_offset arg from GetModuleSpecifications (#188978 ) - it is always passed as zero - a lot of plugins aren't using it correctly - the data extractor class already has the capability to look at a subset of bytes	2026-03-30 08:39:52 +02:00
Pavel Labath	5ed2f6f46b	[lldb] Make ObjectFile::GetModuleSpecifications return module specifications (#188509 ) For consistency with #188276 (and better readability?).	2026-03-27 12:20:44 +01:00
Pavel Labath	08c94c0ac3	[lldb] Clear up GetModuleSpecifications return value confusion (#188276 ) Some plugins were returning the number of specifications they have added, while others were returning the total final number. Particularly devious plugins (Minidump) were clearing the specification list altogether. This resulted in nondeterministic failures (depending on plugin ininitialization order) in TestSBModule. This PR defines the problem away by having each plugin only return the specifications it is responsible for. If the caller wants to merge them, it is free to do so. This might be slighly less efficient, but this is hardly hot code. I'm not touching the ObjectFile::GetModuleSpecifications function (the caller of all these functions) as the PR is big enough, although the same approach might be warranted there as well. Fixes https://github.com/llvm/llvm-project/issues/178625.	2026-03-25 15:55:04 +01:00
Jason Molenda	4398b5fa8c	[lldb] Have ObjectFile::FindPlugin send a copy of the DE (#185727 ) ObjectFile::FindPlugin iterates over plugins to find one that can handle the binary provided. It is currently sending the one DataExtractorSP to each subclass, but some subclasses may modify this DataExtractor during their processing, e.g. calling DataExtractor::SetData on it, and I think it is safer to isolate these with a copy of the DataExtractor so the order the plugins are tried cannot possibly change behavior.	2026-03-10 12:27:24 -07:00
Jonas Devlieghere	81a537e708	[lldb] Use range-based for loops over plugins (#184837 ) This PR replaces the Get*CallbackAtIndex pattern in the PluginManager with returning a snapshot of callbacks that the caller can iterate over using a range-based for loop. This is a continuation of #184452 which added thread safety by using snapshots. However, that introduced a bunch of unnecessary copies which are largely eliminated again by getting the snapshot once when gather all the callbacks, rather than doing that on each iteration when querying a plugin for a given index. It also eliminates the possibility of the snapshot changing underneath you when iterating over the plugins. This change was largely mechanical and I used Claude to do the menial work of updating the signatures and call sites.	2026-03-06 22:48:33 +00:00
Sergei Barannikov	b881949ee4	[lldb] Drop incomplete non-8-bit bytes support (#182025 ) This was originally introduced to support kalimba DSPs featuring 24-bit bytes by f03e6d84 and also c928de3e, but the kalimba support was mostly removed by f8819bd5. This change removes the rest of the support, which was far from complete.	2026-02-19 13:01:02 +03:00
Jason Molenda	2aa020f49b	[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347 ) In a PR last month I changed the ObjectFile CreateInstance etc methods to accept an optional DataExtractorSP instead of a DataBufferSP, and retain the extractor in a shared pointer internally in all of the ObjectFile subclasses. This is laying the groundwork for using a VirtualDataExtractor for some Mach-O binaries on macOS, where the segments of the binary are out-of-order in actual memory, and we add a lookup table to make it appear that the TEXT segment is at offset 0 in the Extractor, etc. Working on the actual implementation, I realized we were still using DataBufferSP's in ModuleSpec and Module, as well as in ObjectFile::GetModuleSpecifications. I originally was making a much larger NFC change where I had all ObjectFile subclasses operating on DataExtractors throughout their implementation, as well as in the DWARF parser. It was a very large patchset. Many subclasses start with their DataExtractor, then create smaller DataExtractors for parts of the binary image - the string table, the symbol table, etc., for processing. After consideration and discussion with Jonas, we agreed that a segment/section of a binary will never require a lookup table to access the bytes within it, so I changed VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the Subset be contained within a single lookup table entry, and (2) return a simple DataExtractor bounded on that byte range. By doing this, I was able to remove all of my very-invasive changes to the ObjectFile subclass internals; it's only when they are operating on the entire binary image that care is needed. One pattern that subclasses like ObjectFileBreakpad use is to take an ArrayRef of the DataBuffer for a binary, then create a StringRef of that, then look for strings in it. With a VirtualDataExtractor and out-of-order binary segments, with gaps between them, this allows us to search the entire buffer looking for a string, and segfault when it gets to an unmapped region of the buffer. I added a VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest contiguous memory region starting at offset 0 for this use case, and I added a comment about what was being done there because I know it is not obvious, and people not working on macOS wouldn't be familiar with the requirement. (when we have a ModuleSpec with a DataExtractor, any of the ObjectFile subclasses get a shot at Creating, so they all have to be able to iterate on these) rdar://148939795	2026-01-29 15:36:40 -08:00
Alex Langford	9ca02a13a4	[lldb][NFC] Mark Symbol pointers as const where easily possible (#177472 ) These are the places that required no modifications to surrounding code.	2026-01-27 15:23:49 -08:00
Jonas Devlieghere	d6652c189d	[lldb] Fix data buffer regression in ObjectFile (#177724 ) This fixes a regression in `ObjectFile` and `ObjectFileELF` introduced by #171574. The original code created a `DataBuffer` using `MapFileDataWritable`. ``` data_sp = MapFileDataWritable(file, length, file_offset); if (!data_sp) return nullptr; data_offset = 0; ``` The new code requires converting the `DataBuffer` to a `DataExtractor`: ``` DataBufferSP buffer_sp = MapFileDataWritable(file, length, file_offset); if (!buffer_sp) return nullptr; extractor_sp = std::make_shared<DataExtractor>(); extractor_sp->SetData(buffer_sp, data_offset, buffer_sp->GetByteSize()); data_offset = 0; ``` The issue is that once we get a data buffer back from MapFileDataWritable, we don't have to adjust for the `data_offset` again when calling `SetData` as the `DataBuffer` is already normalized to have a zero start offset. A similar issue exists in `ObjectFile`. rdar://168317174	2026-01-23 19:58:53 -08:00
Jonas Devlieghere	cb651a2eaa	[lldb] Avoid redundant calls to `std::shared_ptr::get` (NFC) (#177720 ) Avoid redundant calls to `std::shared_ptr::get()`. The class provides a dereference operator and using that is the standard, idiomatic way to access the underlying object.	2026-01-24 01:14:25 +00:00
Jonas Devlieghere	17e226f71e	[lldb] Fix crash when passing a folder in as the executable (#175181 ) This is another instance where we weren't checking that the result of FileSystem::CreateDataBuffer and unconditionally accessing it, similar to the bug in SourceManager last week. In this particular case, ObjectFile was assuming that we can read the contents non-zero, which isn't true for directory nodes. Jim figured this one out yesterday. I'm just putting up the patch and adding a test. rdar://167796036	2026-01-09 11:29:31 -06:00
Jason Molenda	e4c83b7b11	[lldb][NFC] Change ObjectFile argument type (#171574 ) The ObjectFile plugin interface accepts an optional DataBufferSP argument. If the caller has the contents of the binary, it can provide this in that DataBufferSP. The ObjectFile subclasses in their CreateInstance methods will fill in the DataBufferSP with the actual binary contents if it is not set. ObjectFile base class creates an ivar DataExtractor from the DataBufferSP passed in. My next patch will be a caller that creates a VirtualDataExtractor with the binary data, and needs to pass that in to the ObjectFile plugin, instead of the bag-of-bytes DataBufferSP. It builds on the previous patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP so I could pass in a subclass in the shared ptr. And it will be using the VirtualDataExtractor that Jonas added in https://github.com/llvm/llvm-project/pull/168802 No behavior is changed by the patch; we're simply moving the creation of the DataExtractor to the caller, instead of a DataBuffer that is immediately used to set up the ObjectFile DataExtractor. The patch is a bit complicated because all of the ObjectFile subclasses have to initialize their DataExtractor to pass in to the base class. I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the SME, that explains why the testsuite passes) rdar://148939795 --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>	2025-12-11 10:08:56 -08:00
Jason Molenda	ae68377c69	[lldb][NFC] Change ObjectFile's DataExtractor to a shared ptr (#170066 ) ObjectFile has an m_data DataExtractor ivar which may be default constructed initially, or initialized with a DataBuffer passed in to its ctor. If the DataExtractor does not get a DataBuffer source passed in, the subclass will initialize it with access to the object file's data. When a DataBuffer is passed in to the base class ctor, the DataExtractor only has its buffer initialized; ObjectFile doesn't yet know the address size and endianness to fully initialize the DataExtractor. This patch changes ObjectFile to instead have a DataExtractorSP ivar which is always initialized with at least a default-constructed DataExtractor object in the base class ctor. The next patch I will be writing is to change the ObjectFile ctor to take an optional DataExtractorSP, so the caller can pass a DataExtractor subclass -- the VirtualizeDataExtractor being added via https://github.com/llvm/llvm-project/pull/168802 instead of a DataBuffer which is trivially saved into the DataExtractor. The change is otherwise mechanical; all `m_data.` changed to `m_data_sp->` and all the places where `m_data` was passed in for a by-ref call were changed to `*m_data_sp.get()`. The shared pointer is always initialized to contain an object. I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for getting the Linux testsuite to run on SME-only systems David). All of the ObjectFile subclasses I modifed compile cleanly, but I haven't tested them beyond any unit tests they may have (prob breakpad). rdar://148939795	2025-12-01 14:37:55 -08:00
Jakub Kuderski	4c21d0cb14	[ADT] Prepare to deprecate variadic `StringSwitch::Cases`. NFC. (#166020 ) Update all uses of variadic `.Cases` to use the initializer list overload instead. I plan to mark variadic `.Cases` as deprecated in a followup PR. For more context, see https://github.com/llvm/llvm-project/pull/163117.	2025-11-02 00:12:33 +00:00
Adrian Prantl	7c5b535d8c	[lldb] Replace IRExecutionUnit::GetSectionTypeFromSectionName with Ob… (#157192 ) …jectFile API This avoids code duplication.	2025-09-05 15:47:04 -07:00
Jonas Devlieghere	5be2063e10	[lldb] Support parsing the Wasm symbol table (#153093 ) This PR adds support for parsing the WebAssembly symbol table. The symbol table is encoded in the "names" section and contains names and indexes into other sections. For now we only support parsing function (code) symbols. The result is that you can set breakpoints by symbol name, while previously breakpoints by name required debug info (DWARF). This is also necessary for Swift, which checks for the presence of `swift_release` as a heuristic to determine if there's a static Swift stdlib.	2025-08-12 15:12:30 -05:00
nerix	c6670fa20d	[LLDB] Unify DWARF section name matching (#141344 ) Different object file formats support DWARF sections (COFF, ELF, MachO, PE/COFF, WASM). COFF and PE/COFF only matched a subset. This caused some GCC executables produced on MinGW to have issue later on when debugging. One example is that `.debug_rnglists` was not matched, which caused range-extraction to fail when printing a backtrace. This unifies the parsing of section names in `ObjectFile::GetDWARFSectionTypeFromName`, so all file formats can use the same naming convention. Since the prefixes are different, `GetDWARFSectionTypeFromName` only matches the suffixes (i.e. `.debug_` needs to be stripped before). I added two tests to ensure the sections are correctly identified on Windows executables.	2025-06-09 09:46:50 +01:00
royitaqi	967434aa32	[lldb] Remerge #136236 (Avoid force loading symbols in statistics collection (#136795 ) Fix a [test failure](https://github.com/llvm/llvm-project/pull/136236#issuecomment-2819772879) in #136236, apply a minor renaming of statistics, and remerge. See details below. # Changes in #136236 Currently, `DebuggerStats::ReportStatistics()` calls `Module::GetSymtab(/can_create=/false)`, but then the latter calls `SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See stacktrace below. The problem is that `DebuggerStats::ReportStatistics` should be read-only. This is especially important because it reports stats for symtab parsing/indexing time, which could be affected by the reporting itself if it's not read-only. This patch fixes this problem by adding an optional parameter `SymbolFile::GetSymtab(bool can_create = true)` and receiving the `false` value passed down from `Module::GetSymtab(/can_create=/false)` when the call is initiated from `DebuggerStats::ReportStatistics()`. --- Notes about the following stacktrace: 1. This can be reproduced. Create a helloworld program on macOS with dSYM, add `settings set target.preload-symbols false` to `~/.lldbinit`, do `lldb a.out`, then `statistics dump`. 2. `ObjectFile::GetSymtab` has `llvm::call_once`. So the fact that it called into `ObjectFileMachO::ParseSymtab` means that the symbol table is actually being parsed. ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = step over frame #0: 0x0000000124c4d5a0 LLDB`ObjectFileMachO::ParseSymtab(this=0x0000000111504e40, symtab=0x0000600000a05e00) at ObjectFileMachO.cpp:2259:44 * frame #1: 0x0000000124fc50a0 LLDB`lldb_private::ObjectFile::GetSymtab()::$_0::operator()(this=0x000000016d35c858) const at ObjectFile.cpp:761:9 frame #5: 0x0000000124fc4e68 LLDB`void std::__1::__call_once_proxy[abi:v160006]<std::__1::tuple<lldb_private::ObjectFile::GetSymtab()::$_0&&>>(__vp=0x000000016d35c7f0) at mutex:652:5 frame #6: 0x0000000198afb99c libc++.1.dylib`std::__1::__call_once(unsigned long volatile&, void, void ()(void*)) + 196 frame #7: 0x0000000124fc4dd0 LLDB`void std::__1::call_once[abi:v160006]<lldb_private::ObjectFile::GetSymtab()::$_0>(__flag=0x0000600003920080, __func=0x000000016d35c858) at mutex:670:9 frame #8: 0x0000000124fc3cb0 LLDB`void llvm::call_once<lldb_private::ObjectFile::GetSymtab()::$_0>(flag=0x0000600003920080, F=0x000000016d35c858) at Threading.h:88:5 frame #9: 0x0000000124fc2bc4 LLDB`lldb_private::ObjectFile::GetSymtab(this=0x0000000111504e40) at ObjectFile.cpp:755:5 frame #10: 0x0000000124fe0a28 LLDB`lldb_private::SymbolFileCommon::GetSymtab(this=0x0000000104865200) at SymbolFile.cpp:158:39 frame #11: 0x0000000124d8fedc LLDB`lldb_private::Module::GetSymtab(this=0x00000001113041a8, can_create=false) at Module.cpp:1027:21 frame #12: 0x0000000125125bdc LLDB`lldb_private::DebuggerStats::ReportStatistics(debugger=0x000000014284d400, target=0x0000000115808200, options=0x000000014195d6d1) at Statistics.cpp:329:30 frame #13: 0x0000000125672978 LLDB`CommandObjectStatsDump::DoExecute(this=0x000000014195d540, command=0x000000016d35d820, result=0x000000016d35e150) at CommandObjectStats.cpp:144:18 frame #14: 0x0000000124f29b40 LLDB`lldb_private::CommandObjectParsed::Execute(this=0x000000014195d540, args_string="", result=0x000000016d35e150) at CommandObject.cpp:832:9 frame #15: 0x0000000124efbd70 LLDB`lldb_private::CommandInterpreter::HandleCommand(this=0x0000000141b22f30, command_line="statistics dump", lazy_add_to_history=eLazyBoolCalculate, result=0x000000016d35e150, force_repeat_command=false) at CommandInterpreter.cpp:2134:14 frame #16: 0x0000000124f007f4 LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete(this=0x0000000141b22f30, io_handler=0x00000001419b2aa8, line="statistics dump") at CommandInterpreter.cpp:3251:3 frame #17: 0x0000000124d7b5ec LLDB`lldb_private::IOHandlerEditline::Run(this=0x00000001419b2aa8) at IOHandler.cpp:588:22 frame #18: 0x0000000124d1e8fc LLDB`lldb_private::Debugger::RunIOHandlers(this=0x000000014284d400) at Debugger.cpp:1225:16 frame #19: 0x0000000124f01f74 LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter(this=0x0000000141b22f30, options=0x000000016d35e63c) at CommandInterpreter.cpp:3543:16 frame #20: 0x0000000122840294 LLDB`lldb::SBDebugger::RunCommandInterpreter(this=0x000000016d35ebd8, auto_handle_events=true, spawn_thread=false) at SBDebugger.cpp:1212:42 frame #21: 0x0000000102aa6d28 lldb`Driver::MainLoop(this=0x000000016d35ebb8) at Driver.cpp:621:18 frame #22: 0x0000000102aa75b0 lldb`main(argc=1, argv=0x000000016d35f548) at Driver.cpp:829:26 frame #23: 0x0000000198858274 dyld`start + 2840 ``` # Changes in this PR top of the above Fix a [test failure](https://github.com/llvm/llvm-project/pull/136236#issuecomment-2819772879) in `TestStats.py`. The original version of the added test checks that all modules have symbol count zero when `target.preload-symbols == false`. The test failed on macOS. Due to various reasons, on macOS, symbols can be loaded for dylibs even with that setting, but not for the main module. For now, the fix of the test is to limit the assertion to only the main module. The test now passes on macOS. In the future, when we have a way to control a specific list of plug-ins to be loaded, there may be a configuration that this test can use to assert that all modules have symbol count zero. Apply a minor renaming of statistics, per the [suggestion](https://github.com/llvm/llvm-project/pull/136226#issuecomment-2825080275) in #136226 after merge.	2025-04-24 17:23:41 -07:00
Shubham Sandeep Rastogi	08b4c52540	Revert "[lldb] Avoid force loading symbols in statistics collection (#136236 )" This reverts commit d5b40c71f6be972f677de5d9886f91866df007b5. This change broke greendragon lldb test: lldb-api :: commands/statistics/basic/TestStats.py And is therefore being reverted.	2025-04-21 17:19:54 -07:00
royitaqi	d5b40c71f6	[lldb] Avoid force loading symbols in statistics collection (#136236 ) Currently, `DebuggerStats::ReportStatistics()` calls `Module::GetSymtab(/can_create=/false)`, but then the latter calls `SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See stacktrace below. The problem is that `DebuggerStats::ReportStatistics` should be read-only. This is especially important because it reports stats for symtab parsing/indexing time, which could be affected by the reporting itself if it's not read-only. This patch fixes this problem by adding an optional parameter `SymbolFile::GetSymtab(bool can_create = true)` and receive the `false` value passed down from `Module::GetSymtab(/can_create=/false)` when the call was initiated from `DebuggerStats::ReportStatistics()`.	2025-04-21 16:53:14 -07:00
Jonas Devlieghere	5271dead61	[lldb] Add a {ObjectFile,SymbolFile}::GetObjectName method (#133370 ) Add ObjectFile::GetObjectName and SymbolFile::GetObjectName to retrieve the name of the object file, including the `.a` for static libraries. We currently do something similar in CommandObjectTarget, but the code for dumping this is a lot more involved than what's being offered by the new method. We have options to print he full path, the base name, and the directoy of the path and trim it to a specific width. This is motivated by #133211, where Greg pointed out that the old code would print the static archive (the .a file) rather than the actual object file inside of it.	2025-04-04 16:33:40 -07:00
Greg Clayton	c4fb7180cb	[lldb][NFC] Make the target's SectionLoadList private. (#113278 ) Lots of code around LLDB was directly accessing the target's section load list. This NFC patch makes the section load list private so the Target class can access it, but everyone else now uses accessor functions. This allows us to control the resolving of addresses and will allow for functionality in LLDB which can lazily resolve addresses in JIT plug-ins with a future patch.	2025-01-14 20:12:46 -08:00
Adrian Prantl	87659a17d0	Reland: [lldb] Implement a formatter bytecode interpreter in C++ Compared to the python version, this also does type checking and error handling, so it's slightly longer, however, it's still comfortably under 500 lines. Relanding with more explicit type conversions.	2024-12-10 16:37:53 -08:00
Sylvestre Ledru	a2fb70523a	Revert "[lldb] Add cast to fix compile error on 32-bit platforms" This reverts commit f6012a209dca6b1866d00e6b4f96279469884320. Revert "[lldb] Add cast to fix compile error on 32-but platforms" This reverts commit d300337e93da4ed96b044557e4b0a30001967cf0. Revert "[lldb] Improve log message to include missing strings" This reverts commit 0be33484853557bc0fd9dfb94e0b6c15dda136ce. Revert "[lldb] Add comment" This reverts commit e2bb47443d2e5c022c7851dd6029e3869fc8835c. Revert "[lldb] Implement a formatter bytecode interpreter in C++" This reverts commit 9a9c1d4a6155a96ce9be494cec7e25731d36b33e.	2024-12-11 00:00:44 +01:00
Adrian Prantl	9a9c1d4a61	[lldb] Implement a formatter bytecode interpreter in C++ Compared to the python version, this also does type checking and error handling, so it's slightly longer, however, it's still comfortably under 500 lines.	2024-12-10 09:36:38 -08:00
Dave Lee	1a650fde4a	[lldb] Load embedded type summary section (#7859 ) (#8040 ) Add support for type summaries embedded into the binary. These embedded summaries will typically be generated by Swift macros, but can also be generated by any other means. rdar://115184658	2024-12-10 09:36:38 -08:00
Greg Clayton	f1e2886261	[LLDB] Impove ObjectFileELF's .dynamic parsing and usage. (#102570 ) This patch improves the ability of a ObjectFileELF instance to read the .dynamic section. It adds the ability to read the .dynamic section from the PT_DYNAMIC program header which is useful for ELF files that have no section headers and for ELF files that are read from memory. It cleans up the usage of the .dynamic entries so that ObjectFileELF::ParseDynamicSymbols() is the only code that parses .dynamic entries, teaches that function the read and store the string values for each .dynamic entry. We now dump the .dynamic entries in the output of "image dump objfile". It also cleans up the code that gets the dynamic string table so that it can grab it from the DT_STRTAB and DT_STRSZ .dynamic entries for when we have a ELF file with no section headers or we are reading it from memory.	2024-08-12 10:57:04 -07:00
Leonard Chan	1d9e1c6644	Revert "[LLDB] Impove ObjectFileELF's .dynamic parsing and usage. (#101237 )" This reverts commit 28ba8a56b6fb9ec61897fa84369f46e43be94c03. Reverting since this broke the buildbot at https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/9352/.	2024-08-08 23:05:23 +00:00
Greg Clayton	28ba8a56b6	[LLDB] Impove ObjectFileELF's .dynamic parsing and usage. (#101237 ) This patch improves the ability of a ObjectFileELF instance to read the .dynamic section. It adds the ability to read the .dynamic section from the PT_DYNAMIC program header which is useful for ELF files that have no section headers and for ELF files that are read from memory. It cleans up the usage of the .dynamic entries so that ObjectFileELF::ParseDynamicSymbols() is the only code that parses .dynamic entries, teaches that function the read and store the string values for each .dynamic entry. We now dump the .dynamic entries in the output of "image dump objfile". It also cleans up the code that gets the dynamic string table so that it can grab it from the DT_STRTAB and DT_STRSZ .dynamic entries for when we have a ELF file with no section headers or we are reading it from memory.	2024-08-08 11:04:52 -07:00
Jonas Devlieghere	ed7e46877d	[lldb] Improve error message for unrecognized executables (#97490 ) Currently, LLDB prints out a rather unhelpful error message when passed a file that it doesn't recognize as an executable. > error: '/path/to/file' doesn't contain any 'host' platform > architectures: arm64, armv7, armv7f, armv7k, armv7s, armv7m, armv7em, > armv6m, armv6, armv5, armv4, arm, thumbv7, thumbv7k, thumbv7s, > thumbv7f, thumbv7m, thumbv7em, thumbv6m, thumbv6, thumbv5, thumbv4t, > thumb, x86_64, x86_64, arm64, arm64e I did a quick search internally and found at least 24 instances of users being confused by this. This patch improves the error message when it doesn't recognize the file as an executable, but keeps the existing error message otherwise, i.e. when it's an object file we understand, but the current platform doesn't support.	2024-07-08 09:29:01 -07:00
Kazu Hirata	744f38913f	[lldb] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-16 14:39:37 -08:00
Alex Langford	764287f1ad	[lldb] Add support for recognizing swift ast sections in object files In Apple's downstream fork, there is support for understanding the swift AST sections in various binaries. Even though the lldb on llvm.org does not have support for debugging swift, I think it makes sense to move support for recognizing swift ast sections upstream. Differential Revision: https://reviews.llvm.org/D159142	2023-08-31 15:16:12 -07:00
Hiroshi Yamauchi	e9040e875d	[lldb][PECOFF] Exclude alignment padding when reading section data There can be zero padding bytes at a section end for file alignment in PECOFF. Exclude those padding bytes when reading the section data. Differential Revision: https://reviews.llvm.org/D157059	2023-08-04 13:38:30 -07:00
Jonas Devlieghere	e5aa4cff43	[lldb] Support Compact C Type Format (CTF) section Teach LLDB about the ctf (Compact C Type Format) section. Differential revision: https://reviews.llvm.org/D154668	2023-07-13 11:30:35 -07:00
Jonas Devlieghere	0e5cdbf07e	[lldb] Make ObjectFileJSON loadable as a module This patch adds support for creating modules from JSON object files. This is necessary for the crashlog use case where we don't have either a module or a symbol file. In that case the ObjectFileJSON serves as both. The patch adds support for an object file type (i.e. executable, shared library, etc). It also adds the ability to specify sections, which is necessary in order specify symbols by address. Finally, this patch improves error handling and fixes a bug where we wouldn't read more than the initial 512 bytes in GetModuleSpecifications. Differential revision: https://reviews.llvm.org/D148062	2023-04-13 14:08:19 -07:00
Jonas Devlieghere	f2ea125ea0	[lldb] Change CreateMemoryInstance to take a WritableDataBuffer Change the CreateMemoryInstance interface to take a WritableDataBuffer. Differential revision: https://reviews.llvm.org/D123073	2022-04-05 13:46:41 -07:00
Jonas Devlieghere	fc54427e76	[lldb] Refactor DataBuffer so we can map files as read-only Currently, all data buffers are assumed to be writable. This is a problem on macOS where it's not allowed to load unsigned binaries in memory as writable. To be more precise, MAP_RESILIENT_CODESIGN and MAP_RESILIENT_MEDIA need to be set for mapped (unsigned) binaries on our platform. Binaries are mapped through FileSystem::CreateDataBuffer which returns a DataBufferLLVM. The latter is backed by a llvm::WritableMemoryBuffer because every DataBuffer in LLDB is considered to be writable. In order to use a read-only llvm::MemoryBuffer I had to split our abstraction around it. This patch distinguishes between a DataBuffer (read-only) and WritableDataBuffer (read-write) and updates LLDB to use the appropriate one. rdar://74890607 Differential revision: https://reviews.llvm.org/D122856	2022-04-05 13:46:37 -07:00
Pavel Labath	c34698a811	[lldb] Rename Logging.h to LLDBLog.h and clean up includes Most of our code was including Log.h even though that is not where the "lldb" log channel is defined (Log.h defines the generic logging infrastructure). This worked because Log.h included Logging.h, even though it should. After the recent refactor, it became impossible the two files include each other in this direction (the opposite inclusion is needed), so this patch removes the workaround that was put in place and cleans up all files to include the right thing. It also renames the file to LLDBLog to better reflect its purpose.	2022-02-03 14:47:01 +01:00
Pavel Labath	a007a6d844	[lldb] Convert "LLDB" log channel to the new API	2022-02-02 14:13:08 +01:00
Greg Clayton	da816ca0cb	Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster. This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes: - We no longer modify modification times of the cache files - Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp) - Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed - Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented. The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items: - object file UUID if available - object file mod time if available - object name for BSD archive .o files that are in .a files if available If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached. When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created. Module caching must be enabled by the user before this can be used: symbols.enable-lldb-index-cache (boolean) = false (lldb) settings set symbols.enable-lldb-index-cache true There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory: (lldb) settings show symbols.lldb-index-cache-path /var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug. Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues. If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed. The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings: (lldb) settings set symbols.lldb-index-cache-expiration-days <days> A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently. (lldb) settings set symbols.lldb-index-cache-max-byte-size <size> A value of zero will disable pruning based on a total byte size. The default value is zero currently. (lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space> A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero. Reviewed By: labath, wallace Differential Revision: https://reviews.llvm.org/D115324	2021-12-16 09:59:55 -08:00
Greg Clayton	7e6df41f65	[NFC] Refactor symbol table parsing. Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once. This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety. Differential Revision: https://reviews.llvm.org/D114288	2021-11-30 13:54:32 -08:00
Greg Clayton	a68ccda203	Revert "[NFC] Refactor symbol table parsing." This reverts commit 951b107eedab1829f18049443f03339dbb0db165. Buildbots were failing, there is a deadlock in /Users/gclayton/Documents/src/llvm/clean/llvm-project/lldb/test/Shell/SymbolFile/DWARF/DW_AT_range-DW_FORM_sec_offset.s when ELF files try to relocate things.	2021-11-17 18:07:28 -08:00
Greg Clayton	951b107eed	[NFC] Refactor symbol table parsing. Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once. Differential Revision: https://reviews.llvm.org/D113965	2021-11-17 15:14:01 -08:00
Greg Clayton	ec1a491701	Create synthetic symbol names on demand to improve memory consumption and startup times. This is a resubmission of https://reviews.llvm.org/D105160 after fixing testing issues. This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool. Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index. This patch fixes the issue by: not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool Prior to this fix the startup times for a large application was: 12.5 seconds (cold file caches) 8.5 seconds (warm file caches) After this fix: 9.7 seconds (cold file caches) 5.7 seconds (warm file caches) The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name. Differential Revision: https://reviews.llvm.org/D106837	2021-07-27 16:51:12 -07:00
Jonas Devlieghere	6b0d266036	Revert "Create synthetic symbol names on demand to improve memory consumption and startup times." This reverts commit c8164d0276b97679e80db01adc860271ab4a5d11 and 43f6dad2344247976d5777f56a1fc29e39c6c717 because it breaks TestDyldTrieSymbols.py on GreenDragon.	2021-07-02 16:21:47 -07:00
Greg Clayton	c8164d0276	Create synthetic symbol names on demand to improve memory consumption and startup times. This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool. Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index. This patch fixes the issue by: - not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed - doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen - removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool Prior to this fix the startup times for a large application was: 12.5 seconds (cold file caches) 8.5 seconds (warm file caches) After this fix: 9.7 seconds (cold file caches) 5.7 seconds (warm file caches) The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name. Differential Revision: https://reviews.llvm.org/D105160	2021-06-29 17:44:33 -07:00
Stella Stamenova	bb2cfca2f3	Revert D104488 and friends since it broke the windows bot Reverts commits: "Fix failing tests after https://reviews.llvm.org/D104488." "Fix buildbot failure after https://reviews.llvm.org/D104488." "Create synthetic symbol names on demand to improve memory consumption and startup times." This series of commits broke the windows lldb bot and then failed to fix all of the failing tests.	2021-06-29 12:58:55 -07:00
Greg Clayton	d77ccfdc72	Create synthetic symbol names on demand to improve memory consumption and startup times. This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool. Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index. This patch fixes the issue by: - not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed - doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen - removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool Prior to this fix the startup times for a large application was: 12.5 seconds (cold file caches) 8.5 seconds (warm file caches) After this fix: 9.7 seconds (cold file caches) 5.7 seconds (warm file caches) The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name. Differential Revision: https://reviews.llvm.org/D104488	2021-06-28 18:04:51 -07:00
Greg Clayton	b0572abf72	Improve performance when parsing symbol tables in mach-o files. Some larger projects were loading quite slowly with the current LLDB on macOS and macOS simulator builds. I did some instrument traces and found 3 main culprits: - a LLDB timer that was put into a function that was called too often - a std::set that was keeping track of the address of symbols that were already added - a unnamed function generator in ObjectFile that was going slow due to allocations In order to see this in action I ran the latest LLDB on a large application with many frameworks using the following method: (lldb) script import time; start_time = time.perf_counter() (lldb) file Large.app (lldb) script print(time.perf_counter() - start_time) I first range "sudo purge" to clear the system file caches to simulate a cold startup of the debugger, followed by two iterations with warm file caches. Prior to this fix I was seeing the following timings: 17.68 (cold) 14.56 (warm 1) 14.52 (warm 2) After this fix I was seeing: 11.32 (cold) 8.43 (warm 1) 8.49 (warm 2) Differential Revision: https://reviews.llvm.org/D103504	2021-06-02 10:31:37 -07:00
Jonas Devlieghere	d97e9f1a3d	[lldb] Simplify ObjectFile::FindPlugin (NFC) Use early return to reduce the levels of indentation. Extract logic to find object in container into helper function.	2020-12-23 14:06:40 -08:00

1 2 3 4

183 Commits