llvm-project

Author	SHA1	Message	Date
Pavel Labath	17491d9130	[lldb] Remove data_offset arg from GetModuleSpecifications (#188978 ) - it is always passed as zero - a lot of plugins aren't using it correctly - the data extractor class already has the capability to look at a subset of bytes	2026-03-30 08:39:52 +02:00
Pavel Labath	08c94c0ac3	[lldb] Clear up GetModuleSpecifications return value confusion (#188276 ) Some plugins were returning the number of specifications they have added, while others were returning the total final number. Particularly devious plugins (Minidump) were clearing the specification list altogether. This resulted in nondeterministic failures (depending on plugin ininitialization order) in TestSBModule. This PR defines the problem away by having each plugin only return the specifications it is responsible for. If the caller wants to merge them, it is free to do so. This might be slighly less efficient, but this is hardly hot code. I'm not touching the ObjectFile::GetModuleSpecifications function (the caller of all these functions) as the PR is big enough, although the same approach might be warranted there as well. Fixes https://github.com/llvm/llvm-project/issues/178625.	2026-03-25 15:55:04 +01:00
Jason Molenda	2aa020f49b	[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347 ) In a PR last month I changed the ObjectFile CreateInstance etc methods to accept an optional DataExtractorSP instead of a DataBufferSP, and retain the extractor in a shared pointer internally in all of the ObjectFile subclasses. This is laying the groundwork for using a VirtualDataExtractor for some Mach-O binaries on macOS, where the segments of the binary are out-of-order in actual memory, and we add a lookup table to make it appear that the TEXT segment is at offset 0 in the Extractor, etc. Working on the actual implementation, I realized we were still using DataBufferSP's in ModuleSpec and Module, as well as in ObjectFile::GetModuleSpecifications. I originally was making a much larger NFC change where I had all ObjectFile subclasses operating on DataExtractors throughout their implementation, as well as in the DWARF parser. It was a very large patchset. Many subclasses start with their DataExtractor, then create smaller DataExtractors for parts of the binary image - the string table, the symbol table, etc., for processing. After consideration and discussion with Jonas, we agreed that a segment/section of a binary will never require a lookup table to access the bytes within it, so I changed VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the Subset be contained within a single lookup table entry, and (2) return a simple DataExtractor bounded on that byte range. By doing this, I was able to remove all of my very-invasive changes to the ObjectFile subclass internals; it's only when they are operating on the entire binary image that care is needed. One pattern that subclasses like ObjectFileBreakpad use is to take an ArrayRef of the DataBuffer for a binary, then create a StringRef of that, then look for strings in it. With a VirtualDataExtractor and out-of-order binary segments, with gaps between them, this allows us to search the entire buffer looking for a string, and segfault when it gets to an unmapped region of the buffer. I added a VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest contiguous memory region starting at offset 0 for this use case, and I added a comment about what was being done there because I know it is not obvious, and people not working on macOS wouldn't be familiar with the requirement. (when we have a ModuleSpec with a DataExtractor, any of the ObjectFile subclasses get a shot at Creating, so they all have to be able to iterate on these) rdar://148939795	2026-01-29 15:36:40 -08:00
Jason Molenda	e4c83b7b11	[lldb][NFC] Change ObjectFile argument type (#171574 ) The ObjectFile plugin interface accepts an optional DataBufferSP argument. If the caller has the contents of the binary, it can provide this in that DataBufferSP. The ObjectFile subclasses in their CreateInstance methods will fill in the DataBufferSP with the actual binary contents if it is not set. ObjectFile base class creates an ivar DataExtractor from the DataBufferSP passed in. My next patch will be a caller that creates a VirtualDataExtractor with the binary data, and needs to pass that in to the ObjectFile plugin, instead of the bag-of-bytes DataBufferSP. It builds on the previous patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP so I could pass in a subclass in the shared ptr. And it will be using the VirtualDataExtractor that Jonas added in https://github.com/llvm/llvm-project/pull/168802 No behavior is changed by the patch; we're simply moving the creation of the DataExtractor to the caller, instead of a DataBuffer that is immediately used to set up the ObjectFile DataExtractor. The patch is a bit complicated because all of the ObjectFile subclasses have to initialize their DataExtractor to pass in to the base class. I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the SME, that explains why the testsuite passes) rdar://148939795 --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>	2025-12-11 10:08:56 -08:00
Jason Molenda	69511ae804	[lldb] Unwind through ARM Cortex-M exceptions automatically (#153922 ) When a processor faults/is interrupted/gets an exception, it will stop running code and jump to an exception catcher routine. Most processors will store the pc that was executing in a system register, and the catcher functions have special instructions to retrieve that & possibly other registers. It may then save those values to stack, and the author can add .cfi directives to tell lldb's unwinder where to find those saved values. ARM Cortex-M (microcontroller) processors have a simpler mechanism where a fixed set of registers are saved to the stack on an exception, and a unique value is put in the link register to indicate to the caller that this has taken place. No special handling needs to be written into the exception catcher, unless it wants to inspect these preserved values. And it is possible for a general stack walker to walk the stack with no special knowledge about what the catch function does. This patch adds an Architecture plugin method to allow an Architecture to override/augment the UnwindPlan that lldb would use for a stack frame, given the contents of the return address register. It resembles a feature where the LanguageRuntime can replace/augment the unwind plan for a function, but it is doing it at offset by one level. The LanguageRuntime is looking at the local register context and/or symbol name to decide if it will override the unwind rules. For the Cortex-M exception unwinds, we need to modify THIS frame's unwind plan if the CALLER's LR had a specific value. RegisterContextUnwind has to retrieve the caller's LR value before it has completely decided on the UnwindPlan it will use for THIS stack frame. This does mean that we will need one additional read of stack memory than we currently do when unwinding, on Armv7 Cortex-M targets. The unwinder walks the stack lazily, as stack frames are requested, and so now if you ask for 2 stack frames, we will read enough stack to walk 2 frames, plus we will read one extra word of memory, the spilled LR value from the stack. In practice, with 512-byte memory cache reads, this is unlikely to be a real performance hit. This PR includes a test with a yaml corefile description and a JSON ObjectFile, incorporating all of the necessary stack memory and symbol names from a real debug session I worked on. The architectural default unwind plans are used for all stack frames except the 0th because there's no instructions for the functions, and no unwind info. I may need to add an encoding of unwind fules to ObjectFileJSON in the future as we create more test cases like this. This PR depends on the yaml2macho-core utility from https://github.com/llvm/llvm-project/pull/153911 to run its API test. rdar://110663219	2025-09-09 14:11:39 -07:00
Pavel Labath	2c4f67794b	[lldb/cmake] Implicitly pass arguments to llvm_add_library (#142583 ) If we're not touching them, we don't need to do anything special to pass them along -- with one important caveat: due to how cmake arguments work, the implicitly passed arguments need to be specified before arguments that we handle. This isn't particularly nice, but the alternative is enumerating all arguments that can be used by llvm_add_library and the macros it calls (it also relies on implicit passing of some arguments to llvm_process_sources).	2025-06-04 11:33:37 +02:00
Greg Clayton	8ac359ba0d	Add complete ObjectFileJSON support for sections. (#129916 ) Sections now support specifying: - user IDs - file offset/size - alignment - flags - bool values for fake, encrypted and thread specific sections	2025-03-07 15:34:27 -08:00
Greg Clayton	27901cec0e	Add subsection and permissions support to ObjectFileJSON. (#129801 ) This patch adds the ability to create subsections in a section and allows permissions to be specified.	2025-03-04 16:19:20 -08:00
Greg Clayton	7b596ce362	[lldb] Fix ObjectFileJSON to section addresses. (#129648 ) ObjectFileJSON sections didn't work, they were set to zero all of the time. Fixed the bug and fixed the test to ensure it was testing real values.	2025-03-04 14:35:42 -08:00
Jonas Devlieghere	0e5cdbf07e	[lldb] Make ObjectFileJSON loadable as a module This patch adds support for creating modules from JSON object files. This is necessary for the crashlog use case where we don't have either a module or a symbol file. In that case the ObjectFileJSON serves as both. The patch adds support for an object file type (i.e. executable, shared library, etc). It also adds the ability to specify sections, which is necessary in order specify symbols by address. Finally, this patch improves error handling and fixes a bug where we wouldn't read more than the initial 512 bytes in GetModuleSpecifications. Differential revision: https://reviews.llvm.org/D148062	2023-04-13 14:08:19 -07:00
Jonas Devlieghere	cf3524a574	[lldb] Introduce new SymbolFileJSON and ObjectFileJSON Introduce a new object and symbol file format with the goal of mapping addresses to symbol names. I'd like to think of is as an extremely simple textual symtab. The file format consists of a triple, a UUID and a list of symbols. JSON is used for the encoding, but that's mostly an implementation detail. The goal of the format was to be simple and human readable. The new file format is motivated by two use cases: - Stripped binaries: when a binary is stripped, you lose the ability to do thing like setting symbolic breakpoints. You can keep the unstripped binary around, but if all you need is the stripped symbols then that's a lot of overhead. Instead, we could save the stripped symbols to a file and load them in the debugger when needed. I want to extend llvm-strip to have a mode where it emits this new file format. - Interactive crashlogs: with interactive crashlogs, if we don't have the binary or the dSYM for a particular module, we currently show an unnamed symbol for those frames. This is a regression compared to the textual format, that has these frames pre-symbolicated. Given that this information is available in the JSON crashlog, we need a way to tell LLDB about it. With the new symbol file format, we can easily synthesize a symbol file for each of those modules and load them to symbolicate those frames. Here's an example of the file format: { "triple": "arm64-apple-macosx13.0.0", "uuid": "36D0CCE7-8ED2-3CA3-96B0-48C1764DA908", "symbols": [ { "name": "main", "type": "code", "size": 32, "address": 4294983568 }, { "name": "foo", "type": "code", "size": 8, "address": 4294983560 } ] } Differential revision: https://reviews.llvm.org/D145180	2023-03-08 20:56:11 -08:00

11 Commits