351 Commits

Author SHA1 Message Date
Steven Wu
1b8db068ed
[PrefixMap] Teach lldb to auto-load compilation-prefix-map.json (#187145)
Add a LoadCompilationPrefixMap() helper in SymbolFile::FindPlugin that
walks up from the symbol file's directory looking for a
compilation-prefix-map.json file. When found, each key→value entry is
applied to the module's source path mapping list, allowing LLDB to
resolve source file paths that were rewritten by -fdebug-prefix-map at
build time without requiring manual `settings set target.source-map`.

The JSON file format maps fake paths (as written into debug info) back
to their real on-disk counterparts:
  { "/fake/srcdir": "/real/srcdir" }

Directory results are cached so the filesystem is walked at most once
per unique directory across all modules loaded in a session.

Also apply the module's source path remappings in
SymbolFileDWARFDebugMap::ParseCompileUnitAtIndex when constructing
compile units from N_SO stabs. This mirrors what MakeAbsoluteAndRemap
does for the dSYM case so that fake paths baked into the debug map are
transparently resolved to real paths.

rdar://84824567

Assisted-By: Claude
2026-03-18 09:11:28 -07:00
Michael Buch
d9eba8b355
[lldb][Module] Don't try to locate scripting resources without a ScriptInterpreter (#187091)
I'm in the process of moving `SanitizedScriptingModuleName` into
`ScriptInterpreter` as a `virtual` API. The nullptr check inside the
constructor made that more difficult because it implied we may not have
a `ScriptInterpreter` available to call the sanitization API on. Really
the `nullptr` check is redundant because even if we succesfully sanitize
and then locate some scripts, `Module::LoadScriptingResourceInTarget`
bails out if we don't have a `ScriptInterpreter`.

This patch moves the early exit in `LoadScriptingResourceInTarget` to
before we make the call to `LocateExecutableScriptingResources`. That
way we ensure we never get to `SanitizedScriptingModuleName` without a
valid `ScriptInterpreter`.
2026-03-18 00:09:15 +00:00
Michael Buch
361987c01a
[lldb][Module] Remove feedback_stream parameter from LoadScriptingResources (#186787)
I'm in the process of making `LoadScriptingResources` interactively ask
a user whether to load a script. I'd like to turn the existing warning
into the prompt. The simplest way to achieve this is to not print into a
`feedback_stream` parameter, and instead create a prompt right there.
This patch removes the `feedback_stream` parameter and emits a
`ReportWarning` instead. If we get around to adding the prompt instead
of the warning, those changes will be simpler to review. But even if we
don't end up replacing the warning with a prompt, moving away from
output parameters and towards more structured error reporting is a
nice-to-have (e.g., the `warning` prefix is now colored, IDEs have more
flexibility on how to present the warning, etc.).

For a command-line user nothing should change with this patch (apart
from `warning:` being highlighted).
2026-03-16 15:18:23 +00:00
Michael Buch
adb04f86f8
[lldb][Module][NFC] Use raw string literal and formatv-style format in LoadScriptingResourceInTarget (#186411)
Makes it obvious what the warning will look like (with the indenentation
etc.). Also adds a test since we had no coverage for the warning before
(as far as I'm aware).
2026-03-13 15:13:35 +00:00
Michael Buch
b3bc1f543f
[lldb][Module][NFC] Use early-return style in LoadScriptingResourceInTarget (#186392)
Planning on adding more to this function/loop soon. Making it
early-return style (as suggested by the LLVM style guide) makes those
changes easier to reason about.

Drive-by:
* Reduced the indentation of the loop by doing an early-continue if the
`FileSpec` is invalid or doesn't exist
2026-03-13 14:39:13 +00:00
Jason Molenda
3f024d0835
[lldb] A few small code modernizations and cleanups [NFC] (#182656)
I was reading through ObjectContainerBSDArchive and came across some
dead method decls, a less-than-completely-clear `shared_ptr` typedef in
`ObjectContainerBSDArchive::Archive` for a shared_ptr<Archive> which was
a little unclear when reading a decl like `shared_ptr archive_sp;` for a
local variable.
2026-02-23 22:03:40 -08:00
Augusto Noronha
b3f3b57aed
[lldb] Speed up SymbolContextList::AppendIfUnique (#181952)
d7fb086668dff68 changed some calls from SymbolContextList::Append to
SymbolContextList::AppendIfUnique. This has unfortunately caused a huge
slow down in operations involving a large amount of symbol contexts (for
example, trying to autocomplete from an empty input "b <TAB>" will add
every function to the list), since AppendIfUnique scans the entire
symbol context list. Speed this up by adding a hash set to quickly
answer whether a symbol context is on the list or not.

This takes the time from running "b <TAB>" when debugging yaml2obj on my
machine from 600 seconds down to 13, which is about the same as before
d7fb086668dff68.

Note that AppendIfUnique has a logic error, which has been present since
its introduction. This has to do with the behavior controlled by
"merge_symbol_into_function", which will try to merge symbols with
symbol context containing the equivalent function to that symbol.

The previous patch tried to correct this by adding
CompareConsideringPossiblyNullSymbol(), which is not quite correct.

With CompareConsideringPossiblyNullSymbol(), if symbols are added in
this order:

- Symbol context without symbol.
- Equivalent symbol context with symbol.

The list will have only one symbol context WITHOUT the symbol.

If we stop using CompareConsideringPossiblyNullSymbol() and instead go
back to the == operator which d7fb086668dff68 introduced, with symbols
added in this order, the following will happen:

- Symbol context without symbol.
- Equivalent symbol context with symbol.
- The bare symbol, with "merge_symbol_into_function = true", the list
will have the same symbol twice.

This patch does not attempt to solve this, and instead focuses on the
performance issue d7fb086668dff68 introduced.

rdar://170477680
2026-02-23 11:41:13 -08:00
Jonas Devlieghere
091296f3e3
[lldb] Revert scripted symbol locator (#181945)
This revert #181334 and its follow-up PRs (including #181488, #181492,
#181493, #181494 and #181498) as well as Ismail's documentation changes
(#181594, #181717). The original commit causes a test failure in CI
(https://github.com/llvm/llvm-project/issues/181938) but the more I look
at the patch, the more I'm convinced it was not ready to land. It will
be easier to iterate on the feedback by re-landing this than by using
post-commit review.
2026-02-17 16:52:21 -08:00
rchamala
1ee03d1e09
[lldb] Add ScriptedSymbolLocator plugin for source file resolution (#181334)
## Summary                                                        
                                                                    
Based on discussion from
[RFC](https://discourse.llvm.org/t/rfc-python-callback-for-source-file-resolution/83545),
this PR adds a new `SymbolLocatorScripted` plugin that allows Python
scripts to implement custom symbol and source file resolution logic.
This enables downstream users to build custom symbol servers, source
file remapping, and build artifact resolution entirely in Python.
                                                                    
  ### Changes

- Adds `LocateSourceFile()` to the SymbolLocator plugin interface,
called during source path resolution with a fully loaded `ModuleSP`, so
the plugin has access to the module's UUID, file paths, and symbols.
- Adds `SymbolLocatorScripted` plugin that delegates all four
SymbolLocator methods (`LocateExecutableObjectFile`,
`LocateExecutableSymbolFile`, `DownloadObjectAndSymbolFile`,
`LocateSourceFile`) to a user-provided Python class.
- Adds `ScriptedSymbolLocatorPythonInterface` to bridge C++ calls to
Python, with proper GIL management and error handling.
- Results for `LocateSourceFile` are cached per (module UUID, source
file) pair.
- The Python class is configured via: `settings set
plugin.symbol-locator.scripted.script-class module.ClassName`

  ### Python class interface

  ```python
  class MyLocator:
      def __init__(self, exe_ctx, args): ...
      def locate_source_file(self, module, original_source_file):
  ...
      def locate_executable_object_file(self, module_spec): ...
      def locate_executable_symbol_file(self, module_spec,
  default_search_paths): ...
      def download_object_and_symbol_file(self, module_spec,
  force_lookup, copy_executable): ...
```

  ### Test plan
```
  Added TestScriptedSymbolLocator.py with 3 test cases:
  - test_locate_source_file — verifies the locator resolves source
  files, receives a valid SBModule with UUID, and remaps paths correctly
  - test_locate_source_file_none_fallthrough — verifies returning
None falls through to default LLDB resolution, and that having no script
  class set works normally
  - test_invalid_script_class — verifies graceful handling of
  invalid class names without crashing
```

Co-authored-by: Rahul Reddy Chamala <rachamal@fb.com>
2026-02-14 07:39:00 -08:00
Alex Langford
85c96ff3a0
[lldb][NFC] Remove unused method Module::SetUUID (#178803) 2026-02-02 10:22:20 -08:00
Jason Molenda
2aa020f49b
[lldb][NFC] Module, ModuleSpec, GetSectionData use DataExtractorSP (#178347)
In a PR last month I changed the ObjectFile CreateInstance etc methods
to accept an optional DataExtractorSP instead of a DataBufferSP, and
retain the extractor in a shared pointer internally in all of the
ObjectFile subclasses. This is laying the groundwork for using a
VirtualDataExtractor for some Mach-O binaries on macOS, where the
segments of the binary are out-of-order in actual memory, and we add a
lookup table to make it appear that the TEXT segment is at offset 0 in
the Extractor, etc. Working on the actual implementation, I realized we
were still using DataBufferSP's in ModuleSpec and Module, as well as in
ObjectFile::GetModuleSpecifications.

I originally was making a much larger NFC change where I had all
ObjectFile subclasses operating on DataExtractors throughout their
implementation, as well as in the DWARF parser. It was a very large
patchset. Many subclasses start with their DataExtractor, then create
smaller DataExtractors for parts of the binary image - the string table,
the symbol table, etc., for processing.

After consideration and discussion with Jonas, we agreed that a
segment/section of a binary will never require a lookup table to access
the bytes within it, so I changed
VirtualDataExtractor::GetSubsetExtractorSP to (1) require that the
Subset be contained within a single lookup table entry, and (2) return a
simple DataExtractor bounded on that byte range. By doing this, I was
able to remove all of my very-invasive changes to the ObjectFile
subclass internals; it's only when they are operating on the entire
binary image that care is needed.

One pattern that subclasses like ObjectFileBreakpad use is to take an
ArrayRef of the DataBuffer for a binary, then create a StringRef of
that, then look for strings in it. With a VirtualDataExtractor and
out-of-order binary segments, with gaps between them, this allows us to
search the entire buffer looking for a string, and segfault when it gets
to an unmapped region of the buffer. I added a
VirtualDataExtractor::GetSubsetExtractorSP(0) which gets the largest
contiguous memory region starting at offset 0 for this use case, and I
added a comment about what was being done there because I know it is not
obvious, and people not working on macOS wouldn't be familiar with the
requirement. (when we have a ModuleSpec with a DataExtractor, any of the
ObjectFile subclasses get a shot at Creating, so they all have to be
able to iterate on these)

rdar://148939795
2026-01-29 15:36:40 -08:00
Alex Langford
956485d39a
[lldb][NFC] Remove ObjectFile::ResolveSymbolForAddress (#177479)
Nothing overrides this method and the base class's implementation
returns nullptr.
2026-01-27 15:23:09 -08:00
Augusto Noronha
208553460a
[lldb] Prefer exact address match when looking up symbol by address (#172055)
The current behavior will pick the first symbol that contains the
address, this causes LLDB to pick the wrong symbol when looking for
swift reflection metadata on Linux, as in that case it is valid for a
symbol to completely encompass another one.

Instead, this function should prefer the symbol which is an exact, if it
exists. As a bonus, this should also be faster in the vast majority of
the cases, as we probably query symbols by their exact address most of
the time.

rdar://166344740
2025-12-17 10:51:14 -08:00
Alex Langford
34f6303293
[lldb][NFCI] Make LookupInfo const (#171901)
Instead of changing an existing LookupInfo after creation, let's make
them constant.
2025-12-15 10:51:48 -08:00
Jason Molenda
e4c83b7b11
[lldb][NFC] Change ObjectFile argument type (#171574)
The ObjectFile plugin interface accepts an optional DataBufferSP
argument. If the caller has the contents of the binary, it can provide
this in that DataBufferSP. The ObjectFile subclasses in their
CreateInstance methods will fill in the DataBufferSP with the actual
binary contents if it is not set.
ObjectFile base class creates an ivar DataExtractor from the
DataBufferSP passed in.

My next patch will be a caller that creates a VirtualDataExtractor with
the binary data, and needs to pass that in to the ObjectFile plugin,
instead of the bag-of-bytes DataBufferSP. It builds on the previous
patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP
so I could pass in a subclass in the shared ptr. And it will be using
the VirtualDataExtractor that Jonas added in
https://github.com/llvm/llvm-project/pull/168802

No behavior is changed by the patch; we're simply moving the creation of
the DataExtractor to the caller, instead of a DataBuffer that is
immediately used to set up the ObjectFile DataExtractor. The patch is a
bit complicated because all of the ObjectFile subclasses have to
initialize their DataExtractor to pass in to the base class.

I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it
under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked
lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the
SME, that explains why the testsuite passes)

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2025-12-11 10:08:56 -08:00
Michael Buch
1b7f272906
[lldb][Module] Only log SDK search error once per debugger session (#171820)
Currently if we are debugging an app that was compiled against an SDK
that we don't know about on the host, then every time we evaluate an
expression we get following spam on the console:
```
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
error: Error while searching for Xcode SDK: Unrecognized SDK type: <some SDK>
```

It is not a fatal error but the way we spam it pretty much ruins the
debugging experience.

This patch makes it so we only log this error once per debugger session.

Not sure how to best test it since we'd need to build a program with a
particular SDK and then make it unrecognized by LLDB. Confirmed manually
that the error only gets reported once after this patch.
2025-12-11 17:25:45 +00:00
Felipe de Azevedo Piovezan
f27fbca37c
[lldb][NFC] Replace const std::vector& with ArrayRef in APIs (#170834)
Inside the LLVM codebase, const vector& should just be ArrayRef, as this
more general API works both with vectors, SmallVectors and
SmallVectorImpl, as well as with single elements.

This commit replaces two uses introduced in
https://github.com/llvm/llvm-project/pull/168797 .
2025-12-08 16:59:32 +00:00
Augusto Noronha
d7fb086668
[lldb] Refactor LookupInfo object to be per-language (#168797)
Some months ago, the LookupInfo constructor logic was refactored to not
depend on language specific logic, and use languages plugins instead. In
this refactor, when the language type is unknown, a single LookupInfo
object will handle multiple languages. This doesn't work well, as
multiple languages might want to configure the LookupInfo object in
different ways. For example, different languages might want to set the
m_lookup_name differently from each other, but the previous
implementation would pick the first name a language provided, and
effectively ignored every other language. Other fields of the LookupInfo
object are also configured in incompatible ways.

This approach doesn't seem to be a problem upstream, since only the
C++/Objective-C language plugins are available, but it broke downstream
on the Swift fork, as adding Swift to the list of default languages when
the language type is unknown breaks C++ tests.

This patch makes it so instead of building a single LookupInfo object
for multiple languages, one LookupInfo object is built per language
instead.

rdar://159531216
2025-12-03 16:15:36 -08:00
Alex Langford
897cc3ee42
[lldb][NFC] Remove plugin headers from Module (#167789)
As of e4a672bc17a2a, lldbCore is free of plugins. These headers are no
longer needed.
2025-11-12 16:19:17 -08:00
Michael Buch
f89059140b
[lldb][Expression] Encode Module and DIE UIDs into function AsmLabels (#148877)
LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](https://github.com/llvm/llvm-project/pull/149827)

This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.

**Implementation**

The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).

Depends on https://github.com/llvm/llvm-project/pull/151355
2025-08-01 07:21:41 +01:00
Ely Ronnen
22dfe9cb58
[lldb-dap] Reuse source object logics (#141426)
Refactor code revolving source objects such that most logics will be
reused.

The main change is to expose a single `CreateSource(addr, target)` that
can return either a normal or an assembly source object, and call
`ShouldDisplayAssemblySource()` only from this function instead of
multiple places across the code.

Other functions can use `source.IsAssemblySource()` in order to check
which type the source is.
2025-05-31 08:47:18 +02:00
royitaqi
967434aa32
[lldb] Remerge #136236 (Avoid force loading symbols in statistics collection (#136795)
Fix a [test
failure](https://github.com/llvm/llvm-project/pull/136236#issuecomment-2819772879)
in #136236, apply a minor renaming of statistics, and remerge. See
details below.

# Changes in #136236

Currently, `DebuggerStats::ReportStatistics()` calls
`Module::GetSymtab(/*can_create=*/false)`, but then the latter calls
`SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See
stacktrace below.

The problem is that `DebuggerStats::ReportStatistics` should be
read-only. This is especially important because it reports stats for
symtab parsing/indexing time, which could be affected by the reporting
itself if it's not read-only.

This patch fixes this problem by adding an optional parameter
`SymbolFile::GetSymtab(bool can_create = true)` and receiving the
`false` value passed down from `Module::GetSymtab(/*can_create=*/false)`
when the call is initiated from `DebuggerStats::ReportStatistics()`.

---

Notes about the following stacktrace:
1. This can be reproduced. Create a helloworld program on **macOS** with
dSYM, add `settings set target.preload-symbols false` to `~/.lldbinit`,
do `lldb a.out`, then `statistics dump`.
2. `ObjectFile::GetSymtab` has `llvm::call_once`. So the fact that it
called into `ObjectFileMachO::ParseSymtab` means that the symbol table
is actually being parsed.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x0000000124c4d5a0 LLDB`ObjectFileMachO::ParseSymtab(this=0x0000000111504e40, symtab=0x0000600000a05e00) at ObjectFileMachO.cpp:2259:44
  * frame #1: 0x0000000124fc50a0 LLDB`lldb_private::ObjectFile::GetSymtab()::$_0::operator()(this=0x000000016d35c858) const at ObjectFile.cpp:761:9
    frame #5: 0x0000000124fc4e68 LLDB`void std::__1::__call_once_proxy[abi:v160006]<std::__1::tuple<lldb_private::ObjectFile::GetSymtab()::$_0&&>>(__vp=0x000000016d35c7f0) at mutex:652:5
    frame #6: 0x0000000198afb99c libc++.1.dylib`std::__1::__call_once(unsigned long volatile&, void*, void (*)(void*)) + 196
    frame #7: 0x0000000124fc4dd0 LLDB`void std::__1::call_once[abi:v160006]<lldb_private::ObjectFile::GetSymtab()::$_0>(__flag=0x0000600003920080, __func=0x000000016d35c858) at mutex:670:9
    frame #8: 0x0000000124fc3cb0 LLDB`void llvm::call_once<lldb_private::ObjectFile::GetSymtab()::$_0>(flag=0x0000600003920080, F=0x000000016d35c858) at Threading.h:88:5
    frame #9: 0x0000000124fc2bc4 LLDB`lldb_private::ObjectFile::GetSymtab(this=0x0000000111504e40) at ObjectFile.cpp:755:5
    frame #10: 0x0000000124fe0a28 LLDB`lldb_private::SymbolFileCommon::GetSymtab(this=0x0000000104865200) at SymbolFile.cpp:158:39
    frame #11: 0x0000000124d8fedc LLDB`lldb_private::Module::GetSymtab(this=0x00000001113041a8, can_create=false) at Module.cpp:1027:21
    frame #12: 0x0000000125125bdc LLDB`lldb_private::DebuggerStats::ReportStatistics(debugger=0x000000014284d400, target=0x0000000115808200, options=0x000000014195d6d1) at Statistics.cpp:329:30
    frame #13: 0x0000000125672978 LLDB`CommandObjectStatsDump::DoExecute(this=0x000000014195d540, command=0x000000016d35d820, result=0x000000016d35e150) at CommandObjectStats.cpp:144:18
    frame #14: 0x0000000124f29b40 LLDB`lldb_private::CommandObjectParsed::Execute(this=0x000000014195d540, args_string="", result=0x000000016d35e150) at CommandObject.cpp:832:9
    frame #15: 0x0000000124efbd70 LLDB`lldb_private::CommandInterpreter::HandleCommand(this=0x0000000141b22f30, command_line="statistics dump", lazy_add_to_history=eLazyBoolCalculate, result=0x000000016d35e150, force_repeat_command=false) at CommandInterpreter.cpp:2134:14
    frame #16: 0x0000000124f007f4 LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete(this=0x0000000141b22f30, io_handler=0x00000001419b2aa8, line="statistics dump") at CommandInterpreter.cpp:3251:3
    frame #17: 0x0000000124d7b5ec LLDB`lldb_private::IOHandlerEditline::Run(this=0x00000001419b2aa8) at IOHandler.cpp:588:22
    frame #18: 0x0000000124d1e8fc LLDB`lldb_private::Debugger::RunIOHandlers(this=0x000000014284d400) at Debugger.cpp:1225:16
    frame #19: 0x0000000124f01f74 LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter(this=0x0000000141b22f30, options=0x000000016d35e63c) at CommandInterpreter.cpp:3543:16
    frame #20: 0x0000000122840294 LLDB`lldb::SBDebugger::RunCommandInterpreter(this=0x000000016d35ebd8, auto_handle_events=true, spawn_thread=false) at SBDebugger.cpp:1212:42
    frame #21: 0x0000000102aa6d28 lldb`Driver::MainLoop(this=0x000000016d35ebb8) at Driver.cpp:621:18
    frame #22: 0x0000000102aa75b0 lldb`main(argc=1, argv=0x000000016d35f548) at Driver.cpp:829:26
    frame #23: 0x0000000198858274 dyld`start + 2840
```

# Changes in this PR top of the above

Fix a [test
failure](https://github.com/llvm/llvm-project/pull/136236#issuecomment-2819772879)
in `TestStats.py`. The original version of the added test checks that
all modules have symbol count zero when `target.preload-symbols ==
false`. The test failed on macOS. Due to various reasons, on macOS,
symbols can be loaded for dylibs even with that setting, but not for the
main module. For now, the fix of the test is to limit the assertion to
only the main module. The test now passes on macOS. In the future, when
we have a way to control a specific list of plug-ins to be loaded, there
may be a configuration that this test can use to assert that all modules
have symbol count zero.

Apply a minor renaming of statistics, per the
[suggestion](https://github.com/llvm/llvm-project/pull/136226#issuecomment-2825080275)
in #136226 after merge.
2025-04-24 17:23:41 -07:00
Shubham Sandeep Rastogi
08b4c52540 Revert "[lldb] Avoid force loading symbols in statistics collection (#136236)"
This reverts commit d5b40c71f6be972f677de5d9886f91866df007b5.

This change broke greendragon lldb test:

lldb-api :: commands/statistics/basic/TestStats.py

And is therefore being reverted.
2025-04-21 17:19:54 -07:00
royitaqi
d5b40c71f6
[lldb] Avoid force loading symbols in statistics collection (#136236)
Currently, `DebuggerStats::ReportStatistics()` calls
`Module::GetSymtab(/*can_create=*/false)`, but then the latter calls
`SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See
stacktrace below.

The problem is that `DebuggerStats::ReportStatistics` should be
read-only. This is especially important because it reports stats for
symtab parsing/indexing time, which could be affected by the reporting
itself if it's not read-only.

This patch fixes this problem by adding an optional parameter
`SymbolFile::GetSymtab(bool can_create = true)` and receive the `false`
value passed down from `Module::GetSymtab(/*can_create=*/false)` when
the call was initiated from `DebuggerStats::ReportStatistics()`.
2025-04-21 16:53:14 -07:00
Dmitry Vasilyev
e4a672bc17
[LLDB] Reapply refactored CPlusPlusLanguage::MethodName to break lldb-server dependencies (#135033)
The original PR is #132274.

Co-authored-by: @bulbazord Alex Langford
2025-04-14 14:30:09 +04:00
David Spickett
a29be9f28e
Revert "[LLDB] Refactored CPlusPlusLanguage::MethodName to break lldb-server dependencies" (#134995)
Reverts llvm/llvm-project#132274

Broke a test on LLDB Widows on Arm:
https://lab.llvm.org/buildbot/#/builders/141/builds/7726
```
FAIL: test_dwarf (lldbsuite.test.lldbtest.TestExternCSymbols.test_dwarf)
<...>
    self.assertTrue(self.res.Succeeded(), msg + output)

AssertionError: False is not true : Command 'expression -- foo()' did not return successfully

Error output:

error: Couldn't look up symbols:

  int foo(void)

Hint: The expression tried to call a function that is not present in the target, perhaps because it was optimized out by the compiler.
```
2025-04-09 13:16:23 +01:00
Dmitry Vasilyev
fbc6241d3a
[LLDB] Refactored CPlusPlusLanguage::MethodName to break lldb-server dependencies (#132274)
This patch addresses the issue #129543.
After this patch the size of lldb-server is reduced by 9MB.

Co-authored-by: @bulbazord Alex Langford
2025-04-09 09:11:56 +04:00
jimingham
347c5a7af5
Add a new affordance that the Python module in a dSYM (#133290)
So the dSYM can be told what target it has been loaded into.

When lldb is loading modules, while creating a target, it will run
"command script import" on any Python modules in Resources/Python in the
dSYM. However, this happens WHILE the target is being created, so it is
not yet in the target list. That means that these scripts can't act on
the target that they a part of when they get loaded.

This patch adds a new python API that lldb will call:

__lldb_module_added_to_target

if it is defined in the module, passing in the Target the module was
being added to, so that code in these dSYM's don't have to guess.
2025-04-01 09:54:06 -07:00
David Peixotto
1d1b20a19e
[lldb] Avoid force loading symbol files in statistics collection (#129593)
This commit modifies the `DebuggerStats::ReportStatistics`
implementation to avoid loading symbol files for unloaded symbols. We
collect stats on debugger shutdown and without this change it can cause
the debugger to hang for a long while on shutdown if they symbols were
not previously loaded (e.g. `settings set target.preload-symbols
false`).

The implementation is done by adding an optional parameter to
`Module::GetSymtab` to control if the corresponding symbol file will be
loaded in the same way that can control it for `Module::GetSymbolFile`.
2025-03-10 10:54:11 -07:00
Pavel Labath
3736de2e3c
[lldb] Use Function::GetAddress in Module::FindFunctions (#124938)
The original code resulted in a misfire in the symtab vs. debug info
deduplication code, which caused us to return the same function twice
when searching via a regex (for functions whose entry point is also not
the lowest address).
2025-01-31 09:12:56 +01:00
jeffreytan81
24feaab838
Fix statistics dump to report per-target (#113723)
"statistics dump" currently report the statistics of all targets in
debugger instead of current target. This is wrong because there is a
"statistics dump --all-targets" option that supposed to include
everything.

This PR fixes the issue by only report statistics for current target
instead of all. It also includes the change to reset statistics debug
info/symbol table parsing/indexing time during debugger destroy. This is
required so that we report current statistics if we plan to reuse
lldb/lldb-dap across debug sessions

---------

Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-11-17 20:36:54 -08:00
Adrian Prantl
697a455e6f
More aggressively deduplicate global warnings based on contents. (#112801)
I've been getting complaints from users being spammed by -gmodules
missing file warnings going out of control because each object file
depends on an entire DAG of PCM files that usually are all missing at
once. To reduce this problem, this patch does two things:

1. Module now maintains a DenseMap<hash, once> that is used to display
each warning only once, based on its actual text.

2. The PCM warning itself is reworded to include less details, such as
the DIE offset, which is only useful to LLDB developers, who can get
this from the dwarf log if they need it. Because the detail is omitted
the hashing from (1) deduplicates the warnings.

rdar://138144624
2024-10-19 09:38:25 -07:00
Youngsuk Kim
d7796855b8
[lldb] Nits on uses of llvm::raw_string_ostream (NFC) (#108745)
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly

( 65b13610a5226b84889b923bae884ba395ad084d for further reference )

* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess
indirection.
2024-09-16 00:26:51 -04:00
Adrian Prantl
0642cd768b
[lldb] Turn lldb_private::Status into a value type. (#106163)
This patch removes all of the Set.* methods from Status.

This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.

This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()

Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form

`    ResultTy DoFoo(Status& error)
`
to

`    llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?

The interesting changes are in Status.h and Status.cpp, all other
changes are mostly

` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
2024-08-27 10:59:31 -07:00
Jason Molenda
7ad073a45b
[lldb] Change Module to have a concrete UnwindTable, update (#101130)
Currently a Module has a std::optional<UnwindTable> which is created
when the UnwindTable is requested from outside the Module. The idea is
to delay its creation until the Module has an ObjectFile initialized,
which will have been done by the time we're doing an unwind.

However, Module::GetUnwindTable wasn't doing any locking, so it was
possible for two threads to ask for the UnwindTable for the first time,
one would be created and returned while another thread would create one,
destroy the first in the process of emplacing it. It was an uncommon
crash, but it was possible.

Grabbing the Module's mutex would be one way to address it, but when
loading ELF binaries, we start creating the SymbolTable on one thread
(ObjectFileELF) grabbing the Module's mutex, and then spin up worker
threads to parse the individual DWARF compilation units, which then try
to also get the UnwindTable and deadlock if they try to get the Module's
mutex.

This changes Module to have a concrete UnwindTable as an ivar, and when
it adds an ObjectFile or SymbolFileVendor, it will call the Update
method on it, which will re-evaluate which sections exist in the
ObjectFile/SymbolFile. UnwindTable used to have an Initialize method
which set all the sections, and an Update method which would set some of
them if they weren't set. I unified these with the Initialize method
taking a `force` option to re-initialize the section pointers even if
they had been done already before.

This is addressing a rare crash report we've received, and also a
failure Adrian spotted on the -fsanitize=address CI bot last week, it's
still uncommon with ASAN but it can happen with the standard testsuite.

rdar://128876433
2024-08-01 17:43:25 -07:00
Jason Molenda
6a0ec8e25c
[lldb] Revive shell test after updating UnwindTable (#86770)
In
     commit 2f63718f8567413a1c596bda803663eb58d6da5a
     Author: Jason Molenda <jmolenda@apple.com>
     Date:   Tue Mar 26 09:07:15 2024 -0700

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile
(#86603)

I stopped clearing a Module's UnwindTable when we add a SymbolFile to
avoid the memory management problems with adding a symbol file
asynchronously while the UnwindTable is being accessed on another
thread. This broke the target-symbols-add-unwind.test shell test on
Linux which removes the DWARF debub_frame section from a binary, loads
it, then loads the unstripped binary with the DWARF debug_frame section
and checks that the UnwindPlans for a function include debug_frame.

I originally decided that I was willing to sacrifice the possiblity of
additional unwind sources from a symbol file because we rely on assembly
emulation so heavily, they're rarely critical. But there are targets
where we we don't have emluation and rely on things like DWARF
debug_frame a lot more, so this probably wasn't a good choice.

This patch adds a new UnwindTable::Update method which looks for any new
sources of unwind information and adds it to the UnwindTable, and calls
that after a new SymbolFile has been added to a Module.
2024-03-27 09:25:46 -07:00
Jason Molenda
2f63718f85
[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile (#86603)
Fixing a crash in lldb when `symbols.auto-download` setting is enabled.
When doing a backtrace, this feature has lldb search for a SymbolFile
for stack frames when we are backtracing, and add them either
synchoronously or asynchronously, depending on the specific setting
used.

Module::SetSymbolFileFileSpec clears the Module's UnwindTable, once we
find a new SymbolFile. We may be adding a source of unwind information
that we did not have when lldb was working only with the executable
binary.

What happens in practice is that we're using a reference to the Module's
UnwindTable, and then the other thread getting the SymbolFile clears it
and now the first thread is referring to freed memory and we can crash.
When built with address sanitizer, it crashes much more reliably.

Given that unwind information used for exception handling -- eh_frame,
compact unwind -- is present in executable binaries, the only thing
we're likely to *add* would be DWARF's `debug_frame` if that was also
available. The actual value of re-creating the UnwindTable when we have
added a SymbolFile is not large.

I also tried fixing this by changing the Module to have a shared_ptr to
the UnwindTable, so we could have two different UnwindTable's in use
simultaneously for a brief period. This would be fine TODAY, but it
introduces a very subtle bug that someone will have a heck of a time
figuring out in the future.

In the end, I believe the safest approach is to sacrifice the possible
marginal gain of reconstructing the UnwindTable once a SymbolFile has
been added, to sidestep this whole problem area.

Also, in `Module::GetUnwindTable()`, call `DownloadSymbolFileAsync`
before we create the UnwindTable for the first time, in case the symbol
file is fetched synchronously, we will have it for that possible
marginal gain.
2024-03-26 09:07:15 -07:00
Jonas Devlieghere
5aea6ba8f5
[lldb] Fix trailing whitespace & formatting in Core/Module.cpp (NFC)
I have my editor configured to remove trailing whitespace and every time
I touch this file I end up with a bunch of clang-format changes to lines
that were modified because of it. Nobody likes trailing whitespace so
this cleans up the file.
2024-01-16 21:30:02 -08:00
Adrian Prantl
fa9284589f
[lldb] DWARFDIE: Follow DW_AT_specification when computing CompilerCo… (#77157)
…ntext

Following the specification chain seems to be clearly the expected
behavior of GetDeclContext(). Otherwise C++ methods have an empty
CompilerContext instead of being nested in their struct/class.

Theprimary motivation for this functionality is the Swift plugin. In
order to test the change I added a proof-of-concept implementation of a
Module::FindFunction() variant that takes a CompilerContext, expesed via
lldb-test.

rdar://120553412
2024-01-09 10:45:30 -08:00
Kazu Hirata
744f38913f [lldb] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-16 14:39:37 -08:00
Greg Clayton
dd95877958
[lldb] Make only one function that needs to be implemented when searching for types (#74786)
This patch revives the effort to get this Phabricator patch into
upstream:

https://reviews.llvm.org/D137900

This patch was accepted before in Phabricator but I found some
-gsimple-template-names issues that are fixed in this patch.

A fixed up version of the description from the original patch starts
now.

This patch started off trying to fix Module::FindFirstType() as it
sometimes didn't work. The issue was the SymbolFile plug-ins didn't do
any filtering of the matching types they produced, and they only looked
up types using the type basename. This means if you have two types with
the same basename, your type lookup can fail when only looking up a
single type. We would ask the Module::FindFirstType to lookup "Foo::Bar"
and it would ask the symbol file to find only 1 type matching the
basename "Bar", and then we would filter out any matches that didn't
match "Foo::Bar". So if the SymbolFile found "Foo::Bar" first, then it
would work, but if it found "Baz::Bar" first, it would return only that
type and it would be filtered out.

Discovering this issue lead me to think of the patch Alex Langford did a
few months ago that was done for finding functions, where he allowed
SymbolFile objects to make sure something fully matched before parsing
the debug information into an AST type and other LLDB types. So this
patch aimed to allow type lookups to also be much more efficient.

As LLDB has been developed over the years, we added more ways to to type
lookups. These functions have lots of arguments. This patch aims to make
one API that needs to be implemented that serves all previous lookups:

- Find a single type
- Find all types
- Find types in a namespace

This patch introduces a `TypeQuery` class that contains all of the state
needed to perform the lookup which is powerful enough to perform all of
the type searches that used to be in our API. It contain a vector of
CompilerContext objects that can fully or partially specify the lookup
that needs to take place.

If you just want to lookup all types with a matching basename,
regardless of the containing context, you can specify just a single
CompilerContext entry that has a name and a CompilerContextKind mask of
CompilerContextKind::AnyType.

Or you can fully specify the exact context to use when doing lookups
like: CompilerContextKind::Namespace "std"
CompilerContextKind::Class "foo"
CompilerContextKind::Typedef "size_type"

This change expands on the clang modules code that already used a
vector<CompilerContext> items, but it modifies it to work with
expression type lookups which have contexts, or user lookups where users
query for types. The clang modules type lookup is still an option that
can be enabled on the `TypeQuery` objects.

This mirrors the most recent addition of type lookups that took a
vector<CompilerContext> that allowed lookups to happen for the
expression parser in certain places.

Prior to this we had the following APIs in Module:

```
void
Module::FindTypes(ConstString type_name, bool exact_match, size_t max_matches,
                  llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
                  TypeList &types);

void
Module::FindTypes(llvm::ArrayRef<CompilerContext> pattern, LanguageSet languages,
                  llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
                  TypeMap &types);

void Module::FindTypesInNamespace(ConstString type_name,
                                  const CompilerDeclContext &parent_decl_ctx,
                                  size_t max_matches, TypeList &type_list);
```

The new Module API is much simpler. It gets rid of all three above
functions and replaces them with:

```
void FindTypes(const TypeQuery &query, TypeResults &results);
```
The `TypeQuery` class contains all of the needed settings:

- The vector<CompilerContext> that allow efficient lookups in the symbol
file classes since they can look at basename matches only realize fully
matching types. Before this any basename that matched was fully realized
only to be removed later by code outside of the SymbolFile layer which
could cause many types to be realized when they didn't need to.
- If the lookup is exact or not. If not exact, then the compiler context
must match the bottom most items that match the compiler context,
otherwise it must match exactly
- If the compiler context match is for clang modules or not. Clang
modules matches include a Module compiler context kind that allows types
to be matched only from certain modules and these matches are not needed
when d oing user type lookups.
- An optional list of languages to use to limit the search to only
certain languages

The `TypeResults` object contains all state required to do the lookup
and store the results:
- The max number of matches
- The set of SymbolFile objects that have already been searched
- The matching type list for any matches that are found

The benefits of this approach are:
- Simpler API, and only one API to implement in SymbolFile classes
- Replaces the FindTypesInNamespace that used a CompilerDeclContext as a
way to limit the search, but this only worked if the TypeSystem matched
the current symbol file's type system, so you couldn't use it to lookup
a type in another module
- Fixes a serious bug in our FindFirstType functions where if we were
searching for "foo::bar", and we found a "baz::bar" first, the basename
would match and we would only fetch 1 type using the basename, only to
drop it from the matching list and returning no results
2023-12-12 16:51:49 -08:00
Jonas Devlieghere
745e8bfd1a
[lldb] Remove LocateSymbolFile (#71301)
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
2023-11-05 08:26:42 -08:00
Alex Langford
9e6d48ef60 [lldb][NFCI] Module constructor should take ConstString by value
ConstStrings are super cheap to copy around. It is often more expensive
to pass a pointer and potentially dereference it than just to always copy it.

Differential Revision: https://reviews.llvm.org/D158043
2023-08-17 10:34:57 -07:00
Wanyi Ye
4b9eed9c64 [BSDArchive] NULL check the child object file ptr before accessing its member
Recently we've observed lldb crashes caused by missing object file linked to a thin archive (.a) files. The crash is due to a missing NULL check in the code when looking for child object file referred by the thin archive. Malformed archive file should not crash LLDB. Instead, it should report the error and continue.

New error message will look like the following

```
error: libfoo.a(__objects__/foo/barAppDelegate.mm.o) failed to load objfile for path/to/libfoo.a.
Debugging will be degraded for this module.
```

Test Plan:

llvm-lit test
```
./bin/llvm-lit -sv ../llvm-project/lldb/test/API/functionalities/archives/TestBSDArchives.py
```

Test without code change will error out with LLDB crash
```
--
Command Output (stderr):
--
PASS: LLDB (~/llvm-upstream/Debug/bin/clang-arm64) :: test (TestBSDArchives.BSDArchivesTestCase)
PASS: LLDB (~/llvm-upstream/Debug/bin/clang-arm64) :: test_frame_var_errors_when_archive_missing (TestBSDArchives.BSDArchivesTestCase)
FAIL: LLDB (~/llvm-upstream/Debug/bin/clang-arm64) :: test_frame_var_errors_when_mtime_mistmatch_for_object_in_archive (TestBSDArchives.BSDArchivesTestCase)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      HandleCommand(command = "b a")
1.      HandleCommand(command = "breakpoint set --name 'a'")
Fatal Python error: Segmentation fault

Current thread 0x00000001f7b99e00 (most recent call first):
  File "~/llvm-upstream/Debug/bin/LLDB.framework/Resources/Python/lldb/__init__.py", line 3270 in HandleCommand
  File "~/llvm-upstream/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2070 in runCmd
  File "~/llvm-upstream/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2421 in expect
  File "~/llvm-upstream/llvm-project/lldb/test/API/functionalities/archives/TestBSDArchives.py", line 156 in test_frame_var_errors_when_thin_archive_malformed
...
```

Differential Revision: https://reviews.llvm.org/D156367
2023-07-27 13:21:31 -04:00
Alex Langford
1d796b48e4 [lldb][NFCI] Methods to load scripting resources should take a Stream by reference
These methods all take a `Stream *` to get feedback about what's going
on. By default, it's a nullptr, but we always feed it with a valid
pointer. It would therefore make more sense to have this take a
reference.

Differential Revision: https://reviews.llvm.org/D154883
2023-07-11 10:36:11 -07:00
Jim Ingham
2b0c886542 Refine the reporting mechanism for interruption.
Also, make it possible for new Targets which haven't been added to
the TargetList yet to check for interruption, and add a few more
places in building modules where we can check for interruption.

Differential Revision: https://reviews.llvm.org/D154542
2023-07-06 16:19:19 -07:00
Saleem Abdulrasool
cd21c0d30c Revert "Revert "Host: generalise GetXcodeSDKPath""
This reverts commit c46d9af26cefb0b24646d3235b75ae7a1b8548d4.

Rename the variable to avoid `-Wchanges-meaning` warning.  Although, it
might be better to squelch the warning as it is of low value IMO.
2023-05-29 10:16:41 -07:00
Jonas Devlieghere
917b3a7e62
[lldb] Move Core/FileSpecList -> Utility/FileSpecList (NFC)
There's no reason for FileSpecList to live in lldb/Core while FileSpec
lives in lldb/Utility. Move FileSpecList next to FileSpec.
2023-05-04 22:00:17 -07:00
Douglas Yung
c46d9af26c Revert "Host: generalise GetXcodeSDKPath"
This reverts commit ade3c6a6a88ed3a9b06c076406f196da9d3cc1b9.

This breaks the build with GCC and affects at least 2 build bots:
https://lab.llvm.org/buildbot/#/builders/217/builds/20568
https://lab.llvm.org/buildbot/#/builders/243/builds/5576
2023-05-01 10:22:53 -07:00
Saleem Abdulrasool
ade3c6a6a8 Host: generalise GetXcodeSDKPath
This generalises the GetXcodeSDKPath hook to a GetSDKRoot path which
will be re-used for the Windows support to compute a language specific
SDK path on the platform. Because there may be other options that we
wish to use to compute the SDK path, sink the XcodeSDK parameter into
a structure which can pass a disaggregated set of options. Furthermore,
optionalise the parameter as Xcode is not available for all platforms.

Differential Revision: https://reviews.llvm.org/D149397
Reviewed By: JDevlieghere
2023-04-28 09:30:59 -07:00