## Summary
Based on discussion from
[RFC](https://discourse.llvm.org/t/rfc-python-callback-for-source-file-resolution/83545),
this PR adds a new `SymbolLocatorScripted` plugin that allows Python
scripts to implement custom symbol and source file resolution logic.
This enables downstream users to build custom symbol servers, source
file remapping, and build artifact resolution entirely in Python.
### Changes
- Adds `LocateSourceFile()` to the SymbolLocator plugin interface,
called during source path resolution with a fully loaded `ModuleSP`, so
the plugin has access to the module's UUID, file paths, and symbols.
- Adds `SymbolLocatorScripted` plugin that delegates all four
SymbolLocator methods (`LocateExecutableObjectFile`,
`LocateExecutableSymbolFile`, `DownloadObjectAndSymbolFile`,
`LocateSourceFile`) to a user-provided Python class.
- Adds `ScriptedSymbolLocatorPythonInterface` to bridge C++ calls to
Python, with proper GIL management and error handling.
- Results for `LocateSourceFile` are cached per (module UUID, source
file) pair.
- The Python class is configured via: `settings set
plugin.symbol-locator.scripted.script-class module.ClassName`
### Python class interface
```python
class MyLocator:
def __init__(self, exe_ctx, args): ...
def locate_source_file(self, module, original_source_file):
...
def locate_executable_object_file(self, module_spec): ...
def locate_executable_symbol_file(self, module_spec,
default_search_paths): ...
def download_object_and_symbol_file(self, module_spec,
force_lookup, copy_executable): ...
```
### Test plan
```
Added TestScriptedSymbolLocator.py with 3 test cases:
- test_locate_source_file — verifies the locator resolves source
files, receives a valid SBModule with UUID, and remaps paths correctly
- test_locate_source_file_none_fallthrough — verifies returning
None falls through to default LLDB resolution, and that having no script
class set works normally
- test_invalid_script_class — verifies graceful handling of
invalid class names without crashing
```
Co-authored-by: Rahul Reddy Chamala <rachamal@fb.com>
This patch adds support for:
- PyObject -> SBValueList (which was surprisingly not there before!)
- PyObject -> SBValue
- SBValue -> ValueObjectSP using the ScriptInterpreter
These three are the main remaining plumbing changes necessary before we can get to the meat of actually using ScriptedFrame to provide values to the printer/etc. Future patches build off this change in order to allow ScriptedFrames to provide variables and get values for variable expressions.
We missed a handful of uses of the Python private API in the SWIG
typemaps because they are included before we include the Python header
that defines Py_LIMITED_API.
This fixes that and guards the last private use on whether or not you're
targeting the limited API. Unfortunately there doesn't appear to be an
alternative, so we have to resort to being slightly less defensive.
This patch extends ScriptedFrame to work with real (non-scripted)
threads,
enabling frame providers to synthesize frames for native processes.
Previously, ScriptedFrame only worked within
ScriptedProcess/ScriptedThread
contexts. This patch decouples ScriptedFrame from ScriptedThread,
allowing
users to augment or replace stack frames in real debugging sessions for
use
cases like custom calling conventions, reconstructing corrupted frames
from
core files, or adding diagnostic frames.
Key changes:
- ScriptedFrame::Create() now accepts ThreadSP instead of requiring
ScriptedThread, extracting architecture from the target triple rather
than ScriptedProcess.arch
- Added SBTarget::RegisterScriptedFrameProvider() and
ClearScriptedFrameProvider() APIs, with Target storing a
SyntheticFrameProviderDescriptor template for new threads
- Added "target frame-provider register/clear" commands for CLI access
- Thread class gains LoadScriptedFrameProvider(),
ClearScriptedFrameProvider(),
and GetFrameProvider() methods for per-thread frame provider management
- New SyntheticStackFrameList overrides FetchFramesUpTo() to lazily
provide
frames from either the frame provider or the real stack
This enables practical use of the SyntheticFrameProvider infrastructure
in
real debugging workflows.
rdar://161834688
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch implements the base and python interface for the
ScriptedFrameProvider class.
This is necessary to call python APIs from the ScriptedFrameProvider
that will come in a follow-up.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Modify the python wrapper to return uint32_t,
which prevents incorrect child name-to-index mapping and avoids
performing redundant operations on non-existent SBValues.
This patch adds the notion of "Facade" locations which can be reported
from a ScriptedResolver instead of the actual underlying breakpoint
location for the breakpoint. Also add a "was_hit" method to the scripted
resolver that allows the breakpoint to say which of these "Facade"
locations was hit, and "get_location_description" to provide a
description for the facade locations.
I apologize in advance for the size of the patch. Almost all of what's
here was necessary to (a) make the feature testable and (b) not break
any of the current behavior.
The motivation for this feature is given in the "Providing Facade
Locations" section that I added to the python-reference.rst so I won't
repeat it here.
rdar://152112327
So the dSYM can be told what target it has been loaded into.
When lldb is loading modules, while creating a target, it will run
"command script import" on any Python modules in Resources/Python in the
dSYM. However, this happens WHILE the target is being created, so it is
not yet in the target list. That means that these scripts can't act on
the target that they a part of when they get loaded.
This patch adds a new python API that lldb will call:
__lldb_module_added_to_target
if it is defined in the module, passing in the Target the module was
being added to, so that code in these dSYM's don't have to guess.
Xcode uses a pseudoterminal for the debugger console.
- The upside of this apporach is that it means that it can rely on
LLDB's IOHandlers for multiline and script input.
- The downside of this approach is that the command output is printed to
the PTY and you don't get a SBCommandReturnObject. Adrian added support
for inline diagnostics (#110901) and we'd like to access those from the
IDE.
This patch adds support for registering a callback in the command
interpreter that gives access to the `(SB)CommandReturnObject` right
before it will be printed. The callback implementation can choose
whether it likes to handle printing the result or defer to lldb. If the
callback indicated it handled the result, the command interpreter will
skip printing the result.
We considered a few other alternatives to solve this problem:
- The most obvious one is using `HandleCommand`, which returns a
`SBCommandReturnObject`. The problem with this approach is the multiline
input mentioned above. We would need a way to tell the IDE that it
should expect multiline input, which isn't known until LLDB starts
handling the command.
- To address the multiline issue,we considered exposing (some of the)
IOHandler machinery through the SB API. To solve this particular issue,
that would require reimplementing a ton of logic that already exists
today in the CommandInterpeter. Furthermore that seems like overkill
compared to the proposed solution.
rdar://141254310
If your arguments or option values are of a type that naturally uses one
of our common completion mechanisms, you will get completion for free.
But if you have your own custom values or if you want to do fancy things
like have `break set -s foo.dylib -n ba<TAB>` only complete on symbols
in foo.dylib, you can use this new mechanism to achieve that.
...and "[lldb/Interpreter] Introduce `ScriptedStopHook{,Python}Interface` & make use of it (#105449)"
This reverts commit 76b827bb4d5b4cc4d3229c4c6de2529e8b156810, and commit 1e131ddfa8f1d7b18c85c6e4079458be8b419421
because the first commit caused the test command-stop-hook-output.test to fail.
This patch introduces new `ScriptedStopHook{,Python}Interface` classes
that make use of the Scripted Interface infrastructure and makes use of
it in `StopHookScripted`.
It also relax the requirement on the number of argument for initializing
scripting extension if the size of the interface parameter pack contains
1 less element than the extension maximum number of positional arguments
for this initializer.
This addresses the cases where the embedded interpreter session
dictionary is passed to the extension initializer which is not used most
of the time.
---------
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch removes all of the Set.* methods from Status.
This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.
This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()
Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form
` ResultTy DoFoo(Status& error)
`
to
` llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?
The interesting changes are in Status.h and Status.cpp, all other
changes are mostly
` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
The issue was introduced in
https://github.com/llvm/llvm-project/pull/104523.
The code introduces the `ret_val` variable but does not use it. Instead
it returns a pointer, which gets implicitly converted to bool.
Compilers and language runtimes often use helper functions that are
fundamentally uninteresting when debugging anything but the
compiler/runtime itself. This patch introduces a user-extensible
mechanism that allows for these frames to be hidden from backtraces and
automatically skipped over when navigating the stack with `up` and
`down`.
This does not affect the numbering of frames, so `f <N>` will still
provide access to the hidden frames. The `bt` output will also print a
hint that frames have been hidden.
My primary motivation for this feature is to hide thunks in the Swift
programming language, but I'm including an example recognizer for
`std::function::operator()` that I wished for myself many times while
debugging LLDB.
rdar://126629381
Example output. (Yes, my proof-of-concept recognizer could hide even
more frames if we had a method that returned the function name without
the return type or I used something that isn't based off regex, but it's
really only meant as an example).
before:
```
(lldb) thread backtrace --filtered=false
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
frame #8: 0x0000000183cdf154 dyld`start + 2476
(lldb)
```
after
```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
frame #8: 0x0000000183cdf154 dyld`start + 2476
Note: Some frames were hidden by frame recognizers
```
Among other things, returning an empty string as the repeat command
disables auto-repeat, which can be useful for state-changing commands.
There's one remaining refinement to this setup, which is that for parsed
script commands, it should be possible to change an option value, or add
a new option value that wasn't originally specified, then ask lldb "make
this back into a command string". That would make doing fancy things
with repeat commands easier.
That capability isn't present in the lldb_private side either, however.
So that's for a next iteration.
I haven't added this to the docs on adding commands yet. I wanted to
make sure this was an acceptable approach before I spend the time to do
that.
This patch makes ScriptedThreadPlan conforming to the ScriptedInterface
& ScriptedPythonInterface facilities by introducing 2
ScriptedThreadPlanInterface & ScriptedThreadPlanPythonInterface classes.
This allows us to get rid of every ScriptedThreadPlan-specific SWIG
method and re-use the same affordances as other scripting offordances,
like Scripted{Process,Thread,Platform} & OperatingSystem.
To do so, this adds new transformer methods for `ThreadPlan`, `Stream` &
`Event`, to allow the bijection between C++ objects and their python
counterparts.
This just re-lands #70392 after fixing test failures.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
…andOverrideCallback (#94518)"
This reverts commit 7cff05ada05e87408966d56b4c1675033187ff5c. The API
test that was added erroneously imports a module that isn't needed and
wouldn't be found which causes a test failures. This reversion removes
that import.
`SBCommandInterpreter::CommandOverrideCallback` was not being exposed to
the Python API and has no coverage in the
API test suite, so this commits exposes and adds a test for it. Doing
this involves also adding a typemap for the callback used for this
function so that it matches the functionality of other callback
functions that are exposed to Python.
This patch makes ScriptedThreadPlan conforming to the ScriptedInterface
& ScriptedPythonInterface facilities by introducing 2
ScriptedThreadPlanInterface & ScriptedThreadPlanPythonInterface classes.
This allows us to get rid of every ScriptedThreadPlan-specific SWIG
method and re-use the same affordances as other scripting offordances,
like Scripted{Process,Thread,Platform} & OperatingSystem.
To do so, this adds new transformer methods for `ThreadPlan`, `Stream` &
`Event`, to allow the bijection between C++ objects and their python
counterparts.
This just re-lands #70392 after fixing test failures.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This allows you to specify options and arguments and their definitions
and then have lldb handle the completions, help, etc. in the same way
that lldb does for its parsed commands internally.
This feature has some design considerations as well as the code, so I've
also set up an RFC, but I did this one first and will put the RFC
address in here once I've pushed it...
Note, the lldb "ParsedCommand interface" doesn't actually do all the
work that it should. For instance, saying the type of an option that has
a completer doesn't automatically hook up the completer, and ditto for
argument values. We also do almost no work to verify that the arguments
match their definition, or do auto-completion for them. This patch
allows you to make a command that's bug-for-bug compatible with built-in
ones, but I didn't want to stall it on getting the auto-command checking
to work all the way correctly.
As an overall design note, my primary goal here was to make an interface
that worked well in the script language. For that I needed, for
instance, to have a property-based way to get all the option values that
were specified. It was much more convenient to do that by making a
fairly bare-bones C interface to define the options and arguments of a
command, and set their values, and then wrap that in a Python class
(installed along with the other bits of the lldb python module) which
you can then derive from to make your new command. This approach will
also make it easier to experiment.
See the file test_commands.py in the test case for examples of how this
works.
Temporarily revert to unblock the CI bots, this is breaking the -DLLVM_ENABLE_MODULES=On
modules style build. I've notified Ismail.
This reverts commit 888501bc631c4f6d373b4081ff6c504a1ce4a682.
This patch makes ScriptedThreadPlan conforming to the ScriptedInterface
& ScriptedPythonInterface facilities by introducing 2
ScriptedThreadPlanInterface & ScriptedThreadPlanPythonInterface classes.
This allows us to get rid of every ScriptedThreadPlan-specific SWIG
method and re-use the same affordances as other scripting offordances,
like Scripted{Process,Thread,Platform} & OperatingSystem.
To do so, this adds new transformer methods for `ThreadPlan`, `Stream` &
`Event`, to allow the bijection between C++ objects and their python
counterparts.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch makes ScriptedThreadPlan conforming to the ScriptedInterface
& ScriptedPythonInterface facilities by introducing 2
ScriptedThreadPlanInterface & ScriptedThreadPlanPythonInterface classes.
This allows us to get rid of every ScriptedThreadPlan-specific SWIG
method and re-use the same affordances as other scripting offordances,
like Scripted{Process,Thread,Platform} & OperatingSystem.
To do so, this adds new transformer methods for `ThreadPlan`, `Stream` &
`Event`, to allow the bijection between C++ objects and their python
counterparts.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch changes the way plugin objects used with Scripted Interfaces
are created.
Instead of implementing a different SWIG method to create the object for
every scripted interface, this patch makes the creation more generic by
re-using some of the ScriptedPythonInterface templated Dispatch code.
This patch also improves error handling of the object creation by
returning an `llvm::Expected`.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Many SB classes have public constructors or methods involving types that
are private. Some are more obvious (e.g. containing lldb_private in the
name) than others (lldb::FooSP is usually std::shared_pointer<lldb_private::Foo>).
This commit explicitly does not address FileSP, so I'm leaving that one
alone for now.
Some of these were for other SB classes to use and should have been made
protected/private with a friend class entry added. Some of these were
public for some of the swig python helpers to use. I put all of those
functions into a class and made them static methods. The relevant SB
classes mark that class as a friend so they can access those
private/protected members.
I've also removed an outdated SBStructuredData test (can you guess which
constructor it was using?) and updated the other relevant tests.
Differential Revision: https://reviews.llvm.org/D150157
This patch improves breakpoint management when doing interactive
scripted process debugging.
In other to know which process set a breakpoint, we need to do some book
keeping on the multiplexer scripted process. When initializing the
multiplexer, we will first copy breakpoints that are already set on the
driving target.
Everytime we launch or resume, we should copy breakpoints from the
multiplexer to the driving process.
When creating a breakpoint from a child process, it needs to be set both
on the multiplexer and on the driving process. We also tag the created
breakpoint with the name and pid of the originator process.
This patch also implements all the requirement to achieve proper
breakpoint management. That involves:
- Adding python interator for breakpoints and watchpoints in SBTarget
- Add a new `ScriptedProcess.create_breakpoint` python method
Differential Revision: https://reviews.llvm.org/D148548
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Adding a new SBDebugger::SetDestroyCallback() API.
This API can be used by any client to query for statistics/metrics before
exiting debug sessions.
Differential Revision: https://reviews.llvm.org/D143520
This patch adds process attach capabilities to the ScriptedProcess
plugin. This doesn't really expects a PID or process name, since the
process state is already script, however, this allows to create a
scripted process without requiring to have an executuble in the target.
In order to do so, this patch also turns the scripted process related
getters and setters from the `ProcessLaunchInfo` and
`ProcessAttachInfo` classes to a `ScriptedMetadata` instance and moves
it in the `ProcessInfo` class, so it can be accessed interchangeably.
This also adds the necessary SWIG wrappers to convert the internal
`Process{Attach,Launch}InfoSP` into a `SB{Attach,Launch}Info` to pass it
as argument the scripted process python implementation and convert it
back to the internal representation.
rdar://104577406
Differential Revision: https://reviews.llvm.org/D143104
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces both the Scripted Platform python base
implementation and an example for it.
The base implementation is embedded in lldb python module under
`lldb.plugins.scripted_platform`.
This patch also refactor the various SWIG methods to create scripted
objects into a single method, that is now shared between the Scripted
Platform, Process and Thread. It also replaces the target argument by a
execution context object.
Differential Revision: https://reviews.llvm.org/D139250
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch adds a new matching method for data formatters, in addition
to the existing exact typename and regex-based matching. The new method
allows users to specify the name of a Python callback function that
takes a `SBType` object and decides whether the type is a match or not.
Here is an overview of the changes performed:
- Add a new `eFormatterMatchCallback` matching type, and logic to handle
it in `TypeMatcher` and `SBTypeNameSpecifier`.
- Extend `FormattersMatchCandidate` instances with a pointer to the
current `ScriptInterpreter` and the `TypeImpl` corresponding to the
candidate type, so we can run registered callbacks and pass the type
to them. All matcher search functions now receive a
`FormattersMatchCandidate` instead of a type name.
- Add some glue code to ScriptInterpreterPython and the SWIG bindings to
allow calling a formatter matching callback. Most of this code is
modeled after the equivalent code for watchpoint callback functions.
- Add an API test for the new callback-based matching feature.
For more context, please check the RFC thread where this feature was
originally discussed:
https://discourse.llvm.org/t/rfc-python-callback-for-data-formatters-type-matching/64204/11
Differential Revision: https://reviews.llvm.org/D135648
Return our PythonObject wrappers instead of raw PyObjects (obfuscated as
void *). This ensures that ownership (reference counts) of python
objects is automatically tracked.
Differential Revision: https://reviews.llvm.org/D117462
Unlike the rest of our SB objects, SBEvent and SBCommandReturnObject
have the ability to hold non-owning pointers to their non-SB
counterparts. This makes it hard to ensure the SB objects do not become
dangling once their backing object goes away.
While we could make these two objects behave like others, that would
require plubming even more shared pointers through our internal code
(Event objects are mostly prepared for it, CommandReturnObject are not).
Doing so seems unnecessarily disruptive, given that (unlike for some of
the other objects) I don't see any good reason why would someone want to
hold onto these objects after the function terminates.
For that reason, this patch implements a different approach -- the SB
objects will still hold non-owning pointers, but they will be reset to
the empty/default state as soon as the function terminates. This python
code will not crash if the user decides to store these objects -- but
the objects themselves will be useless/empty.
Differential Revision: https://reviews.llvm.org/D116162