Originally commited in 362b9d78b4ee9107da2b5e90b3764b0f0fa610fe and then
reverted in cb63b75e32a415c9bfc298ed7fdcd67e8d9de54c.
This re-lands a subset of the changes to
dap_server.py/DebugCommunication and addresses the python3.10
compatibility issue.
This includes less type annotations since those were the reason for the
failures on that specific version of python.
I've done additional testing on python3.8, python3.10 and python3.13 to
further validate these changes.
Resolves#141955
- Adds data to breakpoints `Source` object, in order for assembly
breakpoints, which rely on a temporary `sourceReference` value, to be
able to resolve in future sessions like normal path+line breakpoints
- Adds optional `instructions_offset` parameter to `BreakpointResolver`
This updates all the existing memory reference fields to use
`lldb::addr_t` directly.
A few places were using `std::string` and decoding the value in the
request handler and other places had unique ways of parsing addresses.
This unifies all of these references with the `DecodeMemoryReference`
helper in JSONUtils.h.
Additionally, for the types I updated, I tried to simplify the POD types
some and moved default values out of RequestHandlers and into the
protocol POD types.
Replace `isTestSupported` function with `skipIfBuildType` annotation.
Test that uses the `IsTestSupported` function are no longer run, as the
size of lldb-dap binary is now more than `1mb`.
Update the broken test.
Fixes#108621
We could probably check if the test now passes on `linux arm` since it
was disabled because it timed out. I experienced the timeout after
replacing the `IsTestSupported` with `skipIfBuildType`.
In DebugCommunication, we currently are using 2 thread to drive
lldb-dap. At the moment, they make an attempt at only synchronizing the
`recv_packets` between the reader thread and the main test thread. Other
stateful properties of the debug session are not guarded by a
locks/mutex.
To mitigate this, I am moving any state updates to the main thread
inside the `_recv_packet` method to ensure that between calls to
`_recv_packet` the state does not change out from under us in a test.
This does mean the precise timing of events has changed slightly as a
result and I've updated the existing tests that fail for me locally with
this new behavior.
I think this should result in overall more predictable behavior, even if
the test is slow due to the host workload or architecture differences.
---------
Co-authored-by: Ebuka Ezike <yerimyah1@gmail.com>
This adds new types for setExceptionBreakpoints and adds support for
`supportsExceptionFilterOptions`, which allows exception breakpoints to
set a condition.
While testing this, I noticed that obj-c exception catch breakpoints may
not be working correctly in lldb-dap.
This adds a new 'CapabilitiesEventBody' type for having a well
structured type for the event and updates the 'restart' request
to dynamically set their capabilities.
Moving `threads` request to structured types. Adding helper types for
this and moving helpers from JSONUtils to ProtocolUtils.
---------
Co-authored-by: Ebuka Ezike <yerimyah1@gmail.com>
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
This reverts commit 4b6c608615a285d81132acf8e33b81b2ec2c9bf9 and
follow-up commits 159de3633640a5cb2d322ebe8cc4ec0c1c9a896d and
c9e1c52e2e75a91a44a98df818cc9bd11655e51d because this breaks
TestDAP_stepInTargets.py on Darwin.
https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake
Attempt to improve tests by synchronously waiting for breakpoints to
resolve. Not sure if it will fix all the tests but I think it should
make the tests more stable
This reverts commit 8a49db35f45e56c92522c6079e51553e80c07aec.
Due to failures on Arm and AArch64 Linux:
https://lab.llvm.org/buildbot/#/builders/59/builds/18540https://lab.llvm.org/buildbot/#/builders/18/builds/16759
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py", line 22, in assertEvaluateFailure
self.assertNotIn(
AssertionError: 'result' unexpectedly found in {'memoryReference': '0xFFFFF7CB3060', 'result': '0x0000000000000000', 'type': 'int *', 'variablesReference': 7}
FAIL: test_generic_evaluate_expressions (TestDAP_evaluate.TestDAP_evaluate)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py", line 228, in test_generic_evaluate_expressions
self.run_test_evaluate_expressions(enableAutoVariableSummaries=False)
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py", line 117, in run_test_evaluate_expressions
self.assertEvaluateFailure("list") # local variable of a_function
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py", line 22, in assertEvaluateFailure
self.assertNotIn(
AssertionError: 'result' unexpectedly found in {'memoryReference': '0xFFFFF7CB3060', 'result': '0x0000000000000000', 'type': 'int *', 'variablesReference': 7}
Config=aarch64-/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang
The second one is because our bots have the libc debug info package installed,
the first, no idea.
Improving the readability and correctness of DebugCommunication by
adding type annotations to many parts of the library and trying to
improve the implementation of a few key areas of the code to better
handle correctness.
Specifically, this refactored the
`DebugCommunication._handle_recv_packet` function to ensure consistency
with the reader thread when handling state changes and improved the
`DebugCommunication._recv_packet` helper to make it easier to follow by
adding some additional helpers.
#140107 changed the default argument of `disableASLR` of related
functions from `False` to `True`. libc++ CI has been stably failing for
`TestDAP_subtleFrames.py` (in bootstrapping-build) since then. The error
message "personality set failed: Operation not permitted" seems related
to ASLR.
This PR attempts to fix the CI failure by changing the default value of
`disableASLR` in `dap_server.py` to `False`.
Fix the handling of the `instructionOffset` parameter, which resulted in
always returning the wrong disassembly because VSCode always uses
`instructionOffset = -50` and expects 50 instructions before the given
address, instead of 50 bytes before
This is more straight forward refactor of the startup sequence that
reverts parts of ba29e60f9a2222bd5e883579bb78db13fc5a7588. Unlike my
previous attempt, I ended up removing the pending request queue and not
including an `AsyncReqeustHandler` because I don't think we actually
need that at the moment.
The key is that during the startup flow there are 2 parallel operations
happening in the DAP that have different triggers.
* The `initialize` request is sent and once the response is received the
`launch` or `attach` is sent.
* When the `initialized` event is recieved the `setBreakpionts` and
other config requests are made followed by the `configurationDone`
event.
I moved the `initialized` event back to happen in the `PostRun` of the
`launch` or `attach` request handlers. This ensures that we have a valid
target by the time the configuration calls are made. I added also added
a few extra validations that to the `configurationeDone` handler to
ensure we're in an expected state.
I've also fixed up the tests to match the new flow. With the other
additional test fixes in 087a5d2ec7897cd99d3787820711fec76a8e1792 I
think we've narrowed down the main source of test instability that
motivated the startup sequence change.
Adding an assert that the 'continue' request succeeds caused a number of
tests to fail. This showed a number of tests that were not specifying if
they should be stopped or not at key points in the test. This is likely
contributing to these tests being flaky since the debugger is not in the
expected state.
Additionally, I spent a little time trying to improve the readability of
the dap_server.py and lldbdap_testcase.py.
To improve logging this adjusts two properties of the existing tests:
* Forwards stderr from lldb-dap to the process in case errors are
reported to stderr.
* Adjusts `DebugAdapterServer.terminate` to close stdin and wait for the
process to exit instead of sending SIGTERM. Additionally, if we end up
with a non-zero exit status we now raise an error to note the unexpected
exit status.
With these changes, I did find one test case in
`TestDAP_console.test_diagnositcs` that was not waiting to ensure the
expected event had arrived by the time it performed an assert.
This PR changes how we treat the launch sequence in lldb-dap.
- Send the initialized event after we finish handling the initialize
request, rather than after we finish attaching or launching.
- Delay handling the launch and attach request until we have handled
the configurationDone request. The latter is now largely a NO-OP and
only exists to signal lldb-dap that it can handle the launch and
attach requests.
- Delay handling the initial threads requests until we have handled
the launch or attach request.
- Make all attaching and launching synchronous, including when we have
attach or launch commands. This removes the need to synchronize
between the request and event thread.
Background:
https://discourse.llvm.org/t/reliability-of-the-lldb-dap-tests/86125
Make stopOnAttach=False the default again and explicitly pass
stopOnAttach=True where the tests relies on that. I changed the default
in the launch sequence PR (#138219) because that was implicitly the
assumption (the tests never send the configurationDone request).
This PR changes how we treat the launch sequence in lldb-dap.
- Send the initialized event after we finish handling the initialize
request, rather than after we finish attaching or launching.
- Delay handling the launch and attach request until we have handled
the configurationDone request. The latter is now largely a NO-OP and
only exists to signal lldb-dap that it can handle the launch and
attach requests.
- Delay handling the initial threads requests until we have handled
the launch or attach request.
- Make all attaching and launching synchronous, including when we have
attach or launch commands. This removes the need to synchronize
between the request and event thread.
Background:
https://discourse.llvm.org/t/reliability-of-the-lldb-dap-tests/86125
# Summary
This patch makes the `request_attach` wait for events `process` and
`initialized` just like `request_launch`. This ensure the DAP session
can move forward somewhat correctly.
Recently `TestDap_attach.test_terminate_commands` became flaky.
It's hitting:
```
lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv
raise ValueError(desc)
ValueError: no response for "disconnect"
```
I took a look at the DAP msg from that test case and noticed:
- It's not using the regular attaching, instead it's using the
`attachCommands` to launch debug the binary and it will stop at entry.
- The `initialized` event returned after the `disconnect` request. Which
means lldb-dap didn't really get ready yet.
### NOTE
The `dap_server.py` is doing things to mimic the VSCode (or other dap
clients) but it had some assumptions. For example, it's still missing
the `configurationDone` request and response because it relies on a
continue action to trigger the `configurationDone` request.
# Test Plan
```
./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py
./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py
```
To test the `wait_for_events` timeout case
```
events = self.wait_for_events(["process", "initialized", "fake", "event"], 1)
if events:
raise ValueError(f'no events {",".join(events)} found for within timeout 1')
```
Observed
<img width="696" alt="image"
src="https://github.com/user-attachments/assets/bc97c0ef-d91f-4561-8272-4d36f5f5d4e6"
/>
### Also
Looks like some test cases should be re-enabled in
0b8dfb5762
But only comments was removed. The skip statements survived the change.
The module event indicates that some information about a module has
changed. The event is supported by the Emacs and Visual Studio DAP
clients. This PR adds support for emitting the event from lldb-dap.
Fixes#137058
This converts a number of json::Value's into well defined types that are
used throughout lldb-dap and updates the 'launch' command to use the new
well defined types.
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
The debug adapter protocol supports an option to provide formatting
information for a stack frames as part of the StackTrace request.
lldb-dap incorrectly advertises it supports this, but until this PR that
support wasn't actually implemented.
Fixes#137057
# Summary
This PR updates `SBProcess::GetNumThreads()` and
`SBProcess::GetThreadAtIndex()` to listen to the stop locker.
`SBProcess::GetNumThreads()` will return 0 if the process is running.
## Problem Description
Recently upon debugging a program with thousands of threads in VS Code,
lldb-dap would hang at a `threads` request sent right after receiving
the `configurationDone` response. Soon after it will end the debug
session with the following error
```
Process <pid> exited with status = -1 (0xffffffff) lost connection
```
This is because LLDB is still in the middle of resuming all the threads.
And requesting threads will end up interrupt the process on Linux. From
the gdb-remote log it ended up getting `lldb::StateType::eStateInvalid`
and just exit with status -1.
I don't think it's reasonable to allow getting threads from a running
process. There are a few approaches to fix this:
1) Send the stopped event to IDE after `configurationDone`. This aligns
with the CLI behavior.
2) However, the above approach will break the existing user facing
behavior. The alternative will be reject the `threads` request if the
process is not stopped.
3) Improve the run lock. This is a synchronize issue where process was
in the middle of resuming while lldb-dap attempts to interrupt it.
**This PR implements the option 3**
## HOWEVER
This fixed the "lost connection" issue below but new issue has surfaced.
From testing, and also from checking the [VSCode source
code](174af221c9/src/vs/workbench/contrib/debug/browser/debugSession.ts (L791)),
it expects having threadID to perform `pause`. So after attaching,
without any threads reported to the client, the user will not be able to
pause the attached process. `setBreakpoint` will still work and once we
make a stop at the bp (or any stop that will report threads, client can
perform pause again.
## NEXT
1) Made an attempt to return initial thread list so that VSCode can
pause (second commit in the PR)
2) Investigate why threads will trigger unwinding the second frame of a
thread, which leads to sending the interrupt
3) Decided if we want to support `stopOnEntry` for attaching, given
i. This is not an official specification
ii. If enable stopOnEntry, we need to fix attaching on Linux, to send
only one stopped event. Currently, all threads upon attaching will have
stop reason `SIGSTOP` and lldb-dap will send `stopped` event for each
one of them. Every `stopped` will trigger the client request for
threads.
iii. Alternatively, we can support auto continue correspond to `(lldb)
process attach --continue`. This require the ii above.
### Additionally
lldb-dap will not send a `continued` event after `configurationDone`
because it checks `dap.focus_tid == LLDB_INVALID_THREAD_ID` (so that we
don't send it for `launch` request). Notice `dap.focus_tid` will only
get assigned when handling stop or stepping.
According to DAP
> Please note: a debug adapter is not expected to send this event in
response to a request that implies that execution continues, e.g. launch
or continue.
It is only necessary to send a continued event if there was no previous
request that implied this.
So I guess we are not violating DAP if we don't send `continued` event.
But I'd like to get some sense about this.
## Test Plan
Used following program for testing:
https://gist.github.com/kusmour/1729d2e07b7b1063897db77de194e47d
**NOTE: Utilize stdin to get pid and attach AFTER hitting enter. Attach
should happen when all the threads start running.**
DAP messages before the change
<img width="1165" alt="image"
src="https://github.com/user-attachments/assets/a9ad85fb-81ce-419c-95e5-612639905c66"
/>
DAP message after the change - report zero threads after attaching
<img width="1165" alt="image"
src="https://github.com/user-attachments/assets/a1179e18-6844-437a-938c-0383702294cd"
/>
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Before #134048, TestDAP_Progress relied on wait_for_event to block until
the progressEnd came in. However, progress events were not added to the
packet list, so this call would always time out. This PR makes it so
that packets are added to the packet list, and you can block on them.
This adds new types and helpers to support the 'initialize' request with
the new typed RequestHandler. While working on this I found there were a
few cases where we incorrectly treated initialize arguments as
capabilities. The new `lldb_dap::protocol::InitializeRequestArguments`
and `lldb_dap::protocol::Capabilities` uncovered the inconsistencies.
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
This should ensure we have the full logs prior to dumping the logs.
Additionally, printing log dumps to stderr so they are adjacent to
assertion failures.
Instead of having two discrete InputStream and OutputStream helpers,
this merges the two into a unifed 'Transport' handler.
This handler is responsible for reading the DAP message headers, parsing
the resulting JSON and converting the messages into
`lldb_dap::protocol::Message`s for both input and output.
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Both spellings are considered correct and acceptable, with adapter being
more common in American English. Given that DAP stands for Debug Adapter
Protocol (with an e) let's go with that as the canonical spelling.
This adjusts the lldb-dap listening mode to accept multiple clients.
Each client initializes a new instance of DAP and an associated
`lldb::SBDebugger` instance.
The listening mode is configured with the `--connection` option and
supports listening on a port or a unix socket on supported platforms.
When running in server mode launch and attach performance should
be improved by lldb sharing symbols for core libraries between debug
sessions.
This commit adds support for column breakpoints to lldb-dap
To do so, support for the `breakpointLocations` request was
added. To find all available breakpoint positions, we iterate over
the line table.
The `setBreakpoints` request already forwarded the column correctly to
`SBTarget::BreakpointCreateByLocation`. However, `SourceBreakpointMap`
did not keep track of multiple breakpoints in the same line. To do so,
the `SourceBreakpointMap` is now indexed by line+column instead of by
line only.
This was previously submitted as #113787, but got reverted due to
failures on ARM and macOS. This second attempt has less strict test
case expectations. Also, I added a release note.
This commit adds support for column breakpoints to lldb-dap.
To do so, support for the `breakpointLocations` request was
added. To find all available breakpoint positions, we iterate over
the line table.
The `setBreakpoints` request already forwarded the column correctly to
`SBTarget::BreakpointCreateByLocation`. However, `SourceBreakpointMap`
did not keep track of multiple breakpoints in the same line. To do so,
the `SourceBreakpointMap` is now indexed by line+column instead of by
line only.
See http://jonasdevlieghere.com/post/lldb-column-breakpoints/ for a
high-level introduction to column breakpoints.
More context can be found in
https://github.com/llvm/llvm-project/pull/110303
For DAP tests running in constrained environments (e.g., Docker
containers), disabling ASLR isn't allowed. So we set `disableASLR=False`
(since https://github.com/llvm/llvm-project/pull/113593).
However, the `dap_server.py` will currently only forward the value
of `disableASLR` to the DAP executable if it's set to `True`. If the
DAP executable wasn't provided a `disableASLR` field it defaults to
`true` too:
f147437945/lldb/tools/lldb-dap/lldb-dap.cpp (L2103-L2104)
This means that passing `disableASLR=False` from the tests is currently
not possible.
This is also true for many of the other boolean arguments of
`request_launch`. But this patch only addresses `disableASLR` for now
since it's blocking a libc++ patch.