This patch uses the "setting changed" callback to clear previously
cached stack frames when
`target.process.disable-language-runtime-unwindplans` is changed. This
is necessary so that changing the setting followed by a `backtrace`
command produces an accurate backtrace.
With this, a user can create a custom command like below in order to
quickly inspect a backtrace created without language plugins.
```
debugger.HandleCommand("settings set target.process.disable-language-runtime-unwindplans true")
debugger.HandleCommand("bt " + command)
debugger.HandleCommand("settings set target.process.disable-language-runtime-unwindplans false")
```
In the future, we may consider implementing this as an option to
`backtrace`. Currently, this process setting is the only way of
affecting the unwinder, and changing the process setting as part of the
backtrace implementation doesn't feel right.
There are no upstream users of this flag, so we cannot write a test for
it here.
This is an odd corner case of the use of scripts loaded from dSYM's - a
macOS only feature, which can load OS Plugins that re-present the thread
state of the program we attach to. If we find out about and load the
dSYM scripts when we discover a target in the course of attaching to it,
we can end up running the OS plugin before we've started up the private
state thread. However, the os_plugin in that case will be running before
we broadcast the stop event to the public event listener. So it should
formally use the private state and not the public state for the Python
code environment.
This patch says that if we have not yet started up the private state
thread, then any thread that is servicing events is doing so on behalf
of the private state machinery, and should see the private state, not
the public state.
Most of the patch is getting a test that will actually reproduce the
error. Only the test `test_python_os_plugin_remote` actually reproduced
the error. In `test_python_os_plugin` we actually do start up the
private state thread before handling the event. `test_python_os_plugin`
is there for completeness sake.
It's not necessary on posix platforms as of #126935 and it's ignored on
windows as of #138896. For both platforms, we have a better way of
inheriting FDs/HANDLEs.
This patch should be mostly obvious, but in one place, this patch
changes:
const auto &it = std::find(...)
to:
auto it = llvm::find(...)
We do not need to bind to a temporary with const ref.
This macro does not do what is described. The only thing it actually
does control is whether or not the process's memory cache gets flushed
when writing to an address. It does not override the setting
`target.process.disable-memory-cache`.
Instead of making it work as intended, I chose to remove the macro. I
don't see much value in being able to override the setting with a
preprocessor macro.
This reapplies https://github.com/llvm/llvm-project/pull/138892, which
was reverted in
5fb9dca14a
due to failures on windows.
Windows loads modules from the Process class, and it does that quite
early, and it kinda makes sense which is why I'm moving the clearing
code even earlier.
The original commit message was:
Minidump files contain explicit information about load addresses of
modules, so it can load them itself. This works on other platforms, but
fails on darwin because DynamicLoaderDarwin nukes the loaded module list
on initialization (which happens after the core file plugin has done its
work).
This used to work until
https://github.com/llvm/llvm-project/pull/109477, which enabled the
dynamic loader
plugins for minidump files in order to get them to provide access to
TLS.
Clearing the load list makes sense, but I think we could do it earlier
in the process, so that both Process and DynamicLoader plugins get a
chance to load modules. This patch does that by calling the function
early in the launch/attach/load core flows.
This fixes TestDynamicValue.py:test_from_core_file on darwin.
Follow-up to #138892 fixing breakage on windows. Calling
ClearAllLoadedSections earlier is necessary to avoid throwing out the
work done by the windows process plugin.
Minidump files contain explicit information about load addresses of
modules, so it can load them itself. This works on other platforms, but
fails on darwin because DynamicLoaderDarwin nukes the loaded module list
on initialization (which happens after the core file plugin has done its
work).
This used to work until #109477, which enabled the dynamic loader
plugins for minidump files in order to get them to provide access to
TLS.
Clearing the load list makes sense, but I think we could do it earlier
in the process, so that both Process and DynamicLoader plugins get a
chance to load modules. This patch does that by calling the function
early in the launch/attach/load core flows.
This fixes TestDynamicValue.py:test_from_core_file on darwin.
Custom regions in Process::GetUserSpecifiedCoreFileSaveRanges originally
used `FindEntryThatContains`. This made sense on my first attempt, but
what we really want are *intersecting* regions. This is so the user can
specify arbitrary memory, and if it's available we output it to the core
(Minidump or MachO).
This reverts commit daa4061d61216456baa83ab404e096200e327fb4.
Original PR https://github.com/llvm/llvm-project/pull/129092.
I have restricted the test to X86 Windows because it turns out the only
reason that `expr x.get()` would change m_memory_id is that on x86 we
have to write the return address to the stack in ABIWindows_X86_64::PrepareTrivialCall:
```
// Save return address onto the stack
if (!process_sp->WritePointerToMemory(sp, return_addr, error))
return false;
```
This is not required on AArch64 so m_memory_id was not changed:
```
(lldb) expr x.get()
(int) $0 = 0
(lldb) process status -d
Process 15316 stopped
* thread #1, stop reason = Exception 0x80000003 encountered at address 0x7ff764a31034
frame #0: 0x00007ff764a31038 TestProcessModificationIdOnExpr.cpp.tmp`main at TestProcessModificationIdOnExpr.cpp:35
32 __builtin_debugtrap();
33 __builtin_debugtrap();
34 return 0;
-> 35 }
36
37 // CHECK-LABEL: process status -d
38 // CHECK: m_stop_id: 2
ProcessModID:
m_stop_id: 3
m_last_natural_stop_id: 0
m_resume_id: 0
m_memory_id: 0
```
Really we should find a better way to force a memory write here, but
I can't think of one right now.
And a follow up warning fix.
This reverts commit 6aa963f780d63d4c8fa80de97dd79c932bc35f4e
and 2bff80f25d51e24d3c552e033a2863dd36ef648b.
This is failing on Windows on Arm: https://lab.llvm.org/buildbot/#/builders/141/builds/8375
Seems to produce the line the test wants but not in the right place.
Reverting while I investigate.
stop-hooks are supposed to trigger every time the process stops, but as
initially implemented they would only fire when control was returned to
the user. So for instance when a process was launched the stop hook
would only trigger when the process hit a breakpoint or crashed.
However, it would be really useful to be able to trigger a stop hook
when lldb first gains control over the process. One way to do that would
be to implement general "target lifecycle events" and then send process
created events that users could bind actions to.
OTOH, extending the stop hooks to fire when lldb first gains control
over the process is a pretty natural extension to the notion of a stop
hook. So this patch takes the shorter route to that ability by making
stop-hooks fire when lldb first gains control over the process.
I also added the ability to specify whether to trigger the stop hook "on
gaining control". I'm on the fence about whether to set the default to
be "trigger on gaining control" or "don't trigger on gaining control".
Since I think it's a generally useful feature, I've set the default to
"trigger on gaining control".
This change adds a setting `target.process.track-memory-cache-changes`.
Disabling this setting prevents invalidating and updating values in
`ValueObject::UpdateValueIfNeeded` when only "internal" debugger memory
is updated. Writing to "internal" debugger memory happens when, for
instance, expressions are evaluated by visualizers (pretty printers).
One of the examples when cache invalidation has a particularly heavy
impact is visualizations of some collections: in some collections
getting collection size is an expensive operation (it requires traversal
of the collection).
At the same time evaluating user expression with side effects (visible
to target, not only to debugger) will still bump memory ID because:
- If expression is evaluated via interpreter: it will cause write to
"non-internal" memory
- If expression is JIT-compiled: then to call the function LLDB will
write to "non-internal" stack memory
The downside of disabled `target.process.track-memory-cache-changes`
setting is that convenience variables won't reevaluate synthetic
children automatically.
---------
Co-authored-by: Mikhail Zakharov <mikhail.zakharov@jetbrains.com>
I traced the issue reported by Caroline and Pavel in #134757 back to the
call to ProcessRunLock::TrySetRunning. When that fails, we get a
somewhat misleading error message:
> process resume at entry point failed: Resume request failed - process
still running.
This is incorrect: the problem was not that the process was in a running
state, but rather that the RunLock was being held by another thread
(i.e. the Statusline). TrySetRunning would return false in both cases
and the call site only accounted for the former.
Besides the odd semantics, the current implementation is inherently
race-y and I believe incorrect. If someone is holding the RunLock, the
resume call should block, rather than give up, and with the lock held,
switch the running state and report the old running state.
This patch removes ProcessRunLock::TrySetRunning and updates all callers
to use ProcessRunLock::SetRunning instead. To support that,
ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for
consistency) now report whether the process was stopped or running
respectively. Previously, both methods returned true unconditionally.
The old code has been around pretty much pretty much forever, there's
nothing in the git history to indicate that this was done purposely to
solve a particular issue. I've tested this on both Linux and macOS and
confirmed that this solves the statusline issue.
A big thank you to Jim for reviewing my proposed solution offline and
trying to poke holes in it.
I recently received an internal error report that LLDB was OOM'ing when
creating a Minidump. In my 64b refactor we made a decision to acquire
buffers the size of the largest memory region so we could read all of
the contents in one call. This made error handling very simple (and
simpler coding for me!) but had the trade off of large allocations if
huge pages were enabled.
This patch is one I've had on the back burner for awhile, but we can
read and write the Minidump memory sections in discrete chunks which we
already do for writing to disk.
I had to refactor the error handling a bit, but it remains the same. We
make a best effort attempt to read as much of the memory region as
possible, but fail immediately if we receive an error writing to disk. I
did not add new tests for this because our existing test suite is quite
good, but I did manually verify a few Minidumps couldn't read beyond the
red_zone.
```
(lldb) reg read $sp
rsp = 0x00007fffffffc3b0
(lldb) p/x 0x00007fffffffc3b0 - 128
(long) 0x00007fffffffc330
(lldb) memory read 0x00007fffffffc330
0x7fffffffc330: 60 c3 ff ff ff 7f 00 00 60 cd ff ff ff 7f 00 00 `.......`.......
0x7fffffffc340: 60 c3 ff ff ff 7f 00 00 65 e6 26 00 00 00 00 00 `.......e.&.....
(lldb) memory read 0x00007fffffffc329
error: could not parse memory info (Success!)
```
I'm not sure how to quantify the memory improvement other than we would
allocate the largest size regardless of the size. So a 2gb unreadable
region would cause a 2gb allocation even if we were reading 4096 kb. Now
we will take the range size or the max chunk size of 128 mb.
This NFC patch simplifies the main loop in HandleProcessStateChanged
event by moving duplicated code into the StopInfo class, also allowing
StopInfo subclasses to override behavior.
More specifically, two functions are created:
* ShouldShow: should a Thread with such StopInfo should be printed when
the debugger stops? Currently, no StopInfo subclasses override this, but
a subsequent patch will fix a bug by making StopInfoBreakpoint check
whether the breakpoint is internal.
* ShouldSelect: should a Thread with such a StopInfo be selected? This
is currently overridden by StopInfoUnixSignal but will, in the future,
be overridden by StopInfoBreakpoint.
When using `SBFrame::EvaluateExpression` on a frame that's not the
currently selected frame, we would sometimes run into errors such as:
```
error: error: The context has changed before we could JIT the expression!
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression
```
During expression parsing, we call `RunStaticInitializers`. On our
internal fork this happens quite frequently because any usage of, e.g.,
function pointers, will inject ptrauth fixup code into the expression.
The static initializers are run using `RunThreadPlan`. The
`ExecutionContext::m_frame_sp` going into the `RunThreadPlan` is the
`SBFrame` that we called `EvaluateExpression` on. LLDB then tries to
save this frame to restore it after the thread-plan ran (the restore
occurs by unconditionally overwriting whatever is in
`ExecutionContext::m_frame_sp`). However, if the `selected_frame_sp` is
not the same as the `SBFrame`, then `RunThreadPlan` would set the
`ExecutionContext`'s frame to a different frame than what we started
with. When we `PrepareToExecuteJITExpression`, LLDB checks whether the
`ExecutionContext` frame changed from when we initially
`EvaluateExpression`, and if did, bails out with the error above.
One such test-case is attached. This currently passes regardless of the
fix because our ptrauth static initializers code isn't upstream yet. But
the plan is to upstream it soon.
This patch addresses the issue by saving/restoring the frame of the
incoming `ExecutionContext`, if such frame exists. Otherwise, fall back
to using the selected frame.
rdar://147456589
SetExitStatus can be called the second time when we reap the debug
server process. This shouldn't be interesting as at that point, we've
already told everyone that the process has exited.
I believe/hope this will also help with sporadic shutdown crashes that
have cropped up recently. They happen because the debug server is
monitored from a detached thread, so this code can be called after main
returns (and starts destroying everything). This isn't a real fix for
that though, as the situation can still happen (it's just that it
usually happens after the exit status has already been set). I think the
real fix for that is to make sure these threads terminate before we
start shutting everything down.
This reverts commit
87b7f63a11,
reapplying
7e66cf74fb
with a small (and probably temporary)
change to generate more debug info to help with diagnosing buildbot
issues.
The main motivation for this was the inconsistency in handling of
partial reads/writes between the windows and posix implementations
(windows was returning partial reads, posix was trying to fill the
buffer completely). I settle on the windows implementation, as that's
the more common behavior, and the "eager" version can be implemented on
top of that (in most cases, it isn't necessary, since we're writing just
a single byte).
Since this also required auditing the callers to make sure they're
handling partial reads/writes correctly, I used the opportunity to
modernize the function signatures as a forcing function. They now use
the `Timeout` class (basically an `optional<duration>`) to support both
polls (timeout=0) and blocking (timeout=nullopt) operations in a single
function, and use an `Expected` instead of a by-ref result to return the
number of bytes read/written.
As a drive-by, I also fix a problem with the windows implementation
where we were rounding the timeout value down, which meant that calls
could time out slightly sooner than expected.
Make StreamAsynchronousIO an unique_ptr instead of a shared_ptr. I tried
passing the class by value, but the llvm::raw_ostream forwarder stored
in the Stream parent class isn't movable and I don't think it's worth
changing that. Additionally, there's a few places that expect a
StreamSP, which are easily created from a StreamUP.
Remove Debugger::GetOutputStream and Debugger::GetErrorStream in
preparation for replacing both with a new variant that needs to be
locked and hence can't be handed out like we do right now.
The patch replaces most uses with GetAsyncOutputStream and
GetAsyncErrorStream respectively. There methods return new StreamSP
objects that automatically get flushed on destruction.
See #126630 for more details.
I encountered a `qMemoryRegionInfo not supported` error when capturing a
Minidump. This was surprising, and I started looking around I found
@jasonmolenda's fix in #115963 and then realized I was not validated
anything from the custom ranges.
This reverts commit a774de807e56c1147d4630bfec3110c11d41776e.
This is the same changes as last time, plus:
* We load the binary into the target object so that on Windows, we can
resolve the locations of the functions.
* We now assert that each required breakpoint has at least 1 location,
to prevent an issue like that in the future.
* We are less strict about the unsupported error message, because it
prints "error: windows" on Windows instead of "error: gdb-remote".
Reverts llvm/llvm-project#123945
Has failed on the Windows on Arm buildbot:
https://lab.llvm.org/buildbot/#/builders/141/builds/5865
```
********************
Unresolved Tests (2):
lldb-api :: functionalities/reverse-execution/TestReverseContinueBreakpoints.py
lldb-api :: functionalities/reverse-execution/TestReverseContinueWatchpoints.py
********************
Failed Tests (1):
lldb-api :: functionalities/reverse-execution/TestReverseContinueNotSupported.py
```
Reverting while I reproduce locally.
This reverts commit 22561cfb443267905d4190f0e2a738e6b412457f and fixes
b7b9ccf44988edf49886743ae5c3cf4184db211f (#112079).
The problem is that x86_64 and Arm 32-bit have memory regions above the
stack that are readable but not writeable. First Arm:
```
(lldb) memory region --all
<...>
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---
```
Then x86_64:
```
$ cat /proc/self/maps
<...>
7ffdcd148000-7ffdcd16a000 rw-p 00000000 00:00 0 [stack]
7ffdcd193000-7ffdcd196000 r--p 00000000 00:00 0 [vvar]
7ffdcd196000-7ffdcd197000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
```
Compare this to AArch64 where the test did pass:
```
$ cat /proc/self/maps
<...>
ffffb87dc000-ffffb87dd000 r--p 00000000 00:00 0 [vvar]
ffffb87dd000-ffffb87de000 r-xp 00000000 00:00 0 [vdso]
ffffb87de000-ffffb87e0000 r--p 0002a000 00:3c 76927217 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffffb87e0000-ffffb87e2000 rw-p 0002c000 00:3c 76927217 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
fffff4216000-fffff4237000 rw-p 00000000 00:00 0 [stack]
```
To solve this, look up the memory region of the stack pointer (using
https://lldb.llvm.org/resources/lldbgdbremote.html#qmemoryregioninfo-addr)
and constrain the read to within that region. Since we know the stack is
all readable and writeable.
I have also added skipIfRemote to the tests, since getting them working
in that context is too complex to be worth it.
Memory write failures now display the range they tried to write, and
register write errors will show the name of the register where possible.
The patch also includes a workaround for a an issue where the test code
could mistake an `x` response that happens to begin with an `O` for an
output packet (stdout). This workaround will not be necessary one we
start using the [new
implementation](https://discourse.llvm.org/t/rfc-fixing-incompatibilties-of-the-x-packet-w-r-t-gdb/84288)
of the `x` packet.
---------
Co-authored-by: Pavel Labath <pavel@labath.sk>
This commit adds support for a
`SBProcess::ContinueInDirection()` API. A user-accessible command for
this will follow in a later commit.
This feature depends on a gdbserver implementation (e.g. `rr`) providing
support for the `bc` and `bs` packets. `lldb-server` does not support
those packets, and there is no plan to change that. For testing
purposes, this commit adds a Python implementation of *very limited*
record-and-reverse-execute functionality, implemented as a proxy between
lldb and lldb-server in `lldbreverse.py`. This should not (and in
practice cannot) be used for anything except testing.
The tests here are quite minimal but we test that simple breakpoints and
watchpoints work as expected during reverse execution, and that
conditional breakpoints and watchpoints work when the condition calls a
function that must be executed in the forward direction.
Currently, an LLDB target option controls whether plugins report all
threads. However, it seems natural for this knowledge could come from
the plugin itself. To support this, this commits adds a virtual method
to the plugin base class, making the Python OS query the target option
to preserve existing behavior.
ELF core debugging fix#117070 broke TestLoadUnload.py tests due to
GetModuleSpec call, ProcessGDBRemote fetches modules from remote. Revise
the original PR, renamed FindBuildId to FindModuleUUID.
A scripted process implementation might return an SBMemoryRegionInfo
object in its implementation of `get_memory_region_containing_address`
which will have an address 0 and size 0, without realizing the problems
this can cause. Several algorithms in lldb will try to iterate over the
MemoryRegions of the process, starting at address 0 and expecting to
iterate up to the highest vm address, stepping by the size of each
region, so a 0-length region will result in an infinite loop. Add a
check to Process::GetMemoryRegionInfo that rejects a MemoryRegion which
does not contain the requested address; a 0-length memory region will
therefor always be rejected.
rdar://139678032
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).
This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.
The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
Reverting this again; I added a commit which added @skipIfDarwin
markers to the TestReverseContinueBreakpoints.py and
TestReverseContinueNotSupported.py API tests, which use lldb-server
in gdbserver mode which does not work on Darwin. But the aarch64 ubuntu
bot reported a failure on TestReverseContinueBreakpoints.py,
https://lab.llvm.org/buildbot/#/builders/59/builds/6397
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py", line 63, in test_reverse_continue_skip_breakpoint
self.reverse_continue_skip_breakpoint_internal(async_mode=False)
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py", line 81, in reverse_continue_skip_breakpoint_internal
self.expect(
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2372, in expect
self.runCmd(
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1002, in runCmd
self.assertTrue(self.res.Succeeded(), msg + output)
AssertionError: False is not true : Process should be stopped due to history boundary
Error output:
error: Process must be launched.
This reverts commit 4f297566b3150097de26c6a23a987d2bd5fc19c5.
This commit only adds support for the
`SBProcess::ReverseContinue()` API. A user-accessible command for this
will follow in a later commit.
This feature depends on a gdbserver implementation (e.g. `rr`) providing
support for the `bc` and `bs` packets. `lldb-server` does not support
those packets, and there is no plan to change that. So, for testing
purposes, `lldbreverse.py` wraps `lldb-server` with a Python
implementation of *very limited* record-and-replay functionality for use
by *tests only*.
The majority of this PR is test infrastructure (about 700 of the 950
lines added).
This patch adds the support to `Process.cpp` to automatically save off
TLS sections, either via loading the memory region for the module, or
via reading `fs_base` via generic register. Then when Minidumps are
loaded, we now specify we want the dynamic loader to be the `POSIXDYLD`
so we can leverage the same TLS accessor code as `ProcessELFCore`. Being
able to access TLS Data is an important step for LLDB generated
minidumps to have feature parity with ELF Core dumps.
This commit only adds support for the
`SBProcess::ReverseContinue()` API. A user-accessible command for this
will follow in a later commit.
This feature depends on a gdbserver implementation (e.g. `rr`) providing
support for the `bc` and `bs` packets. `lldb-server` does not support
those packets, and there is no plan to change that. So, for testing
purposes, `lldbreverse.py` wraps `lldb-server` with a Python
implementation of *very limited* record-and-replay functionality for use
by *tests only*.
The majority of this PR is test infrastructure (about 700 of the 950
lines added).
Recently in #107731 this change was revereted due to excess memory size
in `TestSkinnyCore`. This was due to a bug where a range's end was being
passed as size. Creating massive memory ranges.
Additionally, and requiring additional review, I added more unit tests
and more verbose logic to the merging of save core memory regions.
@jasonmolenda as an FYI.
Reapplies #106293, testing identified issue in the merging code. I used
this opportunity to strip CoreFileMemoryRanges to it's own file and then
add unit tests on it's behavior.