A CMake change included in CMake 4.0 makes `AIX` into a variable
(similar to `APPLE`, etc.)
ff03db6657
However, `${CMAKE_SYSTEM_NAME}` unfortunately also expands exactly to
`AIX` and `if` auto-expands variable names in CMake. That means you get
a double expansion if you write:
`if (${CMAKE_SYSTEM_NAME} MATCHES "AIX")`
which becomes:
`if (AIX MATCHES "AIX")`
which is as if you wrote:
`if (ON MATCHES "AIX")`
You can prevent this by quoting the expansion of "${CMAKE_SYSTEM_NAME}",
due to policy
[CMP0054](https://cmake.org/cmake/help/latest/policy/CMP0054.html#policy:CMP0054)
which is on by default in 4.0+. Most of the LLVM CMake already does
this, but this PR fixes the remaining cases where we do not.
`std::wstring AnsiToUtf16(const std::string &ansi)` is a
reimplementation of `llvm::sys::windows::UTF8ToUTF16`. This patch
removes `AnsiToUtf16` and its usages entirely.
This abstracts the base Transport handler to have a MessageHandler
component and allows us to generalize both JSON-RPC 2.0 for MCP (or an
LSP) and DAP format.
This should allow us to create clearly defined clients and servers for
protocols, both for testing and for RPC between the lldb instances and
an lldb-mcp multiplexer.
This basic model is inspiried by the clangd/Transport.h file and the
mlir/lsp-server-support/Transport.h that are both used for LSP servers
within the llvm project.
Additionally, this helps with testing by subclassing `Transport` to
allow us to simplify sending/receiving messages without needing to use a
toJSON/fromJSON and a pair of pipes, see `TestTransport` in
DAP/TestBase.h.
Reapply "[lldb] Update JSONTransport to use MainLoop for reading."
(#152155)
This reverts commit cd40281685f642ad879e33f3fda8d1faa136ebf4.
This also includes some updates to try to address the platforms with
failing tests.
I updated the JSONTransport and tests to use std::function instead of
llvm:unique_function. I think the tests were failing due to the
unique_function not being moved correctly in the loop on some platforms.
This updates JSONTransport to use a MainLoop for reading messages.
This also allows us to read in larger chunks than we did previously.
With the event driven reading operations we can read in chunks and store
the contents in an internal buffer. Separately we can parse the buffer
and split the contents up into messages.
Our previous version approach would read a byte at a time, which is less
efficient.
This patch makes LLDB use the Event Viewer on Windows (equivalent of
system logging on Darwin) rather than piping to the standard output
(which was deactivated in ca0a5247004b6d692978d10bdbf86e338133e60c.
This fixes the following build warnings in a mingw environment:
../../lldb/source/Host/windows/MainLoopWindows.cpp:226:50: warning:
format specifies type 'int' but the argument has type
'IOObject::WaitableHandle' (aka 'void *') [-Wformat]
226 | "File descriptor %d already monitored.", waitable_handle);
| ~~ ^~~~~~~~~~~~~~~
../../lldb/source/Host/windows/MainLoopWindows.cpp:239:49: warning:
format specifies type 'int' but the argument has type 'DWORD' (aka
'unsigned long') [-Wformat]
238 | error = Status::FromErrorStringWithFormat("Unsupported file type
%d",
| ~~
| %lu
239 | file_type);
| ^~~~~~~~~
2 warnings generated.
This is a continuation of 68fd102, which did the same thing but only for
StopInfo. Using make_shared is both safer and more efficient:
- With make_shared, the object and the control block are allocated
together, which is more efficient.
- With make_shared, the enable_shared_from_this base class is properly
linked to the control block before the constructor finishes, so
shared_from_this() will be safe to use (though still not recommended
during construction).
I wasn't able to build lldb using Mingw-w64 on Windows without changing
these 3 lines. It seems like `std::atomic<bool>` wasn't being found
without `#include <atomic>` and `ceil` was defaulting to `std::ceil`
instead of `std::chrono::ceil`, but I'm not smart enough to know the
root cause. I'm sure I'm not the first people to try and compile lldb
(and clang and lld) with Mingw-w64 and I don't know if something is
wrong with my Mingw-w64, but my changes shouldn't have any affect if
they aren't needed.
# Lldb-server crash
We have seen stacks like the following in lldb-server core dumps:
```
[
"__GI___pthread_kill at pthread_kill.c:46",
"__GI_raise at raise.c:26",
"__GI_abort at abort.c:100",
"__assert_fail_base at assert.c:92",
"__GI___assert_fail at assert.c:101",
"lldb_private::NativeProcessProtocol::RemoveSoftwareBreakpoint(unsigned long) at /redacted/lldb-server:0"
]
```
# Hypothesis of root cause
In `NativeProcessProtocol::RemoveSoftwareBreakpoint()`
([code](19b2dd9d79/lldb/source/Host/common/NativeProcessProtocol.cpp (L359-L423))),
a `ref_count` is asserted and reduced. If it becomes zero, the code
first go through a series of memory reads and writes to remove the
breakpoint trap opcode and to restore the original process code, then,
if everything goes fine, removes the entry from the map
`m_software_breakpoints` at the end of the function.
However, if any of the validations for the above reads and writes goes
wrong, the code returns an error early, skipping the removal of the
entry. This leaves the entry behind with a `ref_count` of zero.
The next call to `NativeProcessProtocol::RemoveSoftwareBreakpoint()` for
the same breakpoint[*] would violate the assertion about `ref_count > 0`
([here](19b2dd9d79/lldb/source/Host/common/NativeProcessProtocol.cpp (L365))),
which would cause a crash.
[*] We haven't found a *regular* way to repro such a next call in lldb
or lldb-dap. This is because both of them remove the breakpoint from
their internal list when they get any response from the lldb-server (OK
or error). Asking the client to delete the breakpoint a second time
doesn't trigger the client to send the `$z` gdb packet to lldb-server.
We are able to trigger the crash by sending the `$z` packet directly,
see "Manual test" below.
# Fix
Lift the removal of the map entry to be immediately after the decrement
of `ref_count`, before the early returns. This ensures that the asserted
case will never happen. The validation errors can still happen, and
whether they happen or not, the breakpoint has been removed from the
perspective of the lldb-server (same as that of lldb and lldb-dap).
# Manual test & unit test
See PR.
This is an odd corner case of the use of scripts loaded from dSYM's - a
macOS only feature, which can load OS Plugins that re-present the thread
state of the program we attach to. If we find out about and load the
dSYM scripts when we discover a target in the course of attaching to it,
we can end up running the OS plugin before we've started up the private
state thread. However, the os_plugin in that case will be running before
we broadcast the stop event to the public event listener. So it should
formally use the private state and not the public state for the Python
code environment.
This patch says that if we have not yet started up the private state
thread, then any thread that is servicing events is doing so on behalf
of the private state machinery, and should see the private state, not
the public state.
Most of the patch is getting a test that will actually reproduce the
error. Only the test `test_python_os_plugin_remote` actually reproduced
the error. In `test_python_os_plugin` we actually do start up the
private state thread before handling the event. `test_python_os_plugin`
is there for completeness sake.
This should improve synchronizing the MainLoopWindows monitor thread
with the main loop state.
This uses the `m_ready` and `m_event` event handles to manage when the
Monitor thread continues and adds new tests to cover additional use
cases.
I believe this should fix#147291 but it is hard to ensure a race
condition is fixed without running the CI on multiple
machines/configurations.
---------
Co-authored-by: Pavel Labath <pavel@labath.sk>
LLDB has the concept of an "SDK root", which expresses where the
system libraries and so which are necessary for debugging can be
found. On POSIX platforms, the SDK root is just the root of the
file system, or /.
We need to implement the appropriate method on HostInfoPosix to
prevent the use of the default implementation for which just returns
an error.
This change is needed to support the Swift REPL but may be useful
for other use cases as well.
As seen on Linaro's Windows on Arm bot.
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Host\windows\MainLoopWindows.cpp(80,25): warning: missing field 'InternalHigh' initializer [-Wmissing-field-initializers]
80 | OVERLAPPED ov = {0};
| ^
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Host\windows\MainLoopWindows.cpp(135,8): warning: 'WillPoll' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
135 | void WillPoll() {
| ^
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include\lldb/Host/windows/MainLoopWindows.h(40,18): note: overridden virtual function is here
40 | virtual void WillPoll() {}
| ^
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Host\windows\MainLoopWindows.cpp(142,8): warning: 'DidPoll' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
142 | void DidPoll() {
| ^
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include\lldb/Host/windows/MainLoopWindows.h(41,18): note: overridden virtual function is here
41 | virtual void DidPoll() {}
| ^
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Host\windows\MainLoopWindows.cpp(80,25): warning: missing field 'InternalHigh' initializer [-Wmissing-field-initializers]
80 | OVERLAPPED ov = {0};
| ^
Commit 1a7b7e24bcc1041ae0fb90abcfb73d36d76f4a07 introduced a few casting
warnings and a build issue in Win32 platforms.
Trying to correct the casts to c++ style casts instead of C style casts.
Letting Editline refresh itself is more robust and ensures that the
current text is redraw if it was accidentally cleared. In that scenario
MoveCursor would only fix up the cursor position.
This patch fixes:
lldb/source/Host/posix/ConnectionFileDescriptorPosix.cpp:279:15:
error: format specifies type 'unsigned long' but the argument has
type 'file_t' (aka 'int') [-Werror,-Wformat]
lldb/source/Host/posix/ConnectionFileDescriptorPosix.cpp:383:15:
error: format specifies type 'unsigned long' but the argument has
type 'file_t' (aka 'int') [-Werror,-Wformat]
This updates MainLoopWindows to support events for reading from a pipe
(both anonymous and named pipes) as well as sockets.
This unifies both handle types using `WSAWaitForMultipleEvents` which
can listen to both sockets and handles for change events.
This should allow us to unify how we handle watching pipes/sockets on
Windows and Posix systems.
We can extend this in the future if we want to support watching other
types, like files or even other events like a process life time.
---------
Co-authored-by: Pavel Labath <pavel@labath.sk>
This PR ensures we correctly restore the cursor column after resizing
the statusline. To ensure we have space for the statusline, we have to
emit a newline to move up everything on screen. The newline causes the
cursor to move to the start of the next line, which needs to be undone.
Normally, we would use escape codes to save & restore the cursor
position, but that doesn't work here, as the cursor position may have
(purposely) changed. Instead, we move the cursor up one line using an
escape code, but we weren't restoring the column.
Interestingly, Editline was able to recover from this issue through the
LineInfo struct which contains the buffer and the cursor location, which
allows us to compute the column. This PR addresses the bug by having
Editline "refresh" the cursor position.
Fixes#134064
It's not necessary on posix platforms as of #126935 and it's ignored on
windows as of #138896. For both platforms, we have a better way of
inheriting FDs/HANDLEs.
# Benefit
This patch fixes:
1. After `platform select ios-simulator`, `platform process list` will
now print processes which are running in the iOS simulator. Previously,
no process will be listed.
2. After `platform select ios-simulator`, `platform attach --name
<name>` will succeed. Previously, it will error out saying no process is
found.
# Several bugs that is being fixed
1. During the process listing, add `aarch64` to the list of CPU types
for which iOS simulators are checked for.
2. Given a candidate process, when checking for simulators, the original
code will find the desired environment variable (`SIMULATOR_UDID`) and
set the OS to iOS, but then the immediate next environment variable will
set it back to macOS.
3. For processes running on simulator, set the triple's `Environment` to
`Simulator`, so that such processes can pass the filtering [in this
line](https://fburl.com/8nivnrjx). The original code leave it as the
default `UnknownEnvironment`.
# Manual test
**With this patch:**
```
royshi-mac-home ~/public_llvm/build % bin/lldb
(lldb) platform select ios-simulator
(lldb) platform process list
240 matching processes were found on "ios-simulator"
PID PARENT USER TRIPLE NAME
====== ====== ========== ============================== ============================
40511 28844 royshi arm64-apple-ios-simulator FocusPlayground // my toy iOS app running on simulator
... // omit
28844 1 royshi arm64-apple-ios-simulator launchd_sim
(lldb) process attach --name FocusPlayground
Process 40511 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x0000000104e3cb70 libsystem_kernel.dylib`mach_msg2_trap + 8
libsystem_kernel.dylib`mach_msg2_trap:
-> 0x104e3cb70 <+8>: ret
... // omit
```
**Without this patch:**
```
$ bin/lldb
(lldb) platform select ios-simulator
(lldb) platform process list
error: no processes were found on the "ios-simulator" platform
(lldb) process attach --name FocusPlayground
error: attach failed: could not find a process named FocusPlayground
```
# Unittest
See PR.
Originally added for reproducers, it is now only used for test code.
While we could make it a test helper, I think that after #145015 it is
simple enough to not be needed.
Also squeeze in a change to make ConnectionFileDescriptor accept a
unique_ptr<Socket>.
It creates a pair of connected sockets using the simplest mechanism for
the given platform (TCP on windows, socketpair(2) elsewhere).
Main motivation is to remove the ugly platform-specific code in
ProcessGDBRemote::LaunchAndConnectToDebugserver, but it can also be used
in other places where we need to create a pair of connected sockets.
This PR implements JSON RPC-style (i.e. newline delimited) JSON
transport. I moved the existing transport tests from DAP to Host and
moved the PipeTest base class into TestingSupport so it can be shared by
both.
If we're not touching them, we don't need to do anything special to pass
them along -- with one important caveat: due to how cmake arguments
work, the implicitly passed arguments need to be specified before
arguments that we handle.
This isn't particularly nice, but the alternative is enumerating all
arguments that can be used by llvm_add_library and the macros it calls
(it also relies on implicit passing of some arguments to
llvm_process_sources).
Before:
```
(lldb) r
error: connect remote failed (Failed to connect port)
error: Failed to connect port
```
After the patch:
```
(lldb) r
error: connect remote failed (Failed to connect localhost:47140)
error: Failed to connect localhost:47140
```
This reverts commit
a0260a95ec,
reapplying
7c5f5f3ef8,
with a fix that makes *both*
pipe handles inheritable.
The original commit description was:
This is a follow-up to https://github.com/llvm/llvm-project/pull/126935,
which enables passing handles to a child
process on windows systems. Unlike on unix-like systems, the handles
need to be created with the "inheritable" flag because there's to way to
change the flag value after it has been created. This is why I don't
respect the child_process_inherit flag but rather always set the flag to
true. (My next step is to delete the flag entirely.)
This does mean that pipe may be created as inheritable even if its not
necessary, but I think this is offset by the fact that windows (unlike
unixes, which pass all ~O_CLOEXEC descriptors through execve and *all*
descriptors through fork) has a way to specify the precise set of
handles to pass to a specific child process.
If this turns out to be insufficient, instead of a constructor flag, I'd
rather go with creating a separate api to create an inheritable copy of
a handle (as typically, you only want to inherit one end of the pipe).
This is the first useful patch in the series related to enabling
`PTRACE_SEIZE` for processes Coredumping. In order to make the decision
if we want to seize or attach, we need to expose that in processinfo.
Which we acquire by reading it from `/proc/pid/status`
Note that in status it is `CoreDumping` not `Coredumping`, so I kept
with that, even if I prefer `Coredumping`
This is a follow-up to https://github.com/llvm/llvm-project/pull/126935,
which enables passing handles to a child
process on windows systems. Unlike on unix-like systems, the handles
need to be created with the "inheritable" flag because there's to way to
change the flag value after it has been created. This is why I don't
respect the child_process_inherit flag but rather always set the flag to
true. (My next step is to delete the flag entirely.)
This does mean that pipe may be created as inheritable even if its not
necessary, but I think this is offset by the fact that windows (unlike
unixes, which pass all ~O_CLOEXEC descriptors through execve and *all*
descriptors through fork) has a way to specify the precise set of
handles to pass to a specific child process.
If this turns out to be insufficient, instead of a constructor flag, I'd
rather go with creating a separate api to create an inheritable copy of
a handle (as typically, you only want to inherit one end of the pipe).
Currently we're creating inheritable (`~FD_CLOEXEC`) file descriptors in
the (few) cases where we need to pass an FD to a subprocess. The problem
with these is that, in a multithreaded application such as lldb, there's
essentially no way to prevent them from being leaked into processes
other than the intended one.
A safer (though still not completely safe) approach is to mark the
descriptors as FD_CLOEXEC and only clear this flag in the subprocess. We
currently have something that almost does that, which is the ability to
add a `DuplicateFileAction` to our `ProcessLaunchInfo` struct (the
duplicated file descriptor will be created with the flag cleared). The
problem with *that* is that this approach is completely incompatible
with Windows.
Windows equivalents of file descriptors are `HANDLE`s, but these do not
have user controlled values -- applications are expected to work with
whatever HANDLE values are assigned by the OS. In unix terms, there is
no equivalent to the `dup2` syscall (only `dup`).
To find a way out of this conundrum, and create a miniscule API surface
that works uniformly across platforms, this PR proposes to extend the
`DuplicateFileAction` API to support duplicating a file descriptor onto
itself. Currently, this operation does nothing (it leaves the FD_CLOEXEC
flag set), because that's how `dup2(fd, fd)` behaves, but I think it's
not completely unreasonable to say that this operation should clear the
FD_CLOEXEC flag, just like it would do if one was using different fd
values. This would enable us to pass a windows HANDLE as itself through
the ProcessLaunchInfo API.
This PR implements the unix portion of this idea. Macos and non-macos
launchers are updated to clear FD_CLOEXEC flag when duplicating a file
descriptor onto itself, and I've created a test which enables passing a
FD_CLOEXEC file descritor to the subprocess. For the windows portion,
please see the follow-up PR.
This PR is in reference to porting LLDB on AIX.
Link to discussions on llvm discourse and github:
1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. https://github.com/llvm/llvm-project/issues/101657
The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601
- Added changes to make the common host support functions under
`Host/posix` for unix-like system.
Also, created the `unittests/Host/posix/` to test the hostInfo & support
functions for unix-like system.
- Added changes to get the host information for AIX. (GetProcessInfo())
(Information like : executable path, arch, process status etc.)
Fixes https://github.com/llvm/llvm-project/issues/22981
If `settings set use-color` is changed when lldb is running it does not take effect.
This is fixes that.
---------
Signed-off-by: Ebuka Ezike <yerimyah1@gmail.com>
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Fix a deadlock between the statusline mutex (in Debugger) and the output
file mutex (in LockedStreamFile). The deadlock occurs when the main
thread is calling the statusline callback while holding the output mutex
in Editline, while the default event thread is trying to update the
statusline.
Extend the uncritical section so we can redraw the statusline there.
The loop in Editline::GetCharacter should be unnecessary. It would only
loop if we had a successful read with length zero, which shouldn't be
possible or when we can't convert a partial UTF-8 character, in which
case we bail out.
rdar://149251156
I traced the issue reported by Caroline and Pavel in #134757 back to the
call to ProcessRunLock::TrySetRunning. When that fails, we get a
somewhat misleading error message:
> process resume at entry point failed: Resume request failed - process
still running.
This is incorrect: the problem was not that the process was in a running
state, but rather that the RunLock was being held by another thread
(i.e. the Statusline). TrySetRunning would return false in both cases
and the call site only accounted for the former.
Besides the odd semantics, the current implementation is inherently
race-y and I believe incorrect. If someone is holding the RunLock, the
resume call should block, rather than give up, and with the lock held,
switch the running state and report the old running state.
This patch removes ProcessRunLock::TrySetRunning and updates all callers
to use ProcessRunLock::SetRunning instead. To support that,
ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for
consistency) now report whether the process was stopped or running
respectively. Previously, both methods returned true unconditionally.
The old code has been around pretty much pretty much forever, there's
nothing in the git history to indicate that this was done purposely to
solve a particular issue. I've tested this on both Linux and macOS and
confirmed that this solves the statusline issue.
A big thank you to Jim for reviewing my proposed solution offline and
trying to poke holes in it.