jeffreytan81 f838fa820f
New ThreadPlanSingleThreadTimeout to resolve potential deadlock in single thread stepping (#90930)
This PR introduces a new `ThreadPlanSingleThreadTimeout` that will be
used to address potential deadlock during single-thread stepping.

While debugging a target with a non-trivial number of threads (around
5000 threads in one example target), we noticed that a simple step over
can take as long as 10 seconds. Enabling single-thread stepping mode
significantly reduces the stepping time to around 3 seconds. However,
this can introduce deadlock if we try to step over a method that depends
on other threads to release a lock.

To address this issue, we introduce a new
`ThreadPlanSingleThreadTimeout` that can be controlled by the
`target.process.thread.single-thread-plan-timeout` setting during
single-thread stepping mode. The concept involves counting the elapsed
time since the last internal stop to detect overall stepping progress.
Once a timeout occurs, we assume the target is not making progress due
to a potential deadlock, as mentioned above. We then send a new async
interrupt, resume all threads, and `ThreadPlanSingleThreadTimeout`
completes its task.

To support this design, the major changes made in this PR are:
1. `ThreadPlanSingleThreadTimeout` is popped during every internal stop
and reset (re-pushed) to the top of the stack (as a leaf node) during
resume. This is achieved by always returning `true` from
`ThreadPlanSingleThreadTimeout::DoPlanExplainsStop()` and
`ThreadPlanSingleThreadTimeout::MischiefManaged()`.
2. A new thread-specific async interrupt stop is introduced, which can
be detected/consumed by `ThreadPlanSingleThreadTimeout`.
3. The clearing of branch breakpoints in the range thread plan has been
moved from `DoPlanExplainsStop()` to `ShouldStop()`, as it is not
guaranteed that it will be called.

The detailed design is discussed in the RFC below:

[https://discourse.llvm.org/t/improve-single-thread-stepping/74599](https://discourse.llvm.org/t/improve-single-thread-stepping/74599)

---------

Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-08-05 17:26:39 -07:00

69 lines
1.7 KiB
C++

#include <condition_variable>
#include <iostream>
#include <mutex>
#include <thread>
std::mutex mtx;
std::condition_variable cv;
int ready_thread_id = 0;
int signal_main_thread = 0;
void worker(int id) {
std::cout << "Worker " << id << " executing..." << std::endl;
// lldb test should change signal_main_thread to true to break the loop.
while (!signal_main_thread) {
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
// Signal the main thread to continue main thread
{
std::lock_guard<std::mutex> lock(mtx);
ready_thread_id = id; // break worker thread here
}
cv.notify_one();
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << "Worker " << id << " finished." << std::endl;
}
void deadlock_func(std::unique_lock<std::mutex> &lock) {
int i = 10;
++i; // Set interrupt breakpoint here
printf("%d", i); // Finish step-over from inner breakpoint
auto func = [] { return ready_thread_id == 1; };
cv.wait(lock, func);
}
int simulate_thread() {
std::thread t1(worker, 1);
std::unique_lock<std::mutex> lock(mtx);
deadlock_func(lock); // Set breakpoint1 here
std::thread t2(worker, 2); // Finish step-over from breakpoint1
cv.wait(lock, [] { return ready_thread_id == 2; });
t1.join();
t2.join();
std::cout << "Main thread continues..." << std::endl;
return 0;
}
int bar() { return 54; }
int foo(const std::string p1, int extra) { return p1.size() + extra; }
int main(int argc, char *argv[]) {
std::string ss = "this is a string for testing",
ls = "this is a long string for testing";
foo(ss.size() % 2 == 0 ? ss : ls, bar()); // Set breakpoint2 here
simulate_thread(); // Finish step-over from breakpoint2
return 0;
}