When multiple threads are stopped at the same breakpoint, LLDB currently
steps each thread over the breakpoint one at a time. Each step requires
disabling the breakpoint, single-stepping one thread, and re-enabling
it, resulting in N disable/enable cycles and N individual vCont packets
for N threads. This is a common scenario for hot breakpoints in
multithreaded programs and scales poorly.
This patch batches the step-over so that all threads at the same
breakpoint site are stepped together in a single vCont packet, with the
breakpoint disabled once at the start and re-enabled once after the last
thread finishes.
At the top of WillResume, any leftover StepOverBreakpoint plans from a
previous cycle are popped with their re-enable side effect suppressed
via SetReenabledBreakpointSite, giving a clean slate.
SetupToStepOverBreakpointIfNeeded then creates fresh plans for all
threads that still need to step over a breakpoint, and these are grouped
by breakpoint address.
For groups with multiple threads, each plan is set to defer its
re-enable through SetDeferReenableBreakpointSite. Instead of re-enabling
the breakpoint directly when a plan completes, it calls
ThreadFinishedSteppingOverBreakpoint, which decrements a per-address
tracking count. The breakpoint is only re-enabled when the count reaches
zero.
All threads in the largest group are resumed together in a single
batched vCont packet. If some threads don't complete their step in one
cycle, the pop-and-recreate logic naturally re-batches the remaining
threads on the next WillResume call.
For 10 threads at the same breakpoint, this reduces the operation from
10 z0/Z0 pairs and 10 vCont packets to 1 z0 + 1 Z0 and a few
progressively smaller batched vCont packets.
EDIT:
Tried to merge this PR twice, the first time the test was flaky so we
had to revert. The second time, we broke 2 tests on windows machine:
https://lab.llvm.org/buildbot/#/builders/141/builds/15798
The tests that were failing were failing because the cleanup code in
`WillResume` was popping **ALL** `StepOverBreakpoint` plans, including
non-deferred ones from incomplete single-steps.
The issue was:
1) Multiple threads hit the same breakpoint. One thread's breakpoint
condition evaluates to false, so it needs to auto-continue.
2) A `StepOverBreakpoint` plan is created for that thread
(non-deferred).
3) On the next WillResume, the cleanup pops that non-deferred plan.
4) Now the `StopOthers` scan finds no thread with a StopOthers() plan,
so thread_to_run stays null.
5) The else branch runs, calling `SetupToStepOverBreakpointIfNeeded` on
**ALL** threads, including the thread that legitimately hit the
breakpoint with a true condition.
6) That thread gets a new `StepOverBreakpoint` plan pushed, which
overwrites its breakpoint stop reason with trace when the step
completes.
The error `trace (2) != breakpoint (3)` confirms this, the thread that
should have reported breakpoint as its stop reason instead reports
trace, because an unwanted `StepOverBreakpoint` plan was pushed on it
and completed.
The newly added code fixes it by only popping plans that have
`GetDeferReenableBreakpointSite() == true`
Co-authored-by: Bar Soloveychik <barsolo@fb.com>
Re-land https://github.com/llvm/llvm-project/pull/180101 since it was
reverted here https://github.com/llvm/llvm-project/pull/182431 because
of a flaky test. This PR include the modified test that should pass from
https://github.com/llvm/llvm-project/pull/182415 :
When multiple threads are stopped at the same breakpoint, LLDB currently
steps each thread over the breakpoint one at a time. Each step requires
disabling the breakpoint, single-stepping one thread, and re-enabling
it, resulting in N disable/enable cycles and N individual vCont packets
for N threads. This is a common scenario for hot breakpoints in
multithreaded programs and scales poorly.
This patch batches the step-over so that all threads at the same
breakpoint site are stepped together in a single vCont packet, with the
breakpoint disabled once at the start and re-enabled once after the last
thread finishes.
At the top of WillResume, any leftover StepOverBreakpoint plans from a
previous cycle are popped with their re-enable side effect suppressed
via SetReenabledBreakpointSite, giving a clean slate.
SetupToStepOverBreakpointIfNeeded then creates fresh plans for all
threads that still need to step over a breakpoint, and these are grouped
by breakpoint address.
For groups with multiple threads, each plan is set to defer its
re-enable through SetDeferReenableBreakpointSite. Instead of re-enabling
the breakpoint directly when a plan completes, it calls
ThreadFinishedSteppingOverBreakpoint, which decrements a per-address
tracking count. The breakpoint is only re-enabled when the count reaches
zero.
All threads in the largest group are resumed together in a single
batched vCont packet. If some threads don't complete their step in one
cycle, the pop-and-recreate logic naturally re-batches the remaining
threads on the next WillResume call.
For 10 threads at the same breakpoint, this reduces the operation from
10 z0/Z0 pairs and 10 vCont packets to 1 z0 + 1 Z0 and a few
progressively smaller batched vCont packets.
Co-authored-by: Bar Soloveychik <barsolo@fb.com>
PR to fix failing test from
https://github.com/llvm/llvm-project/pull/180101 .
Fix the integration test to be resilient to non-deterministic thread
timing. Instead of requiring exact z0/Z0 counts, verify that batching
reduced toggles compared to one at a time stepping.
Also added: skip on `aarch64` where thread scheduling makes batching
unreliable.
Ran the test 20 times, passed all 20.
Co-authored-by: Bar Soloveychik <barsolo@fb.com>
Following up from
https://discourse.llvm.org/t/improving-performance-of-multiple-threads-stepping-over-the-same-breakpoint/89637
When multiple threads are stopped at the same breakpoint, LLDB currently
steps each thread over the breakpoint one at a time. Each step requires
disabling the breakpoint, single-stepping one thread, and re-enabling
it, resulting in N disable/enable cycles and N individual vCont packets
for N threads.
Now we batch the step-over so that all threads at the same breakpoint
site are stepped together in a single vCont packet, with the breakpoint
disabled once at the start and re-enabled once after the last thread
finishes.
When we hit `WillResume` any leftover `StepOverBreakpoint` plans from a
previous cycle are popped with their re-enable side effect suppressed
via `SetReenabledBreakpointSite`, giving a clean slate.
`SetupToStepOverBreakpointIfNeeded` then creates fresh plans for all
threads that still need to step over a breakpoint, and these are grouped
by breakpoint address.
For groups with multiple threads, each plan is set to defer its
re-enable through `SetDeferReenableBreakpointSite`. Instead of
re-enabling the breakpoint directly when a plan completes, it calls
`ThreadFinishedSteppingOverBreakpoint`, which decrements a tracking
count per address. The breakpoint is only re-enabled when the count
reaches zero.
All threads in the largest group are resumed together in a single
batched vCont packet. If some threads don't complete their step in one
cycle, the pop-and-recreate logic naturally re-batches the remaining
threads on the next WillResume call.
Claude AI assisted in syntex fixes and making the comments more
understandable.
---------
Co-authored-by: Bar Soloveychik <barsolo@fb.com>