The test did not work as intended when the empty function `done()`
contained prologue/epilogue code, because a breakpoint was set before
the last instruction of the function, which caused the test to pass even
with the fix from #126838 having been reverted.
The test is intended to check a case when a breakpoint is set on a
return instruction, which is the very last instruction of a function.
When stepping out from this breakpoint, there is interaction between
`ThreadPlanStepOut` and `ThreadPlanStepOverBreakpoint` that could lead
to missing the stop location in the outer frame; the detailed
explanation can be found in #126838.
On `Linux/AArch64`, the source is compiled into:
```
> objdump -d main.o
0000000000000000 <done>:
0: d65f03c0 ret
```
So, when the command `bt set -n done` from the original test sets a
breakpoint to the first instruction of `done()`, this instruction
luckily also happens to be the last one.
On `Linux/x86_64`, it compiles into:
```
> objdump -d main.o
0000000000000000 <done>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 5d pop %rbp
5: c3 ret
```
In this case, setting a breakpoint by function name means setting it
several instructions before `ret`, which does not provoke the
interaction between `ThreadPlanStepOut` and
`ThreadPlanStepOverBreakpoint`.
I will be changing breakpoint hitting behavior soon, where currently
lldb reports a breakpoint as being hit when a thread is *at* a
BreakpointSite, but possibly has not executed the breakpoint instruction
and trapped yet, to having lldb only report a breakpoint hit when the
breakpoint instruction has actually been executed.
One corner case bug with this change is that when you are stopped at a
breakpoint (that has been hit) on the last instruction of a function,
and you do `finish`, a ThreadPlanStepOut is pushed to the thread's plan
stack to put a breakpoint on the return address and resume execution.
And when the thread is asked to resume, it sees that it is at a
BreakpointSite that has been hit, and pushes a
ThreadPlanStepOverBreakpoint on the thread. The StepOverBreakpoint
plan sees that the thread's state is eStateRunning (not eStateStepping),
so it marks itself as "auto continue" -- so once the breakpoint has
been stepped over, we will execution on the thread.
With current lldb stepping behavior ("a thread *at* a BreakpointSite is
said to have stopped with a breakpoint-hit stop reason, even if the
breakpoint hasn't been executed yet"),
`ThreadPlanStepOverBreakpoint::DoPlanExplainsStop` has a special bit of
code which detects when the thread stops with a eStopReasonBreakpoint.
It first checks if the pc is the same as when we started -- did our
"step instruction" not actually step? -- says the stop reason is
explained. Otherwise it sets auto-continue to false (because we've hit
an *unexpected* breakpoint, and we have advanced past our original pc,
and returns false - the stop reason is not explained.
So we do the "finish", lldb instruction steps, we stop *at* the
return-address breakpoint and lldb sets the thread's stop reason to
breakpoint-hit. ThreadPlanStepOverBreakpoint sees an
eStopReasonBreakpoint, sets its auto-continue to false, and says we
stopped for osme reason other than this plan. (and it will also report
`IsPlanStale()==true` so it will remove itself) Meanwhile the
ThreadPlanStepOut sees that it has stopped in the StackID it wanted to
run to, and return success.
This all changes when stopping at a breakpoint site doesn't report
breakpoint-hit until we actually execute the instruction. Now the
ThraedPlanStepOverBreakpoint looks at the thread's stop reason, it's
eStopReasonTrace (we've instruction stepped), and so it leaves its
auto-continue to `true`. ThreadPlanStepOut sees that it has reached its
goal StackID, removes its breakpoint, and says it is done.
Thread::ShouldStop thinks the auto-continue == yes vote from
ThreadPlanStepOverBreakpoint wins, and we lose control of the process.
This patch changes ThreadPlanStepOut to require that *both* (1) we are
at the StackID of the caller function, where we wanted to end up, and
(2) we have actually hit the breakpoint that we inserted.
This in effect means that now lldb instruction-steps over the breakpoint
in the callee function, stops at the return address of the caller
function. StepOverBreakpoint has completed. StepOut is still running,
and we continue the thread again. We immediatley hit the breakpoint
(that we're sitting at), and now ThreadPlanStepOut marks itself as
completed, and we return control to the user.
Jim suggests that ThreadPlanStepOverBreakpoint is a bit unusual because
it's not something pushed on the stack by a higher-order thread plan
that "owns" it, it is inserted by the Thread as it is about to resume,
if we're at a BreakpointSite. It has no connection to the thread plans
above it, but tries to set the auto-continue mode based on the state of
the thread when it is inserted (and tries to detect an unexpected
breakpoint and unset that auto-continue it previously decided on,
because it now realizes it should not influence execution control any
more). Instead maybe the
ThreadPlanStepOverBreakpoint should be inserted as a child plan of
whatever the lowest plan is on the stack at the point it is added.
I added an API test that will catch this bug in the new thread
breakpoint algorithm.