The x86 assembly instruction scanner creates incorrect UnwindPlans when
a mid-function epilogue has a non-epilogue instruction in it.
The x86 instruction analysis which creates an UnwindPlan handles
mid-function epilogues by tracking "epilogue instructions" (register
loads from stack, stack pointer increasing, etc) and any UnwindPlan
updates which are NOT epilogue instructions update the "prologue
UnwindPlan" saved row. It detects a LEAVE/RET/unconditional JMP out of
the function and after that instruction, re-instates the "prologue Row".
There's a parallel piece of data tracked across the duration of the
function, current_sp_bytes_offset_from_fa, and we reflect the "value
after prologue instructions" in
prologue_completed_sp_bytes_offset_from_cfa. When the CFA is calculated
in terms of the frame pointer ($ebp/$rbp), we don't add changes to the
stack pointer to the UnwindPlan, so this separate mechanism is used for
the "current value" and the "last value at prologue setup".
(the x86 UnwindPlan generated writes "sp=CFA+0" as a register rule which
is formally correct, but it could also track the stack pointer value as
sp=$rsp+<x> and update this register rule every time $rsp is modified.)
This leads to a bug when there is an instruction in an epilogue which
isn't recognzied as an epilogue instruction.
prologue_completed_sp_bytes_offset_from_cfa is always set to the value
of current_sp_bytes_offset_from_fa unless the current instruction is an
epilogue instruction. With a non-epilogue instruction in the middle of
the epilogue, we suddenly copy a current_sp_bytes_offset_from_fa value
from the middle of the epilogue into this
prologue_completed_sp_bytes_offset_from_cfa. Once the epilogue is
finished, we restore the "prologue Row" and
prologue_completed_sp_bytes_offset_from_cfa. But now $rsp has a very
incorrect value in it.
This patch tracks when we've updated current_sp_bytes_offset_from_fa in
the current instruction analysis. If it was updated looking at an
epilogue instruction, `is_epilogue` will be set correctly. Otherwise
it's a "prologue" instruction and we should update
prologue_completed_sp_bytes_offset_from_cfa. Any instruction that is
unrecognized will leave prologue_completed_sp_bytes_offset_from_cfa
unmodified.
The actual instruction we hit this with was a BTRQ but I added a NOP to
the unit test which is only 1 byte and made the update to the unit test
a little simpler. This bug is hit with a NOP just as well.
UnwindAssemblyInstEmulation has a much better algorithm for handling
mid-function epilogues, which "forward" the current unwind state Row
when it sees branches within the function, to the target instruction
offset. This avoids detecting prologue/epilogue instructions altogether.
rdar://137153323
lldb has two RegisterLocation classes that do slightly different things.
UnwindPlan::Row::RegisterLocation (new: AbstractRegisterLocation) has a
description of how to find a register's value or location, not specific
to a particular stopping point. It may say that at a given offset into a
function, the caller's register has been spilled to stack memory at CFA
minus an offset. Or it may say that the caller's register is at a DWARF
exprssion.
UnwindLLDB::RegisterLocation (new: ConcreteRegisterLocation) is a
specific address where the register is currently stored, or the register
it has been copied into, or its value at this point in the current
function execution.
When lldb stops in a function, it interprets the
AbstractRegisterLocation's instructions using the register context and
stack memory, to create the ConcreteRegisterLocation at this point in
time for this stack frame.
I'm not thrilled with AbstractRegisterLocation and
ConcreteRegisterLocation, but it's better than the same name and it will
be easier to update them if someone suggests a better pair.
…unction (#91321)"
This reapplies fd1bd53ba5a06f344698a55578f6a5d79c457e30, which was
reverted due to a test failure on aarch64/windows. The failure was
caused by a combination of several factors:
- clang targeting aarch64-windows (unlike msvc, and unlike clang
targeting other aarch64 platforms) defaults to -fomit-frame-pointers
- lldb's code for looking up register values for `<same>` unwind rules
is recursive
- the test binary creates a very long chain of fp-less function frames
(it manages to fit about 22k frames before it blows its stack)
Together, these things have caused lldb to recreate the same deep
recursion when unwinding through this, and blow its own stack as well.
Since lldb frames are larger, about 4k frames like this was sufficient
to trigger the stack overflow.
This version of the patch works around this problem by increasing the
frame size of the test binary, thereby causing it to blow its stack
sooner. This doesn't fix the issue -- the same problem can occur with a
real binary -- but it's not very likely, as it requires an infinite
recursion in a simple (so it doesn't use the frame pointer) function
with a very small frame (so you can fit a lot of them on the stack).
A more principled fix would be to make lldb's lookup code non-recursive,
but I believe that's out of scope for this patch.
The original patch description follows:
A leaf function may not store the link register to stack, but we it can
still end up being a non-zero frame if it gets interrupted by a signal.
Currently, we were unable to unwind past this function because we could
not read the link register value.
To make this work, this patch:
- changes the function-entry unwind plan to include the `fp|lr = <same>`
rules. This in turn necessitated an adjustment in the generic
instruction emulation logic to ensure that `lr=[sp-X]` can override the
`<same>` rule.
- allows the `<same>` rule for pc and lr in all
`m_all_registers_available` frames (and not just frame zero).
The test verifies that we can unwind in a situation like this, and that
the backtrace matches the one we computed before getting a signal.
This reverts commit fd1bd53ba5a06f344698a55578f6a5d79c457e30.
TestInterruptBacktrace was broken on AArch64/Windows as a result of this change.
See lldb-aarch64-windows buildbot here:
https://lab.llvm.org/buildbot/#/builders/219/builds/11261
A leaf function may not store the link register to stack, but we it can
still end up being a non-zero frame if it gets interrupted by a signal.
Currently, we were unable to unwind past this function because we could
not read the link register value.
To make this work, this patch:
- changes the function-entry unwind plan to include the `fp|lr = <same>`
rules. This in turn necessitated an adjustment in the generic
instruction emulation logic to ensure that `lr=[sp-X]` can override the
`<same>` rule.
- allows the `<same>` rule for pc and lr in all
`m_all_registers_available` frames (and not just frame zero).
The test verifies that we can unwind in a situation like this, and that
the backtrace matches the one we computed before getting a signal.
If `LLVM_TARGETS_TO_BUILD` does not contain `X86` and we try to run an
x86 binary in lldb, we get a `nullptr` dereference in
`LLVMDisasmInstruction(...)`. We try to call `getDisAsm()` method on a
`LLVMDisasmContext *DC` which is null. The pointer is passed from
`x86AssemblyInspectionEngine::instruction_length(...)` and is originally
`m_disasm_context` member of `x86AssemblyInspectionEngine`. This should
be filled by `LLVMCreateDisasm(...)` in the class constructor, but not
having X86 target enabled in llvm makes
`TargetRegistry::lookupTarget(...)` call return `nullptr`, which results
in `m_disasm_context` initialized with `nullptr` as well.
This patch adds if statements against `m_disasm_context` in
`x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(...)`
and `x86AssemblyInspectionEngine::FindFirstNonPrologueInstruction(...)`
so subsequent calls to
`x86AssemblyInspectionEngine::instruction_length(...)` do not cause a
null pointer dereference.
Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477
This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like
```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc unknown stp x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd unknown stp x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd unknown add x29, sp, #0x10
0x1000086f0: 0xd11843ff unknown sub sp, sp, #0x610
0x1000086f4: 0x910c63e8 unknown add x8, sp, #0x318
```
instead of `disassemble`'s output style of
```
lldb`main:
lldb[0x1000086e4] <+0>: stp x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>: stp x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>: add x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub sp, sp, #0x610
lldb[0x1000086f4] <+16>: add x8, sp, #0x318
```
Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.
But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.
Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.
Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d5010a7e34ae15f962dac8505ea11a8716
Author: Greg Clayton <gclayton@apple.com>
Date: Thu Oct 27 17:55:14 2011 +0000
[...]
eFormatInstruction will print out disassembly with bytes and it will use the
current target's architecture. The format character for this is "i" (which
used to be being used for the integer format, but the integer format also has
"d", so we gave the "i" format to disassembly), the long format is
"instruction".
```
he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.
tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.
This is the first step to being able to handle non
trivial types in the union.
info_type effects the lifetime of the objects in the union,
so making it private means we know you have to call one of the
Set<...> functions to change it.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134039
Slava Gurevich noticed this dead code I wrote in jmp_to_reg_p
is never executed, is duplicated, and has comments that seem to
describe the opposite behavior. Remove the dead code that cannot
be executed.
Differential Revision: https://reviews.llvm.org/D131029
Fixing potential int overflow and uninitialized variables.
These were found by Coverity static code inspection.
Differential Revision: https://reviews.llvm.org/D130795
In UnwindAssemblyInstEmulation we correctly recognize when a LDP
restores the fp & lr in an epilogue, and mark them as having the
caller's contents now, but we don't update the CFA register rule
at that point to indicate that the CFA is now calculated in terms
of $sp. This doesn't impact the backtrace because the register
contents are all <same> now, but it can confuse the stepper when
the StackID changes mid-epilogue.
Differential Revision: https://reviews.llvm.org/D124492
rdar://92064415
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
There is no reason why this function should be returning a ConstString.
While modifying these files, I also fixed several instances where
GetPluginName and GetPluginNameStatic were returning different strings.
I am not changing the return type of GetPluginNameStatic in this patch, as that
would necessitate additional changes, and this patch is big enough as it is.
Differential Revision: https://reviews.llvm.org/D111877
In all these years, we haven't found a use for this function (it has
zero callers). Lets just remove the boilerplate.
Differential Revision: https://reviews.llvm.org/D109600
This converts a default constructor's member initializers into C++11
default member initializers. This patch was automatically generated with
clang-tidy and the modernize-use-default-member-init check.
$ run-clang-tidy.py -header-filter='lldb' -checks='-*,modernize-use-default-member-init' -fix
This is a mass-refactoring patch and this commit will be added to
.git-blame-ignore-revs.
Differential revision: https://reviews.llvm.org/D103483
Commiting this patch for Augusto Noronha who is getting set
up still.
This patch changes Target::ReadMemory so the default behavior
when a read is in a Section that is read-only is to fetch the
data from the local binary image, instead of reading it from
memory. Update all callers to use their old preferences
(the old prefer_file_cache bool) using the new API; we should
revisit these calls and see if they really intend to read
live memory, or if reading from a read-only Section would be
equivalent and important for performance-sensitive cases.
rdar://30634422
Differential revision: https://reviews.llvm.org/D100338
Summary:
This fixes a bug in the logic for choosing the unwind plan. Based on the
comment in UnwindAssembly-x86, the intention was that a plan which
describes the function epilogue correctly does not need to be augmented
(and it should be used directly). However, the way this was implemented
(by returning false) meant that the higher level code
(FuncUnwinders::GetEHFrameAugmentedUnwindPlan) interpreted this as a
failure to produce _any_ plan and proceeded with other fallback options.
The fallback usually chosed for "asynchronous" plans was the
"instruction emulation" plan, which tended to fall over on certain
functions with multiple epilogues (that's a separate bug).
This patch simply changes the function to return true, which signals the
caller that the unmodified plan is ready to be used.
The attached test case demonstrates the case where we would previously
fall back to the instruction emulation plan, and unwind incorrectly --
the test asserts that the "augmented" eh_frame plan is used, and that
the unwind is correct.
Reviewers: jasonmolenda, jankratochvil
Subscribers: davide, echristo, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D82378
Fix a bug where UnwindAssemblyInstEmulation would confuse which
register is used to compute the Canonical Frame Address after it
had branched over a mid-function epilogue (where the CFA reg changes
from $fp to $sp in the process of epiloguing). Reinstate the
correct CFA register after we forward the unwind rule for branch
targets. The failure mode was that UnwindAssemblyInstEmulation
would think CFA was set in terms of $sp after one of these epilogues,
and if it sees modifications to $sp after the branch target, it would
change the CFA offset in the unwind rule -- even though the CFA is
defined in terms of $fp and the $sp changes are irrelevant to correct
calculation.
<rdar://problem/60300528>
Differential Revision: https://reviews.llvm.org/D78077
LLDB has a few different styles of header guards and they're not very
consistent because things get moved around or copy/pasted. This patch
unifies the header guards across LLDB and converts everything to match
LLVM's style.
Differential revision: https://reviews.llvm.org/D74743
Use LLDB_PLUGIN_DEFINE_ADV to make the name of the generated initializer
match the name of the plugin. This is a step towards generating the
initializers with a def file. I'm landing this change in pieces so I can
narrow down what exactly breaks the Windows bot.
This patch changes the way we initialize and terminate the plugins in
the system initializer. It uses an approach similar to LLVM's
TARGETS_TO_BUILD with a def file that enumerates the plugins.
The previously landed patch got reverted because it was lacking:
(1) A plugin definition for the Objective-C language runtime,
(2) The dependency between the Static and WASM dynamic loader,
(3) Explicit initialization of ScriptInterpreterNone for lldb-test.
All issues have been addressed in this patch.
Differential revision: https://reviews.llvm.org/D73067
This patch changes the way we initialize and terminate the plugins in
the system initializer. It uses an approach similar to LLVM's
TARGETS_TO_BUILD with a def file that enumerates the plugins.
Differential revision: https://reviews.llvm.org/D73067
This is a step towards making the initialize and terminate calls be
generated by CMake, which in turn is towards making it possible to
disable plugins at configuration time.
Differential revision: https://reviews.llvm.org/D74245
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).
This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).
Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73258
Unwind plan augmentation should compute the plan row at offset x from
the instruction before offset x, but currently we compute it from the
instruction at offset x. Note that this behavior is a regression
introduced when moving the x86 assembly inspection engine to its own
file
(1c9858b298 (diff-375a2be066db6f34bb9a71442c9b71fcL913));
the original version handled this properly by copying the previous
instruction out before advancing the instruction pointer.
The relevant bug with more info is here: https://bugs.llvm.org/show_bug.cgi?id=43561
Differential Revision: https://reviews.llvm.org/D68454
Patch by Jaroslav Sevcik <jarin@google.com>.
llvm-svn: 374342
Summary:
Add __kernel_rt_sigreturn to the list of trap handlers for Linux (it's
used as such on aarch64 at least), and __restore_rt as well (used on
x86_64).
Skip decrement-and-recompute for trap handlers in
InitializeNonZerothFrame, as signal dispatch may point the child frame's
return address to the start of the return trampoline.
Parse the 'S' flag for signal handlers from eh_frame augmentation, and
propagate it to the unwind plan.
Reviewers: labath, jankratochvil, compnerd, jfb, jasonmolenda
Reviewed By: jasonmolenda
Subscribers: clayborg, MaskRay, wuzish, nemanjai, kbarton, jrtc27, atanasyan, jsji, javed.absar, kristof.beyls, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D63667
llvm-svn: 366580
The x86 assembly inspection engine has code to support detecting a
mid-function epilogue that ends in a RET instruction; add support for
recognizing an epilogue that ends in a JMP, and add a check that the
unwind state has been restored to the original stack setup; reinstate
the post-prologue unwind state after this JMP instruction.
The assembly inspection engine used for other architectures,
UnwindAssemblyInstEmulation, detects mid-function epilogues by
tracking branch instructions within the function and "forwards"
the current unwind state to the targets of the branches. If
an epilogue unwinds the stack and exits, followed by a branch
target, we get back to the correct unwind state. The x86
unwinder should move to this same algorithm, or possibly even
look at implementing an x86 instruction emulation plugin and
get UnwindAssemblyInstEmulation to work for x86 too. I added
a branch instruction recognizier method that will be necessary
if we want to switch the algorithm.
Differential Revision: https://reviews.llvm.org/D62764
<rdar://problem/51074422>
llvm-svn: 362456
Summary:
NFC = [[ https://llvm.org/docs/Lexicon.html#nfc | Non functional change ]]
This commit is the result of modernizing the LLDB codebase by using
`nullptr` instread of `0` or `NULL`. See
https://clang.llvm.org/extra/clang-tidy/checks/modernize-use-nullptr.html
for more information.
This is the command I ran and I to fix and format the code base:
```
run-clang-tidy.py \
-header-filter='.*' \
-checks='-*,modernize-use-nullptr' \
-fix ~/dev/llvm-project/lldb/.* \
-format \
-style LLVM \
-p ~/llvm-builds/debug-ninja-gcc
```
NOTE: There were also changes to `llvm/utils/unittest` but I did not
include them because I felt that maybe this library shall be updated in
isolation somehow.
NOTE: I know this is a rather large commit but it is a nobrainer in most
parts.
Reviewers: martong, espindola, shafik, #lldb, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, JDevlieghere, teemperor, rnkovacs, emaste, kubamracek, nemanjai, ki.stfu, javed.absar, arichardson, kbarton, jrtc27, MaskRay, atanasyan, dexonsmith, arphaman, jfb, jsji, jdoerfert, lldb-commits, llvm-commits
Tags: #lldb, #llvm
Differential Revision: https://reviews.llvm.org/D61847
llvm-svn: 361484