Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.
Either way, I'm open.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
(Revised version of a previous, unreviewed, PR.)
Move all expression verification into its only client: DWARFVerifier.
Move all printing code (which was a mix of static and member functions)
into a separate class.
This is one in a series of refactoring PRs to separate dwarf
functionality into lower-level pieces usable without object files and
sections at build time. The code is already written this way via various
"if (section == nullptr)" and similar conditionals. So the functionality
itself is needed and exists, but only as a runtime feature. The goal of
these refactors is to remove the build-time dependencies, which will
allow the existing functionality to be used from lower-level parts of
the compiler. Particularly from lib/MC/.... More information at:
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
This is mainly useful for discontinuous functions because individual
parts of the function will have separate FDE entries, which can begin
many megabytes from the start of the function. However, I'm separating
it out, because it turns out we already have a test case for the
situation where the FDE does not begin exactly at the function boundary.
The test works mostly by accident because the FDE starts only one byte
after the beginning of the function so it doesn't really matter whether
one looks up the unwind row using the function or fde offset. In this
patch, I beef up the test to catch this problem more reliably.
To make this work I've also needed to change a couple of places which
that an unwind plan always has a row at offset zero.
`GetLSDAAddress` and `GetPersonalityRoutinePtrAddress` are unused and
they create a bit of a problem for discontinuous functions, because the
unwind plan for these consists of multiple eh_frame descriptors and (at
least in theory) each of them could have a different value for these
entities.
We could say we only support functions for which these are always the
same, or create some sort of a Address2LSDA lookup map, but I think it's
better to leave this question to someone who actually needs this.
This reverts commit
48864a52ef,
reapplying d7cea2b187.
It also fixes the dangling
pointers caused by the previous version by creating copies of the Rows
in x86AssemblyInspectionEngine.
This reverts commit 094904303d50e0ab14bc5f2586a602f79af95953, reapplying
d7afafdbc464e65c56a0a1d77bad426aa7538306 (#133247).
The failure ought to be fixed by
0509932bb6a291ba11253f30c465ab3ad164ae08.
The function had special handling for -1, but that is incompatible with
functions whose entry point is not the first address. Use std::nullopt
instead.
The surrounding code doesn't use them anymore. This removes the internal
usages.
This patch makes the Rows actual values. An alternative would be to make
them unique_ptrs. That would make vector resizes faster at the cost of
more pointer chasing and heap fragmentation. I don't know which one is
better so I picked the simpler option.
To be able to describe discontinuous functions, this patch changes the
UnwindPlan to accept more than one address range.
I've also squeezed in a couple improvements/modernizations, for example
using the lower_bound function instead of a linear scan.
The whole unwind plan is already stored in a shared pointer, and there's
no need to persist Rows individually. If there's ever a need to do that,
there are at least two options:
- copy the row (they're not that big, and they're being copied left and
right during construction already)
- use the shared_ptr subobject constructor to create a shared_ptr which
points to a Row but holds the entire unwind plan alive
This also changes all of the getter functions to return const Row
pointers, which is important for safety because all of these objects are
cached and potentially accessed from multiple threads. (Technically one
could hand out `shared_ptr<const Row>`s, but we don't have a habit of
doing that.)
As a next step, I'd like to remove the internal UnwindPlan usages of the
shared pointer, but I'm doing this separately to gauge feedback, and
also because the patch got rather big.
This is similar to 9fe455fd0c7d, but for FA locations instead of
register locations.
This is useful for unwind plans that cannot create abstract unwind
rules, but instead must inspect the state of the program to determine
the current CFA.
In 206408732bca2ef464732a39c8319d47c8a1dbea I updated the
UnwindPlan header to include a new method, but didn't add
the actual implementation. Fix that.
lldb has two RegisterLocation classes that do slightly different things.
UnwindPlan::Row::RegisterLocation (new: AbstractRegisterLocation) has a
description of how to find a register's value or location, not specific
to a particular stopping point. It may say that at a given offset into a
function, the caller's register has been spilled to stack memory at CFA
minus an offset. Or it may say that the caller's register is at a DWARF
exprssion.
UnwindLLDB::RegisterLocation (new: ConcreteRegisterLocation) is a
specific address where the register is currently stored, or the register
it has been copied into, or its value at this point in the current
function execution.
When lldb stops in a function, it interprets the
AbstractRegisterLocation's instructions using the register context and
stack memory, to create the ConcreteRegisterLocation at this point in
time for this stack frame.
I'm not thrilled with AbstractRegisterLocation and
ConcreteRegisterLocation, but it's better than the same name and it will
be easier to update them if someone suggests a better pair.
This is useful for language runtimes that compute register values by
inspecting the state of the currently running process. Currently, there
are no mechanisms enabling these runtimes to set register values to
arbitrary values.
The alternative considered would involve creating a dwarf expression
that produces an arbitrary integer (e.g. using OP_constu). However, the
current data structure for Rows is such that they do not own any memory
associated with dwarf expressions, which implies any such expression
would need to have static storage and therefore could not contain a
runtime value.
Adding a new rule for constants leads to a simpler implementation. It's
also worth noting that this does not make the "Location" union any
bigger, since it already contains a pointer+size pair.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch had to be reverted because on gcc 7.5.0 we see an error converting from std::unique_ptr<MCRegisterInfo> to Expected<std::unique_ptr<MCRegisterInfo>> as the return type for the function createRegInfo. This has now been fixed.
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"
Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files
Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848
Which is a great diff!
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
This converts a default constructor's member initializers into C++11
default member initializers. This patch was automatically generated with
clang-tidy and the modernize-use-default-member-init check.
$ run-clang-tidy.py -header-filter='lldb' -checks='-*,modernize-use-default-member-init' -fix
This is a mass-refactoring patch and this commit will be added to
.git-blame-ignore-revs.
Differential revision: https://reviews.llvm.org/D103483
Add a new state for UnwindPlan::Row which indicates that any
register not listed is not defined, and should not be found in
stack frames newer than this one and passed up the stack. Mostly
intended for use with architectural default unwind plans that are
used for jitted stack frames, where we have no unwind information
or start address. lldb has no way to tell if registers were
spilled in the jitted frame & overwritten, so passing register
values up the stack is not safe to show the user.
Architectural default unwind plans are also used as a fast unwind
plan on x86_64 in particular, and are used as the fallback unwind
plans when lldb thinks it may be able to work around a problem
which causes the unwinder to stop walking the stack early.
For fast unwind plans, when we don't find a register location in
the arch default unwind plan, we fall back to computing & using
the full unwind plan. One small part of this patch is to know that
a register marked as Undefined in the fast unwind plan is a special
case, and we should continue on to the full unwind plan to find what
the real unwind rule is for this register.
Differential Revision: https://reviews.llvm.org/D96829
<rdar://problem/70398009>
Update the "image show-unwind" command output to show if the function
being shown is listed as a user-setting or platform trap handler.
Update the individual UnwindPlan dumps to show whether the unwind plan
is registered as a trap handler.
The llvm DWARFExpression dump is nearly identical, but better -- for
example it does print a spurious space after zero-argument expressions.
Some parts of our code (variable locations) have been already switched
to llvm-based expression dumping. This switches the remainder: unwind
plans and some unit tests.
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).
This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).
Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73258
Summary:
Windows unwinding is weird. The unwind rules do not (always) describe
the precise layout of the stack, but rather expect the debugger to scan
the stack for something which looks like a plausible return address, and
the unwind based on that. The reason this works somewhat reliably is
because the the unwinder also has access to the frame sizes of the
functions on the stack. This allows it (in most cases) to skip function
pointers in local variables or function arguments, which could otherwise
be mistaken for return addresses.
Implementing this kind of unwind mechanism in lldb was a bit challenging
because we expect to be able to statically describe (in the UnwindPlan)
structure, the layout of the stack for any given instruction. Giving a
precise desription of this is not possible, because it requires
correlating information from two functions -- the pushed arguments to a
function are considered a part of the callers stack frame, and their
size needs to be considered when unwinding the caller, but they are only
present in the unwind entry of the callee. The callee may end up being
in a completely different module, or it may not even be possible to
determine it statically (indirect calls).
This patch implements this functionality by introducing a couple of new
APIs:
SymbolFile::GetParameterStackSize - return the amount of stack space
taken up by parameters of this function.
SymbolFile::GetOwnFrameSize - the size of this function's frame. This
excludes the parameters, but includes stuff like local variables and
spilled registers.
These functions are then used by the unwinder to compute the estimated
location of the return address. This address is not always exact,
because the stack may contain some additional values -- for instance, if
we're getting ready to call a function then the stack will also contain
partially set up arguments, but we will not know their size because we
haven't called the function yet. For this reason the unwinder will crawl
up the stack from the return address position, and look for something
that looks like a possible return address. Currently, we assume that
something is a valid return address if it ends up pointing to an
executable section.
All of this logic kicks in when the UnwindPlan sets the value of CFA as
"isHeuristicallyDetected", which is also the final new API here. Right
now, only SymbolFileBreakpad implements these APIs, but in the future
SymbolFilePDB will use them too.
Differential Revision: https://reviews.llvm.org/D66638
llvm-svn: 373072
This patch replaces explicit calls to log::Printf with the new LLDB_LOGF
macro. The macro is similar to LLDB_LOG but supports printf-style format
strings, instead of formatv-style format strings.
So instead of writing:
if (log)
log->Printf("%s\n", str);
You'd write:
LLDB_LOG(log, "%s\n", str);
This change was done mechanically with the command below. I replaced the
spurious if-checks with vim, since I know how to do multi-line
replacements with it.
find . -type f -name '*.cpp' -exec \
sed -i '' -E 's/log->Printf\(/LLDB_LOGF\(log, /g' "{}" +
Differential revision: https://reviews.llvm.org/D65128
llvm-svn: 366936
Summary:
Previously we were printing the dwarf expressions in unwind rules simply
as "dwarf-expr". This patch uses the existing dwarf-printing
capabilities in lldb to enhance this dump output, and print the full
decoded dwarf expression.
Reviewers: jasonmolenda, clayborg
Subscribers: aprantl, lldb-commits
Differential Revision: https://reviews.llvm.org/D60949
llvm-svn: 358959
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
This patch fixes issues with a stack realignment.
MSVC maintains two frame pointers (`ebx` and `ebp`) for a realigned stack - one
is used for access to function parameters, while another is used for access to
locals. To support this the patch:
- adds an alternative frame pointer (`ebx`);
- considers stack realignment instructions (e.g. `and esp, -32`);
- along with CFA (Canonical Frame Address) which point to the position next to
the saved return address (or to the first parameter on the stack) introduces
AFA (Aligned Frame Address) which points to the position of the stack pointer
right after realignment. AFA is used for access to registers saved after the
realignment (see the test);
Here is an example of the code with the realignment:
```
struct __declspec(align(256)) OverAligned {
char c;
};
void foo(int foo_arg) {
OverAligned oa_foo = { 1 };
auto aaa_foo = 1234;
}
void bar(int bar_arg) {
OverAligned oa_bar = { 2 };
auto aaa_bar = 5678;
foo(1111);
}
int main() {
bar(2222);
return 0;
}
```
and here is the `bar` disassembly:
```
push ebx
mov ebx, esp
sub esp, 8
and esp, -100h
add esp, 4
push ebp
mov ebp, [ebx+4]
mov [esp+4], ebp
mov ebp, esp
sub esp, 200h
mov byte ptr [ebp-200h], 2
mov dword ptr [ebp-4], 5678
push 1111 ; foo_arg
call j_?foo@@YAXH@Z ; foo(int)
add esp, 4
mov esp, ebp
pop ebp
mov esp, ebx
pop ebx
retn
```
Reviewers: labath, zturner, jasonmolenda, stella.stamenova
Reviewed By: jasonmolenda
Subscribers: abidh, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D53435
llvm-svn: 345577
This is intended as a clean up after the big clang-format commit
(r280751), which unfortunately resulted in many of the comment
paragraphs in LLDB being very hard to read.
FYI, the script I used was:
import textwrap
import commands
import os
import sys
import re
tmp = "%s.tmp"%sys.argv[1]
out = open(tmp, "w+")
with open(sys.argv[1], "r") as f:
header = ""
text = ""
comment = re.compile(r'^( *//) ([^ ].*)$')
special = re.compile(r'^((([A-Z]+[: ])|([0-9]+ )).*)|(.*;)$')
for line in f:
match = comment.match(line)
if match and not special.match(match.group(2)):
# skip intentionally short comments.
if not text and len(match.group(2)) < 40:
out.write(line)
continue
if text:
text += " " + match.group(2)
else:
header = match.group(1)
text = match.group(2)
continue
if text:
filled = textwrap.wrap(text, width=(78-len(header)),
break_long_words=False)
for l in filled:
out.write(header+" "+l+'\n')
text = ""
out.write(line)
os.rename(tmp, sys.argv[1])
Differential Revision: https://reviews.llvm.org/D46144
llvm-svn: 331197
All references to Host and Core have been removed, so this
class can now safely be lowered into Utility.
Differential Revision: https://reviews.llvm.org/D30559
llvm-svn: 296909
This moves the following classes from Core -> Utility.
ConstString
Error
RegularExpression
Stream
StreamString
The goal here is to get lldbUtility into a state where it has
no dependendencies except on itself and LLVM, so it can be the
starting point at which to start untangling LLDB's dependencies.
These are all low level and very widely used classes, and
previously lldbUtility had dependencies up to lldbCore in order
to use these classes. So moving then down to lldbUtility makes
sense from both the short term and long term perspective in
solving this problem.
Differential Revision: https://reviews.llvm.org/D29427
llvm-svn: 293941