The "process metadata" LC_NOTE allows for thread IDs to be specified in
a Mach-O corefile. This extends the JSON recognzied in that LC_NOTE to
allow for additional registers to be supplied on a per-thread basis.
The registers included in a Mach-O corefile LC_THREAD load command can
only be one of the register flavors that the kernel (xnu) defines in
<mach/arm/thread_status.h> for arm64 -- the general purpose registers,
floating point registers, exception registers.
JTAG style corefile producers may have access to many additional
registers beyond these that EL0 programs typically use, for instance
TCR_EL1 on AArch64, and people developing low level code need access to
these registers. This patch defines a format for including these
registers for any thread.
The JSON in "process metadata" is a dictionary that must have a
`threads` key. The value is an array of entries, one per LC_THREAD in
the Mach-O corefile. The number of entries must match the LC_THREADs so
they can be correctly associated.
Each thread's dictionary must have two keys, `sets`, and `registers`.
`sets` is an array of register set names. If a register set name matches
one from the LC_THREAD core registers, any registers that are defined
will be added to that register set. e.g. metadata can add a register to
the "General Purpose Registers" set that lldb shows users.
`registers` is an array of dictionaries, one per register. Each register
must have the keys `name`, `value`, `bitsize`, and `set`. It may provide
additional keys like `alt-name`, that
`DynamicRegisterInfo::SetRegisterInfo` recognizes.
This `sets` + `registers` formatting is the same that is used by the
`target.process.python-os-plugin-path` script interface uses, both are
parsed by `DynamicRegisterInfo`. The one addition is that in this
LC_NOTE metadata, each register must also have a `value` field, with the
value provided in big-endian base 10, as usual with JSON.
In RegisterContextUnifiedCore, I combine the register sets & registers
from the LC_THREAD for a specific thread, and the metadata sets &
registers for that thread from the LC_NOTE. Even if no LC_NOTE is
present, this class ingests the LC_THREAD register contexts and
reformats it to its internal stores before returning itself as the
RegisterContex, instead of shortcutting and returning the core's native
RegisterContext. I could have gone either way with that, but in the end
I decided if the code is correct, we should live on it always.
I added a test where we process save-core to create a userland corefile,
then use a utility "add-lcnote" to strip the existing "process metadata"
LC_NOTE that lldb put in it, and adds a new one from a JSON string.
rdar://74358787
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
TestFirmwareCorefiles.py has a helper utility,
create-empty-corefile.cpp, which creates corefiles with different
metadata to specify the binary that should be loaded. It normally uses
an actual binary's UUID for the metadata, and it uses the binary's
cputype/cpusubtype for the corefile's mach header.
There is one test where it creates a corefile with metadata for a UUID
that cannot be found -- it is given no binary -- and in that case, the
cputype/cpusubtype it sets in the core file mach header was
uninitialized data. Through luck, on Darwin systems, the uninitialized
data typically matched a CPU_TYPE from machine.h and the test would
work. But when the value doens't match one of thoes defines, lldb would
reject the corefile entirely, and the test would fail. This has been an
infrequent failure on the CI bots for a while and I couldn't ever repo
it. There's a recent configuration where it was happening every time and
I was able to track it down.
rdar://141727563
Any time we see the pattern `assertEqual(value, bool)`, we can replace
that with `assert<bool>(value)`. Likewise for `assertNotEqual`.
Technically this relaxes the test a bit, as we may want to make sure
`value` is either `True` or `False`, and not something that implicitly
converts to a bool. For example, `assertEqual("foo", True)` will fail,
but `assertTrue("foo")` will not. In most cases, this distinction is not
important.
There are two such places that this patch does **not** transform, since
it seems intentional that we want the result to be a bool:
*
5daf2001a1/lldb/test/API/python_api/sbstructureddata/TestStructuredDataAPI.py (L90)
*
5daf2001a1/lldb/test/API/commands/settings/TestSettings.py (L940)
Followup to 9c2468821ec51defd09c246fea4a47886fff8c01. I patched `teyit`
with a `visit_assertEqual` node handler to generate this.
The "kern ver str" LC_NOTE gives lldb a kernel version string -- with a
UUID and/or a load address (stext) to load it at. The LC_NOTE specifies
a size of the identifier string in bytes. In
ObjectFileMachO::GetIdentifierString, I copy that number of bytes into a
std::string, and in case there were additional nul characters at the end
of the sting for padding reasons, I tried to shrink the std::string to
not include these extra nul's.
However, I did this resizing without handling the case of an empty
identifier string. I don't know why any corefile creator would do that,
but of course at least one does. This patch removes the resizing
altogether; I was solving something that hasn't ever shown to be a
problem. I also added a test case for this, to check that lldb doesn't
crash when given one of these corefiles.
rdar://120390199
DynamicLoader::LoadBinaryWithUUIDAndAddress can create a Module based
on the binary image in memory, which in some cases contains symbol
names and can be genuinely useful. If we don't have a filename, it
creates a name in the form `memory-image-0x...` with the header address.
In practice, this is most useful with Darwin userland corefiles
where the binary was stored in the corefile in whole, and we can't
find a binary with the matching UUID. Using the binary out of
the corefile memory in this case works well.
But in other cases, akin to firmware debugging, we merely end up
with an oddly named binary image and no symbols.
Add a flag to control whether we will create these memory images
and add them to the Target or not; only set it to true when working
with a userland Mach-O image with the "all image infos" LC_NOTE for
a userland corefile.
Differential Revision: https://reviews.llvm.org/D157167
This is an ongoing series of commits that are reformatting our Python
code. Reformatting is done with `black` (23.1.0).
If you end up having problems merging this commit because you have made
changes to a python file, the best way to handle that is to run `git
checkout --ours <yourfile>` and then reformat it with black.
RFC: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Differential revision: https://reviews.llvm.org/D151460
Add support to Mach-O corefiles and to live gdb remote serial protocol
connections for the corefile/remote stub to provide a list of load
addresses of binaries that should be found & loaded by lldb, and nothing
else. lldb will try to parse the binary out of memory, and if it can
find a UUID, try to find a binary & its debug information based on the
UUID, falling back to using the memory image if it must.
A bit of code unification from three parts of lldb that were loading
individual binaries already, so there is a shared method in
DynamicLoader to handle all of the variations they were doing.
Re-landing this with a uuid_is_null() implementation added to
Utility/UuidCompatibility.h for non-Darwin systems.
Differential Revision: https://reviews.llvm.org/D130813
rdar://94249937
rdar://94249384
This reverts commit d8879fba8825b9799166ba0ea552d4027bfb8ad1.
Debian bot failure; I included <uuid/uuid.h> to get uuid_is_null() but
don't get it there. Will memcmp or whatever & recommit.
Add support to Mach-O corefiles and to live gdb remote serial protocol
connections for the corefile/remote stub to provide a list of load
addresses of binaries that should be found & loaded by lldb, and nothing
else. lldb will try to parse the binary out of memory, and if it can
find a UUID, try to find a binary & its debug information based on the
UUID, falling back to using the memory image if it must.
A bit of code unification from three parts of lldb that were loading
individual binaries already, so there is a shared method in
DynamicLoader to handle all of the variations they were doing.
Differential Revision: https://reviews.llvm.org/D130813
rdar://94249937
rdar://94249384
Eliminate boilerplate of having each test manually assign to `mydir` by calling
`compute_mydir` in lldbtest.py.
Differential Revision: https://reviews.llvm.org/D128077
Version 2 of 'main bin spec' LC_NOTE allows for the specification
of a slide of where the binary is loaded in the corefile virtual
address space. It also adds a (currently unused) platform field
for the main binary.
Some corefile creators will only have a UUID and an offset to be
applied to the binary.
Changed TestFirmwareCorefiles.py to test this new form of
'main bin spec' with a slide, and also to run on both x86_64
and arm64 macOS systems.
Differential Revision: https://reviews.llvm.org/D116094
rdar://85938455
This patch adds code to process save-core for Mach-O files which
embeds an "addrable bits" LC_NOTE when the process is using a
code address mask (e.g. AArch64 v8.3 with ptrauth aka arm64e).
Add code to ObjectFileMachO to read that LC_NOTE from corefiles,
and ProcessMachCore to set the process masks based on it when reading
a corefile back in.
Also have "process status --verbose" print the current address masks
that lldb is using internally to strip ptrauth bits off of addresses.
Differential Revision: https://reviews.llvm.org/D106348
rdar://68630113
A little cleanup to how these firmware corefile tests are done; add
a test that loads a dSYM that loads an OS plugin, and confirm that
the OS plugin's threads are created.
Fill out ProcessMachCore::DoLoadCore to handle LC_NOTE hints with
a UUID or with a UUID+address, and load the binary at the specified
offset correctly. Add tests for all four combinations. Change
DynamicLoaderStatic to not re-set a Section's load address in the
Target if it's already been specified.
Differential Revision: https://reviews.llvm.org/D99571
rdar://51490545
When the architecture from the returned plist differs from the
architecture lldb will pick when loading the binary file, lldb will
reject the binary as not matching. We are working with UUID's in
this case, so an architecture is not disambiguating anything; it
just opens this possibility for failing to load the specified binary.
Stop reading the architecture from the plist.
<rdar://problem/71612561>
Differential revision: https://reviews.llvm.org/D92692
When a Mach-O corefile has an LC_NOTE "main bin spec" for a
standalone binary / firmware, with only a UUID and no load
address, try to locate the binary and dSYM by UUID and if
found, load it at offset 0 for the user.
Add a test case that tests a firmware/standalone corefile
with both the "kern ver str" and "main bin spec" LC_NOTEs.
<rdar://problem/68193804>
Differential Revision: https://reviews.llvm.org/D88282
Summary: Moves lldbsuite tests to lldb/test/API.
This is a largely mechanical change, moved with the following steps:
```
rm lldb/test/API/testcases
mkdir -p lldb/test/API/{test_runner/test,tools/lldb-{server,vscode}}
mv lldb/packages/Python/lldbsuite/test/test_runner/test lldb/test/API/test_runner
for d in $(find lldb/packages/Python/lldbsuite/test/* -maxdepth 0 -type d | egrep -v "make|plugins|test_runner|tools"); do mv $d lldb/test/API; done
for d in $(find lldb/packages/Python/lldbsuite/test/tools/lldb-vscode -maxdepth 1 -mindepth 1 | grep -v ".py"); do mv $d lldb/test/API/tools/lldb-vscode; done
for d in $(find lldb/packages/Python/lldbsuite/test/tools/lldb-server -maxdepth 1 -mindepth 1 | egrep -v "gdbremote_testcase.py|lldbgdbserverutils.py|socket_packet_pump.py"); do mv $d lldb/test/API/tools/lldb-server; done
```
lldb/packages/Python/lldbsuite/__init__.py and lldb/test/API/lit.cfg.py were also updated with the new directory structure.
Reviewers: labath, JDevlieghere
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71151