PlatformDarwinKernel::GetSharedModule, which can find a kernel or kext
from a local filesystem scan, needed a little cleanup. The method which
finds kernels was (1) not looking for the SymbolFileSpec when creating a
Module, and (2) adding that newly created Module to a Target, which
GetSharedModule should not be doing - after auditing many other subclass
implementations of this method, I haven't found any others doing it.
Platform::GetSharedModule didn't have a headerdoc so it took a little
work to piece together the intended behaviors.
This is addressing a bug where
PlatformDarwinKernel::GetSharedModuleKernel would find the ObjectFile
for a kernel, create a Module, and add it to the Target. Then up in
DynamicLoaderDarwinKernel, it would check if the Module had a SymbolFile
FileSpec, and because it did not, it would do its own search for a
binary & dSYM, find them, and then add that to the Target. Now we have
two copies of the Module in the Target, one with a dSYM and the other
without, and only one of them has its load addresses set.
GetSharedModule should not be adding binaries to the Target, and it
should set the SymbolFile FileSpec when it is creating the Module.
rdar://120895951
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch is rearranging code a bit to add WatchpointResources to
Process. A WatchpointResource is meant to represent a hardware
watchpoint register in the inferior process. It has an address, a size,
a type, and a list of Watchpoints that are using this
WatchpointResource.
This current patch doesn't add any of the features of
WatchpointResources that make them interesting -- a user asking to watch
a 24 byte object could watch this with three 8 byte WatchpointResources.
Or a Watchpoint on 1 byte at 0x1002 and a second watchpoint on 1 byte at
0x1003, these must both be served by a single WatchpointResource on that
doubleword at 0x1000 on a 64-bit target, if two hardware watchpoint
registers were used to track these separately, one of them may not be
hit. Or if you have one Watchpoint on a variable with a condition set,
and another Watchpoint on that same variable with a command defined or
different condition, or ignorecount, both of those Watchpoints need to
evaluate their criteria/commands when their WatchpointResource has been
hit.
There's a bit of code movement to rearrange things in the direction I'll
need for implementing this feature, so I want to start with reviewing &
landing this mostly NFC patch and we can focus on the algorithmic
choices about how WatchpointResources are shared and handled as they're
triggeed, separately.
This patch also stops printing "Watchpoint <n> hit: old value: <x>, new
vlaue: <y>" for Read watchpoints. I could make an argument for print
"Watchpoint <n> hit: current value <x>" but the current output doesn't
make any sense, and the user can print the value if they are
particularly interested. Read watchpoints are used primarily to
understand what code is reading a variable.
This patch adds more fallbacks for how to print the objects being
watched if we have types, instead of assuming they are all integral
values, so a struct will print its elements. As large watchpoints are
added, we'll be doing a lot more of those.
To track the WatchpointSP in the WatchpointResources, I changed the
internal API which took a WatchpointSP and devolved it to a Watchpoint*,
which meant touching several different Process files. I removed the
watchpoint code in ProcessKDP which only reported that watchpoints
aren't supported, the base class does that already.
I haven't yet changed how we receive a watchpoint to identify the
WatchpointResource responsible for the trigger, and identify all
Watchpoints that are using this Resource to evaluate their conditions
etc. This is the same work that a BreakpointSite needs to do when it has
been tiggered, where multiple Breakpoints may be at the same address.
There is not yet any printing of the Resources that a Watchpoint is
implemented in terms of ("watchpoint list", or
SBWatchpoint::GetDescription).
"watchpoint set var" and "watchpoint set expression" take a size
argument which was previously 1, 2, 4, or 8 (an enum). I've changed this
to an unsigned int. Most hardware implementations can only watch 1, 2,
4, 8 byte ranges, but with Resources we'll allow a user to ask for
different sized watchpoints and set them in hardware-expressble terms
soon.
I've annotated areas where I know there is work still needed with
LWP_TODO that I'll be working on once this is landed.
I've tested this on aarch64 macOS, aarch64 Linux, and Intel macOS.
https://discourse.llvm.org/t/rfc-large-watchpoint-support-in-lldb/72116
(cherry picked from commit fc6b72523f3d73b921690a713e97a433c96066c6)
...and follow ups.
As it has caused test failures on Linux Arm and AArch64:
https://lab.llvm.org/buildbot/#/builders/96/builds/49126https://lab.llvm.org/buildbot/#/builders/17/builds/45824
```
lldb-shell :: Subprocess/clone-follow-child-wp.test
lldb-shell :: Subprocess/fork-follow-child-wp.test
lldb-shell :: Subprocess/vfork-follow-child-wp.test
```
This reverts commit a6c62bf1a4717accc852463b664cd1012237d334,
commit a0a1ff3ab40e347589b4e27d8fd350c600526735 and commit
fc6b72523f3d73b921690a713e97a433c96066c6.
This patch is rearranging code a bit to add WatchpointResources to
Process. A WatchpointResource is meant to represent a hardware
watchpoint register in the inferior process. It has an address, a size,
a type, and a list of Watchpoints that are using this
WatchpointResource.
This current patch doesn't add any of the features of
WatchpointResources that make them interesting -- a user asking to watch
a 24 byte object could watch this with three 8 byte WatchpointResources.
Or a Watchpoint on 1 byte at 0x1002 and a second watchpoint on 1 byte at
0x1003, these must both be served by a single WatchpointResource on that
doubleword at 0x1000 on a 64-bit target, if two hardware watchpoint
registers were used to track these separately, one of them may not be
hit. Or if you have one Watchpoint on a variable with a condition set,
and another Watchpoint on that same variable with a command defined or
different condition, or ignorecount, both of those Watchpoints need to
evaluate their criteria/commands when their WatchpointResource has been
hit.
There's a bit of code movement to rearrange things in the direction I'll
need for implementing this feature, so I want to start with reviewing &
landing this mostly NFC patch and we can focus on the algorithmic
choices about how WatchpointResources are shared and handled as they're
triggeed, separately.
This patch also stops printing "Watchpoint <n> hit: old value: <x>, new
vlaue: <y>" for Read watchpoints. I could make an argument for print
"Watchpoint <n> hit: current value <x>" but the current output doesn't
make any sense, and the user can print the value if they are
particularly interested. Read watchpoints are used primarily to
understand what code is reading a variable.
This patch adds more fallbacks for how to print the objects being
watched if we have types, instead of assuming they are all integral
values, so a struct will print its elements. As large watchpoints are
added, we'll be doing a lot more of those.
To track the WatchpointSP in the WatchpointResources, I changed the
internal API which took a WatchpointSP and devolved it to a Watchpoint*,
which meant touching several different Process files. I removed the
watchpoint code in ProcessKDP which only reported that watchpoints
aren't supported, the base class does that already.
I haven't yet changed how we receive a watchpoint to identify the
WatchpointResource responsible for the trigger, and identify all
Watchpoints that are using this Resource to evaluate their conditions
etc. This is the same work that a BreakpointSite needs to do when it has
been tiggered, where multiple Breakpoints may be at the same address.
There is not yet any printing of the Resources that a Watchpoint is
implemented in terms of ("watchpoint list", or
SBWatchpoint::GetDescription).
"watchpoint set var" and "watchpoint set expression" take a size
argument which was previously 1, 2, 4, or 8 (an enum). I've changed this
to an unsigned int. Most hardware implementations can only watch 1, 2,
4, 8 byte ranges, but with Resources we'll allow a user to ask for
different sized watchpoints and set them in hardware-expressble terms
soon.
I've annotated areas where I know there is work still needed with
LWP_TODO that I'll be working on once this is landed.
I've tested this on aarch64 macOS, aarch64 Linux, and Intel macOS.
https://discourse.llvm.org/t/rfc-large-watchpoint-support-in-lldb/72116
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
The remaining use of ConstString in StructuredData is the Dictionary
class. Internally it's backed by a `std::map<ConstString, ObjectSP>`.
I propose that we replace it with a `llvm::StringMap<ObjectSP>`.
Many StructuredData::Dictionary objects are ephemeral and only exist for
a short amount of time. Many of these Dictionaries are only produced
once and are never used again. That leaves us with a lot of string data
in the ConstString StringPool that is sitting there never to be used
again. Even if the same string is used many times for keys of different
Dictionary objects, that is something we can measure and adjust for
instead of assuming that every key may be reused at some point in the
future.
Quick comparisons of key data is likely not a concern with Dictionary,
but the use of `llvm::StringMap` means that lookups should be fast with
its hashing strategy.
Switching to a llvm::StringMap meant that the iteration order may be
different. To account for this when serializing/dumping the dictionary,
I added some code to sort the output by key before emitting anything.
Differential Revision: https://reviews.llvm.org/D159313
ConstString can be implicitly converted into a llvm::StringRef. This is
very useful in many places, but it also hides places where we are
creating a ConstString only to use it as a StringRef for the entire
lifespan of the ConstString object.
I locally removed the implicit conversion and found some of the places we
were doing this.
Differential Revision: https://reviews.llvm.org/D159237
The existing comment already explains what the problem is. The radar
tracks caching negative lookups in xcrun. Having a backlink is handy,
but it's not necessary as the radar references the LLDB workaround.
Furthermore, we have other places in LLDB that work around xcrun not
caching negative that should potentially be reconsidered at that time.
When we build the Clang module compilation command (e.g., when
a user requests import of a module via `expression @import Foundation`),
LLDB will try to determine which SDK directory to use as the `sysroot`.
However, it currently does so by simply enumerating the `SDKs` directory
and picking the last one that's appropriate for module compilation
(see `PlatformDarwin::GetSDKDirectoryForModules`). That means if we have
multiple platform SDKs installed (e.g., a public and internal one), we
may pick the wrong one by chance.
On Darwin platforms we emit the SDK path that a object
file was compiled against into DWARF (using `DW_AT_LLVM_sysroot`
and `DW_AT_APPLE_sdk`). For Swift debugging, we already parse the SDK
path from debug-info if we can.
This patch mimicks the Swift behaviour for non-Swift languages. I.e.,
if we can get the SDK path from debug-info, do so. Otherwise, fall back
to the old heuristic.
rdar://110407148
Differential Revision: https://reviews.llvm.org/D156020
These methods all take a `Stream *` to get feedback about what's going
on. By default, it's a nullptr, but we always feed it with a valid
pointer. It would therefore make more sense to have this take a
reference.
Differential Revision: https://reviews.llvm.org/D154883
While working on the broken-symlinks issue, I noticed this one place
where I was setting `find_other` to false unintentionally in the
darwin kernel platform kext/kernel scan. Small fix while I'm here.
These don't need to be ConstStrings. They don't really benefit much from
deduplication and comparing them isn't on a hot path, so they don't
really benefit much from quick comparisons.
Differential Revision: https://reviews.llvm.org/D152331
In ProcessMachCore::LoadBinariesViaMetadata(), if we did
load some binaries via metadata in the core file, don't
then search for a userland dyld in the corefile / kernel
and throw away that binary list. Also fix a little bug
with correctly recognizing corefiles using a `main bin spec`
LC_NOTE that explicitly declare that this is a userland
corefile.
LocateSymbolFileMacOSX.cpp's Symbols::DownloadObjectAndSymbolFile
clarify the comments on how the force_lookup and how the
dbgshell_command local both have the same effect.
In PlatformDarwinKernel::LoadPlatformBinaryAndSetup, don't
log a message unless we actually found a kernel fileset.
Reorganize ObjectFileMachO::LoadCoreFileImages so that it delegates
binary searching to DynamicLoader::LoadBinaryWithUUIDAndAddress and
doesn't duplicate those searches. For searches that fail, we would
perform them multiple times in both methods. When we have the
mach-o segment vmaddrs for a binary, don't let LoadBinaryWithUUIDAndAddress
load the binary first at its mach-o header address in the Target;
we'll load the segments at the correct addresses individually later
in this method.
DynamicLoaderDarwin::ImageInfo::PutToLog fix a LLDB_LOG logging
formatter.
In DynamicLoader::LoadBinaryWithUUIDAndAddress, instead of using
Target::GetOrCreateModule as a way to find a binary already registered
in lldb's global module cache (and implicitly add it to the Target
image list), use ModuleList::GetSharedModule() which only searches
the global module cache, don't add it to the Target. We may not
want to add an unstripped binary to the Target.
Add a call to Symbols::DownloadObjectAndSymbolFile() even if
"force_symbol_search" isn't set -- this will turn into a
DebugSymbols call / Spotlight search on a macOS system, which
we want.
Only set the Module's LoadAddress if the caller asked us to do that.
Differential Revision: https://reviews.llvm.org/D150928
rdar://109186357
This patch introduces FileSpec::GetComponents, a method that splits a
FileSpec's path into its individual components. For example, given
/foo/bar/baz, you'll get back a vector of strings {"foo", "bar", baz"}.
The motivation here is to reduce the use of
`FileSpec::RemoveLastPathComponent`. Mutating a FileSpec is expensive,
so providing a way of doing this without mutation is useful.
Differential Revision: https://reviews.llvm.org/D151399
This reverts commit c46d9af26cefb0b24646d3235b75ae7a1b8548d4.
Rename the variable to avoid `-Wchanges-meaning` warning. Although, it
might be better to squelch the warning as it is of low value IMO.
Instead of doing the local filesystem scan for kexts and kernels
when the PlatformDarwinKernel is constructed, delay doing it until
the scan is needed.
Differential Revision: https://reviews.llvm.org/D150621
rdar://109186357
As far as I can tell, this just computes the filename of the FileSpec,
which is already conveniently stored in m_filename. We can use
FileSpec::GetFilename() instead.
Differential Revision: https://reviews.llvm.org/D149663
FileSpec::GetFileNameExtension returns a StringRef. In some cases we
are calling it and then storing the result in a local. To prevent
cases where we store the StringRef, mutate the Filespec, and then try to
use the stored StringRef afterwards, I've audited the callsites and made
adjustments to mitigate: Either marking the FileSpec it comes from as
const (to avoid mutations) or by not storing the StringRef in a local if
it makes sense not to.
Differential Revision: https://reviews.llvm.org/D149671
The majority of call sites are nullptr as the execution context.
Refactor OptionValueProperties to make the argument optional and
simplify all the callers.
Various OptionValue related classes are passing around will_modify but
the value is never used. This patch simplifies the interfaces by
removing the redundant argument.
This generalises the GetXcodeSDKPath hook to a GetSDKRoot path which
will be re-used for the Windows support to compute a language specific
SDK path on the platform. Because there may be other options that we
wish to use to compute the SDK path, sink the XcodeSDK parameter into
a structure which can pass a disaggregated set of options. Furthermore,
optionalise the parameter as Xcode is not available for all platforms.
Differential Revision: https://reviews.llvm.org/D149397
Reviewed By: JDevlieghere
These don't really need to be in ConstStrings. It's nice that comparing
ConstStrings is fast (just a pointer comparison) but the cost of
creating the ConstString usually already includes the cost of doing a
StringRef comparison anyway, so this is just extra work and extra memory
consumption for basically no benefit.
Differential Revision: https://reviews.llvm.org/D149300
Jason isn't sure what this is used for and isn't aware of a .Bundle
suffix related to kernel debugging. Let's remove it.
Differential Revision: https://reviews.llvm.org/D149284
These probably do not need to be in the ConstString StringPool as they
don't really need any of the advantages that ConstStrings offer.
Lifetime for these things is always static and we never need to perform
comparisons for setting descriptions.
Differential Revision: https://reviews.llvm.org/D148679
In https://reviews.llvm.org/D122373 I delayed the search for
the SDK filepath until the simulator platform is Created.
In the qProcessInfo binary-addresses key, I have to force-Create
every platform to find one that can handle a kernel fileset;
this forced all of the simulator platforms to create, taking the
SDK filepath discovery perf hit.
This patch delays that path search further until the Apple
Simulator platform calls a method that actually needs the full
filepath; it saves the SDK name ("WatchSimulator.sdk" etc) until
it needs to expand it.
Differential Revision: https://reviews.llvm.org/D143932
rdar://103380717
This patch is preparatory work for Scripted Platform support and does
multiple things:
First, it introduces new options for the `platform select` command and
`SBPlatform::Create` API, to hold a reference to the debugger object,
the name of the python script managing the Scripted Platform and a
structured data dictionary that the user can use to pass arbitrary data.
Then, it updates the various `Create` and `GetOrCreate` methods for
the `Platform` and `PlatformList` classes to pass down the new parameter
to the `Platform::CreateInstance` callbacks.
Finally, it updates every callback to reflect these changes.
Differential Revision: https://reviews.llvm.org/D139249
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838