The LLVM Coding Standards [1] specify that:
> [T]o match error message styles commonly produced by other tools,
> start the first sentence with a lowercase letter, and finish the last
> sentence without a period, if it would end in one otherwise.
Historically, that hasn't been something we've enforced in LLDB, but in
the past year or so I've started to pay more attention to this in code
reviews. This PR brings more error messages in compliance, further
increasing consistency.
I also adopted `createStringErrorV` where it improved the code as a
drive-by for lines I was already touching.
[1] https://llvm.org/docs/CodingStandards.html#error-and-warning-messages
Assisted-by: Claude Code
This is to highlight places where we (probably unintentionally)
construct an `Address` object from an already resolved address, making
it unresolved again.
See the changes in `DynamicLoaderDarwin.cpp` for a quick example.
Also, use this constructor instead of `Address(lldb::addr_t file_addr,
const SectionList *section_list)` when `section_list` is `nullptr`.
This changes the inputs to `update`. It's now the data stack that was
the result of `@init`. This makes `update` more predictable, as its
inputs are the same between the first call and the Nth call.
lldb had three preprocessor defines for logging,
LLDB_LOG - formatv style argument
LLDB_LOGF - printf style argument
LLDB_LOGV - formatv style argument, only when verbose enabled
If you weren't looking at Log.h and the definition of these three, and
wanted to log something with formatv, it was easy to use LLDB_LOGV by
accident. We just had a situation where an important log statement
wasn't logging and it turned out to be this. This is fragile if you
aren't looking at the header directly, so I'd like to make this more
explicit. My proposal:
LLDB_LOG - formatv style argument
LLDB_LOG_VERBOSE - formatv style argument, only when verbose enabled
LLDB_LOGF - printf style argument
LLDB_LOGF_VERBOSE - printf style argument, only when verbose enabled
The new fouth one is to remove several places where we do `if (log &&
log->GetVerbose()) LLDB_LOGF (...)` in the sources today, and make both
styles consistent.
This PR implements that change, mechanically changing all LLDB_LOGV's to
LLDB_LOG_VERBOSE.
It also updates many of the `if (log && log->GetVerbose()) LLDB_LOGF`'s.
Some uses of this conditional expression do extra calculations in
addition to logging, and so those were left as-is so we're not doing
throwaway work when running without verbose logging.
There were many instances throughout lldb where callers are still doing
`if (log) LLDB_LOG*(...)`, a remnant of when all calls were to the `Log`
object's `Printf()` method, and you had to check if your local Log*
pointer was non-nullptr before calling the method. I removed those,
again keeping ones where work for logging is done in the block of code.
The code changes are all mechanical and uninteresting, but the question
of whether this naming change is widely agreed on is maybe worth
discussing.
In #186001, I said the last large chunk of downstream PtrAuth code in
LLDB was the expression evaluator support. However, that wasn't
accurate, as we also have changes to thread this through ValueObject.
This is driven by the Swift language. In Swift many data types such as
Int, and String are structs, and LLDB provides summary formatters and
synthetic child providers for them. For String, for example, a summary
formatter pulls out the string data from the implementation, while a
synthetic child provider hides the implementation details from users, so
strings don't expand their children.
rdar://171646109
Synthetic providers for collection types use a child name format of
"[N]".
This `ValueObjectSynthetic` to automatically convert child names in this
convention to the index embedded in the subscript string. With this
change, synthetic formatters for collections will only need to implement
`GetIndexOfChildWithName` or `get_child_index` for non-indexed
collection children. Some examples of non-indexed children are
`$$dereference$$` support, or "hidden" children.
The automatic conversion applies to N values that are less than the
number of children reported by the synthetic provider.
The formatter extraction would look at too much data for one type -
possibly reading data outside the section.
This PR limits the size of the `DataExtractor` to the one specified in
the record size before - previously, the whole section was looked at.
Similarly, `ForEachFormatterInModule` skipped zero-bytes but didn't stop
when reaching the end of the extractor.
I added a test for both cases.
`GetSyntheticValue` in synthetic providers which need to operate on raw
root values, but will often want to use the synthetic value of children,
or nested children.
When an object description expression fails, instead of emitting an error, mark it as a
warning instead. Additionally, send the more low level details of the failure to the
`expr` log, and show a more user friendly message:
> `po` was unsuccessful, running `p` instead
rdar://165190497
This patch creates a new `FormatEntity::Formatter` class and moves
`FormatEntity::Format` (and related APIs) into it. Most of the
parameters to `Format` are immutable across all recursive calls, so I
made them `const` member variables of `Formatter`. The main changes are
just mechanical renaming of:
```
FormatEntity::Format(...)
```
to
```
FormatEntity::Formatter(...).Format(stream, entry, valobj)
```
and making use of the member variables from inside `Format`.
We can probably make most of the parameters to the `Formatter`
constructor defaulted, but I chose not to in this patch to keep the diff
smaller.
The motivation for this is that I'm planning on adding logic to detect
recursive format entities (which would crash LLDB). That requires some
state, which in my opinion is best kept inside the `Formatter` class
instead of another parameter to `Format`.
The patch should be entirely NFC.
Depends on:
* https://github.com/llvm/llvm-project/pull/174385
(only last commit is relevant for this review)
The `${var%s}` format isn't capable of formatting references to
C-strings. So the summary for those becomes `<no value available>`. This
patch prevents the system C-string formatter from applying to
references, which means the summary for such types will be empty. This
prompts LLDB to instead print the child, which is the referenced
C-string.
Before:
```
(lldb) v ref
(const char *&) ref = 0x000000016fdfe960 <no value available>
```
After:
```
(lldb) v ref
(const char *&) ref = 0x000000016fdfec40 (&ref = "hi")
```
An alternative would be to support references in the `ValueObject` dump
methods. We assume C-string are pointers/arrays in a lot of places, so
such a fix would be a more intrusive undertaking, and I'm not sure we
would want to support references there in the first place. So for now I
went with the fallback logic in this PR.
This fixes a few bugs, effectively through a fallback to `p` when `po` fails.
The motivating bug this fixes is when an error within the compiler causes `po` to fail.
Previously when that happened, only its value (typically an object's address) was
printed – and problematically, no compiler diagnostics were shown. With this change,
compiler diagnostics are shown, _and_ the object is fully printed (ie `p`).
Another bug this fixes is when `po` is used on a type that doesn't provide an object
description (such as a struct). Again, the normal `ValueObject` printing is used.
Additionally, this also improves how lldb handles an object description method that
fails in some way. Now an error will be shown (it wasn't before), and the value will be
printed normally.
This is a continuation of 68fd102, which did the same thing but only for
StopInfo. Using make_shared is both safer and more efficient:
- With make_shared, the object and the control block are allocated
together, which is more efficient.
- With make_shared, the enable_shared_from_this base class is properly
linked to the control block before the constructor finishes, so
shared_from_this() will be safe to use (though still not recommended
during construction).
This re-uses the `LibcxxContainerSummaryProvider` for the libstdc++
formatters. There's a couple of containers that aren't making use of it
for libstdc++. This patch will make it easier to review when adding
those in the future.
The only difference is that with libc++ the summary string contains the
derefernced pointer value. With libstdc++ we currently display the
pointer itself, which seems redundant. E.g.,
```
(std::unique_ptr<int>) iup = 0x55555556d2b0 {
pointer = 0x000055555556d2b0
}
(std::unique_ptr<std::basic_string<char> >) sup = 0x55555556d2d0 {
pointer = "foobar"
}
```
This patch moves the logic into a common helper that's shared between
the libc++ and libstdc++ formatters.
After this patch we can combine the libc++ and libstdc++ API tests (see
https://github.com/llvm/llvm-project/pull/146740).
This commit adjusts the pretty printer for `std::coroutine_handle` based
on recent personal experiences with debugging C++20 coroutines:
1. It adds the `coro_frame` member. This member exposes the complete
coroutine frame contents, including the suspension point id and all
internal variables which the compiler decided to persist into the
coroutine frame. While this data is highly compiler-specific, inspecting
it can help identify the internal state of suspended coroutines.
2. It includes the `promise` and `coro_frame` members, even if
devirtualization failed and we could not infer the promise type / the
coro_frame type. Having them available as `void*` pointers can still be
useful to identify, e.g., which two coroutine handles have the same
frame / promise pointers.
If we're not touching them, we don't need to do anything special to pass
them along -- with one important caveat: due to how cmake arguments
work, the implicitly passed arguments need to be specified before
arguments that we handle.
This isn't particularly nice, but the alternative is enumerating all
arguments that can be used by llvm_add_library and the macros it calls
(it also relies on implicit passing of some arguments to
llvm_process_sources).
The problem was in calling GetLoadAddress on a value in the error state,
where `ValueObject::GetLoadAddress` could end up accessing the
uninitialized "address type" by-ref return value from `GetAddressOf`.
This probably happened because each function expected the other to
initialize it.
We can guarantee initialization by turning this into a proper return
value.
I've added a test, but it only (reliably) crashes if lldb is built with
ubsan.
Currently, the type `T`'s summary formatter will be matched for `T`,
`T*`, `T**` and so on. This is unexpected in many data formatters. Such
unhandled cases could cause the data formatter to crash. An example
would be the lldb's built-in data formatter for `std::optional`:
```
$ cat main.cpp
#include <optional>
int main() {
std::optional<int> o_null;
auto po_null = &o_null;
auto ppo_null = &po_null;
auto pppo_null = &ppo_null;
return 0;
}
$ clang++ -g main.cpp && lldb -o "b 8" -o "r" -o "v pppo_null"
[lldb crash]
```
This change adds an options `--pointer-match-depth` to `type summary
add` command to allow users to specify how many layer of pointers can be
dereferenced at most when matching a summary formatter of type `T`, as
Jim suggested
[here](https://github.com/llvm/llvm-project/pull/124048/#issuecomment-2611164133).
By default, this option has value 1 which means summary formatter for
`T` could also be used for `T*` but not `T**` nor beyond. This option is
no-op when `--skip-pointers` is set as well.
I didn't add such option for `type synthetic add`, `type format add`,
`type filter add`, because it useful for those command. Instead, they
all have the pointer match depth of 1. When printing a type `T*`, lldb
never print the children of `T` even if there is a synthetic formatter
registered for `T`.
When printing an ObjC object, which is a pointer, lldb has handled it
the same way it treats any other pointer – printing only class name and
pointer address. The object is not expanded, its children are not shown.
This change updates `dwim-print` to print objc pointers by expanding (ie
dereferencing), with the assumption that it's what the user wants.
Note that this is currently possible using the `--ptr-depth`/`-P` flag.
With this change, when `dwim-print` prints root level objc objects, it's
the same effect as using `--ptr-depth 1`.
This patch pushes the error handling boundary for the GetBitSize()
methods from Runtime into the Type and CompilerType APIs. This makes it
easier to diagnose problems thanks to more meaningful error messages
being available. GetBitSize() is often the first thing LLDB asks about a
type, so this method is particularly important for a better user
experience.
rdar://145667239
This patch fixes `-Wreturn-type` warnings which happens if LLVM is built
with GCC compiler (14.1 is used for detecting)
Warnings:
```
llvm-project/lldb/source/ValueObject/DILLexer.cpp: In static member function ‘static llvm::StringRef lldb_private::dil::Token::GetTokenName(Kind)’:
llvm-project/lldb/source/ValueObject/DILLexer.cpp:33:1: warning: control reaches end of non-void function [-Wreturn-type]
33 | }
| ^
```
and:
```
llvm-project/lldb/source/DataFormatters/TypeSummary.cpp: In member function ‘virtual std::string lldb_private::TypeSummaryImpl::GetSummaryKindName()’:
llvm-project/lldb/source/DataFormatters/TypeSummary.cpp:62:1: warning: control reaches end of non-void function [-Wreturn-type]
62 | }
| ^
```
Technically, it is a bug in Clang (see #115345), however, UBSan with
Clang should detect these places, therefore it would be nice to provide
a return statement for all possible inputs (even invalid).
The vast majority of `SyntheticChildrenFrontEnd` subclasses provide
children, and as such implement `MightHaveChildren` with a constant
value of `true`. This change makes `true` the default value. With this
change, `MightHaveChildren` only needs to be implemented by synthetic
providers that can return `false`, which is only 3 subclasses.
Lots of code around LLDB was directly accessing the target's section
load list. This NFC patch makes the section load list private so the
Target class can access it, but everyone else now uses accessor
functions. This allows us to control the resolving of addresses and will
allow for functionality in LLDB which can lazily resolve addresses in
JIT plug-ins with a future patch.
Compared to the python version, this also does type checking and error
handling, so it's slightly longer, however, it's still comfortably
under 500 lines.
Relanding with more explicit type conversions.
This reverts commit f6012a209dca6b1866d00e6b4f96279469884320.
Revert "[lldb] Add cast to fix compile error on 32-but platforms"
This reverts commit d300337e93da4ed96b044557e4b0a30001967cf0.
Revert "[lldb] Improve log message to include missing strings"
This reverts commit 0be33484853557bc0fd9dfb94e0b6c15dda136ce.
Revert "[lldb] Add comment"
This reverts commit e2bb47443d2e5c022c7851dd6029e3869fc8835c.
Revert "[lldb] Implement a formatter bytecode interpreter in C++"
This reverts commit 9a9c1d4a6155a96ce9be494cec7e25731d36b33e.