This fixes a few bugs, effectively through a fallback to `p` when `po` fails.
The motivating bug this fixes is when an error within the compiler causes `po` to fail.
Previously when that happened, only its value (typically an object's address) was
printed – and problematically, no compiler diagnostics were shown. With this change,
compiler diagnostics are shown, _and_ the object is fully printed (ie `p`).
Another bug this fixes is when `po` is used on a type that doesn't provide an object
description (such as a struct). Again, the normal `ValueObject` printing is used.
Additionally, this also improves how lldb handles an object description method that
fails in some way. Now an error will be shown (it wasn't before), and the value will be
printed normally.
Relates to https://github.com/llvm/llvm-project/issues/135707
Where it was reported that reading the PC using "register read" had
different results to an expression "$pc".
This was happening because registers are treated in lldb as pure
"values" that don't really have an endian. We have to store them
somewhere on the host of course, so the endian becomes host endian.
When you want to use a register as a value in an expression you're
pretending that it's a variable in memory. In target memory. Therefore
we must convert the register value to that endian before use.
The test I have added is based on the one used for XML register flags.
Where I fake an AArch64 little endian and an s390x big endian target. I
set up the data in such a way the pc value should print the same for
both, either with register read or an expression.
I considered just adding a live process test that checks the two are the
same but with on one doing cross endian testing, I doubt it would have
ever caught this bug.
Simulating this means most of the time, little endian hosts will test
little to little and little to big. In the minority of cases with a big
endian host, they'll check the reverse. Covering all the combinations.
In architectures where pointers may contain metadata, such as arm64e,
the metadata may need to be cleaned prior to sending this pointer to be
used in expression evaluation generated code.
This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See
https://github.com/llvm/llvm-project/pull/150537.
This was tested running the LLDB test suite on arm64e.
Factors out code to locate a definition DIE from a `FunctionCallLabel`
into a helper. This will be useful for an upcoming refactor that extends
`FindFunctionDefinition`.
Drive-by:
* adjusts some error messages in the `FunctionCallLabel` support code.
I discovered this building the lldb test suite with `-mthumb` set, so
that all test programs are purely Arm Thumb code.
When C++ expressions did a function lookup, they took a different path
from C programs. That path happened to land on the line that I've
changed. Where we try to look something up as a function, but don't then
resolve the address as if it's a callable.
With Thumb, if you do the non-callable lookup, the bottom bit won't be
set. This means when lldb's expression wrapper function branches into
the found function, it'll be in Arm mode trying to execute Thumb code.
Thumb is the only instance where you'd notice this. Aside from maybe
MicroMIPS or MIPS16 perhaps but I expect that there are 0 users of that
and lldb.
I have added a new test case that will simulate this situation in
"normal" Arm builds so that it will get run on Linaro's buildbot.
This change also fixes the following existing tests when `-mthumb` is
used:
```
lldb-api :: commands/expression/anonymous-struct/TestCallUserAnonTypedef.py
lldb-api :: commands/expression/argument_passing_restrictions/TestArgumentPassingRestrictions.py
lldb-api :: commands/expression/call-function/TestCallStopAndContinue.py
lldb-api :: commands/expression/call-function/TestCallUserDefinedFunction.py
lldb-api :: commands/expression/char/TestExprsChar.py
lldb-api :: commands/expression/class_template_specialization_empty_pack/TestClassTemplateSpecializationParametersHandling.py
lldb-api :: commands/expression/context-object/TestContextObject.py
lldb-api :: commands/expression/formatters/TestFormatters.py
lldb-api :: commands/expression/import_base_class_when_class_has_derived_member/TestImportBaseClassWhenClassHasDerivedMember.py
lldb-api :: commands/expression/inline-namespace/TestInlineNamespace.py
lldb-api :: commands/expression/namespace-alias/TestInlineNamespaceAlias.py
lldb-api :: commands/expression/no-deadlock/TestExprDoesntBlock.py
lldb-api :: commands/expression/pr35310/TestExprsBug35310.py
lldb-api :: commands/expression/static-initializers/TestStaticInitializers.py
lldb-api :: commands/expression/test/TestExprs.py
lldb-api :: commands/expression/timeout/TestCallWithTimeout.py
lldb-api :: commands/expression/top-level/TestTopLevelExprs.py
lldb-api :: commands/expression/unwind_expression/TestUnwindExpression.py
lldb-api :: commands/expression/xvalue/TestXValuePrinting.py
lldb-api :: functionalities/thread/main_thread_exit/TestMainThreadExit.py
lldb-api :: lang/cpp/call-function/TestCallCPPFunction.py
lldb-api :: lang/cpp/chained-calls/TestCppChainedCalls.py
lldb-api :: lang/cpp/class-template-parameter-pack/TestClassTemplateParameterPack.py
lldb-api :: lang/cpp/constructors/TestCppConstructors.py
lldb-api :: lang/cpp/function-qualifiers/TestCppFunctionQualifiers.py
lldb-api :: lang/cpp/function-ref-qualifiers/TestCppFunctionRefQualifiers.py
lldb-api :: lang/cpp/global_operators/TestCppGlobalOperators.py
lldb-api :: lang/cpp/llvm-style/TestLLVMStyle.py
lldb-api :: lang/cpp/multiple-inheritance/TestCppMultipleInheritance.py
lldb-api :: lang/cpp/namespace/TestNamespace.py
lldb-api :: lang/cpp/namespace/TestNamespaceLookup.py
lldb-api :: lang/cpp/namespace_conflicts/TestNamespaceConflicts.py
lldb-api :: lang/cpp/operators/TestCppOperators.py
lldb-api :: lang/cpp/overloaded-functions/TestOverloadedFunctions.py
lldb-api :: lang/cpp/rvalue-references/TestRvalueReferences.py
lldb-api :: lang/cpp/static_methods/TestCPPStaticMethods.py
lldb-api :: lang/cpp/template/TestTemplateArgs.py
lldb-api :: python_api/thread/TestThreadAPI.py
```
There are other failures that are due to different problems, and this
change does not make those worse.
(I have no plans to run the test suite with `-mthumb` regularly, I just
did it to test some other refactoring)
### Summary
We have internally seen cases like this
`DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28`
where we have longer op pieces than what Scalar supports (32, 64 or 128
bits). In these cases LLDB is currently hitting the assertion
`assert(ap_int.getBitWidth() >= bit_size);`
We are extending the generated APInt to the piece size by filling zeros.
### Test plan
Added a unit to cover this case.
### Reviewers
@clayborg , @jeffreytan81 , @Jlalond
LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](https://github.com/llvm/llvm-project/pull/149827)
This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.
**Implementation**
The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).
Depends on https://github.com/llvm/llvm-project/pull/151355
This patch fixes updating persistent variables when memory cannot be
allocated in an inferior process:
```
> lldb -c test.core
(lldb) expr int $i = 5
(lldb) expr $i = 55
(int) $0 = 55
(lldb) expr $i
(int) $i = 5
```
With this patch, the last command prints:
```
(int) $i = 55
```
The issue was introduced in #145599.
Add support for DW_OP_WASM_location in DWARFExpression. This PR rebases
#78977 and cleans up the unit test. The DWARF extensions are documented
at https://yurydelendik.github.io/webassembly-dwarf/ and supported by
LLVM-based toolchains such as Clang, Swift, Emscripten, and Rust.
Eliminate the `lldb_private::dwarf` namespace, in favor of using
`llvm::dwarf` directly. The latter is shorter, and this avoids ambiguity
in the ABI plugins that define a `dwarf` namespace inside an anonymous
namespace.
The only place that passes a target to `LoadAddressResolver` already
checks for pointer validity. And inside of the resolver we have been
dereferencing the target anyway without nullptr checks. So codify the
non-nullness of `m_target` by making it a reference.
This patch introduces a new struct and helper API in
`DWARFExpressionList` to expose
variable location metadata (base address, end address, and
DWARFExpression pointer) for
a given PC address. It will be used in later patches to annotate
disassembly instructions
with source-level variable locations.
## New struct
```
/// Represents one entry in a DWARFExpressionList, with its range and expr.
struct DWARFExpressionEntry {
lldb::addr_t base; // file‐address start of this location range
lldb::addr_t end; // file‐address end of this range (exclusive)
const DWARFExpression *expr; // the DWARF expression for this range
};
```
## New API
```
/// Retrieve the DWARFExpressionEntry covering a particular instruction.
///
/// \param func_load_addr
/// The load address of the start of the function containing this location list;
/// used to translate between file offsets and load addresses. If this is
/// LLDB_INVALID_ADDRESS, the stored CU base (m_func_file_addr) is used.
///
/// \param load_addr
/// The load address of the *current* PC (i.e., the instruction for which
/// we want its variable‐location entry). We first convert this back into
/// the function’s file‐address space to find the correct DWARF range.
///
/// \returns
/// On success, an entry whose `[base,end)` covers this PC; else an Error.
llvm::Expected<DWARFExpressionEntry>
GetExpressionEntryAtAddress(lldb::addr_t func_load_addr,
lldb::addr_t load_addr) const;
```
## Rationale
LLDB already provides:
```
const DWARFExpression *
GetExpressionAtAddress(lldb::addr_t func_load_addr,
lldb::addr_t load_addr) const;
```
However, this only returns the DWARF expression itself, without the
file‐address start (base) and end (end) of the location range. Those
bounds are crucial for:
1) Detecting range beginnings: render a var = <location> annotation
exactly when a variable’s live‐range starts.
2) Detecting range continuation: optionally display a “|” on subsequent
instructions in the same range.
3) Detecting state changes: know when a variable moves (e.g. from one
register to another), becomes a constant, or goes out of scope.
These primitives form the foundation for the Rich Disassembler feature
proposed for GSoC 25.
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Adrian Prantl <adrian.prantl@gmail.com>
The DWARF expression evaluator is essentially a large state machine
implemented as a switch statement. For anything but trivial operations,
having the evaluation logic implemented inline is difficult to follow,
especially when there are nested switches. This commit moves evaluation
of `DW_OP_deref` out-of-line, similar to `Evaluate_DW_OP_entry_value`.
If a server does not support allocating memory in an inferior process or
when debugging a core file, evaluating an expression in the context of a
value object results in an error:
```
error: <lldb wrapper prefix>:43:1: use of undeclared identifier '$__lldb_class'
43 | $__lldb_class::$__lldb_expr(void *$__lldb_arg)
| ^
```
Such expressions require a live address to be stored in the value
object. However, `EntityResultVariable::Dematerialize()` only sets
`ret->m_live_sp` if JIT is available, even if the address points to the
process memory and no custom allocations were made. Similarly,
`EntityPersistentVariable::Dematerialize()` tries to deallocate memory
based on the same check, resulting in an error if the memory was not
previously allocated in `EntityPersistentVariable::Materialize()`.
As an unintended bonus, the patch also fixes a FIXME case in
`TestCxxChar8_t.py`.
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.
Either way, I'm open.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
(Revised version of a previous, unreviewed, PR.)
Move all expression verification into its only client: DWARFVerifier.
Move all printing code (which was a mix of static and member functions)
into a separate class.
This is one in a series of refactoring PRs to separate dwarf
functionality into lower-level pieces usable without object files and
sections at build time. The code is already written this way via various
"if (section == nullptr)" and similar conditionals. So the functionality
itself is needed and exists, but only as a runtime feature. The goal of
these refactors is to remove the build-time dependencies, which will
allow the existing functionality to be used from lower-level parts of
the compiler. Particularly from lib/MC/.... More information at:
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
This patch should be mostly obvious, but in one place, this patch
changes:
const auto &it = std::find(...)
to:
auto it = llvm::find(...)
We do not need to bind to a temporary with const ref.
If we're not touching them, we don't need to do anything special to pass
them along -- with one important caveat: due to how cmake arguments
work, the implicitly passed arguments need to be specified before
arguments that we handle.
This isn't particularly nice, but the alternative is enumerating all
arguments that can be used by llvm_add_library and the macros it calls
(it also relies on implicit passing of some arguments to
llvm_process_sources).
The problem was in calling GetLoadAddress on a value in the error state,
where `ValueObject::GetLoadAddress` could end up accessing the
uninitialized "address type" by-ref return value from `GetAddressOf`.
This probably happened because each function expected the other to
initialize it.
We can guarantee initialization by turning this into a proper return
value.
I've added a test, but it only (reliably) crashes if lldb is built with
ubsan.
This patch addresses the issue #129543.
After this patch DWARFExpression does not call DWARFUnit directly and does not depend on
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp and a lot of clang code.
After this patch the size of lldb-server binary (Linux Aarch64) is reduced from 47MB to 17MB.
The `TestMemoryHistory.py`/`TestReportData.py` are currently failing on
the x86 macOS CI (started after we upgraded the Xcode SDK on that
machien). The LLDB ASAN utility expression is failing to run with
following error:
```
(lldb) image lookup -n __asan_get_alloc_stack
1 match found in /usr/lib/system/libsystem_sanitizers.dylib:
Address: libsystem_sanitizers.dylib[0x00007ffd11e673f7] (libsystem_sanitizers.dylib.__TEXT.__text + 11287)
Summary: libsystem_sanitizers.dylib`__asan_get_alloc_stack
1 match found in /Users/michaelbuch/Git/lldb-build-main-no-modules/lib/clang/21/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:
Address: libclang_rt.asan_osx_dynamic.dylib[0x0000000000009ec0] (libclang_rt.asan_osx_dynamic.dylib.__TEXT.__text + 34352)
Summary: libclang_rt.asan_osx_dynamic.dylib`::__asan_get_alloc_stack(__sanitizer::uptr, __sanitizer::uptr *, __sanitizer::uptr, __sanitizer::u32 *) at asan_debugging.cpp:132
(lldb) memory history 'pointer'
Assertion failed: ((uintptr_t)addr == report.access.address), function __asan_get_alloc_stack, file debugger_abi.cpp, line 62.
warning: cannot evaluate AddressSanitizer expression:
error: Expression execution was interrupted: signal SIGABRT.
The process has been returned to the state before expression evaluation.
```
The reason for this is that the system sanitizer dylib and the locally
built libclang_rt contain the same symbol `__asan_get_alloc_stack`, and
depending on the order in which they're loaded, we may pick the one from
the wrong dylib (this probably changed during the buildbot upgrade and
is why it only now started failing). Based on discussion with @wrotki we
always want to pick the one that's in the libclang_rt dylib if it was
loaded, and libsystem_sanitizers otherwise.
This patch addresses this by adding a "preferred lookup context list" to
the expression evaluator. Currently this is only exposed in the
`EvaluateExpressionOptions`. We make it a `SymbolContextList` in case we
want the lookup contexts to be contexts other than modules (e.g., source
files, etc.). In `IRExecutionUnit` we make it a `ModuleList` because it
makes the symbol lookup implementation simpler and we only do module
lookups here anyway. If we ever need it to be a `SymbolContext`, that
transformation shouldn't be too difficult.
This patch pushes the error handling boundary for the GetBitSize()
methods from Runtime into the Type and CompilerType APIs. This makes it
easier to diagnose problems thanks to more meaningful error messages
being available. GetBitSize() is often the first thing LLDB asks about a
type, so this method is particularly important for a better user
experience.
rdar://145667239
This patch improves the synchronization of the debugger's output and error
streams using two new abstractions: `LockableStreamFile` and
`LockedStreamFile`.
- `LockableStreamFile` is a wrapper around a `StreamFile` and a mutex. Client
cannot use the `StreamFile` without calling `Lock`, which returns a
`LockedStreamFile`.
- `LockedStreamFile` is an RAII object that locks the stream for the duration
of its existence. As long as you hold on to the returned object you are
permitted to write to the stream. The destruction of the object
automatically flush the output stream.
The improved error reporting in #120162 revealed that we were missing
opcodes in GetOpcodeDataSize. I changed the function to remove the
default case and switch over the enum type which will cause the compiler
to emit a warning if there are unhandled operations in the future.
rdar://139705570
This patch rewords some of the user expression diagnostics.
- Differentiate between being interrupted and hitting a breakpoint.
- Use "expression execution" to make it more obvious that the diagnostic
is associated with the user expression.
- Consistently use a colon instead of semicolons and commas.
rdar://143059974
Lots of code around LLDB was directly accessing the target's section
load list. This NFC patch makes the section load list private so the
Target class can access it, but everyone else now uses accessor
functions. This allows us to control the resolving of addresses and will
allow for functionality in LLDB which can lazily resolve addresses in
JIT plug-ins with a future patch.
Many calls to Function::GetAddressRange() were not interested in the
range itself. Instead they wanted to find the address of the function
(its entry point) or the base address for relocation of function-scoped
entities (technically, the two don't need to be the same, but there's
isn't good reason for them not to be). This PR creates a separate
function for retrieving this, and changes the existing
(non-controversial) uses to call that instead.
Expressions can take arbitrary amounts of time to run, so IDEs might
want to be informed about the fact that an expression is currently being
executed.
rdar://141253078
Prior to this patch, the function returned an exit status, sometimes a
ValueObject with an error and a Status object. This patch removes the
Status object and ensures the error is consistently returned as the
error of the ValueObject.
When parsing an optimized value and reading a piece from a file address,
LLDB tries to read the data from memory using that address.
This patch converts file address to load address before reading the
memory.
Fixes#111313Fixes#97484
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).
This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.
The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
ValueObject is part of lldbCore for historical reasons, but conceptually
it deserves to be its own library. This does introduce a (link-time) circular
dependency between lldbCore and lldbValueObject, which is unfortunate
but probably unavoidable because so many things in LLDB rely on
ValueObject. We already have cycles and these libraries are never built
as dylibs so while this doesn't improve the situation, it also doesn't
make things worse.
The header includes were updated with the following command:
```
find . -type f -exec sed -i.bak "s%include \"lldb/Core/ValueObject%include \"lldb/ValueObject/ValueObject%" '{}' \;
```
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
This fixes the following assertion: "Cannot create Expected<T> from
Error success value." The problem was that GetFrameBaseValue return
false without updating the Status argument. This patch eliminates the
opportunity for mistakes by returning an llvm:Error.
When we search for a symbol, we first check if it is in the module_sp of
the current SymbolContext, and if not, we check in the target's modules.
However, the target's ModuleList also includes the already checked
module, which leads to a redundant search in it.
[lldb][RISCV] add jitted function calls to ABI
Function calls support in LLDB expressions for RISCV: 1 of 4
Augments corresponding functionality to RISCV ABI, which allows to jit
lldb expressions and thus make function calls inside them. Only function
calls with integer and void function arguments and return value are
supported.
[lldb][RISCV] add JIT relocations resolver
Function calls support in LLDB expressions for RISCV: 2 of 4
Adds required RISCV relocations resolving functionality in lldb
ExecutionEngine.
[lldb][RISCV] RISC-V large code model in lldb expressions
Function calls support in LLDB expressions for RISCV: 3 of 4
This patch sets large code model in MCJIT settings for RISC-V 64-bit targets
that allows to make assembly jumps at any 64bit address. This is needed,
because resulted jitted code may contain more that +-2GB jumps, that are
not available in RISC-V with medium code model.
[lldb][RISCV] doubles support in lldb expressions
Function calls support in LLDB expressions for RISCV: 4 of 4
This patch adds desired feature flags in MCJIT compiler to enable
hard-float instructions if target supports them and allows to use floats
and doubles in lldb expressions.