1507 Commits

Author SHA1 Message Date
Dave Lee
6cd651ae21
Revert "Make result variables obey their dynamic values in subsequent expressions (#168611)" (#172780)
[Green Dragon's lldb incremental tests
(x86_64)](https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/)
are failing beginning with
https://github.com/llvm/llvm-project/pull/168611. This commit reverts
that change. If the job continues to fail after committing this revert,
then I will recommit the original.

rdar://166741668

This reverts commit 6344e3aa8106dfdfb30cac36c8ca02bc4c52ce24.
2025-12-17 19:39:25 -08:00
jimingham
6344e3aa81
Make result variables obey their dynamic values in subsequent expressions (#168611)
When you run an expression and the result has a dynamic type that is
different from the expression's static result type, we print the result
variable using the dynamic type, but at present when you use the result
variable in an expression later on, we only give you access to the
static type. For instance:

```
(lldb) expr MakeADerivedReportABase()
(Derived *) $0 = 0x00000001007e93e0
(lldb) expr $0->method_from_derived()
                ^
                error: no member named 'method_from_derived' in 'Base'
(lldb) 

```

The static return type of that function is `Base *`, but we printed that
the result was a `Derived *` and then only used the `Base *` part of it
in subsequent expressions. That's not very helpful, and forces you to
guess and then cast the result types to their dynamic type in order to
be able to access the full type you were returned, which is
inconvenient.

This patch makes lldb retain the dynamic type of the result variable
(and ditto for persistent result variables).

It also adds more testing of expression result variables with various
types of dynamic values, to ensure we can access both the ivars and
methods of the type we print the result as.
2025-12-11 14:28:17 -08:00
Jason Molenda
e4c83b7b11
[lldb][NFC] Change ObjectFile argument type (#171574)
The ObjectFile plugin interface accepts an optional DataBufferSP
argument. If the caller has the contents of the binary, it can provide
this in that DataBufferSP. The ObjectFile subclasses in their
CreateInstance methods will fill in the DataBufferSP with the actual
binary contents if it is not set.
ObjectFile base class creates an ivar DataExtractor from the
DataBufferSP passed in.

My next patch will be a caller that creates a VirtualDataExtractor with
the binary data, and needs to pass that in to the ObjectFile plugin,
instead of the bag-of-bytes DataBufferSP. It builds on the previous
patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP
so I could pass in a subclass in the shared ptr. And it will be using
the VirtualDataExtractor that Jonas added in
https://github.com/llvm/llvm-project/pull/168802

No behavior is changed by the patch; we're simply moving the creation of
the DataExtractor to the caller, instead of a DataBuffer that is
immediately used to set up the ObjectFile DataExtractor. The patch is a
bit complicated because all of the ObjectFile subclasses have to
initialize their DataExtractor to pass in to the base class.

I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it
under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked
lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the
SME, that explains why the testsuite passes)

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2025-12-11 10:08:56 -08:00
David Peixotto
fae64adaa6
[lldb] Handle deref of register and implicit locations (#169419)
This commit modifies the dwarf expression evaluator in how we handle the
deref operation for register and implicit locations on the stack. For a
typical memory location a deref operation will read the value from
memory. For register and implicit locations the deref operation will
read the value from the register or its implicit location. In lldb we
eagerly read register and implicit values and push them on the stack so
the deref operation for these becomes a "no-op" that leaves the value on
the stack and updates the tracked location kind.

The motivation for this change is to handle `DW_OP_deref*` operations on
location descriptions as described by the heterogenious debugging
[extensions](https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/llvm/html/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#a-2-5-4-4-4-register-location-description-operations).

Specifically, for register locations it states

> These operations obtain a register location. To fetch the contents of
> a register, it is necessary to use DW_OP_regval_type, use one of the
> DW_OP_breg* register-based addressing operations, or use DW_OP_deref*
on
> a register location description.

My understanding is that this is the intended behavior from dwarf5 as
well and is not a change in behavior.
2025-12-02 11:13:48 -08:00
Jason Molenda
ae68377c69
[lldb][NFC] Change ObjectFile's DataExtractor to a shared ptr (#170066)
ObjectFile has an m_data DataExtractor ivar which may be default
constructed initially, or initialized with a DataBuffer passed in to its
ctor. If the DataExtractor does not get a DataBuffer source passed in,
the subclass will initialize it with access to the object file's data.
When a DataBuffer is passed in to the base class ctor, the DataExtractor
only has its buffer initialized; ObjectFile doesn't yet know the address
size and endianness to fully initialize the DataExtractor.

This patch changes ObjectFile to instead have a DataExtractorSP ivar
which is always initialized with at least a default-constructed
DataExtractor object in the base class ctor. The next patch I will be
writing is to change the ObjectFile ctor to take an optional
DataExtractorSP, so the caller can pass a DataExtractor subclass -- the
VirtualizeDataExtractor being added via
https://github.com/llvm/llvm-project/pull/168802
instead of a DataBuffer which is trivially saved into the DataExtractor.

The change is otherwise mechanical; all `m_data.` changed to
`m_data_sp->` and all the places where `m_data` was passed in for a
by-ref call were changed to `*m_data_sp.get()`. The shared pointer is
always initialized to contain an object.

I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for
getting the Linux testsuite to run on SME-only systems David). All of
the ObjectFile subclasses I modifed compile cleanly, but I haven't
tested them beyond any unit tests they may have (prob breakpad).

rdar://148939795
2025-12-01 14:37:55 -08:00
David Peixotto
da9e8d5c57
[lldb] Unify DW_OP_deref and DW_OP_deref_size implementations (#169587)
This commit unifies the code in the dwarf expression evaluator that
handles these two deref operations. Previously we had similar, but not
identical code for handling them.

The implementation was taken from the DW_OP_deref_size code path since
that handles the general case. We put that code into an
Evaluate_DW_OP_deref_size function and call it with the address size
when evaluating DW_OP_deref.

There were initially two test failures when I made the change. The
`DW_op_deref_no_ptr_fixing` unittest failed because we were not
correctly setting the address size when we created the DataExtractor.
The `DW_OP_deref test` failed because previously the expression
`DW_OP_lit4, DW_OP_deref` would evaluate to a LoadAddress, but the code
for deref_size was evaluating it to a scalar.

The difference was in how it handled a deref of a scalar value type. In
the deref code path we convert a scalar to a load address, but this was
not done in the deref_size code path.

```
  case Value::ValueType::Scalar:
      stack.back().SetValueType(Value::ValueType::LoadAddress);
```

I decided to modify the deref_size code to also convert the value to a
load address to keep the test passing.

There is no functional change intended here. The motivation is to reduce
the number of code paths that implement the deref operation.
2025-12-01 13:05:27 -08:00
Jonas Devlieghere
06eac9feb9
[lldb] Eliminate SupportFileSP nullptr derefs (#168624)
This patch fixes and eliminates the possibility of SupportFileSP ever
being nullptr. The support file was originally treated like a value
type, but became a polymorphic type and therefore has to be stored and
passed around as a pointer.

To avoid having all the callers check the validity of the pointer, I
introduced the invariant that SupportFileSP is never null and always
default constructed. However, without enforcement at the type level,
that's fragile and indeed, we already identified two crashes where
someone accidentally broke that invariant.

This PR introduces a NonNullSharedPtr to prevent that. NonNullSharedPtr
is a smart pointer wrapper around std::shared_ptr that guarantees the
pointer is never null. If default-constructed, it creates a
default-constructed instance of the contained type. Note that I'm using
private inheritance because you shouldn't inherit from standard library
classes due to the lack of virtual destructor. So while the new
abstraction looks like a `std::shared_ptr`, it is in fact **not** a
shared pointer. Given that our destructor is trivial, we could use
public inheritance, but currently there's no need for it.

rdar://164989579
2025-11-20 16:45:11 -08:00
Alex Langford
3a08e423f1
Re-land [lldb][NFC] Mark ValueObject library with NO_PLUGIN_DEPENDENCIES (#167933)
This is a fixed version of #167886.

The build previously failed with `BUILD_SHARED_LIBS=ON`. After trying
that locally, I uncovered a few other instances of lldb non-plugin
libraries depending on clang transitively through lldbValueObject, so I
added the correct clang libraries to their dependencies.
2025-11-14 11:17:04 -08:00
Michael Buch
9b1719efa0
[lldb] Mark single-argument SourceLanguage constructors explicit (#166527)
This avoids unintentional comparisons between `SourceLanguage` and
`LanguageType`.

Also marks `operator bool` explicit so we don't implicitly convert to
bool.
2025-11-05 17:04:57 +00:00
Michael Buch
332b4deb0d
[lldb][IRExecutionUnit] Return error on failure to resolve function address (#161363)
Starting with https://github.com/llvm/llvm-project/pull/148877 we
started encoding the module ID of the function DIE we are currently
parsing into its `AsmLabel` in the AST. When the JIT asks LLDB to
resolve our special mangled name, we would locate the module and resolve
the function/symbol we found in it.

If we are debugging with a `SymbolFileDWARFDebugMap`, the module ID we
encode is that of the `.o` file that is tracked by the debug-map. To
resolve the address of the DIE in that `.o` file, we have to ask
`SymbolFileDWARFDebugMap::LinkOSOAddress` to turn the address of the
`.o` DIE into a real address in the linked executable. This will only
work if the `.o` address was actually tracked by the debug-map. However,
if the function definition appears in multiple `.o` files (which is the
case for functions defined in headers), the linker will most likely
de-deuplicate that definition. So most `.o`'s definition DIEs for that
function won't have a contribution in the debug-map, and thus we fail to
resolve the address.

When debugging Clang on Darwin, e.g., you'd see:
```
(lldb) expr CXXDecl->getName()

error: Couldn't look up symbols:
  $__lldb_func::0x1:0x4000d000002359da:_ZNK5clang9NamedDecl7getNameEv
Hint: The expression tried to call a function that is not present in the target, perhaps because it was optimized out by the compiler.
```
unless you were stopped in the `.o` file whose definition of `getName`
made it into the final executable.

The fix here is to error out if we fail to resolve the address, causing
us to fall back on the old flow which did a lookup by mangled name,
which the `SymbolFileDWARFDebugMap` will handle correctly.

An alternative fix to this would be to encode the
`SymbolFileDWARFDebugMap`'s module-id. And implement
`SymbolFileDWARFDebugMap::ResolveFunctionCallLabel` by doing a mangled
name lookup. The proposed approach doesn't stop us from implementing
that, so we could choose to do it in a follow-up.

rdar://161393045
2025-10-01 08:37:15 +01:00
Felipe de Azevedo Piovezan
bb013a4a22
Reland "Revert "[lldb] Fix OP_deref evaluation for large integer resu… (#159482)
…lts (#159460)""

The original had an issue on "AArch-less" bots.
Fixed it with some ifdefs around the presence of the AArch ABI plugin.

This reverts commit 1a4685df13282ae5c1d7dce055a71a7130bfab3c.
2025-09-17 17:02:24 -07:00
Felipe de Azevedo Piovezan
1a4685df13 Revert "[lldb] Fix OP_deref evaluation for large integer results (#159460)"
This reverts commit 1d2007ba6f7bacda8848e35298a1833e79f4abd5.
2025-09-17 15:49:09 -07:00
Felipe de Azevedo Piovezan
1d2007ba6f
[lldb] Fix OP_deref evaluation for large integer results (#159460)
When evaluating any DWARF expression that ended in OP_deref and whose
previous value on the dwarf stack -- the pointer address for the deref
-- was a load address, we were treating the result itself as a pointer,
calling Process:FixCodeAddress(result). This is wrong: there's no
guarantee that the result is a pointer itself.
2025-09-17 15:46:24 -07:00
Felipe de Azevedo Piovezan
5d088ba304
[lldb] Track CFA pointer metadata in StackID (#157498)
[lldb] Track CFA pointer metadata in StackID

    In this commit:

9c8e71644227 [lldb] Make StackID call Fix{Code,Data} pointers (#152796)

We made StackID keep track of the CFA without any pointer metadata in
it. This is necessary when comparing two StackIDs to determine which one
    is "younger".

However, the CFA inside StackIDs is also used in other contexts through
    the method StackID::GetCallFrameAddress. One notable case is
DWARFExpression: the computation of `DW_OP_call_frame_address` is done
    using StackID. This feeds into many other places, e.g. expression
evaluation may require the address of a variable that is computed from
    the CFA; to access the variable without faulting, we may need to
preserve the pointer metadata. As such, StackID must be able to provide
    both versions of the CFA.

    In the spirit of allowing consumers of pointers to decide what to do
with pointer metadata, this patch changes StackID to store both versions
of the cfa pointer. Two getter methods are provided, and all call sites
    except DWARFExpression preserve their existing behavior (stripped
    pointer). Other alternatives were considered:

    * Just store the raw pointer. This would require changing the
comparisong operator `<` to also receive a Process, as the comparison
requires stripped pointers. It wasn't clear if all call-sites had a
non-null process, whereas we know we have a process when creating a
      StackID.

* Store a weak pointer to the process inside the class, and then strip
      metadata as needed. This would require a `weak_ptr::lock` in many
operations of LLDB, and it felt wasteful. It also prevents stripping
      of the pointer if the process has gone away.

This patch also changes RegisterContextUnwind::ReadFrameAddress, which
is the method computing the CFA fed into StackID, to also preserve the
    signature pointers.
2025-09-12 09:17:48 -07:00
Felipe de Azevedo Piovezan
d367c7d016
[lldb][nfc] Rename WritePointerToMemory argument's name (#157566)
One of those arguments should be called `pointer` to correlate it to the
name of the function and to distinguish it from the address where it
will be written.
2025-09-09 07:27:52 -07:00
Michael Buch
57a7907179
[lldb][Expression] Add structor variant to LLDB's function call labels (#149827)
Depends on
* https://github.com/llvm/llvm-project/pull/148877
* https://github.com/llvm/llvm-project/pull/155483
* https://github.com/llvm/llvm-project/pull/155485
* https://github.com/llvm/llvm-project/pull/154137
* https://github.com/llvm/llvm-project/pull/154142

This patch is an implementation of [this
discussion](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
about handling ABI-tagged structors during expression evaluation.

**Motivation**

LLDB encodes the mangled name of a `DW_TAG_subprogram` into `AsmLabel`s
on function and method Clang AST nodes. This means that when calls to
these functions get lowered into IR (when running JITted expressions),
the address resolver can locate the appropriate symbol by mangled name
(and it is guaranteed to find the symbol because we got the mangled name
from debug-info, instead of letting Clang mangle it based on AST
structure). However, we don't do this for
`CXXConstructorDecl`s/`CXXDestructorDecl`s because these structor
declarations in DWARF don't have a linkage name. This is because there
can be multiple variants of a structor, each with a distinct mangling in
the Itanium ABI. Each structor variant has its own definition
`DW_TAG_subprogram`. So LLDB doesn't know which mangled name to put into
the `AsmLabel`.

Currently this means using ABI-tagged structors in LLDB expressions
won't work (see [this
RFC](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
for concrete examples).

**Proposed Solution**

The `FunctionCallLabel` encoding that we put into `AsmLabel`s already
supports stuffing more info about a DIE into it. So this patch extends
the `FunctionCallLabel` to contain an optional discriminator (a sequence
of bytes) which the `SymbolFileDWARF` plugin interprets as the
constructor/destructor variant of that DIE. So when searching for the
definition DIE, LLDB will include the structor variant in its heuristic
for determining a match.

There's a few subtleties here:
1. At the point at which LLDB first constructs the label, it has no way
of knowing (just by looking at the debug-info declaration), which
structor variant the expression evaluator is supposed to call. That's
something that gets decided when compiling the expression. So we let the
Clang mangler inject the correct structor variant into the `AsmLabel`
during JITing. I adjusted the `AsmLabelAttr` mangling for this in
https://github.com/llvm/llvm-project/pull/155485. An option would've
been to create a new Clang attribute which behaved like an `AsmLabel`
but with these special semantics for LLDB. My main concern there is that
we'd have to adjust all the `AsmLabelAttr` checks around Clang to also
now account for this new attribute.
2. The compiler is free to omit the `C1` variant of a constructor if the
`C2` variant is sufficient. In that case it may alias `C1` to `C2`,
leaving us with only the `C2` `DW_TAG_subprogram` in the object file.
Linux is one of the platforms where this occurs. For those cases I added
a heuristic in `SymbolFileDWARF` where we pick `C2` if we asked for `C1`
but it doesn't exist. This may not always be correct (e.g., if the
compiler decided to drop `C1` for other reasons).
3. In https://github.com/llvm/llvm-project/pull/154142 Clang will emit
`C4`/`D4` variants of ctors/dtors on declarations. When resolving the
`FunctionCallLabel` we will now substitute the actual variant that Clang
told us we need to call into the mangled name. We do this using LLDB's
`ManglingSubstitutor`. That way we find the definition DIE exactly the
same way we do for regular function calls.
4. In cases where declarations and definitions live in separate modules,
the DIE ID encoded in the function call label may not be enough to find
the definition DIE in the encoded module ID. For those cases we fall
back to how LLDB used to work: look up in all images of the target. To
make sure we don't use the unified mangled name for the fallback lookup,
we change the lookup name to whatever mangled name the FunctionCallLabel
resolved to.

rdar://104968288
2025-09-09 09:02:00 +00:00
Adrian Prantl
7c5b535d8c
[lldb] Replace IRExecutionUnit::GetSectionTypeFromSectionName with Ob… (#157192)
…jectFile API

This avoids code duplication.
2025-09-05 15:47:04 -07:00
Jonas Devlieghere
820f440274
[lldb] Correct style of error messages (#156774)
The LLVM Style Guide says the following about error and warning messages
[1]:

> [T]o match error message styles commonly produced by other tools,
> start the first sentence with a lowercase letter, and finish the last
> sentence without a period, if it would end in one otherwise.

I often provide this feedback during code review, but we still have a
bunch of places where we have inconsistent error message, which bothers
me as a user. This PR identifies a handful of those places and updates
the messages to be consistent.

[1] https://llvm.org/docs/CodingStandards.html#error-and-warning-messages
2025-09-04 16:37:41 -07:00
Felipe de Azevedo Piovezan
f88eadda35
[lldb] Call FixUpPointer in WritePointerToMemory (try 2) (#153585)
In architectures where pointers may contain metadata, such as arm64e,
the metadata may need to be cleaned prior to sending this pointer to be
used in expression evaluation generated code.

This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See #150537.

This was tested running the LLDB test suite on arm64e.

(The first attempt at this patch caused a failure in
TestScriptedProcessEmptyMemoryRegion.py. This test exercises a case
where IRMemoryMap uses host memory in its allocations; pointers to such
allocations should not be fixed, which is what the original patch failed
to account for).
2025-09-04 13:05:10 -07:00
Abdullah Mohammad Amin
8ec4db5e17
Stateful variable-location annotations in Disassembler::PrintInstructions() (follow-up to #147460) (#152887)
**Context**  
Follow-up to
[#147460](https://github.com/llvm/llvm-project/pull/147460), which added
the ability to surface register-resident variable locations.
This PR moves the annotation logic out of `Instruction::Dump()` and into
`Disassembler::PrintInstructions()`, and adds lightweight state tracking
so we only print changes at range starts and when variables go out of
scope.

---

## What this does

While iterating the instructions for a function, we maintain a “live
variable map” keyed by `lldb::user_id_t` (the `Variable`’s ID) to
remember each variable’s last emitted location string. For each
instruction:

- **New (or newly visible) variable** → print `name = <location>` once
at the start of its DWARF location range, cache it.
- **Location changed** (e.g., DWARF range switched to a different
register/const) → print the updated mapping.
- **Out of scope** (was tracked previously but not found for the current
PC) → print `name = <undef>` and drop it.

This produces **concise, stateful annotations** that highlight variable
lifetime transitions without spamming every line.

---

## Why in `PrintInstructions()`?

- Keeps `Instruction` stateless and avoids changing the
`Instruction::Dump()` virtual API.
- Makes it straightforward to diff state across instructions (`prev →
current`) inside the single driver loop.

---

## How it works (high-level)

1. For the current PC, get in-scope variables via
`StackFrame::GetInScopeVariableList(/*get_parent=*/true)`.
2. For each `Variable`, query
`DWARFExpressionList::GetExpressionEntryAtAddress(func_load_addr,
current_pc)` (added in #144238).
3. If the entry exists, call `DumpLocation(..., eDescriptionLevelBrief,
abi)` to get a short, ABI-aware location string (e.g., `DW_OP_reg3 RBX →
RBX`).
4. Compare against the last emitted location in the live map:
   - If not present → emit `name = <location>` and record it.
   - If different → emit updated mapping and record it.
5. After processing current in-scope variables, compute the set
difference vs. the previous map and emit `name = <undef>` for any that
disappeared.

Internally:
- We respect file↔load address translation already provided by
`DWARFExpressionList`.
- We reuse the ABI to map LLVM register numbers to arch register names.

---

## Example output (x86_64, simplified)

```
->  0x55c6f5f6a140 <+0>:  cmpl   $0x2, %edi                                                         ; argc = RDI, argv = RSI
    0x55c6f5f6a143 <+3>:  jl     0x55c6f5f6a176            ; <+54> at d_original_example.c:6:3
    0x55c6f5f6a145 <+5>:  pushq  %r15
    0x55c6f5f6a147 <+7>:  pushq  %r14
    0x55c6f5f6a149 <+9>:  pushq  %rbx
    0x55c6f5f6a14a <+10>: movq   %rsi, %rbx
    0x55c6f5f6a14d <+13>: movl   %edi, %r14d
    0x55c6f5f6a150 <+16>: movl   $0x1, %r15d                                                        ; argc = R14
    0x55c6f5f6a156 <+22>: nopw   %cs:(%rax,%rax)                                                    ; i = R15, argv = RBX
    0x55c6f5f6a160 <+32>: movq   (%rbx,%r15,8), %rdi
    0x55c6f5f6a164 <+36>: callq  0x55c6f5f6a030            ; symbol stub for: puts
    0x55c6f5f6a169 <+41>: incq   %r15
    0x55c6f5f6a16c <+44>: cmpq   %r15, %r14
    0x55c6f5f6a16f <+47>: jne    0x55c6f5f6a160            ; <+32> at d_original_example.c:5:10
    0x55c6f5f6a171 <+49>: popq   %rbx                                                               ; i = <undef>
    0x55c6f5f6a172 <+50>: popq   %r14                                                               ; argv = RSI
    0x55c6f5f6a174 <+52>: popq   %r15                                                               ; argc = RDI
    0x55c6f5f6a176 <+54>: xorl   %eax, %eax
    0x55c6f5f6a178 <+56>: retq  
```

Only transitions are shown: the start of a location, changes, and
end-of-lifetime.

---

## Scope & limitations (by design)

- Handles **simple locations** first (registers, const-in-register cases
surfaced by `DumpLocation`).
- **Memory/composite locations** are out of scope for this PR.
- Annotations appear **only at range boundaries** (start/change/end) to
minimize noise.
- Output is **target-independent**; register names come from the target
ABI.

## Implementation notes

- All annotation printing now happens in
`Disassembler::PrintInstructions()`.
- Uses `std::unordered_map<lldb::user_id_t, std::string>` as the live
map.
- No persistent state across calls; the map is rebuilt while walking
instruction by instruction.
- **No changes** to the `Instruction` interface.

---

## Requested feedback

- Placement and wording of the `<undef>` marker.
- Whether we should optionally gate this behind a setting (currently
always on when disassembling with an `ExecutionContext`).
- Preference for immediate inclusion of tests vs. follow-up patch.

---

Thanks for reviewing! Happy to adjust behavior/format based on feedback.

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Adrian Prantl <adrian.prantl@gmail.com>
2025-08-28 10:41:38 -07:00
Dave Lee
ae7e1b82fe
[lldb] Print ValueObject when GetObjectDescription fails (#152417)
This fixes a few bugs, effectively through a fallback to `p` when `po` fails.

The motivating bug this fixes is when an error within the compiler causes `po` to fail.
Previously when that happened, only its value (typically an object's address) was
printed – and problematically, no compiler diagnostics were shown. With this change,
compiler diagnostics are shown, _and_ the object is fully printed (ie `p`).

Another bug this fixes is when `po` is used on a type that doesn't provide an object
description (such as a struct). Again, the normal `ValueObject` printing is used.

Additionally, this also improves how lldb handles an object description method that
fails in some way. Now an error will be shown (it wasn't before), and the value will be
printed normally.
2025-08-15 08:37:26 -07:00
David Spickett
b563b274b8
[lldb] Convert registers values into target endian for expressions (#148836)
Relates to https://github.com/llvm/llvm-project/issues/135707

Where it was reported that reading the PC using "register read" had
different results to an expression "$pc".

This was happening because registers are treated in lldb as pure
"values" that don't really have an endian. We have to store them
somewhere on the host of course, so the endian becomes host endian.

When you want to use a register as a value in an expression you're
pretending that it's a variable in memory. In target memory. Therefore
we must convert the register value to that endian before use.

The test I have added is based on the one used for XML register flags.
Where I fake an AArch64 little endian and an s390x big endian target. I
set up the data in such a way the pc value should print the same for
both, either with register read or an expression.

I considered just adding a live process test that checks the two are the
same but with on one doing cross endian testing, I doubt it would have
ever caught this bug.

Simulating this means most of the time, little endian hosts will test
little to little and little to big. In the minority of cases with a big
endian host, they'll check the reverse. Covering all the combinations.
2025-08-13 09:48:29 +01:00
Felipe de Azevedo Piovezan
a203546496 Revert "[lldb] Call FixUpPointer in WritePointerToMemory"
This reverts commit 085a53cb89c4021da2e32d1757a1ee44668e8596.

This patch is hitting a corner case tested by
`TestScriptedProcessEmptyMemoryRegion.py`.
2025-08-12 18:51:00 -07:00
Felipe de Azevedo Piovezan
18ef8d7fae
[lldb] Call FixUpPointer in WritePointerToMemory (#152798)
In architectures where pointers may contain metadata, such as arm64e,
the metadata may need to be cleaned prior to sending this pointer to be
used in expression evaluation generated code.

This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See
https://github.com/llvm/llvm-project/pull/150537.

This was tested running the LLDB test suite on arm64e.
2025-08-11 09:39:12 -07:00
Michael Buch
ef30dd34e9
[lldb][SymbolFileDWARF][NFC] Add FindFunctionDefinition helper (#152733)
Factors out code to locate a definition DIE from a `FunctionCallLabel`
into a helper. This will be useful for an upcoming refactor that extends
`FindFunctionDefinition`.

Drive-by:
* adjusts some error messages in the `FunctionCallLabel` support code.
2025-08-08 18:23:58 +01:00
David Spickett
4077e66432
[lldb] Treat address found via function name as a callable address (#151973)
I discovered this building the lldb test suite with `-mthumb` set, so
that all test programs are purely Arm Thumb code.

When C++ expressions did a function lookup, they took a different path
from C programs. That path happened to land on the line that I've
changed. Where we try to look something up as a function, but don't then
resolve the address as if it's a callable.

With Thumb, if you do the non-callable lookup, the bottom bit won't be
set. This means when lldb's expression wrapper function branches into
the found function, it'll be in Arm mode trying to execute Thumb code.

Thumb is the only instance where you'd notice this. Aside from maybe
MicroMIPS or MIPS16 perhaps but I expect that there are 0 users of that
and lldb.

I have added a new test case that will simulate this situation in
"normal" Arm builds so that it will get run on Linaro's buildbot.

This change also fixes the following existing tests when `-mthumb` is
used:
```
  lldb-api :: commands/expression/anonymous-struct/TestCallUserAnonTypedef.py
  lldb-api :: commands/expression/argument_passing_restrictions/TestArgumentPassingRestrictions.py
  lldb-api :: commands/expression/call-function/TestCallStopAndContinue.py
  lldb-api :: commands/expression/call-function/TestCallUserDefinedFunction.py
  lldb-api :: commands/expression/char/TestExprsChar.py
  lldb-api :: commands/expression/class_template_specialization_empty_pack/TestClassTemplateSpecializationParametersHandling.py
  lldb-api :: commands/expression/context-object/TestContextObject.py
  lldb-api :: commands/expression/formatters/TestFormatters.py
  lldb-api :: commands/expression/import_base_class_when_class_has_derived_member/TestImportBaseClassWhenClassHasDerivedMember.py
  lldb-api :: commands/expression/inline-namespace/TestInlineNamespace.py
  lldb-api :: commands/expression/namespace-alias/TestInlineNamespaceAlias.py
  lldb-api :: commands/expression/no-deadlock/TestExprDoesntBlock.py
  lldb-api :: commands/expression/pr35310/TestExprsBug35310.py
  lldb-api :: commands/expression/static-initializers/TestStaticInitializers.py
  lldb-api :: commands/expression/test/TestExprs.py
  lldb-api :: commands/expression/timeout/TestCallWithTimeout.py
  lldb-api :: commands/expression/top-level/TestTopLevelExprs.py
  lldb-api :: commands/expression/unwind_expression/TestUnwindExpression.py
  lldb-api :: commands/expression/xvalue/TestXValuePrinting.py
  lldb-api :: functionalities/thread/main_thread_exit/TestMainThreadExit.py
  lldb-api :: lang/cpp/call-function/TestCallCPPFunction.py
  lldb-api :: lang/cpp/chained-calls/TestCppChainedCalls.py
  lldb-api :: lang/cpp/class-template-parameter-pack/TestClassTemplateParameterPack.py
  lldb-api :: lang/cpp/constructors/TestCppConstructors.py
  lldb-api :: lang/cpp/function-qualifiers/TestCppFunctionQualifiers.py
  lldb-api :: lang/cpp/function-ref-qualifiers/TestCppFunctionRefQualifiers.py
  lldb-api :: lang/cpp/global_operators/TestCppGlobalOperators.py
  lldb-api :: lang/cpp/llvm-style/TestLLVMStyle.py
  lldb-api :: lang/cpp/multiple-inheritance/TestCppMultipleInheritance.py
  lldb-api :: lang/cpp/namespace/TestNamespace.py
  lldb-api :: lang/cpp/namespace/TestNamespaceLookup.py
  lldb-api :: lang/cpp/namespace_conflicts/TestNamespaceConflicts.py
  lldb-api :: lang/cpp/operators/TestCppOperators.py
  lldb-api :: lang/cpp/overloaded-functions/TestOverloadedFunctions.py
  lldb-api :: lang/cpp/rvalue-references/TestRvalueReferences.py
  lldb-api :: lang/cpp/static_methods/TestCPPStaticMethods.py
  lldb-api :: lang/cpp/template/TestTemplateArgs.py
  lldb-api :: python_api/thread/TestThreadAPI.py
```

There are other failures that are due to different problems, and this
change does not make those worse.

(I have no plans to run the test suite with `-mthumb` regularly, I just
did it to test some other refactoring)
2025-08-06 09:31:09 +01:00
satyanarayana reddy janga
a0db29d647
[lldb] Zero extend APInt when piece size is bigger than the bitwidth (#150149)
### Summary
We have internally seen cases like this 
`DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28`
where we have longer op pieces than what Scalar supports (32, 64 or 128
bits). In these cases LLDB is currently hitting the assertion
`assert(ap_int.getBitWidth() >= bit_size);`

We are extending the generated APInt to the piece size by filling zeros.


### Test plan

Added a unit to cover this case. 

### Reviewers
@clayborg , @jeffreytan81 , @Jlalond
2025-08-04 09:42:42 -07:00
Michael Buch
f89059140b
[lldb][Expression] Encode Module and DIE UIDs into function AsmLabels (#148877)
LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](https://github.com/llvm/llvm-project/pull/149827)

This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.

**Implementation**

The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).

Depends on https://github.com/llvm/llvm-project/pull/151355
2025-08-01 07:21:41 +01:00
Igor Kudrin
e4c0f30030
[lldb] Fix updating persistent variables without JIT (#149642)
This patch fixes updating persistent variables when memory cannot be
allocated in an inferior process:
```
> lldb -c test.core
(lldb) expr int $i = 5
(lldb) expr $i = 55
(int) $0 = 55
(lldb) expr $i
(int) $i = 5
```

With this patch, the last command prints:
```
(int) $i = 55
```

The issue was introduced in #145599.
2025-07-30 12:54:15 -07:00
Jonas Devlieghere
c2548a8c4c
[lldb] Support DW_OP_WASM_location in DWARFExpression (#151010)
Add support for DW_OP_WASM_location in DWARFExpression. This PR rebases
#78977 and cleans up the unit test. The DWARF extensions are documented
at https://yurydelendik.github.io/webassembly-dwarf/ and supported by
LLVM-based toolchains such as Clang, Swift, Emscripten, and Rust.
2025-07-30 09:20:37 -07:00
Jonas Devlieghere
2cb1ecb94b
[lldb] Eliminate namespace lldb_private::dwarf (NFC) (#150073)
Eliminate the `lldb_private::dwarf` namespace, in favor of using
`llvm::dwarf` directly. The latter is shorter, and this avoids ambiguity
in the ABI plugins that define a `dwarf` namespace inside an anonymous
namespace.
2025-07-22 15:51:00 -07:00
Michael Buch
03b7766dba
[lldb][Expression][NFC] Make LoadAddressResolver::m_target a reference (#149490)
The only place that passes a target to `LoadAddressResolver` already
checks for pointer validity. And inside of the resolver we have been
dereferencing the target anyway without nullptr checks. So codify the
non-nullness of `m_target` by making it a reference.
2025-07-18 14:31:12 +01:00
Abdullah Mohammad Amin
7cb84392fd
[lldb] Add DWARFExpressionEntry and GetExpressionEntryAtAddress() to … (#144238)
This patch introduces a new struct and helper API in
`DWARFExpressionList` to expose
variable location metadata (base address, end address, and
DWARFExpression pointer) for
a given PC address. It will be used in later patches to annotate
disassembly instructions
with source-level variable locations.

## New struct
```
/// Represents one entry in a DWARFExpressionList, with its range and expr.
struct DWARFExpressionEntry {
  lldb::addr_t base;           // file‐address start of this location range
  lldb::addr_t end;            // file‐address end of this range (exclusive)
  const DWARFExpression *expr; // the DWARF expression for this range
};
```

## New API
```
/// Retrieve the DWARFExpressionEntry covering a particular instruction.
///
/// \param func_load_addr
///     The load address of the start of the function containing this location list;
///     used to translate between file offsets and load addresses. If this is
///     LLDB_INVALID_ADDRESS, the stored CU base (m_func_file_addr) is used.
///
/// \param load_addr
///     The load address of the *current* PC (i.e., the instruction for which
///     we want its variable‐location entry). We first convert this back into
///     the function’s file‐address space to find the correct DWARF range.
///
/// \returns
///     On success, an entry whose `[base,end)` covers this PC; else an Error.
llvm::Expected<DWARFExpressionEntry>
GetExpressionEntryAtAddress(lldb::addr_t func_load_addr,
                            lldb::addr_t load_addr) const;
```

## Rationale
LLDB already provides:
```
const DWARFExpression *
GetExpressionAtAddress(lldb::addr_t func_load_addr,
                       lldb::addr_t load_addr) const;

```
However, this only returns the DWARF expression itself, without the
file‐address start (base) and end (end) of the location range. Those
bounds are crucial for:

1) Detecting range beginnings: render a var = <location> annotation
exactly when a variable’s live‐range starts.

2) Detecting range continuation: optionally display a “|” on subsequent
instructions in the same range.

3) Detecting state changes: know when a variable moves (e.g. from one
register to another), becomes a constant, or goes out of scope.

These primitives form the foundation for the Rich Disassembler feature
proposed for GSoC 25.

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Adrian Prantl <adrian.prantl@gmail.com>
2025-07-12 10:03:54 -07:00
Jonas Devlieghere
1995fd9ac6
[lldb] Extract DW_OP_deref evaluation code (NFC) (#146801)
The DWARF expression evaluator is essentially a large state machine
implemented as a switch statement. For anything but trivial operations,
having the evaluation logic implemented inline is difficult to follow,
especially when there are nested switches. This commit moves evaluation
of `DW_OP_deref` out-of-line, similar to `Evaluate_DW_OP_entry_value`.
2025-07-03 13:46:47 -07:00
Igor Kudrin
442f99d769
[lldb] Fix evaluating expressions without JIT in an object context (#145599)
If a server does not support allocating memory in an inferior process or
when debugging a core file, evaluating an expression in the context of a
value object results in an error:

```
error: <lldb wrapper prefix>:43:1: use of undeclared identifier '$__lldb_class'
   43 | $__lldb_class::$__lldb_expr(void *$__lldb_arg)
      | ^
```

Such expressions require a live address to be stored in the value
object. However, `EntityResultVariable::Dematerialize()` only sets
`ret->m_live_sp` if JIT is available, even if the address points to the
process memory and no custom allocations were made. Similarly,
`EntityPersistentVariable::Dematerialize()` tries to deallocate memory
based on the same check, resulting in an error if the memory was not
previously allocated in `EntityPersistentVariable::Materialize()`.

As an unintended bonus, the patch also fixes a FIXME case in
`TestCxxChar8_t.py`.
2025-06-27 14:30:24 -07:00
Igor Kudrin
21993f0a47
[lldb][NFC] Switch IRMemoryMap::Malloc to return llvm::Expected (#146016)
This will make changes in #145599 a bit nicer.
2025-06-27 13:04:25 -07:00
Sterling-Augustine
23f1ba3ee4
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959) (#146112)
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
    
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.

Original Description:

This is the culmination of a series of changes described in [1].
    
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
    
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
2025-06-27 11:05:49 -07:00
Sterling-Augustine
5d03e7a204
Revert "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959)
…145081)"

This reverts commit cbf781f0bdf2f680abbe784faedeefd6f84c246e.

Breaks a couple of buildbots.
2025-06-26 13:09:20 -07:00
Sterling-Augustine
cbf781f0bd
[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#145081)
This is the culmination of a series of changes described in [1].
    
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
    
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
 
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.

Either way, I'm open.

1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
2025-06-26 11:23:46 -07:00
Pavel Labath
7e471c1fd0
[lldb/cmake] Use ADDITIONAL_HEADER(_DIR)?S (#142587)
Replace (questionable) header globs with an explicit argument supported
by llvm_add_library.
2025-06-10 11:58:39 +02:00
Sterling-Augustine
474db6a852
[NFC] Separate high-level-dependent portions of DWARFExpression (revised) (#143170)
(Revised version of a previous, unreviewed, PR.)

Move all expression verification into its only client: DWARFVerifier.
Move all printing code (which was a mix of static and member functions)
into a separate class.

This is one in a series of refactoring PRs to separate dwarf
functionality into lower-level pieces usable without object files and
sections at build time. The code is already written this way via various
"if (section == nullptr)" and similar conditionals. So the functionality
itself is needed and exists, but only as a runtime feature. The goal of
these refactors is to remove the build-time dependencies, which will
allow the existing functionality to be used from lower-level parts of
the compiler. Particularly from lib/MC/.... More information at:


https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
2025-06-09 16:32:40 -07:00
Kazu Hirata
df4b453516
[lldb] Use llvm::find (NFC) (#143338)
This patch should be mostly obvious, but in one place, this patch
changes:

  const auto &it = std::find(...)

to:

  auto it = llvm::find(...)

We do not need to bind to a temporary with const ref.
2025-06-09 09:56:27 +01:00
David Blaikie
bc7f1eadbf Fix forward for new DWARF DW_OP enum to address warning in lldb 2025-06-06 19:43:49 +00:00
Pavel Labath
2c4f67794b
[lldb/cmake] Implicitly pass arguments to llvm_add_library (#142583)
If we're not touching them, we don't need to do anything special to pass
them along -- with one important caveat: due to how cmake arguments
work, the implicitly passed arguments need to be specified before
arguments that we handle.

This isn't particularly nice, but the alternative is enumerating all
arguments that can be used by llvm_add_library and the macros it calls
(it also relies on implicit passing of some arguments to
llvm_process_sources).
2025-06-04 11:33:37 +02:00
Pavel Labath
e9fad0e91c
[lldb] Refactor away UB in SBValue::GetLoadAddress (#141799)
The problem was in calling GetLoadAddress on a value in the error state,
where `ValueObject::GetLoadAddress` could end up accessing the
uninitialized "address type" by-ref return value from `GetAddressOf`.
This probably happened because each function expected the other to
initialize it.

We can guarantee initialization by turning this into a proper return
value.

I've added a test, but it only (reliably) crashes if lldb is built with
ubsan.
2025-06-02 09:39:56 +02:00
Kazu Hirata
25da043c55
[lldb] Use llvm::is_contained (NFC) (#140466) 2025-05-19 06:18:37 -07:00
Kazu Hirata
61714c16be
[lldb] Remove unused local variables (NFC) (#138457) 2025-05-04 11:56:22 -07:00
Dmitry Vasilyev
59c5d53199
[LLDB][NFC] Replace DWARFUnit with DWARFExpression::Delegate in DWARFExpressionList too (#133049)
This is an update for #131645.
2025-03-26 16:28:39 +04:00
Dmitry Vasilyev
2edf534f55
[LLDB][NFC] Added the interface DWARFExpression::Delegate to break dependencies and reduce lldb-server size (#131645)
This patch addresses the issue #129543.
After this patch DWARFExpression does not call DWARFUnit directly and does not depend on
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp and a lot of clang code.
After this patch the size of lldb-server binary (Linux Aarch64) is reduced from 47MB to 17MB.
2025-03-24 18:40:49 +04:00
Michael Buch
542d52b1e8
[lldb][Expression] Allow specifying a preferred ModuleList for lookup during expression evaluation (#129733)
The `TestMemoryHistory.py`/`TestReportData.py` are currently failing on
the x86 macOS CI (started after we upgraded the Xcode SDK on that
machien). The LLDB ASAN utility expression is failing to run with
following error:
```
(lldb) image lookup -n __asan_get_alloc_stack
1 match found in /usr/lib/system/libsystem_sanitizers.dylib:
        Address: libsystem_sanitizers.dylib[0x00007ffd11e673f7] (libsystem_sanitizers.dylib.__TEXT.__text + 11287)
        Summary: libsystem_sanitizers.dylib`__asan_get_alloc_stack
1 match found in /Users/michaelbuch/Git/lldb-build-main-no-modules/lib/clang/21/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:
        Address: libclang_rt.asan_osx_dynamic.dylib[0x0000000000009ec0] (libclang_rt.asan_osx_dynamic.dylib.__TEXT.__text + 34352)
        Summary: libclang_rt.asan_osx_dynamic.dylib`::__asan_get_alloc_stack(__sanitizer::uptr, __sanitizer::uptr *, __sanitizer::uptr, __sanitizer::u32 *) at asan_debugging.cpp:132
(lldb) memory history 'pointer'
Assertion failed: ((uintptr_t)addr == report.access.address), function __asan_get_alloc_stack, file debugger_abi.cpp, line 62.
warning: cannot evaluate AddressSanitizer expression:
error: Expression execution was interrupted: signal SIGABRT.
The process has been returned to the state before expression evaluation.
```

The reason for this is that the system sanitizer dylib and the locally
built libclang_rt contain the same symbol `__asan_get_alloc_stack`, and
depending on the order in which they're loaded, we may pick the one from
the wrong dylib (this probably changed during the buildbot upgrade and
is why it only now started failing). Based on discussion with @wrotki we
always want to pick the one that's in the libclang_rt dylib if it was
loaded, and libsystem_sanitizers otherwise.

This patch addresses this by adding a "preferred lookup context list" to
the expression evaluator. Currently this is only exposed in the
`EvaluateExpressionOptions`. We make it a `SymbolContextList` in case we
want the lookup contexts to be contexts other than modules (e.g., source
files, etc.). In `IRExecutionUnit` we make it a `ModuleList` because it
makes the symbol lookup implementation simpler and we only do module
lookups here anyway. If we ever need it to be a `SymbolContext`, that
transformation shouldn't be too difficult.
2025-03-06 21:07:22 +00:00