**Context** Follow-up to [#147460](https://github.com/llvm/llvm-project/pull/147460), which added the ability to surface register-resident variable locations. This PR moves the annotation logic out of `Instruction::Dump()` and into `Disassembler::PrintInstructions()`, and adds lightweight state tracking so we only print changes at range starts and when variables go out of scope. --- ## What this does While iterating the instructions for a function, we maintain a “live variable map” keyed by `lldb::user_id_t` (the `Variable`’s ID) to remember each variable’s last emitted location string. For each instruction: - **New (or newly visible) variable** → print `name = <location>` once at the start of its DWARF location range, cache it. - **Location changed** (e.g., DWARF range switched to a different register/const) → print the updated mapping. - **Out of scope** (was tracked previously but not found for the current PC) → print `name = <undef>` and drop it. This produces **concise, stateful annotations** that highlight variable lifetime transitions without spamming every line. --- ## Why in `PrintInstructions()`? - Keeps `Instruction` stateless and avoids changing the `Instruction::Dump()` virtual API. - Makes it straightforward to diff state across instructions (`prev → current`) inside the single driver loop. --- ## How it works (high-level) 1. For the current PC, get in-scope variables via `StackFrame::GetInScopeVariableList(/*get_parent=*/true)`. 2. For each `Variable`, query `DWARFExpressionList::GetExpressionEntryAtAddress(func_load_addr, current_pc)` (added in #144238). 3. If the entry exists, call `DumpLocation(..., eDescriptionLevelBrief, abi)` to get a short, ABI-aware location string (e.g., `DW_OP_reg3 RBX → RBX`). 4. Compare against the last emitted location in the live map: - If not present → emit `name = <location>` and record it. - If different → emit updated mapping and record it. 5. After processing current in-scope variables, compute the set difference vs. the previous map and emit `name = <undef>` for any that disappeared. Internally: - We respect file↔load address translation already provided by `DWARFExpressionList`. - We reuse the ABI to map LLVM register numbers to arch register names. --- ## Example output (x86_64, simplified) ``` -> 0x55c6f5f6a140 <+0>: cmpl $0x2, %edi ; argc = RDI, argv = RSI 0x55c6f5f6a143 <+3>: jl 0x55c6f5f6a176 ; <+54> at d_original_example.c:6:3 0x55c6f5f6a145 <+5>: pushq %r15 0x55c6f5f6a147 <+7>: pushq %r14 0x55c6f5f6a149 <+9>: pushq %rbx 0x55c6f5f6a14a <+10>: movq %rsi, %rbx 0x55c6f5f6a14d <+13>: movl %edi, %r14d 0x55c6f5f6a150 <+16>: movl $0x1, %r15d ; argc = R14 0x55c6f5f6a156 <+22>: nopw %cs:(%rax,%rax) ; i = R15, argv = RBX 0x55c6f5f6a160 <+32>: movq (%rbx,%r15,8), %rdi 0x55c6f5f6a164 <+36>: callq 0x55c6f5f6a030 ; symbol stub for: puts 0x55c6f5f6a169 <+41>: incq %r15 0x55c6f5f6a16c <+44>: cmpq %r15, %r14 0x55c6f5f6a16f <+47>: jne 0x55c6f5f6a160 ; <+32> at d_original_example.c:5:10 0x55c6f5f6a171 <+49>: popq %rbx ; i = <undef> 0x55c6f5f6a172 <+50>: popq %r14 ; argv = RSI 0x55c6f5f6a174 <+52>: popq %r15 ; argc = RDI 0x55c6f5f6a176 <+54>: xorl %eax, %eax 0x55c6f5f6a178 <+56>: retq ``` Only transitions are shown: the start of a location, changes, and end-of-lifetime. --- ## Scope & limitations (by design) - Handles **simple locations** first (registers, const-in-register cases surfaced by `DumpLocation`). - **Memory/composite locations** are out of scope for this PR. - Annotations appear **only at range boundaries** (start/change/end) to minimize noise. - Output is **target-independent**; register names come from the target ABI. ## Implementation notes - All annotation printing now happens in `Disassembler::PrintInstructions()`. - Uses `std::unordered_map<lldb::user_id_t, std::string>` as the live map. - No persistent state across calls; the map is rebuilt while walking instruction by instruction. - **No changes** to the `Instruction` interface. --- ## Requested feedback - Placement and wording of the `<undef>` marker. - Whether we should optionally gate this behind a setting (currently always on when disassembling with an `ExecutionContext`). - Preference for immediate inclusion of tests vs. follow-up patch. --- Thanks for reviewing! Happy to adjust behavior/format based on feedback. --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com> Co-authored-by: Adrian Prantl <adrian.prantl@gmail.com>
109 lines
4.6 KiB
Python
109 lines
4.6 KiB
Python
from lldbsuite.test.lldbtest import *
|
||
from lldbsuite.test.decorators import *
|
||
import lldb
|
||
import os
|
||
import re
|
||
|
||
|
||
class TestVariableAnnotationsDisassembler(TestBase):
|
||
def _build_obj(self, obj_name: str) -> str:
|
||
# Let the Makefile build all .o’s (pattern rule). Then grab the one we need.
|
||
self.build()
|
||
obj = self.getBuildArtifact(obj_name)
|
||
self.assertTrue(os.path.exists(obj), f"missing object: {obj}")
|
||
return obj
|
||
|
||
def _create_target(self, path):
|
||
target = self.dbg.CreateTarget(path)
|
||
self.assertTrue(target, f"failed to create target for {path}")
|
||
return target
|
||
|
||
def _disassemble_verbose_symbol(self, symname):
|
||
self.runCmd(f"disassemble -n {symname} -v", check=True)
|
||
return self.res.GetOutput()
|
||
|
||
def test_d_original_example_O1(self):
|
||
obj = self._build_obj("d_original_example.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("main")
|
||
print(out)
|
||
self.assertIn("argc = ", out)
|
||
self.assertIn("argv = ", out)
|
||
self.assertIn("i = ", out)
|
||
self.assertNotIn("<decoding error>", out)
|
||
|
||
@no_debug_info_test
|
||
def test_regs_int_params(self):
|
||
obj = self._build_obj("regs_int_params.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("regs_int_params")
|
||
print(out)
|
||
self.assertRegex(out, r"\ba\s*=\s*(DW_OP_reg5\b|RDI\b)")
|
||
self.assertRegex(out, r"\bb\s*=\s*(DW_OP_reg4\b|RSI\b)")
|
||
self.assertRegex(out, r"\bc\s*=\s*(DW_OP_reg1\b|RDX\b)")
|
||
self.assertRegex(out, r"\bd\s*=\s*(DW_OP_reg2\b|RCX\b)")
|
||
self.assertRegex(out, r"\be\s*=\s*(DW_OP_reg8\b|R8\b)")
|
||
self.assertRegex(out, r"\bf\s*=\s*(DW_OP_reg9\b|R9\b)")
|
||
self.assertNotIn("<decoding error>", out)
|
||
|
||
@no_debug_info_test
|
||
def test_regs_fp_params(self):
|
||
obj = self._build_obj("regs_fp_params.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("regs_fp_params")
|
||
print(out)
|
||
self.assertRegex(out, r"\ba\s*=\s*(DW_OP_reg17\b|XMM0\b)")
|
||
self.assertRegex(out, r"\bb\s*=\s*(DW_OP_reg18\b|XMM1\b)")
|
||
self.assertRegex(out, r"\bc\s*=\s*(DW_OP_reg19\b|XMM2\b)")
|
||
self.assertRegex(out, r"\bd\s*=\s*(DW_OP_reg20\b|XMM3\b)")
|
||
self.assertRegex(out, r"\be\s*=\s*(DW_OP_reg21\b|XMM4\b)")
|
||
self.assertRegex(out, r"\bf\s*=\s*(DW_OP_reg22\b|XMM5\b)")
|
||
self.assertNotIn("<decoding error>", out)
|
||
|
||
@no_debug_info_test
|
||
def test_regs_mixed_params(self):
|
||
obj = self._build_obj("regs_mixed_params.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("regs_mixed_params")
|
||
print(out)
|
||
self.assertRegex(out, r"\ba\s*=\s*(DW_OP_reg5\b|RDI\b)")
|
||
self.assertRegex(out, r"\bb\s*=\s*(DW_OP_reg4\b|RSI\b)")
|
||
self.assertRegex(out, r"\bx\s*=\s*(DW_OP_reg17\b|XMM0\b|DW_OP_reg\d+\b)")
|
||
self.assertRegex(out, r"\by\s*=\s*(DW_OP_reg18\b|XMM1\b|DW_OP_reg\d+\b)")
|
||
self.assertRegex(out, r"\bc\s*=\s*(DW_OP_reg1\b|RDX\b)")
|
||
self.assertRegex(out, r"\bz\s*=\s*(DW_OP_reg19\b|XMM2\b|DW_OP_reg\d+\b)")
|
||
self.assertNotIn("<decoding error>", out)
|
||
|
||
@no_debug_info_test
|
||
def test_live_across_call(self):
|
||
obj = self._build_obj("live_across_call.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("live_across_call")
|
||
print(out)
|
||
self.assertRegex(out, r"\bx\s*=\s*(DW_OP_reg5\b|RDI\b)")
|
||
self.assertIn("call", out)
|
||
self.assertRegex(out, r"\br\s*=\s*(DW_OP_reg0\b|RAX\b|DW_OP_reg\d+\b)")
|
||
self.assertNotIn("<decoding error>", out)
|
||
|
||
@no_debug_info_test
|
||
def test_loop_reg_rotate(self):
|
||
obj = self._build_obj("loop_reg_rotate.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("loop_reg_rotate")
|
||
print(out)
|
||
self.assertRegex(out, r"\bn\s*=\s*(DW_OP_reg\d+\b|R[A-Z0-9]+)")
|
||
self.assertRegex(out, r"\bseed\s*=\s*(DW_OP_reg\d+\b|R[A-Z0-9]+)")
|
||
self.assertRegex(out, r"\bk\s*=\s*(DW_OP_reg\d+\b|R[A-Z0-9]+)")
|
||
self.assertRegex(out, r"\bj\s*=\s*(DW_OP_reg\d+\b|R[A-Z0-9]+)")
|
||
self.assertRegex(out, r"\bi\s*=\s*(DW_OP_reg\d+\b|R[A-Z0-9]+)")
|
||
self.assertNotIn("<decoding error>", out)
|
||
|
||
@no_debug_info_test
|
||
def test_seed_reg_const_undef(self):
|
||
obj = self._build_obj("seed_reg_const_undef.o")
|
||
target = self._create_target(obj)
|
||
out = self._disassemble_verbose_symbol("main")
|
||
print(out)
|
||
self.assertRegex(out, r"\b(i|argc)\s*=\s*(DW_OP_reg\d+\b|R[A-Z0-9]+)")
|
||
self.assertNotIn("<decoding error>", out)
|