Resolves#141955
- Adds data to breakpoints `Source` object, in order for assembly
breakpoints, which rely on a temporary `sourceReference` value, to be
able to resolve in future sessions like normal path+line breakpoints
- Adds optional `instructions_offset` parameter to `BreakpointResolver`
LLDB uses the LLVM disassembler to determine the size of instructions and
to do the actual disassembly. Currently, if the LLVM disassembler can't
disassemble an instruction, LLDB will ignore the instruction size, assume
the instruction size is the minimum size for that device, print no useful
opcode, and print nothing for the instruction.
This patch changes this behavior to separate the instruction size and
"can't disassemble". If the LLVM disassembler knows the size, but can't
dissasemble the instruction, LLDB will use that size. It will print out
the opcode, and will print "<unknown>" for the instruction. This is much
more useful to both a user and a script.
The impetus behind this change is to clean up RISC-V disassembly when
the LLVM disassembler doesn't understand all of the instructions.
RISC-V supports proprietary extensions, where the TD files don't know
about certain instructions, and the disassembler can't disassemble them.
Internal users want to be able to disassemble these instructions.
With llvm-objdump, the solution is to pipe the output of the disassembly
through a filter program. This patch modifies LLDB's disassembly to look
more like llvm-objdump's, and includes an example python script that adds
a command "fdis" that will disassemble, then pipe the output through a
specified filter program. This has been tested with crustfilt, a sample
filter located at https://github.com/quic/crustfilt .
Changes in this PR:
- Decouple "can't disassemble" with "instruction size".
DisassemblerLLVMC::MCDisasmInstance::GetMCInst now returns a bool for
valid disassembly, and has the size as an out paramter.
Use the size even if the disassembly is invalid.
Disassemble if disassemby is valid.
- Always print out the opcode when -b is specified.
Previously it wouldn't print out the opcode if it couldn't disassemble.
- Print out RISC-V opcodes the way llvm-objdump does.
Code for the new Opcode Type eType16_32Tuples by Jason Molenda.
- Print <unknown> for instructions that can't be disassembled, matching
llvm-objdump, instead of printing nothing.
- Update max riscv32 and riscv64 instruction size to 8.
- Add example "fdis" command script.
- Added disassembly byte test for x86 with known and unknown instructions.
- Added disassembly byte test for riscv32 with known and unknown instructions,
with and without filtering.
- Added test from Jason Molenda to RISC-V disassembly unit tests.
This fixes a data race between the main thread and the default event
handler thread. The statusline format option value was protected by a
mutex, but it was returned as a pointer, allowing one thread to access
it while another was modifying it.
Avoid the data race by returning format values by value instead of by
pointer.
This overload is taking a StackFrame, so we just need to change how we
obtain the ranges out of it. A slightly fiddly aspect is the code which
tries to provide a default dissassembly range for the case where we
don't have a real one. I believe this case is only relevant for
symbol-based stack frames as debug info always has size/range for the
functions (if it didn't we wouldn't even resolve the stack frame to a
function), which is why I've split the handling of the two cases.
We already have a test case for disassembly of discontinuous functions
(in test/Shell/Commands/command-disassemble.s), so I'm not creating
another one as this is just a slightly different entry point into the
same code.
The main change is to permit the disassembler class to process/store
multiple (discontinuous) ranges of addresses. The result is not
ambiguous because each instruction knows its size (in addition to its
address), so we can check for discontinuity by looking at whether the
next instruction begins where the previous ends.
This patch doesn't handle the "disassemble" CLI command, which uses a
more elaborate mechanism for disassembling and printing instructions.
Lots of code around LLDB was directly accessing the target's section
load list. This NFC patch makes the section load list private so the
Target class can access it, but everyone else now uses accessor
functions. This allows us to control the resolving of addresses and will
allow for functionality in LLDB which can lazily resolve addresses in
JIT plug-ins with a future patch.
When retrieving the location of the function declaration, we were
dropping the file component on the floor, which resulted in an amusingly
confusing situation were we displayed the file containing the
implementation of the function, but used the line number of the
declaration. This patch fixes that.
It required a small refactor Function::GetStartLineSourceLineInfo to
return a SupportFile (instead of just the file spec), which in turn
necessitated changes in a couple of other places as well.
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).
This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.
The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
This member variable is completely unused. I also don't think it makes a
ton of sense since (1) The "base address" can be obtained from the first
Instruction in its InstructionList, and (2) InstructionLists may not be
a series of contiguous instructions (even though they are most of the
time).
This is another step towards supporting DWARF5 checksums and inline
source code in LLDB. This is a reland of #85468 but without the
functional change of storing the support file from the line table (yet).
Store a SupportFile, rather than a FileSpec, in LineEntry. This commit
works towards having the SourceManageroperate on SupportFiles so that it
can (1) validate the Checksum and (2) materialize the content of inline
source information.
Add support for syntax color highlighting disassembly in LLDB. This
patch relies on 77d1032516e7, which introduces support for syntax
highlighting in MC.
Currently only AArch64 and X86 have color support, but other interested
backends can adopt WithColor in their respective MCInstPrinter.
Differential revision: https://reviews.llvm.org/D159164
`Instruction::TestEmulation` takes a `Stream *` and checks it for validity.
However, this is unnecessary as we can always ensure that we never pass
`nullptr` for the `Stream` argument. The only use of
`Instruction::TestEmulation` currently is `SBInstruction::TestEmulation`
which gets the `Stream` from an `SBStream`, and `SBStream::ref` can
return a `Stream &` guaranteed.
Differential Revision: https://reviews.llvm.org/D154757
DissassemblerCreateInstance is a function pointer whos return type is
`Disassembler *`. But Disassembler::FindPlugin always returns a
DisassemblerSP, so there's no reason why we can't just create a
DisassemblerSP in the first place.
Differential Revision: https://reviews.llvm.org/D150235
Various OptionValue related classes are passing around will_modify but
the value is never used. This patch simplifies the interfaces by
removing the redundant argument.
Refactor OptionValue to return a std::optional instead of taking a fail
value. This allows the caller to handle situations where there's no
value, instead of being unable to distinguish between the absence of a
value and the value happening the match the fail value. When a fail
value is required, std::optional::value_or() provides the same
functionality.
llvm has a structure for maps where the key's type is a string. Using
that also means that the keys for OptionValueDictionary don't stick
around forever in ConstString's StringPool (even after they are gone).
The only thing we lose here is ordering: iterating over the map where the keys
are ConstStrings guarantees that we iterate in alphabetical order.
StringMap makes no guarantees about the ordering when you iterate over
the entire map.
Differential Revision: https://reviews.llvm.org/D149482
Refactor the string conversion of the `lldb::InstructionControlFlowKind` enum out
of `Instruction::Dump` to enable reuse of this logic by the
JSON TraceDumper (to be implemented in separate diff).
Will coordinate the landing of this change with D130320 since there will be a minor merge conflict between
these changes.
Test Plan:
Run unittests
```
> ninja check-lldb
[4/5] Running lldb unit test suite
Testing Time: 10.13s
Passed: 1084
```
Verify '-k' flag's output
```
(lldb) thread trace dump instructions -k
thread #1: tid = 1375377
libstdc++.so.6`std::ostream::flush() + 43
7048: 0x00007ffff7b54dab return retq
7047: 0x00007ffff7b54daa other popq %rbx
7046: 0x00007ffff7b54da7 other movq %rbx, %rax
7045: 0x00007ffff7b54da5 cond jump je 0x11adb0 ; <+48>
7044: 0x00007ffff7b54da2 other cmpl $-0x1, %eax
libc.so.6`_IO_fflush + 249
7043: 0x00007ffff7161729 return retq
7042: 0x00007ffff7161728 other popq %rbp
7041: 0x00007ffff7161727 other popq %rbx
7040: 0x00007ffff7161725 other movl %edx, %eax
7039: 0x00007ffff7161721 other addq $0x8, %rsp
7038: 0x00007ffff7161709 cond jump je 0x87721 ; <+241>
7037: 0x00007ffff7161707 other decl (%rsi)
7036: 0x00007ffff71616fe cond jump je 0x87707 ; <+215>
7035: 0x00007ffff71616f7 other cmpl $0x0, 0x33de92(%rip) ; __libc_multiple_threads
7034: 0x00007ffff71616ef other movq $0x0, 0x8(%rsi)
7033: 0x00007ffff71616ed cond jump jne 0x87721 ; <+241>
7032: 0x00007ffff71616e9 other subl $0x1, 0x4(%rsi)
7031: 0x00007ffff71616e2 other movq 0x88(%rbx), %rsi
7030: 0x00007ffff71616e0 cond jump jne 0x87721 ; <+241>
7029: 0x00007ffff71616da other testl $0x8000, (%rbx) ; imm = 0x8000
```
Differential Revision: https://reviews.llvm.org/D130580
This diff move the logic of `GetControlFlowKind()` from Disassembler.cpp to DisassemblerLLVMC.cpp.
Here's details:
- Actual logic of GetControlFlowKind() move to `DisassemblerLLVMC.cpp`, and we can check underlying architecture using `DisassemblerScope` there.
- With this change, passing 'triple' to `GetControlFlowKind()` is no more required.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D130320
The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D103084
Commiting this patch for Augusto Noronha who is getting set
up still.
This patch changes Target::ReadMemory so the default behavior
when a read is in a Section that is read-only is to fetch the
data from the local binary image, instead of reading it from
memory. Update all callers to use their old preferences
(the old prefer_file_cache bool) using the new API; we should
revisit these calls and see if they really intend to read
live memory, or if reading from a read-only Section would be
equivalent and important for performance-sensitive cases.
rdar://30634422
Differential revision: https://reviews.llvm.org/D100338
This patch introduces a LLDB_SCOPED_TIMER macro to hide the needlessly
repetitive creation of scoped timers in LLDB. It's similar to the
LLDB_LOG(F) macro.
Differential revision: https://reviews.llvm.org/D93663
Depends on D89408.
This diff finally implements trace decoding!
The current interface is
$ trace load /path/to/trace/session/file.json
$ thread trace dump instructions
thread #1: tid = 3842849, total instructions = 22
[ 0] 0x40052d
[ 1] 0x40052d
...
[19] 0x400521
$ # simply enter, which is a repeat command
[20] 0x40052d
[21] 0x400529
...
This doesn't do any disassembly, which will be done in the next diff.
Changes:
- Added an IntelPTDecoder class, that is a wrapper for libipt, which is the actual library that performs the decoding.
- Added TraceThreadDecoder class that decodes traces and memoizes the result to avoid repeating the decoding step.
- Added a DecodedThread class, which represents the output from decoding and that for the time being only stores the list of reconstructed instructions. Later it'll contain the function call hierarchy, which will enable reconstructing backtraces.
- Added basic APIs for accessing the trace in Trace.h:
- GetInstructionCount, which counts the number of instructions traced for a given thread
- IsTraceFailed, which returns an Error if decoding a thread failed
- ForEachInstruction, which iterates on the instructions traced for a given thread, concealing the internal storage of threads, as plug-ins can decide to generate the instructions on the fly or to store them all in a vector, like I do.
- DumpTraceInstructions was updated to print the instructions or show an error message if decoding was impossible.
- Tests included
Differential Revision: https://reviews.llvm.org/D89283
In a new Range class was introduced to simplify and the Disassembler API
and reduce duplication. It unintentionally broke the
SBFrame::Disassemble functionality because it unconditionally converts
the number of instructions to a Range{Limit::Instructions,
num_instructions}. This is subtly different from the previous behavior,
where now we're passing a Range and assume it's valid in the callee, the
original code would propagate num_instructions and the callee would
compare the value and decided between disassembling instructions or
bytes.
Unfortunately the existing tests was not particularly strict:
disassembly = frame.Disassemble()
self.assertNotEqual(len(disassembly), 0, "Disassembly was empty.")
This would pass because without this patch we'd disassemble zero
instructions, resulting in an error:
(lldb) script print(lldb.frame.Disassemble())
error: error reading data from section __text
Differential revision: https://reviews.llvm.org/D89925
On Hexagon, breakpoints need to be on the first instruction of a packet.
When the LLVM disassembler for Hexagon returned 32 bit instructions, we
needed code to find the start of the current packet. Now that the LLVM
disassembler for Hexagon returns packets instead of instructions, we always
have the first instruction of the packet. Remove the packet traversal code
because it can cause problems when the next packet has more than one
instruction.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D84966
Summary:
The class has two pairs of functions whose functionalities differ in
only how one specifies how much he wants to disasseble. One limits the
process by the size of the input memory region. The other based on the
total amount of instructions disassembled. They also differ in various
features (like error reporting) that were only added to one of the
versions.
There are various ways in which this could be addressed. This patch
does it by introducing a helper struct called "Limit", which is
effectively a pair specifying the value that you want to limit, and the
actual limit itself.
Reviewers: JDevlieghere
Subscribers: sdardis, jrtc27, atanasyan, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D75730
The static Disassembler can be thought of as shorthands for three
operations:
- fetch an appropriate disassembler instance (FindPluginForTarget)
- ask it to dissassemble some bytes (ParseInstructions)
- ask it to dump the disassembled instructions (PrintInstructions)
The only thing that's standing in the way of this interpretation is that
the Disassemble function also does some address resolution before
calling ParseInstructions. This patch moves this functionality into
ParseInstructions so that it is available to users who call
ParseInstructions directly.
Some functions in this file only use the "target" component of an
execution context. Adjust the argument lists to reflect that.
This avoids some defensive null checks and simplifies most of the
callers.
the previously static member function took a Disassembler* argument
anyway. This renames the argument to "this". The function also always
succeeds (returns true), so I change the return type to void.
by "inlining" them into their single caller (CommandObjectDisassemble).
The functions mainly consist of long argument lists and defensive
checks. These become unnecessary after inlining, so the end result is
less code. Additionally, this makes the implementation of
CommandObjectDisassemble more uniform (first figure out what you're
going to disassemble, then actually do it), which enables further
cleanups.
Instead of a ExecutionContext*. All it needs is the target so it can
read the memory.
This removes some defensive checks from the function. I've added
equivalent checks to the callers in cases where a non-null target
pointer was not guaranteed to be available.
Summary:
All of our lookup APIs either use `CompilerDeclContext &` or `CompilerDeclContext *` semi-randomly it seems.
This leads to us constantly converting between those two types (and doing nullptr checks when going from
pointer to reference). It also leads to the confusing situation where we have two possible ways to express
that we don't have a CompilerDeclContex: either a nullptr or an invalid CompilerDeclContext (aka a default
constructed CompilerDeclContext).
This moves all APIs to use references and gets rid of all the nullptr checks and conversions.
Reviewers: labath, mib, shafik
Reviewed By: labath, shafik
Subscribers: shafik, arphaman, abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74607
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).
This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).
Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73258
If you don't do this you end up running arbitrary code with
only one thread allowed to run, which can cause deadlocks.
<rdar://problem/56422478>
Differential Revision: https://reviews.llvm.org/D71440
This patch removes the size_t return value and the append parameter
from the remainder of the Find.* functions in LLDB's internal API. As
in the previous patches, this is motivated by the fact that these
parameters aren't really used, and in the case of the append parameter
were frequently implemented incorrectly.
Differential Revision: https://reviews.llvm.org/D69119
llvm-svn: 375160