Lots of code around LLDB was directly accessing the target's section
load list. This NFC patch makes the section load list private so the
Target class can access it, but everyone else now uses accessor
functions. This allows us to control the resolving of addresses and will
allow for functionality in LLDB which can lazily resolve addresses in
JIT plug-ins with a future patch.
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).
This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.
The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477
This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like
```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc unknown stp x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd unknown stp x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd unknown add x29, sp, #0x10
0x1000086f0: 0xd11843ff unknown sub sp, sp, #0x610
0x1000086f4: 0x910c63e8 unknown add x8, sp, #0x318
```
instead of `disassemble`'s output style of
```
lldb`main:
lldb[0x1000086e4] <+0>: stp x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>: stp x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>: add x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub sp, sp, #0x610
lldb[0x1000086f4] <+16>: add x8, sp, #0x318
```
Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.
But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.
Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.
Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d5010a7e34ae15f962dac8505ea11a8716
Author: Greg Clayton <gclayton@apple.com>
Date: Thu Oct 27 17:55:14 2011 +0000
[...]
eFormatInstruction will print out disassembly with bytes and it will use the
current target's architecture. The format character for this is "i" (which
used to be being used for the integer format, but the integer format also has
"d", so we gave the "i" format to disassembly), the long format is
"instruction".
```
he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.
tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.
As suggested by Greg in https://github.com/llvm/llvm-project/pull/66534,
I'm adding a setting at the Target level that controls whether to show
leading zeroes in hex ValueObject values.
This has the benefit of reducing the amount of characters displayed in
certain interfaces, like VSCode.
When specifying the C-string format for dumping memory, we treat
unprintable characters as signed. Whether a character is signed or not
is implementation defined, but all printable characters are signed.
Therefore it's fair to assume that unprintable characters are unsigned.
Before this patch, "\xcf\xfa\xed\xfe\f" would be printed as
"\xffffffcf\xfffffffa\xffffffed\xfffffffe\f". Now we correctly print the
original string.
rdar://111126134
Differential revision: https://reviews.llvm.org/D153644
std::optional::value() has undesired exception checking semantics and is
unavailable in some older Xcode. The call sites block std::optional migration.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
When a process gets restarted TypeSystem objects associated with it
may get deleted, and any CompilerType objects holding on to a
reference to that type system are a use-after-free in waiting. Because
of the SBAPI, we don't have tight control over where CompilerTypes go
and when they are used. This is particularly a problem in the Swift
plugin, where the scratch TypeSystem can be restarted while the
process is still running. The Swift plugin has a lock to prevent
abuse, but where there's a lock there can be bugs.
This patch changes CompilerType to store a std::weak_ptr<TypeSystem>.
Most of the std::weak_ptr<TypeSystem>* uglyness is hidden by
introducing a wrapper class CompilerType::WrappedTypeSystem that has a
dyn_cast_or_null() method. The only sites that need to know about the
weak pointer implementation detail are the ones that deal with
creating TypeSystems.
rdar://101505232
Differential Revision: https://reviews.llvm.org/D136650
The DumpDataExtractor function had two branches for printing floating
point values. One branch (APFloat) was used if we had a Target object
around and could query it for the appropriate semantics. If we didn't
have a Target, we used host operations to read and format the value.
This patch changes second path to use APFloat as well. To make it work,
I pick reasonable defaults for different byte size. Notably, I did not
include x87 long double in that list (as it is ambibuous and
architecture-specific). This exposed a bug where we were printing
register values using the target-less branch, even though the registers
definitely belong to a target, and we had it available. Fixing this
prompted the update of several tests for register values due to slightly
different floating point outputs.
The most dubious aspect of this patch is the change in
TypeSystemClang::GetFloatTypeSemantics to recognize `10` as a valid size
for x87 long double. This was necessary because because sizeof(long
double) on x86_64 is 16 even though it only holds 10 bytes of useful
data. This generalizes the hackaround present in the target-free branch
of the dumping function.
Differential Revision: https://reviews.llvm.org/D129750
This adds an option --show-tags to "memory read".
(lldb) memory read mte_buf mte_buf+32 -f "x" -s8 --show-tags
0x900fffff7ff8000: 0x0000000000000000 0x0000000000000000 (tag: 0x0)
0x900fffff7ff8010: 0x0000000000000000 0x0000000000000000 (tag: 0x1)
Tags are printed on the end of each line, if that
line has any tags associated with it. Meaning that
untagged memory output is unchanged.
Tags are printed based on the granule(s) of memory that
a line covers. So you may have lines with 1 tag, with many
tags, no tags or partially tagged lines.
In the case of partially tagged lines, untagged granules
will show "<no tag>" so that the ordering is obvious.
For example, a line that covers 2 granules where the first
is not tagged:
(lldb) memory read mte_buf-16 mte_buf+16 -l32 -f"x" --show-tags
0x900fffff7ff7ff0: 0x00000000 <...> (tags: <no tag> 0x0)
Untagged lines will just not have the "(tags: ..." at all.
Though they may be part of a larger output that does have
some tagged lines.
To do this I've extended DumpDataExtractor to also print
memory tags where it has a valid execution context and
is asked to print them.
There are no special alignment requirements, simply
use "memory read" as usual. All alignment is handled
in DumpDataExtractor.
We use MakeTaggedRanges to find all the tagged memory
in the current dump, then read all that into a MemoryTagMap.
The tag map is populated once in DumpDataExtractor and re-used
for each subsequently printed line (or recursive call of
DumpDataExtractor, which some formats do).
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D107140
The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D103084
It seems std::ostringstream ignores NaN signs on Darwin while it prints them on
Linux. This causes that LLDB behaves differently on those platforms which is
both confusing for users and it also means we have to deal with that in our
tests.
This patch manually implements the NaN/Inf printing (which are apparently
implementation defined) to make LLDB print the same thing on all platforms. The
only output difference in practice seems to be that we now print negative NaNs
as `-nan`, but this potentially also changes the output on other systems I
haven't tested this on.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D102845
This relands part of the UB fix in 4b074b49be206306330076b9fa40632ef1960823.
The original commit also added some additional tests that uncovered some
other issues (see D102845). I landed all the passing tests in
48780527dd6820698f3537f5ebf76499030ee349 and this patch is now just fixing
the UB in half2float. See D102846 for a proposed rewrite of the function.
Original commit message:
The added DumpDataExtractorTest uncovered that this is lshifting a negative
integer which upsets ubsan and breaks the sanitizer bot. This patch just
changes the variable we shift to be unsigned.
The added DumpDataExtractorTest uncovered that this is lshifting a negative
integer which upsets ubsan and breaks the sanitizer bot. This patch just
changes the variable we shift to be unsigned and adds a bunch of tests to make
sure this function does what it promises.
Summary: Just unifying all that copy-pasted code.
Reviewers: JDevlieghere
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D83662
Summary:
LLVM is using its own isPrint/isSpace implementation that doesn't change depending on the current locale. LLDB should do the same
to prevent that internal logic changes depending on the set locale.
Reviewers: JDevlieghere, labath, mib, totally_not_teemperor
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D82175
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).
This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).
Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73258
Summary:
Yet another step on the long road towards getting rid of lldb's Stream class.
We probably should just make this some kind of member of Address/AddressRange, but it seems quite often we just push
in random integers in there and this is just about getting rid of Stream and not improving arbitrary APIs.
I had to rename another `DumpAddress` function in FormatEntity that is dumping the content of an address to make Clang happy.
Reviewers: labath
Reviewed By: labath
Subscribers: JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71052
Summary:
DumpDataExtractor uses ClangASTContext in order to get the proper llvm
fltSemantics for the type it needs so that it can dump floats in a more
precise way. However, there's no reason that this behavior needs to be
specific ClangASTContext. Instead, I think it makes sense to ask
TypeSystems for the float semantics for a type of a given size.
Differential Revision: https://reviews.llvm.org/D67239
llvm-svn: 371258
Summary:
We got a radar that printing small floats is not very user-friendly in LLDB as we print them with up to
100 leading zeroes before starting to use scientific notation. This patch changes this by already using
scientific notation when we hit 6 padding zeroes by default and moves this value into a target setting
so that users can just set this number back to 100 if they for some reason preferred the old behaviour.
This new setting is influencing how we format data, so that's why we have to reset the data visualisation
cache when it is changed.
Note that we have always been using scientific notation for large numbers because it seems that
the LLVM implementation doesn't support printing out the padding zeroes for them. I would have fixed
that if it was trivial, but looking at the LLVM implementation for this it seems that this is not as trivial
as it sounds. I would say we look into this if we ever get a bug report about someone wanting to have
a large amount of trailing zeroes in their numbers instead of using scientific notation.
Fixes rdar://39744137
Reviewers: #lldb, clayborg
Reviewed By: clayborg
Subscribers: JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D67001
llvm-svn: 370880
The current implementation returns a bool for indicating success and
whether or not the APInt passed by reference was populated. Instead of
doing that, I think it makes more sense to return an Optional<APInt>.
llvm-svn: 369970
The `ap` suffix is a remnant of lldb's former use of auto pointers,
before they got deprecated. Although all their uses were replaced by
unique pointers, some variables still carried the suffix.
In r353795 I removed another auto_ptr remnant, namely redundant calls to
::get for unique_pointers. Jim justly noted that this is a good
opportunity to clean up the variable names as well.
I went over all the changes to ensure my find-and-replace didn't have
any undesired side-effects. I hope I didn't miss any, but if you end up
at this commit doing a git blame on a weirdly named variable, please
know that the change was unintentional.
llvm-svn: 353912
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This patch removes the comments following the header includes. They were
added after running IWYU over the LLDB codebase. However they add little
value, are often outdates and burdensome to maintain.
Differential revision: https://reviews.llvm.org/D54385
llvm-svn: 346625
This is intended as a clean up after the big clang-format commit
(r280751), which unfortunately resulted in many of the comment
paragraphs in LLDB being very hard to read.
FYI, the script I used was:
import textwrap
import commands
import os
import sys
import re
tmp = "%s.tmp"%sys.argv[1]
out = open(tmp, "w+")
with open(sys.argv[1], "r") as f:
header = ""
text = ""
comment = re.compile(r'^( *//) ([^ ].*)$')
special = re.compile(r'^((([A-Z]+[: ])|([0-9]+ )).*)|(.*;)$')
for line in f:
match = comment.match(line)
if match and not special.match(match.group(2)):
# skip intentionally short comments.
if not text and len(match.group(2)) < 40:
out.write(line)
continue
if text:
text += " " + match.group(2)
else:
header = match.group(1)
text = match.group(2)
continue
if text:
filled = textwrap.wrap(text, width=(78-len(header)),
break_long_words=False)
for l in filled:
out.write(header+" "+l+'\n')
text = ""
out.write(line)
os.rename(tmp, sys.argv[1])
Differential Revision: https://reviews.llvm.org/D46144
llvm-svn: 331197
Summary:
LLDB's DumpDataExtractor was not prepared to handle PowerPC's long double type: PPCDoubleDouble.
As it is somewhat special, treating it as other regular float types resulted in getting wrong information about it.
In this particular case, llvm::APFloat::getSizeInBits(PPCDoubleDouble) was returning 0.
This caused the TestSetValues.py test to fail, because lldb would abort on an assertion failure on APInt(), because of the invalid size.
Since in the PPC case the value of item_byte_size was correct and the
getSizeInBits call was only added to support x87DoubleExtended
semantics, this restricts the usage of getSizeInBits to the x87
semantics.
Reviewers: labath, clayborg
Reviewed By: labath
Subscribers: llvm-commits, anajuliapc, alexandreyy, lbianc, lldb-commits
Differential Revision: https://reviews.llvm.org/D42083
Author: Leandro Lupori <leandro.lupori@gmail.com>
llvm-svn: 322666
* Prevent dumping of characters in DumpDataExtractor() with
item_byte_size bigger than 8 bytes. This case is not supported by the
code and results in a crash because the code calls
DataExtractor::GetMaxU64Bitfield() -> GetMaxU64() that asserts for
byte size > 8 bytes.
* Teach DataExtractor::GetMaxU64(), GetMaxU32(), GetMaxS64() and
GetMaxU64_unchecked() how to handle byte sizes that are not a multiple
of 2. This allows DumpDataExtractor() to dump characters and booleans
with item_byte_size in the interval of [1, 8] bytes. Values that are
not a multiple of 2 would previously result in a crash because they
were not handled by GetMaxU64().
llvm-svn: 315444
This adjusts header file includes for headers and source files
in Core. In doing so, one dependency cycle is eliminated
because all the includes from Core to that project were dead
includes anyway. In places where some files in other projects
were only compiling due to a transitive include from another
header, fixups have been made so that those files also include
the header they need. Tested on Windows and Linux, and plan
to address failures on OSX and FreeBSD after watching the
bots.
llvm-svn: 299714
In an effort to move the various DataBuffer / DataExtractor
classes from Core -> Utility, we have to separate the low-level
functionality from the higher level functionality. Only a
few functions required anything other than reading/writing
raw bytes, so those functions are separated out into a
more appropriate area. Specifically, Dump() and DumpHexBytes()
are moved into free functions in Core/DumpDataExtractor.cpp,
and GetGNUEHPointer is moved into a static function in the
only file that it's referenced from.
Differential Revision: https://reviews.llvm.org/D30560
llvm-svn: 296910