There were two bugs here.
eMatchTypeStartsWith searched for "symbol_name" by adding ".*" to the
end of the symbol name and treating that as a regex, which isn't
actually a regex for "starts with". The ".*" is in fact a no-op. When
we finally get to comparing the name, we compare against whatever form
of the name was in the accelerator table. But for C++ that might be
the mangled name. We should also try demangled names here, since most
users are going the see demangled not mangled names. I fixed these
two bugs and added a bunch of tests for FindGlobalVariables.
This change is in the DWARF parser code, so there may be a similar bug
in PDB, but the test for this was already skipped for Windows, so I
don't know about this.
You might theoretically need to do this Mangled comparison in
DWARFMappedHash::MemoryTable::FindByName
except when we have names we always chop them before looking them up
so I couldn't see any code paths that fail without that change. So I
didn't add that to this patch.
Differential Revision: https://reviews.llvm.org/D151940
Currently, the method `GetAttributeAddressRanges` takes a DWARFRangeList as a
parameter, just to immediately clear it. The method also returns the size of
this list. Such an API was obfuscating the intent of the call sites (it's not
clear from the method name what it returns) and it was obfuscating redundant
checks on the size of the list.
This commit refactors the method to return the list and to also make the call
sites use the more explicit `IsEmpty` method.
Differential Revision: https://reviews.llvm.org/D151451
In an effort to unify the different dwarf parsers available in the codebase,
this commit removes LLDB's custom parsing for the `.debug_ranges` DWARF section,
instead calling into LLVM's parser.
Subsequent work should look into unifying `llvm::DWARDebugRangeList` (whose
entries are pairs of (start, end) addresses) with `lldb::DWARFRangeList` (whose
entries are pairs of (start, length)). The lists themselves are also different
data structures, but functionally equivalent.
Depends on D150363
Differential Revision: https://reviews.llvm.org/D150366
The goal of this patch is to make it easier to reason about the state of
ObjCLanguage::MethodName. I do that in several ways:
- Instead of using the constructor directly, you go through a factory
method. It returns a std::optional<MethodName> so either you got an
ObjCLanguage::MethodName or you didn't. No more checking if it's valid
to know if you can use it or not.
- ObjCLanguage::MethodName is now immutable. You cannot change its
internals once it is created.
- ObjCLanguage::MethodName::GetFullNameWithoutCategory previously had a
parameter that let you get back an empty string if the method had no
category. Every caller of this method was enabling this behavior so I
dropped the parameter and made it the default behavior.
- No longer store all the various components of the method name as
ConstStrings. The relevant `Get` methods now return llvm::StringRefs
backed by the MethodName's internal storage. The lifetime of these
StringRefs are tied to the MethodName itself, so if you need to
persist these you need to create copies.
Differential Revision: https://reviews.llvm.org/D149914
Both LLVM and LLDB implement DWARFAbbreviationDeclaration. As of
631ff46cbf51, llvm's implementation of
DWARFAbbreviationDeclaration::extract behaves the same as LLDB's
implementation, making it easier to merge the implementations.
The only major difference between LLDB's implementation and LLVM's
implementation is that LLVM's DWARFAbbreviationDeclaration is slightly
larger. Specifically, it has some metadata that keeps track of the size
of a declaration (if it has a fixed size) so that it can potentially
optimize extraction in some scenarios. I think this increase in size
should be acceptable and possibly useful on the LLDB side.
Differential Revision: https://reviews.llvm.org/D150716
**Summary**
When filling out the LayoutInfo for a structure with the offsets
from DWARF, LLDB fills gaps in the layout by creating unnamed
bitfields and adding them to the AST. If we don't do this correctly
and our layout has overlapping fields, we will hat an assertion
in `clang::CGRecordLowering::lower()`. Specifically, if we have
a derived class with a VTable and a bitfield immediately following
the vtable pointer, we create a layout with overlapping fields.
This is an oversight in some of the previous cleanups done around this
area.
In `D76808`, we prevented LLDB from creating unnamed bitfields if there
was a gap between the last field of a base class and the start of a bitfield
in the derived class.
In `D112697`, we started accounting for the vtable pointer. The intention
there was to make sure the offset bookkeeping accounted for the
existence of a vtable pointer (but we didn't actually want to create
any AST nodes for it). Now that `last_field_info.bit_size` was being
set even for artifical fields, the previous fix `D76808` broke
specifically for cases where the bitfield was the first member of a
derived class with a vtable (this scenario wasn't tested so we didn't
notice it). I.e., we started creating redundant unnamed bitfields for
where the vtable pointer usually sits. This confused the lowering logic
in clang.
This patch adds a condition to `ShouldCreateUnnamedBitfield` which
checks whether the first field in the derived class is a vtable ptr.
**Testing**
* Added API test case
Differential Revision: https://reviews.llvm.org/D150591
This patch adds a new private helper
`DWARFASTParserClang::ShouldCreateUnnamedBitfield` which
`ParseSingleMember` whether we should fill the current gap
in a structure layout with unnamed bitfields.
Extracting this logic will allow us to add additional
conditions in upcoming patches without jeoperdizing readability
of `ParseSingleMember`.
We also store some of the boolean conditions in local variables
to make the intent more obvious.
Differential Revision: https://reviews.llvm.org/D150590
DWARFAttribute is used in 2 classes: DWARFAbbreviationDecl and
DWARFAttributes. The former stores a std::vector of them and the latter
has a small structure called AttributeValue that contains a
DWARFAttribute. DWARFAttributes maintains a llvm::SmallVector of
AttributeValues.
My end goal is to have `DWARFAttributes` have a llvm::SmallVector
specialized on DWARFAttribute. In order to do that, we'll have to move
the other elements of AttributeValue into DWARFAttribute itself. But we
don't want to do this while DWARFAbbreviationDecl is using
DWARFAttribute because it will needlessly increase the size of
DWARFAbbreviationDecl. So instead I will create a small type containing
only what DWARFAbbreviationDecl needs and call it `AttributeSpec`. This
is the exact same thing that LLVM does today.
I've elected to swap std::vector for llvm::SmallVector here with a pre-allocated
size of 8. I've collected time and memory measurements before this change and
after it as well. Using a c++ project with 10,000 object files and no dSYM, I
place a breakpoint by file + lineno and see how long it takes to resolve.
Before this patch:
Time (mean ± σ): 13.577 s ± 0.024 s [User: 12.418 s, System: 1.247 s]
Total number of bytes allocated: 1.38 GiB
Total number of allocations: 6.47 million allocations
After this patch:
Time (mean ± σ): 13.287 s ± 0.020 s [User: 12.128 s, System: 1.250 s]
Total number of bytes allocated: 1.59 GiB
Total number of allocations: 4.61 million allocations
So we consume more memory than before, but we actually make less allocations on
average.
I also measured with an llvm::SmallVector with a pre-allocated size of 4 instead
of 8 to measure how well it performs:
Time (mean ± σ): 13.246 s ± 0.048 s [User: 12.074 s, System: 1.268 s]
Total memory consumption: 1.50 GiB
Total number of allocations: 5.74 million
Of course this data may look very different depending on the actual program
being debugged, but each of the object files had 100+ AbbreviationDeclarations
each with between 0 and 10 Attributes, so I feel this was a fair example to
consider.
Differential Revision: https://reviews.llvm.org/D150418
The purpose of this method is to get the list of attributes of a
DebugInfoEntry. Prior to this change we were passing in a mutable
reference to a DWARFAttributes object and having the method fill it in
for us while returning the size of the filled out list. But
instead of doing that, we can just return a `DWARFAttributes` object
ourselves since every caller creates a new list before calling
GetAttributes.
Differential Revision: https://reviews.llvm.org/D150402
Similar to dw_form_t, dw_attr_t is typedef'd to be a uint16_t. LLVM
defines their type `llvm::dwarf::Attribute` as an enum backed by a
uint16_t. Switching to the LLVM type buys us type checking and the
requirement of explicit casts.
Differential Revision: https://reviews.llvm.org/D150299
Most of the code changed here dates back to 2010, when LLDB was first
introduced upstream, as such it benefits from a slight cleanup.
The method "dump" is not used anywhere nor is it tested, so this commit removes
it.
The "findRanges" method returns a boolean which is never checked and indicates
whether the method found anything/assigned a range map to the out parameter.
This commit folds the out parameter into the return type of the method.
A handful of typedefs were also never used and therefore removed.
Differential Revision: https://reviews.llvm.org/D150363
LLDB currently defines `dw_form_t` as a `uint16_t` which makes sense.
However, LLVM also defines a similar type `llvm::dwarf::Form` which is
an enum backed by a `uint16_t`. Switching to the llvm implementation
means that we can more easily interoperate with the LLVM DWARF code.
Additionally, we get some type checking out of this: I found that
DWARFAttribute had a method called `FormAtIndex` that returned a
`dw_attr_t`. Although `dw_attr_t` is also a `uint16_t` under the hood,
the type checking benefits here are undeniable: If this had returned a
something of different signedness/width, we could have had some bad
bugs.
Differential Revision: https://reviews.llvm.org/D150228
The LEB128 type defined by the DWARF standard is explicitly a variable-length
encoding of an integer. LLDB had defined `uleb128` and `sleb128` types
to be 32-bit but in many places in both LLVM and LLDB we treat the maximum
width of LEB128 types to be 64, so let's remove these types and be
consistent.
Differential Revision: https://reviews.llvm.org/D150222
Use templates to simplify {Get,Set}PropertyAtIndex. It has always
bothered me how cumbersome those calls are when adding new properties.
After this patch, SetPropertyAtIndex infers the type from its arguments
and GetPropertyAtIndex required a single template argument for the
return value. As an added benefit, this enables us to remove a bunch of
wrappers from UserSettingsController and OptionValueProperties.
Differential revision: https://reviews.llvm.org/D149774
The majority of call sites are nullptr as the execution context.
Refactor OptionValueProperties to make the argument optional and
simplify all the callers.
Similar to fdbe7c7faa54, refactor OptionValueProperties to return a
std::optional instead of taking a fail value. This allows the caller to
handle situations where there's no value, instead of being unable to
distinguish between the absence of a value and the value happening the
match the fail value. When a fail value is required,
std::optional::value_or() provides the same functionality.
We have a handful of places in LLDB where we try to outsmart the logic
in Mangled to determine whether a string is mangled or not. There's at
least one place (*) where we are getting this wrong and causes a subtle
bug. The `cstring_is_mangled` is cheap enough that we should always rely
on it to determine whether a string is mangled or not.
(*) `ObjectFileMachO` assumes that a symbol that starts with a double
underscore (such as `__pthread_kill`) is mangled. That's mostly
harmless, until you use `function.name-without-args` in the frame
format. The formatter calls `Symbol::GetNameNoArguments()` which is a
wrapper around `Mangled::GetName(ePreferDemangledWithoutArguments)`. The
latter will first try using the appropriate language plugin to get the
demangled name without arguments, and if that fails, falls back to
returning the demangled name. Because we forced Mangled to treat the
symbol as a mangled name (even though it's not) there's no demangled
name. The result is that frames don't show any symbol at all.
Differential revision: https://reviews.llvm.org/D148846
This reverts commit d81cdb49d74064e88843733e7da92db865943509.
This refactoring was waiting on converting LLVM to C++17.
Leave StringView.h and cleanup around for subsequent cleanup.
Additional fixes for missing std::string_view conversions for MSVC.
Reviewed By: MaskRay, DavidSpickett, ayzhao
Differential Revision: https://reviews.llvm.org/D148546
These probably do not need to be in the ConstString StringPool as they
don't really need any of the advantages that ConstStrings offer.
Lifetime for these things is always static and we never need to perform
comparisons for setting descriptions.
Differential Revision: https://reviews.llvm.org/D148679
Add MSP430 to the list of available targets, implement MSP430 ABI, add support for debugging targets with 16-bit address size.
The update is intended for use with MSPDebug, a GDB server implementation for MSP430.
Reviewed By: bulbazord, DavidSpickett
Differential Revision: https://reviews.llvm.org/D146965
Add MSP430 to the list of available targets, implement MSP430 ABI, add support for debugging targets with 16-bit address size.
The update is intended for use with MSPDebug, a GDB server implementation for MSP430.
Reviewed By: bulbazord, DavidSpickett
Differential Revision: https://reviews.llvm.org/D146965
This reverts commit 3e559509b426b6aae735a7f57dbdaed1041d2622 and e0c4ffa796b553fa78c638a9584c05ac21fe07d5.
This still breaks Windows builds.
In addition, `#include <llvm/ADT/StringViewExtras.h>` in
llvm/include/llvm/Demangle/ItaniumDemangle.h is a library layering violation
(LLVMDemangle is the lowest LLVM library and cannot depend on LLVMSupport).
This refactoring was waiting on converting LLVM to C++17.
Leave StringView.h and cleanup around for subsequent cleanup.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148384
**Summary**
In a program such as:
```
namespace A {
namespace B {
struct Bar {};
}
}
namespace B {
struct Foo {};
}
```
...LLDB would run into issues such as:
```
(lldb) expr ::B::Foo f
error: expression failed to parse:
error: <user expression 0>:1:6: no type named 'Foo' in namespace 'A::B'
::B::Foo f
~~~~~^
```
This is because the `SymbolFileDWARF::FindNamespace` implementation
will return *any* namespace it finds if the `parent_decl_ctx` provided
is empty. In `FindExternalVisibleDecls` we use this API to find the
namespace that symbol `B` refers to. If `A::B` happened to be the one
that `SymbolFileDWARF::FindNamespace` looked at first, we would try
to find `struct Foo` in `A::B`. Hence the error.
This patch proposes a new `SymbolFileDWARF::FindNamespace` API that
will only find a match for top-level namespaces, which is what
`FindExternalVisibleDecls` is attempting anyway; it just never
accounted for multiple namespaces of the same name.
**Testing**
* Added API test-case
Differential Revision: https://reviews.llvm.org/D147436
The function DWARFASTParserClang::ParsePointerToMemberType attempts to make
two pointers and then immediately tries to dereference them, without
verifying that the pointesr were successfully created. Sometimes the pointer
creation fails, and the dereference then causes a segfault. This add a check
that the pointers are non-null before attempting to dereference them.
Implement SymbolFile::GetCompileOptions, which returns a map from
compilation units to compilation arguments associated with that unit.
Differential Revision: https://reviews.llvm.org/D147748
Ensure that the variant returned by `member->getValue()` has a value and
is not `Empty`. Failure to do so will trigger an assertion failure in
`llvm::pdb::Variant::getBitWidth()`. This can occur when the `static`
member is a forward declaration.
Differential Revision: https://reviews.llvm.org/D146536
Reviewed By: sgraenitz
SymbolFile::ParseAllLanguages allows collecting the languages of the
extra compile units a SymbolFileDWARFDebugMap may have, which can't
be accessed otherwise. For every other symbol file type, it should
behave exactly the same as ParseLanguage.
rdar://97610458
Differential Revision: https://reviews.llvm.org/D146265
This patch adds llvm::codeview::SourceLanguage entries, DWARF translations, and PDB source file extensions in LLVM and allow LLDB's PDB parsers to recognize them correctly.
The CV_CFL_LANG enum in the Visual Studio 2022 documentation https://learn.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cfl-lang defines:
```
CV_CFL_OBJC = 0x11,
CV_CFL_OBJCXX = 0x12,
```
Since the initial commit in D24317, ObjC was emitted as C language and ObjC++ as Masm.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D146221
The old name of this function was confusing for me, when I started working on the PDB parser. The new name clearifies that the function adds file, line and column (typically referred as source info) and indicates that the information is stored in the provided decl parameter.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D146286
Introduce a new object and symbol file format with the goal of mapping
addresses to symbol names. I'd like to think of is as an extremely
simple textual symtab. The file format consists of a triple, a UUID and
a list of symbols. JSON is used for the encoding, but that's mostly an
implementation detail. The goal of the format was to be simple and human
readable.
The new file format is motivated by two use cases:
- Stripped binaries: when a binary is stripped, you lose the ability to
do thing like setting symbolic breakpoints. You can keep the
unstripped binary around, but if all you need is the stripped
symbols then that's a lot of overhead. Instead, we could save the
stripped symbols to a file and load them in the debugger when
needed. I want to extend llvm-strip to have a mode where it emits
this new file format.
- Interactive crashlogs: with interactive crashlogs, if we don't have
the binary or the dSYM for a particular module, we currently show an
unnamed symbol for those frames. This is a regression compared to the
textual format, that has these frames pre-symbolicated. Given that
this information is available in the JSON crashlog, we need a way to
tell LLDB about it. With the new symbol file format, we can easily
synthesize a symbol file for each of those modules and load them to
symbolicate those frames.
Here's an example of the file format:
{
"triple": "arm64-apple-macosx13.0.0",
"uuid": "36D0CCE7-8ED2-3CA3-96B0-48C1764DA908",
"symbols": [
{
"name": "main",
"type": "code",
"size": 32,
"address": 4294983568
},
{
"name": "foo",
"type": "code",
"size": 8,
"address": 4294983560
}
]
}
Differential revision: https://reviews.llvm.org/D145180
Usually PDB files have a string table (aka: Named Stream "/names" ). PDB for
some windows system libraries might not have that. This adds the check for it to
avoid crash in the absence of string table.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D145115
In an upcoming patch, D142556, Clang is proposed to be changed to emit
line locations that are inlined at line 0. This clashed with the behavior of
GetDIENamesAndRanges() which used 0 as a default value to determine if
file, line or column numbers had been set. Users of that function then
checked for any non-0 values when setting up the call site:
if (call_file != 0 || call_line != 0 || call_column != 0)
[...]
which did not work with the Clang change since all three values then
could be 0.
This changes the function to use std::optional to catch non-set values
instead.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D142552
This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
Note that those functions on the left hand side are soft-deprecated in
favor of those on the right hand side:
getMinSignedBits -> getSignificantBits
getNullValue -> getZero
isNullValue -> isZero
isOneValue -> isOne
This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Depends on D139955
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
This reverts commit be7d7ca1101840fc8e19e0e48f9dc395da569d23.
This commit was made to fix be7d7ca1101840fc8e19e0e48f9dc395da569d23
which got reverted in 620b3d9ba3343d7bc5bab2340174a20952fcd00f. We need
to revert this commit as well because types in log statements are 32 bit again.