This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Depends on D139955
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.
The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.
Depends on D139955
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D138618
This reverts commit 19128792e2aa320c1a149f7f93638cbd7f3c83c6.
As pointed out in https://reviews.llvm.org/D143652 this implementation
doesn't quite work for subobject constructors/destructors because DWARF
can map multiple definitions of a ctor/dtor to the same specification DIE.
With the current implementation we would pick the first definition we
find and use that linkage name which means we can sometimes pick the
wrong dtor/ctor and fail to execute a valid expression.
Differential Revision: https://reviews.llvm.org/D143652
This relands a patch previously reverted
in `181d6e24ca3c09bfd6ec7c3b20affde3e5ea9b40`.
This wasn't quite working on Linux because we
weren't populating the manual DWARF index with
`DW_TAG_imported_declaration`. The relanded patch
does this.
**Summary**
This patch makes the expression evaluator understand
namespace aliases.
This will become important once `std::ranges` become
more widespread since `std::views` is defined as:
```
namespace std {
namespace ranges::views {}
namespace views = ranges::views;
}
```
**Testing**
* Added API test
Differential Revision: https://reviews.llvm.org/D143398
This relands the commit previously reverted in
`d2cc2c5610ffa78736aa99512bc85a85417efb0a` due to failures on Linux
when debugging split-debug-info enabled executables.
The problem was we called `SymbolFileDWARF::FindFunctions` directly
instead of `Module::FindFunctions` which resulted in a nullptr
dereference because the backing `SymbolFileDWARFDwo` didn't have
an index attached to it. The relanded version calls `Module::FindFunctions`
instead.
Differential Revision: https://reviews.llvm.org/D143652
**Summary**
This patch addresses the case where we have a `DW_AT_external`
subprogram for a constructor (and/or destructor) that doesn't carry
a `DW_AT_linkage_name` attribute. The corresponding DIE(s) that
represent the definition will have a linkage name, but if the name
contains constructs that LLDBs fallback mechanism for guessing mangled
names to resolve external symbols doesn't support (e.g., abi-tags)
then we end up failing to resolve the function call.
We address this by trying to find the linkage name before we create
the constructor/destructor decl, which will get attached using
an `AsmLabelAttr` to make symbol resolution easier.
**Testing**
* Added API test
Differential Revision: https://reviews.llvm.org/D143652
**Summary**
This patch makes the expression evaluator understand
namespace aliases.
This will become important once `std::ranges` become
more widespread since `std::views` is defined as:
```
namespace std {
namespace ranges::views {}
namespace views = ranges::views;
}
```
**Testing**
* Added API test
Differential Revision: https://reviews.llvm.org/D143398
This patch makes the members of `TemplateParameterInfos` only accessible
via public APIs. The motivation for this is that
`TemplateParameterInfos` attempts to maintain two vectors in tandem
(`args` for the template arguments and `names` for the corresponding
name). Working with this structure as it's currently designed makes
it easy to run into out-of-bounds accesses later down the line.
This patch proposes to introduce a new
`TemplateParameterInfos::InsertArg` which is the only way to
set the `TemplateArgument` and name of an entry and since we
require both to be specified we maintain the vectors in sync
out-of-the-box.
To avoid adding non-const getters just for unit-tests a new
`TemplateParameterInfosManipulatorForTests` is introduced
that can be used to control internal state from tests.
Otherwise we may be inserting a decl into a DeclContext that's not fully defined yet.
This simplifies/removes some clang AST node creation code. Instead, use
clang::printTemplateArgumentList().
Reviewed By: Michael137
Differential Revision: https://reviews.llvm.org/D142413
SymbolFiles should own Types by keeping them in their TypeList. This
patch privates the Type constructor to guarantee that every created Type
is kept in the SymbolFile's type list.
This regressed with `e262b8f48af9fdca8380f2f079e50291956aad71`.
Two issues here:
1. `:16x` is not a valid format specifier and
we would crash when we encountered this log
(which was the case in `TestCPPAccelerator.py`)
2. The third argument was missing curly braces so
the log message itself was malformed.
In preparation for eanbling 64bit support in LLDB switching to use llvm::formatv
instead of format MACROs.
Reviewed By: labath, JDevlieghere
Differential Revision: https://reviews.llvm.org/D139955
Without checking template parameters, we would sometimes lookup the
wrong type definition for a type declaration because different
instantiations of the same template class had the same debug info name.
The added GetForwardDeclarationDIETemplateParams() shouldn't need a
cache because we'll cache the results of the declaration -> definition
lookup anyway. (DWARFASTParserClang::ParseStructureLikeDIE()
is_forward_declaration branch)
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D138834
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
A previous patch added the ability for us to tell if types were forcefully completed. This patch adds the ability to see which modules have forcefully completed types and aggregates the number of modules with forcefully completed types at the root level.
We add a module specific setting named "debugInfoHadIncompleteTypes" that is a boolean value. We also aggregate the number of modules at the root level that had incomplete debug info with a key named "totalModuleCountWithIncompleteTypes" that is a count of number of modules that had incomplete types.
Differential Revision: https://reviews.llvm.org/D138638
When a process gets restarted TypeSystem objects associated with it
may get deleted, and any CompilerType objects holding on to a
reference to that type system are a use-after-free in waiting. Because
of the SBAPI, we don't have tight control over where CompilerTypes go
and when they are used. This is particularly a problem in the Swift
plugin, where the scratch TypeSystem can be restarted while the
process is still running. The Swift plugin has a lock to prevent
abuse, but where there's a lock there can be bugs.
This patch changes CompilerType to store a std::weak_ptr<TypeSystem>.
Most of the std::weak_ptr<TypeSystem>* uglyness is hidden by
introducing a wrapper class CompilerType::WrappedTypeSystem that has a
dyn_cast_or_null() method. The only sites that need to know about the
weak pointer implementation detail are the ones that deal with
creating TypeSystems.
rdar://101505232
Differential Revision: https://reviews.llvm.org/D136650
It's required in following situations:
1. As a base class.
2. As a data member.
3. As an array element type.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134066
Undoes a lot of the code added in D135169 to piggyback off of the enum logic in `TypeSystemClang::SetIntegerInitializerForVariable()`.
Fixes#58383.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D137045
See https://discourse.llvm.org/t/dwarf-using-simplified-template-names/58417 for background on simplified template names.
lldb doesn't work with simplified template names because it uses DW_AT_name which doesn't contain template parameters under simplified template names.
Two major changes are required to make lldb work with simplified template names.
1) When building clang ASTs for struct-like dies, we use the name as a cache key. To distinguish between different instantiations of a template class, we need to add in the template parameters.
2) When looking up types, if the requested type name contains '<' and we didn't initially find any types from the index searching the name, strip the template parameters and search the index, then filter out results with non-matching template parameters. This takes advantage of the clang AST's ability to print full names rather than doing it by ourself.
An alternative is to fix up the names in the index to contain the fully qualified name, but that doesn't respect .debug_names.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134378
In D134378, we'll need the clang AST to be able to construct the qualified in some cases.
This makes logging in one place slightly less informative.
Reviewed By: dblaikie, Michael137
Differential Revision: https://reviews.llvm.org/D135979
Fixes#58135
Somehow lldb was able to print the member on its own but when we try
to print the whole type found by "image lookup -t" lldb would crash.
This is because we'd encoded the initial value of the member as an integer.
Which isn't the end of the world because bool is integral for C++.
However, clang has a special AST node to handle literal bool and it
expected us to use that instead.
This adds a new codepath to handle static bool which uses cxxBoolLiteralExpr
and we get the member printed as you'd expect.
For testing I added a struct with just the bool because trying to print
all of "A" crashes as well. Presumably because one of the other member's
types isn't handled properly either.
So for now I just added the bool case, we can merge it with A later.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D135169
Currently funciton lookup in the expression evaluator
fails to disambiguate member functions the are overloaded
on lvalue/rvalue reference-qualifiers. This happens because
we unconditionally set a `FunctionPrototype`s
`ExtProtoInfo::RefQualifier` to `RQ_None`. We lose
the ref-qualifiers in the synthesized AST and `clang::Sema`
fails to pick a correct overload candidate.
DWARF emits information about a function's ref-qualifiers
in the form of a boolean `DW_AT_rvalue_reference` (for rvalues)
and `DW_AT_reference` (for lvalues).
This patch sets the `FunctionPrototype::ExtProtoInfo::RefQualifier`
based on the DWARF attributes above.
**Testing**
* Added API test
llvm/llvm-project issue #57866
Differential Revision: https://reviews.llvm.org/D134661
This check was put in place to prevent static functions
from translation units outside the one that the current
expression is evaluated from taking precedence over functions
in the global namespace. However, this is really a different
bug. LLDB lumps functions from all CUs into a single AST and
ends up picking the file-static even when C++ context rules
wouldn't allow that to happen.
This patch removes the check so we apply the AsmLabel to all
FunctionDecls we create from DWARF if we have a linkage name
available. This makes the code-path easier to reason about and
allows calling static functions in contexts where we previously
would've chosen the wrong function.
We also flip the XFAILs in the API test to reflect what effect
this change has.
**Testing**
* Fixed API tests and added XFAIL
Differential Revision: https://reviews.llvm.org/D132231
When resolving symbols during IR execution, lldb makes a last effort attempt
to resolve external symbols from object files by approximate name matching.
It currently uses `CPlusPlusNameParser` to parse the demangled function name
and arguments for the unresolved symbol and its candidates. However, this
hand-rolled C++ parser doesn’t support ABI tags which, depending on the demangler,
get demangled into `[abi:tag]`. This lack of parsing support causes lldb to never
consider a candidate mangled function name that has ABI tags.
The issue reproduces by calling an ABI-tagged template function from the
expression evaluator. This is particularly problematic with the recent
addition of ABI tags to numerous libcxx APIs.
The issue stems from the fact that `clang::CodeGen` emits function
function calls using the mangled name inferred from the `FunctionDecl`
LLDB constructs from DWARF. Debug info often lacks information for
us to construct a perfect FunctionDecl resulting in subtle mangled
name inaccuracies.
This patch side-steps the problem of inaccurate `FunctionDecl`s by
attaching an `asm()` label to each `FunctionDecl` LLDB creates from DWARF.
`clang::CodeGen` consults this label to get the mangled name as one of
the first courses of action when emitting a function call.
LLDB already does this for C++ member functions as of
[675767a5910d2ec77ef8b51c78fe312cf9022896](https://reviews.llvm.org/D40283)
**Testing**
* Added API tests
Differential Revision: https://reviews.llvm.org/D131974
This reverts commit 967df65a3610f98a3bc0ec0f2303641d7bad176c.
This fixes test/Shell/SymbolFile/NativePDB/find-functions.cpp. When
looking up functions with the PDB plugins, if we are looking for a
full function name, we should use `GetName` to populate the `name`
field instead of `GetLookupName` since `GetName` has the more
complete information.
This reverts commit befa77e59a7760d8c4fdd177b234e4a59500f61c.
Looks like this broke a SymbolFileNativePDB test. I'll investigate and
resubmit with a fix soon.
Context:
When setting a breakpoint by name, we invoke Module::FindFunctions to
find the function(s) in question. However, we use a Module::LookupInfo
to first process the user-provided name and figure out exactly what
we're looking for. When we actually perform the function lookup, we
search for the basename. After performing the search, we then filter out
the results using Module::LookupInfo::Prune. For example, given
a:🅱️:foo we would first search for all instances of foo and then filter
out the results to just names that have a:🅱️:foo in them. As one can
imagine, this involves a lot of debug info processing that we do not
necessarily need to be doing. Instead of doing one large post-processing
step after finding each instance of `foo`, we can filter them as we go
to save time.
Some numbers:
Debugging LLDB and placing a breakpoint on
llvm::itanium_demangle::StringView::begin without this change takes
approximately 70 seconds and resolves 31,920 DIEs. With this change,
placing the breakpoint takes around 30 seconds and resolves 8 DIEs.
Differential Revision: https://reviews.llvm.org/D129682
Fixing potential int overflow and uninitialized variables.
These were found by Coverity static code inspection.
Differential Revision: https://reviews.llvm.org/D130795
Reland 486787210d which broke tests on Arm and Windows.
* Windows -- on Windows const static data members with no out-of-class
definition do have valid addresses, in constract to other platforms
(Linux, macos) where they don't. Adjusted the test to expect success
on Windows and failure on other platforms.
* Arm -- `int128` is not available on 32-bit ARM, so disable the test
for this architecture.
This adds support for using const static integral data members as described by C++11 [class.static.data]p3
to LLDB's expression evaluator.
So far LLDB treated these data members are normal static variables. They already work as intended when they are declared in the class definition and then defined in a namespace scope. However, if they are declared and initialised in the class definition but never defined in a namespace scope, all LLDB expressions that use them will fail to link when LLDB can't find the respective symbol for the variable.
The reason for this is that the data members which are only declared in the class are not emitted into any object file so LLDB can never resolve them. Expressions that use these variables are expected to directly use their constant value if possible. Clang can do this for us during codegen, but it requires that we add the constant value to the VarDecl we generate for these data members.
This patch implements this by:
* parsing the constant values from the debug info and adding it to variable declarations we encounter.
* ensuring that LLDB doesn't implicitly try to take the address of expressions that might be an lvalue that points to such a special data member.
The second change is caused by LLDB's way of storing lvalues in the expression parser. When LLDB parses an expression, it tries to keep the result around via two mechanisms:
1. For lvalues, LLDB generates a static pointer variable and stores the address of the last expression in it: `T *$__lldb_expr_result_ptr = &LastExpression`
2. For everything else, LLDB generates a static variable of the same type as the last expression and then direct initialises that variable: `T $__lldb_expr_result(LastExpression)`
If we try to print a special const static data member via something like `expr Class::Member`, then LLDB will try to take the address of this expression as it's an lvalue. This means LLDB will try to take the address of the variable which causes that Clang can't replace the use with the constant value. There isn't any good way to detect this case (as there a lot of different expressions that could yield an lvalue that points to such a data member), so this patch also changes that we only use the first way of capturing the result if the last expression does not have a type that could potentially indicate it's coming from such a special data member.
This change shouldn't break most workflows for users. The only observable side effect I could find is that the implicit persistent result variables for const int's now have their own memory address:
Before this change:
```
(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x00007ffeefbff8e8
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8
```
After this change we capture `i` by value so it has its own value.
```
(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x0000000100155320
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8
```
Reviewed By: Michael137
Differential Revision: https://reviews.llvm.org/D81471
This reland 227dffd0b6d78154516ace45f6ed28259c7baa48 and
562c3467a6738aa89203f72fc1d1343e5baadf3c with failed api tests fixed by keeping
function base file addres in DWARFExpressionList.