Factors out code to locate a definition DIE from a `FunctionCallLabel`
into a helper. This will be useful for an upcoming refactor that extends
`FindFunctionDefinition`.
Drive-by:
* adjusts some error messages in the `FunctionCallLabel` support code.
LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](https://github.com/llvm/llvm-project/pull/149827)
This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.
**Implementation**
The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).
Depends on https://github.com/llvm/llvm-project/pull/151355
The ultimate goal is to replace the return types of all the `DWARFIndex`
callbacks to `IterationAction`. To reduce the blast radius and do this
incrementally I'm doing this for `GetFunctions` only here. I added a
`IterationActionAdaptor` helper (that will ultimately get removed once
all APIs have been migrated) to avoid having to change too many other
APIs that `GetFunctions` interacts with.
Add support for DW_OP_WASM_location in DWARFExpression. This PR rebases
#78977 and cleans up the unit test. The DWARF extensions are documented
at https://yurydelendik.github.io/webassembly-dwarf/ and supported by
LLVM-based toolchains such as Clang, Swift, Emscripten, and Rust.
LanguageType has two kinds of enumerators in it. The first is
DWARF-assigned enumerators which must be consecutive and match DW_LANG
values. The second is the vendor-assigned enumerators which must be
unique and must follow on from the DWARF-assigned values (i.e. the first
one is currently eLanguageTypeMojo + 1) even if that collides with
DWARF-assigned values that lldb is not yet aware of
Only the DWARF-assigned enumerators may be static_cast from DW_LANG
since their values match. The vendor-assigned enumerators must be
explicitly converted since their values do not match. This needs to
handle new languages added to DWARF and not yet implemented in lldb.
This fixes a crash when encountering a DW_LANG value >=
eNumLanguageTypes and wrong behaviour when encountering DW_LANG values
that have not yet been added to LanguageType but happen to coincide with
a vendor-assigned enumerator due to the consecutive values requirement
described above.
Another way to fix the crash is to add the language to LanguageType (and
fill any preceeding gaps in the number space) so that the DW_LANG being
encountered is correctly handled but this just moves the problem to a
new subset of DW_LANG values.
Also fix an unnecessary static-cast from LanguageType to LanguageType.
Eliminate the `lldb_private::dwarf` namespace, in favor of using
`llvm::dwarf` directly. The latter is shorter, and this avoids ambiguity
in the ABI plugins that define a `dwarf` namespace inside an anonymous
namespace.
### Summary
Currently `target modules dump separate separate-debug-info`
automatically loads up all DWO files, even if deferred loading is
enabled through debug_names. Then, as expected all DWO files (assuming
there is no error loading it), get marked as "loaded".
This change adds the option `--force-load-all-debug-info` or `-f` for
short to force loading all debug_info up, if it hasn't been loaded yet.
Otherwise, it will change default behavior to not load all debug info so
that the correct DWO files will show up for each modules as "loaded" or
not "loaded", which could be helpful in cases where we want to know
which particular DWO files were loaded.
### Testing
#### Unit Tests
Added additional unit tests
`test_dwos_load_json_with_debug_names_default` and
`test_dwos_load_json_with_debug_names_force_load_all` to test both
default behavior and loading with the new flag
`--force-load-all-debug-info`, and changed expected behavior in
`test_dwos_loaded_symbols_on_demand`.
```
bin/lldb-dotest -p TestDumpDwo ~/llvm-project/lldb/test/API/commands/target/dump-separate-debug-info/dwo
```
#### Manual Testing
Compiled a simple binary w/ `--gsplit-dwarf --gpubnames` and loaded it
up:
```
(lldb) target create "./a.out"
Current executable set to '/home/qxy11/hello-world/a.out' (x86_64).
(lldb) help target modules dump separate-debug-info
List the separate debug info symbol files for one or more target modules.
Syntax: target modules dump separate-debug-info <cmd-options> [<filename> [<filename> [...]]]
Command Options Usage:
target modules dump separate-debug-info [-efj] [<filename> [<filename> [...]]]
-e ( --errors-only )
Filter to show only debug info files with errors.
-f ( --force-load-all-debug-info )
Load all debug info files.
-j ( --json )
Output the details in JSON format.
This command takes options and free-form arguments. If your arguments resemble option specifiers (i.e., they start with a - or --), you must use ' -- ' between the end of the
command options and the beginning of the arguments.
(lldb) target modules dump separate-debug-info --j
[
{
"separate-debug-info-files": [
{ ...
"dwo_name": "main.dwo",
"loaded": false
},
{ ...
"dwo_name": "foo.dwo",
"loaded": false
},
{ ...
"dwo_name": "bar.dwo",
"loaded": false
}
],
}
]
(lldb) b main
Breakpoint 1: where = a.out`main + 15 at main.cc:3:12, address = 0x00000000000011ff
(lldb) target modules dump separate-debug-info --j
[
{
"separate-debug-info-files": [
{ ...
"dwo_name": "main.dwo",
"loaded": true,
"resolved_dwo_path": "/home/qxy11/hello-world/main.dwo"
},
{ ...
"dwo_name": "foo.dwo",
"loaded": false
},
{ ...
"dwo_name": "bar.dwo",
"loaded": false
}
],
}
]
(lldb) b foo
Breakpoint 2: where = a.out`foo(int) + 11 at foo.cc:12:11, address = 0x000000000000121b
(lldb) target modules dump separate-debug-info --j
[
{
"separate-debug-info-files": [
{ ...
"dwo_name": "main.dwo",
"loaded": true,
"resolved_dwo_path": "/home/qxy11/hello-world/main.dwo"
},
{ ...
"dwo_name": "foo.dwo",
"loaded": true,
"resolved_dwo_path": "/home/qxy11/hello-world/foo.dwo"
},
{ ...
"dwo_name": "bar.dwo",
"loaded": false
}
],
}
]
(lldb) b bar
Breakpoint 3: where = a.out`bar(int) + 11 at bar.cc:10:9, address = 0x000000000000126b
(lldb) target modules dump separate-debug-info --j
[
{
"separate-debug-info-files": [
{ ...
"dwo_name": "main.dwo",
"loaded": true,
"resolved_dwo_path": "/home/qxy11/hello-world/main.dwo"
},
{ ...
"dwo_name": "foo.dwo",
"loaded": true,
"resolved_dwo_path": "/home/qxy11/hello-world/foo.dwo"
},
{ ...
"dwo_name": "bar.dwo",
"loaded": true,
"resolved_dwo_path": "/home/qxy11/hello-world/bar.dwo"
}
],
}
]
```
This relands changes in #144424 for adding a count of DWO files
parsed/loaded and the total number of DWO files.
The previous PR was reverted in #145494 due to the newly added unit
tests failing on Windows and MacOS CIs since these platforms don't
support DWO. This change add an additional
`@add_test_categories(["dwo"])` to the new tests to
[skip](cd46354dbd/lldb/packages/Python/lldbsuite/test/test_categories.py (L56))
these tests on Windows/MacOS.
Original PR: #144424
### Testing
Ran unit tests
```
$ bin/lldb-dotest -p TestStats.py llvm-project/lldb/test/API/commands/statistics/basic/
----------------------------------------------------------------------
Ran 24 tests in 211.391s
OK (skipped=3)
```
## Summary
A new `totalLoadedDwoFileCount` and `totalDwoFileCount` counters to
available statisctics when calling "statistics dump".
1. `GetDwoFileCounts ` is created, and returns a pair of ints
representing the number of loaded DWO files and the total number of DWO
files, respectively. An override is implemented for `SymbolFileDWARF`
that loops through each compile unit, and adds to a counter if it's a
DWO unit, and then uses `GetDwoSymbolFile(false)` to check whether the
DWO file was already loaded/parsed.
3. In `Statistics`, use `GetSeparateDebugInfo` to sum up the total
number of loaded/parsed DWO files along with the total number of DWO
files. This is done by checking whether the DWO file was already
successfully `loaded` in the collected DWO data, anding adding to the
`totalLoadedDwoFileCount`, and adding to `totalDwoFileCount` for all CU
units.
## Expected Behavior
- When binaries are compiled with split-dwarf and separate DWO files,
`totalLoadedDwoFileCount` would be the number of loaded DWO files and
`totalDwoFileCount` would be the total count of DWO files.
- When using a DWP file instead of separate DWO files,
`totalLoadedDwoFileCount` would be the number of parsed compile units,
while `totalDwoFileCount` would be the total number of CUs in the DWP
file. This should be similar to the counts we get from loading separate
DWO files rather than only counting whether a single DWP file was
loaded.
- When not using split-dwarf, we expect both `totalDwoFileCount` and
`totalLoadedDwoFileCount` to be 0 since no separate debug info is
loaded.
## Testing
**Manual Testing**
On an internal script that has many DWO files, `statistics dump` was
called before and after a `type lookup` command. The
`totalLoadedDwoFileCount` increased as expected after the `type lookup`.
```
(lldb) statistics dump
{
...
"totalLoadedDwoFileCount": 29,
}
(lldb) type lookup folly::Optional<unsigned int>::Storage
typedef std::conditional<true, folly::Optional<unsigned int>::StorageTriviallyDestructible, folly::Optional<unsigned int>::StorageNonTriviallyDestructible>::type
typedef std::conditional<true, folly::Optional<unsigned int>::StorageTriviallyDestructible, folly::Optional<unsigned int>::StorageNonTriviallyDestructible>::type
...
(lldb) statistics dump
{
...
"totalLoadedDwoFileCount": 2160,
}
```
**Unit test**
Added three unit tests that build with new "third.cpp" and "baz.cpp"
files. For tests with w/ flags `-gsplit-dwarf -gpubnames`, this
generates 2 DWO files. Then, the test incrementally adds breakpoints,
and does a type lookup, and the count should increase for each of these
as new DWO files get loaded to support these.
```
$ bin/lldb-dotest -p TestStats.py ~/llvm-sand/external/llvm-project/lldb/test/API/commands/statistics/basic/
----------------------------------------------------------------------
Ran 20 tests in 211.738s
OK (skipped=3)
```
Depends on https://github.com/llvm/llvm-project/pull/142163
This patch makes the `-ast-dump-filter` Clang option available to the
`target modules dump ast` command. This allows us to selectively dump
parts of the AST by name.
The AST can quickly grow way too large to skim on the console. This will
aid in debugging AST related issues.
Example:
```
(lldb) target modules dump ast --filter func
Dumping clang ast for 48 modules.
Dumping func:
FunctionDecl 0xc4b785008 <<invalid sloc>> <invalid sloc> func 'void (int)' extern
|-ParmVarDecl 0xc4b7853d8 <<invalid sloc>> <invalid sloc> x 'int'
`-AsmLabelAttr 0xc4b785358 <<invalid sloc>> Implicit "_Z4funcIiEvT_"
Dumping func<int>:
FunctionDecl 0xc4b7850b8 <<invalid sloc>> <invalid sloc> func<int> 'void (int)' implicit_instantiation extern
|-TemplateArgument type 'int'
| `-BuiltinType 0xc4b85b110 'int'
`-ParmVarDecl 0xc4b7853d8 <<invalid sloc>> <invalid sloc> x 'int'
```
The majority of this patch is adjust the `Dump` API. The main change in
behaviour is in `TypeSystemClang::Dump`, where we now use the
`ASTPrinter` for dumping the `TranslationUnitDecl`. This is where the
`-ast-dump-filter` functionality lives in Clang.
Inferred submodule declarations are emitted in DWARF as `DW_TAG_module`s
without `DW_AT_LLVM_include_path`s. Instead the parent DIE will have the
include path. This patch adds support for such setups. Without this, the
`ClangModulesDeclVendor` would fail to `AddModule` the submodules (i.e.,
compile and load the submodules). This would cause errors such as:
```
note: error: Header search couldn't locate module 'TopLevel'
```
The test added here also tests
https://github.com/llvm/llvm-project/pull/141220. Without that patch
we'd fail with:
```
note: error: No module map file in /Users/jonas/Git/llvm-worktrees/llvm-project/lldb/test/API/lang/cpp/decl-from-submodule
```
Unfortunately the embedded clang instance doesn't allow us to use the
decls we find in the modules. But we'll try to fix that in a separate
patch.
rdar://151022173
Add an overloaded `GetTypeSystem` to specify the expected type system subclass. Changes code from `GetTypeSystem().dyn_cast_or_null<TypeSystemClang>()` to `GetTypeSystem<TypeSystemClang>()`.
Identical PR to: https://github.com/llvm/llvm-project/pull/134563
Previous PR was approved and landed but broke the build due to bad
merge.
Manually resolve the merge conflict and try to land again.
Co-authored-by: George Hu <georgehuyubo@gmail.com>
This reverts commit 070a4ae2f9bcf6967a7147ed2972f409eaa7d3a6.
Multiple buildbot failures have been reported:
https://github.com/llvm/llvm-project/pull/134563
The build fails with:
lldb/source/Target/Statistics.cpp:75:39: error: use of undeclared
identifier 'num_symbols_loaded'
The check is not correct for discontinuous functions, as one of the
blocks could very well begin before the function entry point. To catch
dead-stripped ranges, I check whether the functions is after the first
known code address. I don't print any error in this case as that is a
common/expected situation.
This avoids many errors like:
```
error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
the start of this error message
```
when debugging binaries on debian trixie because the dynamic linker
(ld-linux) contains discontinuous functions.
If the block ranges is not a subrange of the enclosing block then this
will range will currently be added to the outer block as well (i.e., we
get the same behavior that's currently possible for non-subrange blocks
larger than function_low_pc). However, this code path is buggy and I'd
like to change that (#117725).
This patch addresses the issue #129543.
After this patch DWARFExpression does not call DWARFUnit directly and does not depend on
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp and a lot of clang code.
After this patch the size of lldb-server binary (Linux Aarch64) is reduced from 47MB to 17MB.
This patch pushes the error handling boundary for the GetBitSize()
methods from Runtime into the Type and CompilerType APIs. This makes it
easier to diagnose problems thanks to more meaningful error messages
being available. GetBitSize() is often the first thing LLDB asks about a
type, so this method is particularly important for a better user
experience.
rdar://145667239
This reverts commit 6041c745f32e8fd60ed24e29e7d919d8d1c87ca6.
Relands the original patch with the test-case data fixed. Weirldy the PR CI
didn't seem to run the unit-tests? In any case, the problem was an
incorrect expectation in the test-case data. Since we have both public
and internal SDK in that test-case, we should `expect_mismatch` to be
`true`.
Reverts llvm/llvm-project#128712
```
******************** TEST 'lldb-unit :: SymbolFile/DWARF/./SymbolFileDWARFTests/10/14' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/unittests/SymbolFile/DWARF/./SymbolFileDWARFTests-lldb-unit-1021-10-14.json GTEST_SHUFFLE=1 GTEST_TOTAL_SHARDS=14 GTEST_SHARD_INDEX=10 GTEST_RANDOM_SEED=62233 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/unittests/SymbolFile/DWARF/./SymbolFileDWARFTests
--
Script:
--
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/unittests/SymbolFile/DWARF/./SymbolFileDWARFTests --gtest_filter=SDKPathParsingTests/SDKPathParsingMultiparamTests.TestSDKPathFromDebugInfo/6
--
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/unittests/SymbolFile/DWARF/XcodeSDKModuleTests.cpp:265: Failure
Expected equality of these values:
found_mismatch
Which is: true
expect_mismatch
Which is: false
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/unittests/SymbolFile/DWARF/XcodeSDKModuleTests.cpp:265
Expected equality of these values:
found_mismatch
Which is: true
expect_mismatch
Which is: false
```
`GetSDKRoot` uses `xcrun` to find an SDK root path for a given SDK
version string. But if the SDK doesn't exist in the Xcode installations,
but instead lives in the `CommandLineTools`, `xcrun` will fail to find
it. Negative searches for an SDK path cost a lot (a few seconds) each
time `xcrun` is invoked. We do cache negative results in
`find_cached_path` inside LLDB, but we would still pay the price on
every new debug session the first time we evaluate an expression. This
doesn't only cause a noticable delay in running the expression, but also
generates following error:
```
error: Error while searching for Xcode SDK: timed out waiting for shell command to complete
(int) $0 = 42
```
In this patch we avoid these possibly expensive calls to `xcrun` by
checking the `DW_AT_LLVM_sysroot`, and if it exists, using that as the
SDK path. We need an explicit check for the `CommandLineTools` path
before we call `RegisterXcodeSDK`, because that will try to call
`xcrun`. This won't prevent other uses of `GetSDKRoot` popping up that
cause us to make expensive `xcrun` calls, but for now this addresses the
regression in the expression evaluator. We also had to adjust the
`XcodeSDK::Merge` logic to update the sysroot. There is one case for
which this wouldn't make sense: if a CU was compiled with
`CommandLineTools` and a different one with an older internal SDK, in
that case we would update the `CommandLineTools` sysroot with a
`.Internal.sdk` prefix, which won't possibly exist for
`CommandLineTools`. I added a unit-test for this. Not sure if we want to
explicitly detect and disallow this, given it's quite a niche scenario.
rdar://113619904
rdar://113619723
The purpose of this originally was to check for DWARF which refers to
garbage-collected functions (by checking whether we're able to get a
good address out of the function). The address check has been removed in
https://reviews.llvm.org/D112310, so the code computing it is not doing
anything.
This is the behavior expected by DWARF. It also requires some fixups to
algorithms which were storing the addresses of some objects (Blocks and
Variables) relative to the beginning of the function.
There are plenty of things that still don't work in this setups, but
this change is sufficient for the expression evaluator to correctly
recognize the entry point of a function in this case.
`GetAttributes` returns all attributes on a given DIE, including any
attributes that the DIE references via `DW_AT_abstract_origin` and
`DW_AT_specification`. However, if an attribute exists on both the
referring DIE and the referenced DIE, the first one encountered will be
the one that takes precendence when querying the returned
`DWARFAttributes`. But there was no guarantee in which order those
attributes get visited. That means there's no convenient way of ensuring
that an attribute of a definition doesn't get shadowed by one found on
the declaration. One use-case where we don't want this to happen is for
`DW_AT_object_pointer` (which can exist on both definitions and
declarations, see https://github.com/llvm/llvm-project/pull/123089).
This patch makes sure we visit the current DIE's attributes before
following DIE references. I tried keeping as much of the original
`GetAttributes` unchanged and just add an outer `GetAttributes` that
keeps track of the DIEs we need to visit next.
There's precendent for this iteration order in
`llvm::DWARFDie::findRecursively` and also
`lldb_private::ElaboratingDIEIterator`. We could use the latter to
implement `GetAttributes`, though it also follows `DW_AT_signature` so I
decided to leave it for follow-up.
Anonymous namespaces are supposed to be optional when looking up types.
This was not working in combination with -gsimple-template-names,
because the way it was constructing the complete (with template args)
name scope (i.e., by generating thescope as a string and then reparsing
it) did not preserve the information about the scope kinds.
Essentially what the code wants here is to call `GetTypeLookupContext`
(that's the function used to get the context in the "regular" code
path), but to embelish each name with the template arguments (if they
don't have them already). This PR implements exactly that by adding an
argument to control which kind of names are we interested in. This
should also make the lookup faster as it avoids parsing of the long
string, but I haven't attempted to benchmark that.
I believe this function can also be used in some other places where
we're manually appending template names, but I'm leaving that for
another patch.
The problem here manifests as follows:
1. We are stopped in main.o, so the first `ParseTypeFromDWARF` on
`FooImpl<char>` gets called on `main.o`'s SymbolFile. This adds a
mapping from *declaration die* -> `TypeSP` into `main.o`'s
`GetDIEToType` map.
2. We then `CompleteType(FooImpl<char>)`. Depending on the order of
entries in the debug-map, this might call `CompleteType` on `lib.o`'s
SymbolFile. In which case, `GetDIEToType().lookup(decl_die)` will return
a `nullptr`. This is already a bit iffy because some of the surrounding
code assumes we don't call `CompleteTypeFromDWARF` with a `nullptr`
`Type*`. E.g., `CompleteEnumType` blindly dereferences it (though enums
will never encounter this because their definition is fetched in
ParseEnum, unlike for structures).
3. While in `CompleteTypeFromDWARF`, we call `ParseTypeFromDWARF` again.
This will parse the member function `FooImpl::Create` and its return
type which is a typedef to `FooImpl*`. But now we're inside `lib.o`'s
SymbolFile, so we call it on the definition DIE. In step (2) we just
inserted a `nullptr` into `GetDIEToType` for the definition DIE, so we
trivially return a `nullptr` from `ParseTypeFromDWARF`. Instead of
reporting back this parse failure to the user LLDB trucks on and marks
`FooImpl::Ref` to be `void*`.
This test-case will trigger an assert in `TypeSystemClang::VerifyDecl`
even if we just `frame var` (but only in debug-builds). In release
builds where this function is a no-op, we'll create an incorrect Clang
AST node for the `Ref` typedef.
The proposed fix here is to share the `GetDIEToType` map between
SymbolFiles if a debug-map exists.
**Alternatives considered**
* Check the `GetDIEToType` map of the `SymbolFile` that the declaration
DIE belongs to. The assumption here being that if we called
`ParseTypeFromDWARF` on a declaration, the `GetDIEToType` map that the
result was inserted into was the one on that DIE's SymbolFile. This was
the first version of this patch, but that felt like a weaker version
sharing the map. It complicates the code in `CompleteType` and is less
consistent with the other bookkeeping structures we already share
between SymbolFiles
* Return from `SymbolFileDWARF::CompleteType` if there is no type in the
current `GetDIEToType`. Then `SymbolFileDWARFDebugMap::CompleteType`
could continue to the next `SymbolFile` which does own the type. But
that didn't quite work because we remove the
`GetForwardCompilerTypeToDie` entry in `SymbolFile::CompleteType`, which
`SymbolFileDWARFDebugMap::CompleteType` relies upon for iterating
The main difference is that the llvm class (just a std::vector in
disguise) is not sorted. It turns out this isn't an issue because the
callers either:
- ignore the range list;
- convert it to a different format (which is then sorted);
- or query the minimum value (which is faster than sorting)
The last case is something I want to get rid of in a followup as a part
of removing the assumption that function's entry point is also its
lowest address.
This reverts commit 2526d5b1689389da9b194b5ec2878cfb2f4aca93, reapplying
ba14dac481564000339ba22ab867617590184f4c after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
It's basically true already (except for a brief time during
construction). This patch makes sure the objects are constructed with a
valid parent and enforces this in the type system, which allows us to
get rid of some nullptr checks.
In #108907, the index classes started filtering the DIEs according to
the full type query (instead of just the base name). This means that the
checks in SymbolFileDWARF are now redundant.
I've also moved the non-redundant checks so that now all checking is
done in the DWARFIndex class and the caller can expect to get the final
filtered list of types.
This reverts commit f06c187799d910fd3ac3e9106397e5eecff9f265.
Temporary revert: there is https://github.com/llvm/llvm-project/pull/117239 that is suppose to fix the issue.
Reverting to keep things rolling.
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.
Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.
### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
The class is only used from one place, which is trivial to implement
using the llvm class.
The main difference is that in the new implementation, the ranges are
parsed each time anew (instead of being parsed at startup and cached). I
believe this is fine because:
- this is already how things work with DWARF v5 debug_rnglists
- parsing debug_ranges is fairly fast (definitely faster than rnglists)
- generally, this result will be cached at a higher level anyway.
Browsing the code I did find one instance where that is not the case --
SymbolFileDWARF::ResolveFunctionAndBlock -- which is called each time we
resolve an address (to the block level). However, this function is
already pretty suboptimal: it first traverses the DIE tree (which
involves parsing all the DIE attributes) to find the correct block, then
it parses them again to construct the `lldb_private::Block`
representation, and *then* it uses the ID of the block DIE it found in
the first step to look up the `Block` object. If this turns out to be a
bottleneck, I think there are better ways to optimize it than caching
the debug_ranges parse.
The motiviation for this is that DWARFDebugRanges sorts the block
ranges, even though the order of the ranges is load-bearing (in the
absence of DW_AT_low_pc, the "base address" of a scope is determined by
the first range entry). Delaying the parsing (and sorting) step makes it
easier to access the first entry.
"statistics dump" currently report the statistics of all targets in
debugger instead of current target. This is wrong because there is a
"statistics dump --all-targets" option that supposed to include
everything.
This PR fixes the issue by only report statistics for current target
instead of all. It also includes the change to reset statistics debug
info/symbol table parsing/indexing time during debugger destroy. This is
required so that we report current statistics if we plan to reuse
lldb/lldb-dap across debug sessions
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
This is the beginning of a different, more fundamental approach to
handling. This PR tries to tries to minimize functional changes. It only
makes sure that we store the true set of ranges inside the function
object, so that subsequent patches can make use of it.
In DWARF 4 and earlier `static const` members of structs, classes and
unions have an entry tag `DW_TAG_member`, and are also tagged as
`DW_AT_declaration`, but otherwise follow the same rules as
`DW_TAG_variable`.