LLDB's DWARF parser didn't support parsing DW_FORM_GNU_ref_alt and
DW_FORM_GNU_strp_alt forms which would cause any file loaded by LLDB to
fail to parse any DWARF. Added support for parsing this information
only, not for actually finding the debug info reference to an alternate
file or a string in an alternate file. These extensions are used by DWZ
files which are present in some linux distros, so it will be good for
LLDB to just be able to parse these without emitting an error like:
(lldb) b bar
warning: (arm64) /tmp/a.out unsupported DW_FORM values: 0x1f20 0x1f21
Corrected various spelling mistakes such as 'occurred', 'receiver',
'initialized', 'length', and others in comments, variable names,
function names, and documentation throughout the project. These
changes improve code readability and maintain consistency in naming
and documentation.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Implementation files using the Intel syntax typically explicitly specify it.
Do the same for the few files using AT&T syntax.
This enables building LLVM with `-mllvm -x86-asm-syntax=intel` in one's Clang config files
(i.e. a global preference for Intel syntax).
This patch is updating the reading capabilities of the LLDB DWARF parser
for a llvm-dwp patch https://github.com/llvm/llvm-project/pull/167457
that will emit .dwp files where the compile units are DWARF32 and the
.debug_str_offsets tables will be emitted as DWARF64 to allow .debug_str
sections that exceed 4GB in size.
Most of the cases were where a C++ file was being compiled with the C substitution.
There were a few cases of the opposite though.
LLDB seems to be the only real culprit in the LLVM codebase for these mismatches.
Rest of the LLVM presumably sticks at least language-specific options in the common substitutions
making the mistakes immediately apparent.
I found these by using Clang frontend configuration files containing language-specific options for
both C and C++ (e.g. `-std=c2y` and `-std=c++26`).
First time check was introduced in
`fa3ab4599d717feedbb83e08e7f654913942520b` to work around a debug-info
generation bug in Clang. This bug was fixed in Clang-4. The check has
since been adjusted (first in
`808ff186f6a6ba1fd38cc7e00697cd82f4afe540`, and then most recently in
`370db9c62910195e664e82dde6f0adb3e255a4fd`).
This check is getting quite convoluted, and all it does is turn an
`array[1]` into an `array[0]` type when it is deemed correct. At this
point the workaround probably never fires, apart from actually valid
codegen. This patch removes the special conditions and emits the error
specifically in those cases where we know the DWARF is malformed.
Added some shell tests for the error case.
Fixes#159401
On Windows there is no hex prefix, I suspect because somewhere we
print a pointer. I'd prefer to fix that itself but can't get to
a Windows machine at the moment. It's not important to the purpose
of the test anyway.
GCC doesn't add DW_AT_data_member_location attributes to the
DW_TAG_member children of DW_TAG_union_type types. An error was being
emitted incorrectly for these cases fr om the DWARFASTParserClang. This
fixes that issue and adds a test.
This test was brokem by migrating to the lit internal shell due to a
lack of env prefix for setting environment variables. This was fixed in
prevented the breakpoint in the test from mapping to anything, causing
the test to file. This patch restores the original line numbering.
These tests were failing on darwin, because the internal shell needs
environment var definitions to start with 'env'. This PR (hopefully)
fixes that problem.
This patch updates the lld lit test config to use the internal shell by
default. This has some performance advantages (~10-15%) and also
produces nicer failure output. It also updates the two LLDB tests to not
require shell (so that they run under the internal shell), after first
verifying that they run and pass using the internal shell; and it fixes
one test that was not passing under the internal shell.
This upstreams https://github.com/swiftlang/llvm-project/pull/10313.
If we detect a situation where a forward declaration is C++ and the
definition DIE is Objective-C, then just don't try to complete the type
(it would crash otherwise). In the long term we might want to add
support for completing such types.
We've seen real world crashes when debugging WebKit and wxWidgets
because of this. Both projects forward declare ObjC++ decls in the way
shown in the test.
rdar://145959981
This uses split DWARF and from looking at other tests, it should not be
running on Darwin or Windows.
It does pass using the DIA PDB plugin but I think this is misleading
because it's not actually testing the intended feature.
When the native PDB plugin is used it fails because it cannot set a
breakpoint. I don't see a point to running this test on Windows at all.
Native PDB plugin test failures are being tracked in #114906.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
Fixes a bug that surfaces in frame recognizers.
Details about the bug:
A new frame recognizer is configured to match a specific symbol
(`swift_willThrow`). This is an `extern "C"` symbol defined in a C++
source file. When Swift is built with debug info, the function
`ParseFunctionFromDWARF` will use the debug info to construct a function
name that looks like a C++ declaration (`::swift_willThrow(void *,
SwiftError**)`). The `Mangled` instance will have this string as its
`m_demangled` field, and have _no_ string for its `m_mangled` field.
The result is the frame recognizer would not match the symbol to the
name (`swift_willThrow` != `::swift_willThrow(void *, SwiftError**)`.
By changing `ParseFunctionFromDWARF` to assign both a demangled name and
a mangled, frame recognizers can successfully match symbols in this
configuration.
%T has been deprecated for about seven years, mostly because it is not
unique to each test which can lead to races. This patch updates the few
remaining tests in lldb that use %T to not use it (either directly using
files or creating their own temp dir). The eventual goal is to remove
support for %T from llvm-lit given few tests use it and it still has
racey behavior.
This patch errors on the side of creating new temp dirs even when not
strictly necessary to avoid needing to update filenames inside filecheck
matchers.
The check is not correct for discontinuous functions, as one of the
blocks could very well begin before the function entry point. To catch
dead-stripped ranges, I check whether the functions is after the first
known code address. I don't print any error in this case as that is a
common/expected situation.
This avoids many errors like:
```
error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
the start of this error message
```
when debugging binaries on debian trixie because the dynamic linker
(ld-linux) contains discontinuous functions.
If the block ranges is not a subrange of the enclosing block then this
will range will currently be added to the outer block as well (i.e., we
get the same behavior that's currently possible for non-subrange blocks
larger than function_low_pc). However, this code path is buggy and I'd
like to change that (#117725).
When searching for the end of prologue, I'm only iterating through the
address range (~basic block) which contains the function entry point.
The reason for that is that even if some other range somehow contained
the end-of-prologue marker, the fact that it's in a different range
would imply it's reachable through some form of control flow, and that's
usually not a good place to set an function entry breakpoint.
This patch pushes the error handling boundary for the GetBitSize()
methods from Runtime into the Type and CompilerType APIs. This makes it
easier to diagnose problems thanks to more meaningful error messages
being available. GetBitSize() is often the first thing LLDB asks about a
type, so this method is particularly important for a better user
experience.
rdar://145667239
The llvm versions of these functions do that, so we must to so as well.
Practically this meant that were were unable to correctly un-simplify
the names of some types when using type units, which resulted in type
lookup errors.
The original code resulted in a misfire in the symtab vs. debug info
deduplication code, which caused us to return the same function twice
when searching via a regex (for functions whose entry point is also not
the lowest address).
This is XFAILed for now until we find a good way to locate the
DW_AT_object_pointer of function declarations (a possible solution being
https://github.com/llvm/llvm-project/pull/124790).
Made it a shell test because I couldn't find any SBAPIs that i could
query to find the CV-qualifiers/etc. of member functions.
This is the behavior expected by DWARF. It also requires some fixups to
algorithms which were storing the addresses of some objects (Blocks and
Variables) relative to the beginning of the function.
There are plenty of things that still don't work in this setups, but
this change is sufficient for the expression evaluator to correctly
recognize the entry point of a function in this case.
In Objective-C, forward declarations are currently represented as:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
DW_AT_APPLE_runtime_class (DW_LANG_ObjC)
```
However, when compiling with `-gmodules`, when a class definition is
turned into a forward declaration within a `DW_TAG_module`, the DIE for
the forward declaration looks as follows:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
```
Note the absence of `DW_AT_APPLE_runtime_class`. With recent changes in
LLDB, not being able to differentiate between C++ and Objective-C
forward declarations has become problematic (see attached test-case and
explanation in https://github.com/llvm/llvm-project/pull/119860).
The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
.. in the global namespace
The problem was the interaction of #116989 with an optimization in
GetTypesWithQuery. The optimization was only correct for non-exact
matches, but that didn't matter before this PR due to the "second layer
of defense". After that was removed, the query started returning more
types than it should.
This reverts commit 2526d5b1689389da9b194b5ec2878cfb2f4aca93, reapplying
ba14dac481564000339ba22ab867617590184f4c after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
Making a breakpoint on a line causes an error on aarch64-pc-windows.
This patch changes the test so that a breakpoint can be made on a
function name.
#117168
This reverts commit f06c187799d910fd3ac3e9106397e5eecff9f265.
Temporary revert: there is https://github.com/llvm/llvm-project/pull/117239 that is suppose to fix the issue.
Reverting to keep things rolling.
The problem here is the assumption that the entire function will be
placed in a single section. This will ~never be the case for a
discontinuous function, as the point of splitting the function is to let
the linker group parts of the function according to their "hotness".
The fix is to change the offset computation to use file addresses
instead.
When parsing an optimized value and reading a piece from a file address,
LLDB tries to read the data from memory using that address.
This patch converts file address to load address before reading the
memory.
Fixes#111313Fixes#97484
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.
Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.
### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
Following up from https://github.com/llvm/llvm-project/pull/112928, we
can reuse the approach from Clang Sema to infer the MSInheritanceModel
and add the necessary attribute manually. This allows the inspection of
member function pointers with DWARF on Windows.