Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.
Either way, I'm open.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
https://github.com/llvm/llvm-project/issues/136772
Incorrect handling of 'tombstone' value for WebAssembly.
llvm-debuginfo-analyzer already uses the tombstone approach
to identify dead code. Currently, the tombstone value is
evaluated as std::numeric_limits<uint64_t>::max(). Which is
wrong as it does not take into account the 'Address Byte Size'
from the Compile Unit header.
Some data members are only part of a class definition in a Debug build,
e.g. `LVObject::ID`. If `debuginfologicalview` is used as a library,
`NDEBUG` cannot be used for this purpose, as this PP macro may have a
different definition in a downstream project, which in turn triggers an
ODR violation. Fix it by
- Making `LVObject::ID` an unconditional data member.
- Making `LVObject::dump()` non-virtual. Rationale: `virtual` is not
needed (and it calls `print()`, which is virtual anyway).
Fixes#139098.
(Revised version of a previous, unreviewed, PR.)
Move all expression verification into its only client: DWARFVerifier.
Move all printing code (which was a mix of static and member functions)
into a separate class.
This is one in a series of refactoring PRs to separate dwarf
functionality into lower-level pieces usable without object files and
sections at build time. The code is already written this way via various
"if (section == nullptr)" and similar conditionals. So the functionality
itself is needed and exists, but only as a runtime feature. The goal of
these refactors is to remove the build-time dependencies, which will
allow the existing functionality to be used from lower-level parts of
the compiler. Particularly from lib/MC/.... More information at:
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
This pull request adds support for parsing the source language in both
DWARF and CodeView. Specifically,
- The `LVSourceLanguage` class is introduced to represent any supported
language by any of the debug info representations.
- Update `LVDWARFReader.cpp` and `LVCodeViewVisitor.cpp` to parse the
source language where it applies. Added a new `=Language` attribute;
`getAttributeLanguage()` is internally used to control whether this
information is being printed.
This PR was split from https://github.com/llvm/llvm-project/pull/137228
(which introduced support for `DW_TAG_module` and `DW_AT_byte_size`).
This PR improves `LVDWARFReader` by introducing handling of
`DW_AT_byte_size`. Most DWARF emitters include this attribute for types
to specify the size of an entity of the given type.
- Adds support for `DW_TAG_module` DIEs and recurse over their children.
Prior to this patch, entities hanging below `DW_TAG_module` were just
not visible. This DIE kind is commonly generated by Objective-C modules.
This patch will represent such entities, which will print as
```
[001] {CompileUnit} '/llvm/tools/clang/test/modules/<stdin>'
[002] {Producer} 'LLVM version 3.7.0'
{Directory} '/llvm/tools/clang/test/modules'
{File} '<stdin>'
[002] {Module} 'DebugModule'
```
The minimal test case included is just the result of
```
$ llc llvm/test/DebugInfo/X86/DIModule.ll
-accel-tables=Dwarf
-o llvm/unittests/DebugInfo/LogicalView/Inputs/test-dwarf-clang-module.o
-filetype=obj
```
This pull request fixes a couple of unhandled situations in DWARF input
leading to a crash. Specifically,
- If the DWARF input contains a declaration of a C variadic function
(where `...` translates to `DW_TAG_unspecified_parameters`), which is
then followed by a definition, `llvm_unreachable()` is hit in
`LVScope::addMissingElements()`. This is only visible in Debug builds.
- Parsing of instructions in `LVBinaryReader::createInstructions()` does
not check whether `Offset` lies within the `Bytes` ArrayRef. A specially
crafted DWARF input can lead to this condition.
[llvm-debuginfo-analyzer] Fix crash due to un-checked error in LVReaderHandler::handleArchive
method.
- Added README describing how to generated the binary files used for the test.
- A follow up patch to add extra ASSERT_NE
Committed on behalf of @aurelien35
For the given C++ code:
```
template <typename T> class Foo { T Member; };
template <template <typename T> class TemplateType>
class Bar {
TemplateType<int> Int;
};
template <template <template <typename> class> class TemplateTemplateType>
class Baz {
TemplateTemplateType<Foo> Foo;
};
typedef Baz<Bar> Example;
Example TT;
```
The '--attribute=encoded' option, will produce the logical view:
```
{Class} 'Foo<int>'
{Encoded} <int>
{Class} 'Bar<Foo>'
{Encoded} <> <-- Missing the template argument info (Foo)
{Class} 'Baz<Bar>'
{Encoded} <> <-- Missing the template argument info (Bar)
```
When the template argument is another template it is not included in the
{Encoded} field. The correct output should be:
```
{Class} 'Foo<int>'
{Encoded} <int>
{Class} 'Bar<Foo>'
{Encoded} <Foo>
{Class} 'Baz<Bar>'
{Encoded} <Bar>
```
- In the DWARF reader, for those attributes that can have an unsigned
value, allow for the following cases:
* Is an implicit constant
* Is an optional value
- The testing is done by creating a file with generated DWARF, using
`DwarfGenerator` (generate DWARF debug info for unit tests).
The code dealing with DW_AT_call_line/DW_AT_call_file is in the wrong
place. The correct functions were call, but with incorrect values:
DW_AT_call_line <-- Filename Index
DW_AT_call_file <-- Line number
The DW_OP_GNU_push_tls_address, DW_OP_form_tls_address DWARF
location forms generated for thread local storage variables, caused a
crash in the DWARFReader, due to incorrect number of operands.
The result of the function cannot be correctly interpreted without
knowing the precise form type (a type signature needs to be looked up
very differently from a supplementary debug info reference). The
function sort of worked because the two reference types (unit-relative
and section-relative) that can be handled uniformly are also the most
common types of references, but this setup made it easy to write code
which does not support other kinds of reference (and if one tried to
support them, the result didn't look pretty --
https://github.com/llvm/llvm-project/pull/97423/files#r1676217081).
The split is based on the reference type classification from DWARFv5
(Section 7.5.5 Classes and Forms), and it should enable uniform (if
slightly more verbose) hadling. Note that this only affects users which
want more control of how (or if) the references are resolved. Users
which just want to access the referenced DIE can use the higher level
API (DWARFDie::GetAttributeValueAsReferencedDie) which returns (or will
return after #97423 is merged) the correct die for all reference types
(except for supplementary references, which we don't support right now).
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
53 under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
As part of the WebAssembly support work review
https://github.com/llvm/llvm-project/pull/82588
It was decided to rename:
Files: LVElfReader.cpp[h] -> LVDWARFReader.cpp[h]
ELFReaderTest.cpp -> DWARFReaderTest.cpp
Class: LVELFReader -> LVDWARFReader
The name LVDWARFReader would match the another reader LVCodeViewReader
as they will reflect the type of
debug information format that they are parsing.
Add support for the WebAssembly binary format and be able to generate
logical views.
https://github.com/llvm/llvm-project/issues/69181
The README.txt includes information about how to build the test cases.
C++20 comes with std::erase to erase a value from std::vector. This
patch renames llvm::erase_value to llvm::erase for consistency with
C++20.
We could make llvm::erase more similar to std::erase by having it
return the number of elements removed, but I'm not doing that for now
because nobody seems to care about that in our code base.
Since there are only 50 occurrences of erase_value in our code base,
this patch replaces all of them with llvm::erase and deprecates
llvm::erase_value.