This is a follow up patch after .debug_names can now emit local type
unit entries when we compile with type units + DWARF5 + .debug_names.
The pull request that added this functionality was:
https://github.com/llvm/llvm-project/pull/70515
This patch makes sure that the DebugNamesDWARFIndex in LLDB will not
manually need to parse type units if they have a valid index. It also
fixes the index to be able to correctly extract name entries that
reference type unit DIEs. Added a test to verify things work as
expected.
Changed so that when Abbrev code is printed out for entry it is done in
the same
way as in Abbrev table.
Once letters are present in a hex number in abbrev table they will be
lower case,
and in the Entry upper case. Which makes FIleCheck Pattern recognition
fail.
Example in: llvm/test/tools/llvm-dwarfdump/X86/debug-names-misaligned.s
The DWARFLinker library has code to identify ObjC selector names, which is used
by the debug linker to generate accelerator table entries. In the future, we
would like the DWARF verifier to also have access to such code, so that it can
identify these names when verifying accelerator tables (e.g. debug_names).
This patch follows the same intent of D155723, where we also moved code
generating simplified template names.
Since this is moving code around and changing the log, we also replace raw
pointer manipulation with the more expressive
StringRef::{drop_front,take_front,...} methods.
We also change a test so that it verifies its output, and that requires having
dsymutil not write to stdout.
Differential Revision: https://reviews.llvm.org/D158980
LLDB can benefit from having the base name of functions (i.e. without any
template parameters) as an entry into accelerator tables pointing back in the
DIE for the corresponding function specialization. In fact, some LLDB
functionality is only possible when those entries are present.
The DWARFLinker has been adding such entries for a while now, both with
apple_names and with debug_names. However, this has two side effects:
1. Some LLDB functionality is only possible when dsym bundles are present (i.e.
the linker touched the debug info).
2. The DWARFVerifier doesn't accept debug_name sections created by the linker,
as such names are (usually) neither the AT_name nor the AT_linkage_name of the
DIE.
Based on recent discussion [1], and because the DWARF 5 spec says that:
> A producer may choose to implement additional rules for what names are placed
> in the index
This patch relaxes the checks on the verifier to allow for simplified template
names in the accelerator table. To do so, we move some helper functions from
DWARFLinker into the core lib debug info. This addresses the point 2) above.
This patch also enables addressing point 1) in the future, since the helper
function is now visible to other parts of LLVM.
[1]: https://github.com/llvm/llvm-project/issues/58362
Differential Revision: https://reviews.llvm.org/D155723
This commit adds functionality to the Apple Accelerator table allowing iteration
over all elements in the table.
Our iterators look like streaming iterators: when we increment the iterator we
check if there is still enough data in the "stream" (in our case, the blob of
data of the accelerator table) and extract the next entry. If any failures
occur, we immediately set the iterator to be the end iterator.
Since the ultimate user of this functionality is LLDB, there are roughly two
iteration methods we want support: one that also loads the name of each entry,
and one which does not. Loading names is measurably slower (one order the
magnitude) than only loading DIEs, so we used some template metaprograming to
implement both iteration methods.
Depends on D153066
Differential Revision: https://reviews.llvm.org/D153066
We will soon need different kinds of iterators
We also use one of LLVM's iterator classes to implement some basic iterator
operations.
Differential Revision: https://reviews.llvm.org/D153065
The current implementation of the AppleAcceleratorTable::Entry is problematic
for a few reasons:
1. It is heavyweight. Iterators should be cheap, but the current implementation
tracks 3 different integer values, one "Entry" object, and one pointer to the
actual accelerator table. Most of this information is redundant and can be
removed.
2. It performs "memory reads" outside of the dereference operation. This
violates the usual expectations of iterators, whereby we don't access anything
so long as we don't dereference the iterator.
3. It doesn't commit to tracking _one_ thing only. It tries to track both an
"index" into a list of HashData entries and a pointer in a blob of data. For
this reason, it allows for multiple "invalid" states, keeps redundant
information around and is difficult to understand.
4. It couples the interpretation of the data with the iterator increment. As
such, if the *interpretation* fails, the iterator will keep on producing garbage
values without ever indicating so to consumers.
The problem this iterator is trying to solve is simple: we have a blob of data
containing many "HashData" entries and we want to iterate over them. As such,
this commit makes the iterator only track a pointer over that data, and it
decouples the iterator increments from the interpretation of this blob of data.
We maintain the already existing assumption that failures never happen, but now
make it explicit with an assert.
Depends on D152158
Differential Revision: https://reviews.llvm.org/D152159
This commit does a few minor NFC cleanups:
* A variable was called "Atom", probably trying to claim it was an AtomType.
This was incorrect, it is actually a FormValue.
* LLVM provides a `zip_equal` to express the intent of asserting ranges with the
same size. We change the lookup method to use that.
* The use of tuples made the code slightly difficult to follow, as such we
unpack the tuple with structure binding to improve readability.
Depends on D152157
Differential Revision: https://reviews.llvm.org/D152158
In a future patch, it will be desirable to skip over all hash data entries for a
particular string. In order to do so, we must know how many bytes each of those
entries have.
In its full specification, Apple tables allow for variable length entries, which
would make the above impossible without reading the data of each entry. However,
this is largely unsupported today (as a FIXME in the code indicates, there is a
bug with hash collisions exactly because we don't know how to skip over data),
and the documentation[1] states that:
> For the current implementations of the “.apple_names” (all functions +
> globals), the “.apple_types” (names of all types that are defined), and the
> “.apple_namespaces” (all namespaces), we currently set the Atom array to be:
> [...]
> This defines the contents to be the DIE offset (eAtomTypeDIEOffset) that is
> encoded as a 32 bit value (DW_FORM_data4).
In other words, we only produce fixed sized entries.
A few tests containing invalid dwarf had to be updated, as the error is now
detected earlier (when the accelerator table is being parsed, instead of inside
the explicit call to "verify").
[1]: https://llvm.org/docs/SourceLevelDebugging.html#fixed-lookup
Depends on D152156
Differential Revision: https://reviews.llvm.org/D152157
These are used throughout the class and are recreated every time they are used.
To prevent the risk of it being created incorrectly in different places, we
create it once and in the earliest moment possible: when the table is extracted.
Depends on D151989
Differential Revision: https://reviews.llvm.org/D152156
The current implementation of AppleAcceleratorTable::equal_range has a couple of
drawbacks:
1. Assumptions about how the hash table is structured are hard-coded throughout
the code. Unless you are familiar with the data layout of the table, it becomes
tricky to follow the code.
2. We currently load strings from the string table even for hashes that don't
match our current search hash. This is wasteful.
3. There is no error checking in most DataExtractor calls that can fail.
This patch cleans up (1) by making helper methods that hide the details of the
data layout from the algorithms relying on them. This should also help us in
future patches, where we want to expand the interface to allow iterating over
_all_ entries in the table, and potentially clean up the existing Iterator
class.
The changes above also fix (2), as the problem "just vanishes" when you have a
function called "idxOfHashInBucket(SearchHash)".
The changes above also fix (3), as having individual functions allow us to
expose the points in which reading data can fail. This is particularly important
as we would like to share this implementation with LLDB, which requires robust
error handling.
The changes above are also a step towards addressing a comment left by the
original author:
```
/// TODO: Generalize the rest of the AppleAcceleratorTable interface and move it
/// to this class.
```
I suspect a lot of these helper functions created also apply to DWARF 5's
accelerator table, so they could be moved to the base class.
The changes above also expose a bug in this implementation: the previous
algorithm looks at _one_ string inside the bucket, but never bothers checking
for collisions. When the main search loop is written as it is with this patch,
the problem becomes evident. We do not fix the issue in this patch, as it is
intended to be NFC.
Differential Revision: https://reviews.llvm.org/D151989
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"
Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files
Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848
Which is a great diff!
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
Summary:
In this patch I've done a slightly bigger rewrite to also remove the
hardcoded header lengths.
Reviewers: jhenderson, dblaikie, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75119
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
The padding field is reserved for DWARF and does not contain any useful
information. No need to read, store and report it.
Differential Revision: https://reviews.llvm.org/D73042
This structure was used to get the size of the fixed-size part of a Name
Index header for 32-bit DWARF. It is unsuitable for 64-bit DWARF because
the size of the unit length field is different.
Differential Revision: https://reviews.llvm.org/D73040
This updates all libraries and tools in LLVM Core to use 64-bit offsets
which directly or indirectly come to DataExtractor.
Differential Revision: https://reviews.llvm.org/D65638
llvm-svn: 368014
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
DWARF-related classes in lib/DebugInfo/DWARF contained
duplicating code for creating StringError instances, like:
template <typename... Ts>
static Error createError(char const *Fmt, const Ts &... Vals) {
std::string Buffer;
raw_string_ostream Stream(Buffer);
Stream << format(Fmt, Vals...);
return make_error<StringError>(Stream.str(), inconvertibleErrorCode());
}
Similar function was placed in Support lib in https://reviews.llvm.org/D49824
This revision makes DWARF classes use this function
instead of their local implementation of it.
Reviewers: aprantl, dblaikie, probinson, wolfgangp, JDevlieghere, jhenderson
Reviewed By: JDevlieghere, jhenderson
Differential Revision: https://reviews.llvm.org/D49964
llvm-svn: 340163
For instance, When dumping .apple_types, the second atom represents the
DW_TAG. In addition to printing the raw value, we now also pretty print
the value if the ATOM tells us how.
llvm-svn: 337026
Summary:
This method was not correct for entries in DWO files as it assumed it
could just add up the CU and DIE offsets to get the absolute DIE offset.
This is not correct for the DWO files, as here the CU offset will
reference the skeleton unit, whereas the DIE offset will be the offset
in the full unit in the DWO file.
Unfortunately, this means that we are not able to determine the absolute
DIE offset using the information in the .debug_names section alone,
which means we have to offload some of this work to the users of this
class.
To demonstrate how this can be done, I've added/fixed the ability to
lookup entries using accelerator tables in DWO files in llvm-dwarfdump.
To make this happen, I've needed to make two extra changes in other
classes:
- made the DWARFContext method to lookup a CU based on the section
offset public. I've needed this functionality to lookup a CU, and this
seems like a useful thing in general.
- made DWARFUnit::getDWOId call extractDIEsIfNeeded. Before this, the
DWOId was filled in only if the root DIE happened to be parsed
before we called the accessor. Since the lazy parsing is supposed to
happen under the hood, calling extractDIEsIfNeeded seems appropriate.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D48009
llvm-svn: 334578
Summary:
Back when we were introducing the DWARF v5 name index, there was a
short discussion whether we shouldn't have a nicer api for iterating
over the index. At that time, I did not find it necessary since the
iteration over names was done only from within the index itself (and I
figured the internal implementation can deal with a slightly rough
interface).
However, now I ran into a use for this kind of API in LLDB (for finding
all names matching a regular expression), so it looked like a nice
opportunity to introduce one. To make the API more useful, I've made the
NameTableEntry class a bit smarter: it now stores the string section
reference (so it can return its name) and its position in the name index
(mainly useful for dumping/logging).
I also convert the internal users to use the new API, which also gives
test coverage for the added code.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47590
llvm-svn: 333738
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.
For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.
The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.
The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D47543
llvm-svn: 333635
This is a resubmit of r331868 (D46583), which was reverted due to
failures on the PS4 bot.
These have been resolved with r332246/D46748.
llvm-svn: 332349
The new verifier check has found an error in the
debug-names-name-collisions.ll test on the PS4 bot:
error: Name Index @ 0x0: Entry @ 0xdc: mismatched Name of DIE @ 0x23: index - _ZN3foo3fooE; debug_info - foo.
Reverting while I investigate whether this is a bug in the verifier or
the generator.
This reverts commit r331868.
llvm-svn: 331869
Summary:
This patch implements a check which makes sure all entries required by
the DWARF v5 specification are present in the Name Index. The algorithm
tries to follow the wording of Section 6.1.1.1 of the spec as closely as
possible.
The main deviation from it is that instead of a whitelist-based approach
in the spec "The name index must contain an entry for each debugging
information entry that defines a named subprogram, label, variable,
type, or namespace" I chose a blacklist-based one, where I consider
everything to be "in" and then remove the entries that don't make sense.
I did this because it has more potential for catching interesting cases
and the above is a bit vague (it uses plain words like "variable" and
"subprogram", but the rest of the section speaks about specific TAGs).
This approach has raised some interesting questions, the main one being
whether enumerator values should be indexed. The consensus seems to be
that they should, although it does not follow from section 6.1.1.1.
For the time being I made the verifier ignore these, as LLVM does not do
this yet, and I wanted to get a clean run when verifying generated debug
info.
Another interesting case was the DW_TAG_imported_declaration. It was not
immediately clear to me whether this should go in or not, but currently
it is not indexed, and (unlike the enumerators) in does not seem to cause
problems for LLDB, so I've also ignored it.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46583
llvm-svn: 331868
Summary:
This patch add checks to verify that the information in the name index
entries is consistent with the debug_info section. Specifically, we
check that entries point to valid DIEs, and their names, tags, and
compile units match the information in the debug_info sections.
These checks are only run if the previous checks did not find any errors
in the name index headers. Attempting to proceed with the checks anyway
would likely produce a lot of spurious errors and the verification code
would need to be very careful to avoid crashing.
I also add a couple of more checks to the abbreviation-validation code
to verify that some attributes are always present (an index without a
DW_IDX_die_offset attribute is fairly useless).
The entry verification works only on indexes without any type units - I
haven't attempted to extend it to type units, as we don't even have a
DWARF v5-compatible type unit generator at the moment.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45323
llvm-svn: 329392
We should align the value of the field, not the overall section offset.
This distinction matters if one of the debug_names contributions is not
of size which is a multiple of four. The dwarf producers may choose to
emit rounded contributions, but they are not required to do so. In the
latter case, without this patch we would corrupt the parsing state, as
we would adjust the offset even if subsequent contributions contained
correctly rounded augmentation strings.
llvm-svn: 328796
Before this patch we were parsing the attributes as section offsets, as
that is what apple_names is doing. However, this is not correct as DWARF
v5 specifies that this attribute should use the Reference form class.
This also updates all the testcases (except the ones that deliberately
pass a different form) to use the correct form class.
llvm-svn: 328773
Summary:
We have had at least three pieces of code (in DWARFAbbreviationDeclaration,
DWARFAcceleratorTable and DWARFDie) that have hand-rolled support for
dumping unknown dwarf enum values. While not terrible, they are a bit
distracting and enable small differences to creep in (Unknown_ffff vs.
Unknown_0xffff). I ended up needing to add a fourth place
(DWARFVerifier), so it seems it would be a good time to centralize.
This patch creates an alternative to the XXXString dumping functions in
the BinaryFormat library, which formats an unknown value as
DW_TYPE_unknown_1234, instead of just an empty string. It is based on
the formatv function, as that allows us to avoid materializing the
string for unknown values (and because this way I don't have to invent a
name for the new functions :P).
In this patch I add formatters for dwarf attributes, forms, tags, and
index attributes as these are the ones in use currently, but adding
other enums is straight-forward.
Reviewers: dblaikie, JDevlieghere, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44570
llvm-svn: 328090
Summary:
This patch replaces the two switches which are deducing the size of
various forms with a single implementation. I have put the new
implementation into BinaryFormat, to avoid introducing dependencies
between the two independent libraries (DebugInfo and CodeGen) that need
this functionality.
Reviewers: aprantl, JDevlieghere, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44418
llvm-svn: 327486
Summary:
Even though the getDIEOffset offset function was common for the two
accelerator table implementations, it was doing two different things:
for the Apple tables, it was returning the die offset relative to the
start of the section, whereas for DWARF v5 tables, it was relative to
the start of the CU.
I resolve this by renaming the function to getDIESectionOffset to make
it obvious what the function returns, and change the DWARF
implementation to return the section offset. I also keep the CU-relative
accessor, but only in the DWARF implementation (there is no way to get
this information for the Apple tables). This was not caught by existing
tests because the hand-written inputs also erroneously used section
offsets instead of CU-relative ones.
While looking at this, I noticed that the Apple implementation was not
fully correct either -- the header contains a DIEOffsetBase field, which
should be added to offsets encoded with the DW_FORM_ref*** family, but
this was not being used. This went unnoticed because all current writers
set this field to zero anyway. I fix this as well and add a hand-written
test which demonstrates the issue.
Reviewers: JDevlieghere, dblaikie
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D44202
llvm-svn: 327116