According to the standard `DW_LNCT_directory_index` can be `data1`,
`data2`, or `udata` (see 6.2.4.1). The code was using `data1`, but this
limits the number of directories to 256, even if the variable holding
the directory index is a `uint64_t`. `dsymutil` was hitting an assertion
when trying to write directory indices higher than 255.
Modify the classic and the parallel DWARF linkers to use `udata` and
encode the directory indices as ULEB128 and provide a test that has more
than 256 directories to check the changes are working as expected.
For people that were using `dsymutil` with CUs that had between 128-256
directories, this will mean that for those indices 2 bytes will be used
now, instead of just one.
Without this patch, we first default-construct a value with
try_emplace and then immediately override it with a new value.
This patch inserts the final value with try_emplace and simplies the
code around it.
The result of the function cannot be correctly interpreted without
knowing the precise form type (a type signature needs to be looked up
very differently from a supplementary debug info reference). The
function sort of worked because the two reference types (unit-relative
and section-relative) that can be handled uniformly are also the most
common types of references, but this setup made it easy to write code
which does not support other kinds of reference (and if one tried to
support them, the result didn't look pretty --
https://github.com/llvm/llvm-project/pull/97423/files#r1676217081).
The split is based on the reference type classification from DWARFv5
(Section 7.5.5 Classes and Forms), and it should enable uniform (if
slightly more verbose) hadling. Note that this only affects users which
want more control of how (or if) the references are resolved. Users
which just want to access the referenced DIE can use the higher level
API (DWARFDie::GetAttributeValueAsReferencedDie) which returns (or will
return after #97423 is merged) the correct die for all reference types
(except for supplementary references, which we don't support right now).
DWARF-4 introduced DWARF discriminators but both, the classic and the
parallel DWARF linkers discarded them so far.
After applying this patch dsymutil, which uses the DWARF linkers,
correctly preserves discriminator information.
This fixes https://github.com/llvm/llvm-project/issues/93886. The UnitID
is not
unique between CUs and TUs. This led to DW_IDX_parent to point ot an
entry for a
DIE in CU if it had the same relative offset as TU die.
Added a IsTU to the hash for parent chain.
This fixes https://github.com/llvm/llvm-project/issues/93886. The UnitID
is not
unique between CUs and TUs. This led to DW_IDX_parent to point ot an
entry for a
DIE in a CU if it had the same relative offset as a TU die.
Added a IsTU to the hash for parent chain.
The function was meant to find the Developer/ dir, but it found a
Developer directory nested deep inside the top-level Developer dir.
The new implementation rejects everything in Xcode.app/Developer in
broad strokes.
rdar://128571037
Remove support for obfuscated bitcode in dsymutil and the DWARF linker.
We no longer support bitcode submissions and the obfuscation support has
been removed from the rest of the compiler.
rdar://123863918
Remove support for obfuscated bitcode in dsymutil and the DWARF linker.
We no longer support bitcode submissions and the obfuscation support has
been removed from the rest of the compiler.
rdar://123863918
The base class llvm::ThreadPoolInterface will be renamed
llvm::ThreadPool in a subsequent commit.
This is a breaking change: clients who use to create a ThreadPool must
now create a DefaultThreadPool instead.
DWARFLinkerImpl::DWARFLinkerImpl initializes
DebugStrStrings/DebugLineStrStrings/CommonSections using GlobalData
but GlobalData is initialized after the three members.
Move GlobalData before.
Fix#81110
This implements the ideas discussed in [1].
To summarize, this commit changes AsmPrinter so that it outputs
DW_IDX_parent information for debug_name entries. It will enable
debuggers to speed up queries for fully qualified types (based on a
DWARFDeclContext) significantly, as debuggers will no longer need to
parse the entire CU in order to inspect the parent chain of a DIE.
Instead, a debugger can simply take the parent DIE offset from the
accelerator table and peek at its name in the debug_info/debug_str
sections.
The implementation uses two types of DW_FORM for the DW_IDX_parent
attribute:
1. DW_FORM_ref4, which points to the accelerator table entry for the
parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the
table (that is, the parent doesn't have a name, or isn't allowed to be
in the table as per the DWARF spec). This is space-efficient, since it
takes 0 bytes.
The implementation works by:
1. Changing how abbreviations are encoded (so that they encode which
form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.
When all patches related to this are merged, we are able to show that
evaluating an expression such as:
```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
clang++ -c -g test.cpp -o /dev/null
```
is far faster: from ~5000 ms to ~1500ms.
Building llvm-project + clang with and without this patch, and looking
at its impact on object file size:
```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5} END {printf "%\047d\n", s}'
11,507,327,592
-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5} END {printf "%\047d\n", s}'
11,436,446,616
```
That is, an increase of 0.62% in total object file size.
Looking only at debug_names:
```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3} END {printf "%\047d\n", s}'
440,772,348
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3} END {printf "%\047d\n", s}'
369,867,920
```
That is an increase of 19%.
DWARF Linkers need to be changed in order to support this. This commit
already brings support to "base" linker, but it does not attempt to
modify the parallel linker. Accelerator entries refer to the
corresponding DIE offset, and this patch also requires the parent DIE
offset -- it's not clear how the parallel linker can access this. It may
be obvious to someone familiar with it, but it would be nice to get help
from its authors.
[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/
This patch is extracted from #74725.
The DwarfStreamer interface looks overcomplicated and has unnecessary
dependencies. This patch avoids creation of DwarfStreamer by DWARFLinker and
simplifies interface.
It was noted that new DWARFLinker libraries do not follow naming
agreement -
https://github.com/llvm/llvm-project/pull/75925#issuecomment-1883301659
This patch rename libraries to match with the agreement.
Rename LLVMDWARFLinkerBase library into the LLVMDWARFLinker. Rename
LLVMDWARFLinker library into the LLVMDWARFLinkerClassic. Correct include
path according to the new directory structure.
This patch creates DWARFLinkerBase library, places DWARFLinker code into
DWARFLinker\Classic, places DWARFLinkerParallel into DWARFLinker\Parallel.
updates BOLT to use new library. This patch is NFC.