26 Commits

Author SHA1 Message Date
Sayhaan Siddiqui
9a3e66e314
[BOLT][DWARF][NFC] Fix DebugStrOffsetsWriter (#100672)
Fix DebugStrOffsetsWriter so updateAddressMap can't be called after it
is finalized.
2024-07-26 18:58:25 -07:00
Pavel Labath
09cbb45edd
[BOLT][DWARF][NFC] A better DIEBuilder for the llvm API change in #98905 (#99324)
The caller (cloneAttribute) already switches on the reference type. By
aligning the cases with the retrieval functions, we can avoid branching
twice.
2024-07-18 09:46:29 +02:00
Pavel Labath
9dab91247d Fix bolt for #98905 2024-07-16 13:29:00 +02:00
Alexander Yermolovich
61589b8599
[BOLT][DWARF] Fix parent chain in debug_names entries with forward declaration. (#93865)
Previously when an entry was skipped in parent chain a child will point
to the next valid entry in the chain. After discussion in
https://github.com/llvm/llvm-project/pull/91808 this is not very useful.
Changed implemenation so that all the children of the entry that is
skipped won't have DW_IDX_parent.
2024-06-05 09:57:11 -07:00
Alexander Yermolovich
99fad7ebd8
[BOLT][DWARF] Update DW_AT_comp_dir/DW_AT_dwo_name for DWO TUs (#91486)
Type unit DIE generated by clang contains DW_AT_comp_dir/DW_AT_dwo_name.
This was added to clang to help LLDB to figure out where type unit come
from when accessing an entry in a .debug_names accelerator table and
type units in .dwp file.

When BOLT writes out .dwo files it changes the name of them. User can
also specify directory of where they can be written out. Added support
to BOLT to update those attributes.
2024-05-14 15:08:45 -07:00
Amir Ayupov
fd38366e45
[BOLT][NFC] Clean includes, add license headers (#87200) 2024-03-31 19:29:45 -07:00
Alexander Yermolovich
f3cfe016c5
[BOLT][DWARF] Add support for cross-cu references for debug-names (#86015)
The DW_AT_abstract_origin can be a cross-cu reference as a by-product of
LTO. On IR level for absolute references an address is stored, vs a DIE
for relative references. Added a map to keep track of cross-cu
referenced DIEs to use when we add an Entry.
2024-03-22 13:48:49 -07:00
Alexander Yermolovich
a4610c7182
[BOLT][DWARF] Add support for DW_IDX_parent (#85285)
This adds support for DW_IDX_parent. If DIE has a parent then
DW_IDX_parent in Entry will point to Entry for that parent DIE.
Otherwise it will have DW_FORM_flag_present in abbrev. Which takes zero
space in Entry.

This came from

https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151
2024-03-15 13:52:45 -07:00
Alexander Yermolovich
6de5fcc746
[BOLT][DWARF] Add support for .debug_names (#81062)
DWARF5 spec supports the .debug_names acceleration table. This is the
formalized version of combination of gdb-index/pubnames/types. Added
implementation of it to BOLT. It supports both monolothic and split
dwarf, with and without Type Units. It does not include parent indices.
This will be in followup PR. Unlike LLVM output this will put all the
CUs and TUs into one Module.
2024-02-26 14:00:31 -08:00
Alexander Yermolovich
640e781dc8
[BOLT][DWARF][NFC] Use SkeletonCU in place of IsDWO check (#82540)
Changed isDWO to a function that checks Skeleton CU that is passed in.
This is for preparation for
https://github.com/llvm/llvm-project/pull/81062.
2024-02-21 16:18:18 -08:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Alexander Yermolovich
ad4cead67c
[BOLT][DWARF][NFC] Initialize CloneUnitCtxMap with current partition size (#75876)
We would always allocate maximum amount for vector containing
DWARFUnitInfo. In real usecases what ends up hapenning is we allocate a
giant vector when processing one CU, or for thin-lto case multiple CUs.
This lead to a lot of memory overhead, and 2x BOLT processing slowdown
for at least one service built with monolithic DWARF.

For binaries built with LTO with clang all of CUs that have cross
references will share an abbrev table and will be processed in one
batch. Rest of CUs are processesd in --cu-processing-batch-size size.
Which defaults to 1.

For theoretical cases where cross-cu references are present, but they do
not share abbrev will increase the size of CloneUnitCtxMap as each CU is
being processsed.
2023-12-20 16:12:52 -08:00
Alexander Yermolovich
bf2b035e58
[BOLT][DWARF] Fix handling .debug_str_offsets for type units (#75522)
There was an assumpiton that TUs and CUs share .debug_str_offsets
contribution. For ThinLTO builds it is not the case. Changed so that we
parse contributions for TUs also, and did some refactoring so that we
don't re-parse contributions that were not modified.
2023-12-14 17:27:21 -08:00
Alexander Yermolovich
00dbea7c73
[BOLT][DWARF][NFC] Added const to variable (#73731)
Nit followup to 72729.
2023-11-28 17:30:28 -08:00
Alexander Yermolovich
b47b3bee7b
[BOLT][DWARF] Fix handling of DWARF5 DWP (#72729)
Fixed handling of DWP as input. Before BOLT crashed. Now it will write
out
correct CU, and all the TUs. Potential future improvement is to scan all
the TUs
used in this CU, and only include those.
2023-11-28 15:54:14 -08:00
Alexander Yermolovich
2c784f7d26 [BOLT][DWARF] Fix handling of invalid DIE references
Compiler can generate DIE References that are invalid. Previously BOLT could
assert when writing out IR to .debug_info. Changed where DIE offsets are changed
so that it's always done. Thus making sure that assert is not triggered.

Added more specific warnings, and ability to print out invalid referenced DIE
offset when verbosity >=1.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D157746
2023-08-14 17:28:24 -07:00
Alexander Yermolovich
f52e61f3d3 [BOLT][DWARF] Replace MD5 with hash_combine
Slight performance improvement, based on perf.

Collected on clang-17 built with DWARF4 + split dwarf.
MD5
8:46.50 real,   713.38 user,    64.19 sys,      0 amem, 41933136 mmem
8:27.44 real,   708.55 user,    63.83 sys,      0 amem, 41906576 mmem
8:40.37 real,   724.63 user,    62.56 sys,      0 amem, 42319572 mmem

hash_combine

8:03.99 real,   681.92 user,    60.04 sys,      0 amem, 42459204 mmem
8:02.92 real,   685.20 user,    62.56 sys,      0 amem, 41879164 mmem
7:57.85 real,   690.27 user,    60.12 sys,      0 amem, 41806240 mmem

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D155764
2023-07-20 11:51:17 -07:00
Alexander Yermolovich
41afc42673 [BOLT][DWARF][NFC] Set initial offset of DIE
Setting initial offset of DIE to input DIE. This is to make "printf" debugging
easier.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D155031
2023-07-13 10:44:44 -07:00
Kazu Hirata
e71f9d264e [BOLT] Fix an unused-variable warning
This patch fixes:

  bolt/lib/Core/DIEBuilder.cpp:468:18: error: unused variable 'Ref'
  [-Werror,-Wunused-variable]
2023-07-10 15:51:58 -07:00
Alexander Yermolovich
dcfa2ab534 [BOLT][DWARF] Change to process and write out TUs first then CUs in batches
To reduce memory footprint changed so that we process and write out TUs first,
reset DIEBuilder and process CUs. CUs are processed in buckets. First bucket
contains all the CUs with cross CU references. Rest processd one at a time.

clang-17 build in debug mode, by clang-17.
before
8:25.81 real, 834.37 user, 86.03 sys, 0 amem, 79525064 mmem
8:02.20 real, 820.46 user, 81.81 sys, 0 amem, 79501616 mmem
7:52.69 real, 802.01 user, 83.99 sys, 0 amem, 79534392 mmem

after
7:49.35 real, 822.04 user, 66.19 sys, 0 amem, 34934260 mmem
7:42.16 real, 825.46 user, 63.52 sys, 0 amem, 34951660 mmem
7:46.71 real, 821.11 user, 63.14 sys, 0 amem, 34981164 mmem

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D151909
2023-07-10 14:42:04 -07:00
Alexander Yermolovich
c33536e9c3 [BOLT][DWARF] Numerous fixes for a new DWARFRewriter
* Some cleanup and minor fixes for the new debug information re-writer before moving on
to productatization.

* The new rewriter wasn't handling binary with DWARF5 and DWARF4 with
-fdebug-types-sections.

* Removed dead cross cu reference code.

* Added support for DW_AT_sibling.

* With the new re-writer abbrev number can change which can lead to offset of Type
Units changing. Before we would just copy raw data. Changed to write out Type
Unit List. This is generated by gdb-add-index.

* Fixed how bolt handles gdb-index generated by gdb-11 with types sections.
Simplified logic that handles variations of gdb-index.

* Clang can generate two type units with the same hash, but different content. LLD
does not de-duplicate when ThinLTO is involved. Changed so that TU hash and
offset are used to make TU's unique.

* It is possible to have references within location expression to another DIE.
Fixed it so that relative offset is updated correctly.

* Removed all the code related to patching.

* Removed dead code. Changed how we handling writting out TUs and TU Index. It now
  should fully work for DWARF4 and DWARF5.

* Removed unused arguments from some APIs, changed return type to void, and other
small cleanups.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D151906
2023-07-10 14:42:03 -07:00
Rui Zhong
87fb0ea27e [BOLT][DWARF] Implement new mechanism for DWARFRewriter
This revision implement new mechanism for DWARFRewriter.
In the new mechanism, we adopt the same way with DWARFLinker did.
By parsing Debug information into IR, we are allowed to handle debug information more flexible.
Now the debug information updating process relies on IR and IR will be written out to binary once the updating finished.

A new class was added: DIEBuilder. This class is responsible for parsing debug information and raising it to the IR level.
This class is also used to write out the .debug_info and .debug_abbrev sections.
Since we output brand new Abbrev section we won't need to always convert low_pc/high_pc into ranges.
When conversion does happen we can also remove low_pc entry.

Reviewed By: maksfb, ayermolo

Differential Revision: https://reviews.llvm.org/D130315
2023-07-10 14:42:03 -07:00
Nico Weber
de7781ea42 Revert "[DWARF][BOLT] Implement new mechanism for DWARFRewriter"
This reverts commit 460a2244430fae192298a5fd9fa2a269e540e8c1.
It breaks building on macOS, and it was landed with a review URL
pointing to some Facebook-internal service.

Also reverts a bunch of follow-ups:

Revert "[BOLT][DWARF] Don't check string offsets"
This reverts commit f9d6f48c8bf5acaac07502403c41cf0b0d89c8d2.

Revert "[BOLT][DWARF] Change to process and write out TUs first then CUs in batches"
This reverts commit 88e95c1e4bb6e2ad3bfd185b96341ad5c09eff6b.

Revert "[BOLT][DWARF] Output DWO files as they are being processed"
This reverts commit 46ca2e3fcd419b1246357ed3b9cd36630f16e64d.

Revert "[BOLT][DWARF] Don't check string offsets"
This reverts commit cfe4a4b04f219a9dbb4e3fc01883437b6ff0e702.

Revert "[BOLT][DWARF] Numerous fixes for a new DWARFRewriter"
This reverts commit 2701a661daa393ad5901ac88d420d7aa931eda0d.
2023-07-07 08:07:01 -04:00
Alexander Yermolovich
88e95c1e4b [BOLT][DWARF] Change to process and write out TUs first then CUs in batches
Summary:
To reduce memory footprint changed so that we process and write out TUs first,
reset DIEBuilder and process CUs. CUs are processed in buckets. First bucket
contains all the CUs with cross CU references. Rest processd one at a time.

clang-17 build in debug mode, by clang-17.
before
8:25.81 real, 834.37 user, 86.03 sys, 0 amem, 79525064 mmem
8:02.20 real, 820.46 user, 81.81 sys, 0 amem, 79501616 mmem
7:52.69 real, 802.01 user, 83.99 sys, 0 amem, 79534392 mmem

after
7:49.35 real, 822.04 user, 66.19 sys, 0 amem, 34934260 mmem
7:42.16 real, 825.46 user, 63.52 sys, 0 amem, 34951660 mmem
7:46.71 real, 821.11 user, 63.14 sys, 0 amem, 34981164 mmem

Differential Revision: https://phabricator.intern.facebook.com/D45883198
2023-07-06 14:21:26 -07:00
Alexander Yermolovich
2701a661da [BOLT][DWARF] Numerous fixes for a new DWARFRewriter
Summary:

* Some cleanup and minor fixes for the new debug information re-writer before moving on
to productatization.

* The new rewriter wasn't handling binary with DWARF5 and DWARF4 with
-fdebug-types-sections.

* Removed dead cross cu reference code.

* Added support for DW_AT_sibling.

* With the new re-writer abbrev number can change which can lead to offset of Type
Units changing. Before we would just copy raw data. Changed to write out Type
Unit List. This is generated by gdb-add-index.

* Fixed how bolt handles gdb-index generated by gdb-11 with types sections.
Simplified logic that handles variations of gdb-index.

* Clang can generate two type units with the same hash, but different content. LLD
does not de-duplicate when ThinLTO is involved. Changed so that TU hash and
offset are used to make TU's unique.

* It is possible to have references within location expression to another DIE.
Fixed it so that relative offset is updated correctly.

* Removed all the code related to patching.

* Removed dead code. Changed how we handling writting out TUs and TU Index. It now
  should fully work for DWARF4 and DWARF5.

* Removed unused arguments from some APIs, changed return type to void, and other
small cleanups.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: https://phabricator.intern.facebook.com/D46168257
2023-07-06 14:21:26 -07:00
Alexander Yermolovich
460a224443 [DWARF][BOLT] Implement new mechanism for DWARFRewriter
Summary:
This revision implement new mechanism for DWARFRewriter.
In the new mechanism, we adopt the same way with DWARFLinker did.
By parsing Debug information into IR, we are allowed to handle debug information more flexible.
Now the debug information updating process relies on IR and IR will be written out to binary once the updating finished.

A new class was added: DIEBuilder. This class is responsible for parsing debug information and raising it to the IR level.
This class is also used to write out the .debug_info and .debug_abbrev sections.
Since we output brand new Abbrev section we won't need to always convert low_pc/high_pc into ranges.
When conversion does happen we can also remove low_pc entry.

Differential Revision: https://phabricator.intern.facebook.com/D39484421

Tasks: T117448832
2023-07-06 14:21:26 -07:00