19 Commits

Author SHA1 Message Date
Ádám Kallai
f75973949b
[BOLT][AArch64] Add support for SPE brstack format (#129231)
Since Linux 6.14, Perf gained the ability to report SPE branch events
using the `brstack` format, which matches the layout of LBR/BRBE.

This patch reuses the existing LBR parsing logic to support SPE.

Example SPE brstack format:
```bash
perf script -i perf.data -F pid,brstack --itrace=bl
```
```
  PID       FROM / TO / PREDICTED

16984  0x72e342e5f4/0x72e36192d0/M/-/-/11/RET/-
16984  0x72e7b8b3b4/0x72e7b8b3b8/PN/-/-/11/COND/-
16984  0x72e7b92b48/0x72e7b92b4c/PN/-/-/8/COND/-
16984  0x72eacc6b7c/0x760cc94b00/P/-/-/9/RET/-
16984  0x72e3f210fc/0x72e3f21068/P/-/-/4//-
16984  0x72e39b8c5c/0x72e3627b24/P/-/-/4//-
16984  0x72e7b89d20/0x72e7b92bbc/P/-/-/4/RET/-
```

SPE brstack flags can be two characters long: `PN` or `MN`:
- `P` = predicted branch
- `M` = mispredicted branch
- `N` = optionally appears when the branch is NOT-TAKEN
    - flag is relevant only to  conditional branches


Example of usage with BOLT:

1. Capture SPE branch events:
```bash
perf record -e 'arm_spe_0/branch_filter=1/u' -- binary
```

2. Convert profile for BOLT:
```bash
perf2bolt -p perf.data -o perf.fdata --spe binary
```

3. Run BOLT Optimization:
```bash
llvm-bolt binary -o binary.bolted --data   perf.fdata ...
```

A unit test verifies the parsing of the 'SPE brstack format'.

---------

Co-authored-by: Paschalis Mpeis <paschalis.mpeis@arm.com>
2025-06-20 10:40:35 +01:00
chrisPyr
038fff3f24
[NFC][BOLT] Make file-local cl::opt global variables static (#126472)
#125983
2025-03-05 22:11:05 -08:00
Nikita Popov
e235fcb582
[BOLT] Only link and initialize supported targets (#127509)
Bolt currently links and initializes all LLVM targets. This
substantially increases the binary size, and link time if LTO is used.

Instead, only link the targets specified by BOLT_TARGETS_TO_BUILD. We
also have to only initialize those targets, so generate a
TargetConfig.def file with the necessary information. The way the
initialization is done mirrors what llvm-exegesis does.

This reduces llvm-bolt size from 137MB to 78MB for me.
2025-02-18 09:17:51 +01:00
Nikita Popov
0abe058d7f
[BOLT] Use getMainExecutable() (#126698)
Use LLVM's getMainExecutable() helper instead of rolling our own. This
will result in standard behavior across platforms, such as making sure
that symlinks are always resolved.
2025-02-12 09:44:26 +01:00
Amir Ayupov
6ee5ff95ab
[BOLT] Add profile density computation
Reuse the definition of profile density from llvm-profgen (#92144):
- the density is computed in perf2bolt using raw samples (perf.data or
  pre-aggregated data),
- function density is the ratio of dynamically executed function bytes
  to the static function size in bytes,
- profile density:
  - functions are sorted by density in decreasing order, accumulating
    their respective sample counts,
  - profile density is the smallest density covering 99% of total sample
    count.

In other words, BOLT binary profile density is the minimum amount of
profile information per function (excluding functions in tail 1% sample
count) which is sufficient to optimize the binary well.

The density threshold of 60 was determined through experiments with
large binaries by reducing the sample count and checking resulting
profile density and performance. The threshold is conservative.

perf2bolt would print the warning if the density is below the threshold
and suggest to increase the sampling duration and/or frequency to reach
a given density, e.g.:
```
BOLT-WARNING: BOLT is estimated to optimize better with 2.8x more samples.
```

Test Plan: updated pre-aggregated-perf.test

Reviewers: maksfb, wlei-llvm, rafaelauler, ayermolo, dcci, WenleiHe

Reviewed By: WenleiHe, wlei-llvm

Pull Request: https://github.com/llvm/llvm-project/pull/101094
2024-10-24 18:30:59 -07:00
Amir Ayupov
3c4f00905e
[BOLT] Support perf2bolt-N in the driver
Check invoked tool with `starts_with`.

Addresses the issue where `perf2bolt` invoked using a distro symlink
`perf2bolt-16` fails to run in perf2bolt mode and runs in llvm-bolt mode
instead.

The issue is mentioned in https://vondra.me/posts/playing-with-bolt-and-postgres/

Test Plan:
```
ln -sf perf2bolt perf2bolt-20
perf2bolt-20 clang -p perf.data -o fdata.clang -w yaml.clang
...
PERF2BOLT: wrote 188593 objects and 0 memory objects to fdata.clang
```

Reviewers: ayermolo, rafaelauler, dcci, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/111072
2024-10-14 10:17:31 -07:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Kazu Hirata
6da4a7a8e2 [BOLT] Use SmallString::operator std::string (NFC) 2024-01-15 21:59:06 -08:00
Amir Ayupov
16492a6143 [BOLT][NFC] Rename {MachO,}RewriteInstance::create methods
Follow the code style of fallible constructors in [LLVM Programmer's Manual]
(https://llvm.org/docs/ProgrammersManual.html#fallible-constructors)
and rename `RewriteInstance::createRewriteInstance` to `RewriteInstance::create`

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D143119
2023-02-02 12:30:45 -08:00
serge-sans-paille
984b800a03
Move from llvm::makeArrayRef to ArrayRef deduction guides - last part
This is a follow-up to https://reviews.llvm.org/D140896, split into
several parts as it touches a lot of files.

Differential Revision: https://reviews.llvm.org/D141298
2023-01-10 11:47:43 +01:00
Nicolai Hähnle
f7872cdce1 CommandLine: add and use cl::SubCommand::get{All,TopLevel}
Prefer using these accessors to access the special sub-commands
corresponding to the top-level (no subcommand) and all sub-commands.

This is a preparatory step towards removing the use of ManagedStatic:
with a subsequent change, these global instances will be moved to
be regular function-scope statics.

It is split up to give downstream projects a (albeit short) window in
which they can switch to using the accessors in a forward-compatible
way.

Differential Revision: https://reviews.llvm.org/D129118
2022-08-02 23:49:16 +02:00
Amir Ayupov
af6e66f44c [BOLT][NFC] Report errors from RewriteInstance discoverStorage and run
Further improve error handling in BOLT by reporting `RewriteInstance` errors in
a library and fuzzer-friendly way instead of exiting.

Follow-up to D119658

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120224
2022-02-23 20:42:39 -08:00
Amir Ayupov
32d2473a5d [BOLT][NFC] Report errors from createBinaryContext and RewriteInstance ctor
Refactor createBinaryContext and RewriteInstance/MachORewriteInstance
constructors to report an error in a library and fuzzer-friendly way instead of
returning a nullptr or exiting.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D119658
2022-02-17 00:50:52 -08:00
serge-sans-paille
290e482342 Cleanup LLVMDWARFDebugInfo
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:

llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"

Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files

Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848

Which is a great diff!

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
2022-02-15 09:16:03 +01:00
Vladislav Khmelevsky
5c2ae5f454 [BOLT] Refactor heatmap to be standalone tool
Separate heatmap from bolt and build it as standalone tool.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D118946
2022-02-07 22:00:44 +03:00
Vladislav Khmelevsky
20e9d4caf0 [BOLT] Prepare BOLT for unit-testing
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: maksfb, Amir

Differential Revision: https://reviews.llvm.org/D118271
2022-01-27 00:22:13 +03:00
Maksim Panchenko
2f09f445b2 [BOLT][NFC] Fix file-description comments
Summary: Fix comments at the start of source files.

(cherry picked from FBD33274597)
2021-12-21 10:21:41 -08:00
Maksim Panchenko
40c2e0fafe [BOLT][NFC] Reformat with clang-format
Summary: Selectively apply clang-format to BOLT code base.

(cherry picked from FBD33119052)
2021-12-14 16:52:51 -08:00
Rafael Auler
a34c753fe7 Rebase: [NFC] Refactor sources to be buildable in shared mode
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.

Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.

To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.

To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).

Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.

(cherry picked from FBD32746834)
2021-10-08 11:47:10 -07:00