33 Commits

Author SHA1 Message Date
wangjue
dbb79c30c9
[BOLT][Instrumentation] Initial instrumentation support for RISCV64 (#133882)
This patch adds code generation for RISCV64 instrumentation.The work
    involved includes the following three points:

a) Implements support for instrumenting direct function call and jump
    on RISC-V which relies on , Atomic instructions
    (used to increment counters) are only available on RISC-V when the A
    extension is used.

b) Implements support for instrumenting direct function inderect call
    by implementing the createInstrumentedIndCallHandlerEntryBB and
createInstrumentedIndCallHandlerExitBB interfaces. In this process, we
    need to accurately record the target address and IndCallID to ensure
    the correct recording of the indirect call counters.

c)Implemented the RISCV64 Bolt runtime library, implemented some system
call interfaces through embedded assembly. Get the difference between
runtime addrress of .text section andstatic address in section header
table, which in turn can be used to search for indirect call
description.

However, the community code currently has problems with relocation in
    some scenarios, but this has nothing to do with instrumentation. We
    may continue to submit patches to fix the related bugs.
2025-04-16 23:01:00 -07:00
alekuz01
38faf32d23
[BOLT] Enable hugify for AArch64 (#117158)
Add required hugify instrumentation and runtime libraries support for AArch64.
Fixes #58226
Unblocks #62695
2025-04-15 12:59:05 +01:00
Elvina Yakubova
87e9c42495 [BOLT][Instrumentation] AArch64 instrumentation support in runtime
This commit adds support for AArch64 in instrumentation runtime library,
including AArch64 system calls.
Also this commit divides syscalls into target-specific files.

Reviewed By: rafauler, yota9

Differential Revision: https://reviews.llvm.org/D151942
2023-08-24 19:34:57 +03:00
Denis Revunov
a86dd9ae60 [BOLT][Instrumentation] Fix indirect call profile in PIE
Because indirect call tables use static addresses for call sites, but pc
values recorded by runtime may be subject to ASLR in PIE, we couldn't
find indirect call descriptions by their runtime address in PIE. It
resulted in [unknown] entries in profile for all indirect calls. We need
to substract base address of .text from runtime addresses to get the
corresponding static addresses. Here we create a getter for base address
of .text and substract it's return value from recorded PC values. It
converts them to static addresses, which then may be used to find the
corresponding indirect call descriptions.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D154121
2023-08-23 23:50:31 +03:00
Denis Revunov
a799298152 [BOLT][Instrumentation] Keep profile open in WatchProcess
When a binary is instrumented with --instrumentation-sleep-time and
instrumentation-wait-forks options and lauched, the profile is
periodically written until all the forks die. The problem is that we
cannot wait for the whole process tree, and we have no way to tell when
it's safe to read the profile. Hovewer, if we keep profile open
throughout the life of the process tree, we can use fuser to determine
when writing is finished.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D154436
2023-08-23 23:50:31 +03:00
Denis Revunov
60bbddf3c1 [BOLT][Instrumentation][NFC] Define and use more syscall constants
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D154419
2023-07-10 21:36:04 +03:00
Denis Revunov
8b23a853b9 Reland [BOLT][Instrumentation][NFC] define and use mmap flags
Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D154056
2023-06-30 11:10:08 +03:00
Amir Aupov
3e13b299f9 Revert "[BOLT][Instrumentation][NFC] define and use mmap flags"
This reverts commit f0b45fba4b64ab0b5d6c50d978e02f0d12d4d070.

The stack broke https://lab.llvm.org/buildbot/#/builders/252.
2023-06-29 22:11:17 -07:00
Denis Revunov
47934c119e [BOLT][Instrumentation] Add dumping function to instrumentation hash tables
Since there are no other means to debug the instrumentation library
other than using stdout, having a function to print hash table entries
is very useful.

Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D153771
2023-06-30 01:03:53 +03:00
Denis Revunov
f0b45fba4b [BOLT][Instrumentation][NFC] define and use mmap flags
Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D154056
2023-06-30 01:03:52 +03:00
Nathan Sidwell
1c3653df08 [BOLT] Robustify compile-time config check
The BOLT runtime is specifically hard coded for x86_64 linux or x86_64
darwin. (Using x86_64 syscalls, hardcoding syscall numbers.)

Make it very clear this is for those specific pair of systems.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148825
2023-04-21 12:37:54 -04:00
Alexey Moksyakov
1fb186198a adds huge pages support of PIE/no-PIE binaries
This patch adds the huge pages support (-hugify) for PIE/no-PIE
binaries. Also returned functionality to support the kernels < 5.10
where there is a problem in a dynamic loader with the alignment of
pages addresses.

Differential Revision: https://reviews.llvm.org/D129107
2022-11-04 15:14:21 +03:00
Vladislav Khmelevsky
e10e120cea [BOLT][Runtime] Fix memset definition
Differential Revision: https://reviews.llvm.org/D129321
2022-07-09 01:17:08 +03:00
Maksim Panchenko
ea2182fedd [BOLT] Add runtime functions required by freestanding environment
Compiler can generate calls to some functions implicitly, even under
constraints of freestanding environment. Make sure these functions are
available in our runtime objects.

Fixes test failures on some systems after https://reviews.llvm.org/D128960.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D129168
2022-07-06 11:22:22 -07:00
Elvina Yakubova
35155a0716 [BOLT] Change mutex implementation
Changed acquire implemetaion to __atomic_test_and_set() and release
to __atomic_clear() so it eliminates inline asm usage and is arch
independent.

Elvina Yakubova,
Advanced Software Technology Lab, Huawei

Reviewers: yota9, maksfb, rafauler

Differential Revision: https://reviews.llvm.org/D129162
2022-07-06 08:19:54 +03:00
Vladislav Khmelevsky
823ebcc7a8 [BOLT] Fix runtime osx cross-compile build
Place include elf.h under !apple condition

Differential Revision: https://reviews.llvm.org/D119038
2022-02-08 03:42:47 +03:00
Amir Ayupov
883bf0e83d [BOLT][NFC] Fix braces usage in the rest of the codebase
Summary:
Refactor remaining bolt sources to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33345885)
2021-12-28 18:43:53 -08:00
Maksim Panchenko
2f09f445b2 [BOLT][NFC] Fix file-description comments
Summary: Fix comments at the start of source files.

(cherry picked from FBD33274597)
2021-12-21 10:21:41 -08:00
Vladislav Khmelevsky
dcdd37fdc2 [PR] Instrumentation: Sync file on dump
Summary:
Sync the file with storage device on data dump to stabilize
instrumentation testing

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD31738021)
2021-10-15 20:46:09 +03:00
Vasily Leonenko
9aa134dc2d [PR] Instrumentation: use TryLock for SimpleHashTable getter
Summary:
This commit introduces TryLock usage for SimpleHashTable getter to
avoid deadlock and relax syscalls usage which causes significant
overhead in runtime.
The old behavior left under -conservative-instrumentation option passed
to instrumentation library.
Also, this commit includes a corresponding test case: instrumentation of
executable which performs indirect calls from common code and signal
handler.

Note: in case if TryLock was failed to acquire the lock - this indirect
call will not be accounted in the resulting profile.

Vasily Leonenko,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD30821949)
2021-08-08 04:50:06 +08:00
Elvina Yakubova
2ffd6e2b43 [PR] Instrumentation: Add support for opening libs based on links /proc/self/map_files
Summary:
This commit adds support for opening libs based on links
/proc/self/map_files.  For this we're getting current virtual address
and searching the lib in the directory with such address range. After
that, we're getting full path to the binary by using readlink
function. Direct read from link in /proc/self/map_files entries is not
possible because of lack of permissions.

Elvina Yakubova,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD30092422)
2021-01-19 02:08:55 +08:00
Elvina Yakubova
6665c628ea [PR] Instrumentation: Add readlink and getdents support
Summary:
This commit adds support for getting directory entries and
reading value of a symbolic link in instrumentation runtime library

Elvina Yakubova,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD30092362)
2021-01-18 22:08:10 +08:00
Vladislav Khmelevsky
2cf9008a60 [PR] Instrumentation: Disable signals on mutex lock
Summary:
When indirect call is instrmented it locks SimpleHashTable's mutex on get() call.
If while locked we we receive a signal and signal handler also will call
indirect function we will end up with deadlock.

PR facebookincubator/BOLT#167

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD28909921)
2021-06-04 19:51:06 +03:00
Amir Ayupov
2da5b12a3d [BOLT] Hugify: check for THP support via sysfs
Summary:
Remove dependence on kernel version check, query sysfs directly
instead.

(cherry picked from FBD28858208)
2021-06-02 19:11:52 -07:00
Vladislav Khmelevsky
76d346ca14 [BOLT][PR] Instrumentation: Introduce -no-counters-clear and -wait-forks options
Summary:
This PR introduces 2 new instrumentation options:
1. instrumentation-no-counters-clear: Discussed at https://github.com/facebookincubator/BOLT/issues/121
2. instrumentation-wait-forks: Since the instrumentation counters are mapped as MAP_SHARED it will be nice to add ability to wait until all forks of the parent process will die using tracking of process group.
The last patch is just emitBinary code refactor.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Pull Request resolved: https://github.com/facebookincubator/BOLT/pull/125
GitHub Author: Vladislav Khmelevskyi <Vladislav.Khmelevskyi@huawei.com>

(cherry picked from FBD26919011)
2021-03-09 16:18:11 -08:00
Alexander Shaposhnikov
a0dd5b05dc [BOLT] Add support for dumping profile on MacOS
Summary: Add support for dumping profile on MacOS.

(cherry picked from FBD25751363)
2021-01-28 12:44:14 -08:00
Alexander Shaposhnikov
3b876cc3e7 [BOLT] Add support for dumping counters on MacOS
Summary: Add support for dumping counters on MacOS

(cherry picked from FBD25750516)
2021-01-28 12:32:03 -08:00
Alexander Shaposhnikov
d6e60c5bec [BOLT] Enable intToStr for MacOS
Summary: Enable intToStr et al. in the runtime library for MacOS.

(cherry picked from FBD25745358)
2021-01-20 16:40:17 -08:00
Alexander Shaposhnikov
1b258b8908 Refactor syscall wrappers for OSX
Summary: Refactor syscall wrappers for OSX.

(cherry picked from FBD25084642)
2020-11-19 14:56:45 -08:00
Alexander Shaposhnikov
1cf23e5ee8 Link the instrumentation runtime on OSX
Summary: Link the instrumentation runtime on OSX.

(cherry picked from FBD24390019)
2020-11-17 13:57:29 -08:00
Alexander Shaposhnikov
bbd9d610fe Add first bits to cross-compile the runtime for OSX
Summary: Add first bits to cross-compile the runtime for OSX.

(cherry picked from FBD24330977)
2020-10-15 03:51:56 -07:00
Rafael Auler
c6799a689d [BOLT] Fix stack alignment for runtime lib
Summary:
Right now, the SAVE_ALL sequence executed upon entry of both
of our runtime libs (hugify and instrumentation) will cause the stack to
not be aligned at a 16B boundary because it saves 15 8-byte regs. Change
the code sequence to adjust for that. The compiler may generate code
that assumes the stack is aligned by using movaps instructions, which
will crash.

(cherry picked from FBD22744307)
2020-07-27 16:52:51 -07:00
Xun Li
9bd7161529 Adding automatic huge page support
Summary:
This patch enables automated hugify for Bolt.
When running Bolt against a binary with -hugify specified, Bolt will inject a call to a runtime library function at the entry of the binary. The runtime library calls madvise to map the hot code region into a 2M huge page. We support both new kernel with THP support and old kernels. For kernels with THP support we simply make a madvise call, while for old kernels, we first copy the code out, remap the memory with huge page, and then copy the code back.
With this change, we no longer need to manually call into hugify_self and precompile it with --hot-text. Instead, we could simply combine --hugify option with existing optimizations, and at runtime it will automatically move hot code into 2M pages.

Some details around the changes made:
1. Add an command line option to support --hugify. --hugify will automatically turn on --hot-text to get the proper hot code symbols. However, running with both --hugify and --hot-text is not allowed, since --hot-text is used on binaries that has precompiled call to hugify_self, which contradicts with the purpose of --hugify.
2. Moved the common utility functions out of instr.cpp to common.h, which will also be used by hugify.cpp. Added a few new system calls definitions.
3. Added a new class that inherits RuntimeLibrary, and implemented the necessary emit and link logic for hugify.
4. Added a simple test for hugify.

(cherry picked from FBD21384529)
2020-05-02 11:14:38 -07:00