Distributions are making the choice to turn frame pointers on by
default. Nixpkgs recently turned them on, and the method they use to do
so implies that everything is built with them on by default.
https://github.com/NixOS/nixpkgs/pull/399014
Assuming that a well behaved distribution doing this puts
`-fno-omit-frame-pointer` at the beginning of the compiler invocation,
we can still re-enable omission by supplying `-fomit-frame-pointer`
during compilation.
This fixes some segfaults from stack corruption in binaries rewritten by
bolt with `llvm-bolt -instrument`.
See also: #147569Fixes: #148595
This patch adds code generation for RISCV64 instrumentation.The work
involved includes the following three points:
a) Implements support for instrumenting direct function call and jump
on RISC-V which relies on , Atomic instructions
(used to increment counters) are only available on RISC-V when the A
extension is used.
b) Implements support for instrumenting direct function inderect call
by implementing the createInstrumentedIndCallHandlerEntryBB and
createInstrumentedIndCallHandlerExitBB interfaces. In this process, we
need to accurately record the target address and IndCallID to ensure
the correct recording of the indirect call counters.
c)Implemented the RISCV64 Bolt runtime library, implemented some system
call interfaces through embedded assembly. Get the difference between
runtime addrress of .text section andstatic address in section header
table, which in turn can be used to search for indirect call
description.
However, the community code currently has problems with relocation in
some scenarios, but this has nothing to do with instrumentation. We
may continue to submit patches to fix the related bugs.
This fixes a number of issues introduced in #97130 when
LLVM_LIBDIR_SUFFIX is a non-empty string. Make sure that the libdir is
always referenced as `lib${LLVM_LIBDIR_SUFFIX}`, not as just `lib` or
`${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}`.
This is the standard libdir convention for all LLVM subprojects. Using
`${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}` would result in a
duplicate suffix.
Continue from #87196 as author did not have much time, I have taken over
working on this PR. We would like to have this so it'll be easier to
package for Nix.
Can be tested by copying cmake, bolt, third-party, and llvm directories
out into their own directory with this PR applied and then build bolt.
---------
Co-authored-by: pca006132 <john.lck40@gmail.com>
I'm using clang-10 to build bolt which doesn't have moutline-atomics
option and though it doesn't do it. So test compiler for supporting it
before appending to the list of cxxflags.
Differential Revision: https://reviews.llvm.org/D159521
Fix tests that are failing in cross-compilation after D151920
(https://lab.llvm.org/buildbot/#/builders/221/builds/17715):
- instrumentation-ind-call, basic-instrumentation: add -mno-outline-atomics flag to runtime lib
- bolt-address-translation-internal-call, internal-call-instrument: add %cflags
- meta-merge-fdata: restrict to x86_64
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D159094
This commit adds support for AArch64 in instrumentation runtime library,
including AArch64 system calls.
Also this commit divides syscalls into target-specific files.
Reviewed By: rafauler, yota9
Differential Revision: https://reviews.llvm.org/D151942
This reverts commit d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6.
Adds the patch by @hans from
https://github.com/llvm/llvm-project/issues/62719
This patch fixes the Windows build.
d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6 reverted the reviews
D144509 [CMake] Bumps minimum version to 3.20.0.
This partly undoes D137724.
This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193
Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.
D150532 [OpenMP] Compile assembly files as ASM, not C
Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent)
when compiling a file which has been set as having the language
C. This behaviour change only takes place if "cmake_minimum_required"
is set to 3.20 or newer, or if the policy CMP0119 is set to new.
Attempting to compile assembly files with "-x c" fails, however
this is workarounded in many cases, as OpenMP overrides this with
"-x assembler-with-cpp", however this is only added for non-Windows
targets.
Thus, after increasing cmake_minimum_required to 3.20, this breaks
compiling the GNU assembly for Windows targets; the GNU assembly is
used for ARM and AArch64 Windows targets when building with Clang.
This patch unbreaks that.
D150688 [cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump
The build uses other mechanism to select the runtime.
Fixes#62719
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D151344
This reverts commit 65429b9af6a2c99d340ab2dcddd41dab201f399c.
Broke several projects, see https://reviews.llvm.org/D144509#4347562 onwards.
Also reverts follow-up commit "[OpenMP] Compile assembly files as ASM, not C"
This reverts commit 4072c8aee4c89c4457f4f30d01dc9bb4dfa52559.
Also reverts fix attempt "[cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump"
This reverts commit 7d47dac5f828efd1d378ba44a97559114f00fb64.
Some build bots have not been updated to the new minimal CMake version.
Reverting for now and ping the buildbot owners.
This reverts commit 44c6b905f8527635e49bb3ea97dea315f92d38ec.
This partly undoes D137724.
This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193
Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.
Reviewed By: mehdi_amini, MaskRay, ChuanqiXu, to268, thieta, tschuett, phosek, #libunwind, #libc_vendors, #libc, #libc_abi, sivachandra, philnik, zibi
Differential Revision: https://reviews.llvm.org/D144509
This patch adds the huge pages support (-hugify) for PIE/no-PIE
binaries. Also returned functionality to support the kernels < 5.10
where there is a problem in a dynamic loader with the alignment of
pages addresses.
Differential Revision: https://reviews.llvm.org/D129107
Some distribution install libraries under lib64. LLVM supports this
through LLVM_LIBDIR_SUFFIX, have bolt do the same.
Differential Revision: https://reviews.llvm.org/D137039
Working back towards D130586.
Bolt didn't use `LLVM_LIBDIR_SUFFIX` before, and has no in-tree reverse
dependencies, it seems easier to add.
The change in LLVM itself is to prevent some unexpected `lib64` from
cropping up due to the `CMAKE_INSTALL_LIBDIR` defaulting logic.
Differential Revision: https://reviews.llvm.org/D132297
We held off on this before as `LLVM_LIBDIR_SUFFIX` conflicted with it.
Now we return this.
`LLVM_LIBDIR_SUFFIX` is kept as a deprecated way to set
`CMAKE_INSTALL_LIBDIR`. The other `*_LIBDIR_SUFFIX` are just removed
entirely.
I imagine this is too potentially-breaking to make LLVM 15. That's fine.
I have a more minimal version of this in the disto (NixOS) patches for
LLVM 15 (like previous versions). This more expansive version I will
test harder after the release is cut.
Reviewed By: sebastian-ne, ldionne, #libc, #libc_abi
Differential Revision: https://reviews.llvm.org/D130586
Also make the soft toolchain requirements hard. This allows
us to use C++17 features in LLVM now.
If we find patterns with C++17 that improve readability
it should be recommended in the coding standards.
Reviewed By: jhenderson, cor3ntin, MaskRay
Differential Revision: https://reviews.llvm.org/D130689
If BOLT instrumentation runtime uses XMM registers, it can interfere
with the user program causing crashes and unexpected behavior. This
happens as the instrumentation code preserves general purpose registers
only.
Build BOLT instrumentation runtime with "-mno-sse".
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D128960
Summary:
In some of the system stack protection is enabled by default, which will
lead in extra symbols dependencies, which we want to avoid.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD32478426)
Summary:
This commit implements new method for _start & _fini functions hooking
which allows to use relative jumps for future PIE & .so library support.
Instead of using absolute address of _start & _fini functions known on
linking stage - we'll use dynamically created trampoline functions and
use corresponding symbols in instrumentation runtime library.
As we would like to use instrumentation for dynamically loaded binaries
(with PIE & .so), thus we need to compile instrumentation library with
"-fPIC" flag to support relative address resolution for functions and
data.
For shared libraries we need to handle initialization of instrumentation
library case by using DT_INIT section entry point.
Also this commit adds detection if the binary is executable or shared
library based on existence of PT_INTERP header. In case of shared
library we save information about real library init function address
for further usage for instrumentation library init trampoline function
creation and also update DT_INIT to point instrumentation library init
function.
Functions called from init/fini functions should be called with forced
stack alignment to avoid issues with instructions which relies on it.
E.g. optimized string operations.
Vasily Leonenko,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092316)
Summary:
This patch enables automated hugify for Bolt.
When running Bolt against a binary with -hugify specified, Bolt will inject a call to a runtime library function at the entry of the binary. The runtime library calls madvise to map the hot code region into a 2M huge page. We support both new kernel with THP support and old kernels. For kernels with THP support we simply make a madvise call, while for old kernels, we first copy the code out, remap the memory with huge page, and then copy the code back.
With this change, we no longer need to manually call into hugify_self and precompile it with --hot-text. Instead, we could simply combine --hugify option with existing optimizations, and at runtime it will automatically move hot code into 2M pages.
Some details around the changes made:
1. Add an command line option to support --hugify. --hugify will automatically turn on --hot-text to get the proper hot code symbols. However, running with both --hugify and --hot-text is not allowed, since --hot-text is used on binaries that has precompiled call to hugify_self, which contradicts with the purpose of --hugify.
2. Moved the common utility functions out of instr.cpp to common.h, which will also be used by hugify.cpp. Added a few new system calls definitions.
3. Added a new class that inherits RuntimeLibrary, and implemented the necessary emit and link logic for hugify.
4. Added a simple test for hugify.
(cherry picked from FBD21384529)
Summary:
Add full instrumentation support (branches, direct and
indirect calls). Add output statistics to show how many hot bytes
were split from cold ones in functions. Add -cold-threshold option
to allow splitting warm code (non-zero count). Add option in
bolt-diff to report missing functions in profile 2.
In instrumentation, fini hooks are fixed to run proper finalization
code after program finishes. Hooks for startup are added to setup
the runtime structures that needs initilization, such as indirect call
hash tables.
Add support for automatically dumping profile data every N seconds by
forking a watcher process during runtime.
(cherry picked from FBD17644396)
Summary:
Change our CMake config for the standalone runtime instrumentation
library to check for the elf.h header before using it, so the build
doesn't break on systems lacking it. Also fix a SmallPtrSet usage where
its elements are not really pointers, but uint64_t, breaking the build
in Apple's Clang.
(cherry picked from FBD17505759)
Summary:
Change our edge profiling technique when using instrumentation
to do not instrument every edge. Instead, build the spanning tree
for the CFG and omit instrumentation for edges in the spanning tree.
Infer the edge count for these edges when writing the profile during
run time. The inference works with a bottom-up traversal of the spanning
tree and establishes the value of the edge connecting to the parent based
on a simple flow equation involving output and input edges, where the
only unknown variable is the parent edge.
This requires some engineering in the runtime lib to support dynamic
allocation for building these graphs at runtime.
(cherry picked from FBD17062773)
Summary:
To allow the development of future instrumentation work, this
patch adds support in BOLT for linking arbitrary libraries into the
binary processed by BOLT. We use orc relocation handling mechanism for
that. With this support, this patch also moves code programatically
generated in X86 assembly language by X86MCPlusBuilder to C code written
in a new library called bolt_rt. Change CMake to support this library as
an external project in the same way as clang does with compiler_rt. This
library is installed in the lib/ folder relative to BOLT root
installation and by default instrumentation will look for the library
at that location to finish processing the binary with instrumentation.
(cherry picked from FBD16572013)