438 Commits

Author SHA1 Message Date
Snehasish Kumar
08e40c12fa
Reapply "[MemProf] Change histogram storage from uint64_t to uint16_t… (#151431)
Reapply #147854 after fixes merged in #151398.

Change memory access histogram storage from uint64_t to uint16_t to
reduce profile size on disk. This change updates the raw profile format
to v5. Also add a histogram test in compiler-rt since we didn't have one
before. With this change the histogram memprof raw for the basic test
reduces from 75KB -> 20KB.
2025-07-30 18:28:53 -07:00
Snehasish Kumar
24f9482abd
Revert "[MemProf] Change histogram storage from uint64_t to uint16_t" (#151382)
Reverts llvm/llvm-project#147854

Test failure when building with gcc.
https://lab.llvm.org/buildbot/#/builders/174/builds/21989
2025-07-30 12:36:45 -07:00
Snehasish Kumar
1bf89e90a8
[MemProf] Change histogram storage from uint64_t to uint16_t (#147854)
Change memory access histogram storage from uint64_t to uint16_t to
reduce profile size on disk. This change updates the raw profile format
to v5. Also add a histogram test in compiler-rt since we didn't have one
before. With this change the histogram memprof raw for the basic test
reduces from 75KB -> 20KB.
2025-07-30 11:52:31 -07:00
Andrew Rogers
c30952592a
[compiler-rt] replicate changes from llvm/ProfileData/InstrProfData.inc (#143574)
## Purpose
The compiler-rt project `check-same-common-code.test` test case started
failing after #142861 was merged. This change addresses the failure.

## Overview
This patch replicates the changes made by #142861 to
llvm/include/llvm/ProfileData/InstrProfData.inc in the duplicated file
compiler-rt/include/profile/InstrProfData.inc. These files otherwise
match.

## Validation
Locally built `check-profile` target and verified
`check-same-common-code.test` now passes.
2025-06-10 10:45:21 -07:00
Jan Patrick Lehr
bf6cd24aaa
Revert "[compiler-rt][XRay] Make xray_interface.h C compliant" (#141570)
Reverts llvm/llvm-project#140068

Failures on PPC buildbots.
2025-05-27 11:15:47 +02:00
Jan André Reuter
80da58da34
[compiler-rt][XRay] Make xray_interface.h C compliant (#140068)
The XRay interface header uses no C++ specific features aside from using
the `std` namespace and including the C++ variant of C headers. Yet,
these changes prevent using `xray_interface.h` in external tools relying
on C for different reasons. Make this header C compliant by using C
headers, removing the `std` namespace from `std::size_t` and guard
`extern "C"`.

To make sure that further changes to not break the interface
accidentally, port one test from C++ to C. This requires the C23
standard to officially support the attribute syntax used in this test
case.

Note that this only resolves this issue for `xray_interface.h`.
`xray_records.h` is also not C compliant, but requires more work to
port.

Fixes #139902

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
2025-05-27 09:59:14 +02:00
kevkevin
8cd628f472
doc: get rid of redundant TODO tag in FuzzedDataProvider.h (#137395)
'list.size()' is determined at runtime, so using static_assert on it as
suggested by the TODO comment is not feasible and produces the following
error when done:

error: static assertion expression is not an integral constant
expression

initially referenced in https://github.com/bitcoin/bitcoin/pull/32024

Co-authored-by: Chand-ra <chandrapratap376@gmail.com>
2025-04-25 14:51:10 -07:00
Qinkun Bao
45b9e24b1e
Fix some small typos in compiler-rt. NFC (#133388) 2025-03-28 15:51:13 -04:00
Ellis Hoag
2044dd07da
[InstrProf] Remove -forder-file-instrumentation (#130192) 2025-03-13 08:28:16 -07:00
Lang Hames
2c8b2dc3f4 [ORC-RT] Rename 'orc_rt_*CWrapper*' types and functions to 'orc_rt_*Wrapper*'.
The orc_rt_ prefix implies C API anyway (the C++ API should use the orc_rc::
namespace), so the 'C' is redundant here.
2025-03-07 14:31:42 +11:00
Lang Hames
a22881c9db [ORC-RT] Fix type name in comment. NFC. 2025-03-06 16:13:10 +11:00
l0rinc
aad74dc971
[compiler-rt] FuzzedDataProvider: modernize outdated trait patterns (#127811) 2025-02-21 10:17:26 +01:00
Ellis Hoag
2e33ed9ecc
[memprof] Use -memprof-runtime-default-options to set options during compile time (#118874)
Add the `__memprof_default_options_str` variable, initialized via the
`-memprof-runtime-default-options` LLVM flag, to hold the default options string
for memprof. This allows us to set these options during compile time in
the clang invocation.

Also update the docs to describe the various ways to set these options.
2024-12-06 09:22:16 -08:00
ronryvchin
ff281f7d37
[PGO] Add option to always instrumenting loop entries (#116789)
This patch extends the PGO infrastructure with an option to prefer the
instrumentation of loop entry blocks.
This option is a generalization of
19fb5b467b,
and helps to cover cases where the loop exit is never executed.
An example where this can occur are event handling loops.

Note that change does NOT change the default behavior.
2024-12-04 07:56:46 +01:00
Vitaly Buka
17d956588a
Reapply "[tsan] Don't use enum __tsan_memory_order in tsan interface"" (#115034)
In C++ it's UB to use undeclared values as enum.
And there is support __ATOMIC_HLE_ACQUIRE and
__ATOMIC_HLE_RELEASE need such values.

So use `int` in TSAN interface, and mask out
irrelevant bits and cast to enum ASAP.

`ThreadSanitizer.cpp` already declare morder parameterd
in these functions as `i32`.

This may looks like a slight change, as we
previously didn't mask out additional bits for `fmo`,
and `NoTsanAtomic` call. But from implementation
it's clear that they are expecting exact enum.


Reverts llvm/llvm-project#115032
Reapply llvm/llvm-project#114724
2024-11-05 11:23:52 -08:00
Vitaly Buka
b14c436311
Revert "[tsan] Don't use enum __tsan_memory_order in tsan interface" (#115032)
Reverts llvm/llvm-project#114724

Breaks OSX builds
2024-11-05 09:43:29 -08:00
Vitaly Buka
1e50958399
[tsan] Don't use enum __tsan_memory_order in tsan interface (#114724)
In C++ it's UB to use undeclared values as enum.
And there is support `__ATOMIC_HLE_ACQUIRE` and
`__ATOMIC_HLE_RELEASE` need such values.

Internal implementation was switched to `class enum`,
where that behavior is defined. But interface is C, so
we just switch to `int`.
2024-11-05 09:21:19 -08:00
Prabhuk
9f69da35e2
[NFC][compiler-rt] Add missing header include (#113951)
Include `cstdlib` which was originally included transitively but the
changes to `vector` in libcpp breaks new builds due to missing cstdlib
header for `abort()` function call.
2024-10-28 15:24:47 -07:00
Vassil Vassilev
f6b513a785 Revert "Add explicit symbol visibility macros to InstrProfData.inc (#110732)"
This reverts commit d7ca703eab7997814de425eaa4fd888563d78831 in llvm/llvm-project#110732
2024-10-28 08:57:57 +00:00
Thomas Fransham
d7ca703eab
Add explicit symbol visibility macros to InstrProfData.inc (#110732)
Add explicit symbol visibility macros to InstrProfData.inc

Annotating these symbols will fix missing symbols for InstrProfTest when
using shared library builds on windows with explicit visibility macros
enabled.
Add a empty fallback definition for LLVM_ABI macro so the code works in
compiler-rt.

This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
window.

```
llvm\lld-link : error : undefined symbol: public: void ValueProfData::deserializeTo(InstrProfRecord&,  InstrProfSymtab*)
>>> referenced by unittests\ProfileData\InstrProfTest.cpp:1372 void ValueProfileReadWriteTest_value_prof_data_read_write_Test::TestBody()
```
2024-10-28 10:47:40 +02:00
Sebastian Kreutzer
e738a5d8e3
Reapply " [XRay] Add support for instrumentation of DSOs on x86_64 (#90959)" (#113548)
This fixes remaining issues in my previous PR #90959.

Changes:
- Removed dependency on LLVM header in `xray_interface.cpp`
- Fixed XRay patching for some targets due to missing changes in
architecture-specific patching functions
- Addressed some remaining compiler warnings that I missed in the
previous patch
- Formatting

I have tested these changes on `x86_64` (natively), as well as
`ppc64le`, `aarch64` and `arm32` (cross-compiled and emulated using
qemu).

**Original description:**

This PR introduces shared library (DSO) support for XRay based on a
revised version of the implementation outlined in [this
RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000).
The feature enables the patching and handling of events from DSOs,
supporting both libraries linked at startup or explicitly loaded, e.g.
via `dlopen`.
This patch adds the following:
- The `-fxray-shared` flag to enable the feature (turned off by default)
- A small runtime library that is linked into every instrumented DSO,
providing position-independent trampolines and code to register with the
main XRay runtime
- Changes to the XRay runtime to support management and patching of
multiple objects

These changes are fully backward compatible, i.e. running without
instrumented DSOs will produce identical traces (in terms of recorded
function IDs) to the previous implementation.

Due to my limited ability to test on other architectures, this feature
is only implemented and tested with x86_64. Extending support to other
architectures is fairly straightforward, requiring only a
position-independent implementation of the architecture-specific
trampoline implementation (see
`compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference).

This patch does not include any functionality to resolve function IDs
from DSOs for the provided logging/tracing modes. These modes still work
and will record calls from DSOs, but symbol resolution for these
functions in not available. Getting this to work properly requires
recording information about the loaded DSOs and should IMO be discussed
in a separate RFC, as there are mulitple feasible approaches.

---------

Co-authored-by: Sebastian Kreutzer <sebastian.kreutzer@tu-darmstadt.de>
2024-10-25 10:15:25 +02:00
Qiongsi Wu
f9d0789064
[PGO] Initialize GCOV Writeout and Reset Functions in the Runtime on AIX (#108570)
This PR registers the writeout and reset functions for `gcov` for all
modules in the PGO runtime, instead of registering them
using global constructors in each module. The change is made for AIX
only, but the same mechanism works on Linux on Power.

When registering such functions using global constructors in each module
without `-ffunction-sections`, the AIX linker cannot garbage collect
unused undefined symbols, because such symbols are grouped in the same
section as the `__sinit` symbol. Keeping such undefined symbols causes
link errors (see test case
https://github.com/llvm/llvm-project/pull/108570/files#diff-500a7e1ba871e1b6b61b523700d5e30987900002add306e1b5e4972cf6d5a4f1R1
for this scenario). This PR implements the initialization in the
runtime, hence avoiding introducing `__sinit` into each module.

The implementation adds a new global variable `__llvm_covinit_functions`
to each module. This new global variable contains the function pointers
to the `Writeout` and `Reset` functions. `__llvm_covinit_functions`'s
section is the named section `__llvm_covinit`. The linker will aggregate
all the `__llvm_covinit` sections from each module
to form one single named section in the final binary. The pair of
functions
```
const __llvm_gcov_init_func_struct *__llvm_profile_begin_covinit();
const __llvm_gcov_init_func_struct *__llvm_profile_end_covinit();
```
are implemented to return the start and end address of this named
section in the final binary, and they are used in function
```
__llvm_profile_gcov_initialize()
```
(which is a constructor function in the runtime) so the runtime knows
the addresses of all the `Writeout` and `Reset` functions from all the
modules.

One noticeable implementation detail relevant to AIX is that to preserve
the `__llvm_covinit` from the linker's garbage collection, a `.ref`
pseudo instruction is inserted into them, referring to the section that
contains the `__llvm_gcov_ctr` variables, which are used in the
instrumented code. The `__llvm_gcov_ctr` variables did not belong to
named sections before, but this PR added them to the
`__llvm_gcov_ctr_section` named section, so we can add a `.ref` pseudo
instruction that refers to them in the `__llvm_covinit` section.
2024-10-17 09:32:10 -04:00
Tacet
c76045d9bf
[compiler-rt][ASan] Add function copying annotations (#91702)
This PR adds a `__sanitizer_copy_contiguous_container_annotations`
function, which copies annotations from one memory area to another. New
area is annotated in the same way as the old region at the beginning
(within limitations of ASan).

Overlapping case: The function supports overlapping containers, however
no assumptions should be made outside of no false positives in new
buffer area. (It doesn't modify old container annotations where it's not
necessary, false negatives may happen in edge granules of the new
container area.) I don't expect this function to be used with
overlapping buffers, but it's designed to work with them and not result
in incorrect ASan errors (false positives).

If buffers have granularity-aligned distance between them (`old_beg %
granularity == new_beg % granularity`), copying algorithm works faster.
If the distance is not granularity-aligned, annotations are copied byte
after byte.

```cpp
void __sanitizer_copy_contiguous_container_annotations(
    const void *old_storage_beg_p, const void *old_storage_end_p,
    const void *new_storage_beg_p, const void *new_storage_end_p) {
```

This function aims to help with short string annotations and similar
container annotations. Right now we change trait types of
`std::basic_string` when compiling with ASan and this function purpose
is reverting that change as soon as possible.


87f3407856/libcxx/include/string (L738-L751)

The goal is to not change `__trivially_relocatable` when compiling with
ASan. If this function is accepted and upstreamed, the next step is
creating a function like `__memcpy_with_asan` moving memory with ASan.
And then using this function instead of `__builtin__memcpy` while moving
trivially relocatable objects.


11a6799740/libcxx/include/__memory/uninitialized_algorithms.h (L644-L646)

---

I'm thinking if there is a good way to address fact that in a container
the new buffer is usually bigger than the previous one. We may add two
more arguments to the functions to address it (the beginning and the end
of the whole buffer.

Another potential change is removing `new_storage_end_p` as it's
redundant, because we require the same size.

Potential future work is creating a function `__asan_unsafe_memmove`,
which will be basically memmove, but with turned off instrumentation
(therefore it will allow copy data from poisoned area).

---------

Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2024-10-15 13:26:39 +02:00
Mikhail Goncharov
90627a5a19 Revert "[XRay] Add support for instrumentation of DSOs on x86_64 (#90959)"
This reverts commit a4402039bffd788b9af82435fd5a2fb311fdc6e8 and 4451f9f812d458f6b53785b27869674caf01e67b
2024-10-11 14:01:58 +02:00
Sebastian Kreutzer
4451f9f812
[XRay] Fix LLVM include in xray_interface.cpp (#111978)
Removes a dependency on LLVM in `xray_interface.cpp` by replacing
`llvm_unreachable` with compiler-rt's `UNREACHABLE`.
Applies clang-format to some unformatted changes. 

Original PR: #90959
2024-10-11 13:11:03 +02:00
Sebastian Kreutzer
a4402039bf
[XRay] Add support for instrumentation of DSOs on x86_64 (#90959)
This PR introduces shared library (DSO) support for XRay based on a
revised version of the implementation outlined in [this
RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000).
The feature enables the patching and handling of events from DSOs,
supporting both libraries linked at startup or explicitly loaded, e.g.
via `dlopen`.
This patch adds the following:
- The `-fxray-shared` flag to enable the feature (turned off by default)
- A small runtime library that is linked into every instrumented DSO,
providing position-independent trampolines and code to register with the
main XRay runtime
- Changes to the XRay runtime to support management and patching of
multiple objects

These changes are fully backward compatible, i.e. running without
instrumented DSOs will produce identical traces (in terms of recorded
function IDs) to the previous implementation.

Due to my limited ability to test on other architectures, this feature
is only implemented and tested with x86_64. Extending support to other
architectures is fairly straightforward, requiring only a
position-independent implementation of the architecture-specific
trampoline implementation (see
`compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference).

This patch does not include any functionality to resolve function IDs
from DSOs for the provided logging/tracing modes. These modes still work
and will record calls from DSOs, but symbol resolution for these
functions in not available. Getting this to work properly requires
recording information about the loaded DSOs and should IMO be discussed
in a separate RFC, as there are mulitple feasible approaches.

@petrhosek @jplehr
2024-10-11 11:23:34 +02:00
gbMattN
75103aae4a
Added include of common interfaces (#111374)
Pull request for issue #110823 
Including the file which defines the macros we use here. This would let
user code only include this interface, rather than having to include two
files.
2024-10-07 14:28:19 -07:00
Chris Apple
5d146c689e
[compiler-rt][rtsan] Introduce rtsan_interface.h and ScopedDisabler (#106736) 2024-09-06 11:47:10 -07:00
NAKAMURA Takumi
d2f77eb8ec
[MC/DC][Coverage] Introduce "Bitmap Bias" for continuous mode (#96126)
`counter_bias` is incompatible to Bitmap. The distance between Counters
and Bitmap is different between on-memory sections and profraw image.

Reference to `__llvm_profile_bitmap_bias` is generated only if
`-fcoverge-mcdc` `-runtime-counter-relocation` are specified. The
current implementation rejected their options.

```
Runtime counter relocation is presently not supported for MC/DC bitmaps
```
2024-07-31 10:14:12 +09:00
xur-llvm
25897ba420
[PGO] Sync InstrProfData.inc from llvm to compiler-rt (#99930)
Sync InstrProfData.inc from llvm to compiler-rt. The difference was
introduced from https://github.com/llvm/llvm-project/pull/69535.
2024-07-22 14:05:34 -07:00
Mitch Phillips
8681202dd6
[ASan] [HWASan] Add __sanitizer_ignore_free_hook() (#96749)
This change adds a new weak API function which makes the sanitizer
ignore the call to free(), and implements the
functionality in ASan and HWAsan. The runtime that implements this hook
can then call free() at a later point again on the same pointer (and
making sure the hook returns zero so that the memory will actually be
freed) when it's actually ready for the memory to be cleaned up.

This is needed in order to implement an sanitizer-compatible version
of Chrome's BackupRefPtr algorithm, since process-wide double-shimming
of malloc/free does not work on some platforms.

Requested and designed by @c01db33f (Mark) from Project Zero.

---------

Co-authored-by: Mark Brand <markbrand@google.com>
2024-07-12 13:41:01 +02:00
Lang Hames
73bfb65c57 [ORC-RT] Remove unused typedef from the C API. 2024-07-08 12:56:38 +10:00
Aiden Grossman
839ed1ba55 [MemProf][compiler-rt] Update MemProfData preprocessor directives
This Memprof preprocessor directives have diverged at some point. This
patch fixes that by copying an additional include from the LLVM side to
the compiler-rt side so the two files are identical again.
2024-07-04 23:45:33 +00:00
Matthew Weingarten
30b93db547
[Memprof] Adds the option to collect AccessCountHistograms for memprof. (#94264)
Adds compile time flag -mllvm -memprof-histogram and runtime flag
histogram=true|false to turn Histogram collection on and off. The
-memprof-histogram flag relies on -memprof-use-callbacks=true to work.

Updates shadow mapping logic in histogram mode from having one 8 byte
counter for 64 bytes, to 1 byte for 8 bytes, capped at 255. Only
supports this granularity as of now.

Updates the RawMemprofReader and serializing MemoryInfoBlocks to binary
format, including changing to a new version of the raw binary format
from version 3 to version 4.

Updates creating MemoryInfoBlocks with and without Histograms. When two
MemoryInfoBlocks are merged, AccessCounts are summed up and the shorter
Histogram is removed.

Adds a memprof_histogram test case.

Initial commit for adding AccessCountHistograms up until RawProfile for
memprof
2024-06-26 08:37:22 -07:00
Alexander Shaposhnikov
cae6d458a0
[CompilerRT] Add support for numerical sanitizer (#94322)
This diff contains the compiler-rt changes / preparations for nsan.

Test plan:

1. cd build/runtimes/runtimes-bins && ninja check-nsan
2. ninja check-all
2024-06-19 15:20:36 -07:00
Michael Kruse
a35ac42fac
[compiler-rt] Revise IDE folder structure (#89753)
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
2024-06-04 09:26:45 +02:00
Nazım Can Altınova
fe97a6148e
[tsan] Add callbacks for futex syscalls and mark them as blocking on tsan (#86537)
Fixes #83844.

This PR adds callbacks to mark futex syscalls as blocking. Unfortunately
we didn't have a mechanism before to mark syscalls as a blocking call,
so I had to implement it, but it mostly reuses the `BlockingCall`
implementation
[here](96819daa3d/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (L362-L380)).

The issue includes some information but this issue was discovered
because Rust uses futexes directly. So most likely we need to update
Rust as well to use these callbacks.

Also see the latest comments in #85188 for some context.
I also sent another PR #84162 to mark `pthread_*_lock` calls as
blocking.
2024-03-26 12:33:51 +01:00
Mingming Liu
16e74fd489
Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling." (#82711)
New change on top of [reviewed
patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits
after this
one](d0757f46b3).
Previous commits are restored from the remote branch with timestamps.

1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1acf))
   
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121))

**Original Description**

* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-27 11:07:40 -08:00
Mingming Liu
a8c3b3e20d
[nfc][compiler-rt]Replace Type::getInt8PtrTy with PointerType::getUnqual as a clean-up (#82434)
This is a follow up of
7b9d73c2f9
and
5ef9ba7412
* The definition of `Type::getInt8PtrTy` is deleted. This doesn't cause
a compile error because the `Initializer` part of the macro doesn't run.
2024-02-24 22:30:31 -08:00
Mingming Liu
0e8d1877cd
Revert type profiling change as compiler-rt test break on Windows. (#82583)
Examples
https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio
2024-02-21 21:41:33 -08:00
Mingming Liu
db7e9e6841
[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. (#81691)
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-21 20:59:42 -08:00
Mingming Liu
4247175d45
[nfc]For InstrProfData.inc, clang-format functions and opt-out of formatting on the rest (#82057)
Without this, each time `InstrProfData.inc` is modified (like in
https://github.com/llvm/llvm-project/pull/81691), pre-commit CI
clang-format aggressively formats many lines in an unreadable way. Pull
request with red pre-commit checks are usually frowned upon.

* Use `// clang-format:<reason>` instead of `/* clang-format */`. The
former
[allows](563ef30601/clang/lib/Format/Format.cpp (L4108-L4113))
specifying a reason but the latter is
[not](563ef30601/clang/lib/Format/Format.cpp (L4105-L4106)).
- Filed https://github.com/llvm/llvm-project/issues/82426 to track the
issue in clang-format.
2024-02-21 10:55:59 -08:00
Enna1
ac6f9a785c Reland "[MemProf] Add missing header to list of installed headers. (#79413)"
This reland commit 0a3b5ece0e51b85e4674388f46a0205c4361f76e with fix.
2024-01-26 14:53:41 +08:00
Enna1
f174648b6c Revert "[MemProf] Add missing header to list of installed headers. (#79413)"
This reverts commit 0a3b5ece0e51b85e4674388f46a0205c4361f76e.
2024-01-26 14:50:49 +08:00
Enna1
0a3b5ece0e
[MemProf] Add missing header to list of installed headers. (#79413)
There were buildbot failures when running memprof tests: 
Failed Tests (12):
  MemProfiler-x86_64-linux :: TestCases/interface_test.cpp
  MemProfiler-x86_64-linux :: TestCases/log_path_test.cpp
  MemProfiler-x86_64-linux :: TestCases/memprof_merge_mib.cpp
  MemProfiler-x86_64-linux :: TestCases/memprof_profile_dump.cpp
  MemProfiler-x86_64-linux :: TestCases/profile_reset.cpp
  MemProfiler-x86_64-linux :: TestCases/unaligned_loads_and_stores.cpp
  MemProfiler-x86_64-linux-dynamic :: TestCases/interface_test.cpp
  MemProfiler-x86_64-linux-dynamic :: TestCases/log_path_test.cpp
  MemProfiler-x86_64-linux-dynamic :: TestCases/memprof_merge_mib.cpp
  MemProfiler-x86_64-linux-dynamic :: TestCases/memprof_profile_dump.cpp
  MemProfiler-x86_64-linux-dynamic :: TestCases/profile_reset.cpp
MemProfiler-x86_64-linux-dynamic ::
TestCases/unaligned_loads_and_stores.cpp

See
- https://lab.llvm.org/buildbot/#/builders/258/builds/8852
- https://lab.llvm.org/buildbot/#/builders/258/builds/12876

I suspect the failure is because when build with
-DLLVM_ENABLE_RUNTIMES=compiler-rt -DCOMPILER_RT_BUILD_SANITIZERS=OFF,
the headers sanitizer/allocator_interface.h and
sanitizer/common_interface_defs.h
are not copied to the build tree, and not installed.
But in the failed memprof tests,
sanitizer/allocator_interface.h or sanitizer/memprof_interface.h is
included.

This patch adds sanitizer/allocator_interface.h and
sanitizer/memprof_interface.h to memprof headers if
COMPILER_RT_BUILD_SANITIZERS is false.
2024-01-26 09:18:19 +08:00
Qiongsi Wu
4f21fb8447
[PGO] Reland PGO's Counter Reset and File Dumping APIs #76471 (#78285)
https://github.com/llvm/llvm-project/pull/76471 caused buildbot failures
on Windows. For more details, see
https://github.com/llvm/llvm-project/issues/77546.

This PR revises the test and relands
https://github.com/llvm/llvm-project/pull/76471.
2024-01-22 14:54:58 -05:00
Mingming Liu
66981f9c61
[docs][IRPGO]Document two binary formats for instrumentation-based profiles, with a focus on IRPGO. (#76105) 2024-01-10 16:17:15 -08:00
Vitaly Buka
a828cda9c8 Revert "[PGO] Exposing PGO's Counter Reset and File Dumping APIs (#76471)"
Issue #77546

This reverts commit 07c9189fcc063bdf6219d2733843c89cde3991e1.
2024-01-09 18:37:04 -08:00
Qiongsi Wu
07c9189fcc
[PGO] Exposing PGO's Counter Reset and File Dumping APIs (#76471)
This PR exposes four PGO functions 

- `__llvm_profile_set_filename`
- `__llvm_profile_reset_counters`, 
- `__llvm_profile_dump` 
- `__llvm_orderfile_dump` 

to user programs through the new header `instr_prof_interface.h` under
`compiler-rt/include/profile`. This way, the user can include the header
`profile/instr_prof_interface.h` to introduce these four names to their
programs.

Additionally, this PR defines macro `__LLVM_INSTR_PROFILE_GENERATE` when
the program is compiled with profile generation, and defines macro
`__LLVM_INSTR_PROFILE_USE` when the program is compiled with profile
use. `__LLVM_INSTR_PROFILE_GENERATE` together with
`instr_prof_interface.h` define the PGO functions only when the program
is compiled with profile generation. When profile generation is off,
these PGO functions are defined away and leave no trace in the user's
program.

Background:
https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832
2024-01-09 10:38:17 -05:00
Zequan Wu
ab3430f891
[Profile] Add binary profile correlation for code coverage. (#69493)
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.

## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.

## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +14.6Mi  [NEW] +14.6Mi    __llvm_prf_data
  [NEW] +10.6Mi  [NEW] +10.6Mi    __llvm_prf_names
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
  [ = ]       0   +65% +1.23Ki    .relro_padding
   +62% +1.20Ki  [ = ]       0    [Unmapped]
   +13%    +448   +19%    +448    .init_array
  +8.8%    +192  [ = ]       0    [ELF Section Headers]
  +0.0%    +136  +0.0%     +80    [7 Others]
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.5%     +80  +1.2%     +64    .plt
  [ = ]       0 -99.2% -3.68Ki    [LOAD #5 [RW]]
  +195% +64.0Mi  +194% +64.0Mi    TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
   +13%    +448   +19%    +448    .init_array
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.2%     +64  +1.2%     +64    .plt
  +2.9%     +64  [ = ]       0    [ELF Section Headers]
  +0.0%     +40  +0.0%     +40    .data
  +1.2%     +32  +1.2%     +32    .got.plt
  +0.0%     +24  +0.0%      +8    [5 Others]
  [ = ]       0 -22.9%    -872    [LOAD #5 [RW]]
 -74.5% -1.44Ki  [ = ]       0    [Unmapped]
  [ = ]       0 -76.5% -1.45Ki    .relro_padding
  +118% +38.8Mi  +117% +38.8Mi    TOTAL
```

A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
2023-12-14 14:16:38 -05:00