213 Commits

Author SHA1 Message Date
Kazu Hirata
352602010f Repply [memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307)
Currently, we convert FrameId to Frame and CallStackId to a call stack
at several places.  This patch unifies those into function objects --
FrameIdConverter and CallStackIdConverter.

The existing implementation of CallStackIdConverter, being removed in
this patch, handles both FrameId and CallStackId conversions.  This
patch splits it into two phases for flexibility (but make them
composable) because some places only require the FrameId conversion.

This iteration fixes a problem uncovered with ubsan, where we were
dereferencing an uninitialized std::unique_ptr.
2024-04-28 11:44:45 -07:00
Vitaly Buka
7aa6896dd7
Revert "[memprof] Introduce FrameIdConverter and CallStackIdConverter" (#90318)
Reverts llvm/llvm-project#90307

Breaks bots https://lab.llvm.org/buildbot/#/builders/5/builds/42943
2024-04-27 00:15:08 -07:00
Kazu Hirata
e04df693bf
[memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307)
Currently, we convert FrameId to Frame and CallStackId to a call stack
at several places.  This patch unifies those into function objects --
FrameIdConverter and CallStackIdConverter.

The existing implementation of CallStackIdConverter, being removed in
this patch, handles both FrameId and CallStackId conversions.  This
patch splits it into two phases for flexibility (but make them
composable) because some places only require the FrameId conversion.
2024-04-26 19:22:17 -07:00
Kazu Hirata
777d2e54a9
[memprof] Drop the trait parameter (NFC) (#89461)
OnDiskIterableChainedHashTable::Create can default-contruct a trait
object for us.  We don't need to construct one on our own unless we
need to customize something (like a version number).
2024-04-19 17:00:57 -07:00
Kazu Hirata
b64c69d5b1
[memprof] Introduce IndexedMemProfReader (NFC) (#89331)
Without this patch, a lot of details about the deserilization of the
MemProf data are exposed in IndexedInstrProfReader -- four
MemProf-related variables in IndexedInstrProfReader plus 90+ lines of
deserilization code within IndexedInstrProfReader::readHeader.

This patch encapsulates them into a separate class, exposing only
three methods, namely the default constructor, deserialize, and
getMemProfRecord.
2024-04-18 21:16:46 -07:00
Kazu Hirata
172f6ddfa7
[memprof] Add Version2 of the indexed MemProf format (#89100)
This patch adds Version2 of the indexed MemProf format.  The new
format comes with a hash table from CallStackId to actual call stacks
llvm::SmallVector<FrameId>.  The rest of the format refers to call
stacks with CallStackId.  This "values + references" model effectively
deduplicates call stacks.  Without this patch, a large indexed memprof
file of mine shrinks from 4.4GB to 1.6GB, a 64% reduction.

This patch does not make Version2 generally available yet as I am
planning to make a few more changes to the format.
2024-04-18 14:12:58 -07:00
Kazu Hirata
f430e37446
[llvm] Drop unaligned from calls to readNext (NFC) (#88841)
Now readNext defaults to unaligned accesses.  This patch drops
unaligned to improve readability.
2024-04-16 12:47:02 -07:00
Kazu Hirata
db9a17a407
[memprof] Use std::optional (NFC) (#88366) 2024-04-11 09:56:01 -07:00
Kazu Hirata
d89914f30b
[memprof] Add Version2 of IndexedMemProfRecord serialization (#87455)
I'm currently developing a new version of the indexed memprof format
where we deduplicate call stacks in IndexedAllocationInfo::CallStack
and IndexedMemProfRecord::CallSites.  We refer to call stacks with
integer IDs, namely CallStackId, just as we refer to Frame with
FrameId.  The deduplication will cut down the profile file size by 80%
in a large memprof file of mine.

As a step toward the goal, this patch teaches
IndexedMemProfRecord::{serialize,deserialize} to speak Version2.  A
subsequent patch will add Version2 support to llvm-profdata.

The essense of the patch is to replace the serialization of a call
stack, a vector of FrameIDs, with that of a CallStackId.  That is:

  const IndexedAllocationInfo &N = ...;
  ...
  LE.write<uint64_t>(N.CallStack.size());
  for (const FrameId &Id : N.CallStack)
    LE.write<FrameId>(Id);

becomes:

  LE.write<CallStackId>(N.CSId);
2024-04-03 21:48:38 -07:00
Mingming Liu
1351d17826
[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825)
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691)

* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
 
* Implement profile reader and writer support 
  * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
  * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
  * Indexed profile writer collects the list of vtable names, and stores that to index profiles.
  * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.

rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-04-01 08:52:35 -07:00
Kazu Hirata
44253a9ce6
[memprof] Add MemProf version (#86414)
This patch adds a version field to the MemProf section of the indexed
profile format, calling the new version "version 1".  The existing
version is called "version 0".

The writer supports both versions via a command-line option:

  llvm-profdata merge --memprof-version=1 ...

The reader supports both versions by automatically detecting the
version from the header.
2024-03-28 14:29:34 -07:00
Kazu Hirata
9855134d07
[memprof] Use #ifdef EXPENSIVE_CHECKS (#86585)
This patch replaces:

  #if EXPENSIVE_CHECKS

with:

  #ifdef EXPENSIVE_CHECKS

to follow the existing conventions.
2024-03-25 14:36:03 -07:00
Kazu Hirata
6646fe884c
[memprof] Compute CallStackId when deserializing IndexedAllocationInfo (#86421)
There are two ways to create in-memory instances of
IndexedAllocationInfo -- deserialization of the raw MemProf data and
that of the indexed MemProf data.

With:

  commit 74799f424063a2d751e0f9ea698db1f4efd0d8b2
  Author: Kazu Hirata <kazu@google.com>
  Date:   Sat Mar 23 19:50:15 2024 -0700

we compute CallStackId for each call stack in IndexedAllocationInfo
while deserializing the raw MemProf data.

This patch does the same while deserilizing the indexed MemProf data.

As with the patch above, this patch does not add any use of
CallStackId yet.
2024-03-25 14:21:49 -07:00
Mingming Liu
16e74fd489
Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling." (#82711)
New change on top of [reviewed
patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits
after this
one](d0757f46b3).
Previous commits are restored from the remote branch with timestamps.

1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1acf))
   
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121))

**Original Description**

* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-27 11:07:40 -08:00
Mingming Liu
0e8d1877cd
Revert type profiling change as compiler-rt test break on Windows. (#82583)
Examples
https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio
2024-02-21 21:41:33 -08:00
Mingming Liu
db7e9e6841
[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. (#81691)
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-21 20:59:42 -08:00
NAKAMURA Takumi
f035c018a6 InstrProf::getFunctionBitmap: Fix BE hosts (#80608) 2024-02-05 15:04:57 +09:00
NAKAMURA Takumi
4926f12ff5
[Coverage] ProfileData: Handle MC/DC Bitmap as BitVector. NFC. (#80608)
* `getFunctionBitmap()` stores not `std::vector<uint8_t>` but
`BitVector`.
* `CounterMappingContext` holds `Bitmap` (instead of the ref of bytes)
* `Bitmap` and `BitmapIdx` are used instead of `evaluateBitmap()`.

FIXME: `InstrProfRecord` itself should handle `Bitmap` as `BitVector`.
2024-02-05 12:42:08 +09:00
Mingming Liu
eba2b789d3
[RawProfReader]When constructing symbol table, read the MD5 of function name in the proper byte order (#76312)
Before this patch, when the field `NameRef` is generated in little-endian systems and read back in big-endian systems, the information gets dropped.
- The bug gets caught by a buildbot
https://lab.llvm.org/buildbot/#/builders/94/builds/17931. In the error message (pasted below),
two indirect call targets are not imported.

```
   ; IMPORTS-DAG: Import _Z7callee1v
               ^
<stdin>:1:1: note: scanning from here
main.ll: Import _Z11global_funcv from lib.cc
^
<stdin>:1:10: note: possible intended match here
main.ll: Import _Z11global_funcv from lib.cc
         ^
Input file: <stdin>
Check file:
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
-dump-input=help explains the following input dump.
Input was:
<<<<<<
          1: main.ll: Import _Z11global_funcv from lib.cc 
dag:34'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match
found
dag:34'1 ? possible intended match
```

[This commit](b3999246b1 (diff-b196b796c5a396c7cdf93b347fe47e2b29b72d0b7dd0e2b88abb964d376ee50e)) gates the fix by flag and provide test data by creating big-endian profiles (rather than reading the little-endian data on a big-endian system that might require a VM).

- [This](b3999246b1 (diff-643176077ddbe537bd0a05d2a8a53bdff6339420a30e8511710bf232afdda8b9)) is a hexdump of little-endian profile data, and [this](b3999246b1 (diff-1736a3ee25dde02bba55d670df78988fdb227e5a85b94b8707cf182cf70b28f0)) is the big-endian version of it.
- The [README.md](b3999246b1 (diff-6717b6a385de3ae60ab3aec9638af2a43b55adaf6784b6f0393ebe1a6639438b)) shows the result of `llvm-profdata show -ic-targets` before and after the fix when the profile is in big-endian.
2024-01-02 10:23:29 -08:00
Mingming Liu
78a195e100
Reland the reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954)
Simplify the compiler-rt test to make it more general for different
platforms, and use `*DAG` matchers for lines that may be emitted
out-of-order.
- The compiler-rt test passed on a Windows machine. Previously name
matchers don't work for MSVC mangling
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907)
- `*DAG` matchers fixed the error in
https://lab.llvm.org/buildbot/#/builders/94/builds/17924

This is the second reland and fixed errors caught in first reland
(https://github.com/llvm/llvm-project/pull/75860)

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-19 12:25:56 -08:00
Mingming Liu
6ce23ea0ab
Revert "Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. "" (#75888)
Reverts llvm/llvm-project#75860
- Mangled name mismatch on Windows
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907/steps/8/logs/stdio)
2023-12-18 19:31:18 -08:00
Mingming Liu
c5871712ae
Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75860)
Fixed build-bot failures caught by post-submit tests
1) Add the list of command line tools needed by new compiler-rt test into dependency.
2) Use `starts_with` to replace deprecated `startswith`.

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 17:43:40 -08:00
Mingming Liu
3aa5d71127
Revert "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles." (#75835)
Reverts llvm/llvm-project#74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
2023-12-18 09:39:55 -08:00
Mingming Liu
245cddae70
[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (#74008)
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 09:10:39 -08:00
Zequan Wu
ab3430f891
[Profile] Add binary profile correlation for code coverage. (#69493)
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.

## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.

## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +14.6Mi  [NEW] +14.6Mi    __llvm_prf_data
  [NEW] +10.6Mi  [NEW] +10.6Mi    __llvm_prf_names
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
  [ = ]       0   +65% +1.23Ki    .relro_padding
   +62% +1.20Ki  [ = ]       0    [Unmapped]
   +13%    +448   +19%    +448    .init_array
  +8.8%    +192  [ = ]       0    [ELF Section Headers]
  +0.0%    +136  +0.0%     +80    [7 Others]
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.5%     +80  +1.2%     +64    .plt
  [ = ]       0 -99.2% -3.68Ki    [LOAD #5 [RW]]
  +195% +64.0Mi  +194% +64.0Mi    TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
   +13%    +448   +19%    +448    .init_array
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.2%     +64  +1.2%     +64    .plt
  +2.9%     +64  [ = ]       0    [ELF Section Headers]
  +0.0%     +40  +0.0%     +40    .data
  +1.2%     +32  +1.2%     +32    .got.plt
  +0.0%     +24  +0.0%      +8    [5 Others]
  [ = ]       0 -22.9%    -872    [LOAD #5 [RW]]
 -74.5% -1.44Ki  [ = ]       0    [Unmapped]
  [ = ]       0 -76.5% -1.45Ki    .relro_padding
  +118% +38.8Mi  +117% +38.8Mi    TOTAL
```

A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
2023-12-14 14:16:38 -05:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
Zequan Wu
b9951b3fe6
[llvm-profdata] Fix binary ids with multiple raw profiles in a single… (#72740)
Save binary ids when iterating through `RawInstrProfReader`.

Fixes #72699.
2023-11-20 14:25:24 -05:00
Ellis Hoag
b0154c36d6
[InstrProf] Add pgo use block coverage test (#72443)
Back in https://reviews.llvm.org/D124490 we added a block coverage mode
that instruments a subset of basic blocks using single byte counters to
get coverage for the whole function.

This commit adds a test to make sure that we correctly assign branch
weights based on the coverage profile.

I noticed this test was missing after seeing that we had no coverage on
`PGOUseFunc::populateCoverage()`

https://lab.llvm.org/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp.html#L1383
2023-11-20 09:25:33 -06:00
Zequan Wu
3c97c8b6fc
[Profile] Refactor profile correlation. (#70856)
Refactor some code from https://github.com/llvm/llvm-project/pull/69493.

#70712 was reverted due to linking failures. So, `-debug-info-correlate` remains unchanged and no new flag added.
2023-11-01 14:16:43 -04:00
Zequan Wu
89a2e70159
[llvm-profdata] Emit warning when counter value is greater than 2^56. (#69513)
Fixes #65416
2023-10-31 16:40:51 -04:00
Zequan Wu
db7a1ed9a2 Revert "[Profile] Refactor profile correlation. (#70712)"
This reverts commit 4b383d0af93136b80841fc140da0823dfc441dd4.
2023-10-31 10:53:45 -04:00
Zequan Wu
4b383d0af9
[Profile] Refactor profile correlation. (#70712)
Refactor some code from https://github.com/llvm/llvm-project/pull/69493.

Rebase of https://github.com/llvm/llvm-project/pull/69656 on top of main
as it was messed up.
2023-10-31 10:41:01 -04:00
Alan Phipps
f95b2f1acf Reland "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)"
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.

Differential Revision: https://reviews.llvm.org/D138846
2023-10-30 11:15:02 -05:00
Kazu Hirata
02f67c097d Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class. This patch replaces
{big,little,native} with llvm::endianness::{big,little,native}.

This patch completes the migration to llvm::endianness and
llvm::endianness::{big,little,native}.  I'll post a separate patch to
remove the migration helpers in llvm/Support/Endian.h:

  using endianness = llvm::endianness;
  constexpr llvm::endianness big = llvm::endianness::big;
  constexpr llvm::endianness little = llvm::endianness::little;
  constexpr llvm::endianness native = llvm::endianness::native;
2023-10-13 23:16:25 -07:00
Kazu Hirata
b8885926f8 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-10 22:54:51 -07:00
Kazu Hirata
d7b18d5083 Use llvm::endianness{,::little,::native} (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
llvm::support::endianness with llvm::endianness.
2023-10-09 00:54:47 -07:00
Zequan Wu
3c34245c47
[Profile] Use upper 32 bits of profile version for profile variants. (#67695)
Currently all upper 8 bits are reserved for different profile variants.
We need more bits for new mods in the future.
Context:
https://discourse.llvm.org/t/how-to-add-a-new-mode-to-llvm-raw-profile-version/73688
2023-10-03 10:15:22 -04:00
Kazu Hirata
7df88212d4 [ProfileData] Use llvm::byteswap instead of sys::getSwappedBytes (NFC) 2023-09-23 13:00:47 -07:00
Hans Wennborg
53a2923bf6 Revert "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)"
This seems to cause Clang to crash, see comments on the code review. Reverting
until the problem can be investigated.

> Part 1 of 3. This includes the LLVM back-end processing and profile
> reading/writing components. compiler-rt changes are included.
>
> Differential Revision: https://reviews.llvm.org/D138846

This reverts commit a50486fd736ab2fe03fcacaf8b98876db77217a7.
2023-09-21 12:20:24 +02:00
Alan Phipps
a50486fd73 [InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.

Differential Revision: https://reviews.llvm.org/D138846
2023-09-19 17:07:23 -05:00
Arthur Eubanks
a6f33ad447 [NFC][Profile] Rename Counters/DataSize to NumCounters/Data
Fixes some FIXMEs.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158466
2023-08-22 09:03:59 -07:00
Ellis Hoag
fe051934cb [InstrProf] Encode linkage names in IRPGO counter names
Prior to this diff, names in the `__llvm_prf_names` section had the format `[<filepath>:]<function-name>`, e.g., `main.cpp:foo`, `bar`. `<filepath>` is used to discriminate between possibly identical function names when linkage is local and `<function-name>` simply comes from `F.getName()`. This has two problems:
  * `:` is commonly found in Objective-C functions so that names like `main.mm:-[C foo::]` and `-[C bar::]` are difficult to parse
  * `<function-name>` might be different from the linkage name, so it cannot be used to pass a function order to the linker via `-symbol-ordering-file` or `-order_file` (see https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068)

Instead, this diff changes the format to `[<filepath>;]<linkage-name>`, e.g., `main.cpp;_foo`, `_bar`. The hope is that `;` won't realistically be found in either `<filepath>` or `<linkage-name>`.

To prevent invalidating all prior IRPGO profiles, we also lookup the prior name format when a record is not found (see `InstrProfSymtab::create()`, `readMemprof()`, and `getInstrProfRecord()`). It seems that Swift and Clang FE-PGO rely on the original `getPGOFuncName()`, so we cannot simply replace it.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D156569
2023-08-07 10:15:08 -07:00
Jessica Paquette
17cfd2e025 [profiling] Improve error message for raw profile header mismatches
When a user uses a mismatched clang + llvm-profdata, they didn't get a very
informative error message. It would just say "unsupported version".

As a result, users are often confused as to what they are supposed to do and
tend to assume that it's a bug in the profiling runtime.

This patch improves the error message by:

- Adding a new class of error (`raw_profile_version_mismatch`) to make it clear
  that, specifically, the *raw profile* version is unsupported because of a
  tool mismatch.

- Adding an error message that tells the user which raw profile version was
  encountered, which version was expected, and instructs them to align their
  tool versions.

To support this, this patch also updates `InstrProfError::take` to also
propagate the optional error message.

Differential Revision: https://reviews.llvm.org/D149361
2023-04-27 14:51:38 -07:00
Ellis Hoag
4bddef4117 [InstrProf][Temporal] Add weight field to traces
As discussed in [0], add a `weight` field to temporal profiling traces found in profiles. This allows users to use the `--weighted-input=` flag in the `llvm-profdata merge` command to weight traces from different scenarios differently.

Note that this is a breaking change, but since [1] landed very recently and there is no way to "use" this trace data, there should be no users of this feature. We believe it is acceptable to land this change without bumping the profile format version.

[0] https://reviews.llvm.org/D147812#4259507
[1] https://reviews.llvm.org/D147287

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D148150
2023-04-13 10:37:05 -07:00
Ellis Hoag
244be0b0de [InstrProf] Temporal Profiling
As described in [0], this extends IRPGO to support //Temporal Profiling//.

When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile.

Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively.

In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup.

Special thanks to Julian Mestre for helping with reservoir sampling.

[0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D147287
2023-04-11 08:30:52 -07:00
Simon Pilgrim
6c8fe96582 [Support] Move ItaniumManglingCanonicalizer and SymbolRemappingReader from Support to ProfileData
As mentioned on https://discourse.llvm.org/t/issues-in-llvm-tblgen-high-parallelized-build/68037, ItaniumManglingCanonicalizer is often slow to build, resulting in a bottleneck for distributed builds while waiting for LLVMSupport to complete.

SymbolRemappingReader is the only current user of ItaniumManglingCanonicalizer, and this is only used by ProfileData and llvm-cxxmap - so I propose we move both files into the ProfileData library.

Differential Revision: https://reviews.llvm.org/D143318
2023-02-06 20:55:48 +00:00
Mitch Phillips
7e566a3d11 Remove another unnecessary integer-check.
Same as b3b940d1501e39563ac549c3a5a89b25ae8ab7b8
2023-02-01 15:08:32 -08:00
Mitch Phillips
b3b940d150 Remove unnecessary comparison.
Popped up after https://reviews.llvm.org/D142826 added extra flags to
-Wextra, which is used by our sanitizer buildbots
(https://lab.llvm.org/buildbot/#/builders/37/builds/19910).

This check seems unnecessary, it's a bad cargo-cult after the buffer
size was expanded to allow >= 4GiB after
https://reviews.llvm.org/rGd6c15b661ab0aabb00f1219ce4af7136938e67e2.
2023-02-01 15:04:06 -08:00
Steven Wu
516e301752 [NFC][Profile] Access profile through VirtualFileSystem
Make the access to profile data going through virtual file system so the
inputs can be remapped. In the context of the caching, it can make sure
we capture the inputs and provided an immutable input as profile data.

Reviewed By: akyrtzi, benlangmuir

Differential Revision: https://reviews.llvm.org/D139052
2023-02-01 09:25:02 -08:00
Gulfem Savrun Yeniceri
1ec021467d [instrprof] Fix issue in binary-ids-padding.test
https://reviews.llvm.org/D135929 caused a failure in
binary-ids-padding.test in big endian configurations:
https://lab.llvm.org/buildbot/#/builders/231/builds/6709

binary-ids-padding.test writes the profile in little-endian format.
This patch changes the raw profile reader to use getDataEndianness()
instead of llvm::support::endian::system_endianness() to fix the issue.
2022-12-29 21:48:26 +00:00