161 Commits

Author SHA1 Message Date
Kazu Hirata
bba5ee47e6
[memprof] Introduce memprof::LinearFrameId (NFC) (#94057)
This patch introduces memprof::LinearFrameId, which is a frame version
of memprof::LinearCallStackId.
2024-05-31 15:29:44 -07:00
Kazu Hirata
9a8b73c741
[memprof] Replace uint32_t with LinearCallStackId where appropriate (NFC) (#94023)
This patch replaces uint32_t with LinearCallStackId where appropriate.
I'm replacing uint64_t with LinearCallStackId in
writeMemProfCallStackArray, but that's OK because it's a value to be
used as LinearCallStackId anyway.
2024-05-31 14:41:05 -07:00
Kazu Hirata
90acfbf90d
[memprof] Use linear IDs for Frames and call stacks (#93740)
With this patch, we stop using on-disk hash tables for Frames and call
stacks.  Instead, we'll write out all the Frames as a flat array while
maintaining mappings from FrameIds to the indexes into the array.
Then we serialize call stacks in terms of those indexes.

Likewise, we'll write out all the call stacks as another flat array
while maintaining mappings from CallStackIds to the indexes into the
call stack array.  One minor difference from Frames is that the
indexes into the call stack array are not contiguous because call
stacks are variable-length objects.

Then we serialize IndexedMemProfRecords in terms of the indexes
into the call stack array.

Now, we describe each call stack with 32-bit indexes into the Frame
array (as opposed to the 64-bit FrameIds in Version 2).  The use of
the smaller type cuts down the profile file size by about 40% relative
to Version 2.  The departure from the on-disk hash tables contributes
a little bit to the savings, too.

For now, IndexedMemProfRecords refer to call stacks with 64-bit
indexes into the call stack array.  As a follow-up, I'll change that
to uint32_t, including necessary updates to RecordWriterTrait.
2024-05-30 14:28:22 -07:00
Kazu Hirata
99b9ab45cd
[memprof] Reorder MemProf sections in profile (#93640)
This patch teaches the V3 format to serialize Frames, call stacks, and
IndexedMemProfRecords, in that order.

I'm planning to use linear IDs for Frames.  That is, Frames will be
numbered 0, 1, 2, and so on in the order we serialize them.  In turn,
we will seialize the call stacks in terms of those linear IDs.

Likewise, I'm planning to use linear IDs for call stacks and then
serialize IndexedMemProfRecords in terms of those linear IDs for call
stacks.

With the new order, we can successively free data structures as we
serialize them.  That is, once we serialize Frames, we can free the
Frames' data proper and just retain mappings from FrameIds to linear
IDs.  A similar story applies to call stacks.
2024-05-29 12:18:24 -07:00
Mingming Liu
c54657887b
[nfc][InstrProfWriter]Store header fields in a vector and back patch once (#93594)
This is a split of https://github.com/llvm/llvm-project/pull/93346 as
discussed.
2024-05-29 10:50:44 -07:00
Kazu Hirata
9e89d107a6
[memprof] Add MemProf format Version 3 (#93608)
This patch adds Version 3 for development purposes.  For now, this
patch adds V3 as a copy of V2.

For the most part, this patch adds "case Version3:" wherever "case
Version2:" appears.  One exception is writeMemProfV3, which is copied
from writeMemProfV2 but updated to write out memprof::Version3 to the
MemProf header.  We'll incrementally modify writeMemProfV3 in
subsequent patches.
2024-05-28 13:30:00 -07:00
Mingming Liu
beac910c3b
[nfc][InstrProfWriter]Wrap vtable writes in a method. (#93081)
- This way `InstrProfWriter::writeImpl` itself is simpler.
2024-05-22 21:10:09 -07:00
Kazu Hirata
2375921d67
[ProfileData] Use default member initializations (NFC) (#93120)
This patch uses default member initializations for all the fields in
Header.  The intent is to prevent accidental uninitialized fields and
reduce the number of times we need to mention each member variable.
2024-05-22 20:38:57 -07:00
Mingming Liu
7b977e0f64
[nfc][InstrFDO]Encapsulate header writes in a class member function (#90142)
The smaller class member are more focused and easier to maintain. This
also paves the way for partial header forward compatibility in
https://github.com/llvm/llvm-project/pull/88212

---------

Co-authored-by: Kazu Hirata <kazu@google.com>
2024-05-18 21:51:14 -07:00
Kazu Hirata
479f4a7b68
[memprof] Update comments for writeMemProf and its helpers (#92446)
This patch adds comments for writeMemProf{V0,V1,V2} in a
version-specific manner.  The mostly repetitive nature of the comments
is somewhat unfortunate but intentional to make it easy to retire
older versions.

Without this patch, the comment just before writeMemProf documents the
Version1 format, which is very confusing.
2024-05-16 13:26:13 -07:00
Kazu Hirata
0dc80e4b26
[memprof] Group MemProf data structures into a struct (NFC) (#92360)
This patch groups the three Memprof data structures into a struct
named IndexedMemProfData and teaches InstrProfWriter to use it.  This
way, we can pass IndexedMemProfData to writeMemProf and its helpers
instead of individual data structures.

As a follow-up, we can use the new struct in MemProfReader also.  That
in turn allows loadInput in llvm-profdata to move the MemProf data
into the writer context, saving a few seconds for a large MemProf
profile.
2024-05-16 10:35:45 -07:00
Ellis Hoag
c87b1ca4ed
[InstrProf] Fix bug when clearing traces with samples (#92310)
The `--temporal-profile-max-trace-length=0` flag in the `llvm-profdata
merge` command is used to remove traces from a profile. There was a bug
where traces would not be cleared if the profile was already sampled.
This patch fixes that.
2024-05-15 18:41:25 -05:00
Kazu Hirata
dc7834b76c [ProfileData] Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2024-04-28 23:13:18 -07:00
Kazu Hirata
cb9589b227
[memprof] Move getFullSchema and getHotColdSchema outside PortableMemInfoBlock (#90103)
These functions do not operate on PortableMemInfoBlock.  This patch
moves them outside the class.
2024-04-25 12:12:28 -07:00
Kazu Hirata
4c8ec8f8bc
[memprof] Reduce schema for Version2 (#89876)
Curently, the compiler only uses several fields of MemoryInfoBlock.
Serializing all fields into the indexed MemProf file simply wastes
storage.

This patch limits the schema down to four fields for Version2 by
default.  It retains the old behavior of serializing all fields via:

  llvm-profdata merge --memprof-version=2 --memprof-full-schema

This patch reduces the size of the indexed MemProf profile I have by
40% (1.6GB down to 1.0GB).
2024-04-24 16:25:35 -07:00
Kazu Hirata
34dffc5e00
[memprof] Accept Schema in the constructor of RecordWriterTrait (NFC) (#89486)
The comment being deleted in this patch is not correct.  We already
construct an instance of RecordWriterTrait with Version.

This patch teaches the constructor of RecordWriterTrait to accept
Schema.  While I am at it, this patch makes Version a private
variable.
2024-04-20 10:55:12 -07:00
Kazu Hirata
8b24028a7e
[memprof] Use structured binding (NFC) (#89315) 2024-04-18 14:52:50 -07:00
Kazu Hirata
172f6ddfa7
[memprof] Add Version2 of the indexed MemProf format (#89100)
This patch adds Version2 of the indexed MemProf format.  The new
format comes with a hash table from CallStackId to actual call stacks
llvm::SmallVector<FrameId>.  The rest of the format refers to call
stacks with CallStackId.  This "values + references" model effectively
deduplicates call stacks.  Without this patch, a large indexed memprof
file of mine shrinks from 4.4GB to 1.6GB, a 64% reduction.

This patch does not make Version2 generally available yet as I am
planning to make a few more changes to the format.
2024-04-18 14:12:58 -07:00
Kazu Hirata
83dc41992d
[memprof] Clean up writer traits (NFC) (#88549)
RecordWriter does not live past the end of writeMemProfRecords, so it
can be safely on stack.

The constructor of FrameWriter does not take any parameter, so we can
let OnDiskChainedHashTableGenerator::Emit (with a single parameter)
default-construct an instance of the writer trait inside Emit.
2024-04-12 11:14:20 -07:00
Kazu Hirata
568ec1340c
[memprof] Use structured binding (NFC) (#88096) 2024-04-09 08:25:41 -07:00
Kazu Hirata
4d1bb7699b
[memprof] Fix a typo in writeMemProfV1 (#87890)
This patch borrows memprof-merge.test to test --memprof-version.
2024-04-07 15:06:13 -07:00
Kazu Hirata
fd2a5c46d8
[memprof] Introduce writeMemProf (NFC) (#87698)
This patch refactors the serialization of MemProf data to a switch
statement style:

  switch (Version) {
  case Version0:
    return ...;
  case Version1:
    return ...;
  }

just like IndexedMemProfRecord::serialize.

A reasonable amount of code is shared and factored out to helper
functions between writeMemProfV0 and writeMemProfV1 to the extent that
doens't hamper readability.
2024-04-04 13:36:56 -07:00
Kazu Hirata
f2d22b5944
[memprof] Make RecordWriterTrait a non-template class (#87604)
commit d89914f30bc7c180fe349a5aa0f03438ae6c20a4
  Author: Kazu Hirata <kazu@google.com>
  Date:   Wed Apr 3 21:48:38 2024 -0700

changed RecordWriterTrait to a template class with IndexedVersion as a
template parameter.  This patch changes the class back to a
non-template one while retaining the ability to serialize multiple
versions.

The reason I changed RecordWriterTrait to a template class was
because, even if RecordWriterTrait had IndexedVersion as a member
variable, RecordWriterTrait::EmitKeyDataLength, being a static
function, would not have access to the variable.

Since OnDiskChainedHashTableGenerator calls EmitKeyDataLength as:

  const std::pair<offset_type, offset_type> &Len =
      InfoObj.EmitKeyDataLength(Out, I->Key, I->Data);

we can make EmitKeyDataLength a member function, but we have one
problem.  InstrProfWriter::writeImpl calls:

  void insert(typename Info::key_type_ref Key,
              typename Info::data_type_ref Data) {
    Info InfoObj;
    insert(Key, Data, InfoObj);
  }

which default-constructs RecordWriterTrait without a specific version
number.  This patch fixes the problem by adjusting
InstrProfWriter::writeImpl to call the other form of insert instead:

  void insert(typename Info::key_type_ref Key,
              typename Info::data_type_ref Data, Info &InfoObj)

To prevent an accidental invocation of the default constructor of
RecordWriterTrait, this patch deletes the default constructor.
2024-04-04 10:09:43 -07:00
Kazu Hirata
d89914f30b
[memprof] Add Version2 of IndexedMemProfRecord serialization (#87455)
I'm currently developing a new version of the indexed memprof format
where we deduplicate call stacks in IndexedAllocationInfo::CallStack
and IndexedMemProfRecord::CallSites.  We refer to call stacks with
integer IDs, namely CallStackId, just as we refer to Frame with
FrameId.  The deduplication will cut down the profile file size by 80%
in a large memprof file of mine.

As a step toward the goal, this patch teaches
IndexedMemProfRecord::{serialize,deserialize} to speak Version2.  A
subsequent patch will add Version2 support to llvm-profdata.

The essense of the patch is to replace the serialization of a call
stack, a vector of FrameIDs, with that of a CallStackId.  That is:

  const IndexedAllocationInfo &N = ...;
  ...
  LE.write<uint64_t>(N.CallStack.size());
  for (const FrameId &Id : N.CallStack)
    LE.write<FrameId>(Id);

becomes:

  LE.write<CallStackId>(N.CSId);
2024-04-03 21:48:38 -07:00
Mingming Liu
1351d17826
[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825)
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691)

* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
 
* Implement profile reader and writer support 
  * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
  * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
  * Indexed profile writer collects the list of vtable names, and stores that to index profiles.
  * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.

rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-04-01 08:52:35 -07:00
Muhammad Omair Javaid
80aa52d8c5 Revert "[ProfileData] Use size_t in PatchItem (NFC) (#87014)"
This reverts commit c64a328cb4a32e81f8b694162750ec1b8823994c.
This broke Arm32 bit build on various LLVM buildbots.
For example:
https://lab.llvm.org/buildbot/#/builders/17/builds/51129
2024-03-29 14:48:46 +05:00
Kazu Hirata
c64a328cb4
[ProfileData] Use size_t in PatchItem (NFC) (#87014)
size_t in PatchItem eliminates the need for casts.
2024-03-28 20:31:47 -07:00
Kazu Hirata
44253a9ce6
[memprof] Add MemProf version (#86414)
This patch adds a version field to the MemProf section of the indexed
profile format, calling the new version "version 1".  The existing
version is called "version 0".

The writer supports both versions via a command-line option:

  llvm-profdata merge --memprof-version=1 ...

The reader supports both versions by automatically detecting the
version from the header.
2024-03-28 14:29:34 -07:00
Kazu Hirata
4292086ed0
[ProfileData] Use ArrayRef in ProfOStream::patch (NFC) (#85317)
We always apply all of the items in PatchItems.  This patch simplifies
the interface of ProfOStream::patch by switching to ArrayRef.
2024-03-14 17:49:59 -07:00
Teresa Johnson
08ddd2ce40
[PGO] Add support for writing previous indexed format (#84505)
Enable temporary support to ease use of new llvm-profdata with slightly
older indexed profiles after 16e74fd48988ac95551d0f64e1b36f78a82a89a2,
which bumped the indexed format for type profiling.
2024-03-08 12:27:46 -08:00
Mingming Liu
16e74fd489
Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling." (#82711)
New change on top of [reviewed
patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits
after this
one](d0757f46b3).
Previous commits are restored from the remote branch with timestamps.

1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1acf))
   
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121))

**Original Description**

* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-27 11:07:40 -08:00
Mingming Liu
0e8d1877cd
Revert type profiling change as compiler-rt test break on Windows. (#82583)
Examples
https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio
2024-02-21 21:41:33 -08:00
Mingming Liu
4d73cbe863
[nfc]remove unused variable after pr/81691 (#82578)
* `N` became unused after [pull request 81691](https://github.com/llvm/llvm-project/pull/81691)
* This should fix the build bot failure of `unused variable`
https://lab.llvm.org/buildbot/#/builders/77/builds/34840
2024-02-21 21:10:47 -08:00
Mingming Liu
db7e9e6841
[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. (#81691)
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-21 20:59:42 -08:00
Teresa Johnson
35a003c2b2
[MemProf][NFC] Clear each IndexedMemProfRecord after it is written (#75205)
The on-disk hash table for the memprof writer holds copies of all the
memprof records to be written. These hold a lot of memory in aggregate,
due to the lists of alloc sites (which each have a list of context
frames) and call sites. Clear each one after emitting it.

This drops the peak memory when writing a very large indexed memprof
profile by about 2.5G.
2023-12-15 11:38:33 -08:00
Teresa Johnson
1a5299491a
[MemProf][NFC] Free large data structures after last use (#75120)
The MemProf InstrProfWriter uses a couple of MapVector for building the
lists of records it needs to write. Once its entries are all added to
the associated OnDiskChainedHashTableGenerator, it is no longer used.

Clearing these MapVectors, which grow quite large for large profiles,
saved 4G for a large memory profile.
2023-12-15 11:38:21 -08:00
Ellis Hoag
b0154c36d6
[InstrProf] Add pgo use block coverage test (#72443)
Back in https://reviews.llvm.org/D124490 we added a block coverage mode
that instruments a subset of basic blocks using single byte counters to
get coverage for the whole function.

This commit adds a test to make sure that we correctly assign branch
weights based on the coverage profile.

I noticed this test was missing after seeing that we had no coverage on
`PGOUseFunc::populateCoverage()`

https://lab.llvm.org/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp.html#L1383
2023-11-20 09:25:33 -06:00
Alan Phipps
f95b2f1acf Reland "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)"
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.

Differential Revision: https://reviews.llvm.org/D138846
2023-10-30 11:15:02 -05:00
Kazu Hirata
02f67c097d Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class. This patch replaces
{big,little,native} with llvm::endianness::{big,little,native}.

This patch completes the migration to llvm::endianness and
llvm::endianness::{big,little,native}.  I'll post a separate patch to
remove the migration helpers in llvm/Support/Endian.h:

  using endianness = llvm::endianness;
  constexpr llvm::endianness big = llvm::endianness::big;
  constexpr llvm::endianness little = llvm::endianness::little;
  constexpr llvm::endianness native = llvm::endianness::native;
2023-10-13 23:16:25 -07:00
Kazu Hirata
4a0ccfa865 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-12 21:21:45 -07:00
Kazu Hirata
a9d5056862 Use llvm::endianness (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
support::endianness with llvm::endianness.
2023-10-10 21:54:15 -07:00
Mingming Liu
1c2634e316
[NFC]Rename InstrProf::getFuncName{,orExternalSymbol} to getFuncOrValName{,IfDefined} (#68240)
- This function looks up MD5ToNameMap to return a name for a given MD5.
https://github.com/llvm/llvm-project/pull/66825 adds MD5 of global
variable names into this map. So rename methods and update comments
2023-10-04 11:56:28 -07:00
Hans Wennborg
53a2923bf6 Revert "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)"
This seems to cause Clang to crash, see comments on the code review. Reverting
until the problem can be investigated.

> Part 1 of 3. This includes the LLVM back-end processing and profile
> reading/writing components. compiler-rt changes are included.
>
> Differential Revision: https://reviews.llvm.org/D138846

This reverts commit a50486fd736ab2fe03fcacaf8b98876db77217a7.
2023-09-21 12:20:24 +02:00
Alan Phipps
a50486fd73 [InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.

Differential Revision: https://reviews.llvm.org/D138846
2023-09-19 17:07:23 -05:00
Fangrui Song
4c2980c1a3 [llvm-profdata] Stabilize iteration order for InstrProfWriter
If two functions are inserted to the same bucket, their order in the
serialized profile is dependent on StringMap iteration order, which is
not guaranteed to be deterministic.
(https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h).
Use a sort like we do in writeText.
2023-07-20 18:31:41 -07:00
Snehasish Kumar
4aabd19c06 [instrprof] Add an overload to accept raw_string_ostream.
Add an overload for InstrProfWriter::write so that users can emit the
buffer to a string. Also use this new overload for existing unit test
usecases.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D153904
2023-06-28 16:37:15 +00:00
Ellis Hoag
4bddef4117 [InstrProf][Temporal] Add weight field to traces
As discussed in [0], add a `weight` field to temporal profiling traces found in profiles. This allows users to use the `--weighted-input=` flag in the `llvm-profdata merge` command to weight traces from different scenarios differently.

Note that this is a breaking change, but since [1] landed very recently and there is no way to "use" this trace data, there should be no users of this feature. We believe it is acceptable to land this change without bumping the profile format version.

[0] https://reviews.llvm.org/D147812#4259507
[1] https://reviews.llvm.org/D147287

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D148150
2023-04-13 10:37:05 -07:00
Ellis Hoag
244be0b0de [InstrProf] Temporal Profiling
As described in [0], this extends IRPGO to support //Temporal Profiling//.

When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile.

Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively.

In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup.

Special thanks to Julian Mestre for helping with reservoir sampling.

[0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D147287
2023-04-11 08:30:52 -07:00
Kazu Hirata
b595eb83e5 [llvm] Use *{Set,Map}::contains (NFC) 2023-03-14 18:56:07 -07:00
Gulfem Savrun Yeniceri
1ae7d83803 [profile] Add binary ids into indexed profiles
This patch adds support for including binary ids in an indexed profile.
It adds a new field into the header that points to the offset of the
binary id section. The binary id section consists of a size of the
section, and a list of binary ids (if they are present) that consist
of two parts: length and data.

This patch guarantees that indexed profile is backwards compatible
after adding binary ids.

Differential Revision: https://reviews.llvm.org/D135929
2022-12-29 18:46:56 +00:00