182 Commits

Author SHA1 Message Date
Kazu Hirata
bda02096d3
[ProfileData] Add InstrProfWriter::writeBinaryIds (NFC) (#118754)
The patch makes InstrProfWriter::writeImpl less monolithic by adding
InstrProfWriter::writeBinaryIds to serialize binary IDs.  This way,
InstrProfWriter::writeImpl can simply call the new function instead of
handling all the details within writeImpl.
2024-12-05 08:39:27 -08:00
ronryvchin
ff281f7d37
[PGO] Add option to always instrumenting loop entries (#116789)
This patch extends the PGO infrastructure with an option to prefer the
instrumentation of loop entry blocks.
This option is a generalization of
19fb5b467b,
and helps to cover cases where the loop exit is never executed.
An example where this can occur are event handling loops.

Note that change does NOT change the default behavior.
2024-12-04 07:56:46 +01:00
Kazu Hirata
ff7b42c194
[memprof] Speed up llvm-profdata (#117446)
CallStackRadixTreeBuilder::build takes the parameter
MemProfFrameIndexes by value, involving copies:

  std::optional<const llvm::DenseMap<FrameIdTy, LinearFrameId>>
    MemProfFrameIndexes

Then "build" makes another copy of MemProfFrameIndexe and passes it to
encodeCallStack for every call stack, which is painfully slow.

This patch changes the type to a pointer so that we don't have to make
a copy every time we pass the argument.

Without this patch, it takes 553 seconds to run "llvm-profdata merge"
on a large MemProf raw profile.  This patch shortenes that down to 67
seconds.
2024-11-24 21:08:54 -08:00
Kazu Hirata
9e3215ac16
[memprof] Add an assert to InstrProfWriter::addMemProfData (#117426)
This patch adds a quick validity check to
InstrProfWriter::addMemProfData.  Specifically, we check to see if we
have all (or none) of the MemProf profile components (frames, call
stacks, records).

The credit goes to Teresa Johnson for suggesting this assert.
2024-11-24 21:07:59 -08:00
Kazu Hirata
ad2bdd8fab
[memprof] Remove MemProf format Version 1 (#117357)
This patch removes MemProf format Version 1 now that Version 2 and 3
are working well.
2024-11-22 11:53:31 -08:00
Teresa Johnson
e14827f082
[MemProf] Templatize CallStackRadixTreeBuilder (NFC) (#117014)
Prepare for usage in the bitcode reader/writer where we already have a
LinearFrameId:
- templatize input frame id type in CallStackRadixTreeBuilder
- templatize input frame id type in computeFrameHistogram
- make the map from FrameId to LinearFrameId optional

We plan to use the same radix format in the ThinLTO summary records,
where we already have a LinearFrameId.
2024-11-20 10:08:58 -08:00
Kazu Hirata
4f1b20f023
[ProfileData] Remove unused includes (NFC) (#116751)
Identified with misc-include-cleaner.
2024-11-19 19:42:20 -08:00
Kazu Hirata
f97c610d1f
[memprof] Add MemProfReader::takeMemProfData (#116769)
This patch adds MemProfReader::takeMemProfData, a function to return
the complete MemProf profile from the reader.  We can directly pass
its return value to InstrProfWriter::addMemProfData without having to
deal with the indivual components of the MemProf profile.  The new
function is named "take", but it doesn't do std::move yet because of
type differences (DenseMap v.s. MapVector).

The end state I'm trying to get to is roughly as follows:

- MemProfReader accepts IndexedMemProfData as a parameter as opposed
  to the three individual components (frames, call stacks, and
  records).

- MemProfReader keeps IndexedMemProfData as a class member without
  decomposing it into its individual components.

- MemProfReader returns IndexedMemProfData like:

  IndexedMemProfData takeMemProfData() {
    return std::move(MemProfData);
  }
2024-11-19 19:33:26 -08:00
Kazu Hirata
6bf8f08989
[memprof] Add InstrProfWriter::addMemProfData (#116528)
This patch adds InstrProfWriter::addMemProfData, which adds the
complete MemProf profile (frames, call stacks, and records) to the
writer context.

Without this function, functions like loadInput in llvm-profdata.cpp
and InstrProfWriter::mergeRecordsFromWriter must add one item (frame,
call stack, or record) at a time.  The new function std::moves the
entire MemProf profile to the writer context if the destination is
empty, which is the common use case.  Otherwise, we fall back to
adding one item at a time behind the scene.

Here are a couple of reasons why we should add this function:

- We've had a bug where we forgot to add one of the three data
  structures (frames, call stacks, and records) to the writer context,
  resulting in a nearly empty indexed profile.  We should always
  package the three data structures together, especially on API
  boundaries.

- We expose a little too much of the MemProf detail to
  InstrProfWriter.  I'd like to gradually transform
  InstrProfReader/Writer to entities managing buffers (sequences of
  bytes), with actual serialization/deserialization left to external
  classes.  We already do some of this in InstrProfReader, where
  InstrProfReader "contracts out" to IndexedMemProfReader to handle
  MemProf details.

I am not changing loadInput or InstrProfWriter::mergeRecordsFromWriter
for now because MemProfReader uses DenseMap for frames and call
stacks, whereas MemProfData uses MapVector.  I'll resolve these
mismatches in subsequent patches.
2024-11-18 08:56:25 -08:00
Kazu Hirata
0d38f64e7d
[memprof] Remove MemProf format Version 0 (#116442)
This patch removes MemProf format Version 0 now that version 2 and 3
seem to be working well.

I'm not touching version 1 for now because some tests still rely on
version 1.

Note that Version 0 is identical to Version 1 except that the MemProf
section of the indexed format has a MemProf version field.
2024-11-15 15:37:00 -08:00
Kazu Hirata
59da1afd2a
[memprof] Speed up caller-callee pair extraction (#116184)
We know that the MemProf profile has a lot of duplicate call stacks.
Extracting caller-callee pairs from a call stack we've seen before is
a wasteful effort.

This patch makes the extraction more efficient by first coming up with
a work list of linear call stack IDs -- the set of starting positions
in the radix tree array -- and then extract caller-callee pairs from
each call stack in the work list.

We implement the work list as a bit vector because we expect the work
list to be dense in the range [0, RadixTreeSize).  Also, we want the
set insertion to be cheap.

Without this patch, it takes 25 seconds to extract caller-callee pairs
from a large MemProf profile.  This patch shortenes that down to 4
seconds.
2024-11-14 15:54:55 -08:00
Teresa Johnson
475e736bb5
[MemProf] Include <ctime> to avoid MSVC failure (#114246)
My change in bb3915149a7c9b1660db9caebfc96343352e8454 added a call to
std::time which worked generally as there must be some transitive
include of <ctime>. However, I saw one MSVC bot failure:

InstrProfWriter.cpp(202): error C2039: 'time': is not a member of 'std'

from https://lab.llvm.org/buildbot/#/builders/63/builds/2325.

Presumably explictly including <ctime> should fix this.
2024-10-30 08:28:22 -07:00
Teresa Johnson
bb3915149a
[MemProf] Support for random hotness when writing profile (#113998)
Add support for generating random hotness in the memprof profile writer,
to be used for testing. The random seed is printed to stderr, and an
additional option enables providing a specific seed in order to
reproduce a particular random profile.
2024-10-29 22:10:33 -07:00
Kazu Hirata
0cfd03ac0d
[ProfileData] Use ArrayRef in PatchItem (NFC) (#97379)
Packaging an array and its size as ArrayRef in PatchItem allows us to
get rid of things like std::size(Header) and HeaderOffsets.size().
2024-07-02 22:58:26 -07:00
Kazu Hirata
773ee62e16
[memprof] Rename the members of IndexedMemProfData (NFC) (#94873)
I'm planning to use IndexedMemProfData in MemProfReader and beyond.
Before I do so, this patch renames the members of IndexedMemProfData
as MemProfData.FrameData is a bit mouthful with "Data" repeated twice.

Note that MemProfReader currently has a trio -- IdToFrame,
CSIdToCallStack, and FunctionProfileData.  Replacing them with an
instance of IndexedMemProfData allows us to use the move semantics
from the reader to the writer context.  More importantly, treating the
profile data as one package makes the maintenance easier.  In the
past, forgetting to update a place dealing with the trio has resulted
in a bug where we totally forgot to emit call stacks into the indexed
profile.
2024-06-18 14:23:59 -07:00
Kazu Hirata
3d2bbea370
[ProfileData] Clean up validateRecord (#95488)
validateRecord ensures that all the values are unique except for
IPVK_IndirectCallTarget and IPVK_VTableTarget.  The problem is that we
exclude them in the innermost loop.

This patch pulls the loop invariant out of the loop.  While I am at
it, this patch migrates a use of getValueForSite to
getValueArrayForSite.
2024-06-18 13:06:43 -07:00
Kazu Hirata
7c6d0d26b1
[llvm] Use llvm::unique (NFC) (#95628) 2024-06-14 22:49:36 -07:00
Kazu Hirata
9ad102f03b
[ProfileData] Migrate to getValueArrayForSite (#95493)
This patch migrates uses of getValueForSite to getValueArrayForSite.
Each hunk is self-contained, meaning that each one can be applied
independently of the others.

In the unit test, there are cases where the array length check is
performed a lot earlier than the array content check.  For now, I'm
leaving the length checks where they are.  I'll consider moving them
when I migrate uses of getNumValueDataForSite to getValueArrayForSite
in a follow-up patch.
2024-06-14 06:38:48 -07:00
Kazu Hirata
dc3f8c2f58
[memprof] Improve deserialization performance in V3 (#94787)
We call llvm::sort in a couple of places in the V3 encoding:

- We sort Frames by FrameIds for stability of the output.

- We sort call stacks in the dictionary order to maximize the length
  of the common prefix between adjacent call stacks.

It turns out that we can improve the deserialization performance by
modifying the comparison functions -- without changing the format at
all.  Both places take advantage of the histogram of Frames -- how
many times each Frame occurs in the call stacks.

- Frames: We serialize popular Frames in the descending order of
  popularity for improved cache locality.  For two equally popular
  Frames, we break a tie by serializing one that tends to appear
  earlier in call stacks.  Here, "earlier" means a smaller index
  within llvm::SmallVector<FrameId>.

- Call Stacks: We sort the call stacks to reduce the number of times
  we follow pointers to parents during deserialization.  Specifically,
  instead of comparing two call stacks in the strcmp style -- integer
  comparisons of FrameIds, we compare two FrameIds F1 and F2 with
  Histogram[F1] < Histogram[F2] at respective indexes.  Since we
  encode from the end of the sorted list of call stacks, we tend to
  encode popular call stacks first.

Since the two places use the same histogram, we compute it once and
share it in the two places.

Sorting the call stacks reduces the number of "jumps" by 74% when we
deserialize all MemProfRecords.  The cycle and instruction counts go
down by 10% and 1.5%, respectively.

If we sort the Frames in addition to the call stacks, then the cycle
and instruction counts go down by 14% and 1.6%, respectively, relative
to the same baseline (that is, without this patch).
2024-06-07 17:25:57 -07:00
Kazu Hirata
bfa937a487
[ProfileData] Add const to a few places (NFC) (#94803) 2024-06-07 15:06:04 -07:00
Kazu Hirata
c348e265bd
[memprof] Use CallStackRadixTreeBuilder in the V3 format (#94708)
This patch integrates CallStackRadixTreeBuilder into the V3 format,
reducing the profile size to about 27% of the V2 profile size.

- Serialization: writeMemProfCallStackArray just needs to write out
  the radix tree array prepared by CallStackRadixTreeBuilder.
  Mappings from CallStackIds to LinearCallStackIds are moved by new
  function CallStackRadixTreeBuilder::takeCallStackPos.

- Deserialization: Deserializing a call stack is the same as
  deserializing an array encoded in the obvious manner -- the length
  followed by the payload, except that we need to follow a pointer to
  the parent to take advantage of common prefixes once in a while.
  This patch teaches LinearCallStackIdConverter to how to handle those
  pointers.
2024-06-07 07:19:36 -07:00
Kazu Hirata
bba5ee47e6
[memprof] Introduce memprof::LinearFrameId (NFC) (#94057)
This patch introduces memprof::LinearFrameId, which is a frame version
of memprof::LinearCallStackId.
2024-05-31 15:29:44 -07:00
Kazu Hirata
9a8b73c741
[memprof] Replace uint32_t with LinearCallStackId where appropriate (NFC) (#94023)
This patch replaces uint32_t with LinearCallStackId where appropriate.
I'm replacing uint64_t with LinearCallStackId in
writeMemProfCallStackArray, but that's OK because it's a value to be
used as LinearCallStackId anyway.
2024-05-31 14:41:05 -07:00
Kazu Hirata
90acfbf90d
[memprof] Use linear IDs for Frames and call stacks (#93740)
With this patch, we stop using on-disk hash tables for Frames and call
stacks.  Instead, we'll write out all the Frames as a flat array while
maintaining mappings from FrameIds to the indexes into the array.
Then we serialize call stacks in terms of those indexes.

Likewise, we'll write out all the call stacks as another flat array
while maintaining mappings from CallStackIds to the indexes into the
call stack array.  One minor difference from Frames is that the
indexes into the call stack array are not contiguous because call
stacks are variable-length objects.

Then we serialize IndexedMemProfRecords in terms of the indexes
into the call stack array.

Now, we describe each call stack with 32-bit indexes into the Frame
array (as opposed to the 64-bit FrameIds in Version 2).  The use of
the smaller type cuts down the profile file size by about 40% relative
to Version 2.  The departure from the on-disk hash tables contributes
a little bit to the savings, too.

For now, IndexedMemProfRecords refer to call stacks with 64-bit
indexes into the call stack array.  As a follow-up, I'll change that
to uint32_t, including necessary updates to RecordWriterTrait.
2024-05-30 14:28:22 -07:00
Kazu Hirata
99b9ab45cd
[memprof] Reorder MemProf sections in profile (#93640)
This patch teaches the V3 format to serialize Frames, call stacks, and
IndexedMemProfRecords, in that order.

I'm planning to use linear IDs for Frames.  That is, Frames will be
numbered 0, 1, 2, and so on in the order we serialize them.  In turn,
we will seialize the call stacks in terms of those linear IDs.

Likewise, I'm planning to use linear IDs for call stacks and then
serialize IndexedMemProfRecords in terms of those linear IDs for call
stacks.

With the new order, we can successively free data structures as we
serialize them.  That is, once we serialize Frames, we can free the
Frames' data proper and just retain mappings from FrameIds to linear
IDs.  A similar story applies to call stacks.
2024-05-29 12:18:24 -07:00
Mingming Liu
c54657887b
[nfc][InstrProfWriter]Store header fields in a vector and back patch once (#93594)
This is a split of https://github.com/llvm/llvm-project/pull/93346 as
discussed.
2024-05-29 10:50:44 -07:00
Kazu Hirata
9e89d107a6
[memprof] Add MemProf format Version 3 (#93608)
This patch adds Version 3 for development purposes.  For now, this
patch adds V3 as a copy of V2.

For the most part, this patch adds "case Version3:" wherever "case
Version2:" appears.  One exception is writeMemProfV3, which is copied
from writeMemProfV2 but updated to write out memprof::Version3 to the
MemProf header.  We'll incrementally modify writeMemProfV3 in
subsequent patches.
2024-05-28 13:30:00 -07:00
Mingming Liu
beac910c3b
[nfc][InstrProfWriter]Wrap vtable writes in a method. (#93081)
- This way `InstrProfWriter::writeImpl` itself is simpler.
2024-05-22 21:10:09 -07:00
Kazu Hirata
2375921d67
[ProfileData] Use default member initializations (NFC) (#93120)
This patch uses default member initializations for all the fields in
Header.  The intent is to prevent accidental uninitialized fields and
reduce the number of times we need to mention each member variable.
2024-05-22 20:38:57 -07:00
Mingming Liu
7b977e0f64
[nfc][InstrFDO]Encapsulate header writes in a class member function (#90142)
The smaller class member are more focused and easier to maintain. This
also paves the way for partial header forward compatibility in
https://github.com/llvm/llvm-project/pull/88212

---------

Co-authored-by: Kazu Hirata <kazu@google.com>
2024-05-18 21:51:14 -07:00
Kazu Hirata
479f4a7b68
[memprof] Update comments for writeMemProf and its helpers (#92446)
This patch adds comments for writeMemProf{V0,V1,V2} in a
version-specific manner.  The mostly repetitive nature of the comments
is somewhat unfortunate but intentional to make it easy to retire
older versions.

Without this patch, the comment just before writeMemProf documents the
Version1 format, which is very confusing.
2024-05-16 13:26:13 -07:00
Kazu Hirata
0dc80e4b26
[memprof] Group MemProf data structures into a struct (NFC) (#92360)
This patch groups the three Memprof data structures into a struct
named IndexedMemProfData and teaches InstrProfWriter to use it.  This
way, we can pass IndexedMemProfData to writeMemProf and its helpers
instead of individual data structures.

As a follow-up, we can use the new struct in MemProfReader also.  That
in turn allows loadInput in llvm-profdata to move the MemProf data
into the writer context, saving a few seconds for a large MemProf
profile.
2024-05-16 10:35:45 -07:00
Ellis Hoag
c87b1ca4ed
[InstrProf] Fix bug when clearing traces with samples (#92310)
The `--temporal-profile-max-trace-length=0` flag in the `llvm-profdata
merge` command is used to remove traces from a profile. There was a bug
where traces would not be cleared if the profile was already sampled.
This patch fixes that.
2024-05-15 18:41:25 -05:00
Kazu Hirata
dc7834b76c [ProfileData] Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2024-04-28 23:13:18 -07:00
Kazu Hirata
cb9589b227
[memprof] Move getFullSchema and getHotColdSchema outside PortableMemInfoBlock (#90103)
These functions do not operate on PortableMemInfoBlock.  This patch
moves them outside the class.
2024-04-25 12:12:28 -07:00
Kazu Hirata
4c8ec8f8bc
[memprof] Reduce schema for Version2 (#89876)
Curently, the compiler only uses several fields of MemoryInfoBlock.
Serializing all fields into the indexed MemProf file simply wastes
storage.

This patch limits the schema down to four fields for Version2 by
default.  It retains the old behavior of serializing all fields via:

  llvm-profdata merge --memprof-version=2 --memprof-full-schema

This patch reduces the size of the indexed MemProf profile I have by
40% (1.6GB down to 1.0GB).
2024-04-24 16:25:35 -07:00
Kazu Hirata
34dffc5e00
[memprof] Accept Schema in the constructor of RecordWriterTrait (NFC) (#89486)
The comment being deleted in this patch is not correct.  We already
construct an instance of RecordWriterTrait with Version.

This patch teaches the constructor of RecordWriterTrait to accept
Schema.  While I am at it, this patch makes Version a private
variable.
2024-04-20 10:55:12 -07:00
Kazu Hirata
8b24028a7e
[memprof] Use structured binding (NFC) (#89315) 2024-04-18 14:52:50 -07:00
Kazu Hirata
172f6ddfa7
[memprof] Add Version2 of the indexed MemProf format (#89100)
This patch adds Version2 of the indexed MemProf format.  The new
format comes with a hash table from CallStackId to actual call stacks
llvm::SmallVector<FrameId>.  The rest of the format refers to call
stacks with CallStackId.  This "values + references" model effectively
deduplicates call stacks.  Without this patch, a large indexed memprof
file of mine shrinks from 4.4GB to 1.6GB, a 64% reduction.

This patch does not make Version2 generally available yet as I am
planning to make a few more changes to the format.
2024-04-18 14:12:58 -07:00
Kazu Hirata
83dc41992d
[memprof] Clean up writer traits (NFC) (#88549)
RecordWriter does not live past the end of writeMemProfRecords, so it
can be safely on stack.

The constructor of FrameWriter does not take any parameter, so we can
let OnDiskChainedHashTableGenerator::Emit (with a single parameter)
default-construct an instance of the writer trait inside Emit.
2024-04-12 11:14:20 -07:00
Kazu Hirata
568ec1340c
[memprof] Use structured binding (NFC) (#88096) 2024-04-09 08:25:41 -07:00
Kazu Hirata
4d1bb7699b
[memprof] Fix a typo in writeMemProfV1 (#87890)
This patch borrows memprof-merge.test to test --memprof-version.
2024-04-07 15:06:13 -07:00
Kazu Hirata
fd2a5c46d8
[memprof] Introduce writeMemProf (NFC) (#87698)
This patch refactors the serialization of MemProf data to a switch
statement style:

  switch (Version) {
  case Version0:
    return ...;
  case Version1:
    return ...;
  }

just like IndexedMemProfRecord::serialize.

A reasonable amount of code is shared and factored out to helper
functions between writeMemProfV0 and writeMemProfV1 to the extent that
doens't hamper readability.
2024-04-04 13:36:56 -07:00
Kazu Hirata
f2d22b5944
[memprof] Make RecordWriterTrait a non-template class (#87604)
commit d89914f30bc7c180fe349a5aa0f03438ae6c20a4
  Author: Kazu Hirata <kazu@google.com>
  Date:   Wed Apr 3 21:48:38 2024 -0700

changed RecordWriterTrait to a template class with IndexedVersion as a
template parameter.  This patch changes the class back to a
non-template one while retaining the ability to serialize multiple
versions.

The reason I changed RecordWriterTrait to a template class was
because, even if RecordWriterTrait had IndexedVersion as a member
variable, RecordWriterTrait::EmitKeyDataLength, being a static
function, would not have access to the variable.

Since OnDiskChainedHashTableGenerator calls EmitKeyDataLength as:

  const std::pair<offset_type, offset_type> &Len =
      InfoObj.EmitKeyDataLength(Out, I->Key, I->Data);

we can make EmitKeyDataLength a member function, but we have one
problem.  InstrProfWriter::writeImpl calls:

  void insert(typename Info::key_type_ref Key,
              typename Info::data_type_ref Data) {
    Info InfoObj;
    insert(Key, Data, InfoObj);
  }

which default-constructs RecordWriterTrait without a specific version
number.  This patch fixes the problem by adjusting
InstrProfWriter::writeImpl to call the other form of insert instead:

  void insert(typename Info::key_type_ref Key,
              typename Info::data_type_ref Data, Info &InfoObj)

To prevent an accidental invocation of the default constructor of
RecordWriterTrait, this patch deletes the default constructor.
2024-04-04 10:09:43 -07:00
Kazu Hirata
d89914f30b
[memprof] Add Version2 of IndexedMemProfRecord serialization (#87455)
I'm currently developing a new version of the indexed memprof format
where we deduplicate call stacks in IndexedAllocationInfo::CallStack
and IndexedMemProfRecord::CallSites.  We refer to call stacks with
integer IDs, namely CallStackId, just as we refer to Frame with
FrameId.  The deduplication will cut down the profile file size by 80%
in a large memprof file of mine.

As a step toward the goal, this patch teaches
IndexedMemProfRecord::{serialize,deserialize} to speak Version2.  A
subsequent patch will add Version2 support to llvm-profdata.

The essense of the patch is to replace the serialization of a call
stack, a vector of FrameIDs, with that of a CallStackId.  That is:

  const IndexedAllocationInfo &N = ...;
  ...
  LE.write<uint64_t>(N.CallStack.size());
  for (const FrameId &Id : N.CallStack)
    LE.write<FrameId>(Id);

becomes:

  LE.write<CallStackId>(N.CSId);
2024-04-03 21:48:38 -07:00
Mingming Liu
1351d17826
[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825)
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691)

* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
 
* Implement profile reader and writer support 
  * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
  * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
  * Indexed profile writer collects the list of vtable names, and stores that to index profiles.
  * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.

rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-04-01 08:52:35 -07:00
Muhammad Omair Javaid
80aa52d8c5 Revert "[ProfileData] Use size_t in PatchItem (NFC) (#87014)"
This reverts commit c64a328cb4a32e81f8b694162750ec1b8823994c.
This broke Arm32 bit build on various LLVM buildbots.
For example:
https://lab.llvm.org/buildbot/#/builders/17/builds/51129
2024-03-29 14:48:46 +05:00
Kazu Hirata
c64a328cb4
[ProfileData] Use size_t in PatchItem (NFC) (#87014)
size_t in PatchItem eliminates the need for casts.
2024-03-28 20:31:47 -07:00
Kazu Hirata
44253a9ce6
[memprof] Add MemProf version (#86414)
This patch adds a version field to the MemProf section of the indexed
profile format, calling the new version "version 1".  The existing
version is called "version 0".

The writer supports both versions via a command-line option:

  llvm-profdata merge --memprof-version=1 ...

The reader supports both versions by automatically detecting the
version from the header.
2024-03-28 14:29:34 -07:00
Kazu Hirata
4292086ed0
[ProfileData] Use ArrayRef in ProfOStream::patch (NFC) (#85317)
We always apply all of the items in PatchItems.  This patch simplifies
the interface of ProfOStream::patch by switching to ArrayRef.
2024-03-14 17:49:59 -07:00