1088 Commits

Author SHA1 Message Date
Teresa Johnson
475e736bb5
[MemProf] Include <ctime> to avoid MSVC failure (#114246)
My change in bb3915149a7c9b1660db9caebfc96343352e8454 added a call to
std::time which worked generally as there must be some transitive
include of <ctime>. However, I saw one MSVC bot failure:

InstrProfWriter.cpp(202): error C2039: 'time': is not a member of 'std'

from https://lab.llvm.org/buildbot/#/builders/63/builds/2325.

Presumably explictly including <ctime> should fix this.
2024-10-30 08:28:22 -07:00
Teresa Johnson
bb3915149a
[MemProf] Support for random hotness when writing profile (#113998)
Add support for generating random hotness in the memprof profile writer,
to be used for testing. The random seed is printed to stderr, and an
additional option enables providing a specific seed in order to
reproduce a particular random profile.
2024-10-29 22:10:33 -07:00
Yuta Saito
bf8f5cc9b5
Reland: [llvm-cov][WebAssembly] Read __llvm_prf_names from data segments (#112569)
On WebAssembly, most coverage metadata contents read by llvm-cov (like
`__llvm_covmap` and `__llvm_covfun`) are stored in custom sections
because they are not referenced at runtime. However, `__llvm_prf_names`
is referenced at runtime by the profile runtime library and is read by
llvm-cov post-processing tools, so it needs to be stored in a data
segment, which is allocatable at runtime and accessible by tools as long
as "name" section is present in the binary.

This patch changes the way llvm-cov reads `__llvm_prf_names` on
WebAssembly. Instead of looking for a section, it looks for a data
segment with the same name.

This reverts commit 157f10ddf2d851125a85a71e530dc9d50cb032a2 and fixes
PE/COFF `.lprfn$A` section handling.
2024-10-24 12:52:50 +09:00
NAKAMURA Takumi
4a011ac84f
[Coverage] Introduce "partial fold" on BranchRegion (#112694)
Currently both True/False counts were folded. It lost the information,
"It is True or False before folding." It prevented recalling branch
counts in merging template instantiations.

In `llvm-cov`, a folded branch is shown as:

- `[True: n, Folded]`
- `[Folded, False n]`

In the case If `n` is zero, a branch is reported as "uncovered". This is
distinguished from "folded" branch. When folded branches are merged,
`Folded` may be dissolved.

In the coverage map, either `Counter` is `Zero`. Currently both were
`Zero`.

Since "partial fold" has been introduced, either case in `switch` is
omitted as `Folded`.

Each `case:` in `switch` is reported as `[True: n, Folded]`, since
`False` count doesn't show meaningful value.

When `switch` doesn't have `default:`, `switch (Cond)` is reported as
`[Folded, False: n]`, since `True` count was just the sum of `case`(s).
`switch` with `default` can be considered as "the statement that doesn't
have any `False`(s)".
2024-10-20 12:30:35 +09:00
Yuta Saito
157f10ddf2
Revert "[llvm-cov][WebAssembly] Read __llvm_prf_names from data segments" (#112520)
This reverts commit efc9dd4118a7ada7d8c898582f16db64827f7ce0 in order to
fix Windows test failure:
https://github.com/llvm/llvm-project/pull/111332#issuecomment-2416462512
2024-10-16 21:42:54 +09:00
Yuta Saito
d4efc3e097
[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332)
Currently, WebAssembly/WASI target does not provide direct support for
code coverage.
This patch set fixes several issues to unlock the feature. The main
changes are:

1. Port `compiler-rt/lib/profile` to WebAssembly/WASI.
2. Adjust profile metadata sections for Wasm object file format.
- [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections
instead of data segments.
    - [lld] Align the interval space of custom sections at link time.
- [llvm-cov] Copy misaligned custom section data if the start address is
not aligned.
    - [llvm-cov] Read `__llvm_prf_names` from data segments
3. [clang] Link with profile runtime libraries if requested

See each commit message for more details and rationale.
This is part of the effort to add code coverage support in Wasm target
of Swift toolchain.
2024-10-15 02:41:43 +09:00
Kazu Hirata
02b9c97b75
[memprof] Simplify code with MapVector::operator[] (NFC) (#111335)
Note that the following are all equivalent to each other:

  Map.insert({Key, Value()}).first->second
  Map.try_emplace(Key).first->second
  Map[Key]
2024-10-07 09:00:05 -07:00
Kazu Hirata
abaa8247e8
[memprof] Avoid repeated hash lookups (NFC) (#110789) 2024-10-02 06:53:11 -07:00
Kazu Hirata
0089f39e0f
[ProfileData] Avoid repeated hash lookups (NFC) (#110619) 2024-10-01 00:30:04 -07:00
Kazu Hirata
e4e3ff5adc
[llvm] Use std::optional::value_or (NFC) (#109568) 2024-09-22 01:00:24 -07:00
gulfemsavrun
536bdc99e6
[Coverage] Skip empty profile name section (#108480)
llvm-cov reads __llvm_prf_names section in an object file to find the
profile names, and it instead reads __llvm_covnames section in binary
profile correlation mode when __llvm_prf_names section is omitted. This
patch ensures that it still reads __llvm_covnames section when there is
an empty __llvm_prf_names section.
2024-09-13 16:41:07 -07:00
Mircea Trofin
885ac29910 [nfc][ctx_prof] Change some internal "set" types
- the set used for targets under a callsite is simpler to use if iterators
  are stable (it gets manipulated during updates)
- the set used to fetch the transitive closure of GUIDs under a node can
  be left as a choice to the user.
2024-09-12 10:34:53 -07:00
Zequan Wu
a7c26aaf2e
Revert "[Coverage] Ignore unused functions if the count is 0." (#107901)
Reverts llvm/llvm-project#107661

Breaks llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp
2024-09-09 14:34:13 -04:00
Zequan Wu
6850410562
[Coverage] Ignore unused functions if the count is 0. (#107661)
Relax the condition to ignore the case when count is 0. 

This fixes a bug on
381e9d2386.
This was reported at
https://discourse.llvm.org/t/coverage-from-multiple-test-executables/81024/.
2024-09-09 14:14:21 -04:00
gulfemsavrun
787cd8f0fe
[InstrProf] Add debuginfod correlation support (#106606)
This patch adds debuginfod support into llvm-profdata to
find the assosicated executable by a build id in a raw
profile to correlate a profile with a provided correlation
kind (debug-info or binary).
2024-09-06 13:28:23 -07:00
William Junda Huang
75e9d191f5
[llvm-profdata] Enabled functionality to write split-layout profile (#101795)
Using the flag `-split_layout` in llvm-profdata merge, the output
profile can write profiles with and without inlined function into two
different extbinary sections (and their FuncOffsetTable too). The
section without inlined functions are marked with `SecFlagFlat` and is
skipped by ThinLTO because it provides no useful info.

The split layout feature was already implemented in SampleProfWriter but
previously there is no way to use it from llvm-profdata.
2024-08-28 20:33:54 -04:00
Mircea Trofin
1022323c9b
[ctx_prof] Move the "from json" logic more centrally to reuse it from test. (#106129)
Making the synthesis of a contextual profile file from a JSON descriptor more reusable, for unittest authoring purposes.

The functionality round-trips through the binary format - no reason, currently, to support other ways of loading contextual profiles.
2024-08-27 15:43:05 -07:00
Lei Wang
23144e87d2
[SampleFDO][NFC] Refactoring sample reader to support on-demand read profiles for given functions (#104654)
Currently in extended binary format, sample reader only read the
profiles when the function are in the current module at initialization
time, this extends the support to read the arbitrary profiles for given
input functions in later stage. It's used for
https://github.com/llvm/llvm-project/pull/101053.
2024-08-27 11:56:24 -07:00
Ethan Luis McDonough
fde2d23ee2
[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)
This pull request is a revised version of #76587. This pull request
fixes some build issues that were present in the previous version of
this change.

> This pull request is the first part of an ongoing effort to extends
PGO instrumentation to GPU device code. This PR makes the following
changes:
>
> - Adds blank registration functions to device RTL
> - Gives PGO globals protected visibility when targeting a supported
GPU
> - Handles any addrspace casts for PGO calls
> - Implements PGO global extraction in GPU plugins (currently only
dumps info)
>
> These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.
2024-08-22 01:10:54 -05:00
Kazu Hirata
dca820951c
[llvm] Use llvm::any_of (NFC) (#104443) 2024-08-15 17:59:10 -07:00
Mircea Trofin
6b47772a4b
[nfc][ctx_prof] Rename PGOContextualProfile to PGOCtxProfContext (#102209) 2024-08-06 17:41:38 -04:00
Mircea Trofin
cc7308a156
[ctx_prof] Make the profile output analyzable by llvm-bcanalyzer (#99563)
This requires output-ing a "Magic" 4-byte header. We also emit a block info block, to describe our blocks and records. The output of `llvm-bcanalyzer` would look like:

```
<BLOCKINFO_BLOCK/>
<Metadata NumWords=17 BlockCodeSize=2>
  <Version op0=1/>
  <Context NumWords=13 BlockCodeSize=2>
    <GUID op0=2/>
    <Counters op0=1 op1=2 op2=3/>
```

Instead of having `Unknown` for block and record IDs.
2024-07-23 08:59:07 -04:00
Lei Wang
18cdfa72e0
[SampleFDO] Stale profile call-graph matching (#95135)
Profile staleness could be due to function renaming. Given that sample
profile loader relies on exact string matching, a trivial change in the
function signature( such as `int foo()` --> `long foo()` ) can make the
mangled name different, the function profile(including all nested
children profile) becomes unavailable.

This patch introduces stale profile call-graph level matching, targeting
at identifying the trivial function renaming and reusing the old
function profile.

Some noteworthy details:

1. Extend the LCS based CFG level matching to identify new function. 
- Extend to match function and profile have different name instead of
the exact function name matching. This leverages LCS, i.e during the
finding of callsite anchor matching, when two function name are
different, try matching the functions instead of return.
- In LCS, the equal function check is replaced by
`functionMatchesProfile`.
- Only try matching functions that are new functions(neither appears on
each side). This reduces the matching scope as we don't need to match
the originally matched function.
2.  Determine the matching by call-site anchor similarity check.
- A new function `functionMatchesProfile(IRFunc, ProfFunc)` is used to
check the renaming for the possible <IRFunc, ProfFunc> pair, use the
LCS(diff) matching to compute the equal set and we define: `Similarity =
|equalSet * 2| / (|A| + |B|)`. The profile name is marked as renamed if
the similarity is above a
threshold(`-func-profile-similarity-threshold`)

3.  Process the matching in top-down function order 
- when a caller's is done matching, the new function names are saved for
later use, using top-down order will maximize the reused results.
- `ProfileNameToFuncMap` is used to save or cache the matching result.
4. Update the original profile at the end using `ProfileNameToFuncMap`.

5. Added a new switch --salvage-unused-profile to control this, default
is false.

Verified on one Meta's internal big service, confirmed 90%+ of the found
renaming pair is good. (There could be incorrect renaming pair if the
num of the anchor is small, but checked that those functions are simple
cold function)
2024-07-17 10:33:00 -07:00
Kazu Hirata
6c8ff4cbb8
[ProfileData] Take ArrayRef<InstrProfValueData> in addValueData (NFC) (#97363)
This patch fixes another place in ProfileData where we have a pointer
to an array of InstrProfValueData and its length separately.

addValueData is a bit unique in that it remaps incoming values in
place before adding them to ValueSites.  AFAICT, no caller of
addValueData uses updated incoming values.  With this patch, we add
value data to ValueSites first and then remaps values there.  This
way, we can take ArrayRef<InstrProfValueData> as a parameter.
2024-07-11 16:38:44 -07:00
Mircea Trofin
afbd7d1e7c
[NFC] Coding style: drop k in kGlobalIdentifierDelimiter (#98230) 2024-07-09 15:44:55 -07:00
Mircea Trofin
ce03155a1b
[NFC] Coding style fixes: SampleProf (#98208)
Also some control flow simplifications.

Notably, this doesn't address `sampleprof_error`. I *think* the style
there tries to match `std::error_category`.

Also left `hash_value` as-is, because it matches what we do in Hashing.h
2024-07-09 14:35:49 -07:00
Mircea Trofin
e291f31f89
[NFC] Coding style fixes in InstrProf.cpp (#98211) 2024-07-09 13:28:35 -07:00
Kazu Hirata
0cfd03ac0d
[ProfileData] Use ArrayRef in PatchItem (NFC) (#97379)
Packaging an array and its size as ArrayRef in PatchItem allows us to
get rid of things like std::size(Header) and HeaderOffsets.size().
2024-07-02 22:58:26 -07:00
Kazu Hirata
b8eaa5bb10
[ProfileData] Remove the old version of getValueProfDataFromInst (#97374)
I've migrated uses of the old version of getValueProfDataFromInst to
the one that returns SmallVector<InstrProfValueData, 4>.  This patch
removes the old version.
2024-07-02 11:46:31 -07:00
Mingming Liu
1518b260ce
[TypeProf][InstrFDO]Implement more efficient comparison sequence for indirect-call-promotion with vtable profiles. (#81442)
Clang's `-fwhole-program-vtables` is required for this optimization to
take place. If `-fwhole-program-vtables` is not enabled, this change is
no-op.
    
* Function-comparison (before):

```
%vtable = load ptr, ptr %obj
%vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
%func = load ptr, ptr %vfn
%cond = icmp eq ptr %func, @callee
br i1 %cond, label bb1, label bb2:

bb1:
   call @callee

bb2:
   call %func
```

* VTable-comparison (after):

```
%vtable = load ptr, ptr %obj
%cond = icmp eq ptr %vtable, @vtable-address-point
br i1 %cond, label bb1, label bb2:

bb1:
   call @callee

bb2:
  %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
  %func = load ptr, ptr %vfn
  call %func
```
    
Key changes:
1. Find out virtual calls and the vtables they come from.
- The ICP relies on type intrinsic `llvm.type.test` to find out virtual
calls and the
compatible vtables, and relies on type metadata to find the address
point for comparison.
2. ICP pass does cost-benefit analysis and compares vtable only when the
number of vtables for a function candidate is within (option specified)
threshold.
3. Sink the function addressing and vtable load instruction to indirect
fallback.
- The sink helper functions are simplified versions of
`InstCombinerImpl::tryToSinkInstruction`. Currently debug intrinsics are
not handled. Ideally `InstCombinerImpl::tryToSinkInstructionDbgValues`
and `InstCombinerImpl::tryToSinkInstructionDbgVariableRecords` could be
moved into Transforms/Utils/Local.cpp (or another util cpp file) to
handle debug intrinsics when moving instructions across basic blocks.
4. Keep value profiles updated
     1) Update vtable value profiles after inline
     2) For either function-based comparison or vtable-based comparison,
          update both vtable and indirect call value profiles.
2024-06-29 23:21:33 -07:00
Ethan Luis McDonough
2c8b912f63
Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587)"
This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.
2024-06-28 12:30:45 -05:00
Ethan Luis McDonough
5fd2af38e4
[PGO][OpenMP] Instrumentation for GPU devices (#76587)
This pull request is the first part of an ongoing effort to extends PGO
instrumentation to GPU device code. This PR makes the following changes:

- Adds blank registration functions to device RTL
- Gives PGO globals protected visibility when targeting a supported GPU
- Handles any addrspace casts for PGO calls
- Implements PGO global extraction in GPU plugins (currently only dumps
info)

These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.
2024-06-28 10:42:19 -05:00
Matthew Weingarten
ca4e5a8d6e
[Memprof] Fixes memory leak in MemInfoBlock histogram. (#96834)
MemInfoBlocks (MIB) with empty callstacks are erased prematurely from
the CallStackProfileData. This patch frees allocated histogram buffers
when the MIB is associated with an empty callstack.
2024-06-26 17:11:21 -07:00
Kazu Hirata
22b36bfa3f [Memprof] Fix a warning
This patch fixes:

  llvm/lib/ProfileData/MemProfReader.cpp:685:1: error: non-void
  function does not return a value in all con trol paths
  [-Werror,-Wreturn-type]

While I am at it, this patch removes an else-after-return.
2024-06-26 12:05:58 -07:00
Matthew Weingarten
30b93db547
[Memprof] Adds the option to collect AccessCountHistograms for memprof. (#94264)
Adds compile time flag -mllvm -memprof-histogram and runtime flag
histogram=true|false to turn Histogram collection on and off. The
-memprof-histogram flag relies on -memprof-use-callbacks=true to work.

Updates shadow mapping logic in histogram mode from having one 8 byte
counter for 64 bytes, to 1 byte for 8 bytes, capped at 255. Only
supports this granularity as of now.

Updates the RawMemprofReader and serializing MemoryInfoBlocks to binary
format, including changing to a new version of the raw binary format
from version 3 to version 4.

Updates creating MemoryInfoBlocks with and without Histograms. When two
MemoryInfoBlocks are merged, AccessCounts are summed up and the shorter
Histogram is removed.

Adds a memprof_histogram test case.

Initial commit for adding AccessCountHistograms up until RawProfile for
memprof
2024-06-26 08:37:22 -07:00
Kazu Hirata
fef144cebb Revert "[llvm] Use llvm::sort (NFC) (#96434)"
This reverts commit 05d167fc201b4f2e96108be0d682f6800a70c23d.

Reverting the patch fixes the following under EXPENSIVE_CHECKS:

  LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
  LLVM :: CodeGen/AMDGPU/sched-group-barrier-pre-RA.mir
  LLVM :: CodeGen/PowerPC/aix-xcoff-used-with-stringpool.ll
  LLVM :: CodeGen/PowerPC/merge-string-used-by-metadata.mir
  LLVM :: CodeGen/PowerPC/mergeable-string-pool-large.ll
  LLVM :: CodeGen/PowerPC/mergeable-string-pool-pass-only.mir
  LLVM :: CodeGen/PowerPC/mergeable-string-pool.ll
2024-06-25 11:18:40 -07:00
Kazu Hirata
05d167fc20
[llvm] Use llvm::sort (NFC) (#96434) 2024-06-23 10:38:51 -07:00
Kazu Hirata
b0ae923ada
[ProfileData] Add a variant of getValueProfDataFromInst (#95993)
This patch adds a variant of getValueProfDataFromInst that returns
std::vector<InstrProfValueData> instead of
std::unique<InstrProfValueData[]>.  The new return type carries the
length with it, so we can drop out parameter ActualNumValueData.
Also, the caller can directly feed the return value into a range-based
for loop as shown in the patch.

I'm planning to migrate other callers of getValueProfDataFromInst to
the new variant in follow-up patches.
2024-06-22 00:40:36 -07:00
Nikita Popov
48ef912e2b [VFS] Avoid <stack> include (NFC)
Directly use a vector instead of wrapping it in a stack, like we
do in most places.
2024-06-21 15:17:41 +02:00
Kazu Hirata
beba2e7385
[ProfileData] Teach addValueData to honor parameter Site (#96233)
This patch teaches addValueData to honor Site for verification
purposes.  It does not affect the profile data in any manner.
2024-06-20 22:25:19 -07:00
Kazu Hirata
773ee62e16
[memprof] Rename the members of IndexedMemProfData (NFC) (#94873)
I'm planning to use IndexedMemProfData in MemProfReader and beyond.
Before I do so, this patch renames the members of IndexedMemProfData
as MemProfData.FrameData is a bit mouthful with "Data" repeated twice.

Note that MemProfReader currently has a trio -- IdToFrame,
CSIdToCallStack, and FunctionProfileData.  Replacing them with an
instance of IndexedMemProfData allows us to use the move semantics
from the reader to the writer context.  More importantly, treating the
profile data as one package makes the maintenance easier.  In the
past, forgetting to update a place dealing with the trio has resulted
in a bug where we totally forgot to emit call stacks into the indexed
profile.
2024-06-18 14:23:59 -07:00
Kazu Hirata
3d2bbea370
[ProfileData] Clean up validateRecord (#95488)
validateRecord ensures that all the values are unique except for
IPVK_IndirectCallTarget and IPVK_VTableTarget.  The problem is that we
exclude them in the innermost loop.

This patch pulls the loop invariant out of the loop.  While I am at
it, this patch migrates a use of getValueForSite to
getValueArrayForSite.
2024-06-18 13:06:43 -07:00
Kazu Hirata
d6b0b7acf3
[ProfileData] Remove getValueProfDataFromInst (#95617)
I've migrated all uses to the new version of getValueProfDataFromInst
that returns std::unique_ptr<InstrProfValueData[]>.
2024-06-17 18:50:08 -07:00
Kazu Hirata
7c6d0d26b1
[llvm] Use llvm::unique (NFC) (#95628) 2024-06-14 22:49:36 -07:00
Kazu Hirata
9ad102f03b
[ProfileData] Migrate to getValueArrayForSite (#95493)
This patch migrates uses of getValueForSite to getValueArrayForSite.
Each hunk is self-contained, meaning that each one can be applied
independently of the others.

In the unit test, there are cases where the array length check is
performed a lot earlier than the array content check.  For now, I'm
leaving the length checks where they are.  I'll consider moving them
when I migrate uses of getNumValueDataForSite to getValueArrayForSite
in a follow-up patch.
2024-06-14 06:38:48 -07:00
NAKAMURA Takumi
71f8b441ed Reapply: [MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)
By storing possible test vectors instead of combinations of conditions,
the restriction is dramatically relaxed.

This introduces two options to `cc1`:

* `-fmcdc-max-conditions=32767`
* `-fmcdc-max-test-vectors=2147483646`

This change makes coverage mapping, profraw, and profdata incompatible
with Clang-18.

- Bitmap semantics changed. It is incompatible with previous format.
- `BitmapIdx` in `Decision` points to the end of the bitmap.
- Bitmap is packed per function.
- `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.

RFC:
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798

--
Change(s) since llvmorg-19-init-14288-g7ead2d8c7e91

- Update compiler-rt/test/profile/ContinuousSyncMode/image-with-mcdc.c
2024-06-14 19:31:56 +09:00
Hans Wennborg
b422fa6b62 Revert "[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)"
This broke the lit tests on Mac:
https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/1096/

> By storing possible test vectors instead of combinations of conditions,
> the restriction is dramatically relaxed.
>
> This introduces two options to `cc1`:
>
> * `-fmcdc-max-conditions=32767`
> * `-fmcdc-max-test-vectors=2147483646`
>
> This change makes coverage mapping, profraw, and profdata incompatible
> with Clang-18.
>
> - Bitmap semantics changed. It is incompatible with previous format.
> - `BitmapIdx` in `Decision` points to the end of the bitmap.
> - Bitmap is packed per function.
> - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.
>
> RFC:
> https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798

This reverts commit 7ead2d8c7e9114b3f23666209a1654939987cb30.
2024-06-14 10:47:41 +02:00
NAKAMURA Takumi
7ead2d8c7e
[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)
By storing possible test vectors instead of combinations of conditions,
the restriction is dramatically relaxed.

This introduces two options to `cc1`:

* `-fmcdc-max-conditions=32767`
* `-fmcdc-max-test-vectors=2147483646`

This change makes coverage mapping, profraw, and profdata incompatible
with Clang-18.

- Bitmap semantics changed. It is incompatible with previous format.
- `BitmapIdx` in `Decision` points to the end of the bitmap.
- Bitmap is packed per function.
- `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.

RFC:
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
2024-06-13 20:09:02 +09:00
Kazu Hirata
31440738bd
[ProfileData] Use std::vector for ValueData (NFC) (#95194)
This patch changes the type of ValueData to
std::vector<InstrProfValueData> so that, in a follow-up patch, we can
teach getValueForSite to return ArrayRef<InstrProfValueData>.

Currently, a typical traversal over the value data looks like:

  uint32_t NV = Func.getNumValueDataForSite(VK, I);
std::unique_ptr<InstrProfValueData[]> VD = Func.getValueForSite(VK, I);
  for (uint32_t V = 0; V < NV; V++)
    Do something with VD[V].Value and/or VD[V].Count;

Note that we need to call getNumValueDataForSite and getValueForSite
separately.  If getValueForSite returns ArrayRef<InstrProfValueData>
in the future, then we'll be able to do something like:

  for (const auto &V : Func.getValueForSite(VK, I))
    Do something with V.Value and/or V.Count;

If ArrayRef<InstrProfValueData> directly points to ValueData, then
getValueForSite won't need to allocate memory with std::make_unique.

Now, switching to std::vector requires us to update several places:

- sortByTargetValues switches to llvm::sort because we don't need to
  worry about sort stability.

- sortByCount retains sort stability because std::list::sort also
  performs stable sort.

- merge builds another array and move it back to ValueData to avoid a
  potential quadratic behavior with std::vector::insert into the
  middle of a vector.
2024-06-12 11:22:49 -07:00
Kazu Hirata
00fa3fbfb8
[ProfileData] Compute sum in annotateValueSite (NFC) (#95199)
getValueForSite computes the total count -- the total number of times
a given value site is visited.  The problem is that, excluding tests,
annotateValueSite is the only place that needs the total count.

This patch moves the total count computation to annotateValueSite.
2024-06-12 10:14:33 -07:00