947 Commits

Author SHA1 Message Date
Mingming Liu
16e74fd489
Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling." (#82711)
New change on top of [reviewed
patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits
after this
one](d0757f46b3).
Previous commits are restored from the remote branch with timestamps.

1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1acf))
   
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121))

**Original Description**

* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-27 11:07:40 -08:00
NAKAMURA Takumi
512a8a78a7
[MC/DC] Introduce class TestVector with a pair of BitVector (#82174)
This replaces `SmallVector<CondState>` and emulates it.

- -------- True  False DontCare
- Values:  True  False False
- Visited: True  True  False

`findIndependencePairs()` can be optimized with logical ops.

FIXME: Specialize `findIndependencePairs()` for the single word.
2024-02-27 16:46:49 +09:00
NAKAMURA Takumi
0ed61db6fd
[MC/DC] Refactor: Isolate the final result out of TestVector (#82282)
To reduce conditional judges in the loop in `findIndependencePairs()`, I
have tried a couple of tweaks.

* Isolate the final result in `TestVectors`

`using TestVectors = llvm::SmallVector<std::pair<TestVector,
CondState>>;`
The final result was just piggybacked on `TestVector`, so it has been
isolated.

* Filter out and sort `ExecVectors` by the final result

It will cost more in constructing `ExecVectors`, but it can reduce at
least one conditional judgement in the loop.
2024-02-27 15:31:04 +09:00
gulfemsavrun
23f895f656
[InstrProf] Single byte counters in coverage (#75425)
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.

RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
2024-02-26 14:44:55 -08:00
NAKAMURA Takumi
c087bebb02
Introduce mcdc::TVIdxBuilder (LLVM side, NFC) (#80676)
This is a preparation of incoming Clang changes (#82448) and just checks
`TVIdx` is calculated correctly. NFC.

`TVIdxBuilder` calculates deterministic Indices for each Condition Node.
It is used for `clang` to emit `TestVector` indices (aka ID) and for
`llvm-cov` to reconstruct `TestVectors`.

This includes the unittest `CoverageMappingTest.TVIdxBuilder`.

See also
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
2024-02-26 13:23:43 +09:00
NAKAMURA Takumi
1f6a347c8a Refactor: Let MCDC::State have DecisionByStmt and BranchByStmt
- Prune `RegionMCDCBitmapMap` and `RegionCondIDMap`. They are handled
  by `MCDCState`.
- Rename `s/BitmapMap/DecisionByStmt/`. It can handle Decision stuff.
- Rename `s/CondIDMap/BranchByStmt/`. It can be handle Branch stuff.
- `MCDCRecordProcessor`: Use `DecisionParams.BitmapIdx` directly.
2024-02-25 18:33:53 +09:00
Mingming Liu
0e8d1877cd
Revert type profiling change as compiler-rt test break on Windows. (#82583)
Examples
https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio
2024-02-21 21:41:33 -08:00
Mingming Liu
4d73cbe863
[nfc]remove unused variable after pr/81691 (#82578)
* `N` became unused after [pull request 81691](https://github.com/llvm/llvm-project/pull/81691)
* This should fix the build bot failure of `unused variable`
https://lab.llvm.org/buildbot/#/builders/77/builds/34840
2024-02-21 21:10:47 -08:00
Mingming Liu
db7e9e6841
[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. (#81691)
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <modiking213@gmail.com>
2024-02-21 20:59:42 -08:00
NAKAMURA Takumi
ab76e48ac2
[MC/DC] Refactor: Let MCDCConditionID int16_t with zero-origin (#81257)
Also, Let `NumConditions` `uint16_t`.

It is smarter to handle the ID as signed.
Narrowing to `int16_t` will reduce costs of handling byvalue. (See also
#81221 and #81227)

External behavior doesn't change. They below handle values as internal
values plus 1.
* `-dump-coverage-mapping`
* `CoverageMappingReader.cpp`
* `CoverageMappingWriter.cpp`
2024-02-15 16:24:37 +09:00
NAKAMURA Takumi
1a1fcacbce
[MC/DC] Refactor: Introduce ConditionIDs as std::array<2> (#81221)
Its 0th element corresponds to `FalseID` and 1st to `TrueID`.

CoverageMappingGen.cpp: `DecisionIDPair` is replaced with `ConditionIDs`
2024-02-14 23:17:00 +09:00
Mingming Liu
2422e969bf
[NFC][InstrProf]Factor out getCanonicalName to compute the canonical name given a pgo name. (#81547)
- Also update the `InstrProf::addFuncWithName` to call the newly added
`getCanonicalName`.
2024-02-13 10:49:35 -08:00
NAKAMURA Takumi
a17a3e9d9a
[MC/DC] Refactor: Make MCDCParams as std::variant (#81227)
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make
sure them not initialized as zero.

FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
2024-02-13 22:43:46 +09:00
NAKAMURA Takumi
4588525d7e CoverageMappingReader/Writer: MCDCConditionID shouldn't be zero 2024-02-13 17:54:51 +09:00
NAKAMURA Takumi
05ad0d4632
CoverageMapping.cpp: Apply std::move to MCDCRecord (#81220) 2024-02-13 17:45:28 +09:00
NAKAMURA Takumi
f0db35b93f
[MC/DC] Refactor: Introduce MCDCTypes.h for coverage::mcdc (#81459)
They can be also used in `clang`.
Introduce the lightweight header instead of `CoverageMapping.h`.

This includes for now:
* `mcdc::ConditionID`
* `mcdc::Parameters`
2024-02-13 17:40:51 +09:00
NAKAMURA Takumi
3f9d8d892e
[Coverage] MCDCRecordProcessor: Find ExecVectors directly (#80816)
Deprecate `TestVectors`, since no one uses it.

This affects the output order of ExecVectors.
The current impl emits sorted by binary value of ExecVector. This impl
emits along the traversal of `buildTestVector()`.
2024-02-09 06:11:20 +09:00
Mingming Liu
05091aa3ac
[NFC][InstrProf]Generalize getParsedIRPGOFuncName to getParsedIRPGOName (#81054)
- Function getParsedIRPGOFuncName splits name by delimiter. The `[filename;]mangled-name` format could be generalized for non-function global values (e.g., vtables for type profiling). So rename the
function.
- Use kGlobalIdentifierDelimiter rather than semicolon directly for defragmentation.
2024-02-07 20:03:44 -08:00
NAKAMURA Takumi
0b62218110 Anonymize MCDCRecordProcessor 2024-02-06 17:18:19 +09:00
NAKAMURA Takumi
47a12cca44 CoverageMapping.cpp: s/MaxBitmapID/MaxBitmapIdx/ in getMaxBitmapSize() 2024-02-06 16:13:02 +09:00
NAKAMURA Takumi
f035c018a6 InstrProf::getFunctionBitmap: Fix BE hosts (#80608) 2024-02-05 15:04:57 +09:00
NAKAMURA Takumi
4926f12ff5
[Coverage] ProfileData: Handle MC/DC Bitmap as BitVector. NFC. (#80608)
* `getFunctionBitmap()` stores not `std::vector<uint8_t>` but
`BitVector`.
* `CounterMappingContext` holds `Bitmap` (instead of the ref of bytes)
* `Bitmap` and `BitmapIdx` are used instead of `evaluateBitmap()`.

FIXME: `InstrProfRecord` itself should handle `Bitmap` as `BitVector`.
2024-02-05 12:42:08 +09:00
NAKAMURA Takumi
d912f1f0cb
[Coverage] Let Decision take account of expansions (#78969)
The current implementation (D138849) assumes `Branch`(es) would follow
after the corresponding `Decision`. It is not true if `Branch`(es) are
forwarded to expanded file ID. As a result, consecutive `Decision`(s)
would be confused with insufficient number of `Branch`(es).

`Expansion` will point `Branch`(es) in other file IDs if `Expansion` is
included in the range of `Decision`.

Fixes #77871

---------

Co-authored-by: Alan Phipps <a-phipps@ti.com>
2024-02-02 20:34:12 +09:00
NAKAMURA Takumi
438fe1db09
CoverageMappingWriter: Emit Decision before Expansion (#78966)
To relax scanning record, tweak order by `Decision < Expansion`, or
`Expansion` could not be distinguished whether it belonged to `Decision`
or not.

Relevant to #77871
2024-02-02 18:37:10 +09:00
Fangrui Song
3d0a689eb7
[llvm-cov] Simplify and optimize MC/DC computation (#79727)
Update code from https://reviews.llvm.org/D138847

`buildTestVector` is a standard DFS (walking a reduced ordered binary
decision diagram). Avoid shouldCopyOffTestVectorFor{True,False}Path
complexity and redundant `Map[ID]` lookups.

`findIndependencePairs` unnecessarily uses four nested loops (n<=6) to
find independence pairs. Instead, enumerate the two execution vectors
and find the number of mismatches. This algorithm can be optimized using
the marking function technique described in _Efficient Test Coverage
Measurement for MC/DC,  2013_, but this may be overkill.
2024-01-29 12:07:13 -08:00
NAKAMURA Takumi
c193bb7e9e
[Coverage] getMaxBitmapSize: Scan max(BitmapIdx) instead of the last Decision (#78963)
In `CoverageMapping.cpp:getMaxBitmapSize()`, 
this assumed that the last `Decision` has the maxmum `BitmapIdx`.

Let it scan `max(BitmapIdx)`.

Note that `<=` is used insted of `<`, because `BitmapIdx == 0` is valid
and `MaxBitmapID` is `unsigned`. `BitmapIdx` is unique in the record.

Fixes #78922
2024-01-23 17:59:44 +09:00
NAKAMURA Takumi
fe0ec2c91c
[Coverage] Const-ize MCDCRecordProcessor stuff (#78918)
The life of `MCDCRecordProcessor`'s instance is short. It may accept
`const` objects to process.

On the other hand, the life of `MCDCBranches` is shorter than `Record`.
It may be rewritten with reference, rather than copying.
2024-01-23 07:28:10 +09:00
Hana Dusíková
865e4a1f33
[coverage] skipping code coverage for 'if constexpr' and 'if consteval' (#78033)
`if constexpr` and `if consteval` conditional statements code coverage
should behave more like a preprocesor `#if`-s than normal
ConditionalStmt. This PR should fix that.

---------

Co-authored-by: cor3ntin <corentinjabot@gmail.com>
2024-01-22 12:50:20 +01:00
Kazu Hirata
b7a66d0fae [llvm] Use SmallString::operator std::string (NFC) 2024-01-19 18:54:11 -08:00
spupyrev
30aa9fb4c1 Revert "[InstrProf] Adding utility weights to BalancedPartitioning (#72717)"
This reverts commit 5954b9dca21bb0c69b9e991b2ddb84c8b05ecba3
due to broken Windows build
2024-01-19 15:13:47 -08:00
spupyrev
5954b9dca2
[InstrProf] Adding utility weights to BalancedPartitioning (#72717)
Adding weights to utility nodes in BP so that we can give more
importance to
certain utilities. This is useful when we optimize several objectives
jointly.
2024-01-19 13:36:59 -08:00
Fangrui Song
0c6dc80531
BalancedPartitioning: minor updates (#77568)
When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1).
Fix
this and change floor+log2 to Log2_64.

BalancedPartitioning::bisect can use unstable sort because `Nodes`
contains distinct `InputOrderIndex`s.

BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.
2024-01-17 10:46:34 -08:00
Ellis Hoag
9a2df55f47
[InstrProf] No linkage prefixes in IRPGO names (#76994)
Change the format of IRPGO counter names to
`[<filepath>;]<mangled-name>` which is computed by
`GlobalValue::getGlobalIdentifier()` to fix #74565.

In fe051934cbb0aaf25d960d7d45305135635d650b
(https://reviews.llvm.org/D156569) the format of IRPGO counter names was
changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is
basically `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O
(yes, it is confusing that `<linkage-name>` is computed with
`Mangler().getNameWithPrefix()` while `<mangled-name>` is just
`F.getName()`). We discovered in #74565 that this causes some missed
import issues on some targets and #74008 is a partial fix.

Since `<mangled-name>` may not match the `<linkage-name>` on some
targets like Mach-O, we will need to post-process the output of
`llvm-profdata order` before passing to the linker via `-order_file`.

Profiles generated after fe051934cbb0aaf25d960d7d45305135635d650b will
become stale after this diff, but I think this is acceptable since that
patch landed after the LLVM 18 cut which hasn't been released yet.
2024-01-04 16:13:57 -08:00
Mingming Liu
eba2b789d3
[RawProfReader]When constructing symbol table, read the MD5 of function name in the proper byte order (#76312)
Before this patch, when the field `NameRef` is generated in little-endian systems and read back in big-endian systems, the information gets dropped.
- The bug gets caught by a buildbot
https://lab.llvm.org/buildbot/#/builders/94/builds/17931. In the error message (pasted below),
two indirect call targets are not imported.

```
   ; IMPORTS-DAG: Import _Z7callee1v
               ^
<stdin>:1:1: note: scanning from here
main.ll: Import _Z11global_funcv from lib.cc
^
<stdin>:1:10: note: possible intended match here
main.ll: Import _Z11global_funcv from lib.cc
         ^
Input file: <stdin>
Check file:
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
-dump-input=help explains the following input dump.
Input was:
<<<<<<
          1: main.ll: Import _Z11global_funcv from lib.cc 
dag:34'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match
found
dag:34'1 ? possible intended match
```

[This commit](b3999246b1 (diff-b196b796c5a396c7cdf93b347fe47e2b29b72d0b7dd0e2b88abb964d376ee50e)) gates the fix by flag and provide test data by creating big-endian profiles (rather than reading the little-endian data on a big-endian system that might require a VM).

- [This](b3999246b1 (diff-643176077ddbe537bd0a05d2a8a53bdff6339420a30e8511710bf232afdda8b9)) is a hexdump of little-endian profile data, and [this](b3999246b1 (diff-1736a3ee25dde02bba55d670df78988fdb227e5a85b94b8707cf182cf70b28f0)) is the big-endian version of it.
- The [README.md](b3999246b1 (diff-6717b6a385de3ae60ab3aec9638af2a43b55adaf6784b6f0393ebe1a6639438b)) shows the result of `llvm-profdata show -ic-targets` before and after the fix when the profile is in big-endian.
2024-01-02 10:23:29 -08:00
Mingming Liu
78a195e100
Reland the reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954)
Simplify the compiler-rt test to make it more general for different
platforms, and use `*DAG` matchers for lines that may be emitted
out-of-order.
- The compiler-rt test passed on a Windows machine. Previously name
matchers don't work for MSVC mangling
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907)
- `*DAG` matchers fixed the error in
https://lab.llvm.org/buildbot/#/builders/94/builds/17924

This is the second reland and fixed errors caught in first reland
(https://github.com/llvm/llvm-project/pull/75860)

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-19 12:25:56 -08:00
Teresa Johnson
6a7bbf712d
[memprof][NFC] Free symbolizer memory eagerly (#75849)
Move the ownership of the symbolizer into symbolizeAndFilterStackFrames
so that it is freed on exit, when we are done with it, to reduce peak
memory in the reader. This reduces about 9G from the peak for one large
profile.
2023-12-18 20:50:08 -08:00
Mingming Liu
6ce23ea0ab
Revert "Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. "" (#75888)
Reverts llvm/llvm-project#75860
- Mangled name mismatch on Windows
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907/steps/8/logs/stdio)
2023-12-18 19:31:18 -08:00
Mingming Liu
c5871712ae
Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75860)
Fixed build-bot failures caught by post-submit tests
1) Add the list of command line tools needed by new compiler-rt test into dependency.
2) Use `starts_with` to replace deprecated `startswith`.

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 17:43:40 -08:00
Mingming Liu
3aa5d71127
Revert "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles." (#75835)
Reverts llvm/llvm-project#74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
2023-12-18 09:39:55 -08:00
Mingming Liu
245cddae70
[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (#74008)
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 09:10:39 -08:00
Teresa Johnson
35a003c2b2
[MemProf][NFC] Clear each IndexedMemProfRecord after it is written (#75205)
The on-disk hash table for the memprof writer holds copies of all the
memprof records to be written. These hold a lot of memory in aggregate,
due to the lists of alloc sites (which each have a list of context
frames) and call sites. Clear each one after emitting it.

This drops the peak memory when writing a very large indexed memprof
profile by about 2.5G.
2023-12-15 11:38:33 -08:00
Teresa Johnson
1a5299491a
[MemProf][NFC] Free large data structures after last use (#75120)
The MemProf InstrProfWriter uses a couple of MapVector for building the
lists of records it needs to write. Once its entries are all added to
the associated OnDiskChainedHashTableGenerator, it is no longer used.

Clearing these MapVectors, which grow quite large for large profiles,
saved 4G for a large memory profile.
2023-12-15 11:38:21 -08:00
Alan Phipps
47b0052f31
[CoverageMapping] Avoid use of pow() resulting in solaris build fail (#75559)
Fixes a build failure introduced by
commit 8ecbb0404d74 ("Reland [Coverage][llvm-cov]
Enable MC/DC Support in LLVM Source-based Code Coverage (2/3)")

Use of pow() is not necessary.
2023-12-14 23:49:35 -06:00
Zequan Wu
ab3430f891
[Profile] Add binary profile correlation for code coverage. (#69493)
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.

## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.

## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +14.6Mi  [NEW] +14.6Mi    __llvm_prf_data
  [NEW] +10.6Mi  [NEW] +10.6Mi    __llvm_prf_names
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
  [ = ]       0   +65% +1.23Ki    .relro_padding
   +62% +1.20Ki  [ = ]       0    [Unmapped]
   +13%    +448   +19%    +448    .init_array
  +8.8%    +192  [ = ]       0    [ELF Section Headers]
  +0.0%    +136  +0.0%     +80    [7 Others]
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.5%     +80  +1.2%     +64    .plt
  [ = ]       0 -99.2% -3.68Ki    [LOAD #5 [RW]]
  +195% +64.0Mi  +194% +64.0Mi    TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
   +13%    +448   +19%    +448    .init_array
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.2%     +64  +1.2%     +64    .plt
  +2.9%     +64  [ = ]       0    [ELF Section Headers]
  +0.0%     +40  +0.0%     +40    .data
  +1.2%     +32  +1.2%     +32    .got.plt
  +0.0%     +24  +0.0%      +8    [5 Others]
  [ = ]       0 -22.9%    -872    [LOAD #5 [RW]]
 -74.5% -1.44Ki  [ = ]       0    [Unmapped]
  [ = ]       0 -76.5% -1.45Ki    .relro_padding
  +118% +38.8Mi  +117% +38.8Mi    TOTAL
```

A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
2023-12-14 14:16:38 -05:00
Alan Phipps
8ecbb0404d Reland "[Coverage][llvm-cov] Enable MC/DC Support in LLVM Source-based Code Coverage (2/3)"
Part 2 of 3. This includes the Visualization and Evaluation components.

Differential Revision: https://reviews.llvm.org/D138847
2023-12-13 15:10:05 -06:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
Teresa Johnson
749d595de9
[MemProf][NFC] Correct comment about stripping of suffixes in profile (#73840)
The comment about the stripping of suffixes when creating the indexed
MemProf profile was partially incorrect, as we do not strip ".__uniq."
suffixes by default (by design). Update the comment accordingly.
2023-11-29 10:34:21 -08:00
Zequan Wu
b9951b3fe6
[llvm-profdata] Fix binary ids with multiple raw profiles in a single… (#72740)
Save binary ids when iterating through `RawInstrProfReader`.

Fixes #72699.
2023-11-20 14:25:24 -05:00
Ellis Hoag
b0154c36d6
[InstrProf] Add pgo use block coverage test (#72443)
Back in https://reviews.llvm.org/D124490 we added a block coverage mode
that instruments a subset of basic blocks using single byte counters to
get coverage for the whole function.

This commit adds a test to make sure that we correctly assign branch
weights based on the coverage profile.

I noticed this test was missing after seeing that we had no coverage on
`PGOUseFunc::populateCoverage()`

https://lab.llvm.org/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp.html#L1383
2023-11-20 09:25:33 -06:00
Kazu Hirata
0d55ea25a6 [llvm] Stop including llvm/ADT/DenseMapInfo.h (NFC)
Identified with clangd.
2023-11-11 00:13:29 -08:00