On WebAssembly, most coverage metadata contents read by llvm-cov (like
`__llvm_covmap` and `__llvm_covfun`) are stored in custom sections
because they are not referenced at runtime. However, `__llvm_prf_names`
is referenced at runtime by the profile runtime library and is read by
llvm-cov post-processing tools, so it needs to be stored in a data
segment, which is allocatable at runtime and accessible by tools as long
as "name" section is present in the binary.
This patch changes the way llvm-cov reads `__llvm_prf_names` on
WebAssembly. Instead of looking for a section, it looks for a data
segment with the same name.
This reverts commit 157f10ddf2d851125a85a71e530dc9d50cb032a2 and fixes
PE/COFF `.lprfn$A` section handling.
Currently both True/False counts were folded. It lost the information,
"It is True or False before folding." It prevented recalling branch
counts in merging template instantiations.
In `llvm-cov`, a folded branch is shown as:
- `[True: n, Folded]`
- `[Folded, False n]`
In the case If `n` is zero, a branch is reported as "uncovered". This is
distinguished from "folded" branch. When folded branches are merged,
`Folded` may be dissolved.
In the coverage map, either `Counter` is `Zero`. Currently both were
`Zero`.
Since "partial fold" has been introduced, either case in `switch` is
omitted as `Folded`.
Each `case:` in `switch` is reported as `[True: n, Folded]`, since
`False` count doesn't show meaningful value.
When `switch` doesn't have `default:`, `switch (Cond)` is reported as
`[Folded, False: n]`, since `True` count was just the sum of `case`(s).
`switch` with `default` can be considered as "the statement that doesn't
have any `False`(s)".
Currently, WebAssembly/WASI target does not provide direct support for
code coverage.
This patch set fixes several issues to unlock the feature. The main
changes are:
1. Port `compiler-rt/lib/profile` to WebAssembly/WASI.
2. Adjust profile metadata sections for Wasm object file format.
- [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections
instead of data segments.
- [lld] Align the interval space of custom sections at link time.
- [llvm-cov] Copy misaligned custom section data if the start address is
not aligned.
- [llvm-cov] Read `__llvm_prf_names` from data segments
3. [clang] Link with profile runtime libraries if requested
See each commit message for more details and rationale.
This is part of the effort to add code coverage support in Wasm target
of Swift toolchain.
llvm-cov reads __llvm_prf_names section in an object file to find the
profile names, and it instead reads __llvm_covnames section in binary
profile correlation mode when __llvm_prf_names section is omitted. This
patch ensures that it still reads __llvm_covnames section when there is
an empty __llvm_prf_names section.
By storing possible test vectors instead of combinations of conditions,
the restriction is dramatically relaxed.
This introduces two options to `cc1`:
* `-fmcdc-max-conditions=32767`
* `-fmcdc-max-test-vectors=2147483646`
This change makes coverage mapping, profraw, and profdata incompatible
with Clang-18.
- Bitmap semantics changed. It is incompatible with previous format.
- `BitmapIdx` in `Decision` points to the end of the bitmap.
- Bitmap is packed per function.
- `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.
RFC:
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
--
Change(s) since llvmorg-19-init-14288-g7ead2d8c7e91
- Update compiler-rt/test/profile/ContinuousSyncMode/image-with-mcdc.c
This broke the lit tests on Mac:
https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/1096/
> By storing possible test vectors instead of combinations of conditions,
> the restriction is dramatically relaxed.
>
> This introduces two options to `cc1`:
>
> * `-fmcdc-max-conditions=32767`
> * `-fmcdc-max-test-vectors=2147483646`
>
> This change makes coverage mapping, profraw, and profdata incompatible
> with Clang-18.
>
> - Bitmap semantics changed. It is incompatible with previous format.
> - `BitmapIdx` in `Decision` points to the end of the bitmap.
> - Bitmap is packed per function.
> - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.
>
> RFC:
> https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
This reverts commit 7ead2d8c7e9114b3f23666209a1654939987cb30.
By storing possible test vectors instead of combinations of conditions,
the restriction is dramatically relaxed.
This introduces two options to `cc1`:
* `-fmcdc-max-conditions=32767`
* `-fmcdc-max-test-vectors=2147483646`
This change makes coverage mapping, profraw, and profdata incompatible
with Clang-18.
- Bitmap semantics changed. It is incompatible with previous format.
- `BitmapIdx` in `Decision` points to the end of the bitmap.
- Bitmap is packed per function.
- `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.
RFC:
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
- The direct use case (in [1]) is to add `llvm::IntervalMap` [2] and the allocator required by IntervalMap ctor [3]
to class `InstrProfSymtab` as owned members. The allocator class doesn't have a move-assignment operator;
and it's going to take much effort to implement move-assignment operator for the allocator class such that the
enclosing class is movable.
- There is only one use of compiler-generated move-assignment operator in the repo, which is in
CoverageMappingReader.cpp. Luckily it's possible to use std::unique_ptr<InstrProfSymtab> instead, so did the change.
[1] https://github.com/llvm/llvm-project/pull/66825
[2] 4c2f68840e/llvm/include/llvm/ADT/IntervalMap.h (L936)
[3] 4c2f68840e/llvm/include/llvm/ADT/IntervalMap.h (L1041)
This replaces `SmallVector<CondState>` and emulates it.
- -------- True False DontCare
- Values: True False False
- Visited: True True False
`findIndependencePairs()` can be optimized with logical ops.
FIXME: Specialize `findIndependencePairs()` for the single word.
To reduce conditional judges in the loop in `findIndependencePairs()`, I
have tried a couple of tweaks.
* Isolate the final result in `TestVectors`
`using TestVectors = llvm::SmallVector<std::pair<TestVector,
CondState>>;`
The final result was just piggybacked on `TestVector`, so it has been
isolated.
* Filter out and sort `ExecVectors` by the final result
It will cost more in constructing `ExecVectors`, but it can reduce at
least one conditional judgement in the loop.
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4
The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.
RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
This is a preparation of incoming Clang changes (#82448) and just checks
`TVIdx` is calculated correctly. NFC.
`TVIdxBuilder` calculates deterministic Indices for each Condition Node.
It is used for `clang` to emit `TestVector` indices (aka ID) and for
`llvm-cov` to reconstruct `TestVectors`.
This includes the unittest `CoverageMappingTest.TVIdxBuilder`.
See also
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
- Prune `RegionMCDCBitmapMap` and `RegionCondIDMap`. They are handled
by `MCDCState`.
- Rename `s/BitmapMap/DecisionByStmt/`. It can handle Decision stuff.
- Rename `s/CondIDMap/BranchByStmt/`. It can be handle Branch stuff.
- `MCDCRecordProcessor`: Use `DecisionParams.BitmapIdx` directly.
Also, Let `NumConditions` `uint16_t`.
It is smarter to handle the ID as signed.
Narrowing to `int16_t` will reduce costs of handling byvalue. (See also
#81221 and #81227)
External behavior doesn't change. They below handle values as internal
values plus 1.
* `-dump-coverage-mapping`
* `CoverageMappingReader.cpp`
* `CoverageMappingWriter.cpp`
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make
sure them not initialized as zero.
FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
They can be also used in `clang`.
Introduce the lightweight header instead of `CoverageMapping.h`.
This includes for now:
* `mcdc::ConditionID`
* `mcdc::Parameters`
Deprecate `TestVectors`, since no one uses it.
This affects the output order of ExecVectors.
The current impl emits sorted by binary value of ExecVector. This impl
emits along the traversal of `buildTestVector()`.
* `getFunctionBitmap()` stores not `std::vector<uint8_t>` but
`BitVector`.
* `CounterMappingContext` holds `Bitmap` (instead of the ref of bytes)
* `Bitmap` and `BitmapIdx` are used instead of `evaluateBitmap()`.
FIXME: `InstrProfRecord` itself should handle `Bitmap` as `BitVector`.
The current implementation (D138849) assumes `Branch`(es) would follow
after the corresponding `Decision`. It is not true if `Branch`(es) are
forwarded to expanded file ID. As a result, consecutive `Decision`(s)
would be confused with insufficient number of `Branch`(es).
`Expansion` will point `Branch`(es) in other file IDs if `Expansion` is
included in the range of `Decision`.
Fixes#77871
---------
Co-authored-by: Alan Phipps <a-phipps@ti.com>
To relax scanning record, tweak order by `Decision < Expansion`, or
`Expansion` could not be distinguished whether it belonged to `Decision`
or not.
Relevant to #77871
Update code from https://reviews.llvm.org/D138847
`buildTestVector` is a standard DFS (walking a reduced ordered binary
decision diagram). Avoid shouldCopyOffTestVectorFor{True,False}Path
complexity and redundant `Map[ID]` lookups.
`findIndependencePairs` unnecessarily uses four nested loops (n<=6) to
find independence pairs. Instead, enumerate the two execution vectors
and find the number of mismatches. This algorithm can be optimized using
the marking function technique described in _Efficient Test Coverage
Measurement for MC/DC, 2013_, but this may be overkill.
In `CoverageMapping.cpp:getMaxBitmapSize()`,
this assumed that the last `Decision` has the maxmum `BitmapIdx`.
Let it scan `max(BitmapIdx)`.
Note that `<=` is used insted of `<`, because `BitmapIdx == 0` is valid
and `MaxBitmapID` is `unsigned`. `BitmapIdx` is unique in the record.
Fixes#78922
The life of `MCDCRecordProcessor`'s instance is short. It may accept
`const` objects to process.
On the other hand, the life of `MCDCBranches` is shorter than `Record`.
It may be rewritten with reference, rather than copying.
`if constexpr` and `if consteval` conditional statements code coverage
should behave more like a preprocesor `#if`-s than normal
ConditionalStmt. This PR should fix that.
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Fixes a build failure introduced by
commit 8ecbb0404d74 ("Reland [Coverage][llvm-cov]
Enable MC/DC Support in LLVM Source-based Code Coverage (2/3)")
Use of pow() is not necessary.
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.
## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.
## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
FILE SIZE VM SIZE
-------------- --------------
+121% +30.4Mi +121% +30.4Mi .text
[NEW] +14.6Mi [NEW] +14.6Mi __llvm_prf_data
[NEW] +10.6Mi [NEW] +10.6Mi __llvm_prf_names
[NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts
+95% +1.75Mi +95% +1.75Mi .eh_frame
+108% +400Ki +108% +400Ki .eh_frame_hdr
+9.5% +211Ki +9.5% +211Ki .rela.dyn
+9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro
+5.0% +87.3Ki +5.0% +87.3Ki .rodata
[ = ] 0 +13% +47.0Ki .bss
+40% +1.78Ki +40% +1.78Ki .got
+12% +1.49Ki +12% +1.49Ki .gcc_except_table
[ = ] 0 +65% +1.23Ki .relro_padding
+62% +1.20Ki [ = ] 0 [Unmapped]
+13% +448 +19% +448 .init_array
+8.8% +192 [ = ] 0 [ELF Section Headers]
+0.0% +136 +0.0% +80 [7 Others]
+0.1% +96 +0.1% +96 .dynsym
+1.2% +96 +1.2% +96 .rela.plt
+1.5% +80 +1.2% +64 .plt
[ = ] 0 -99.2% -3.68Ki [LOAD #5 [RW]]
+195% +64.0Mi +194% +64.0Mi TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
FILE SIZE VM SIZE
-------------- --------------
+121% +30.4Mi +121% +30.4Mi .text
[NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts
+95% +1.75Mi +95% +1.75Mi .eh_frame
+108% +400Ki +108% +400Ki .eh_frame_hdr
+9.5% +211Ki +9.5% +211Ki .rela.dyn
+9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro
+5.0% +87.3Ki +5.0% +87.3Ki .rodata
[ = ] 0 +13% +47.0Ki .bss
+40% +1.78Ki +40% +1.78Ki .got
+12% +1.49Ki +12% +1.49Ki .gcc_except_table
+13% +448 +19% +448 .init_array
+0.1% +96 +0.1% +96 .dynsym
+1.2% +96 +1.2% +96 .rela.plt
+1.2% +64 +1.2% +64 .plt
+2.9% +64 [ = ] 0 [ELF Section Headers]
+0.0% +40 +0.0% +40 .data
+1.2% +32 +1.2% +32 .got.plt
+0.0% +24 +0.0% +8 [5 Others]
[ = ] 0 -22.9% -872 [LOAD #5 [RW]]
-74.5% -1.44Ki [ = ] 0 [Unmapped]
[ = ] 0 -76.5% -1.45Ki .relro_padding
+118% +38.8Mi +117% +38.8Mi TOTAL
```
A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.
Differential Revision: https://reviews.llvm.org/D138846
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
support::endianness with llvm::endianness.
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
support::endianness::{big,little,native} with
llvm::endianness::{big,little,native}.
This causes stack overflows for real-world coverage reports.
Tested with build/bin/llvm-lit -a llvm/test/tools/llvm-cov
Co-authored-by: Qing Shen <qingshen@google.com>
This reverts commit 32db121b29f78e4c41116b2a8f1c730f9522b202 and subsequent commits.
This causes time regression on llvm-cov even with debug info correlation off.
This patch fixes:
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp:219:5: error:
default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
This causes stack overflows for real-world coverage reports.
Ran $ build/bin/llvm-lit -a llvm/test/tools/llvm-cov locally and passed.
Co-authored-by: Qing Shen <qingshen@google.com>