- The direct use case (in [1]) is to add `llvm::IntervalMap` [2] and the allocator required by IntervalMap ctor [3]
to class `InstrProfSymtab` as owned members. The allocator class doesn't have a move-assignment operator;
and it's going to take much effort to implement move-assignment operator for the allocator class such that the
enclosing class is movable.
- There is only one use of compiler-generated move-assignment operator in the repo, which is in
CoverageMappingReader.cpp. Luckily it's possible to use std::unique_ptr<InstrProfSymtab> instead, so did the change.
[1] https://github.com/llvm/llvm-project/pull/66825
[2] 4c2f68840e/llvm/include/llvm/ADT/IntervalMap.h (L936)
[3] 4c2f68840e/llvm/include/llvm/ADT/IntervalMap.h (L1041)
This replaces `SmallVector<CondState>` and emulates it.
- -------- True False DontCare
- Values: True False False
- Visited: True True False
`findIndependencePairs()` can be optimized with logical ops.
FIXME: Specialize `findIndependencePairs()` for the single word.
To reduce conditional judges in the loop in `findIndependencePairs()`, I
have tried a couple of tweaks.
* Isolate the final result in `TestVectors`
`using TestVectors = llvm::SmallVector<std::pair<TestVector,
CondState>>;`
The final result was just piggybacked on `TestVector`, so it has been
isolated.
* Filter out and sort `ExecVectors` by the final result
It will cost more in constructing `ExecVectors`, but it can reduce at
least one conditional judgement in the loop.
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4
The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.
RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
This is a preparation of incoming Clang changes (#82448) and just checks
`TVIdx` is calculated correctly. NFC.
`TVIdxBuilder` calculates deterministic Indices for each Condition Node.
It is used for `clang` to emit `TestVector` indices (aka ID) and for
`llvm-cov` to reconstruct `TestVectors`.
This includes the unittest `CoverageMappingTest.TVIdxBuilder`.
See also
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
- Prune `RegionMCDCBitmapMap` and `RegionCondIDMap`. They are handled
by `MCDCState`.
- Rename `s/BitmapMap/DecisionByStmt/`. It can handle Decision stuff.
- Rename `s/CondIDMap/BranchByStmt/`. It can be handle Branch stuff.
- `MCDCRecordProcessor`: Use `DecisionParams.BitmapIdx` directly.
Also, Let `NumConditions` `uint16_t`.
It is smarter to handle the ID as signed.
Narrowing to `int16_t` will reduce costs of handling byvalue. (See also
#81221 and #81227)
External behavior doesn't change. They below handle values as internal
values plus 1.
* `-dump-coverage-mapping`
* `CoverageMappingReader.cpp`
* `CoverageMappingWriter.cpp`
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make
sure them not initialized as zero.
FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
They can be also used in `clang`.
Introduce the lightweight header instead of `CoverageMapping.h`.
This includes for now:
* `mcdc::ConditionID`
* `mcdc::Parameters`
Deprecate `TestVectors`, since no one uses it.
This affects the output order of ExecVectors.
The current impl emits sorted by binary value of ExecVector. This impl
emits along the traversal of `buildTestVector()`.
* `getFunctionBitmap()` stores not `std::vector<uint8_t>` but
`BitVector`.
* `CounterMappingContext` holds `Bitmap` (instead of the ref of bytes)
* `Bitmap` and `BitmapIdx` are used instead of `evaluateBitmap()`.
FIXME: `InstrProfRecord` itself should handle `Bitmap` as `BitVector`.
The current implementation (D138849) assumes `Branch`(es) would follow
after the corresponding `Decision`. It is not true if `Branch`(es) are
forwarded to expanded file ID. As a result, consecutive `Decision`(s)
would be confused with insufficient number of `Branch`(es).
`Expansion` will point `Branch`(es) in other file IDs if `Expansion` is
included in the range of `Decision`.
Fixes#77871
---------
Co-authored-by: Alan Phipps <a-phipps@ti.com>
To relax scanning record, tweak order by `Decision < Expansion`, or
`Expansion` could not be distinguished whether it belonged to `Decision`
or not.
Relevant to #77871
Update code from https://reviews.llvm.org/D138847
`buildTestVector` is a standard DFS (walking a reduced ordered binary
decision diagram). Avoid shouldCopyOffTestVectorFor{True,False}Path
complexity and redundant `Map[ID]` lookups.
`findIndependencePairs` unnecessarily uses four nested loops (n<=6) to
find independence pairs. Instead, enumerate the two execution vectors
and find the number of mismatches. This algorithm can be optimized using
the marking function technique described in _Efficient Test Coverage
Measurement for MC/DC, 2013_, but this may be overkill.
In `CoverageMapping.cpp:getMaxBitmapSize()`,
this assumed that the last `Decision` has the maxmum `BitmapIdx`.
Let it scan `max(BitmapIdx)`.
Note that `<=` is used insted of `<`, because `BitmapIdx == 0` is valid
and `MaxBitmapID` is `unsigned`. `BitmapIdx` is unique in the record.
Fixes#78922
The life of `MCDCRecordProcessor`'s instance is short. It may accept
`const` objects to process.
On the other hand, the life of `MCDCBranches` is shorter than `Record`.
It may be rewritten with reference, rather than copying.
`if constexpr` and `if consteval` conditional statements code coverage
should behave more like a preprocesor `#if`-s than normal
ConditionalStmt. This PR should fix that.
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Fixes a build failure introduced by
commit 8ecbb0404d74 ("Reland [Coverage][llvm-cov]
Enable MC/DC Support in LLVM Source-based Code Coverage (2/3)")
Use of pow() is not necessary.
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.
## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.
## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
FILE SIZE VM SIZE
-------------- --------------
+121% +30.4Mi +121% +30.4Mi .text
[NEW] +14.6Mi [NEW] +14.6Mi __llvm_prf_data
[NEW] +10.6Mi [NEW] +10.6Mi __llvm_prf_names
[NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts
+95% +1.75Mi +95% +1.75Mi .eh_frame
+108% +400Ki +108% +400Ki .eh_frame_hdr
+9.5% +211Ki +9.5% +211Ki .rela.dyn
+9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro
+5.0% +87.3Ki +5.0% +87.3Ki .rodata
[ = ] 0 +13% +47.0Ki .bss
+40% +1.78Ki +40% +1.78Ki .got
+12% +1.49Ki +12% +1.49Ki .gcc_except_table
[ = ] 0 +65% +1.23Ki .relro_padding
+62% +1.20Ki [ = ] 0 [Unmapped]
+13% +448 +19% +448 .init_array
+8.8% +192 [ = ] 0 [ELF Section Headers]
+0.0% +136 +0.0% +80 [7 Others]
+0.1% +96 +0.1% +96 .dynsym
+1.2% +96 +1.2% +96 .rela.plt
+1.5% +80 +1.2% +64 .plt
[ = ] 0 -99.2% -3.68Ki [LOAD #5 [RW]]
+195% +64.0Mi +194% +64.0Mi TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
FILE SIZE VM SIZE
-------------- --------------
+121% +30.4Mi +121% +30.4Mi .text
[NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts
+95% +1.75Mi +95% +1.75Mi .eh_frame
+108% +400Ki +108% +400Ki .eh_frame_hdr
+9.5% +211Ki +9.5% +211Ki .rela.dyn
+9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro
+5.0% +87.3Ki +5.0% +87.3Ki .rodata
[ = ] 0 +13% +47.0Ki .bss
+40% +1.78Ki +40% +1.78Ki .got
+12% +1.49Ki +12% +1.49Ki .gcc_except_table
+13% +448 +19% +448 .init_array
+0.1% +96 +0.1% +96 .dynsym
+1.2% +96 +1.2% +96 .rela.plt
+1.2% +64 +1.2% +64 .plt
+2.9% +64 [ = ] 0 [ELF Section Headers]
+0.0% +40 +0.0% +40 .data
+1.2% +32 +1.2% +32 .got.plt
+0.0% +24 +0.0% +8 [5 Others]
[ = ] 0 -22.9% -872 [LOAD #5 [RW]]
-74.5% -1.44Ki [ = ] 0 [Unmapped]
[ = ] 0 -76.5% -1.45Ki .relro_padding
+118% +38.8Mi +117% +38.8Mi TOTAL
```
A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.
Differential Revision: https://reviews.llvm.org/D138846
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
support::endianness with llvm::endianness.
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
support::endianness::{big,little,native} with
llvm::endianness::{big,little,native}.
This causes stack overflows for real-world coverage reports.
Tested with build/bin/llvm-lit -a llvm/test/tools/llvm-cov
Co-authored-by: Qing Shen <qingshen@google.com>
This reverts commit 32db121b29f78e4c41116b2a8f1c730f9522b202 and subsequent commits.
This causes time regression on llvm-cov even with debug info correlation off.
This patch fixes:
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp:219:5: error:
default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
This causes stack overflows for real-world coverage reports.
Ran $ build/bin/llvm-lit -a llvm/test/tools/llvm-cov locally and passed.
Co-authored-by: Qing Shen <qingshen@google.com>
This seems to cause Clang to crash, see comments on the code review. Reverting
until the problem can be investigated.
> Part 1 of 3. This includes the LLVM back-end processing and profile
> reading/writing components. compiler-rt changes are included.
>
> Differential Revision: https://reviews.llvm.org/D138846
This reverts commit a50486fd736ab2fe03fcacaf8b98876db77217a7.
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.
Differential Revision: https://reviews.llvm.org/D138846
Debug info correlation is an option in InstrProfiling pass, which is used by
both IR instrumentation and front-end instrumentation. So, Clang coverage can
also benefits the binary size saving from it.
Reviewed By: ellis
Differential Revision: https://reviews.llvm.org/D157913
The current llvm-cov error message for kind `coveragemap_error::malforme`, just gives the issue kind without any reason for what caused the issue. This patch is aimed at improving the llvm-cov error message to help identify what caused the issue.
Reviewed By: MaskRay
Close: https://github.com/llvm/llvm-project/pull/65264
`llvm-cov convert-for-testing` only functions properly when the input binary contains a single source file. When the binary has multiple source files, a `Malformed coverage data` error will occur when the generated .covmapping is read back. This is because the testing format lacks support for indicating the size of its file records, and current implementation just assumes there's only one record in it. This patch fixes this problem by introducing a new testing format version.
Changes to the code:
- Add a new format version. The version number is stored in the the last 8 bytes of the orignial magic number field to be backward-compatible.
- Output a LEB128 number before the file records section to indicate its size in the new version.
- Change the format parsing code correspondingly.
- Update the document to formalize the testing format.
- Additionally, fix the bug when converting COFF binaries.
Reviewed By: phosek, gulfem
Differential Revision: https://reviews.llvm.org/D156611
`llvm-cov convert-for-testing` is used to build the .covmapping files used in its regression tests. However the current implementation only works when there's only one source file in the mapping information data. If there are more than 1 source files, `llvm-cov convert-for-testing` can still produce a .covmapping file, but when read it back, `llvm-cov` will report:
```
error: Failed to load coverage: 'main.covmapping': Malformed coverage data
```
This is because the output .covmapping file doesn't have any mark to indicate the boundary between file records and function records, and current implementation jsut assume there's only one file record in the .covmapping file.
Changes to the code:
- Make `llvm-cov convert-for-testing` output a LEB128 number before file records to indicate its size.
- Change the testing format parsing code correspondingly.
- Update existing .covmapping files.
Reviewed By: gulfem
Differential Revision: https://reviews.llvm.org/D156611
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
When a user uses a mismatched clang + llvm-profdata, they didn't get a very
informative error message. It would just say "unsupported version".
As a result, users are often confused as to what they are supposed to do and
tend to assume that it's a bug in the profiling runtime.
This patch improves the error message by:
- Adding a new class of error (`raw_profile_version_mismatch`) to make it clear
that, specifically, the *raw profile* version is unsupported because of a
tool mismatch.
- Adding an error message that tells the user which raw profile version was
encountered, which version was expected, and instructs them to align their
tool versions.
To support this, this patch also updates `InstrProfError::take` to also
propagate the optional error message.
Differential Revision: https://reviews.llvm.org/D149361
This makes parsing for build IDs in the markup filter slightly more
permissive, in line with fromHex.
It also removes the distinction between missing build ID and empty build
ID; empty build IDs aren't a useful concept, since their purpose is to
uniquely identify a binary. This removes a layer of indirection wherever
build IDs are obtained.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D147485
This adds the --check-binary-id flag that makes sure that an object file
is available for every binary ID mentioned in the given profile. This
should help make the tool more robust in CI environments where it's
expected that coverage mappings should be available for every object
contributing to the profile.
Reviewed By: gulfem
Differential Revision: https://reviews.llvm.org/D144308