The existing ELF/COFF correlation code mostly "just works" on Mach-O
files, with one gotcha: on disk, the pointers in `__llvm_covdata` are
stored in an encoded format due to dyld fixup chains. (In memory, they
would normally be fixed up at load time in a running binary, but the
correlator only looks at the on-disk values.)
LLVM's Mach-O reader knows how to decode the format, so this patch walks
the fixup table to create a set of mappings that the correlator can use
to resolve the values.
rdar://168259786
CoverageMappingWriter reorders `Region`s by `endLoc DESC` to prioritize
wider `Decision` with the same `startLoc`.
In `llvm-cov`, tweak seeking Decisions by reversal order to find smaller
Decision first.
Fix an issue where coroutine function and its await suspend wrappers are
all canonicalized to the same name. This creates duplicate entries in
`MD5FuncMap` (a sorted vector) and may return an incorrect GUID that
mismatches the one from prof metadata and miss ICP. For example, coro
function `foo` and its wrappers `foo.__await_suspend_wrapper__init`,
`foo.__await_suspend_wrapper__final` are all canonicalized to `foo`.
During GUID lookup, any of them can be returned due to unstable sort.
This is more of the reliability issue (the indeterminism) than a
performance issue because hot indirect calls should've been promoted in
sample loader pass.
This also fixes the same naming issue in `CGProfile` where symtab is
created. By the time the pass is run, wrapper functions should already
be inlined but naming collision can happen to the coro function and its
post-split clones (`foo.resume`, `foo.cleanup`).
This change
* Unifies `InstrProfSymtab::getCanonicalName` with
`FunctionSamples::getCanonicalFnName` by providing a customized list of
suffixes to remove instead of removing everything after the first dot
(.) except ".__uniq.". This also addresses the "FIXME" comment.
* Uses stable sort when storing `MD5FuncMap` which avoids indeterminism
during GUID lookup. `MD5FuncMap` can still contain duplicate entries
even with this change, because no check is done before insertion. Making
entries unique needs some additional work.
* Checks for same PGO names before calling the second `addFuncWithName`,
which does nothing in case of name match.
Since #119952, `getCoverageForFile` and `getCoverageForFunction` have
similar structure each other. Ther merged method `addFunctionRegions`
has two lambda subfunctions.
* `getCoverageForFile`
- `MainFileID` may be `nullopt`.
- `shouldProcess` picks up relevant records along `FileIDs` that is
scanned based on `MainFileID`. They may have expanded source files.
- `shouldExpand` takes the presense of `MainFileID` into account.
* `getCoverageForFunction`
- This assumes the presense of `MainFileID`.
- `shouldProcess` picks up records that belong only to `MainFileID`.
- `shouldExpand` assumes the presense of `MainFileID`.
This change introduces a wrapper class `MergeableCoverageData` for
further merging instances. At the moment, this returns `CoverageData`
including `buildSegments()`.
This change itself is NFC.
`getBranchCounterPair()` allocates an additional Counter to SkipPath in
`SingleByteCoverage`.
`IsCounterEqual()` calculates the comparison with rewinding counter
replacements.
`NumRegionCounters` is updated to take additional counters in account.
`incrementProfileCounter()` has a few additiona arguments.
- `UseSkipPath=true`, to specify setting counters for SkipPath. It
assumes `UseSkipPath=false` is used together.
- `UseBoth` may be specified for marking another path. It introduces the
same effect as issueing `markStmtAsUsed(!SkipPath, S)`.
`llvm-cov` discovers counters in `FalseCount` to allocate `MaxCounterID`
for empty profile data.
https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492
Resumes: #112730
Depends on: #112698#112702#112724
This gives some aggregated information about the data access profiles.
The summaries are computed on the fly to save a profile version update.
Implementation-wise
* `MemProfSummary::printSummaryYaml` is updated to print data access
summaries for v4, the profile version that started to support data
access profiles.
* MemProfSummary.cpp has a FIXME comment to serialize the summary into
profile data, ideally batching with more substantial profile format
change.
* MemProfSummaryBuilder is not updated for now. This class is used to
serialize memprof summaries for v4 and above by memprof writer, and to
construct memprof summaries for v3 and prior versions by llvm-profdata.
Pre-reserve space in the map before inserting. In release builds, 9.4%
of all CPU time is spent in llvm::sampleprof::ProfileSymbolList::add. Of
that 9.4%, roughly half is in llvm::DenseMapBase::grow.
---------
Co-authored-by: mxms <mxms@google.com>
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]]. Note
that this patch adjusts the placement of [[maybe_unused]] to comply
with the C++17 language.
There's a pattern throughout LLVM of cl::opts being exported. That in
itself is probably a bit unfortunate, but what's especially bad about it
is that a lot of those symbols are in the global namespace. Move them
into the llvm namespace.
While doing this, I noticed some other variables in the global namespace
and moved them as well.
This is a follow-up to #156140 and #160979, which deprecated one form of
write and read, respectively.
We have two forms of byte_swap:
template <typename value_type>
[[nodiscard]] inline value_type byte_swap(value_type value, endianness
endian)
template <typename value_type, endianness endian>
[[nodiscard]] inline value_type byte_swap(value_type value)
The difference is that endian is a function parameter in the former
but a template parameter in the latter.
This patch streamlines the code by migrating the use of the latter to
the former while deprecating the latter because the latter is just
forwarded to the former.
This is a follow-up to #156140, which deprecated one form of write.
We have two forms of read:
template <typename value_type, std::size_t alignment>
[[nodiscard]] inline value_type read(const void *memory, endianness
endian)
template <typename value_type, endianness endian, std::size_t alignment>
[[nodiscard]] inline value_type read(const void *memory)
The difference is that endian is a function parameter in the former
but a template parameter in the latter.
This patch streamlines the code by migrating the use of the latter to
the former while deprecating the latter.
This change extends SampleFDO ext-binary and text format to record the
vtable symbols and their counts for virtual calls inside a function. The
vtable profiles will allow the compiler to annotate vtable types on IR
instructions and perform vtable-based indirect call promotion. An RFC is
in
https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283
Given a function below, the before vs after of a function's profile is
illustrated in text format in the table:
```
__attribute__((noinline)) int loop_func(int i, int a, int b) {
Base *ptr = createType(i);
int sum = ptr->func(a, b);
delete ptr;
return sum;
}
```
| before | after |
| --- | --- |
| Samples collected in the function's body { <br> 0: 636241 <br> 1:
681458, calls: _Z10createTypei:681458 <br> 3: 543499, calls:
_ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878
<br> 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635
_ZN8Derived1D0Ev:147566 <br> 7: 511057 <br> } | Samples collected in the
function's body { <br> 0: 636241 <br> 1: 681458, calls:
_Z10createTypei:681458 <br> 3: 543499, calls:
_ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878
<br> 3: vtables: _ZTV8Derived1:1377 _ZTVN12_GLOBAL__N_18Derived2E:4250
<br> 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635
_ZN8Derived1D0Ev:147566 <br> 5.1: vtables: _ZTV8Derived1:227
_ZTVN12_GLOBAL__N_18Derived2E:765 <br> 7: 511057 <br> } |
Key points for this change:
1. In-memory representation of vtable profiles
* A field of type `map<LineLocation, map<FunctionId, uint64_t>>` is
introduced in a function's in-memory representation
[FunctionSamples](ccc416312e/llvm/include/llvm/ProfileData/SampleProf.h (L749-L754)).
2. The vtable counters for one LineLocation represents the relative
frequency among vtables for this LineLocation. They are not required to
be comparable across LineLocations.
3. For backward compatibility of ext-binary format, we take one bit from
ProfSummaryFlag as illustrated in the enum class `SecProfSummaryFlags`.
The ext-binary profile reader parses the integer type flag and reads
this bit. If it's set, the profile reader will parse vtable profiles.
4. The vtable profiles are optional in ext-binary format, and not
serialized out by default, we introduce an LLVM boolean option (named
`-extbinary-write-vtable-type-prof`). The ext-binary profile writer
reads the boolean option and decide whether to set the section flag bit
and serialize the in-memory class members corresponding to vtables.
5. This change doesn't implement `llvm-profdata overlap --sample` for
the vtable profiles. A subsequent change will do it to keep this one
focused on the profile format change.
We don't plan to add the vtable support to non-extensible format mainly
because of the maintenance cost to keep backward compatibility for prior
versions of profile data.
* Currently, the [non-extensible binary
format](5c28af4099/llvm/lib/ProfileData/SampleProfWriter.cpp (L899-L900))
does not have feature parity with extensible binary format today, for
instance, the former doesn't support [profile symbol
list](41e22aa31b/llvm/include/llvm/ProfileData/SampleProf.h (L1518-L1522))
or context-sensitive PGO, both of which give measurable performance
boost. Presumably the non-extensible format is not in wide use.
---------
Co-authored-by: Paschalis Mpeis <paschalis.mpeis@arm.com>
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
`InstrProfWriter::addTemporalProfileTraces()` did not correctly account
for when the sources traces are sampled, but the reservoir size is
larger than what it was before, meaning there is room for more traces.
Also, if the reservoir size decreased, meaning traces should be
truncated.
Depends on https://github.com/llvm/llvm-project/pull/152550 for the test
refactor
Instead of writing out in native endian, write out the raw profile bytes
in little endian. Also update the MIB data in little endian. Also clean
up some lint and unused includes in rawprofile.cpp.
Reapply #147854 after fixes merged in #151398.
Change memory access histogram storage from uint64_t to uint16_t to
reduce profile size on disk. This change updates the raw profile format
to v5. Also add a histogram test in compiler-rt since we didn't have one
before. With this change the histogram memprof raw for the basic test
reduces from 75KB -> 20KB.
Change memory access histogram storage from uint64_t to uint16_t to
reduce profile size on disk. This change updates the raw profile format
to v5. Also add a histogram test in compiler-rt since we didn't have one
before. With this change the histogram memprof raw for the basic test
reduces from 75KB -> 20KB.
* Introduce an error code for illegal_line_offset in sampleprof_error
namespace, and use it for line offset parsing error.
* Add `const` for `LineLocation::serialize`.
* Use structured binding, make_first/second_range in loops.
I'm working on a [sample-profile format
change](https://github.com/llvm/llvm-project/compare/users/mingmingl-llvm/samplefdo-profile-format)
to extend SampleFDO profile with vtable profiles. And this change splits
the non-functional changes.
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
I am happy to put it in another location, or to structure it differently
if that makes sense. Some have suggested in BinaryFormat, but it is not
a great fit there. But if that makes more sense to the reviewers, I can
do that.
Another possibility would be to use pass-through headers to allow
clients who don't care to depend only on DebugInfo/DWARF. This would be
a much less invasive change, and perhaps easier for clients. But also a
system that hides details.
Either way, I'm open.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
Factor out code in `populateCounters()` and `populateCoverage()` used to
grab the record into `PGOUseFunc::getRecord()` to reduce code
duplication. And return `NamedInstrProfRecord` in `getInstrProfRecord()`
to avoid an unnecessary cast. No functional change is intented.
In the current gcov implementation, all lines within a basic block are
attributed to the source file of the block's containing function. This
is inaccurate when a block contains lines from other files (e.g., via
#include "foo.inc").
Commit
[406e81b](406e81b79d)
attempted to address this by filtering lines based on debug info types,
but this approach has two limitations:
* **Over-filtering**: Some valid lines belonging to the function are
incorrectly excluded.
* **Under-counting**: Lines not belonging to the function are filtered
out and omitted from coverage statistics.
**GCC Reference Behavior**
GCC's gcov implementation handles this case correctly.This change aligns
the LLVM behavior with GCC.
**Proposed Solution**
1. **GCNO Generation**:
* **Current**: Each block stores a single GCOVLines record (filename +
lines).
* **New**: Dynamically create new GCOVLines records whenever consecutive
lines in a block originate from different source files. Group subsequent
lines from the same file under one record.
2. **GCNO Parsing**:
* **Current**: Lines are directly attributed to the function's source
file.
* **New**: Introduce a GCOVLocation type to track filename/line mappings
within blocks. Statistics will reflect the actual source file for each
line.
When no profile is provided, but the new --empty-profile option is
specified, the export/report/show commands now emit coverage data
equivalent to that obtained from a profile with all zero counters
("baseline coverage").
This is useful for build systems (e.g. Bazel) that can track coverage
information for each build target, even those that are never linked into
tests and thus don't have runtime coverage data recorded. By merging in
baseline coverage, lines in files that aren't linked into tests are
correctly reported as uncovered.
Reland with fixes to `CoverageMappingTest.cpp`.
Reverts llvm/llvm-project#144121
Reverts llvm/llvm-project#117910
```
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp:281:28: error: 'std::reference_wrapper' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported]
281 | std::make_optional(std::reference_wrapper(*ProfileReader));
| ^
/usr/lib/gcc/ppc64le-redhat-linux/8/../../../../include/c++/8/bits/refwrap.h:289:11: note: add a deduction guide to suppress this warning
289 | class reference_wrapper
| ^
```
When no profile is provided, but the new --empty-profile option is
specifed, the export/report/show commands now emit coverage data
equivalent to that obtained from a profile with all zero counters
("baseline coverage").
This is useful for build systems (e.g. Bazel) that can track coverage
information for each build target, even those that are never linked into
tests and thus don't have runtime coverage data recorded. By merging in
baseline coverage, lines in files that aren't linked into tests are
correctly reported as uncovered.
The previous placement resulted in this warning when using g++-13:
/home/david.spickett/llvm-project/llvm/include/llvm/Support/Compiler.h:120:43: warning: attribute ignored [-Wattributes]
120 | #define LLVM_ATTRIBUTE_VISIBILITY_DEFAULT [[gnu::visibility("default")]]
| ^
/home/david.spickett/llvm-project/llvm/include/llvm/Support/Compiler.h:213:18: note: in expansion of macro ‘LLVM_ATTRIBUTE_VISIBILITY_DEFAULT’
213 | #define LLVM_ABI LLVM_ATTRIBUTE_VISIBILITY_DEFAULT
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/david.spickett/llvm-project/llvm/lib/ProfileData/MemProfRadixTree.cpp:245:5: note: in expansion of macro ‘LLVM_ABI’
245 | LLVM_ABI computeFrameHistogram<FrameId>(
| ^~~~~~~~
/home/david.spickett/llvm-project/llvm/include/llvm/Support/Compiler.h:120:43: note: an attribute that appertains to a type-specifier is ignored
120 | #define LLVM_ATTRIBUTE_VISIBILITY_DEFAULT [[gnu::visibility("default")]]
| ^
According to the interface guide, that macro should go before the return
type to be effective.
https://llvm.org/docs/InterfaceExportAnnotations.html#specialized-template-functions
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/ProfileData`
library. These annotations currently have no meaningful impact on the
LLVM build; however, they are a prerequisite to support an LLVM Windows
DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The following manual adjustments were also applied after running IDS on
Linux:
- Manually annotate the file
`llvm/include/llvm/ProfileData/InstrProfData.inc` because it is skipped
by IDS
- Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported
instantiated templates.
- Add `LLVM_ABI_FRIEND` to friend member functions declared with
`LLVM_ABI`
- Add `LLVM_ABI` to a small number of symbols that require export but
are not declared in headers
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Address post-commit review comments from PR141805. Misc cleanup but the
biggest changes are moving some common utilities to new MemProfCommon
files to reduce unnecessary includes.
This patch adds support for a basic MemProf summary section, which is
built along with the indexed MemProf profile (e.g. when reading the raw
or YAML profiles), and serialized through the indexed profile just after
the header.
Currently only 6 fields are written, specifically the number of contexts
(total, cold, hot), and the max context size (cold, warm, hot).
To support forwards and backwards compatibility for added fields in the
indexed profile, the number of fields serialized first. The code is
written to support forwards compatibility (reading newer profiles with
additional summary fields), and comments indicate how to implement
backwards compatibility (reading older profiles with fewer summary
fields) as needed.
Support is added to print the summary as YAML comments when displaying
both the raw and indexed profiles via `llvm-profdata show`. Because they
are YAML comments, the YAML reader ignores these (the summary is always
recomputed when building the indexed profile as described above).
This necessitated moving some options and a couple of interfaces out of
Analysis/MemoryProfileInfo.cpp and into the new
ProfileData/MemProfSummary.cpp file, as we need to classify context
hotness earlier and also compute context ids to build the summary from
older indexed profiles.