Mingming Liu f3f28323ad
[StaticDataLayout][PGO] Add profile format for static data layout, and the classes to operate on the profiles. (#138170)
Context: For
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744#p-336543-background-3,
we propose to profile memory loads and stores via hardware events,
symbolize the addresses of binary static data sections and feed the
profile back into compiler for data partitioning.

This change adds the profile format for static data layout, and the
classes to operate on it.

The profile and its format
1. Conceptually, a piece of data (call it a symbol) is represented by
its symbol name or its content hash. The former applies to majority of
data whose mangled name remains relatively stable over binary releases,
and the latter applies to string literals (with name patterns like
`.str.<N>[.llvm.<hash>]`.
- The symbols with samples are hot data. The number of hot symbols is
small relative to all symbols. The profile tracks its sampled counts and
locations. Sampled counts come from hardware events, and locations come
from debug information in the profiled binary. The symbols without
samples are cold data. The number of such cold symbols is large. The
profile tracks its representation (the name or content hash).
- Based on a preliminary study, debug information coverage for data
symbols is partial and best-effort. In the LLVM IR, global variables
with source code correspondence may or may not have debug information.
Therefore the location information is optional in the profiles.
2. The profile-and-compile cycle is similar to SamplePGO. Profiles are
sampled from production binaries, and used in next binary releases.
Known cold symbols and new hot symbols can both have zero sampled
counts, so the profile records known cold symbols to tell the two for
next compile.

In the profile's serialization format, strings are concatenated together
and compressed. Individual records stores the index.

A separate PR will connect this class to InstrProfReader/Writer via
MemProfReader/Writer.

---------

Co-authored-by: Kazu Hirata <kazu@google.com>
2025-05-15 18:31:50 -07:00

24 lines
463 B
CMake

set(LLVM_LINK_COMPONENTS
BitReader
Core
Coverage
ProfileData
Support
Object
)
add_llvm_unittest(ProfileDataTests
BPFunctionNodeTest.cpp
CoverageMappingTest.cpp
DataAccessProfTest.cpp
InstrProfDataTest.cpp
InstrProfTest.cpp
ItaniumManglingCanonicalizerTest.cpp
MemProfTest.cpp
PGOCtxProfReaderWriterTest.cpp
SampleProfTest.cpp
SymbolRemappingReaderTest.cpp
)
target_link_libraries(ProfileDataTests PRIVATE LLVMTestingSupport)