The indexed MemProf file has a huge amount of redundancy. In a large
internal application, 82% of call stacks, stored in
IndexedAllocationInfo::CallStack, are duplicates.
We should work toward deduplicating call stacks by referring to them
with unique IDs with actual call stacks stored in a separate data
structure, much like we refer to memprof::Frame with memprof::FrameId.
At the same time, we need to facilitate a graceful transition from the
current version of the MemProf format to the next. We should be able
to read (but not write) the current version of the MemProf file even
after we move onto the next one.
With those goals in mind, I propose to have an integer ID next to
CallStack in IndexedAllocationInfo to refer to a call stack in a
succinct manner. We'll gradually increase the areas of the compiler
where IDs and call stacks have one-to-one correspondence and
eventually remove the existing CallStack field.
This patch adds call stack ID, named CSId, to IndexedAllocationInfo
and teaches the raw profile reader to compute unique call stack IDs
and store them in the new field. It does not introduce any user of
the call stack IDs yet, except in verifyFunctionProfileData.
With one raw memprof file I have, NumPCs averages about 41. Given the
default number of inline elements being 8 for SmallVector<uint64_t>,
we should reserve the storage in advance.
Move the ownership of the symbolizer into symbolizeAndFilterStackFrames
so that it is freed on exit, when we are done with it, to reduce peak
memory in the reader. This reduces about 9G from the peak for one large
profile.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class. This patch replaces
{big,little,native} with llvm::endianness::{big,little,native}.
This patch completes the migration to llvm::endianness and
llvm::endianness::{big,little,native}. I'll post a separate patch to
remove the migration helpers in llvm/Support/Endian.h:
using endianness = llvm::endianness;
constexpr llvm::endianness big = llvm::endianness::big;
constexpr llvm::endianness little = llvm::endianness::little;
constexpr llvm::endianness native = llvm::endianness::native;
Add a MemProfReader base class which can be used directly where
symbolization and processing a raw profile is unnecessary.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D159141
Canonicalize the function name (strip suffixes etc) to ensure that
function name suffixes added by late stage passes do not cause
mismatches when memprof profile data is consumed.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D159132
To check the uniqueness of buildids, we held on to a StringRef of the build id string pushed into the vector. If the number of build ids were large enough to trigger a realloc in the vector then these references where invalidated resulting in a use-after free. This was exposed in downstream usage.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D155110
This allows us to pass in a memory buffer directly simplifying usage
downstream.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D155021
Symbolization of position independent code was added in D146181. Update
the comment to note that the checks below are to sanity check the
assumptions we make to simplify symbolization.
Differential Revision: https://reviews.llvm.org/D149370
When no --profiled-binary flag is provided we can print out the build
ids of the modules in the profile. This can help the user fetch the
correct binary from e.g. remote object store.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D148301
This makes parsing for build IDs in the markup filter slightly more
permissive, in line with fromHex.
It also removes the distinction between missing build ID and empty build
ID; empty build IDs aren't a useful concept, since their purpose is to
uniquely identify a binary. This removes a layer of indirection wherever
build IDs are obtained.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D147485
Support symolization of PIE binaries in memprof. We assume that the
profiled binary has one executable text segment for simplicity. Update
the memprof-pic test to now expect the same output as the memprof-basic test.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D146181
This patch adds support for recording BuildIds usng the sanitizer
ListOfModules API. We add another entry to the SegmentEntry struct and
change the memprof raw version.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D145190
This patch adds support for recording BuildIds usng the sanitizer
ListOfModules API. We add another entry to the SegmentEntry struct and
change the memprof raw version.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D145190
Update the isRuntime check to only match against known memprof filenames
where interceptors are defined. This avoid issues where the path does
not include the directory based on how the runtime was compiled. Also
update the unittest.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D145521
Add a check to detect that the profiled binary was build with position
independent code. Add a test with a pie binary to which can be reused
later when support is added. Also clean up the error messages with
trailing colons.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D128564
This change prints out the segment information in the raw profile in
YAML format for testing. Since we don't capture build ids yet, we print
out <None> for now.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D126840
Update the YAML format print out of the profile to include a summary
instead of displaying the headers in the raw file buffer. This allows us
to release the raw buffer early saving memory.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D126834
Extend the Frame struct to hold the symbol name if requested
when a RawMemProfReader object is constructed. This change updates the
tests and removes the need to pass --debug to obtain the mapping from
GUID to symbol names.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D126344
The current implementation of memprof information in the indexed profile
format stores the representation of each calling context fram inline.
This patch uses an interned representation where the frame contents are
stored in a separate on-disk hash table. The table is indexed via a hash
of the contents of the frame. With this patch, the compressed size of a
large memprof profile reduces by ~22%.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D123094
This reverts commit 0d362c90d335509c57c0fbd01ae1829e2b9c3765.
Reason: Causes the MSan buildbot to fail (see comments on
https://reviews.llvm.org/D121179 for more information
To ease profile annotation, each of the callsites in a function can be
annotated with profile data - "IR metadata format for MemProf" [1]. This
patch extends the on-disk serialized record format to store the debug
information for allocation callsites incl inline frames. This change is
incompatible with the existing format i.e. indexed profiles must be
regenerated, raw profiles are unaffected.
[1] https://groups.google.com/g/llvm-dev/c/aWHsdMxKAfE/m/WtEmRqyhAgAJ
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D121179
Since DI frames are enumerated with the leaf function at index 0, this
patch fixes the logic when IsInlineFrame is set. Also update the
unittests to check that only the last frame is marked as non-inline from
a set of DI Frames for a PC address.
Differential Revision: https://reviews.llvm.org/D121830
We add to ensure that we are observing the correct callstack order in memprof
during symbolization. There was some confusion whether the order of
DIFrame objects were reversed but in reality the leaf function is at
index 0 so no code changes are required.
Differential Revision: https://reviews.llvm.org/D121759
This patch filters out callstack frames which can't be symbolized or if
the frames belong to the runtime. Symbolization may not be possible if
debug information is unavailable or if the addresses are from a shared
library. For now we only support optimization of the main binary which
is statically linked to the compiler runtime.
Differential Revision: https://reviews.llvm.org/D120860
Currently, symbolization of stack frames occurs on demand when the instrprof writer
iterates over all the records in the raw memprof reader. With this
change we symbolize and cache the frames immediately after reading the
raw profiles. For a large internal binary this results in a runtime
reduction of ~50% (2m -> 48s) when merging a memprof raw profile with a
raw instr profile to generate an indexed profile. This change also makes
it simpler in the future to generate additional calling context
metadata to attach to each memprof record.
Differential Revision: https://reviews.llvm.org/D120430
This patch adds support for optional memory profile information to be
included with and indexed profile. The indexed profile header adds a new
field which points to the offset of the memory profile section (if
present) in the indexed profile. For users who do not utilize this
feature the only overhead is a 64-bit offset in the header.
The memory profile section contains (1) profile metadata describing the
information recorded for each entry (2) an on-disk hashtable containing
the profile records indexed via llvm::md5(function_name). We chose to
introduce a separate hash table instead of the existing one since the
indexing for the instrumented fdo hash table is based on a CFG hash
which itself is perturbed by memprof instrumentation.
This commit also includes the changes reviewed separately in D120093.
Differential Revision: https://reviews.llvm.org/D120103
This reverts commit 85355a560a33897453df2ef959e255ee725eebce.
This patch adds support for optional memory profile information to be
included with and indexed profile. The indexed profile header adds a new
field which points to the offset of the memory profile section (if
present) in the indexed profile. For users who do not utilize this
feature the only overhead is a 64-bit offset in the header.
The memory profile section contains (1) profile metadata describing the
information recorded for each entry (2) an on-disk hashtable containing
the profile records indexed via llvm::md5(function_name). We chose to
introduce a separate hash table instead of the existing one since the
indexing for the instrumented fdo hash table is based on a CFG hash
which itself is perturbed by memprof instrumentation.
Differential Revision: https://reviews.llvm.org/D118653
This reverts commit e6999040f5758f89a64b6232119b775b7bd1c85b.
Update test to fix signed int comparison warning, fix whitespace in
compiler-rt MIBEntryDef.inc file.
Differential Revision: https://reviews.llvm.org/D117256
This reverts commit 0f73fb18ca333e38cdb9ffa701a8db026c56041d.
Use llvm/Profile/MIBEntryDef.inc instead of relative path.
Generated the raw profile data with `-mllvm
-enable-name-compression=false` so that builbots where the reader is
built without zlib do not fail.
Also updated the test build instructions.
This reverts commit 43c2348c5b926df6bdbc5b70efaa35ecdefe12d5.
Buildbots are failing with an error on reading memprof testdata.
"Inputs/basic.profraw: profile uses zlib
compression but the profile reader was built without zlib support"
https://lab.llvm.org/buildbot/#/builders/16/builds/24490
This patch adds support for optional memory profile information to be
included with and indexed profile. The indexed profile header adds a new
field which points to the offset of the memory profile section (if
present) in the indexed profile. For users who do not utilize this
feature the only overhead is a 64-bit offset in the header.
The memory profile section contains (1) profile metadata describing the
information recorded for each entry (2) an on-disk hashtable containing
the profile records indexed via llvm::md5(function_name). We chose to
introduce a separate hash table instead of the existing one since the
indexing for the instrumented fdo hash table is based on a CFG hash
which itself is perturbed by memprof instrumentation.
Differential Revision: https://reviews.llvm.org/D118653
Use the macro based format to add a wrapper around the MemInfoBlock
when stored in the MemProfRecord. This wrapped block can then be
serialized/deserialized based on a schema specified by a list of enums.
Differential Revision: https://reviews.llvm.org/D117256
The ContainingSegment variable is only used to check whether we found
the address at this time. When building without Asserts this emits a
warning. So for now wrap this code in LLVM_DEBUG to avoid the warning.
This reverts commit dbf47d227d080e4eb7239b589660f51d7b08afa9.
Reapply https://reviews.llvm.org/D116784 now that
https://reviews.llvm.org/D118413 has landed with a couple of fixes:
* fix raw profile reader unaligned access identified by ubsan
* fix windows build by using MOCK_CONST_METHOD3 instead of MOCK_METHOD.