85 Commits

Author SHA1 Message Date
Kazu Hirata
9b94869942
[memprof] Use front instead of begin in a unit test (NFC) (#119501)
"front" allows us to drop a dereference.
2024-12-11 09:16:44 -08:00
Kazu Hirata
76b493128c
[memprof] Accept a function name in YAML (#119453)
This patch does two things:

- During deserialization, we accept a function name as an alternative
  to the usual GUID represented as a hexadecimal number.

- During serialization, we print a GUID as a 16-digit hexadecimal
  number prefixed with 0x in the usual way.  (Without this patch, we
  print a decimal number, which is not customary.)

In YAML, the MemProf profile is a vector of pairs of GUID and
MemProfRecord.  This patch accepts a function name for the GUID, but
it does not accept a function name for the GUID used in Frames yet.
That will be addressed in a subsequent patch.
2024-12-10 21:29:57 -08:00
Kazu Hirata
1708519fe9
[memprof] Use std::make_unique in unit tests (NFC) (#119175) 2024-12-09 09:38:46 -08:00
Kazu Hirata
6a137fbe64
[memprof] Use namespaces in a unit test (#119144)
MemProfTest.cpp is about MemProf, so mentioning llvm::memprof
everywhere is quite verbose.
2024-12-08 22:44:43 -08:00
Kazu Hirata
b6dfdd2b1e
[memprof] Drop memprof:: in unit tests (NFC) (#119113)
This patch replaces memprof::Foo with Foo if we have corresponding:

  using llvm::memprof::Foo;
2024-12-08 11:34:29 -08:00
Kazu Hirata
6c062afc2e
[memprof] Compare Frames instead of FrameIds in a unit test (#119111)
When we call IndexedMemProfRecord::toMemProfRecord, we care about
getting the original (that is, non-indexed) MemProfRecord back, so we
should just verify that, not the hash values, which are
intermediaries.

There is a remote possibility of hash collisions where call stack
{F1, F2} might come back as {F1, F1} if F1.hash() == F2.hash() for
example.  However, since FrameId uses BLAKE, the hash values should be
consistent across architectures.  That is, if this test case works on
one architecture, it should work on others as well.
2024-12-08 08:33:05 -08:00
Kazu Hirata
8eb5baf5ea
[memprof] Use IndexedMemProfData in a unit test (NFC) (#119062)
IndexedMemProfData eliminates the need for the "using" directives.
Also, we do not need to declare maps for individual components of the
MemProf profile.
2024-12-07 10:41:55 -08:00
Kazu Hirata
32f7f0010b
[memprof] Use gtest matchers at more places (#119050)
These gtest matchers reduce the number of times we mention the
variables under examined.
2024-12-06 22:46:54 -08:00
Kazu Hirata
00090ac0b9
[memprof] Use IndexedMemProfData in tests (NFC) (#119049)
This patch replaces FrameIdMap and CallStackIdMap with
IndexedMemProfData, which comes with recently introduced methods like
addFrame and addCallStack.
2024-12-06 22:46:20 -08:00
Kazu Hirata
c5e4e8f87d
[memprof] Add IndexedMemProfData::addCallStack (#118920)
This patch adds a helper function to replace an idiom like:

  CallStackId CSId = hashCallStack(CallStack)
  MemProfData.CallStacks.try_emplace(CSId, CallStack);
  // Do something with CSId.
2024-12-06 12:10:11 -08:00
Kazu Hirata
d88a0c7322
[memprof] Rename Inline to IsInlineFrame in YAML (#118901)
This patch makes the YAML field name match the struct field name.
2024-12-05 18:39:03 -08:00
Kazu Hirata
dbd920b290 Reapply [memprof] Update YAML traits for writer purposes (#118720)
For Frames, we prefer the inline notation for the brevity.

For PortableMemInfoBlock, we go through all member fields and print
out those that are populated.

This iteration works around the unavailability of
ScalarTraits<uintptr_t> on macOS.
2024-12-05 13:19:19 -08:00
Florian Hahn
0772a0bd29
Revert "[memprof] Update YAML traits for writer purposes (#118720)"
This reverts commit 7b8cf147addf7d3fb4630475c40153226f5fdbd0.

Breaks building on macOS
    https://lab.llvm.org/buildbot/#/builders/190/builds/10737
    https://lab.llvm.org/buildbot/#/builders/23/builds/5491
    https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-cmake-RA-incremental/6076/
2024-12-05 09:54:07 +00:00
Kazu Hirata
50f8580e2c
[memprof] Add IndexedMemProfData::addFrame (#118724)
This patch adds a helper function to replace an idiom like:

  FrameId Id = F.hash();
  MemProfData.Frames.try_emplace(Id, F);
  // Do something with Id.
2024-12-04 20:33:35 -08:00
Kazu Hirata
7b8cf147ad
[memprof] Update YAML traits for writer purposes (#118720)
For Frames, we prefer the inline notation for the brevity.

For PortableMemInfoBlock, we go through all member fields and print
out those that are populated.
2024-12-04 19:23:27 -08:00
Kazu Hirata
82b437944e
[memprof] Use "using" directives in unit tests (NFC) (#117852)
This tests uses existing "using" directives to shorten unit tests.

- llvm::memprof::hashCallStack -> hashCallStack
- testing::Pair -> Pair
- testing::ElementsAreArray -> ElementsAre
- testing::Contains -> UnorderedElementsAre
2024-11-27 11:11:45 -08:00
Kazu Hirata
e98396f484 Reapply [memprof] Add YAML-based deserialization for MemProf profile (#117829)
This patch adds YAML-based deserialization for MemProf profile.

It's been painful to write tests for MemProf passes because we do not
have a text format for the MemProf profile.  We would write a test
case in C++, run it for a binary MemProf profile, and then finally run
a test written in LLVM IR with the binary profile.

This patch paves the way toward YAML-based MemProf profile.
Specifically, it adds new class YAMLMemProfReader derived from
MemProfReader.  For now, it only adds a function to parse StringRef
pointing to YAML data.  Subseqeunt patches will wire it to
llvm-profdata and read from a file.

The field names are based on various printYAML functions in MemProf.h.
I'm not aiming for compatibility with the format used in printYAML,
but I don't see a point in changing the field names.

This iteration works around the unavailability of
ScalarTraits<uintptr_t> on macOS.
2024-11-27 08:19:07 -08:00
Florian Hahn
7e312c3b90
Revert "[memprof] Add YAML-based deserialization for MemProf profile (#117829)"
This reverts commit c00e53208db638c35499fc80b555f8e14baa35f0.

It looks like this breaks building LLVM on macOS and some other
platform/compiler combos

https://lab.llvm.org/buildbot/#/builders/23/builds/5252
https://green.lab.llvm.org/job/llvm.org/job/clang-san-iossim/5356/console

In file included from /Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/lib/ProfileData/MemProfReader.cpp:34:
In file included from /Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/ProfileData/MemProfReader.h:24:
In file included from /Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/ProfileData/InstrProfReader.h:22:
In file included from /Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/ProfileData/InstrProfCorrelator.h:21:
/Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:1173:36: error: implicit instantiation of undefined template 'llvm::yaml::MissingTrait<unsigned long>'
  char missing_yaml_trait_for_type[sizeof(MissingTrait<T>)];
                                   ^
/Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:961:7: note: in instantiation of function template specialization 'llvm::yaml::yamlize<unsigned long>' requested here
      yamlize(*this, Val, Required, Ctx);
      ^
/Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:883:11: note: in instantiation of function template specialization 'llvm::yaml::IO::processKey<unsigned long, llvm::yaml::EmptyContext>' requested here
    this->processKey(Key, Val, true, Ctx);
          ^
/Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/ProfileData/MIBEntryDef.inc:55:1: note: in instantiation of function template specialization 'llvm::yaml::IO::mapRequired<unsigned long>' requested here
MIBEntryDef(AccessHistogram = 27, AccessHistogram, uintptr_t)
^
/Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/lib/ProfileData/MemProfReader.cpp:77:8: note: expanded from macro 'MIBEntryDef'
    Io.mapRequired(KeyStr.str().c_str(), MIB.Name);                            \
       ^
/Users/ec2-user/jenkins/workspace/llvm.org/clang-san-iossim/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:310:8: note: template is declared here
struct MissingTrait;
       ^
1 error generated.
2024-11-27 09:04:16 +00:00
Kazu Hirata
c00e53208d
[memprof] Add YAML-based deserialization for MemProf profile (#117829)
This patch adds YAML-based deserialization for MemProf profile.

It's been painful to write tests for MemProf passes because we do not
have a text format for the MemProf profile.  We would write a test
case in C++, run it for a binary MemProf profile, and then finally run
a test written in LLVM IR with the binary profile.

This patch paves the way toward YAML-based MemProf profile.
Specifically, it adds new class YAMLMemProfReader derived from
MemProfReader.  For now, it only adds a function to parse StringRef
pointing to YAML data.  Subseqeunt patches will wire it to
llvm-profdata and read from a file.

The field names are based on various printYAML functions in MemProf.h.
I'm not aiming for compatibility with the format used in printYAML,
but I don't see a point in changing the field names.
2024-11-26 23:48:03 -08:00
Kazu Hirata
5add295fd7
[memprof] Use IndexedMemProfRecord in MemProfReader (NFC) (#117613)
IndexedMemProfRecord contains a complete package of the MemProf
profile, including frames, call stacks, and records.  This patch
replaces the three member variables of MemProfReader with
IndexedMemProfRecord.

This transition significantly simplies both the constructor and the
final "take" method:

  MemProfReader(IndexedMemProfData MemProfData)
      : MemProfData(std::move(MemProfData)) {}

IndexedMemProfData takeMemProfData() { return std::move(MemProfData); }
2024-11-26 14:33:45 -08:00
Kazu Hirata
ff7b42c194
[memprof] Speed up llvm-profdata (#117446)
CallStackRadixTreeBuilder::build takes the parameter
MemProfFrameIndexes by value, involving copies:

  std::optional<const llvm::DenseMap<FrameIdTy, LinearFrameId>>
    MemProfFrameIndexes

Then "build" makes another copy of MemProfFrameIndexe and passes it to
encodeCallStack for every call stack, which is painfully slow.

This patch changes the type to a pointer so that we don't have to make
a copy every time we pass the argument.

Without this patch, it takes 553 seconds to run "llvm-profdata merge"
on a large MemProf raw profile.  This patch shortenes that down to 67
seconds.
2024-11-24 21:08:54 -08:00
Kazu Hirata
ad2bdd8fab
[memprof] Remove MemProf format Version 1 (#117357)
This patch removes MemProf format Version 1 now that Version 2 and 3
are working well.
2024-11-22 11:53:31 -08:00
Kazu Hirata
b170ab21c3
[memprof] Construct MemProfReader with IndexedMemProfData (#117022)
This patch updates a unit test to construct MemProfReader with
IndexedMemProfData, a complete package of MemProf profile.

With this change, nobody in the LLVM codebase is using the
MemProfReader constructor that takes individual components of the
MemProf profile, so this patch deprecates the constructor.
2024-11-20 10:52:17 -08:00
Teresa Johnson
e14827f082
[MemProf] Templatize CallStackRadixTreeBuilder (NFC) (#117014)
Prepare for usage in the bitcode reader/writer where we already have a
LinearFrameId:
- templatize input frame id type in CallStackRadixTreeBuilder
- templatize input frame id type in computeFrameHistogram
- make the map from FrameId to LinearFrameId optional

We plan to use the same radix format in the ThinLTO summary records,
where we already have a LinearFrameId.
2024-11-20 10:08:58 -08:00
Kazu Hirata
f88c913f8a
[memprof] Add a new constructor to MemProfReader (NFC) (#116918)
This patch adds a new constructor to MemProfReader that takes
IndexedMemProfData, a complete package of MemProf profile.  To
showcase its usage, I'm updating one of the unit tests to use the new
constructor.

Because of type mismatches between DenseMap and MapVector, I'm copying
Frames and CallStacks for now.  Once we remove the methods and old
constructors that take or return individual components (frames, call
stacks, and records), we will drop the copying, and the new
constructor will collapse down to:

  MemProfReader(IndexedMemProfData MemProfData)
    : MemProfData(std::move(MemProfData)) {}

Since nobody in the LLVM codebase uses the constructor that takes the
three indivdual components, I'm deprecating the old constructor.
2024-11-20 08:35:54 -08:00
Kazu Hirata
a4e1a3dc8b
[memprof] Add another constructor to IndexedAllocationInfo (NFC) (#116684)
This patch adds another constructor to IndexedAllocationInfo that is
identical to the existing constructor except that the new one leaves
the CallStack field empty.

I'm planning to remove MemProf format Version 1.  Then we will migrate
the users of the existing constructor to the new one as nobody will be
using the CallStack field anymore.

Adding the new constructor now allows us to migrate a few existing
users of the old constructor even before we remove the CallStack
field.  In turn, that simplifies the patch to actually remove the
field.
2024-11-18 14:09:21 -08:00
Kazu Hirata
0d38f64e7d
[memprof] Remove MemProf format Version 0 (#116442)
This patch removes MemProf format Version 0 now that version 2 and 3
seem to be working well.

I'm not touching version 1 for now because some tests still rely on
version 1.

Note that Version 0 is identical to Version 1 except that the MemProf
section of the indexed format has a MemProf version field.
2024-11-15 15:37:00 -08:00
JOE1994
459a82e689 [llvm][unittests] Don't call raw_string_ostream::flush() (NFC)
raw_string_ostream::flush() is essentially a no-op (also specified in docs).
Don't call it in tests that aren't meant to test 'raw_string_ostream' itself.

p.s. remove a few redundant calls to raw_string_ostream::str()
2024-09-13 19:55:44 -04:00
Matthew Weingarten
30b93db547
[Memprof] Adds the option to collect AccessCountHistograms for memprof. (#94264)
Adds compile time flag -mllvm -memprof-histogram and runtime flag
histogram=true|false to turn Histogram collection on and off. The
-memprof-histogram flag relies on -memprof-use-callbacks=true to work.

Updates shadow mapping logic in histogram mode from having one 8 byte
counter for 64 bytes, to 1 byte for 8 bytes, capped at 255. Only
supports this granularity as of now.

Updates the RawMemprofReader and serializing MemoryInfoBlocks to binary
format, including changing to a new version of the raw binary format
from version 3 to version 4.

Updates creating MemoryInfoBlocks with and without Histograms. When two
MemoryInfoBlocks are merged, AccessCounts are summed up and the shorter
Histogram is removed.

Adds a memprof_histogram test case.

Initial commit for adding AccessCountHistograms up until RawProfile for
memprof
2024-06-26 08:37:22 -07:00
Kazu Hirata
dc3f8c2f58
[memprof] Improve deserialization performance in V3 (#94787)
We call llvm::sort in a couple of places in the V3 encoding:

- We sort Frames by FrameIds for stability of the output.

- We sort call stacks in the dictionary order to maximize the length
  of the common prefix between adjacent call stacks.

It turns out that we can improve the deserialization performance by
modifying the comparison functions -- without changing the format at
all.  Both places take advantage of the histogram of Frames -- how
many times each Frame occurs in the call stacks.

- Frames: We serialize popular Frames in the descending order of
  popularity for improved cache locality.  For two equally popular
  Frames, we break a tie by serializing one that tends to appear
  earlier in call stacks.  Here, "earlier" means a smaller index
  within llvm::SmallVector<FrameId>.

- Call Stacks: We sort the call stacks to reduce the number of times
  we follow pointers to parents during deserialization.  Specifically,
  instead of comparing two call stacks in the strcmp style -- integer
  comparisons of FrameIds, we compare two FrameIds F1 and F2 with
  Histogram[F1] < Histogram[F2] at respective indexes.  Since we
  encode from the end of the sorted list of call stacks, we tend to
  encode popular call stacks first.

Since the two places use the same histogram, we compute it once and
share it in the two places.

Sorting the call stacks reduces the number of "jumps" by 74% when we
deserialize all MemProfRecords.  The cycle and instruction counts go
down by 10% and 1.5%, respectively.

If we sort the Frames in addition to the call stacks, then the cycle
and instruction counts go down by 14% and 1.6%, respectively, relative
to the same baseline (that is, without this patch).
2024-06-07 17:25:57 -07:00
Kazu Hirata
c348e265bd
[memprof] Use CallStackRadixTreeBuilder in the V3 format (#94708)
This patch integrates CallStackRadixTreeBuilder into the V3 format,
reducing the profile size to about 27% of the V2 profile size.

- Serialization: writeMemProfCallStackArray just needs to write out
  the radix tree array prepared by CallStackRadixTreeBuilder.
  Mappings from CallStackIds to LinearCallStackIds are moved by new
  function CallStackRadixTreeBuilder::takeCallStackPos.

- Deserialization: Deserializing a call stack is the same as
  deserializing an array encoded in the obvious manner -- the length
  followed by the payload, except that we need to follow a pointer to
  the parent to take advantage of common prefixes once in a while.
  This patch teaches LinearCallStackIdConverter to how to handle those
  pointers.
2024-06-07 07:19:36 -07:00
Kazu Hirata
5c0df5fe22
[memprof] Add CallStackRadixTreeBuilder (#93784)
Call stacks are a huge portion of the MemProf profile, taking up 70+%
of the profile file size.

This patch implements a radix tree to compress call stacks, which are
known to have long common prefixes.  Specifically,
CallStackRadixTreeBuilder, introduced in this patch, takes call stacks
in the MemProf profile, sorts them in the dictionary order to maximize
the common prefix between adjacent call stacks, and then encodes a
radix tree into a single array that is ready for serialization.

The resulting radix array is essentially a concatenation of call stack
arrays, each encoded with its length followed by the payload, except
that these arrays contain "instructions" like "skip 7 elements
forward" to borrow common prefixes from other call stacks.

This patch does not integrate with the MemProf
serialization/deserialization infrastructure yet.  Once integrated,
the radix tree is expected to roughly halve the file size of the
MemProf profile.
2024-06-06 15:52:45 -07:00
Kazu Hirata
26fabdded3
[memprof] Pass FrameIdConverter and CallStackIdConverter by reference (#92327)
CallStackIdConverter sets LastUnmappedId when a mapping failure
occurs.  Now, since toMemProfRecord takes an instance of
CallStackIdConverter by value, namely std::function, the caller of
toMemProfRecord never receives the mapping failure that occurs inside
toMemProfRecord.  The same problem applies to FrameIdConverter.

The patch fixes the problem by passing FrameIdConverter and
CallStackIdConverter by reference, namely llvm::function_ref.

While I am it, this patch deletes the copy constructor and copy
assignment operator to avoid accidental copies.
2024-05-15 17:53:28 -07:00
Mircea Trofin
181e2e8fb9
[nfc][memprof] Add missing license to MemProfTest (#91695) 2024-05-09 20:47:10 -07:00
Kazu Hirata
c9dae43438
[memprof] Add access checks to PortableMemInfoBlock::get* (#90121)
commit 4c8ec8f8bc3fb4dda4fd36c3b2ad745bd3451970
  Author: Kazu Hirata <kazu@google.com>
  Date:   Wed Apr 24 16:25:35 2024 -0700

introduced the idea of serializing/deserializing a subset of the
fields in PortableMemInfoBlock.  While it reduces the size of the
indexed MemProf profile file, we now could inadvertently access
unavailable fields and go without noticing.

To protect ourselves from the risk, this patch adds access checks to
PortableMemInfoBlock::get* methods by embedding a bit set representing
available fields into PortableMemInfoBlock.
2024-04-28 12:49:08 -07:00
Kazu Hirata
352602010f Repply [memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307)
Currently, we convert FrameId to Frame and CallStackId to a call stack
at several places.  This patch unifies those into function objects --
FrameIdConverter and CallStackIdConverter.

The existing implementation of CallStackIdConverter, being removed in
this patch, handles both FrameId and CallStackId conversions.  This
patch splits it into two phases for flexibility (but make them
composable) because some places only require the FrameId conversion.

This iteration fixes a problem uncovered with ubsan, where we were
dereferencing an uninitialized std::unique_ptr.
2024-04-28 11:44:45 -07:00
Vitaly Buka
7aa6896dd7
Revert "[memprof] Introduce FrameIdConverter and CallStackIdConverter" (#90318)
Reverts llvm/llvm-project#90307

Breaks bots https://lab.llvm.org/buildbot/#/builders/5/builds/42943
2024-04-27 00:15:08 -07:00
Kazu Hirata
e04df693bf
[memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307)
Currently, we convert FrameId to Frame and CallStackId to a call stack
at several places.  This patch unifies those into function objects --
FrameIdConverter and CallStackIdConverter.

The existing implementation of CallStackIdConverter, being removed in
this patch, handles both FrameId and CallStackId conversions.  This
patch splits it into two phases for flexibility (but make them
composable) because some places only require the FrameId conversion.
2024-04-26 19:22:17 -07:00
Kazu Hirata
1f38b8a281
[memprof] Use DenseMap::contains (NFC) (#90124)
This patch replaces count with contains, following the spirit of
clang-tidy's readability-container-contains.
2024-04-25 15:39:42 -07:00
Kazu Hirata
cb9589b227
[memprof] Move getFullSchema and getHotColdSchema outside PortableMemInfoBlock (#90103)
These functions do not operate on PortableMemInfoBlock.  This patch
moves them outside the class.
2024-04-25 12:12:28 -07:00
Kazu Hirata
f9a0b467dd
[memprof] Remove getFullSchema in MemProfTest.cpp (#90072)
This patch removes getFullSchema in MemProfTest.cpp in favor of
llvm::memprof::PortableMemInfoBlock::getFullSchema as they do exactly
the same thing.
2024-04-25 10:24:19 -07:00
Kazu Hirata
3074060d6a
[memprof] Use SizeIs (NFC) (#88984) 2024-04-16 14:28:45 -07:00
Kazu Hirata
5422eb0b84
[memprof] Add another constructor to MemProfReader (#88952)
This patch enables users of MemProfReader to directly supply mappings
from CallStackId to actual call stacks.

Once the users of the current constructor without CSIdMap switch to
the new constructor, we'll have fewer users of:

- IndexedAllocationInfo::CallStack
- IndexedMemProfRecord::CallSites

bringing us one step closer to the removal of these fields in favor
of:

- IndexedAllocationInfo::CSId
- IndexedMemProfRecord::CallSiteIds
2024-04-16 11:50:49 -07:00
Kazu Hirata
8137bd9e03
[memprof] Use CSId to construct MemProfRecord (#88362)
We are in the process of referring to call stacks with CallStackId in
IndexedMemProfRecord and IndexedAllocationInfo instead of holding call
stacks inline (both in memory and the serialized format).  Doing so
deduplicates call stacks and reduces the MemProf profile file size.

Before we can eliminate the two fields holding call stacks inline:

- IndexedAllocationInfo::CallStack
- IndexedMemProfRecord::CallSites

we need to eliminate all the read operations on them.

This patch is a step toward that direction.  Specifically, we
eliminate the read operations in the context of MemProfReader and
RawMemProfReader.  A subsequent patch will eliminate the read
operations during the serialization.
2024-04-16 10:16:48 -07:00
Kazu Hirata
2bede6873d
[memprof] Rename RawMemProfReader.{cpp,h} to MemProfReader.{cpp,h} (NFC) (#88200)
This patch renames RawMemProfReader.{cpp,h} to MemProfReader.{cpp,h},
respectively.  Also, it re-creates RawMemProfReader.h just to include
MemProfReader.h for compatibility with out-of-tree users.
2024-04-10 22:03:20 -07:00
Kazu Hirata
d89914f30b
[memprof] Add Version2 of IndexedMemProfRecord serialization (#87455)
I'm currently developing a new version of the indexed memprof format
where we deduplicate call stacks in IndexedAllocationInfo::CallStack
and IndexedMemProfRecord::CallSites.  We refer to call stacks with
integer IDs, namely CallStackId, just as we refer to Frame with
FrameId.  The deduplication will cut down the profile file size by 80%
in a large memprof file of mine.

As a step toward the goal, this patch teaches
IndexedMemProfRecord::{serialize,deserialize} to speak Version2.  A
subsequent patch will add Version2 support to llvm-profdata.

The essense of the patch is to replace the serialization of a call
stack, a vector of FrameIDs, with that of a CallStackId.  That is:

  const IndexedAllocationInfo &N = ...;
  ...
  LE.write<uint64_t>(N.CallStack.size());
  for (const FrameId &Id : N.CallStack)
    LE.write<FrameId>(Id);

becomes:

  LE.write<CallStackId>(N.CSId);
2024-04-03 21:48:38 -07:00
Kazu Hirata
74799f4240
[memprof] Add call stack IDs to IndexedAllocationInfo (#85888)
The indexed MemProf file has a huge amount of redundancy.  In a large
internal application, 82% of call stacks, stored in
IndexedAllocationInfo::CallStack, are duplicates.

We should work toward deduplicating call stacks by referring to them
with unique IDs with actual call stacks stored in a separate data
structure, much like we refer to memprof::Frame with memprof::FrameId.

At the same time, we need to facilitate a graceful transition from the
current version of the MemProf format to the next.  We should be able
to read (but not write) the current version of the MemProf file even
after we move onto the next one.

With those goals in mind, I propose to have an integer ID next to
CallStack in IndexedAllocationInfo to refer to a call stack in a
succinct manner.  We'll gradually increase the areas of the compiler
where IDs and call stacks have one-to-one correspondence and
eventually remove the existing CallStack field.

This patch adds call stack ID, named CSId, to IndexedAllocationInfo
and teaches the raw profile reader to compute unique call stack IDs
and store them in the new field.  It does not introduce any user of
the call stack IDs yet, except in verifyFunctionProfileData.
2024-03-23 19:50:15 -07:00
Serge Pavlov
cb1a7d28e6
[symbolizer] Support symbol+offset lookup (#75067)
GNU addr2line supports lookup by symbol name in addition to the existing
address lookup. llvm-symbolizer starting from
e144ae54dcb96838a6176fd9eef21028935ccd4f supports lookup by symbol name.
This change extends this lookup with possibility to specify optional
offset.

Now the address for which source information is searched for can be
specified with offset:

    llvm-symbolize --obj=abc.so "SYMBOL func_22+0x12"

It decreases the gap in features of llvm-symbolizer and GNU addr2line.
This lookup now is supported for code only.

Migrated from: https://reviews.llvm.org/D139859
Pull request: https://github.com/llvm/llvm-project/pull/75067
2023-12-15 17:35:33 +07:00
Serge Pavlov
e144ae54dc [symbolizer] Support symbol lookup
Recent versions of GNU binutils starting from 2.39 support symbol+offset
lookup in addition to the usual numeric address lookup. This change adds
symbol lookup to llvm-symbolize and llvm-addr2line.

Now llvm-symbolize behaves closer to GNU addr2line, - if the value specified
as address in command line or input stream is not a number, it is treated as
a symbol name. For example:

    llvm-symbolize --obj=abc.so func_22
    llvm-symbolize --obj=abc.so "CODE func_22"

This lookup is now supported only for functions. Specification with
offset is not supported yet.

This is a recommit of 2b27948783e4bbc1132d3220d8517ef62607b558, reverted
in 39fec5457c0925bd39f67f63fe17391584e08258 because the test
llvm/test/Support/interrupts.test started failing on Windows. The test was
changed in 18f036d0105589c3175bb51a518c5d272dae61e2 and is also updated in
this commit.

Differential Revision: https://reviews.llvm.org/D149759
2023-11-01 14:41:39 +07:00
Serge Pavlov
39fec5457c Revert "[symbolizer] Support symbol lookup"
This reverts commit 2b27948783e4bbc1132d3220d8517ef62607b558.
On some buildbots the test LLVM::interrupts.test start failing.
2023-10-02 22:20:35 +07:00