llvm-project

Author	SHA1	Message	Date
JOE1994	459a82e689	[llvm][unittests] Don't call raw_string_ostream::flush() (NFC) raw_string_ostream::flush() is essentially a no-op (also specified in docs). Don't call it in tests that aren't meant to test 'raw_string_ostream' itself. p.s. remove a few redundant calls to raw_string_ostream::str()	2024-09-13 19:55:44 -04:00
Matthew Weingarten	30b93db547	[Memprof] Adds the option to collect AccessCountHistograms for memprof. (#94264 ) Adds compile time flag -mllvm -memprof-histogram and runtime flag histogram=true\|false to turn Histogram collection on and off. The -memprof-histogram flag relies on -memprof-use-callbacks=true to work. Updates shadow mapping logic in histogram mode from having one 8 byte counter for 64 bytes, to 1 byte for 8 bytes, capped at 255. Only supports this granularity as of now. Updates the RawMemprofReader and serializing MemoryInfoBlocks to binary format, including changing to a new version of the raw binary format from version 3 to version 4. Updates creating MemoryInfoBlocks with and without Histograms. When two MemoryInfoBlocks are merged, AccessCounts are summed up and the shorter Histogram is removed. Adds a memprof_histogram test case. Initial commit for adding AccessCountHistograms up until RawProfile for memprof	2024-06-26 08:37:22 -07:00
Kazu Hirata	dc3f8c2f58	[memprof] Improve deserialization performance in V3 (#94787 ) We call llvm::sort in a couple of places in the V3 encoding: - We sort Frames by FrameIds for stability of the output. - We sort call stacks in the dictionary order to maximize the length of the common prefix between adjacent call stacks. It turns out that we can improve the deserialization performance by modifying the comparison functions -- without changing the format at all. Both places take advantage of the histogram of Frames -- how many times each Frame occurs in the call stacks. - Frames: We serialize popular Frames in the descending order of popularity for improved cache locality. For two equally popular Frames, we break a tie by serializing one that tends to appear earlier in call stacks. Here, "earlier" means a smaller index within llvm::SmallVector<FrameId>. - Call Stacks: We sort the call stacks to reduce the number of times we follow pointers to parents during deserialization. Specifically, instead of comparing two call stacks in the strcmp style -- integer comparisons of FrameIds, we compare two FrameIds F1 and F2 with Histogram[F1] < Histogram[F2] at respective indexes. Since we encode from the end of the sorted list of call stacks, we tend to encode popular call stacks first. Since the two places use the same histogram, we compute it once and share it in the two places. Sorting the call stacks reduces the number of "jumps" by 74% when we deserialize all MemProfRecords. The cycle and instruction counts go down by 10% and 1.5%, respectively. If we sort the Frames in addition to the call stacks, then the cycle and instruction counts go down by 14% and 1.6%, respectively, relative to the same baseline (that is, without this patch).	2024-06-07 17:25:57 -07:00
Kazu Hirata	c348e265bd	[memprof] Use CallStackRadixTreeBuilder in the V3 format (#94708 ) This patch integrates CallStackRadixTreeBuilder into the V3 format, reducing the profile size to about 27% of the V2 profile size. - Serialization: writeMemProfCallStackArray just needs to write out the radix tree array prepared by CallStackRadixTreeBuilder. Mappings from CallStackIds to LinearCallStackIds are moved by new function CallStackRadixTreeBuilder::takeCallStackPos. - Deserialization: Deserializing a call stack is the same as deserializing an array encoded in the obvious manner -- the length followed by the payload, except that we need to follow a pointer to the parent to take advantage of common prefixes once in a while. This patch teaches LinearCallStackIdConverter to how to handle those pointers.	2024-06-07 07:19:36 -07:00
Kazu Hirata	5c0df5fe22	[memprof] Add CallStackRadixTreeBuilder (#93784 ) Call stacks are a huge portion of the MemProf profile, taking up 70+% of the profile file size. This patch implements a radix tree to compress call stacks, which are known to have long common prefixes. Specifically, CallStackRadixTreeBuilder, introduced in this patch, takes call stacks in the MemProf profile, sorts them in the dictionary order to maximize the common prefix between adjacent call stacks, and then encodes a radix tree into a single array that is ready for serialization. The resulting radix array is essentially a concatenation of call stack arrays, each encoded with its length followed by the payload, except that these arrays contain "instructions" like "skip 7 elements forward" to borrow common prefixes from other call stacks. This patch does not integrate with the MemProf serialization/deserialization infrastructure yet. Once integrated, the radix tree is expected to roughly halve the file size of the MemProf profile.	2024-06-06 15:52:45 -07:00
Kazu Hirata	26fabdded3	[memprof] Pass FrameIdConverter and CallStackIdConverter by reference (#92327 ) CallStackIdConverter sets LastUnmappedId when a mapping failure occurs. Now, since toMemProfRecord takes an instance of CallStackIdConverter by value, namely std::function, the caller of toMemProfRecord never receives the mapping failure that occurs inside toMemProfRecord. The same problem applies to FrameIdConverter. The patch fixes the problem by passing FrameIdConverter and CallStackIdConverter by reference, namely llvm::function_ref. While I am it, this patch deletes the copy constructor and copy assignment operator to avoid accidental copies.	2024-05-15 17:53:28 -07:00
Mircea Trofin	181e2e8fb9	[nfc][memprof] Add missing license to `MemProfTest` (#91695 )	2024-05-09 20:47:10 -07:00
Kazu Hirata	c9dae43438	[memprof] Add access checks to PortableMemInfoBlock::get* (#90121 ) commit 4c8ec8f8bc3fb4dda4fd36c3b2ad745bd3451970 Author: Kazu Hirata <kazu@google.com> Date: Wed Apr 24 16:25:35 2024 -0700 introduced the idea of serializing/deserializing a subset of the fields in PortableMemInfoBlock. While it reduces the size of the indexed MemProf profile file, we now could inadvertently access unavailable fields and go without noticing. To protect ourselves from the risk, this patch adds access checks to PortableMemInfoBlock::get* methods by embedding a bit set representing available fields into PortableMemInfoBlock.	2024-04-28 12:49:08 -07:00
Kazu Hirata	352602010f	Repply [memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307 ) Currently, we convert FrameId to Frame and CallStackId to a call stack at several places. This patch unifies those into function objects -- FrameIdConverter and CallStackIdConverter. The existing implementation of CallStackIdConverter, being removed in this patch, handles both FrameId and CallStackId conversions. This patch splits it into two phases for flexibility (but make them composable) because some places only require the FrameId conversion. This iteration fixes a problem uncovered with ubsan, where we were dereferencing an uninitialized std::unique_ptr.	2024-04-28 11:44:45 -07:00
Vitaly Buka	7aa6896dd7	Revert "[memprof] Introduce FrameIdConverter and CallStackIdConverter" (#90318 ) Reverts llvm/llvm-project#90307 Breaks bots https://lab.llvm.org/buildbot/#/builders/5/builds/42943	2024-04-27 00:15:08 -07:00
Kazu Hirata	e04df693bf	[memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307 ) Currently, we convert FrameId to Frame and CallStackId to a call stack at several places. This patch unifies those into function objects -- FrameIdConverter and CallStackIdConverter. The existing implementation of CallStackIdConverter, being removed in this patch, handles both FrameId and CallStackId conversions. This patch splits it into two phases for flexibility (but make them composable) because some places only require the FrameId conversion.	2024-04-26 19:22:17 -07:00
Kazu Hirata	1f38b8a281	[memprof] Use DenseMap::contains (NFC) (#90124 ) This patch replaces count with contains, following the spirit of clang-tidy's readability-container-contains.	2024-04-25 15:39:42 -07:00
Kazu Hirata	cb9589b227	[memprof] Move getFullSchema and getHotColdSchema outside PortableMemInfoBlock (#90103 ) These functions do not operate on PortableMemInfoBlock. This patch moves them outside the class.	2024-04-25 12:12:28 -07:00
Kazu Hirata	f9a0b467dd	[memprof] Remove getFullSchema in MemProfTest.cpp (#90072 ) This patch removes getFullSchema in MemProfTest.cpp in favor of llvm::memprof::PortableMemInfoBlock::getFullSchema as they do exactly the same thing.	2024-04-25 10:24:19 -07:00
Kazu Hirata	3074060d6a	[memprof] Use SizeIs (NFC) (#88984 )	2024-04-16 14:28:45 -07:00
Kazu Hirata	5422eb0b84	[memprof] Add another constructor to MemProfReader (#88952 ) This patch enables users of MemProfReader to directly supply mappings from CallStackId to actual call stacks. Once the users of the current constructor without CSIdMap switch to the new constructor, we'll have fewer users of: - IndexedAllocationInfo::CallStack - IndexedMemProfRecord::CallSites bringing us one step closer to the removal of these fields in favor of: - IndexedAllocationInfo::CSId - IndexedMemProfRecord::CallSiteIds	2024-04-16 11:50:49 -07:00
Kazu Hirata	8137bd9e03	[memprof] Use CSId to construct MemProfRecord (#88362 ) We are in the process of referring to call stacks with CallStackId in IndexedMemProfRecord and IndexedAllocationInfo instead of holding call stacks inline (both in memory and the serialized format). Doing so deduplicates call stacks and reduces the MemProf profile file size. Before we can eliminate the two fields holding call stacks inline: - IndexedAllocationInfo::CallStack - IndexedMemProfRecord::CallSites we need to eliminate all the read operations on them. This patch is a step toward that direction. Specifically, we eliminate the read operations in the context of MemProfReader and RawMemProfReader. A subsequent patch will eliminate the read operations during the serialization.	2024-04-16 10:16:48 -07:00
Kazu Hirata	2bede6873d	[memprof] Rename RawMemProfReader.{cpp,h} to MemProfReader.{cpp,h} (NFC) (#88200 ) This patch renames RawMemProfReader.{cpp,h} to MemProfReader.{cpp,h}, respectively. Also, it re-creates RawMemProfReader.h just to include MemProfReader.h for compatibility with out-of-tree users.	2024-04-10 22:03:20 -07:00
Kazu Hirata	d89914f30b	[memprof] Add Version2 of IndexedMemProfRecord serialization (#87455 ) I'm currently developing a new version of the indexed memprof format where we deduplicate call stacks in IndexedAllocationInfo::CallStack and IndexedMemProfRecord::CallSites. We refer to call stacks with integer IDs, namely CallStackId, just as we refer to Frame with FrameId. The deduplication will cut down the profile file size by 80% in a large memprof file of mine. As a step toward the goal, this patch teaches IndexedMemProfRecord::{serialize,deserialize} to speak Version2. A subsequent patch will add Version2 support to llvm-profdata. The essense of the patch is to replace the serialization of a call stack, a vector of FrameIDs, with that of a CallStackId. That is: const IndexedAllocationInfo &N = ...; ... LE.write<uint64_t>(N.CallStack.size()); for (const FrameId &Id : N.CallStack) LE.write<FrameId>(Id); becomes: LE.write<CallStackId>(N.CSId);	2024-04-03 21:48:38 -07:00
Kazu Hirata	74799f4240	[memprof] Add call stack IDs to IndexedAllocationInfo (#85888 ) The indexed MemProf file has a huge amount of redundancy. In a large internal application, 82% of call stacks, stored in IndexedAllocationInfo::CallStack, are duplicates. We should work toward deduplicating call stacks by referring to them with unique IDs with actual call stacks stored in a separate data structure, much like we refer to memprof::Frame with memprof::FrameId. At the same time, we need to facilitate a graceful transition from the current version of the MemProf format to the next. We should be able to read (but not write) the current version of the MemProf file even after we move onto the next one. With those goals in mind, I propose to have an integer ID next to CallStack in IndexedAllocationInfo to refer to a call stack in a succinct manner. We'll gradually increase the areas of the compiler where IDs and call stacks have one-to-one correspondence and eventually remove the existing CallStack field. This patch adds call stack ID, named CSId, to IndexedAllocationInfo and teaches the raw profile reader to compute unique call stack IDs and store them in the new field. It does not introduce any user of the call stack IDs yet, except in verifyFunctionProfileData.	2024-03-23 19:50:15 -07:00
Serge Pavlov	cb1a7d28e6	[symbolizer] Support symbol+offset lookup (#75067 ) GNU addr2line supports lookup by symbol name in addition to the existing address lookup. llvm-symbolizer starting from e144ae54dcb96838a6176fd9eef21028935ccd4f supports lookup by symbol name. This change extends this lookup with possibility to specify optional offset. Now the address for which source information is searched for can be specified with offset: llvm-symbolize --obj=abc.so "SYMBOL func_22+0x12" It decreases the gap in features of llvm-symbolizer and GNU addr2line. This lookup now is supported for code only. Migrated from: https://reviews.llvm.org/D139859 Pull request: https://github.com/llvm/llvm-project/pull/75067	2023-12-15 17:35:33 +07:00
Serge Pavlov	e144ae54dc	[symbolizer] Support symbol lookup Recent versions of GNU binutils starting from 2.39 support symbol+offset lookup in addition to the usual numeric address lookup. This change adds symbol lookup to llvm-symbolize and llvm-addr2line. Now llvm-symbolize behaves closer to GNU addr2line, - if the value specified as address in command line or input stream is not a number, it is treated as a symbol name. For example: llvm-symbolize --obj=abc.so func_22 llvm-symbolize --obj=abc.so "CODE func_22" This lookup is now supported only for functions. Specification with offset is not supported yet. This is a recommit of 2b27948783e4bbc1132d3220d8517ef62607b558, reverted in 39fec5457c0925bd39f67f63fe17391584e08258 because the test llvm/test/Support/interrupts.test started failing on Windows. The test was changed in 18f036d0105589c3175bb51a518c5d272dae61e2 and is also updated in this commit. Differential Revision: https://reviews.llvm.org/D149759	2023-11-01 14:41:39 +07:00
Serge Pavlov	39fec5457c	Revert "[symbolizer] Support symbol lookup" This reverts commit 2b27948783e4bbc1132d3220d8517ef62607b558. On some buildbots the test LLVM::interrupts.test start failing.	2023-10-02 22:20:35 +07:00
Serge Pavlov	2b27948783	[symbolizer] Support symbol lookup Recent versions of GNU binutils starting from 2.39 support symbol+offset lookup in addition to the usual numeric address lookup. This change adds symbol lookup to llvm-symbolize and llvm-addr2line. Now llvm-symbolize behaves closer to GNU addr2line, - if the value specified as address in command line or input stream is not a number, it is treated as a symbol name. For example: llvm-symbolize --obj=abc.so func_22 llvm-symbolize --obj=abc.so "CODE func_22" This lookup is now supported only for functions. Specification with offset is not supported yet. Differential Revision: https://reviews.llvm.org/D149759	2023-10-02 21:38:15 +07:00
Snehasish Kumar	37fd3c96b9	[memprof] Add a MemProfReader base class. Add a MemProfReader base class which can be used directly where symbolization and processing a raw profile is unnecessary. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D159141	2023-08-30 20:20:55 +00:00
Snehasish Kumar	0edc32fda5	[memprof] Canonicalize the function name prior to hashing. Canonicalize the function name (strip suffixes etc) to ensure that function name suffixes added by late stage passes do not cause mismatches when memprof profile data is consumed. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D159132	2023-08-29 20:45:39 +00:00
Jay Foad	fdbc944385	Fix typos in comments	2023-08-15 13:57:21 +01:00
Snehasish Kumar	bcebadeba7	[memprof] Update the isRuntime symbolization check. Update the isRuntime check to only match against known memprof filenames where interceptors are defined. This avoid issues where the path does not include the directory based on how the runtime was compiled. Also update the unittest. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145521	2023-03-07 20:16:15 +00:00
Fangrui Song	67ba5c507a	std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This fixes check-llvm.	2022-12-17 01:42:39 +00:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Snehasish Kumar	ec51971eae	[memprof] Keep and display symbol names in the RawMemProfReader. Extend the Frame struct to hold the symbol name if requested when a RawMemProfReader object is constructed. This change updates the tests and removes the need to pass --debug to obtain the mapping from GUID to symbol names. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D126344	2022-05-25 21:17:44 +00:00
Snehasish Kumar	6dd6a6161f	[memprof] Deduplicate and outline frame storage in the memprof profile. The current implementation of memprof information in the indexed profile format stores the representation of each calling context fram inline. This patch uses an interned representation where the frame contents are stored in a separate on-disk hash table. The table is indexed via a hash of the contents of the frame. With this patch, the compressed size of a large memprof profile reduces by ~22%. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D123094	2022-04-08 09:15:20 -07:00
Snehasish Kumar	27a4f2545f	Reland "[memprof] Store callsite metadata with memprof records." This reverts commit f4b794427e8037a4e952cacdfe7201e961f31a6f. Reland with underlying msan issue fixed in D122260.	2022-03-22 14:40:02 -07:00
Mitch Phillips	f4b794427e	Revert "[memprof] Store callsite metadata with memprof records." This reverts commit 0d362c90d335509c57c0fbd01ae1829e2b9c3765. Reason: Causes the MSan buildbot to fail (see comments on https://reviews.llvm.org/D121179 for more information	2022-03-21 15:59:13 -07:00
Snehasish Kumar	0d362c90d3	[memprof] Store callsite metadata with memprof records. To ease profile annotation, each of the callsites in a function can be annotated with profile data - "IR metadata format for MemProf" [1]. This patch extends the on-disk serialized record format to store the debug information for allocation callsites incl inline frames. This change is incompatible with the existing format i.e. indexed profiles must be regenerated, raw profiles are unaffected. [1] https://groups.google.com/g/llvm-dev/c/aWHsdMxKAfE/m/WtEmRqyhAgAJ Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D121179	2022-03-21 13:58:29 -07:00
Snehasish Kumar	c9a3d29613	[memprof] Update the frame is inline logic and unittests. Since DI frames are enumerated with the leaf function at index 0, this patch fixes the logic when IsInlineFrame is set. Also update the unittests to check that only the last frame is marked as non-inline from a set of DI Frames for a PC address. Differential Revision: https://reviews.llvm.org/D121830	2022-03-21 10:41:05 -07:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Snehasish Kumar	11314f4059	[memprof] Filter out callstack frames which cannot be symbolized. This patch filters out callstack frames which can't be symbolized or if the frames belong to the runtime. Symbolization may not be possible if debug information is unavailable or if the addresses are from a shared library. For now we only support optimization of the main binary which is statically linked to the compiler runtime. Differential Revision: https://reviews.llvm.org/D120860	2022-03-04 11:10:08 -08:00
Snehasish Kumar	dda7b74967	[memprof] Symbolize and cache stack frames. Currently, symbolization of stack frames occurs on demand when the instrprof writer iterates over all the records in the raw memprof reader. With this change we symbolize and cache the frames immediately after reading the raw profiles. For a large internal binary this results in a runtime reduction of ~50% (2m -> 48s) when merging a memprof raw profile with a raw instr profile to generate an indexed profile. This change also makes it simpler in the future to generate additional calling context metadata to attach to each memprof record. Differential Revision: https://reviews.llvm.org/D120430	2022-03-03 11:00:37 -08:00
Snehasish Kumar	0a4184909a	Reland "[memprof] Extend the index prof format to include memory profiles." This patch adds support for optional memory profile information to be included with and indexed profile. The indexed profile header adds a new field which points to the offset of the memory profile section (if present) in the indexed profile. For users who do not utilize this feature the only overhead is a 64-bit offset in the header. The memory profile section contains (1) profile metadata describing the information recorded for each entry (2) an on-disk hashtable containing the profile records indexed via llvm::md5(function_name). We chose to introduce a separate hash table instead of the existing one since the indexing for the instrumented fdo hash table is based on a CFG hash which itself is perturbed by memprof instrumentation. This commit also includes the changes reviewed separately in D120093. Differential Revision: https://reviews.llvm.org/D120103	2022-02-17 22:09:52 -08:00
Snehasish Kumar	19bdf44d85	Revert "Reland "[memprof] Extend the index prof format to include memory profiles."" This reverts commit 807ba7aace188ada83ddb4477265728e97346af1.	2022-02-17 15:51:04 -08:00
Snehasish Kumar	807ba7aace	Reland "[memprof] Extend the index prof format to include memory profiles." This reverts commit 85355a560a33897453df2ef959e255ee725eebce. This patch adds support for optional memory profile information to be included with and indexed profile. The indexed profile header adds a new field which points to the offset of the memory profile section (if present) in the indexed profile. For users who do not utilize this feature the only overhead is a 64-bit offset in the header. The memory profile section contains (1) profile metadata describing the information recorded for each entry (2) an on-disk hashtable containing the profile records indexed via llvm::md5(function_name). We chose to introduce a separate hash table instead of the existing one since the indexing for the instrumented fdo hash table is based on a CFG hash which itself is perturbed by memprof instrumentation. Differential Revision: https://reviews.llvm.org/D118653	2022-02-17 13:14:17 -08:00
Snehasish Kumar	50713461d4	Reland "[memprof] Introduce a wrapper around MemInfoBlock." This reverts commit e6999040f5758f89a64b6232119b775b7bd1c85b. Update test to fix signed int comparison warning, fix whitespace in compiler-rt MIBEntryDef.inc file. Differential Revision: https://reviews.llvm.org/D117256	2022-02-14 19:04:36 -08:00
Snehasish Kumar	f89319b841	Reland "[memprof] Refactor out the MemInfoBlock into a macro based def." This reverts commit 857ec0d01f8021ff0d9540fcbf6ff24e29868ba4. Fixes -DLLVM_ENABLE_MODULES=On build by adding the new textual header to the modulemap file. Reviewed in https://reviews.llvm.org/D117722	2022-02-14 16:05:05 -08:00
Snehasish Kumar	857ec0d01f	Revert "[memprof] Refactor out the MemInfoBlock into a macro based def." This reverts commit 9def83c6d02944b2931efd50cd2491953a772aab. [4/4]	2022-02-14 11:42:58 -08:00
Snehasish Kumar	e6999040f5	Revert "[memprof] Introduce a wrapper around MemInfoBlock." This reverts commit 9b67165285c5e752fce3b554769f5a22e7b38da8. [3/4]	2022-02-14 11:42:58 -08:00
Snehasish Kumar	85355a560a	Revert "Reland "[memprof] Extend the index prof format to include memory profiles."" This reverts commit de54e4ab78ef09b60f870e8df6f8a87e56d6bd94 [1/4]	2022-02-14 11:42:58 -08:00

1 2

58 Commits