llvm-project

Author	SHA1	Message	Date
Teresa Johnson	e4c308424f	[MemProf] Fix the propagation of context/size info after inlining (#164872 ) In certain cases the context/size info we use for reporting of hinted bytes in the LTO link was being dropped when we re-constructed context tries and memprof metadata after inlining. This only affected cases where we were using the -memprof-min-percent-max-cold-size option to only keep that information for the largest cold contexts, and where the pre-LTO compile did not specify -memprof-report-hinted-sizes. The issue is that we don't have a MaxSize, which is only available during the profile matching step. Use an existing bool indicating that we are redoing this from existing metadata to always propagate any context size metadata in that case.	2025-10-23 21:28:49 -07:00
Teresa Johnson	c40ee0f322	Reapply "[MemProf] Add ambigous memprof attribute" (#161717 ) (#161918 ) Reapply llvm/llvm-project#157204 with fix and a new test for the issue it caused (the test change provoked the assert that was converted to an if condition). Also, make the application of this new attribute under an (on by default) flag, so that it can be more easily disabled if needed. Add test for the new flag.	2025-10-05 15:57:08 -07:00
Teresa Johnson	49603bd9b8	Revert "[MemProf] Add ambigous memprof attribute" (#161717 ) Reverts llvm/llvm-project#157204 This caused issues in ThinLTO binaries because of the checking here, that didn't expect allocations needing cloning to have memprof metadata: `9133fc8cb0/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp (L5572-L5582)` I need to move the assert into the if check and guard by that condition. And add a more thorough test.	2025-10-02 12:17:38 -07:00
Nicolai Hähnle	11a4b2d950	Cleanup the LLVM exported symbols namespace (#161240 ) There's a pattern throughout LLVM of cl::opts being exported. That in itself is probably a bit unfortunate, but what's especially bad about it is that a lot of those symbols are in the global namespace. Move them into the llvm namespace. While doing this, I noticed some other variables in the global namespace and moved them as well.	2025-10-01 15:32:07 -07:00
Teresa Johnson	cf44f19e57	[MemProf] Add ambigous memprof attribute (#157204 ) To help track allocations that we matched with memprof profiles but for which we weren't able to disambiguate the different hotness contexts, apply an "ambiguous" memprof attribute to all allocations with matched profiles. These will be replaced if we can identify a single hotness type, possibly after cloning. Eventually we plan to translate this to a special hotness hint on the allocation call.	2025-09-05 18:32:10 -07:00
Teresa Johnson	e57315e6ca	[MemProf] Fix discarding of noncold contexts after inlining (#149599 ) When we rebuild the call site tries after inlining of an allocation with MD_memprof metadata, we don't want to reapply the discarding of small non-cold contexts (under -memprof-callsite-cold-threshold=) because we have either no context size info (without -memprof-report-hinted-sizes or another option that causes us to keep that as metadata), and even with that information in the metadata, we have imperfect information at that point as we have already discarded some contexts during matching. The first case was even worse because we didn't guard our check by whether the number of cold bytes was 0, leading to very aggressive pruning during post-inline metadata rebuilding without the context size information.	2025-07-18 21:11:37 -07:00
Teresa Johnson	3ec2de2753	[MemProf] Optionally save context size info on largest cold allocations (#142837 ) Reapply PR142507 with fix for test: add in the same x86_64-linux requirement as other tests as the stack ids are currently computed differently on big endian systems. This will be investigated separately. In order to allow selective reporting of context hinting during the LTO link, and in the future to allow selective more aggressive cloning, add an option to specify a minimum percent of the max cold size in the profile summary. Contexts that meet that threshold will get context size info metadata (and ThinLTO summary information) on the associated allocations. Specifying -memprof-report-hinted-sizes during the pre-LTO compile step will continue to cause all contexts to receive this metadata. But specifying -memprof-report-hinted-sizes only during the LTO link will cause only those that meet the new threshold and have the metadata to get reported. To support this, because the alloc info summary and associated bitcode requires the context size information to be in the same order as the other context information, 0s are inserted for contexts without this metadata. The bitcode writer uses a more compact format for the context ids to allow better compression of the 0s. As part of this change several helper methods are added to query whether metadata contains context size info on any or all contexts.	2025-06-04 13:08:56 -07:00
Teresa Johnson	6c1091ea3f	Revert "[MemProf] Optionally save context size info on largest cold allocations" (#142688 ) Reverts llvm/llvm-project#142507 due to buildbot failures that I will look into tomorrow.	2025-06-03 16:05:16 -07:00
Teresa Johnson	f2adae5780	[MemProf] Optionally save context size info on largest cold allocations (#142507 ) In order to allow selective reporting of context hinting during the LTO link, and in the future to allow selective more aggressive cloning, add an option to specify a minimum percent of the max cold size in the profile summary. Contexts that meet that threshold will get context size info metadata (and ThinLTO summary information) on the associated allocations. Specifying -memprof-report-hinted-sizes during the pre-LTO compile step will continue to cause all contexts to receive this metadata. But specifying -memprof-report-hinted-sizes only during the LTO link will cause only those that meet the new threshold and have the metadata to get reported. To support this, because the alloc info summary and associated bitcode requires the context size information to be in the same order as the other context information, 0s are inserted for contexts without this metadata. The bitcode writer uses a more compact format for the context ids to allow better compression of the 0s. As part of this change several helper methods are added to query whether metadata contains context size info on any or all contexts.	2025-06-03 14:20:38 -07:00
Teresa Johnson	49d48c32e0	[MemProf] Emit remarks when hinting allocations not needing cloning (#141859 ) The context disambiguation code already emits remarks when hinting allocations (by adding hotness attributes) during cloning. However, we did not yet emit hints when applying the hotness attributes during building of the metadata (during matching and again after inlining). Add remarks when we apply the hint attributes for these non-context-sensitive allocations.	2025-05-28 16:44:44 -07:00
Teresa Johnson	cc6f446d38	[MemProf] Add basic summary section support (#141805 ) This patch adds support for a basic MemProf summary section, which is built along with the indexed MemProf profile (e.g. when reading the raw or YAML profiles), and serialized through the indexed profile just after the header. Currently only 6 fields are written, specifically the number of contexts (total, cold, hot), and the max context size (cold, warm, hot). To support forwards and backwards compatibility for added fields in the indexed profile, the number of fields serialized first. The code is written to support forwards compatibility (reading newer profiles with additional summary fields), and comments indicate how to implement backwards compatibility (reading older profiles with fewer summary fields) as needed. Support is added to print the summary as YAML comments when displaying both the raw and indexed profiles via `llvm-profdata show`. Because they are YAML comments, the YAML reader ignores these (the summary is always recomputed when building the indexed profile as described above). This necessitated moving some options and a couple of interfaces out of Analysis/MemoryProfileInfo.cpp and into the new ProfileData/MemProfSummary.cpp file, as we need to classify context hotness earlier and also compute context ids to build the summary from older indexed profiles.	2025-05-28 13:12:41 -07:00
Andrew Rogers	d4d4a04771	[llvm] annotate interfaces in llvm/Analysis for DLL export (#136623 ) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/Analysis` library. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Add `#include "llvm/Support/Compiler.h"` to files where it was not auto-added by IDS due to no pre-existing block of include statements. - Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported instantiated templates - Add `LLVM_ABI` to a subset of private class methods and fields that require export - Add `LLVM_ABI` to a small number of symbols that require export but are not declared in headers ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang	2025-05-27 15:14:20 -07:00
Teresa Johnson	8836d68a0d	[MemProf] Optionally discard small non-cold contexts (#139113 ) Adds a new option -memprof-callsite-cold-threshold that allows specifying a percent that will cause non-cold contexts to be discarded if the percent cold bytes at a callsite including that context exceeds the given threshold. Default is 100% (no discarding). This reduces the amount of cloning needed to expose cold allocation contexts when parts of the context are dominantly cold. This motivated the change in PR138792, since discarding a context might require a different decision about which not-cold contexts must be kept to expose cloning requirements, so we need to determine that on the fly. Additionally, this required a change to include the context size information in the alloc trie in more cases, so we now guard the inclusion of this information in the generated metadata on the option values.	2025-05-09 15:56:54 -07:00
Teresa Johnson	9c4c2426d5	[MemProf] Fix bug introduced by restructuring in optional handling (#139092 ) The restructuring of the context pruning patch in PR138792 (764614e6355e214c6b64c715d105007b1a4b97fd) introduced a bug under the non-default -memprof-keep-all-not-cold-contexts handling. Added more testing of this mode which would have caught the issue. While here, fix the newly added function name to match code style.	2025-05-08 08:37:45 -07:00
Teresa Johnson	764614e635	[MemProf] Restructure the pruning of unneeded NotCold contexts (#138792 ) This change is mostly NFC, other than the addition of a new message printed when contexts are pruned when -memprof-report-hinted-sizes is enabled. To prepare for a follow on change, adjust the way we determine which NotCold contexts can be pruned (because they overlap with longer NotCold contexts), and change the way we perform this pruning. Instead of determining the points at which we need to keep NotCold contexts during the building of the trie, we now determine this on the fly as the MIB metadata nodes are recursively built. This simplifies a follow on change that performs additional pruning of some NotCold contexts, and which can affect which others need to be kept as the longest overlapping NotCold contexts.	2025-05-07 17:34:44 -07:00
Kazu Hirata	c4e9901b5b	[llvm] Use llvm::append_range (NFC) (#135931 )	2025-04-16 12:28:47 -07:00
chrisPyr	71f4c7dabe	[NFC]Make file-local cl::opt global variables static (#126486 ) #125983	2025-03-03 13:46:33 +07:00
Kazu Hirata	9a59145d8e	[memprof] Avoid repeated map lookups (NFC) (#127027 )	2025-02-13 09:12:04 -08:00
Teresa Johnson	ae6d5dd58b	[MemProf] Prune unneeded non-cold contexts (#124823 ) We can take advantage of the fact that we subsequently only clone cold allocation contexts, since not cold behavior is the default, and significantly reduce the amount of metadata (and later ThinLTO summary and MemProfContextDisambiguation graph nodes) by pruning unnecessary not cold contexts when building metadata from the trie. Specifically, we only need to keep notcold contexts that overlap the longest with cold allocations, to know how deeply to clone those contexts to expose the cold allocation behavior. For a large target this reduced ThinLTO bitcode object sizes by about 35%. It reduced the ThinLTO indexing time by about half and the peak ThinLTO indexing memory by about 20%.	2025-01-29 10:38:31 -08:00
Teresa Johnson	c725a95e08	[MemProf] Convert Hot contexts to NotCold early (#124219 ) While we convert hot contexts to notcold contexts during the cloning step, their existence was greatly limiting the context trimming performed when we add the MemProf profile to the IR. To address this, any hot contexts are converted to notcold contexts immediately after first checking for unambiguous allocation types, and before checking it again and before adding metadata while performing context trimming. Note that hot hints are now disabled by default, however, this avoids adding unnecessary overhead if they are re-enabled.	2025-01-24 15:58:13 -08:00
Teresa Johnson	ae8b560899	[MemProf] Disable hot hints by default (#124338 ) By default we were marking some contexts as hot, and adding hot hints to unambiguously hot allocations. However, there is not yet support for cloning to expose hot allocation contexts, and none is planned for the forseeable future. While we convert hot contexts to notcold contexts during the cloning step, their existence was greatly limiting the context trimming performed when we add the MemProf profile to the IR. This change simply disables the generation of hot contexts / hints by default, as few allocations were unambiguously hot. A subsequent change will address the issue when hot hints are optionally enabled. See PR124219 for details. This change resulted in significant overhead reductions for a large target: ~48% reduction in the per-module ThinLTO bitcode summary sizes ~72% reduction in the distributed ThinLTO bitcode combined summary sizes ~68% reduction in thin link time ~34% reduction in thin link peak memory	2025-01-24 13:06:11 -08:00
Teresa Johnson	3a423a10ff	[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359 ) This patch fixes a couple of places where memprof-related metadata (!memprof and !callsite) were being dropped, and one place where PGO metadata (!prof) was being dropped. All were due to instances of combineMetadata() being invoked. That function drops all metadata not in the list provided by the client, and also drops any not in its switch statement. Memprof metadata needed a case in the combineMetadata switch statement. For now we simply keep the metadata of the instruction being kept, which doesn't retain all the profile information when two calls with memprof metadata are being combined, but at least retains some. For the memprof metadata being dropped during call CSE, add memprof and callsite metadata to the list of known ids in combineMetadataForCSE. Neither memprof nor regular prof metadata were in the list of known ids for the callsite in MemCpyOptimizer, which was added to combine AA metadata after optimization of byval arguments fed by memcpy instructions, and similar types of optimizations of memcpy uses. There is one other callsite of combineMetadata, but it is only invoked on load instructions, which do not carry these types of metadata.	2025-01-02 12:11:59 -08:00
Teresa Johnson	d7d0e740cc	[MemProf] Refactor single alloc type handling and use in more cases (#120290 ) Emit message when we have aliased contexts that are conservatively hinted not cold. This is not a change in behavior, just in message when the -memprof-report-hinted-sizes flag is enabled.	2024-12-17 12:50:49 -08:00
Teresa Johnson	bf700c39d1	[MemProf] Remove dead code (NFC) (#120156 ) Remove unused collection of context size information that was likely leftover from debugging / testing.	2024-12-16 17:15:25 -08:00
Teresa Johnson	9513f2fdf2	[MemProf] Print full context hash when reporting hinted bytes (#114465 ) Improve the information printed when -memprof-report-hinted-sizes is enabled. Now print the full context hash computed from the original profile, similar to what we do when reporting matching statistics. This will make it easier to correlate with the profile. Note that the full context hash must be computed at profile match time and saved in the metadata and summary, because we may trim the context during matching when it isn't needed for distinguishing hotness. Similarly, due to the context trimming, we may have more than one full context id and total size pair per MIB in the metadata and summary, which now get a list of these pairs. Remove the old aggregate size from the metadata and summary support. One other change from the prior support is that we no longer write the size information into the combined index for the LTO backends, which don't use this information, which reduces unnecessary bloat in distributed index files.	2024-11-15 08:24:44 -08:00
Kazu Hirata	890c4bece2	[memprof] Use SmallVector for InlinedCallStack (NFC) (#114599 ) We can stay within 8 inlined elements more than 99% of the time while building a large application.	2024-11-01 19:52:11 -07:00
Kazu Hirata	b2f3ac836a	[memprof] Teach createMIBNode to take ArrayRef (NFC) (#111195 ) createMIBNode does not modify MIBCallStack, so we can take ArrayRef instead. While I am at it, this patch changes the type of MIBPayload to SmallVector. We put at most three elements, so we can avoid a heap allocation.	2024-10-04 13:56:42 -07:00
Daniil Fukalov	0da2ba811a	[NFC] Cleanup in ADT and Analysis headers. (#104484 ) Remove unused directly includes and forward declarations in ADT and Analysis headers.	2024-08-17 13:11:18 +02:00
Teresa Johnson	8c1bd67dee	[MemProf] Optionally print or record the profiled sizes of allocations (#98248 ) This is the first step in being able to track the total profiled sizes of allocations successfully marked as cold. Under a new option -memprof-report-hinted-sizes: - For unambiguous (non-context-sensitive) allocations, print the profiled size and the allocation coldness, along with a hash of the allocation's location (to allow for deduplication across modules or inline instances). - For context sensitive allocations, add the size as a 3rd operand on the MIB metadata. A follow on patch will propagate this through to the thin link where the sizes will be reported for each context after cloning.	2024-07-10 09:41:36 -07:00
Kazu Hirata	026a29e8b3	[Analysis, CodeGen, DebugInfo] Use StringRef::operator== instead of StringRef::equals (NFC) (#91304 ) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator==/!= outnumber StringRef::equals by a factor of 53 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".	2024-05-07 10:20:10 -07:00
lifengxiang1025	1e7d5871ee	[MemProf] Fix when CallStackTrie has a single chain to leaf with multi alloc type (#79433 ) Fix one corner case when `CallStackTrie` has a single chain to leaf with multi alloc type. This will cause stackIds in function summary is empty.	2024-02-02 22:12:41 +08:00
Kan Wu	b8d2f7177c	[MemProf] Add hot allocation type Add "Hot" AllocationType (in addition to existing cold, notcold). Use lifetime access density as metric to identify hot allocations. Treat hot as notcold for MemProfContextDisambiguation for now before the disambiguation for "hot" is done. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D149932	2023-05-08 10:34:53 -07:00
Teresa Johnson	a28261c711	[MemProf] Create single version of helper function (NFC) Small clean up to keep a single version of getAllocTypeAttributeString which was duplicated locally.	2023-05-05 18:31:35 -07:00
Teresa Johnson	a4bdb27538	[MemProf] Use profiled lifetime access density directly Now that the runtime tracks the lifetime access density directly, we can use that directly in the threshold checks instead of less accurately computing from other statistics. Differential Revision: https://reviews.llvm.org/D149684	2023-05-02 15:19:34 -07:00
Teresa Johnson	5fd82ca05b	[MemProf] Make hasSingleAllocType helper non-static As suggested in D140908, make the hasSingleAllocType helper non-static so that it can be used in other files. Add unit testing. Differential Revision: https://reviews.llvm.org/D144318	2023-02-21 12:00:03 -08:00
Teresa Johnson	6827c4f0de	[MemProf] Add helper to access the back (last) call stack id This is split out of D140908 as suggested. Differential Revision: https://reviews.llvm.org/D143184	2023-02-03 07:51:32 -08:00
Kazu Hirata	caa99a01f5	Use llvm::popcount instead of llvm::countPopulation(NFC)	2023-01-22 12:48:51 -08:00
Teresa Johnson	9eacbba290	Restore "[MemProf] ThinLTO summary support" with more fixes This restores commit 98ed423361de2f9dc0113a31be2aa04524489ca9 and follow on fix 00c22351ba697dbddb4b5bf0ad94e4bcea4b316b, which were reverted in 5d938eb6f79b16f55266dd23d5df831f552ea082 due to an MSVC bot failure. I've included a fix for that failure. Differential Revision: https://reviews.llvm.org/D135714	2022-11-16 09:42:41 -08:00
Jeremy Morse	5d938eb6f7	Revert "Restore "[MemProf] ThinLTO summary support" with fixes" This reverts commit 00c22351ba697dbddb4b5bf0ad94e4bcea4b316b. This reverts commit 98ed423361de2f9dc0113a31be2aa04524489ca9. Seemingly MSVC has some kind of issue with this patch, in terms of linking: https://lab.llvm.org/buildbot/#/builders/123/builds/14137 I'll post more detail on D135714 momentarily.	2022-11-16 11:21:02 +00:00
Teresa Johnson	98ed423361	Restore "[MemProf] ThinLTO summary support" with fixes This restores 47459455009db4790ffc3765a2ec0f8b4934c2a4, which was reverted in commit 452a14efc84edf808d1e2953dad2c694972b312f, along with fixes for a couple of bot failures.	2022-11-15 08:55:17 -08:00
Teresa Johnson	452a14efc8	Revert "[MemProf] ThinLTO summary support" This reverts commit 47459455009db4790ffc3765a2ec0f8b4934c2a4. Revert while I try to fix a couple of non-Linux build failures.	2022-11-15 07:39:40 -08:00
Teresa Johnson	4745945500	[MemProf] ThinLTO summary support Implements the ThinLTO summary support for memprof related metadata. This includes support for the assembly format, and for building the summary from IR during ModuleSummaryAnalysis. To reduce space in both the bitcode format and the in memory index, we do 2 things: 1. We keep a single vector of all uniq stack id hashes, and record the index into this vector in the callsite and allocation memprof summaries. 2. When building the combined index during the LTO link, the callsite and allocation memprof summaries are only kept on the FunctionSummary of the prevailing copy. Differential Revision: https://reviews.llvm.org/D135714	2022-11-15 06:45:12 -08:00
Kazu Hirata	e20d210eef	[llvm] Qualify auto (NFC) Identified with readability-qualified-auto.	2022-08-07 23:55:27 -07:00
Teresa Johnson	1dad6247d2	[MemProf] Add memprof metadata related analysis utilities Adds a number of utilities that are used to help create and update memprof related metadata. These will be used during profile matching and annotation, as well as by the inliner when updating the metadata. Also adds unit tests for the utilities. See also related RFCs: RFC: Sanitizer-based Heap Profiler [1] RFC: A binary serialization format for MemProf [2] RFC: IR metadata format for MemProf [3] (Note that the IR metadata format has changed from the RFC during implementation, as described in the preceeding patch adding the basic metadata and verification support.) Depends on D128141. Differential Revision: https://reviews.llvm.org/D128854	2022-07-21 13:46:01 -07:00

44 Commits