12 Commits

Author SHA1 Message Date
Teresa Johnson
c8e2f43379
[MemProf] Select largest of matching contexts from profile (#165338)
We aren't currently deduplicating contexts that are identical or nearly
identical (differing inline frame information) when generating the
profile. When we have multiple identical contexts we end up
conservatively marking it as non-cold, even if some are much smaller in
terms of bytes allocated.

This was causing us to lose sight of a very large cold context, because
we had a small non-cold one that only differed in the inlining (which we
don't consider when matching as the inlining could change or be
incomplete at that point in compilation). Likely the smaller one was
from binary with much smaller usage and therefore not yet detected as
cold.

Deduplicate the alloc contexts for a function before applying the
profile, selecting the largest one, or conservatively selecting the
non-cold one if they are the same size.

This caused a minor difference to an existing test
(memprof_loop_unroll.ll), which now only gets one message for the
duplicate context instead of 2. While here, convert to the text version
of the profile.
2025-10-31 08:48:02 -05:00
Kazu Hirata
a4ffa1f3d5
[Instrumentation] Remove a redundant control flow statement (NFC) (#165510) 2025-10-29 07:36:05 -07:00
Mingming Liu
e313bc834e
[StaticDataLayout] Factor out a helper function for section prefix eligibility and use it in both optimizer and codegen (#162348)
This change introduces new helper functions to check if a global
variable is eligible for section prefix annotation.

This shared logic is used by both MemProfUse and StaticDataSplitter to
avoid annotating ineligible variables.

This is the 2nd patch as a split of
https://github.com/llvm/llvm-project/pull/155337
2025-10-13 18:26:16 +00:00
Mingming Liu
a6fdbcbb2c
[StaticDataLayout][MemProf] Record whether the IR is compiled with data access profiles by module flag. (#162333)
The codegen pass in the pipeline can read the module flag to tell
whether the IR is compiled with data access profile, to support two use
cases when `memprof-annotate-static-data-prefix=true` is enabled

1. The binary is compiled with data access profiles. 
- The module flag will have value 1, and codegen pass should regard an
empty section prefix as 'unknown' and conservatively not placing the
data into `.unlikely` data sections.

2. The binary is compiled without data access profiles (e.g., during
incremental rollout, etc)
- The module flag will have value 0, and codegen pass can override an
empty section prefix based on PGO counters.

https://github.com/llvm/llvm-project/pull/155337 shows the motivating
use case in function `StaticDataProfileInfo::getConstantSectionPrefix`
in `llvm/lib/Analysis/StaticDataProfileInfo.cpp`

This is the 1st patch as a split of
https://github.com/llvm/llvm-project/pull/155337
2025-10-13 10:29:23 -07:00
Mingming Liu
8b3c91c4fb
Re-apply "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159161)
This is a reland of https://github.com/llvm/llvm-project/pull/158460

Test failures are gone once I undo the changes in codegenprepare.
2025-09-16 20:33:29 +00:00
Mingming Liu
9277bcd1ab
Revert "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159159)
Reverts llvm/llvm-project#158460 due to buildbot failures
2025-09-16 12:51:54 -07:00
Mingming Liu
027bccc469
[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if existing prefix is not equivalent to the new one. Returns whether prefix changed. (#158460)
Before this change, `setSectionPrefix` overwrites existing section
prefix with new one unconditionally.

After this change, `setSectionPrefix` checks for equivalences, updates
conditionally and returns whether an update happens.

Update the existing callers to make use of the return value. [PR
155337](https://github.com/llvm/llvm-project/pull/155337/files#diff-cc0c67ac89807f4453f0cfea9164944a4650cd6873a468a0f907e7158818eae9)
is a motivating use case whether the 'update' semantic is needed.
2025-09-16 12:01:21 -07:00
Mingming Liu
c93b3a3454
[MemProf] Extend MemProfUse pass to make use of data access profiles to partition data (#151238)
f3f28323ad
introduces the data access profile format as a payload inside
[memprof](https://llvm.org/docs/InstrProfileFormat.html#memprof-profile-data),
and the MemProfUse pass reads the memprof payload.

This change extends the MemProfUse pass to read the data access profiles
to annotate global variables' section prefix.
1. If there are samples for a global variable, it's annotated as hot.
2. If a global variable is seen in the profiled binary file but doesn't
have access samples, it's annotated as unlikely.

Introduce an option `annotate-static-data-prefix` to flag-gate the
global-variable annotation path, and make it false by default.
https://github.com/llvm/llvm-project/pull/155337 is the (WIP) draft
change to "reconcile" two sources of hotness.
2025-08-27 23:43:37 -04:00
Kazu Hirata
a270fdf3fe
[memprof] Simplify control flow in readMemProf (NFC) (#149764)
Now that readMemProf calls two helper functions handleAllocSite and
handleCallSite, we can simplify the control flow.  We don't need to
use "continue" anymore.
2025-07-21 09:11:08 -07:00
Kazu Hirata
507ff29c9b
[memprof] Introduce handleCallSite (NFC) (#149724)
Continuing the effort to refactor readMemProf, this patch introduces
handlCallSite to handle, well, call sites.

Moving the code requires taking CallSiteEntry and CallSiteEntryHash
out of readMemProf.

We could simplify some code, but I'm keeping this patch very simple to
facilitate the review process.  For example, we could simplify the
control flow near the end of readMemProf, but we can address that
later.
2025-07-20 20:42:17 -07:00
Kazu Hirata
04f2114ab2
[memprof] Refactor readMemProf (NFC) (#149663)
This patch creates a helper function named handleAllocSite to handle
the allocation site.  It makes readMemProf a little bit shorter.

I'm planning to move the code to handle call sites in a subsequent
patch.  Doing so in this patch would make this patch a lot longer
because we need to move other things like CallSiteEntry and
CallSiteEntryHash.
2025-07-20 08:27:24 -07:00
Snehasish Kumar
16c7b3c9f5
[MemProf] Split MemProfiler into Instrumentation and Use. (#142811)
Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
2025-06-05 07:36:50 -07:00