17 Commits

Author SHA1 Message Date
Mircea Trofin
c4952e513f
[nfc][ctx_prof] Efficient profile traversal and update (#110052)
This optimizes profile updates and visits, where we want to access contexts for a specific function. These are all the current update cases. We do so by maintaining a list of contexts for each function, preserving preorder traversal. The list is updated whenever contexts are `std::move`-d or deleted.
2024-09-27 08:09:10 -07:00
Mircea Trofin
3d01af78a9 [nfc][ctx_prof] Remove unnecessary include
Removed dependency on `Transforms/Utils` from
`CtxProfAnalysis.cpp` - it was unnecessary to
begin with.
2024-09-25 18:42:47 -07:00
Mircea Trofin
c8365feed7
[ctx_prof] Simple ICP criteria during module inliner (#109881)
This is mostly for test: under contextual profiling, we perform ICP for those indirect callsites which have targets marked as `alwaysinline`.

This helped uncover a bug with the way the profile was updated upon ICP, where we were skipping over the update if the target wasn't called in that context. That was resulting in incorrect counts for the indirect BB.

Also flyby fix to the total/direct count values, they should be 64-bit (as all counters are in the contextual profile)
2024-09-25 15:05:52 -07:00
Mircea Trofin
783bac7ffb
[ctx_prof] Handle select and its step instrumentation (#109185)
The `step` instrumentation shouldn't be treated, during use, like an `increment`. The latter is treated as a BB ID. The step isn't that, it's more of a type of value profiling. We need to distinguish between the 2 when really looking for BB IDs (==increments), and handle appropriately `step`s. In particular, we need to know when to elide them because `select`s may get elided by function cloning, if the condition of the select is statically known.
2024-09-23 15:21:25 -07:00
Mircea Trofin
ee5709b3b4
[nfc][ctx_prof] Don't try finding callsite annotation for un-instrumentable callsites (#109184)
Reinforcing properties ensured at instrumentation time.
2024-09-18 21:13:48 -07:00
Mircea Trofin
82266d3a2b
[nfc][ctx_prof] Factor the callsite instrumentation exclusion criteria (#108471)
Reusing this in the logic fetching the instrumentation in `CtxProfAnalysis`.
2024-09-13 21:25:47 -07:00
Mircea Trofin
6cb2d40387
[ctx_prof] Handle case when no root is in this Module. (#107463)
If none of the functions in this `Module` are roots in the contextual profile, we can't use it and should just return the `{}` case.
2024-09-06 13:44:05 -07:00
Mircea Trofin
3209766608
[ctx_prof] Add Inlining support (#106154)
Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant.

Post-inlining, the update mainly consists of:
- making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions)
- in the contextual profile:
   - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual.
   - the contexts of the callee (at the inlined callsite) are moved to the caller.
   - the callee context at the inlined callsite is deleted.
2024-09-03 16:14:05 -07:00
Mircea Trofin
73c3b7337b
[ctx_prof] Add support for ICP (#105469)
An overload of `llvm::promoteCallWithIfThenElse` that updates the contextual profile.

High-level, this is very simple: after creating the `if... then (direct call) else (indirect call)` structure, we instrument the new callsites and BBs (the instrumentation will help with tracking for other IPO transformations, and, ultimately, to match counter values before flattening to `MD_prof`).

In more detail:

- move the callsite instrumentation of the indirect call to the `else` BB, before the indirect call
- create a new callsite instrumentation for the direct call
- create instrumentation for both the `then` and `else` BBs - we could instrument just one (MST-style) but we're not running the binary with this instrumentation, and at most this would save some space (less counters tracked). For simplicity instrumenting both at this point
- update each context belonging to the caller by updating the counters, and moving the indirect callee to the new, direct callsite ID

Issue #89287
2024-08-27 15:50:13 -07:00
Mircea Trofin
1e70122cbc
[ctx_prof] API to get the instrumentation of a BB (#105468)
Analogous to PR #104491 

Issue #89287
2024-08-21 17:17:46 -07:00
Mircea Trofin
22d3fb182c
[ctx_prof] Profile flatterner (#104539)
Eventually we'll need to flatten the profile (at the end of all IPO) and lower to "vanilla" `MD_prof`. This is the first part of that.

Issue #89287
2024-08-21 10:52:10 -07:00
Mircea Trofin
c8a678b1e4
[ctx_prof] Add analysis utility to fetch ID of a callsite (#104491)
This will be needed when maintaining the contextual profile for ICP or inlining - we'll need to first fetch the ID of a callsite, which is in an instrumentation instruction (intrinsic) preceding the callsite.
2024-08-20 10:49:42 -07:00
Mircea Trofin
50c876a486
[nfc][ctx_prof] Remove the need for PassBuilder to know about UseCtxProfile (#104492) 2024-08-15 13:16:55 -07:00
Haojian Wu
8d03710728 [ctx_prof] Remove an unneeded include in CtxProfAnalysis.cpp 2024-08-15 08:17:22 +02:00
Mircea Trofin
aca01bff07
[ctx_prof] CtxProfAnalysis: populate module data (#102930)
Continuing from #102084, which introduced the analysis, we now populate
it with info about functions contained in the module.

When we will update the profile due to e.g. inlined callsites, we'll
ingest the callee's counters and callsites to the caller. We'll move
those to the caller's respective index space (counter and callers), so
we need to know and maintain where those currently end.

We also don't need to keep profiles not pertinent to this module.

This patch also introduces an arguably much simpler way to track the
GUID of a function from the frontend compilation, through ThinLTO, and
into the post-thinlink compilation step, which doesn't rely on keeping
names around. A separate RFC and patches will discuss extending this to
the current PGO (instrumented and sampled) and other consumers as an
infrastructural component.
2024-08-14 18:46:25 -07:00
Mircea Trofin
4a2bf05980 Reapply "[ctx_prof] Fix the pre-thinlink "use" case (#102511)"
This reverts commit 967185eeb85abb77bd6b6cdd2b026d5c54b7d4f3.

The problem was link dependencies, moved `UseCtxProfile` to `Analysis`.
2024-08-08 17:04:00 -07:00
Mircea Trofin
dbbf0762b6
[ctx_prof] CtxProfAnalysis (#102084)
This is an immutable analysis that loads and makes the contextual profile available to other passes. This patch introduces the analysis and an analysis printer pass. Subsequent patches will introduce the APIs that IPO passes will call to modify the profile as result of their changes.
2024-08-07 14:39:48 -04:00