15 Commits

Author SHA1 Message Date
Mircea Trofin
c8365feed7
[ctx_prof] Simple ICP criteria during module inliner (#109881)
This is mostly for test: under contextual profiling, we perform ICP for those indirect callsites which have targets marked as `alwaysinline`.

This helped uncover a bug with the way the profile was updated upon ICP, where we were skipping over the update if the target wasn't called in that context. That was resulting in incorrect counts for the indirect BB.

Also flyby fix to the total/direct count values, they should be 64-bit (as all counters are in the contextual profile)
2024-09-25 15:05:52 -07:00
Mircea Trofin
783bac7ffb
[ctx_prof] Handle select and its step instrumentation (#109185)
The `step` instrumentation shouldn't be treated, during use, like an `increment`. The latter is treated as a BB ID. The step isn't that, it's more of a type of value profiling. We need to distinguish between the 2 when really looking for BB IDs (==increments), and handle appropriately `step`s. In particular, we need to know when to elide them because `select`s may get elided by function cloning, if the condition of the select is statically known.
2024-09-23 15:21:25 -07:00
Mircea Trofin
ce9209f50e
[ctx_prof] Fix ProfileAnnotator::allTakenPathsExit (#109183)
Added tests to the validator and fixed issues stemming from the previous skipping over BBs with single successors - which is incorrect. That would be now picked by added tests where the assertions are expected to be triggered.
2024-09-18 21:08:34 -07:00
Mircea Trofin
b2d3c315d5
[ctx_prof] Fix checks in PGOCtxprofFlattening (#108467)
The assertion that all out-edges of a BB can't be 0 is incorrect: they
can be, if that branch is on a cold subgraph.

Added validators and asserts about the expected proprerties of the
propagated counters.
2024-09-17 18:19:20 -07:00
Mircea Trofin
3b22618094
[ctx_prof] Insert the ctx prof flattener after the module inliner (#107499)
This patch enables experimenting with the contextual profile. ICP is currently disabled in this case - will reenable it subsequently. Also subsequently the inline cost model / decision making would be updated to be context-aware. Right now, this just achieves "complete use" of the profile, in that it's ingested, maintained, and sunk to a flat profile when not needed anymore.

Issue [#89287](https://github.com/llvm/llvm-project/issues/89287)
2024-09-09 18:16:24 -07:00
Mircea Trofin
fe6c025037 [nfc][ctx_prof] Fix the second source of nondeterminism in CtxProfAnalysisPrinterPass
Verified on a build with `LLVM_REVERSE_ITERATION=ON`

Issue #106855
2024-09-06 21:54:23 -07:00
Mircea Trofin
1d2da210a2 [nfc][ctx_prof] Test comments and more clarity in values. 2024-09-06 21:10:24 -07:00
Mircea Trofin
644c9020ff [ctx_prof] Fix nondeterminism in CtxProfAnalysisPrinterPass
Using a `std::map` instead of a `DenseMap` for the FuncInfo
member.

Addresses the ctx_prof part of Issue #106855
2024-09-06 19:22:21 -07:00
Mircea Trofin
775c50709c
[ctx_prof] Flattened profile lowering pass (#107329)
Pass to flatten and lower the contextual profile to profile (i.e. `MD_prof`) metadata. This is expected to be used after all IPO transformations have happened.

Prior to lowering, the instrumentation is maintained during IPO and the contextual profile is kept in sync (see PRs #105469, #106154). Flattening (#104539) sums up all the counters belonging to all a function's context nodes.

We first propagate counter values (from the flattened profile) using the same propagation algorithm as `PGOUseFunc::populateCounters`, then map the edge values to `branch_weights`. Functions. in the module that don't have an entry in the flattened profile are deemed cold, and any `MD_prof` metadata they may have is reset. The profile summary is also reset at this point.

Issue [#89287](https://github.com/llvm/llvm-project/issues/89287)
2024-09-06 13:47:08 -07:00
Mircea Trofin
6cb2d40387
[ctx_prof] Handle case when no root is in this Module. (#107463)
If none of the functions in this `Module` are roots in the contextual profile, we can't use it and should just return the `{}` case.
2024-09-06 13:44:05 -07:00
Mircea Trofin
3209766608
[ctx_prof] Add Inlining support (#106154)
Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant.

Post-inlining, the update mainly consists of:
- making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions)
- in the contextual profile:
   - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual.
   - the contexts of the callee (at the inlined callsite) are moved to the caller.
   - the callee context at the inlined callsite is deleted.
2024-09-03 16:14:05 -07:00
Mircea Trofin
22d3fb182c
[ctx_prof] Profile flatterner (#104539)
Eventually we'll need to flatten the profile (at the end of all IPO) and lower to "vanilla" `MD_prof`. This is the first part of that.

Issue #89287
2024-08-21 10:52:10 -07:00
Mircea Trofin
aca01bff07
[ctx_prof] CtxProfAnalysis: populate module data (#102930)
Continuing from #102084, which introduced the analysis, we now populate
it with info about functions contained in the module.

When we will update the profile due to e.g. inlined callsites, we'll
ingest the callee's counters and callsites to the caller. We'll move
those to the caller's respective index space (counter and callers), so
we need to know and maintain where those currently end.

We also don't need to keep profiles not pertinent to this module.

This patch also introduces an arguably much simpler way to track the
GUID of a function from the frontend compilation, through ThinLTO, and
into the post-thinlink compilation step, which doesn't rely on keeping
names around. A separate RFC and patches will discuss extending this to
the current PGO (instrumented and sampled) and other consumers as an
infrastructural component.
2024-08-14 18:46:25 -07:00
Mircea Trofin
8ccd4208bb Fix post #102084: restrict to linux to avoid formatting diffs. 2024-08-07 11:53:46 -07:00
Mircea Trofin
dbbf0762b6
[ctx_prof] CtxProfAnalysis (#102084)
This is an immutable analysis that loads and makes the contextual profile available to other passes. This patch introduces the analysis and an analysis printer pass. Subsequent patches will introduce the APIs that IPO passes will call to modify the profile as result of their changes.
2024-08-07 14:39:48 -04:00