51 Commits

Author SHA1 Message Date
Mircea Trofin
f32e5bdcef
[NFC] Rename the Nr abbreviation to Num (#107151)
It's more clear. (This isn't exhaustive).
2024-09-05 12:34:47 -07:00
Mircea Trofin
1991aa6b48
Reapply "[nfc][mlgo] Incrementally update DominatorTreeAnalysis in FunctionPropertiesAnalysis (#104867) (#106309)
Reverts c992690179eb5de6efe47d5c8f3a23f2302723f2.

The problem is that if there is a sequence "{delete A->B} {delete A->B}
{insert A->B}" the net result is "{delete A->B}", which is not what we
want.

Duplicate successors may happen in cases like switch statements (as
shown in the unit test).

The second problem was that in `invoke` cases, some edges we speculate may get deleted don't, but are also not reachable from the inlined call site's basic block. We just need to check which edges are actually not present anymore.

The fix is to sanitize the list of deletes, just like we do for inserts.
2024-08-29 18:28:09 -07:00
Hans Wennborg
c992690179 Revert "[nfc][mlgo] Incrementally update DominatorTreeAnalysis in FunctionPropertiesAnalysis (#104867)"
This seems to cause asserts in our builds:

  llvm/include/llvm/Support/GenericDomTreeConstruction.h:927:
  static void llvm::DomTreeBuilder::SemiNCAInfo<llvm::DominatorTreeBase<BasicBlock, false>>::DeleteEdge(DomTreeT &, const BatchUpdatePtr, const NodePtr, const NodePtr) [DomTreeT = llvm::DominatorTreeBase<BasicBlock, false>]:
  Assertion `!IsSuccessor(To, From) && "Deleted edge still exists in the CFG!"' failed.

and

  llvm/lib/Analysis/FunctionPropertiesAnalysis.cpp:390:
  DominatorTree &llvm::FunctionPropertiesUpdater::getUpdatedDominatorTree(FunctionAnalysisManager &) const:
  Assertion `DT.getNode(BB)' failed.

See comment on the PR.

> We need the dominator tree analysis for loop info analysis, which we need to get features like most nested loop and number of top level loops. Invalidating and recomputing these from scratch after each successful inlining can sometimes lead to lengthy compile times. We don't need to recompute from scratch, though, since we have some boundary information about where the changes to the CFG happen; moreover, for dom tree, the API supports incrementally updating the analysis result.
>
> This change addresses the dom tree part. The loop info is still recomputed from scratch. This does reduce the compile time quite significantly already, though (~5x in a specific case)
>
> The loop info change might be more involved and would follow in a subsequent PR.

This reverts commit a2a5508bdae7d115b6c3ace461beb7a987a44407 and the
follow-up commit cdd11d694a406a98a16d6265168ee2fbe1b6a87c.
2024-08-27 16:04:26 +02:00
Mircea Trofin
a2a5508bda
[nfc][mlgo] Incrementally update DominatorTreeAnalysis in FunctionPropertiesAnalysis (#104867)
We need the dominator tree analysis for loop info analysis, which we need to get features like most nested loop and number of top level loops. Invalidating and recomputing these from scratch after each successful inlining can sometimes lead to lengthy compile times. We don't need to recompute from scratch, though, since we have some boundary information about where the changes to the CFG happen; moreover, for dom tree, the API supports incrementally updating the analysis result.

This change addresses the dom tree part. The loop info is still recomputed from scratch. This does reduce the compile time quite significantly already, though (~5x in a specific case)

The loop info change might be more involved and would follow in a subsequent PR.
2024-08-23 13:13:41 -07:00
Arthur Eubanks
94471e6d23
[MLInliner] Handle CGSCC changes from #94815 (#96274)
With #94815, the nodes belonging to dead functions are no longer
invalidated, but kept around to batch delete at the end of the call
graph walk.

The ML inliner needs to be updated to handle this. This fixes some
asserts getting hit, e.g. https://crbug.com/348376263.
2024-07-03 10:14:49 -07:00
Arthur Eubanks
ebdb6f4ef4
[MLInliner] Keep track of deleted functions (#97348)
As opposed to using Node::isDead(), which is no longer accurate after
#94815.

This is only used in diagnostics.
2024-07-02 10:41:26 -07:00
Arthur Eubanks
81f4fb65d8
[MLInliner] Simplify NodeCount bookkeeping (#96576)
Rather than doing delta counting of the total number of functions, just
increment it when we see a new function.
2024-07-01 13:12:10 -07:00
Nikita Popov
4169338e75
[IR] Don't include Module.h in Analysis.h (NFC) (#97023)
Replace it with a forward declaration instead. Analysis.h is pulled in
by all passes, but not all passes need to access the module.
2024-06-28 14:30:47 +02:00
Mircea Trofin
600ff28772
[mlgo] add 2 new features whether caller/callee is available_externally (#96585)
AvailableExternally linkage is interesting because, in ThinLTO cases, it
means the function may get elided if it survives inlining - see
`elim-avail-extern` pass.
2024-06-25 12:36:40 -07:00
Mircea Trofin
313b1a8250
[mlgo] Support composite AOT-ed models (#96276)
This applies to the AOT case where we embed models in the compiler. The
change adds support for multiple models for the same agent, and allows
the user select one via a command line flag. "agent" refers to e.g. the
inline advisor or the register allocator eviction advisor.

To avoid build setup complexity, the support is delegated to the saved
model. Since saved models define computational graphs, we can generate a
composite model (this happens prior to building and embedding it in LLVM
and is not shown in this change) that exposes an extra feature with a
predefined name: `_model_selector`. The model, then, delegates
internally to contained models based on that feature value.

Model selection is expected to happen at model instantiation, there is
no current scenario for switching them afterwards.

If the model doesn't expose such a feature but the user passes one, we
report error.

If the model exposes such a feature but the user doesn't pass one, we
also report an error.

Invalid model selector values are expected to be handled by the saved
model.

Internally, the model uses a pair of uint64 values - the high and low of
the MD5 hash of the name.

A tool composing models would, then, need to:
- expose the extra feature, `_model_selector`, shape (2,), uint64 data
type
- test its value (`tf.cond` or `tf.case` in Tensorflow) against the MD5
hash, in the [high, low] order, of contained models based on a
user-specified name (which the user will then use as flag value to the
compiler)

Agents just need to add a flag to capture the name of a model and pass
it to `ReleaseModeModelRunner` at construction. This can be passed in
all cases without checking - the case where the model is not composite
and we pass an empty name, everything works as before.

This change also factors out the string flags we pass to the
`ReleaseModeModelRunner` for better maintainability (we risk confusing
parameters that are strings otherwise)
2024-06-24 13:35:47 -07:00
Arthur Eubanks
0555afd024
[NFC][MLInliner] Rename LastSCC -> CurSCC (#96546)
The passed SCC is the current SCC we're working on.
2024-06-24 13:25:06 -07:00
Mircea Trofin
6037a698b9
[mlgo] inline for size: add bypass mechanism for perserving performance (#95616)
This allows shrinking for size the cold part of the code, without sacrificing performance.
2024-06-17 14:18:55 -07:00
Mircea Trofin
1b3fc40586
[mlgo][coro] Assign coro split-ed functions a FunctionLevel (#68263) 2023-10-04 21:20:00 -07:00
Jacob Hegna
cc781a4e27 [MLGO] Fix build error concerning ScalarShape. 2023-04-27 23:52:36 +00:00
Jacob Hegna
f9b3e3411c Adjust macros which define the ML inlining features.
This aligns the inlining macros more closely with how the regalloc
macros are defined.

 - Explicitly specify the dtype/shape
 - Remove separate names for python/C++
 - Add docstring for inline cost features

Differential Revision: https://reviews.llvm.org/D149384
2023-04-27 22:47:12 +00:00
Mircea Trofin
f3b5fca12a [mlgo] Fix the help message for interactive mode default advice
This avoids the use-after-free introduced by D147794 and fixed
in 437dfa5b0365.
2023-04-11 13:04:11 -07:00
Mehdi Amini
437dfa5b03 Fix use-after-free in help message: this cl::opt was binding a StringRef to a temporary string
Caught by ASAN on a bot: https://lab.llvm.org/buildbot/#/builders/168/builds/12872/steps/14/logs/stdio
2023-04-11 00:26:15 -06:00
Mircea Trofin
ab2e7666c2 [mlgo][inl] Interactive mode: optionally tell the default decision
This helps training algorithms that may want to sometimes replicate the
default decision. The default decision is presented as an extra feature
called `inlining_default`. It's not normally exported to save
computation time.

This is only available in interactive mode.

Differential Revision: https://reviews.llvm.org/D147794
2023-04-10 12:20:09 -07:00
Mircea Trofin
5fd51fcba6 Reland "[mlgo] Hook up the interactive runner to the mlgo-ed passes"
This reverts commit a772f0bb920a4957fb94dd8dbe45943809fd0ec3.

The main problem was related to how we handled `dbgs()` from the hosted
compiler. Using explicit `subprocess.communicate`, and not relying on
dbgs() being flushed until the end appears to address the problem.

Also some fixes due to some bots running older pythons, so we can't have
nice things like `int | float` and such.
2023-02-03 17:54:42 -08:00
Mircea Trofin
a772f0bb92 Revert "[mlgo] Hook up the interactive runner to the mlgo-ed passes"
This reverts commit a7354899d1a235a796b3a2ccb45f6596983c8672.

The way stdout/stderr get routed seems to work differently locally and
on the bots. Investigating.
2023-02-03 16:34:31 -08:00
Mircea Trofin
a7354899d1 [mlgo] Hook up the interactive runner to the mlgo-ed passes
This hooks up the interactive model runner to the passes that support
ml-based decisions. Because the interface to this runner is the exact
same as the one used during inference, we just reuse the exact same
setup we have for "release mode". This makes "release mode" a misnomer -
and that's something we needed to resolve sooner or later (e.g.
supporting more than one embedded model for the same problem was another
reason to drop that nomenclature). That will happen in a subsequent
change.

To use this evaluator, just enable the pass in (currently) "release"
mode, but also pass the base name for the 2 channel files via the
pass-specific flag.

The 2 files are the responsibilty of the hosting process. The added
tests use a minimal, toy such host, illustrating setup and
communication.

Differential Revision: https://reviews.llvm.org/D143218
2023-02-03 16:22:57 -08:00
Mircea Trofin
5617fb1411 [MLGO][NFC] Use std::map instead of DenseMap to avoid use after free
In `MLInlineAdvisor::getAdviceImpl`, we call `getCachedFPI` twice, once
for the caller, once for the callee, so the second may invalidate the
reference obtained by the first because the underlying implementation of
the cache is a `DenseMap`. `std::map` doesn't have that problem.
2022-11-04 16:07:24 -07:00
Mircea Trofin
7ae92a69c2 [MLInliner] No need to invalidate everything post-inlining.
We really just need to invalidate loop info and the dominator tree, in
addition to the FunctionPropertiesInfo we were invalidating originally.
Doing more adds unnecessary compile time overhead.
2022-06-24 18:22:06 -07:00
Mircea Trofin
7f24e574d4 [MLInliner] Don't inline call sites in unreachable basic blocks
This requires DominatorTree be updated, which we do in the ml inliner
case, but not in the default case, and the cost of doing so is
noticeable to compile time for the latter[1]. So the patch only affects
the ML inliner.

[1] https://llvm-compile-time-tracker.com/compare.php?from=9fc0aa45e3312944431ba7e1ca0cec99c613992b&to=7af461b1ce0d9138211ef5f883f35d5b9ddf47be&stat=wall-time

Differential Revision: https://reviews.llvm.org/D127899
2022-06-16 09:14:22 -07:00
Jin Xin Ng
aaff3fb6d5 [mlgo] Fix accounting for SCC splits
Previously if the inliner split an SCC such that an empty one remained, the MLInlineAdvisor could potentially lose track of the EdgeCount if a subsequent CGSCC pass modified the calls of a function that was initially in the SCC pre-split. Saving the seen nodes in onPassEntry resolves this.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D127693
2022-06-15 10:53:23 -07:00
Mircea Trofin
22a1f998f7 FunctionPropertiesAnalysis: handle callsite BBs that lose edges
There could be successors that were reached before but now are only
reachable from elsewhere in the CFG.

Suppose the following diamond CFG (lines are arrows pointing down):
    A
  /   \
 B     C
  \   /
    D
There's a call site in C that is inlined. Upon doing that, it turns out
it expands to:
   call void @llvm.trap()
   unreachable
D isn't reachable from C anymore, but we did discount it when we set up
FunctionPropertiesUpdater, so we need to re-include it here.

The patch also updates loop accounting to use LoopInfo rather than
traverse BBs.

Differential Revision: https://reviews.llvm.org/D127353
2022-06-14 15:19:44 -07:00
Mircea Trofin
7e7021ca1a [mlgo] Update FunctionPropertyCache after invalidating analyses
The update depends on LoopInfo, so we need that refreshed first, not
after.

Differential Revision: https://reviews.llvm.org/D127467
2022-06-10 16:18:14 -07:00
Jin Xin Ng
a3a7826d82 [mlgo] Disable accounting upon ForceStop
Once ForceStop is set to true, we only return positive inlining advice when it is mandatory; There is no need for further node/edge accounting.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D127245
2022-06-08 14:26:06 -07:00
Mircea Trofin
f46dd19b48 [mlgo] Incrementally update FunctionPropertiesInfo during inlining
Re-computing FunctionPropertiesInfo after each inlining may be very time
consuming: in certain cases, e.g. large caller with lots of callsites,
and when the overall IR doesn't increase (thus not tripping a size bloat
threshold).

This patch addresses this by incrementally updating
FunctionPropertiesInfo.

Differential Revision: https://reviews.llvm.org/D125841
2022-05-31 17:27:32 -07:00
Mircea Trofin
c35ad9ee4f [mlgo] Support exposing more features than those supported by models
This allows the compiler to support more features than those supported by a
model. The only requirement (development mode only) is that the new
features must be appended at the end of the list of features requested
from the model. The support is transparent to compiler code: for
unsupported features, we provide a valid buffer to copy their values;
it's just that this buffer is disconnected from the model, so insofar
as the model is concerned (AOT or development mode), these features don't
exist. The buffers are allocated at setup - meaning, at steady state,
there is no extra allocation (maintaining the current invariant). These
buffers has 2 roles: one, keep the compiler code simple. Second, allow
logging their values in development mode. The latter allows retraining
a model supporting the larger feature set starting from traces produced
with the old model.

For release mode (AOT-ed models), this decouples compiler evolution from
model evolution, which we want in scenarios where the toolchain is
frequently rebuilt and redeployed: we can first deploy the new features,
and continue working with the older model, until a new model is made
available, which can then be picked up the next time the compiler is built.

Differential Revision: https://reviews.llvm.org/D124565
2022-05-09 18:01:21 -07:00
Mircea Trofin
261419273a Fix build breaks on ml-* bots introduced by include cleanups 2022-03-01 11:29:18 -08:00
serge-sans-paille
71c3a5519d Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor:
before: 1065940348
after:  1065307662

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120659
2022-03-01 18:01:54 +01:00
Mircea Trofin
b1af01fe6a [NFC][MLGO] Simplify conditional compilation
Most of the code that's shared between 'release' and 'development'
modes doesn't depend on anything special.
2022-01-24 11:19:04 -08:00
Mircea Trofin
f29256a64a [MLGO] Improved support for AOT cross-targeting scenarios
The tensorflow AOT compiler can cross-target, but it can't run on (for
example) arm64. We added earlier support where the AOT-ed header and object
would be built on a separate builder and then passed at build time to
a build host where the AOT compiler can't run, but clang can be otherwise
built.

To simplify such scenarios given we now support more than one AOT-able
case (regalloc and inliner), we make the AOT scenario centered on whether
files are generated, case by case (this includes the "passed from a
different builder" scenario).
This means we shouldn't need an 'umbrella' LLVM_HAVE_TF_AOT, in favor of
case by case control. A builder can opt out of an AOT case by passing that case's
model path as `none`. Note that the overrides still take precedence.

This patch controls conditional compilation with case-specific flags,
which can be enabled locally, for the component where those are
available. We still keep an overall flag for some tests.

The 'development/training' mode is unchanged, because there the model is
passed from the command line and interpreted.

Differential Revision: https://reviews.llvm.org/D117752
2022-01-20 07:05:39 -08:00
Mircea Trofin
3e8553aab4 [mlgo][inline] Improve global state tracking
The global state refers to the number of the nodes currently in the
module, and the number of direct calls between nodes, across the
module.

Node counts are not a problem; edge counts are because we want strictly
the kind of edges that affect inlining (direct calls), and that is not
easily obtainable without iteration over the whole module.

This patch avoids relying on analysis invalidation because it turned out
to be too aggressive in some cases. It leverages the fact that Node
objects are stable - they do not get deleted while cgscc passes are
run over the module; and cgscc pass manager invariants.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D115847
2022-01-18 17:45:34 +00:00
Mircea Trofin
248d55af3e [NFC][MLGO] Use LazyCallGraph::Node to track functions.
This avoids the InlineAdvisor carrying the responsibility of deleting
Function objects. We use LazyCallGraph::Node objects instead, which are
stable in memory for the duration of the Module-wide performance of CGSCC
passes started under the same ModuleToPostOrderCGSCCPassAdaptor (which
is the case here)

Differential Revision: https://reviews.llvm.org/D116964
2022-01-11 19:23:47 -08:00
Mircea Trofin
db5aceb979 [NFC] Expose the ReleaseModeModelRunner
The type was pretty much generic, just needed a bit of parameterization.

Differential Revision: https://reviews.llvm.org/D115764
2021-12-15 23:21:58 -08:00
Mircea Trofin
059e03476c [NFC][mlgo] Generalize model runner interface
This prepares it for the regalloc work. Part of it is making model
evaluation accross 'development' and 'release' scenarios more reusable.
This patch:
- extends support to tensors of any shape (not just scalars, like we had
in the inliner -Oz case). While the tensor shape can be anything, we
assume row-major layout and expose the tensor as a buffer.
- exposes the NoInferenceModelRunner, which we use in the 'development'
mode to keep the evaluation code path consistent and simplify logging,
as we'll want to reuse it in the regalloc case.

Differential Revision: https://reviews.llvm.org/D115306
2021-12-08 20:10:58 -08:00
Mircea Trofin
f64eee1625 [NFC][InlineAdvisor] Inform advisor when the module is invalidated
This avoids unnecessary re-calculation of module-wide features in the
MLInlineAdvisor. In cases where function passes don't invalidate
functions (and, thus, don't invalidate the module), but we re-process a
CGSCC, we currently refreshed module features unnecessarily. The
overhead of fetching cached results (albeit they weren't themselves
invalidated) was noticeable in certain modules' compilations.

We don't want to just invalidate the advisor object, though, via the
analysis manager, because we'd then need to re-create expensive state
(like the model evaluator in the ML 'development' mode).

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D113644
2021-11-11 10:23:49 -08:00
Jacob Hegna
99f00635d7 Unpack the CostEstimate feature in ML inlining models.
This change yields an additional 2% size reduction on an internal search
binary, and an additional 0.5% size reduction on fuchsia.

Differential Revision: https://reviews.llvm.org/D104751
2021-07-02 16:57:16 +00:00
Mircea Trofin
0d06b14f59 [MLGO] Fix use of AM.invalidate post D100519
The ML inline advisors more aggressively invalidate certain analyses
after each call site inlining, to more accurately capture the problem
state.
2021-04-15 18:45:39 -07:00
Mircea Trofin
ccec2cf1d9 Reland "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor"
This reverts commit d97f776be5f8cd3cd446fe73827cd355f6bab4e1.

The original problem was due to build failures in shared lib builds. D95079
moved ImportedFunctionsInliningStatistics under Analysis, unblocking
this.
2021-01-20 13:33:43 -08:00
Mircea Trofin
d97f776be5 Revert "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor"
This reverts commit e8aec763a57e211420dfceb2a8dc6b88574924f3.
2021-01-20 11:19:34 -08:00
Mircea Trofin
e8aec763a5 [NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor
When using 2 InlinePass instances in the same CGSCC - one for other
mandatory inlinings, the other for the heuristic-driven ones - the order
in which the ImportedFunctionStats would be output-ed would depend on
the destruction order of the inline passes, which is not deterministic.

This patch moves the ImportedFunctionStats responsibility to the
InlineAdvisor to address this problem.

Differential Revision: https://reviews.llvm.org/D94982
2021-01-20 11:07:36 -08:00
Mircea Trofin
e8049dc3c8 [NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner
Expanding from D94808 - we ensure the same InlineAdvisor is used by both
InlinerPass instances. The notion of mandatory inlining is moved into
the core InlineAdvisor: advisors anyway have to handle that case, so
this change also factors out that a bit better.

Differential Revision: https://reviews.llvm.org/D94825
2021-01-15 17:59:38 -08:00
Mircea Trofin
5fe10263ab [llvm][inliner] Reuse the inliner pass to implement 'always inliner'
Enable performing mandatory inlinings upfront, by reusing the same logic
as the full inliner, instead of the AlwaysInliner. This has the
following benefits:
- reduce code duplication - one inliner codebase
- open the opportunity to help the full inliner by performing additional
function passes after the mandatory inlinings, but before th full
inliner. Performing the mandatory inlinings first simplifies the problem
the full inliner needs to solve: less call sites, more contextualization, and,
depending on the additional function optimization passes run between the
2 inliners, higher accuracy of cost models / decision policies.

Note that this patch does not yet enable much in terms of post-always
inline function optimization.

Differential Revision: https://reviews.llvm.org/D91567
2020-11-30 12:03:39 -08:00
Tarindu Jayatilaka
418121c30a Reapply "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"
(This reverts commit a5e0194709c40212694370e0ea789a1ca14548b5, and
corrects author).

Rename the pass to be able to extend it to function properties other than inliner features.

    Reviewed By: mtrofin

    Differential Revision: https://reviews.llvm.org/D82044
2020-07-22 10:07:35 -07:00
Mircea Trofin
a5e0194709 Revert "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"
This reverts commit 44a6bda19b40f2dfcbe92fc3d58bb6276c71ef78. I forgot
to correctly attibute it to tarinduj. Fixing and resubmitting.
2020-07-22 09:42:17 -07:00
Mircea Trofin
44a6bda19b Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis
Rename the pass to be able to extend it to function properties other than inliner features.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D82044
2020-07-22 09:24:15 -07:00
Nico Weber
4fe912f186 Build: Move TF source file inclusion from build system to source files
Outside of compiler-rt (where it's arguably an anti-pattern too),
LLVM tries to keep its build files as simple as possible. See e.g.
llvm/docs/SupportLibrary.rst, "Code Organization".

Differential Revision: https://reviews.llvm.org/D84243
2020-07-21 13:02:34 -04:00