There's an early-exit case for regalloc when we don't even get a chance
to ask for an advisor (priority or eviction), and switch the context.
Then, when we want to log the reward for that function (==the one with
the early exit case), we hit the error case where the function's name
doesn't match the last-seen context.
There are a few possible fixes, one would be to just switch context when
output-ing the reward, which would be correct. This patch opts for the
alternative where we check any loging happened in the first place - just
to re-validate that no function would have been regaloc-ed without first
log-ing its reward.
Differential Revision: https://reviews.llvm.org/D143359
This reverts commit a772f0bb920a4957fb94dd8dbe45943809fd0ec3.
The main problem was related to how we handled `dbgs()` from the hosted
compiler. Using explicit `subprocess.communicate`, and not relying on
dbgs() being flushed until the end appears to address the problem.
Also some fixes due to some bots running older pythons, so we can't have
nice things like `int | float` and such.
This reverts commit a7354899d1a235a796b3a2ccb45f6596983c8672.
The way stdout/stderr get routed seems to work differently locally and
on the bots. Investigating.
This hooks up the interactive model runner to the passes that support
ml-based decisions. Because the interface to this runner is the exact
same as the one used during inference, we just reuse the exact same
setup we have for "release mode". This makes "release mode" a misnomer -
and that's something we needed to resolve sooner or later (e.g.
supporting more than one embedded model for the same problem was another
reason to drop that nomenclature). That will happen in a subsequent
change.
To use this evaluator, just enable the pass in (currently) "release"
mode, but also pass the base name for the 2 channel files via the
pass-specific flag.
The 2 files are the responsibilty of the hosting process. The added
tests use a minimal, toy such host, illustrating setup and
communication.
Differential Revision: https://reviews.llvm.org/D143218
This leverages the new logging format in that we don't need to buffer
the training data, we can just write it out.
Differential Revision: https://reviews.llvm.org/D142168
We use LLVM_HAVE_TFLITE as the key to enable the mlgo work these days,
and LLVM_HAVE_TF_API is defined whenever LLVM_HAVE_TF_API is defined.
I'm posting this patch because it's purely mechanical.
I'll post a follow-up patch to remove LLVM_HAVE_TF_API in non-C++
files, and that will not be as mechanical as this one.
Differential Revision: https://reviews.llvm.org/D139863
It's an artifact very specific to using TFAgents during training, so it
belongs with ModelUnderTrainingRunner.
Differential Revision: https://reviews.llvm.org/D139031
This commit adds in two new features to the ML regalloc eviction
analysis that can be used in ML models, a vector of MBB frequencies and
a vector of indicies mapping instructions to their corresponding basic
blocks. This will allow for further experimentation with per-instruction
features and give a lot more flexibility for future experimentation over
how we're extracting MBB frequency data currently.
Reviewed By: mtrofin, jacobhegna
Differential Revision: https://reviews.llvm.org/D134166
This patch adds in instruction based features to the regalloc advisor
gated behind a flag so a user can decide at runtime whether or not they
want to enable the feature. The features are only enabled when LLVM is
compiled in MLGO develpment mode (LLVM_HAVE_TF_API) is set to true.
To extract the instruction features, I'm taking a list of segments from
each LiveInterval and noting the start and end SlotIndices. This list is then
sorted based on the start SlotIndex and I iterate through each SlotIndex
to grab instructions, making sure to check for overlaps. This results in
a vector of opcodes and binary mapping matrix that maps live ranges to the
opcodes of the instructions within that LR.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D131930
Currently there is no way to add in development features to the ML
regalloc evict advisor which is useful to have when working on feature
engineering/improving the current model. This patch adds in the ability
to add in development features to the ML regalloc evict advisor which
are gated by a runtime flag and not added in at all if not compiled in
LLVM development mode. This sets the stage for future work where we are
planning on upstreaming some of the newer features that we are currently
experimenting with.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D131209
This just shuffles implementations and declarations around. Now the
logger and the TF C API-based model evaluator are separate.
Differential Revision: https://reviews.llvm.org/D131116
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
This allows the compiler to support more features than those supported by a
model. The only requirement (development mode only) is that the new
features must be appended at the end of the list of features requested
from the model. The support is transparent to compiler code: for
unsupported features, we provide a valid buffer to copy their values;
it's just that this buffer is disconnected from the model, so insofar
as the model is concerned (AOT or development mode), these features don't
exist. The buffers are allocated at setup - meaning, at steady state,
there is no extra allocation (maintaining the current invariant). These
buffers has 2 roles: one, keep the compiler code simple. Second, allow
logging their values in development mode. The latter allows retraining
a model supporting the larger feature set starting from traces produced
with the old model.
For release mode (AOT-ed models), this decouples compiler evolution from
model evolution, which we want in scenarios where the toolchain is
frequently rebuilt and redeployed: we can first deploy the new features,
and continue working with the older model, until a new model is made
available, which can then be picked up the next time the compiler is built.
Differential Revision: https://reviews.llvm.org/D124565
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
Added a `NoopSavedModelImpl` type which can be used as a mock AOT-ed
saved model, and further minimize conditional compilation cases. This
also removes unused function warnings on gcc.
This reverts commit bc3b372161716a4c4845d47a877e4892df0d08da.
The planned change that would have needed non-const MachineFunction refs
isn't needed after all.
Added a flag to make configurable the number of interferences after
which we 'bail out' and treat a set of intervals as un-evictable. Also
using it on the ML side, as it turns out to be a good control for
compile-time.
With this configurable, we can do a bit of trial and error and see if
bumping it has any effect on heuristic/policy quality.
Differential Revision: https://reviews.llvm.org/D118707
Factoring it out so we can subsequently cache it. This should be a NFC,
however, for the float quantities, we see small errors in the least
significant digits. This is because, before, we were summing up one by
one. Now, we sum up results of sums.
This shouldn't matter for ML, and will require rework when we do
quantization (avoiding floats altogether), but meanwhile, it did require
an update to the reference file used for testing.
The patch also bumps the precision of the variables involved in this, to
reduce the error (note they are casted back to float at the end by the
SET macro, since we only work with float and not double in TF)
Differential Revision: https://reviews.llvm.org/D118659
We plan to pass the MachineFunction& to APIs that expect it non-const
(for legitimate reasons). The advisor still holds the ref as a const
ref, though, so we keep most of the maintainability value of that.
If AllocationOrder has less than 32 elements, we were treating the extra
positions as if they were valid. This was detected by a subsequent
assert. The fix also tightens the asserts.
The tensorflow AOT compiler can cross-target, but it can't run on (for
example) arm64. We added earlier support where the AOT-ed header and object
would be built on a separate builder and then passed at build time to
a build host where the AOT compiler can't run, but clang can be otherwise
built.
To simplify such scenarios given we now support more than one AOT-able
case (regalloc and inliner), we make the AOT scenario centered on whether
files are generated, case by case (this includes the "passed from a
different builder" scenario).
This means we shouldn't need an 'umbrella' LLVM_HAVE_TF_AOT, in favor of
case by case control. A builder can opt out of an AOT case by passing that case's
model path as `none`. Note that the overrides still take precedence.
This patch controls conditional compilation with case-specific flags,
which can be enabled locally, for the component where those are
available. We still keep an overall flag for some tests.
The 'development/training' mode is unchanged, because there the model is
passed from the command line and interpreted.
Differential Revision: https://reviews.llvm.org/D117752
The bulk of the implementation is common between 'release' mode (==AOT-ed
model) and 'development' mode (for training), the main difference is
that in development mode, we may also log features (for training logs),
inject scoring information (currently after the Virtual Register
Rewriter) and then produce the log file.
This patch also introduces the score injection pass, 'Register
Allocation Pass Scoring', which is trivially just logging the score in
development mode.
Differential Revision: https://reviews.llvm.org/D117147