This introduces new `ModuleCache` interface for writing PCM files.
Together with #188876, this will enable adding a caching layer into the
`InProcessModuleCache` implementation, hopefully reducing IO cost.
Moreover, this makes it super explicit that the PCM is written before
its timestamp, which is an important invariant that we've broken before.
This PR introduces new `ModuleCache` API for reading PCM files. This
makes it so that we don't go through the `FileManager` and VFS, which is
problematic downstream. We interpose a VFS that unintentionally shuffles
implicitly-built modules in and out of the CAS database, leading to some
unnecessary storage and runtime overhead. Moreover, this (together with
a reading API) will enable adding a caching layer into the
`InProcessModuleCache` implementation, hopefully reducing IO cost.
The `ModuleCache` class is currently reference-counted intrusively. As
explained in https://github.com/llvm/llvm-project/pull/139584, this is
problematic. This PR uses `std::shared_ptr` to reference-count
`ModuleCache` instead, which clarifies what happens to its lifetime when
constructing `CompilerInstance`, for example. This also makes the
reference in `ModuleManager` non-owning, simplifying the ownership
relationship further. The
`ASTUnit::transferASTDataFromCompilerInstance()` function now accounts
for that by taking care to keep it alive.
This PR introduces a new mechanism for enforcing a sandbox around
filesystem reads coming from the compiler. A fatal error is raised
whenever the `llvm::sys::fs`, `llvm::MemoryBuffer::getFile*()` APIs get
used directly instead of going through the "blessed" virtual interface
of `llvm::vfs::FileSystem`.
#137363 was supposed to be NFC for the `CrossProcessModuleCache` (a.k.a
normal implicit module builds), but accidentally passed the wrong path
to `sys::fs::status`. Then, #141358 removed the correct path that
should've been passed instead. (The variable was flagged as unused.)
None of our existing tests caught this regression, we only found out due
to a SourceKit-LSP benchmark getting slower.
This PR re-implements the original behavior, adds new remark to Clang
for PCM input file validation, and uses it to create more reliable tests
of the `-fmodules-validate-once-per-build-session` flag.
This PR virtualizes module cache pruning via the new `ModuleCache`
interface. Currently this is an NFC, but I left a FIXME in
`InProcessModuleCache` to make this more efficient for the dependency
scanner.
This reverts commit 7a242387c950c7060143da6da0e6fb91f36bb458. Even after 175f8a44, the Modules/fmodules-validate-once-per-build-session.c test is not fixed on the clang-armv8-quick build bot. (Failure occurs on line 114.)
This reverts commit 18b885f66babff3a10451bc811ffc077d61ed8ee, effectively reapplying #139987. This commit fixes unit tests (for example ASTUnitTest.SaveLoadPreservesLangOptionsInPrintingPolicy) where the `ASTUnit::ModCache` pointer dereferenced within `ASTUnit::serialize()` was null. This commit makes sure each factory function does initialize `ASTUnit::ModCache`.
Timestamps are an implementation detail of the cross-process module
cache implementation. This PR hides it from the `ModuleCache` API, which
simplifies the in-process implementation.
In the past, timestamps used for
`-fmodules-validate-once-per-build-session` were found to be a source of
contention in the dependency scanner
([D149802](https://reviews.llvm.org/D149802),
https://github.com/llvm/llvm-project/pull/112452). This PR is yet
another attempt to optimize these. We now make use of the new
`ModuleCache` interface to implement the in-process version in terms of
atomic `std::time_t` variables rather the mtime attribute on
`.timestamp` files.
This PR adds new `ModuleCache` interface to Clang's implicitly-built
modules machinery. The main motivation for this change is to create a
second implementation that uses a more efficient kind of
`llvm::AdvisoryLock` during dependency scanning.
In addition to the lock abstraction, the `ModuleCache` interface also
manages the existing `InMemoryModuleCache` instance. I found that
compared to keeping these separate/independent, the code is a bit
simpler now, since these are two tightly coupled concepts. I can
envision a more efficient implementation of the `InMemoryModuleCache`
for the single-process case too, which will be much easier to implement
with the current setup.
This is not intended to be a functional change.