Handles clang::DiagnosticsEngine and clang::DiagnosticIDs.
For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to
convenience method `DiagnosticIDs::create()`.
Part of cleanup https://github.com/llvm/llvm-project/issues/151026
This reverts commit e2a885537f11f8d9ced1c80c2c90069ab5adeb1d. Build failures were fixed right away and reverting the original commit without the fixes breaks the build again.
The `DiagnosticOptions` class is currently intrusively
reference-counted, which makes reasoning about its lifetime very
difficult in some cases. For example, `CompilerInvocation` owns the
`DiagnosticOptions` instance (wrapped in `llvm::IntrusiveRefCntPtr`) and
only exposes an accessor returning `DiagnosticOptions &`. One would
think this gives `CompilerInvocation` exclusive ownership of the object,
but that's not the case:
```c++
void shareOwnership(CompilerInvocation &CI) {
llvm::IntrusiveRefCntPtr<DiagnosticOptions> CoOwner = &CI.getDiagnosticOptions();
// ...
}
```
This is a perfectly valid pattern that is being actually used in the
codebase.
I would like to ensure the ownership of `DiagnosticOptions` by
`CompilerInvocation` is guaranteed to be exclusive. This can be
leveraged for a copy-on-write optimization later on. This PR changes
usages of `DiagnosticOptions` across `clang`, `clang-tools-extra` and
`lldb` to not be intrusively reference-counted.
This PR hides the reference-counted pointer that holds `TargetOptions`
from the public API of `CompilerInvocation`. This gives
`CompilerInvocation` an exclusive control over the lifetime of this
member, which will eventually be leveraged to implement a copy-on-write
behavior.
There are two clients that currently share ownership of that pointer:
* `TargetInfo` - This was refactored to hold a non-owning reference to
`TargetOptions`. The options object is typically owned by the
`CompilerInvocation` or by the new `CompilerInstance::AuxTargetOpts` for
the auxiliary target. This needed a bit of care in `ASTUnit::Parse()` to
keep the `CompilerInvocation` alive.
* `clangd::PreambleData` - This was refactored to exclusively own the
`TargetOptions` that get moved out of the `CompilerInvocation`.
WG14 added N3411 to the list of papers which apply to older versions of
C in C2y, and WG21 adopted CWG787 as a Defect Report in C++11. So we no
longer should be issuing a pedantic diagnostic about a file which does
not end with a newline character.
We do, however, continue to support -Wnewline-eof as an opt-in
diagnostic.
This PR hides the reference-counter pointer that holds
`DiagnosticOptions` from the public API of `CompilerInvocation`. This
gives `CompilerInvocation` an exclusive control over the lifetime of
this member, which will eventually be leveraged to implement a
copy-on-write behavior.
The only client that currently accesses that pointer is
`clangd::buildPreamble()` which takes care to reset it so that it's not
reset concurrently. This code is made redundant by making the reference
count of `DiagnosticOptions` atomic.
This is take two of #70976. This iteration of the patch makes sure that
custom
diagnostics without any warning group don't get promoted by `-Werror` or
`-Wfatal-errors`.
This implements parts of the extension proposed in
https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7.
Specifically, this makes it possible to specify a diagnostic group in an
optional third argument.
Starting with 41e3919ded78d8870f7c95e9181c7f7e29aa3cc4 DiagnosticsEngine
creation might perform IO. It was implicitly defaulting to
getRealFileSystem. This patch makes it explicit by pushing the decision
making to callers.
It uses ambient VFS if one is available, and keeps using
`getRealFileSystem` if there aren't any VFS.
This reverts commit e39205654dc11c50bd117e8ccac243a641ebd71f.
There are further discussions in
https://github.com/llvm/llvm-project/pull/70976, happening for past two
weeks. Since there were no responses for couple weeks now, reverting
until author is back.
Don't call raw_string_ostream::flush(), which is essentially a no-op.
As specified in the docs, raw_string_ostream is always unbuffered.
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
This patch adds an IsText parameter to the following functions
openFileForRead, getBufferForFile, getBufferForFileImpl and determines
whether a file is text by querying the file tag on z/OS. The default is
set to OF_Text instead of OF_None, this change in value does not affect
any other platforms other than z/OS.
This reverts commit e7f782e7481cea23ef452a75607d3d61f5bd0d22.
This had UBSan failures:
[----------] 1 test from ConfigCompileTests
[ RUN ] ConfigCompileTests.DiagnosticSuppression
Config fragment: compiling <unknown>:0 -> 0x00007B8366E2F7D8 (trusted=false)
/usr/local/google/home/fmayer/large/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:203:33: runtime error: reference binding to null pointer of type 'clang::DiagnosticIDs'
UndefinedBehaviorSanitizer: undefined-behavior /usr/local/google/home/fmayer/large/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:203:33
Pull Request: https://github.com/llvm/llvm-project/pull/108645
Alternatives to https://reviews.llvm.org/D153114.
Try to address https://github.com/clangd/clangd/issues/1293.
See the links for design ideas and the consensus so far. We want to have
some initial support in clang18.
This is the initial support for C++20 Modules in clangd.
As suggested by sammccall in https://reviews.llvm.org/D153114,
we should minimize the scope of the initial patch to make it easier
to review and understand so that every one are in the same page:
> Don't attempt any cross-file or cross-version coordination: i.e. don't
> try to reuse BMIs between different files, don't try to reuse BMIs
> between (preamble) reparses of the same file, don't try to persist the
> module graph. Instead, when building a preamble, synchronously scan
> for the module graph, build the required PCMs on the single preamble
> thread with filenames private to that preamble, and then proceed to
> build the preamble.
This patch reflects the above opinions.
# Testing in real-world project
I tested this with a modularized library:
https://github.com/alibaba/async_simple/tree/CXX20Modules. This library
has 3 modules (async_simple, std and asio) and 65 module units. (Note
that a module consists of multiple module units). Both `std` module and
`asio` module have 100k+ lines of code (maybe more, I didn't count). And
async_simple itself has 8k lines of code. This is the scale of the
project.
The result shows that it works pretty well, ..., well, except I need to
wait roughly 10s after opening/editing any file. And this falls in our
expectations. We know it is hard to make it perfect in the first move.
# What this patch does in detail
- Introduced an option `--experimental-modules-support` for the support
for C++20 Modules. So that no matter how bad this is, it wouldn't affect
current users. Following off the page, we'll assume the option is
enabled.
- Introduced two classes `ModuleFilesInfo` and
`ModuleDependencyScanner`. Now `ModuleDependencyScanner` is only used by
`ModuleFilesInfo`.
- The class `ModuleFilesInfo` records the built module files for
specific single source file. The module files can only be built by the
static member function `ModuleFilesInfo::buildModuleFilesInfoFor(PathRef
File, ...)`.
- The class `PreambleData` adds a new member variable with type
`ModuleFilesInfo`. This refers to the needed module files for the
current file. It means the module files info is part of the preamble,
which is suggested in the first patch too.
- In `isPreambleCompatible()`, we add a call to
`ModuleFilesInfo::CanReuse()` to check if the built module files are
still up to date.
- When we build the AST for a source file, we will load the built module
files from ModuleFilesInfo.
# What we need to do next
Let's split the TODOs into clang part and clangd part to make things
more clear.
The TODOs in the clangd part include:
1. Enable reusing module files across source files. The may require us
to bring a ModulesManager like thing which need to handle `scheduling`,
`the possibility of BMI version conflicts` and `various events that can
invalidate the module graph`.
2. Get a more efficient method to get the `<module-name> ->
<module-unit-source>` map. Currently we always scan the whole project
during `ModuleFilesInfo::buildModuleFilesInfoFor(PathRef File, ...)`.
This is clearly inefficient even if the scanning process is pretty fast.
I think the potential solutions include:
- Make a global scanner to monitor the state of every source file like I
did in the first patch. The pain point is that we need to take care of
the data races.
- Ask the build systems to provide the map just like we ask them to
provide the compilation database.
3. Persist the module files. So that we can reuse module files across
clangd invocations or even across clangd instances.
TODOs in the clang part include:
1. Clang should offer an option/mode to skip writing/reading the bodies
of the functions. Or even if we can requrie the parser to skip parsing
the function bodies.
And it looks like we can say the support for C++20 Modules is initially
workable after we made (1) and (2) (or even without (2)).
Building ASTs with compile flags that are incompatible to the ones used
for the Preamble are not really supported by clang and can trigger
crashes.
In an ideal world, we should be re-using not only TargetOpts, but the
full ParseInputs from the Preamble to prevent such failures.
Unfortunately current contracts of ThreadSafeFS makes this a non-safe
change for certain implementations. As there are no guarantees that the
same ThreadSafeFS is going to be valid in the Context::current() we're
building the AST in.
This is a follow-up patch for D148088. The dynamic symbol index (`FileIndex::updatePreamble`) may run in a separate thread, and the `DiagnosticConsumer` that is set up in `buildPreamble` might go out of scope before it is used. This could result in a SIGSEGV when attempting to call any method of the `DiagnosticConsumer` class.
The function `buildPreamble` sets up the `DiagnosticConsumer` as follows:
```
... buildPreamble(...) {
...
StoreDiags PreambleDiagnostics;
...
llvm::IntrusiveRefCntPtr<DiagnosticsEngine> PreambleDiagsEngine =
CompilerInstance::createDiagnostics(&CI.getDiagnosticOpts(),
&PreambleDiagnostics,
/*ShouldOwnClient=*/false);
...
// The call might use the diagnostic consumer in a separate thread
PreambleCallback(...)
...
}
```
`PreambleDiagnostics` might be out of scope for `buildPreamble` function when we call it inside `PreambleCallback` in a separate thread.
The Fix
The fix involves replacing the client (DiagnosticConsumer) with an `IgnoringDiagConsumer` instance, which will print messages to the clangd log.
Alternatively, we can replace `PreambleDiagnostics` with an object that is owned by `DiagnosticsEngine`.
Note
There is no corresponding LIT/GTest for this issue, since there is a specific race condition that is difficult to reproduce within a test framework.
Test Plan:
```
ninja check-clangd
```
Reviewed By: kadircet, sammccall
Differential Revision: https://reviews.llvm.org/D159363
This reverts commit 92c0546941190973e9201a08fa10edf27cb6992d.
Prevents tsan issues by resetting ref-counted-pointers eagerly, before
passing the copies into a new thread.
We've been running this internally for months now, without any
stability or correctness concerns. It has ~40% speed up on incremental
diagnostics latencies (as preamble can get invalidated through code completion
etc.).
Differential Revision: https://reviews.llvm.org/D153882
We would like to move the preamble index out of the critical path.
This patch is an RFC to get feedback on the correct implementation and potential pitfalls to keep into consideration.
I am not entirely sure if the lazy AST initialisation would create using Preamble AST in parallel. I tried with tsan enabled clangd but it seems to work OK (at least for the cases I tried)
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D148088
Since we redefine all macros in preamble-patch, and it's parsed after
consuming the preamble macros, we can get false missing-include diagnostics
while a fresh preamble is being rebuilt.
This patch makes sure preamble-patch is treated same as main file for
include-cleaner purposes.
Differential Revision: https://reviews.llvm.org/D148143
Extend the existing MainFileMacros structure:
- record more information (InConditionalDirective) in MacroOccurrence
- collect macro references inside macro body (fix a long-time FIXME)
So that the MainFileMacros preseve enough information, which allows a
just-in-time convertion to interop with include-cleaner::Macro for
include-cleaer features.
See the context in https://reviews.llvm.org/D146017.
Differential Revision: https://reviews.llvm.org/D146279
TempPCHFile::create() calls llvm::sys::fs::createTemporaryFile() to
create a file named preamble-*.pch in a system temporary directory. This
commit allows overriding the directory where these often many and large
preamble-*.pch files are stored.
The referenced bug report requests the ability to override the temporary
directory path used by libclang. However, overriding the return value of
llvm::sys::path::system_temp_directory() was rejected during code review
as improper and because it would negatively affect multithreading
performance. Finding all places where libclang uses the temporary
directory is very difficult. Therefore this commit is limited to
override libclang's single known use of the temporary directory.
This commit allows to override the preamble storage path only during
CXIndex construction to avoid multithreading issues and ensure that all
preambles are stored in the same directory. For the same multithreading
and consistency reasons, this commit deprecates
clang_CXIndex_setGlobalOptions() and
clang_CXIndex_setInvocationEmissionPathOption() in favor of specifying
these options during CXIndex construction.
Adding a new CXIndex constructor function each time a new initialization
argument is needed leads to either a large number of function parameters
unneeded by most libclang users or to an exponential number of overloads
that support different usage requirements. Therefore this commit
introduces a new extensible struct CXIndexOptions and a general function
clang_createIndexWithOptions().
A libclang user passes a desired preamble storage path to
clang_createIndexWithOptions(), which stores it in
CIndexer::PreambleStoragePath. Whenever
clang_parseTranslationUnit_Impl() is called, it passes
CIndexer::PreambleStoragePath to ASTUnit::LoadFromCommandLine(), which
stores this argument in ASTUnit::PreambleStoragePath. Whenever
ASTUnit::getMainBufferWithPrecompiledPreamble() is called, it passes
ASTUnit::PreambleStoragePath to PrecompiledPreamble::Build().
PrecompiledPreamble::Build() forwards the corresponding StoragePath
argument to TempPCHFile::create(). If StoragePath is not empty,
TempPCHFile::create() stores the preamble-*.pch file in the directory at
the specified path rather than in the system temporary directory.
The analysis below proves that this passing around of the
PreambleStoragePath string is sufficient to guarantee that the libclang
user override is used in TempPCHFile::create(). The analysis ignores API
uses in test code.
TempPCHFile::create() is called only in PrecompiledPreamble::Build().
PrecompiledPreamble::Build() is called only in two places: one in
clangd, which is not used by libclang, and one in
ASTUnit::getMainBufferWithPrecompiledPreamble().
ASTUnit::getMainBufferWithPrecompiledPreamble() is called in 3 places:
ASTUnit::LoadFromCompilerInvocation() [analyzed below].
ASTUnit::Reparse(), which in turn is called only from
clang_reparseTranslationUnit_Impl(), which in turn is called only from
clang_reparseTranslationUnit(). clang_reparseTranslationUnit() is never
called in LLVM code, but is part of public libclang API. This function's
documentation requires its translation unit argument to have been built
with clang_createTranslationUnitFromSourceFile().
clang_createTranslationUnitFromSourceFile() delegates its work to
clang_parseTranslationUnit(), which delegates to
clang_parseTranslationUnit2(), which delegates to
clang_parseTranslationUnit2FullArgv(), which delegates to
clang_parseTranslationUnit_Impl(), which passes
CIndexer::PreambleStoragePath to the ASTUnit it creates.
ASTUnit::CodeComplete() passes AllowRebuild = false to
ASTUnit::getMainBufferWithPrecompiledPreamble(), which makes it return
nullptr before calling PrecompiledPreamble::Build().
Both ASTUnit::LoadFromCompilerInvocation() overloads (one of which
delegates its work to another) call
ASTUnit::getMainBufferWithPrecompiledPreamble() only if their argument
PrecompilePreambleAfterNParses > 0. LoadFromCompilerInvocation() is
called in:
ASTBuilderAction::runInvocation() keeps the default parameter value
of PrecompilePreambleAfterNParses = 0, meaning that the preamble file is
never created from here.
ASTUnit::LoadFromCommandLine().
ASTUnit::LoadFromCommandLine() is called in two places:
CrossTranslationUnitContext::ASTLoader::loadFromSource() keeps the
default parameter value of PrecompilePreambleAfterNParses = 0, meaning
that the preamble file is never created from here.
clang_parseTranslationUnit_Impl(), which passes
CIndexer::PreambleStoragePath to the ASTUnit it creates.
Therefore, the overridden preamble storage path is always used in
TempPCHFile::create().
TempPCHFile::create() uses PreambleStoragePath in the same way as
LibclangInvocationReporter() uses InvocationEmissionPath. The existing
documentation for clang_CXIndex_setInvocationEmissionPathOption() does
not specify ownership, encoding, separator or relative vs absolute path
requirements. So the documentation for
CXIndexOptions::PreambleStoragePath doesn't either. The assumptions are:
no ownership transfer;
UTF-8 encoding;
native separators.
Both relative and absolute paths are supported.
The added API works as expected in KDevelop:
https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/283
Fixes: https://github.com/llvm/llvm-project/issues/51847
Differential Revision: https://reviews.llvm.org/D143418
getMemBufferCopy triggers an UB when it receives a default constructed
StringRef. Make sure that we're always passing the null-terminated string
created in ParseInputs throughout the scanPreamble.
Differential Revision: https://reviews.llvm.org/D144708
Translates diagnostics from baseline preamble to relevant modified
contents.
Translation is done by looking for a set of lines that have the same
contents in diagnostic/note/fix ranges inside baseline and modified
contents.
A diagnostic is preserved if its main range is outside of main file or
there's a translation from baseline to modified contents. Later on fixes
and notes attached to that diagnostic with relevant ranges are also
translated and preserved.
Depends on D143095
Differential Revision: https://reviews.llvm.org/D143096