This patch is part of a series to support driver managed module builds
for C++ named modules and Clang modules.
This introduces a scanner that detects C++ named module usage early in
the driver with only negligible overhead.
For now, it is enabled only with the `-fmodules-driver` flag and serves
solely diagnostic purposes. In the future, the scanner will be enabled
for any (modules-driver compatible) compilation with two or more inputs,
and will help the driver determine whether to implicitly enable the
modules driver.
Since the scanner adds very little overhead, we are also exploring
enabling it for compilations with only a single input. This approach
could allow us to detect `import std` usage in a single-file
compilation, which would then activate the modules driver. For
performance measurements on this, see
https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark.
RFC for driver managed module builds:
https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system
This patch relands the reland (2d31fc8) for commit ded1426. The earlier
reland failed due to a missing link dependency on `clangLex`. This
reland fixes the issue by adding the link dependency after discussing it
in the following RFC:
https://discourse.llvm.org/t/rfc-driver-link-the-driver-against-clangdependencyscanning-clangast-clangfrontend-clangserialization-and-clanglex
This patch is part of a series to natively support C++20 module usage
from the Clang driver (without requiring an external build system). This
introduces a new scanner that detects C++20 module usage in source files
without using the preprocessor or lexer.
For now, it is enabled only with the `-fmodules-driver` flag and serves
solely diagnostic purposes. In the future, the scanner will be enabled
for any (modules-driver compatible) compilation with two or more inputs,
and will help the driver determine whether to implicitly enable the
modules driver.
Since the scanner adds very little overhead, we are also exploring
enabling it for compilations with only a single input. This approach
could allow us to detect `import std` usage in a single-file
compilation, which would then activate the modules driver. For
performance measurements on this, see
https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark.
RFC:
https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system
This patch relands commit ded1426. The CI failure is resolved by
removing the compatibility warning for using the `-fmodules-driver` flag
with pre-C++20 standards, which also better aligns its behavior with
other features/flags supported only in newer standards.
Previously, the newline after a module directive was not properly
captured and printed by `clang::printDependencyDirectivesAsSource`.
According to P1857R3, each directive must, after skipping horizontal
whitespace, appear at the start of a logical line. Because the newline
after module directives was missing, this invalidated the following
line.
This fixes tests that were previously in violation of P1857R3,
including for Objective-C directives, which should also comply with
P1857R3.
This also ensures that the global module fragment `module;` is captured
by the dependency directives scanner.
The dependency directive scanner was incorrectly classifying namespaces
such as `import::inner xi` as directives. According to P1857R3, `import` should
not be treated as a directive when followed by `::`.
This change fixes that behavior.
`clang-scan-deps` threw "unterminated conditional directive" error
falsely on the following example:
```
#ifndef __TEST
#define __TEST
#if defined(__TEST_DUMMY)
#if defined(__TEST_DUMMY2)
#pragma GCC warning \
"Hello!"
#else
#pragma GCC error \
"World!"
#endif // defined(__TEST_DUMMY2)
#endif // defined(__TEST_DUMMY)
#endif // #ifndef __TEST
```
The issue comes from PR #143950, where the flag `LastNonWhitespace` does
not correctly represent the state for the example above. The PR aimed to
support that a line-continuation can be followed by whitespaces.
This commit fixes the issue by moving the `LastNonWhitespace` variable
to the inner loop so that it will be correctly reset.
rdar://153742186
P2223R2 allows the line-continuation slash `\` to be followed by
additional whitespace. The Clang lexer already follows this behavior,
also for versions prior to C++23. The dependency directive scanner
however only implements it for `#define` directives (15d5f5d).
This fully implements P2223R2 for the dependency directive scanner (for
any C++ standard) and aligns the dependency directive scanner's splicing
behavior with that of the Clang lexer.
For example, the following code was previously not scanned correctly by
`clang-scan-deps` but now works as expected:
```cpp
import \<whitespace here>
A;
```
Sometimes, when a user writes invalid code, the minimization used for
scanning can create a stream of tokens that is invalid at lex time. This
patch protects against the case where there are valid (non-c++20) import
directives discovered in the middle of an invalid `import` declaration.
Mostly authored by: @akyrtzi
resolves: rdar://152335844
Follow-up to `34ab855826b8cb0c3b46c770b83390bd1fe95c64`:
* Don't fail the scan with an include with missing filename, it may be
inside a skipped preprocessor block. Let the compilation provide any
related error.
* Fix an issue where the lexer was skipping through the next directive,
after ignoring the include with missing filename.
Previously source input like `#import ` resulted in infinite calls
append the same token into `CurDirTokens`. This patch now ignores those
directive lines if they won't actually end up being compiled. (e.g.
macro guarded)
resolves: rdar://121247565
`import:(type)name` is a method argument decl in ObjC, but the C++20
preprocessing rules say this is a preprocessing line.
Because the dependency directive scanner is not language dependent, this
patch extends the C++20 rule to exclude `module :` (which is never a
valid module decl anyway), and `import :` that is not followed by an
identifier.
This is ok to do because in C++20 mode the compiler will later error on
lines like this anyway, and the dependencies the scanner returns are
still correct.
This enables raw R"" string literals in C in some language modes
and adds an option to disable or enable them explicitly as an
extension.
Background: GCC supports raw string literals in C in `-gnuXY` modes
starting with gnu99. This pr both enables raw string literals in gnu99
mode and later in C and adds an `-f[no-]raw-string-literals` flag to override
this behaviour. The decision not to enable raw string literals in gnu89
mode, according to the GCC devs, is intentional as that mode is supposed
to be used for ‘old code’ that they don’t want to break; we’ve decided to
match GCC’s behaviour here as well.
The `-fraw-string-literals` flag can additionally be used to enable raw string
literals in modes where they aren’t enabled by default (such as c99—as
opposed to gnu99—or even e.g. C++03); conversely, the negated flag can
be used to disable them in any gnuXY modes that *do* provide them by
default, or to override a previous flag. However, we do *not* support
disabling raw string literals (or indeed either of these two options) in
C++11 mode and later, because we don’t want to just start supporting
disabling features that are actually part of the language in the general case.
This fixes#85703.
This commit fixes https://github.com/llvm/llvm-project/issues/88896 by
passing LangOpts from the CompilerInstance to
DependencyScanningWorker so that the original LangOpts are
preserved/respected.
This makes for more accurate parsing/lexing when certain language
versions or features specific to versions are to be used.
Instead of passing the Size by reference, assuming it is initialized,
return it alongside the expected char result as a POD.
This makes the interface less error prone: previous interface expected
the Size reference to be initialized, and it was often forgotten,
leading to uninitialized variable usage. This patch fixes the issue.
This also generates faster code, as the returned POD (a char and an
unsigned) fits in 64 bits. The speedup according to compile time tracker
reach -O.7%, with a good number of -0.4%. Details are available on
https://llvm-compile-time-tracker.com/compare.php?from=3fe63f81fcb999681daa11b2890c82fda3aaeef5&to=fc76a9202f737472ecad4d6e0b0bf87a013866f3&stat=instructions:u
And icing on the cake, on my setup it also shaves 2kB out of
libclang-cpp :-)
This is a recommit of d8f5a18b6e587aeaa8b99707e87b652f49b160cd for
While we cannot handle `_Pragma` used inside macros, we can handle
this at the top level, and it some projects use the `_Pragma("once")`
spelling like that, which was causing spurious failures in the scanner.
Limitations
* Cannot handle #define ONCE _Pragma("once"), same issue as using
@import in a macro -- ideally we should diagnose this in obvious cases
* Our LangOpts are currently fixed, so we are not handling u"" strings
or R"()" strings that require C11/C++11.
rdar://108629982
Differential Revision: https://reviews.llvm.org/D149884
This ensures we get the correct FileCharacteristic during scanning. In a
yet-to-be-upstreamed branch this fixes observable failures, but it's
also good to handle this on principle: the FileCharacteristic is a
property of the file that is observable in the scanner, so there is
nothing preventing us from depending on it.
rdar://108627403
Differential Revision: https://reviews.llvm.org/D149777
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.
This makes `ninja clang` work in the absence of llvm::Optional::value.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Directive `dependency_directives_scan::tokens_present_before_eof` is introduced to indicate there were tokens present before
the last scanned dependency directive and EOF.
This is useful to ensure we correctly identify the macro guards when lexing using the dependency directives.
Differential Revision: https://reviews.llvm.org/D133357
This is a commit with the following changes:
* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality
Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.
* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.
* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`
This is bringing the following benefits:
* Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
* Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
* Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).
For normal preprocessing measurements show differences are below the noise level.
Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.
Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
This is first of a series of patches for making the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`.
This patch only includes NFC renaming changes to make reviewing of the functionality changing parts easier.
Differential Revision: https://reviews.llvm.org/D125484