36 Commits

Author SHA1 Message Date
Naveen Seth Hanig
9403c2d64d
Reland [clang][modules-driver] Add scanner to detect C++20 module presence (#153497)
This patch is part of a series to support driver managed module builds
for C++ named modules and Clang modules.
This introduces a scanner that detects C++ named module usage early in
the driver with only negligible overhead.

For now, it is enabled only with the `-fmodules-driver` flag and serves
solely diagnostic purposes. In the future, the scanner will be enabled
for any (modules-driver compatible) compilation with two or more inputs,
and will help the driver determine whether to implicitly enable the
modules driver.

Since the scanner adds very little overhead, we are also exploring
enabling it for compilations with only a single input. This approach
could allow us to detect `import std` usage in a single-file
compilation, which would then activate the modules driver. For
performance measurements on this, see
https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark.

RFC for driver managed module builds:

https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system

This patch relands the reland (2d31fc8) for commit ded1426. The earlier
reland failed due to a missing link dependency on `clangLex`. This
reland fixes the issue by adding the link dependency after discussing it
in the following RFC:

https://discourse.llvm.org/t/rfc-driver-link-the-driver-against-clangdependencyscanning-clangast-clangfrontend-clangserialization-and-clanglex
2025-08-18 21:21:08 +02:00
Naveen Seth Hanig
cb6f132b1c
Revert "Reland [clang][modules-driver] Add scanner to detect C++20 module presence" (#149900)
Reverts llvm/llvm-project#147630.

This causes a linker error caused by linking the driver against the
lexer.
2025-07-21 23:09:49 +02:00
Naveen Seth Hanig
2d31fc85a8
Reland [clang][modules-driver] Add scanner to detect C++20 module presence (#147630)
This patch is part of a series to natively support C++20 module usage
from the Clang driver (without requiring an external build system). This
introduces a new scanner that detects C++20 module usage in source files
without using the preprocessor or lexer.

For now, it is enabled only with the `-fmodules-driver` flag and serves
solely diagnostic purposes. In the future, the scanner will be enabled
for any (modules-driver compatible) compilation with two or more inputs,
and will help the driver determine whether to implicitly enable the
modules driver.

Since the scanner adds very little overhead, we are also exploring
enabling it for compilations with only a single input. This approach
could allow us to detect `import std` usage in a single-file
compilation, which would then activate the modules driver. For
performance measurements on this, see
https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark.

RFC:

https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system

This patch relands commit ded1426. The CI failure is resolved by
removing the compatibility warning for using the `-fmodules-driver` flag
with pre-C++20 standards, which also better aligns its behavior with
other features/flags supported only in newer standards.
2025-07-21 22:34:14 +02:00
Naveen Seth Hanig
6855b9c598
[clang][deps] Properly capture the global module and '\n' for all module directives (#148685)
Previously, the newline after a module directive was not properly
captured and printed by `clang::printDependencyDirectivesAsSource`.

According to P1857R3, each directive must, after skipping horizontal
whitespace, appear at the start of a logical line. Because the newline
after module directives was missing, this invalidated the following
line.
This fixes tests that were previously in violation of P1857R3,
including for Objective-C directives, which should also comply with
P1857R3.

This also ensures that the global module fragment `module;` is captured
by the dependency directives scanner.
2025-07-19 09:47:37 +02:00
Naveen Seth Hanig
ce8c19ffc5
[clang][deps] Fix dependency scanner misidentifying 'import::' as module partition (#148674)
The dependency directive scanner was incorrectly classifying namespaces
such as `import::inner xi` as directives. According to P1857R3, `import` should
not be treated as a directive when followed by `::`.
This change fixes that behavior.
2025-07-15 00:07:30 +05:30
Ziqing Luo
afd20aaca5
[clang-scan-deps] Fix "unterminated conditional directive" bug (#146645)
`clang-scan-deps` threw "unterminated conditional directive" error
falsely on the following example:

```
#ifndef __TEST
#define __TEST

#if defined(__TEST_DUMMY)
 #if defined(__TEST_DUMMY2)
  #pragma GCC warning                                                          \
      "Hello!"
 #else
  #pragma GCC error                                                            \
      "World!"
 #endif // defined(__TEST_DUMMY2)
#endif  // defined(__TEST_DUMMY)

#endif // #ifndef __TEST
```

The issue comes from PR #143950, where the flag `LastNonWhitespace` does
not correctly represent the state for the example above. The PR aimed to
support that a line-continuation can be followed by whitespaces.  
This commit fixes the issue by moving the `LastNonWhitespace` variable
to the inner loop so that it will be correctly reset.

rdar://153742186
2025-07-04 12:53:23 +08:00
Tobias Hieta
1fb786ea93
[clang][scandeps] Improve handling of rawstrings. (#139504) 2025-06-27 12:21:21 +02:00
Naveen Seth Hanig
9d49b82de0
[clang-scan-deps] Implement P2223R2 for DependencyDirectiveScanner.cpp (#143950)
P2223R2 allows the line-continuation slash `\` to be followed by
additional whitespace. The Clang lexer already follows this behavior,
also for versions prior to C++23. The dependency directive scanner
however only implements it for `#define` directives (15d5f5d).

This fully implements P2223R2 for the dependency directive scanner (for
any C++ standard) and aligns the dependency directive scanner's splicing
behavior with that of the Clang lexer.

For example, the following code was previously not scanned correctly by
`clang-scan-deps` but now works as expected:

```cpp
import \<whitespace here>
A;
```
2025-06-13 10:48:05 -07:00
Cyndy Ishida
897b0301d2
[clang][dep-scan] Resolve lexer crash from a permutation of invalid tokens (#142452)
Sometimes, when a user writes invalid code, the minimization used for
scanning can create a stream of tokens that is invalid at lex time. This
patch protects against the case where there are valid (non-c++20) import
directives discovered in the middle of an invalid `import` declaration.

Mostly authored by: @akyrtzi 
resolves: rdar://152335844
2025-06-06 13:45:16 -07:00
Argyrios Kyrtzidis
d19e71db8a
[clang/Lex/DependencyDirectivesScanner] Ignore import/include directives with missing filenames without failing the scan (#100126)
Follow-up to `34ab855826b8cb0c3b46c770b83390bd1fe95c64`:

* Don't fail the scan with an include with missing filename, it may be
inside a skipped preprocessor block. Let the compilation provide any
related error.
* Fix an issue where the lexer was skipping through the next directive,
after ignoring the include with missing filename.
2024-07-23 12:59:59 -07:00
Cyndy Ishida
34ab855826
[clang][deps] Ignore import/include directives with missing filenames (#99520)
Previously source input like `#import ` resulted in infinite calls
append the same token into `CurDirTokens`. This patch now ignores those
directive lines if they won't actually end up being compiled. (e.g.
macro guarded)

resolves: rdar://121247565
2024-07-22 21:10:05 -07:00
Michael Spencer
4272847546
[clang][deps] Don't treat ObjC method args as module directives (#97654)
`import:(type)name` is a method argument decl in ObjC, but the C++20
preprocessing rules say this is a preprocessing line.

Because the dependency directive scanner is not language dependent, this
patch extends the C++20 rule to exclude `module :` (which is never a
valid module decl anyway), and `import :` that is not followed by an
identifier.

This is ok to do because in C++20 mode the compiler will later error on
lines like this anyway, and the dependencies the scanner returns are
still correct.
2024-07-18 13:37:01 -07:00
Sirraide
e46468407a
[Clang] Allow raw string literals in C as an extension (#88265)
This enables raw R"" string literals in C in some language modes
and adds an option to disable or enable them explicitly as an
extension.

Background: GCC supports raw string literals in C in `-gnuXY` modes
starting with gnu99. This pr both enables raw string literals in gnu99 
mode and later in C and adds an `-f[no-]raw-string-literals` flag to override 
this behaviour. The decision not to enable raw string literals in gnu89
mode, according to the GCC devs, is intentional as that mode is supposed
to be used for ‘old code’ that they don’t want to break; we’ve decided to
match GCC’s behaviour here as well.

The `-fraw-string-literals`  flag can additionally be used to enable raw string 
literals in modes where they aren’t enabled by default (such as c99—as 
opposed to gnu99—or even e.g. C++03); conversely, the negated flag can 
be used to disable them in any gnuXY modes that *do* provide them by 
default, or to override a previous flag. However, we do *not*  support 
disabling raw string literals (or indeed either of these two options) in 
C++11 mode and later, because we don’t want to just start supporting 
disabling features that are actually part of the language in the general case.

This fixes #85703.
2024-07-10 12:10:44 +02:00
Nishith Kumar M Shah
0559eaff5a
Revert "Pass LangOpts from CompilerInstance to DependencyScanningWorker (#93753)" (#94488)
This reverts commit 9862080b1cbf685c0d462b29596e3f7206d24aa2.
2024-06-05 11:42:13 -07:00
Nishith Kumar M Shah
9862080b1c
Pass LangOpts from CompilerInstance to DependencyScanningWorker (#93753)
This commit fixes https://github.com/llvm/llvm-project/issues/88896 by
passing LangOpts from the CompilerInstance to
DependencyScanningWorker so that the original LangOpts are
preserved/respected.
This makes for more accurate parsing/lexing when certain language
versions or features specific to versions are to be used.
2024-06-03 17:20:43 +02:00
Danny Mösch
00e80fbfb9
[NFC] Correct C++ standard names (#81421) 2024-02-11 19:43:34 +01:00
serge-sans-paille
8116b6dce7
[clang] Change GetCharAndSizeSlow interface to by-value style
Instead of passing the Size by reference, assuming it is initialized,
return it alongside the expected char result as a POD.

This makes the interface less error prone: previous interface expected
the Size reference to be initialized, and it was often forgotten,
leading to uninitialized variable usage. This patch fixes the issue.

This also generates faster code, as the returned POD (a char and an
unsigned) fits in 64 bits. The speedup according to compile time tracker
reach -O.7%, with a good number of -0.4%. Details are available on

        https://llvm-compile-time-tracker.com/compare.php?from=3fe63f81fcb999681daa11b2890c82fda3aaeef5&to=fc76a9202f737472ecad4d6e0b0bf87a013866f3&stat=instructions:u

And icing on the cake, on my setup it also shaves 2kB out of
libclang-cpp :-)

This is a recommit of d8f5a18b6e587aeaa8b99707e87b652f49b160cd for
2023-10-31 00:08:01 +01:00
Nico Weber
1c876ff515 Revert "Perf/lexer faster slow get char and size (#70543)"
This reverts commit d8f5a18b6e587aeaa8b99707e87b652f49b160cd.
Breaks build, see:
https://github.com/llvm/llvm-project/pull/70543#issuecomment-1784227421
2023-10-29 21:11:39 -04:00
serge-sans-paille
d8f5a18b6e
Perf/lexer faster slow get char and size (#70543)
Co-authored-by: serge-sans-paille <sguelton@mozilla.com>
2023-10-29 18:17:02 +00:00
Ben Langmuir
ee8ed0b309 [clang][deps] Teach dep directive scanner about _Pragma
While we cannot handle `_Pragma` used inside macros, we can handle
this at the top level, and it some projects use the `_Pragma("once")`
spelling like that, which was causing spurious failures in the scanner.

Limitations
* Cannot handle #define ONCE _Pragma("once"), same issue as using
  @import in a macro -- ideally we should diagnose this in obvious cases
* Our LangOpts are currently fixed, so we are not handling u"" strings
  or R"()" strings that require C11/C++11.

rdar://108629982

Differential Revision: https://reviews.llvm.org/D149884
2023-05-09 10:05:12 -07:00
Ben Langmuir
7b492d1be0 [clang][deps] Teach dep directive scanner about #pragma clang system_header
This ensures we get the correct FileCharacteristic during scanning. In a
yet-to-be-upstreamed branch this fixes observable failures, but it's
also good to handle this on principle: the FileCharacteristic is a
property of the file that is observable in the scanner, so there is
nothing preventing us from depending on it.

rdar://108627403

Differential Revision: https://reviews.llvm.org/D149777
2023-05-03 13:53:21 -07:00
Kazu Hirata
6ad0788c33 [clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 12:31:01 -08:00
Kazu Hirata
a1580d7b59 [clang] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 11:07:21 -08:00
Fangrui Song
53e5cd4d3e llvm::Optional::value => operator*/operator->
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.

This makes `ninja clang` work in the absence of llvm::Optional::value.
2022-12-17 06:37:59 +00:00
Kazu Hirata
37a3e98c84 [clang] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-09 18:39:01 -08:00
Kazu Hirata
5891420e68 [clang] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 11:54:46 -08:00
Argyrios Kyrtzidis
b340c5ae42 [Lex/DependencyDirectivesScanner] Handle the case where the source line starts with a tok::hashhash
Differential Revision: https://reviews.llvm.org/D133674
2022-09-13 15:48:50 -07:00
Argyrios Kyrtzidis
aa484c90cf [Lex/DependencyDirectivesScanner] Keep track of the presence of tokens between the last scanned directive and EOF
Directive `dependency_directives_scan::tokens_present_before_eof` is introduced to indicate there were tokens present before
the last scanned dependency directive and EOF.
This is useful to ensure we correctly identify the macro guards when lexing using the dependency directives.

Differential Revision: https://reviews.llvm.org/D133357
2022-09-07 10:31:29 -07:00
Fangrui Song
32197830ef [clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC 2022-08-09 07:11:18 +00:00
Kazu Hirata
cb2c8f694d [clang] Use value instead of getValue (NFC) 2022-07-13 23:39:33 -07:00
Kazu Hirata
97afce08cb [clang] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 22:26:24 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata
ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Argyrios Kyrtzidis
b4c83a13f6 [Tooling/DependencyScanning & Preprocessor] Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
This is a commit with the following changes:

* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality

Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.

* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources

Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.

* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`

This is bringing the following benefits:

    * Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
    * Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
    * Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).

For normal preprocessing measurements show differences are below the noise level.

Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.

Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
2022-05-26 12:50:06 -07:00
Argyrios Kyrtzidis
b58a420ff4 [Tooling/DependencyScanning] Rename refactorings towards transitioning dependency scanning to use pre-lexed preprocessor directive tokens
This is first of a series of patches for making the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`.
This patch only includes NFC renaming changes to make reviewing of the functionality changing parts easier.

Differential Revision: https://reviews.llvm.org/D125484
2022-05-26 12:49:51 -07:00