7 Commits

Author SHA1 Message Date
yronglin
1da403937e
Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130)" (#173789)
The patch reapply https://github.com/llvm/llvm-project/pull/173130.

This patch implement the following papers:
[P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3).
[P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1).
[CWG2947](https://cplusplus.github.io/CWG/issues/2947.html).

At the start of phase 4 an import or module token is treated as starting
a directive and are converted to their respective keywords iff:

 - After skipping horizontal whitespace are
    - at the start of a logical line, or
    - preceded by an export at the start of the logical line.
- Are followed by an identifier pp token (before macro expansion), or
    - <, ", or : (but not ::) pp tokens for import, or
    - ; for module
Otherwise the token is treated as an identifier.

Additionally:

- The entire import or module directive (including the closing ;) must
be on a single logical line and for module must not come from an
#include.
- The expansion of macros must not result in an import or module
directive introducer that was not there prior to macro expansion.
- A module directive may only appear as the first preprocessing tokens
in a file (excluding the global module fragment.)
- Preprocessor conditionals shall not span a module declaration.

After this patch, we handle C++ module-import and module-declaration as
a real pp-directive in preprocessor. Additionally, we refactor module
name lexing, remove the complex state machine and read full module name
during module/import directive handling. Possibly we can introduce a
tok::annot_module_name token in the future, avoid duplicatly parsing
module name in both preprocessor and parser, but it's makes error
recovery much diffcult(eg. import a; import b; in same line).

This patch also introduce 2 new keyword `__preprocessed_module` and
`__preprocessed_import`. These 2 keyword was generated during `-E` mode.
This is useful to avoid confusion with `module` and `import` keyword in
preprocessed output:
```cpp
export module m;
struct import {};
#define EMPTY
EMPTY import foo;
```

Fixes https://github.com/llvm/llvm-project/issues/54047

The previous patch has an use-after-free issue in
Lexer::LexTokenInternal function. Since C++20, the `export`, `import`
and `module` identifiers may be an introducer of a C++ module
declaration/importing directive, and the directive will handled in
`LexIdentifierContinue`. Unfortunately, the EOF may be encountered in
`LexIdentifierContinue` and `CurLexer` might be destructed in
`HandleEndOfFile`, If the code after `LexIdentifierContinue` try to
access `LangOps` or other class members in this Lexer, it will hit
undefined behavior.

This patch also fix the use-after-free issue in Lexer by introduce a
mechanism to delay the destruction of `CurLexer` in `Preprocessor`
class.

---------

Signed-off-by: yronglin <yronglin777@gmail.com>
2026-01-20 17:42:46 +08:00
Jan Svoboda
f94afdd0b7
[clang][modules] Unify "context hash" and "specific module cache path" (#176215)
This PR unifies the terminology for:
* "context hash" - previously ambiguously referred to as "module hash"
or as overly specific "module context hash"
* "specific module cache path" - previously referred to as just "module
cache path" - hard to distinguish from the command-line-provided module
cache path without the context hash

NFCI
2026-01-15 12:02:31 -08:00
yronglin
71bba12587
Revert "Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130)" (#173549)
This reverts commit 0d1c396ce8178baf05f277b16bf41b8a6b847d6d.

Co-authored-by: Yihan Wang <yihwang@nvidia.com>
2025-12-25 19:55:40 +08:00
yronglin
0d1c396ce8
Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130)
This PR reapply https://github.com/llvm/llvm-project/pull/107168.

---------

Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
Signed-off-by: yronglin <yronglin777@gmail.com>
2025-12-25 18:55:44 +08:00
Paul Kirth
2b8b305d46
Revert "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173118)
Reverts llvm/llvm-project#107168

This patch broke on bots:
- https://lab.llvm.org/buildbot/#/builders/190/builds/33105
- https://lab.llvm.org/buildbot/#/builders/94/builds/13727
- https://lab.llvm.org/buildbot/#/builders/169/builds/18192

and on mac-aarch64 builds.
see
https://github.com/llvm/llvm-project/pull/107168#issuecomment-3675990781
2025-12-19 15:17:31 -08:00
yronglin
d2e62d9024
[C++20][Modules] Implement P1857R3 Modules Dependency Discovery (#107168)
This PR implement the following papers:
[P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3).
[P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1).
[CWG2947](https://cplusplus.github.io/CWG/issues/2947.html).

At the start of phase 4 an import or module token is treated as starting
a directive and are converted to their respective keywords iff:

 - After skipping horizontal whitespace are
    - at the start of a logical line, or
    - preceded by an export at the start of the logical line.
- Are followed by an identifier pp token (before macro expansion), or
    - <, ", or : (but not ::) pp tokens for import, or
    - ; for module
Otherwise the token is treated as an identifier.

Additionally:

- The entire import or module directive (including the closing ;) must
be on a single logical line and for module must not come from an
#include.
- The expansion of macros must not result in an import or module
directive introducer that was not there prior to macro expansion.
- A module directive may only appear as the first preprocessing tokens
in a file (excluding the global module fragment.)
- Preprocessor conditionals shall not span a module declaration.

After this patch, we handle C++ module-import and module-declaration as
a real pp-directive in preprocessor. Additionally, we refactor module
name lexing, remove the complex state machine and read full module name
during module/import directive handling. Possibly we can introduce a
tok::annot_module_name token in the future, avoid duplicatly parsing
module name in both preprocessor and parser, but it's makes error
recovery much diffcult(eg. import a; import b; in same line).

This patch also introduce 2 new keyword `__preprocessed_module` and
`__preprocessed_import`. These 2 keyword was generated during `-E` mode.
This is useful to avoid confusion with `module` and `import` keyword in
preprocessed output:
```cpp
export module m;
struct import {};
#define EMPTY
EMPTY import foo;
```

Fixes https://github.com/llvm/llvm-project/issues/54047

---------

Signed-off-by: yronglin <yronglin777@gmail.com>
Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
2025-12-19 23:29:17 +08:00
Naveen Seth Hanig
b70be3dc14
[clang][DependencyScanning] Separate clangDependencyScanning and DependencyScanningTool (NFC) (#169962)
This patch is the first of two in refactoring Clang's dependency
scanning tooling to remove its dependency on clangDriver.

It separates Tooling/DependencyScanningTool.cpp from the rest of
clangDependencyScanning and moves clangDependencyScanning out of
clangTooling into its own library. No functional changes are
introduced.

The follow-up patch (#169964) will restrict clangDependencyScanning to
handling only -cc1 command line inputs and will move all functionality
related to handling driver commands into clangTooling.
(Tooling/DependencyScanningTool.cpp).

This is part of a broader effort to support driver-managed builds for
compilations using C++ named modules and/or Clang modules. It is
required for linking the dependency scanning tooling against the driver
without introducing cyclic dependencies, which would otherwise cause
build failures when dynamic linking is enabled.

The RFC for this change can be found here:

https://discourse.llvm.org/t/rfc-new-clangoptions-library-remove-dependency-on-clangdriver-from-clangfrontend-and-flangfrontend/88773?u=naveen-seth
2025-12-04 00:38:21 +01:00