yronglin 1da403937e
Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130)" (#173789)
The patch reapply https://github.com/llvm/llvm-project/pull/173130.

This patch implement the following papers:
[P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3).
[P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1).
[CWG2947](https://cplusplus.github.io/CWG/issues/2947.html).

At the start of phase 4 an import or module token is treated as starting
a directive and are converted to their respective keywords iff:

 - After skipping horizontal whitespace are
    - at the start of a logical line, or
    - preceded by an export at the start of the logical line.
- Are followed by an identifier pp token (before macro expansion), or
    - <, ", or : (but not ::) pp tokens for import, or
    - ; for module
Otherwise the token is treated as an identifier.

Additionally:

- The entire import or module directive (including the closing ;) must
be on a single logical line and for module must not come from an
#include.
- The expansion of macros must not result in an import or module
directive introducer that was not there prior to macro expansion.
- A module directive may only appear as the first preprocessing tokens
in a file (excluding the global module fragment.)
- Preprocessor conditionals shall not span a module declaration.

After this patch, we handle C++ module-import and module-declaration as
a real pp-directive in preprocessor. Additionally, we refactor module
name lexing, remove the complex state machine and read full module name
during module/import directive handling. Possibly we can introduce a
tok::annot_module_name token in the future, avoid duplicatly parsing
module name in both preprocessor and parser, but it's makes error
recovery much diffcult(eg. import a; import b; in same line).

This patch also introduce 2 new keyword `__preprocessed_module` and
`__preprocessed_import`. These 2 keyword was generated during `-E` mode.
This is useful to avoid confusion with `module` and `import` keyword in
preprocessed output:
```cpp
export module m;
struct import {};
#define EMPTY
EMPTY import foo;
```

Fixes https://github.com/llvm/llvm-project/issues/54047

The previous patch has an use-after-free issue in
Lexer::LexTokenInternal function. Since C++20, the `export`, `import`
and `module` identifiers may be an introducer of a C++ module
declaration/importing directive, and the directive will handled in
`LexIdentifierContinue`. Unfortunately, the EOF may be encountered in
`LexIdentifierContinue` and `CurLexer` might be destructed in
`HandleEndOfFile`, If the code after `LexIdentifierContinue` try to
access `LangOps` or other class members in this Lexer, it will hit
undefined behavior.

This patch also fix the use-after-free issue in Lexer by introduce a
mechanism to delay the destruction of `CurLexer` in `Preprocessor`
class.

---------

Signed-off-by: yronglin <yronglin777@gmail.com>
2026-01-20 17:42:46 +08:00

111 lines
3.5 KiB
C++

// RUN: rm -rf %t
// RUN: split-file %s %t
// RUN: %clang_cc1 -std=c++2a -I%t -emit-module-interface %t/interface.cppm -o %t.pcm
// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %t/implA.cppm -verify -fno-modules-error-recovery
// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %t/implB.cppm -verify -fno-modules-error-recovery
//--- foo.h
#ifndef FOO_H
#define FOO_H
extern int in_header;
#endif
//--- interface.cppm
module;
#include "foo.h"
// FIXME: The following need to be moved to a header file. The global module
// fragment is only permitted to contain preprocessor directives.
int global_module_fragment;
export module A;
export int exported;
int not_exported;
static int internal;
module :private;
int not_exported_private;
static int internal_private;
//--- implA.cppm
module;
void test_early() {
in_header = 1; // expected-error {{use of undeclared identifier 'in_header'}}
// expected-note@* {{not visible}}
global_module_fragment = 1; // expected-error {{use of undeclared identifier 'global_module_fragment'}}
exported = 1; // expected-error {{use of undeclared identifier 'exported'}}
not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'}}
// FIXME: We need better diagnostic message for static variable.
internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
not_exported_private = 1; // expected-error {{undeclared identifier}}
internal_private = 1; // expected-error {{undeclared identifier}}
}
module A;
void test_late() {
in_header = 1; // expected-error {{missing '#include "foo.h"'; 'in_header' must be declared before it is used}}
// expected-note@* {{not visible}}
global_module_fragment = 1; // expected-error {{missing '#include'; 'global_module_fragment' must be declared before it is used}}
exported = 1;
not_exported = 1;
internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
not_exported_private = 1;
internal_private = 1; // expected-error {{use of undeclared identifier 'internal_private'}}
}
//--- implB.cppm
module;
void test_early() {
in_header = 1; // expected-error {{use of undeclared identifier 'in_header'}}
// expected-note@* {{not visible}}
global_module_fragment = 1; // expected-error {{use of undeclared identifier 'global_module_fragment'}}
exported = 1; // expected-error {{use of undeclared identifier 'exported'}}
not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'}}
// FIXME: We need better diagnostic message for static variable.
internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
not_exported_private = 1; // expected-error {{undeclared identifier}}
internal_private = 1; // expected-error {{undeclared identifier}}
}
export module B;
import A;
void test_late() {
in_header = 1; // expected-error {{missing '#include "foo.h"'; 'in_header' must be declared before it is used}}
// expected-note@* {{not visible}}
global_module_fragment = 1; // expected-error {{missing '#include'; 'global_module_fragment' must be declared before it is used}}
exported = 1;
not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'; did you mean 'exported'?}}
// expected-note@* {{'exported' declared here}}
internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
not_exported_private = 1;
// FIXME: should not be visible here
// expected-error@-2 {{undeclared identifier}}
internal_private = 1; // expected-error {{use of undeclared identifier 'internal_private'}}
}