Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130)" (#173789)

The patch reapply https://github.com/llvm/llvm-project/pull/173130.

This patch implement the following papers:
[P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3).
[P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1).
[CWG2947](https://cplusplus.github.io/CWG/issues/2947.html).

At the start of phase 4 an import or module token is treated as starting
a directive and are converted to their respective keywords iff:

 - After skipping horizontal whitespace are
    - at the start of a logical line, or
    - preceded by an export at the start of the logical line.
- Are followed by an identifier pp token (before macro expansion), or
    - <, ", or : (but not ::) pp tokens for import, or
    - ; for module
Otherwise the token is treated as an identifier.

Additionally:

- The entire import or module directive (including the closing ;) must
be on a single logical line and for module must not come from an
#include.
- The expansion of macros must not result in an import or module
directive introducer that was not there prior to macro expansion.
- A module directive may only appear as the first preprocessing tokens
in a file (excluding the global module fragment.)
- Preprocessor conditionals shall not span a module declaration.

After this patch, we handle C++ module-import and module-declaration as
a real pp-directive in preprocessor. Additionally, we refactor module
name lexing, remove the complex state machine and read full module name
during module/import directive handling. Possibly we can introduce a
tok::annot_module_name token in the future, avoid duplicatly parsing
module name in both preprocessor and parser, but it's makes error
recovery much diffcult(eg. import a; import b; in same line).

This patch also introduce 2 new keyword `__preprocessed_module` and
`__preprocessed_import`. These 2 keyword was generated during `-E` mode.
This is useful to avoid confusion with `module` and `import` keyword in
preprocessed output:
```cpp
export module m;
struct import {};
#define EMPTY
EMPTY import foo;
```

Fixes https://github.com/llvm/llvm-project/issues/54047

The previous patch has an use-after-free issue in
Lexer::LexTokenInternal function. Since C++20, the `export`, `import`
and `module` identifiers may be an introducer of a C++ module
declaration/importing directive, and the directive will handled in
`LexIdentifierContinue`. Unfortunately, the EOF may be encountered in
`LexIdentifierContinue` and `CurLexer` might be destructed in
`HandleEndOfFile`, If the code after `LexIdentifierContinue` try to
access `LangOps` or other class members in this Lexer, it will hit
undefined behavior.

This patch also fix the use-after-free issue in Lexer by introduce a
mechanism to delay the destruction of `CurLexer` in `Preprocessor`
class.

---------

Signed-off-by: yronglin <yronglin777@gmail.com>
This commit is contained in:
yronglin 2026-01-20 17:42:46 +08:00 committed by GitHub
parent 3ca2a5fc0b
commit 1da403937e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
45 changed files with 1734 additions and 540 deletions

View File

@ -81,6 +81,8 @@ C++23 Feature Support
C++20 Feature Support
^^^^^^^^^^^^^^^^^^^^^
- Clang now supports `P1857R3 <https://wg21.link/p1857r3>`_ Modules Dependency Discovery. (#GH54047)
C++17 Feature Support
^^^^^^^^^^^^^^^^^^^^^

View File

@ -1384,33 +1384,6 @@ declarations which use it. Thus, the preferred name will not be displayed in
the debugger as expected. This is tracked by
`#56490 <https://github.com/llvm/llvm-project/issues/56490>`_.
Don't emit macros about module declaration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is covered by `P1857R3 <https://wg21.link/P1857R3>`_. It is mentioned here
because we want users to be aware that we don't yet implement it.
A direct approach to write code that can be compiled by both modules and
non-module builds may look like:
.. code-block:: c++
MODULE
IMPORT header_name
EXPORT_MODULE MODULE_NAME;
IMPORT header_name
EXPORT ...
The intent of this is that this file can be compiled like a module unit or a
non-module unit depending on the definition of some macros. However, this usage
is forbidden by P1857R3 which is not yet implemented in Clang. This means that
is possible to write invalid modules which will no longer be accepted once
P1857R3 is implemented. This is tracked by
`#54047 <https://github.com/llvm/llvm-project/issues/54047>`_.
Until then, it is recommended not to mix macros with module declarations.
Inconsistent filename suffix requirement for importable module units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -503,8 +503,8 @@ def warn_cxx98_compat_variadic_macro : Warning<
InGroup<CXX98CompatPedantic>, DefaultIgnore;
def ext_named_variadic_macro : Extension<
"named variadic macros are a GNU extension">, InGroup<VariadicMacros>;
def err_embedded_directive : Error<
"embedding a #%0 directive within macro arguments is not supported">;
def err_embedded_directive : Error<"embedding a %select{#|C++ }0%1 directive "
"within macro arguments is not supported">;
def ext_embedded_directive : Extension<
"embedding a directive within macro arguments has undefined behavior">,
InGroup<DiagGroup<"embedded-directive">>;
@ -998,6 +998,21 @@ def warn_module_conflict : Warning<
InGroup<ModuleConflict>;
// C++20 modules
def err_pp_module_name_is_macro : Error<
"%select{module|partition}0 name component %1 cannot be a object-like macro">;
def err_pp_module_expected_ident : Error<
"expected %select{identifier after '.' in |}0module name">;
def err_pp_unexpected_tok_after_module_name : Error<
"unexpected preprocessing token '%0' after module name, "
"only ';' and '[' (start of attribute specifier sequence) are allowed">;
def warn_pp_extra_tokens_at_module_directive_eol
: Warning<"extra tokens after semicolon in '%0' directive">,
InGroup<ExtraTokens>;
def err_pp_module_decl_in_header
: Error<"module declaration must not come from an #include directive">;
def err_pp_cond_span_module_decl
: Error<"module directive lines are not allowed on lines controlled "
"by preprocessor conditionals">;
def err_header_import_semi_in_macro : Error<
"semicolon terminating header import declaration cannot be produced "
"by a macro">;

View File

@ -1779,10 +1779,8 @@ def ext_bit_int : Extension<
} // end of Parse Issue category.
let CategoryName = "Modules Issue" in {
def err_unexpected_module_decl : Error<
"module declaration can only appear at the top level">;
def err_module_expected_ident : Error<
"expected a module name after '%select{module|import}0'">;
def err_unexpected_module_or_import_decl : Error<
"%select{module|import}0 declaration can only appear at the top level">;
def err_attribute_not_module_attr : Error<
"%0 attribute cannot be applied to a module">;
def err_keyword_not_module_attr : Error<
@ -1793,6 +1791,10 @@ def err_keyword_not_import_attr : Error<
"%0 cannot be applied to a module import">;
def err_module_expected_semi : Error<
"expected ';' after module name">;
def err_expected_semi_after_module_or_import
: Error<"%0 directive must end with a ';'">;
def note_module_declared_here : Note<
"%select{module|import}0 directive defined here">;
def err_global_module_introducer_not_at_start : Error<
"'module;' introducing a global module fragment can appear only "
"at the start of the translation unit">;

View File

@ -231,6 +231,10 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
LLVM_PREFERRED_TYPE(bool)
unsigned IsModulesImport : 1;
// True if this is the 'module' contextual keyword.
LLVM_PREFERRED_TYPE(bool)
unsigned IsModulesDecl : 1;
// True if this is a mangled OpenMP variant name.
LLVM_PREFERRED_TYPE(bool)
unsigned IsMangledOpenMPVariantName : 1;
@ -267,8 +271,9 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
IsCPPOperatorKeyword(false), NeedsHandleIdentifier(false),
IsFromAST(false), ChangedAfterLoad(false), FEChangedAfterLoad(false),
RevertedTokenID(false), OutOfDate(false), IsModulesImport(false),
IsMangledOpenMPVariantName(false), IsDeprecatedMacro(false),
IsRestrictExpansion(false), IsFinal(false), IsKeywordInCpp(false) {}
IsModulesDecl(false), IsMangledOpenMPVariantName(false),
IsDeprecatedMacro(false), IsRestrictExpansion(false), IsFinal(false),
IsKeywordInCpp(false) {}
public:
IdentifierInfo(const IdentifierInfo &) = delete;
@ -569,12 +574,24 @@ public:
}
/// Determine whether this is the contextual keyword \c import.
bool isModulesImport() const { return IsModulesImport; }
bool isImportKeyword() const { return IsModulesImport; }
/// Set whether this identifier is the contextual keyword \c import.
void setModulesImport(bool I) {
IsModulesImport = I;
if (I)
void setKeywordImport(bool Val) {
IsModulesImport = Val;
if (Val)
NeedsHandleIdentifier = true;
else
RecomputeNeedsHandleIdentifier();
}
/// Determine whether this is the contextual keyword \c module.
bool isModuleKeyword() const { return IsModulesDecl; }
/// Set whether this identifier is the contextual keyword \c module.
void setModuleKeyword(bool Val) {
IsModulesDecl = Val;
if (Val)
NeedsHandleIdentifier = true;
else
RecomputeNeedsHandleIdentifier();
@ -629,7 +646,7 @@ private:
void RecomputeNeedsHandleIdentifier() {
NeedsHandleIdentifier = isPoisoned() || hasMacroDefinition() ||
isExtensionToken() || isFutureCompatKeyword() ||
isOutOfDate() || isModulesImport();
isOutOfDate() || isImportKeyword();
}
};
@ -797,10 +814,11 @@ public:
// contents.
II->Entry = &Entry;
// If this is the 'import' contextual keyword, mark it as such.
// If this is the 'import' or 'module' contextual keyword, mark it as such.
if (Name == "import")
II->setModulesImport(true);
II->setKeywordImport(true);
else if (Name == "module")
II->setModuleKeyword(true);
return *II;
}

View File

@ -133,6 +133,11 @@ PPKEYWORD(pragma)
// C23 & C++26 #embed
PPKEYWORD(embed)
// C++20 Module Directive
PPKEYWORD(module)
PPKEYWORD(__preprocessed_module)
PPKEYWORD(__preprocessed_import)
// GNU Extensions.
PPKEYWORD(import)
PPKEYWORD(include_next)
@ -1030,6 +1035,9 @@ ANNOTATION(module_include)
ANNOTATION(module_begin)
ANNOTATION(module_end)
// Annotations for C++, Clang and Objective-C named modules.
ANNOTATION(module_name)
// Annotation for a header_name token that has been looked up and transformed
// into the name of a header unit.
ANNOTATION(header_unit)

View File

@ -76,6 +76,10 @@ const char *getPunctuatorSpelling(TokenKind Kind) LLVM_READNONE;
/// tokens like 'int' and 'dynamic_cast'. Returns NULL for other token kinds.
const char *getKeywordSpelling(TokenKind Kind) LLVM_READNONE;
/// Determines the spelling of simple Objective-C keyword tokens like '@import'.
/// Returns NULL for other token kinds.
const char *getObjCKeywordSpelling(ObjCKeywordKind Kind) LLVM_READNONE;
/// Returns the spelling of preprocessor keywords, such as "else".
const char *getPPKeywordSpelling(PPKeywordKind Kind) LLVM_READNONE;

View File

@ -905,7 +905,7 @@ private:
/// load it.
ModuleLoadResult findOrCompileModuleAndReadAST(StringRef ModuleName,
SourceLocation ImportLoc,
SourceLocation ModuleNameLoc,
SourceRange ModuleNameRange,
bool IsInclusionDirective);
/// Creates a \c CompilerInstance for compiling a module.

View File

@ -13,12 +13,15 @@
#ifndef LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
#define LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
#include "clang/Basic/IdentifierTable.h"
#include "clang/Basic/SourceLocation.h"
#include "llvm/ADT/StringRef.h"
namespace clang {
class IdentifierInfo;
class MacroInfo;
using ModuleIdPath = ArrayRef<IdentifierLoc>;
/// Callback handler that receives notifications when performing code
/// completion within the preprocessor.
@ -70,6 +73,11 @@ public:
/// file where we expect natural language, e.g., a comment, string, or
/// \#error directive.
virtual void CodeCompleteNaturalLanguage() { }
/// Callback invoked when performing code completion inside the module name
/// part of an import directive.
virtual void CodeCompleteModuleImport(SourceLocation ImportLoc,
ModuleIdPath Path) {}
};
}

View File

@ -135,6 +135,22 @@ void printDependencyDirectivesAsSource(
ArrayRef<dependency_directives_scan::Directive> Directives,
llvm::raw_ostream &OS);
/// Scan an input source buffer for C++20 named module usage.
///
/// \param Source The input source buffer.
///
/// \returns true if any C++20 named modules related directive was found.
bool scanInputForCXX20ModulesUsage(StringRef Source);
/// Scan an input source buffer, and check whether the input source is a
/// preprocessed output.
///
/// \param Source The input source buffer.
///
/// \returns true if any '__preprocessed_module' or '__preprocessed_import'
/// directive was found.
bool isPreprocessedModuleFile(StringRef Source);
/// Functor that returns the dependency directives for a given file.
class DependencyDirectivesGetter {
public:

View File

@ -159,6 +159,7 @@ public:
/// \returns Returns true if any modules with that symbol found.
virtual bool lookupMissingImports(StringRef Name,
SourceLocation TriggerLoc) = 0;
static std::string getFlatNameFromPath(ModuleIdPath Path);
bool HadFatalFailure = false;
};

View File

@ -48,6 +48,7 @@
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/Registry.h"
#include "llvm/Support/TrailingObjects.h"
#include <cassert>
#include <cstddef>
#include <cstdint>
@ -136,6 +137,64 @@ struct CXXStandardLibraryVersionInfo {
std::uint64_t Version;
};
/// Record the previous 'export' keyword info.
///
/// Since P1857R3, the standard introduced several rules to determine whether
/// the 'module', 'export module', 'import', 'export import' is a valid
/// directive introducer. This class is used to record the previous 'export'
/// keyword token, and then handle 'export module' and 'export import'.
class ExportContextualKeywordInfo {
Token ExportTok;
bool AtPhysicalStartOfLine = false;
public:
ExportContextualKeywordInfo() = default;
ExportContextualKeywordInfo(const Token &Tok, bool AtPhysicalStartOfLine)
: ExportTok(Tok), AtPhysicalStartOfLine(AtPhysicalStartOfLine) {}
bool isValid() const { return ExportTok.is(tok::kw_export); }
bool isAtPhysicalStartOfLine() const { return AtPhysicalStartOfLine; }
Token getExportTok() const { return ExportTok; }
void reset() {
ExportTok.startToken();
AtPhysicalStartOfLine = false;
}
};
class ModuleNameLoc final
: llvm::TrailingObjects<ModuleNameLoc, IdentifierLoc> {
friend TrailingObjects;
unsigned NumIdentifierLocs;
unsigned numTrailingObjects(OverloadToken<IdentifierLoc>) const {
return getNumIdentifierLocs();
}
ModuleNameLoc(ModuleIdPath Path) : NumIdentifierLocs(Path.size()) {
(void)llvm::copy(Path, getTrailingObjectsNonStrict<IdentifierLoc>());
}
public:
static ModuleNameLoc *Create(Preprocessor &PP, ModuleIdPath Path);
unsigned getNumIdentifierLocs() const { return NumIdentifierLocs; }
ModuleIdPath getModuleIdPath() const {
return {getTrailingObjectsNonStrict<IdentifierLoc>(),
getNumIdentifierLocs()};
}
SourceLocation getBeginLoc() const {
return getModuleIdPath().front().getLoc();
}
SourceLocation getEndLoc() const {
auto &Last = getModuleIdPath().back();
return Last.getLoc().getLocWithOffset(
Last.getIdentifierInfo()->getLength());
}
SourceRange getRange() const { return {getBeginLoc(), getEndLoc()}; }
std::string str() const {
return ModuleLoader::getFlatNameFromPath(getModuleIdPath());
}
};
/// Engages in a tight little dance with the lexer to efficiently
/// preprocess tokens.
///
@ -339,8 +398,9 @@ private:
/// lexed, if any.
SourceLocation ModuleImportLoc;
/// The import path for named module that we're currently processing.
SmallVector<IdentifierLoc, 2> NamedModuleImportPath;
/// The source location of the \c module contextual keyword we just
/// lexed, if any.
SourceLocation ModuleDeclLoc;
llvm::DenseMap<FileID, SmallVector<const char *>> CheckPoints;
unsigned CheckPointCounter = 0;
@ -351,6 +411,12 @@ private:
/// Whether the last token we lexed was an '@'.
bool LastTokenWasAt = false;
/// Whether we're importing a standard C++20 named Modules.
bool ImportingCXXNamedModules = false;
/// Whether the last token we lexed was an 'export' keyword.
ExportContextualKeywordInfo LastTokenWasExportKeyword;
/// First pp-token source location in current translation unit.
SourceLocation FirstPPTokenLoc;
@ -562,9 +628,9 @@ private:
reset();
}
void handleIdentifier(IdentifierInfo *Identifier) {
if (isModuleCandidate() && Identifier)
Name += Identifier->getName().str();
void handleModuleName(ModuleNameLoc *NameLoc) {
if (isModuleCandidate() && NameLoc)
Name += NameLoc->str();
else if (!isNamedModule())
reset();
}
@ -576,13 +642,6 @@ private:
reset();
}
void handlePeriod() {
if (isModuleCandidate())
Name += ".";
else if (!isNamedModule())
reset();
}
void handleSemi() {
if (!Name.empty() && isModuleCandidate()) {
if (State == InterfaceCandidate)
@ -639,10 +698,6 @@ private:
ModuleDeclSeq ModuleDeclState;
/// Whether the module import expects an identifier next. Otherwise,
/// it expects a '.' or ';'.
bool ModuleImportExpectsIdentifier = false;
/// The identifier and source location of the currently-active
/// \#pragma clang arc_cf_code_audited begin.
IdentifierLoc PragmaARCCFCodeAuditedInfo;
@ -776,6 +831,12 @@ private:
/// Only one of CurLexer, or CurTokenLexer will be non-null.
std::unique_ptr<Lexer> CurLexer;
/// Lexers that are pending destruction, deferred until the current
/// Stack of Lexer unwinds completely (LexLevel returns to 0).
/// This avoids use-after-free when HandleEndOfFile is called from
/// within a Lexer method that still needs to access its members.
SmallVector<std::unique_ptr<Lexer>, 2> PendingDestroyLexers;
/// The current top of the stack that we're lexing from
/// if not expanding a macro.
///
@ -1125,6 +1186,9 @@ private:
/// Whether tokens are being skipped until the through header is seen.
bool SkippingUntilPCHThroughHeader = false;
/// Whether the main file is preprocessed module file.
bool MainFileIsPreprocessedModuleFile = false;
/// \{
/// Cache of macro expanders to reduce malloc traffic.
enum { TokenLexerCacheSize = 8 };
@ -1778,6 +1842,36 @@ public:
std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
bool ForHasEmbed);
/// Whether the main file is preprocessed module file.
bool isPreprocessedModuleFile() const {
return MainFileIsPreprocessedModuleFile;
}
/// Mark the main file as a preprocessed module file, then the 'module' and
/// 'import' directive recognition will be suppressed. Only
/// '__preprocessed_moduke' and '__preprocessed_import' are allowed.
void markMainFileAsPreprocessedModuleFile() {
MainFileIsPreprocessedModuleFile = true;
}
bool LexModuleNameContinue(Token &Tok, SourceLocation UseLoc,
SmallVectorImpl<Token> &Suffix,
SmallVectorImpl<IdentifierLoc> &Path,
bool AllowMacroExpansion = true,
bool IsPartition = false);
void EnterModuleSuffixTokenStream(ArrayRef<Token> Toks);
void HandleCXXImportDirective(Token Import);
void HandleCXXModuleDirective(Token Module);
/// Callback invoked when the lexer sees one of export, import or module token
/// at the start of a line.
///
/// This consumes the import/module directive, modifies the
/// lexer/preprocessor state, and advances the lexer(s) so that the next token
/// read is the correct one.
bool HandleModuleContextualKeyword(Token &Result,
bool TokAtPhysicalStartOfLine);
/// Get the start location of the first pp-token in main file.
SourceLocation getMainFileFirstPPTokenLoc() const {
assert(FirstPPTokenLoc.isValid() &&
@ -1786,7 +1880,10 @@ public:
}
bool LexAfterModuleImport(Token &Result);
void CollectPpImportSuffix(SmallVectorImpl<Token> &Toks);
void CollectPPImportSuffix(SmallVectorImpl<Token> &Toks,
bool StopUntilEOD = false);
bool CollectPPImportSuffixAndEnterStream(SmallVectorImpl<Token> &Toks,
bool StopUntilEOD = false);
void makeModuleVisible(Module *M, SourceLocation Loc,
bool IncludeExports = true);
@ -2308,45 +2405,22 @@ public:
}
}
/// Check whether the next pp-token is one of the specificed token kind. this
/// method should have no observable side-effect on the lexed tokens.
template <typename... Ts> bool isNextPPTokenOneOf(Ts... Ks) {
/// isNextPPTokenOneOf - Check whether the next pp-token is one of the
/// specificed token kind. this method should have no observable side-effect
/// on the lexed tokens.
template <typename... Ts> bool isNextPPTokenOneOf(Ts... Ks) const {
static_assert(sizeof...(Ts) > 0,
"requires at least one tok::TokenKind specified");
// Do some quick tests for rejection cases.
std::optional<Token> Val;
if (CurLexer)
Val = CurLexer->peekNextPPToken();
else
Val = CurTokenLexer->peekNextPPToken();
if (!Val) {
// We have run off the end. If it's a source file we don't
// examine enclosing ones (C99 5.1.1.2p4). Otherwise walk up the
// macro stack.
if (CurPPLexer)
return false;
for (const IncludeStackInfo &Entry : llvm::reverse(IncludeMacroStack)) {
if (Entry.TheLexer)
Val = Entry.TheLexer->peekNextPPToken();
else
Val = Entry.TheTokenLexer->peekNextPPToken();
if (Val)
break;
// Ran off the end of a source file?
if (Entry.ThePPLexer)
return false;
}
}
// Okay, we found the token and return. Otherwise we found the end of the
// translation unit.
return Val->isOneOf(Ks...);
auto NextTokOpt = peekNextPPToken();
return NextTokOpt.has_value() ? NextTokOpt->is(Ks...) : false;
}
private:
/// peekNextPPToken - Return std::nullopt if there are no more tokens in the
/// buffer controlled by this lexer, otherwise return the next unexpanded
/// token.
std::optional<Token> peekNextPPToken() const;
/// Identifiers used for SEH handling in Borland. These are only
/// allowed in particular circumstances
// __except block
@ -2402,20 +2476,27 @@ public:
/// If \p EnableMacros is true, then we consider macros that expand to zero
/// tokens as being ok.
///
/// If \p ExtraToks not null, the extra tokens will be saved in this
/// container.
///
/// \return The location of the end of the directive (the terminating
/// newline).
SourceLocation CheckEndOfDirective(const char *DirType,
bool EnableMacros = false);
SourceLocation
CheckEndOfDirective(StringRef DirType, bool EnableMacros = false,
SmallVectorImpl<Token> *ExtraToks = nullptr);
/// Read and discard all tokens remaining on the current line until
/// the tok::eod token is found. Returns the range of the skipped tokens.
SourceRange DiscardUntilEndOfDirective() {
SourceRange
DiscardUntilEndOfDirective(SmallVectorImpl<Token> *DiscardedToks = nullptr) {
Token Tmp;
return DiscardUntilEndOfDirective(Tmp);
return DiscardUntilEndOfDirective(Tmp, DiscardedToks);
}
/// Same as above except retains the token that was found.
SourceRange DiscardUntilEndOfDirective(Token &Tok);
SourceRange
DiscardUntilEndOfDirective(Token &Tok,
SmallVectorImpl<Token> *DiscardedToks = nullptr);
/// Returns true if the preprocessor has seen a use of
/// __DATE__ or __TIME__ in the file so far.
@ -2486,11 +2567,10 @@ public:
}
/// If we're importing a standard C++20 Named Modules.
bool isInImportingCXXNamedModules() const {
// NamedModuleImportPath will be non-empty only if we're importing
// Standard C++ named modules.
return !NamedModuleImportPath.empty() && getLangOpts().CPlusPlusModules &&
!IsAtImport;
bool isImportingCXXNamedModules() const {
assert(getLangOpts().CPlusPlusModules &&
"Import C++ named modules are only valid for C++20 modules");
return ImportingCXXNamedModules;
}
/// Allocate a new MacroInfo object with the provided SourceLocation.
@ -2558,6 +2638,8 @@ private:
}
void PopIncludeMacroStack() {
if (CurLexer)
PendingDestroyLexers.push_back(std::move(CurLexer));
CurLexer = std::move(IncludeMacroStack.back().TheLexer);
CurPPLexer = IncludeMacroStack.back().ThePPLexer;
CurTokenLexer = std::move(IncludeMacroStack.back().TheTokenLexer);

View File

@ -297,6 +297,10 @@ public:
/// Return the ObjC keyword kind.
tok::ObjCKeywordKind getObjCKeywordID() const;
/// Return true if we have a C++20 modules contextual keyword(export, import
/// or module).
bool isModuleContextualKeyword(bool AllowExport = true) const;
bool isSimpleTypeSpecifier(const LangOptions &LangOpts) const;
/// Return true if this token has trigraphs or escaped newlines in it.

View File

@ -100,6 +100,10 @@ class TokenLexer {
/// See the flag documentation for details.
bool IsReinject : 1;
/// This is true if this TokenLexer is created when handling a C++ module
/// directive.
bool LexingCXXModuleDirective : 1;
public:
/// Create a TokenLexer for the specified macro with the specified actual
/// arguments. Note that this ctor takes ownership of the ActualArgs pointer.
@ -151,6 +155,14 @@ public:
/// preprocessor directive.
bool isParsingPreprocessorDirective() const;
/// setLexingCXXModuleDirective - This is set to true if this TokenLexer is
/// created when handling a C++ module directive.
void setLexingCXXModuleDirective(bool Val = true);
/// isLexingCXXModuleDirective - Return true if we are lexing a C++ module or
/// import directive.
bool isLexingCXXModuleDirective() const;
private:
void destroy();

View File

@ -566,10 +566,6 @@ private:
/// Contextual keywords for Microsoft extensions.
IdentifierInfo *Ident__except;
// C++2a contextual keywords.
mutable IdentifierInfo *Ident_import;
mutable IdentifierInfo *Ident_module;
std::unique_ptr<CommentHandler> CommentSemaHandler;
/// Gets set to true after calling ProduceSignatureHelp, it is for a
@ -1081,6 +1077,9 @@ private:
bool ParseModuleName(SourceLocation UseLoc,
SmallVectorImpl<IdentifierLoc> &Path, bool IsImport);
void DiagnoseInvalidCXXModuleDecl(const Sema::ModuleImportState &ImportState);
void DiagnoseInvalidCXXModuleImport();
//===--------------------------------------------------------------------===//
// Preprocessor code-completion pass-through
void CodeCompleteDirective(bool InConditional) override;
@ -1091,6 +1090,8 @@ private:
unsigned ArgumentIndex) override;
void CodeCompleteIncludedFile(llvm::StringRef Dir, bool IsAngled) override;
void CodeCompleteNaturalLanguage() override;
void CodeCompleteModuleImport(SourceLocation ImportLoc,
ModuleIdPath Path) override;
///@}

View File

@ -298,8 +298,11 @@ void IdentifierTable::AddKeywords(const LangOptions &LangOpts) {
if (LangOpts.IEEE128)
AddKeyword("__ieee128", tok::kw___float128, KEYALL, LangOpts, *this);
// Add the 'import' contextual keyword.
get("import").setModulesImport(true);
// Add the 'import' and 'module' contextual keywords.
get("import").setKeywordImport(true);
get("module").setModuleKeyword(true);
get("__preprocessed_import").setKeywordImport(true);
get("__preprocessed_module").setModuleKeyword(true);
}
/// Checks if the specified token kind represents a keyword in the
@ -413,6 +416,13 @@ tok::PPKeywordKind IdentifierInfo::getPPKeywordID() const {
unsigned Len = getLength();
if (Len < 2) return tok::pp_not_keyword;
const char *Name = getNameStart();
if (Name[0] == '_' && isImportKeyword())
return tok::pp___preprocessed_import;
if (Name[0] == '_' && isModuleKeyword())
return tok::pp___preprocessed_module;
// clang-format off
switch (HASH(Len, Name[0], Name[2])) {
default: return tok::pp_not_keyword;
CASE( 2, 'i', '\0', if);
@ -431,6 +441,7 @@ tok::PPKeywordKind IdentifierInfo::getPPKeywordID() const {
CASE( 6, 'd', 'f', define);
CASE( 6, 'i', 'n', ifndef);
CASE( 6, 'i', 'p', import);
CASE( 6, 'm', 'd', module);
CASE( 6, 'p', 'a', pragma);
CASE( 7, 'd', 'f', defined);
@ -450,6 +461,7 @@ tok::PPKeywordKind IdentifierInfo::getPPKeywordID() const {
#undef CASE
#undef HASH
}
// clang-format on
}
//===----------------------------------------------------------------------===//

View File

@ -46,6 +46,18 @@ const char *tok::getKeywordSpelling(TokenKind Kind) {
return nullptr;
}
const char *tok::getObjCKeywordSpelling(ObjCKeywordKind Kind) {
switch (Kind) {
#define OBJC_AT_KEYWORD(X) \
case objc_##X: \
return "@" #X;
#include "clang/Basic/TokenKinds.def"
default:
break;
}
return nullptr;
}
const char *tok::getPPKeywordSpelling(tok::PPKeywordKind Kind) {
switch (Kind) {
#define PPKEYWORD(x) case tok::pp_##x: return #x;

View File

@ -565,7 +565,8 @@ void ModuleDepCollectorPP::InclusionDirective(
void ModuleDepCollectorPP::moduleImport(SourceLocation ImportLoc,
ModuleIdPath Path,
const Module *Imported) {
if (MDC.ScanInstance.getPreprocessor().isInImportingCXXNamedModules()) {
auto &PP = MDC.ScanInstance.getPreprocessor();
if (PP.getLangOpts().CPlusPlusModules && PP.isImportingCXXNamedModules()) {
P1689ModuleInfo RequiredModule;
RequiredModule.ModuleName = Path[0].getIdentifierInfo()->getName().str();
RequiredModule.Type = P1689ModuleInfo::ModuleType::NamedCXXModule;

View File

@ -1762,8 +1762,8 @@ static ModuleSource selectModuleSource(
}
ModuleLoadResult CompilerInstance::findOrCompileModuleAndReadAST(
StringRef ModuleName, SourceLocation ImportLoc,
SourceLocation ModuleNameLoc, bool IsInclusionDirective) {
StringRef ModuleName, SourceLocation ImportLoc, SourceRange ModuleNameRange,
bool IsInclusionDirective) {
// Search for a module with the given name.
HeaderSearch &HS = PP->getHeaderSearchInfo();
Module *M =
@ -1780,10 +1780,11 @@ ModuleLoadResult CompilerInstance::findOrCompileModuleAndReadAST(
std::string ModuleFilename;
ModuleSource Source =
selectModuleSource(M, ModuleName, ModuleFilename, BuiltModules, HS);
SourceLocation ModuleNameLoc = ModuleNameRange.getBegin();
if (Source == MS_ModuleNotFound) {
// We can't find a module, error out here.
getDiagnostics().Report(ModuleNameLoc, diag::err_module_not_found)
<< ModuleName << SourceRange(ImportLoc, ModuleNameLoc);
<< ModuleName << ModuleNameRange;
return nullptr;
}
if (ModuleFilename.empty()) {
@ -1969,8 +1970,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
MM.cacheModuleLoad(*Path[0].getIdentifierInfo(), Module);
} else {
SourceLocation ModuleNameEndLoc = Path.back().getLoc().getLocWithOffset(
Path.back().getIdentifierInfo()->getLength());
ModuleLoadResult Result = findOrCompileModuleAndReadAST(
ModuleName, ImportLoc, ModuleNameLoc, IsInclusionDirective);
ModuleName, ImportLoc, SourceRange{ModuleNameLoc, ModuleNameEndLoc},
IsInclusionDirective);
if (!Result.isNormal())
return Result;
if (!Result)

View File

@ -1641,5 +1641,12 @@ void clang::InitializePreprocessor(Preprocessor &PP,
if (FEOpts.DashX.isPreprocessed()) {
PP.getDiagnostics().setSeverity(diag::ext_pp_gnu_line_directive,
diag::Severity::Ignored, SourceLocation());
// Compiling with -xc++-cpp-output should suppress module directive
// recognition. __preprocessed_module can either get the directive treatment
// or be accepted directly by phase 7 in a module declaration. In the latter
// case, __preprocessed_module will work even if there are preprocessing
// tokens on the same line that precede it.
PP.markMainFileAsPreprocessedModuleFile();
}
}

View File

@ -245,6 +245,8 @@ public:
unsigned GetNumToksToSkip() const { return NumToksToSkip; }
void ResetSkipToks() { NumToksToSkip = 0; }
const Token &GetPrevToken() const { return PrevTok; }
};
} // end anonymous namespace
@ -758,7 +760,8 @@ void PrintPPOutputPPCallbacks::HandleWhitespaceBeforeTok(const Token &Tok,
if (Tok.is(tok::eof) ||
(Tok.isAnnotation() && !Tok.is(tok::annot_header_unit) &&
!Tok.is(tok::annot_module_begin) && !Tok.is(tok::annot_module_end) &&
!Tok.is(tok::annot_repl_input_end) && !Tok.is(tok::annot_embed)))
!Tok.is(tok::annot_repl_input_end) && !Tok.is(tok::annot_embed) &&
!Tok.is(tok::annot_module_name)))
return;
// EmittedDirectiveOnThisLine takes priority over RequireSameLine.
@ -893,6 +896,7 @@ static void PrintPreprocessedTokens(Preprocessor &PP, Token &Tok,
!PP.getCommentRetentionState();
bool IsStartOfLine = false;
bool IsCXXModuleDirective = false;
char Buffer[256];
while (true) {
// Two lines joined with line continuation ('\' as last character on the
@ -978,11 +982,38 @@ static void PrintPreprocessedTokens(Preprocessor &PP, Token &Tok,
*Callbacks->OS << static_cast<int>(Byte);
PrintComma = true;
}
} else if (Tok.is(tok::annot_module_name)) {
auto *NameLoc = static_cast<ModuleNameLoc *>(Tok.getAnnotationValue());
*Callbacks->OS << NameLoc->str();
} else if (Tok.isAnnotation()) {
// Ignore annotation tokens created by pragmas - the pragmas themselves
// will be reproduced in the preprocessed output.
PP.Lex(Tok);
continue;
} else if (PP.getLangOpts().CPlusPlusModules && Tok.is(tok::kw_import) &&
!Callbacks->GetPrevToken().is(tok::at)) {
assert(!IsCXXModuleDirective && "Is an import directive being printed?");
IsCXXModuleDirective = true;
IsStartOfLine = false;
*Callbacks->OS << tok::getPPKeywordSpelling(
tok::pp___preprocessed_import);
PP.Lex(Tok);
continue;
} else if (PP.getLangOpts().CPlusPlusModules && Tok.is(tok::kw_module)) {
assert(!IsCXXModuleDirective && "Is an module directive being printed?");
IsCXXModuleDirective = true;
IsStartOfLine = false;
*Callbacks->OS << tok::getPPKeywordSpelling(
tok::pp___preprocessed_module);
PP.Lex(Tok);
continue;
} else if (PP.getLangOpts().CPlusPlusModules && IsCXXModuleDirective &&
Tok.is(tok::semi)) {
IsCXXModuleDirective = false;
IsStartOfLine = true;
*Callbacks->OS << ';';
PP.Lex(Tok);
continue;
} else if (IdentifierInfo *II = Tok.getIdentifierInfo()) {
*Callbacks->OS << II->getName();
} else if (Tok.isLiteral() && !Tok.needsCleaning() &&

View File

@ -83,6 +83,9 @@ struct Scanner {
/// \returns True on error.
bool scan(SmallVectorImpl<Directive> &Directives);
friend bool clang::scanInputForCXX20ModulesUsage(StringRef Source);
friend bool clang::isPreprocessedModuleFile(StringRef Source);
private:
/// Lexes next token and advances \p First and the \p Lexer.
[[nodiscard]] dependency_directives_scan::Token &
@ -172,6 +175,7 @@ private:
/// true at the end.
bool reportError(const char *CurPtr, unsigned Err);
bool ScanningPreprocessedModuleFile = false;
StringMap<char> SplitIds;
StringRef Input;
SmallVectorImpl<dependency_directives_scan::Token> &Tokens;
@ -542,6 +546,12 @@ static void skipWhitespace(const char *&First, const char *const End) {
bool Scanner::lexModuleDirectiveBody(DirectiveKind Kind, const char *&First,
const char *const End) {
assert(Kind == DirectiveKind::cxx_export_import_decl ||
Kind == DirectiveKind::cxx_export_module_decl ||
Kind == DirectiveKind::cxx_import_decl ||
Kind == DirectiveKind::cxx_module_decl ||
Kind == DirectiveKind::decl_at_import);
const char *DirectiveLoc = Input.data() + CurDirToks.front().Offset;
for (;;) {
// Keep a copy of the First char incase it needs to be reset.
@ -553,7 +563,7 @@ bool Scanner::lexModuleDirectiveBody(DirectiveKind Kind, const char *&First,
First = Previous;
return false;
}
if (Tok.is(tok::eof))
if (Tok.isOneOf(tok::eof, tok::eod))
return reportError(
DirectiveLoc,
diag::err_dep_source_scanner_missing_semi_after_at_import);
@ -561,12 +571,25 @@ bool Scanner::lexModuleDirectiveBody(DirectiveKind Kind, const char *&First,
break;
}
const auto &Tok = lexToken(First, End);
pushDirective(Kind);
if (Tok.is(tok::eof) || Tok.is(tok::eod))
bool IsCXXModules = Kind == DirectiveKind::cxx_export_import_decl ||
Kind == DirectiveKind::cxx_export_module_decl ||
Kind == DirectiveKind::cxx_import_decl ||
Kind == DirectiveKind::cxx_module_decl;
if (IsCXXModules) {
lexPPDirectiveBody(First, End);
pushDirective(Kind);
return false;
return reportError(DirectiveLoc,
diag::err_dep_source_scanner_unexpected_tokens_at_import);
}
pushDirective(Kind);
skipWhitespace(First, End);
if (First == End)
return false;
if (!isVerticalWhitespace(*First))
return reportError(
DirectiveLoc, diag::err_dep_source_scanner_unexpected_tokens_at_import);
skipNewline(First, End);
return false;
}
dependency_directives_scan::Token &Scanner::lexToken(const char *&First,
@ -703,7 +726,12 @@ bool Scanner::lexModule(const char *&First, const char *const End) {
Id = *NextId;
}
if (Id != "module" && Id != "import") {
StringRef Module =
ScanningPreprocessedModuleFile ? "__preprocessed_module" : "module";
StringRef Import =
ScanningPreprocessedModuleFile ? "__preprocessed_import" : "import";
if (Id != Module && Id != Import) {
skipLine(First, End);
return false;
}
@ -716,7 +744,7 @@ bool Scanner::lexModule(const char *&First, const char *const End) {
switch (*First) {
case ':': {
// `module :` is never the start of a valid module declaration.
if (Id == "module") {
if (Id == Module) {
skipLine(First, End);
return false;
}
@ -735,7 +763,7 @@ bool Scanner::lexModule(const char *&First, const char *const End) {
}
case ';': {
// Handle the global module fragment `module;`.
if (Id == "module" && !Export)
if (Id == Module && !Export)
break;
skipLine(First, End);
return false;
@ -753,7 +781,7 @@ bool Scanner::lexModule(const char *&First, const char *const End) {
TheLexer.seek(getOffsetAt(First), /*IsAtStartOfLine*/ false);
DirectiveKind Kind;
if (Id == "module")
if (Id == Module)
Kind = Export ? cxx_export_module_decl : cxx_module_decl;
else
Kind = Export ? cxx_export_import_decl : cxx_import_decl;
@ -886,6 +914,19 @@ static bool isStartOfRelevantLine(char First) {
return false;
}
static inline bool isStartWithPreprocessedModuleDirective(const char *First,
const char *End) {
assert(First <= End);
if (*First == '_') {
StringRef Str(First, End - First);
return Str.starts_with(
tok::getPPKeywordSpelling(tok::pp___preprocessed_module)) ||
Str.starts_with(
tok::getPPKeywordSpelling(tok::pp___preprocessed_import));
}
return false;
}
bool Scanner::lexPPLine(const char *&First, const char *const End) {
assert(First != End);
@ -910,7 +951,13 @@ bool Scanner::lexPPLine(const char *&First, const char *const End) {
CurDirToks.clear();
});
if (*First == '_') {
// FIXME: Shoule we handle @import as a preprocessing directive?
if (*First == '@')
return lexAt(First, End);
bool IsPreprocessedModule =
isStartWithPreprocessedModuleDirective(First, End);
if (*First == '_' && !IsPreprocessedModule) {
if (isNextIdentifierOrSkipLine("_Pragma", First, End))
return lex_Pragma(First, End);
return false;
@ -922,12 +969,8 @@ bool Scanner::lexPPLine(const char *&First, const char *const End) {
llvm::scope_exit ScEx2(
[&]() { TheLexer.setParsingPreprocessorDirective(false); });
// Handle "@import".
if (*First == '@')
return lexAt(First, End);
// Handle module directives for C++20 modules.
if (*First == 'i' || *First == 'e' || *First == 'm')
if (*First == 'i' || *First == 'e' || *First == 'm' || IsPreprocessedModule)
return lexModule(First, End);
// Lex '#'.
@ -1009,6 +1052,7 @@ bool Scanner::scanImpl(const char *First, const char *const End) {
}
bool Scanner::scan(SmallVectorImpl<Directive> &Directives) {
ScanningPreprocessedModuleFile = clang::isPreprocessedModuleFile(Input);
bool Error = scanImpl(Input.begin(), Input.end());
if (!Error) {
@ -1075,3 +1119,93 @@ void clang::printDependencyDirectivesAsSource(
}
}
}
static void skipUntilMaybeCXX20ModuleDirective(const char *&First,
const char *const End) {
assert(First <= End);
while (First != End) {
if (*First == '#') {
++First;
skipToNewlineRaw(First, End);
}
skipWhitespace(First, End);
if (const auto Len = isEOL(First, End)) {
First += Len;
continue;
}
break;
}
}
bool clang::scanInputForCXX20ModulesUsage(StringRef Source) {
const char *First = Source.begin();
const char *const End = Source.end();
skipUntilMaybeCXX20ModuleDirective(First, End);
if (First == End)
return false;
// Check if the next token can even be a module directive before creating a
// full lexer.
if (!(*First == 'i' || *First == 'e' || *First == 'm'))
return false;
llvm::SmallVector<dependency_directives_scan::Token> Tokens;
Scanner S(StringRef(First, End - First), Tokens, nullptr, SourceLocation());
S.TheLexer.setParsingPreprocessorDirective(true);
if (S.lexModule(First, End))
return false;
auto IsCXXNamedModuleDirective = [](const DirectiveWithTokens &D) {
switch (D.Kind) {
case dependency_directives_scan::cxx_module_decl:
case dependency_directives_scan::cxx_import_decl:
case dependency_directives_scan::cxx_export_module_decl:
case dependency_directives_scan::cxx_export_import_decl:
return true;
default:
return false;
}
};
return llvm::any_of(S.DirsWithToks, IsCXXNamedModuleDirective);
}
bool clang::isPreprocessedModuleFile(StringRef Source) {
const char *First = Source.begin();
const char *const End = Source.end();
skipUntilMaybeCXX20ModuleDirective(First, End);
if (First == End)
return false;
llvm::SmallVector<dependency_directives_scan::Token> Tokens;
Scanner S(StringRef(First, End - First), Tokens, nullptr, SourceLocation());
while (First != End) {
if (*First == '#') {
++First;
skipToNewlineRaw(First, End);
} else if (*First == 'e') {
S.TheLexer.seek(S.getOffsetAt(First), /*IsAtStartOfLine=*/true);
StringRef Id = S.lexIdentifier(First, End);
if (Id == "export") {
std::optional<StringRef> NextId =
S.tryLexIdentifierOrSkipLine(First, End);
if (!NextId)
return false;
Id = *NextId;
}
if (Id == "__preprocessed_module" || Id == "__preprocessed_import")
return true;
skipToNewlineRaw(First, End);
} else if (isStartWithPreprocessedModuleDirective(First, End))
return true;
else
skipToNewlineRaw(First, End);
skipWhitespace(First, End);
if (const auto Len = isEOL(First, End)) {
First += Len;
continue;
}
break;
}
return false;
}

View File

@ -72,6 +72,17 @@ tok::ObjCKeywordKind Token::getObjCKeywordID() const {
return specId ? specId->getObjCKeywordID() : tok::objc_not_keyword;
}
bool Token::isModuleContextualKeyword(bool AllowExport) const {
if (AllowExport && is(tok::kw_export))
return true;
if (isOneOf(tok::kw_import, tok::kw_module))
return true;
if (isNot(tok::identifier))
return false;
const auto *II = getIdentifierInfo();
return II->isImportKeyword() || II->isModuleKeyword();
}
/// Determine whether the token kind starts a simple-type-specifier.
bool Token::isSimpleTypeSpecifier(const LangOptions &LangOpts) const {
switch (getKind()) {
@ -4019,11 +4030,23 @@ LexStart:
case 'h': case 'i': case 'j': case 'k': case 'l': case 'm': case 'n':
case 'o': case 'p': case 'q': case 'r': case 's': case 't': /*'u'*/
case 'v': case 'w': case 'x': case 'y': case 'z':
case '_':
case '_': {
// Notify MIOpt that we read a non-whitespace/non-comment token.
MIOpt.ReadToken();
return LexIdentifierContinue(Result, CurPtr);
// LexIdentifierContinue may trigger HandleEndOfFile which would
// normally destroy this Lexer. However, the Preprocessor now defers
// lexer destruction until the stack of Lexer unwinds (LexLevel == 0),
// so it's safe to access member variables after this call returns.
bool returnedToken = LexIdentifierContinue(Result, CurPtr);
if (returnedToken && !LexingRawMode && !Is_PragmaLexer &&
!ParsingPreprocessorDirective && LangOpts.CPlusPlusModules &&
Result.isModuleContextualKeyword() &&
PP->HandleModuleContextualKeyword(Result, TokAtPhysicalStartOfLine))
goto HandleDirective;
return returnedToken;
}
case '$': // $ in identifiers.
if (LangOpts.DollarIdents) {
if (!isLexingRawMode())
@ -4226,8 +4249,12 @@ LexStart:
// it's actually the start of a preprocessing directive. Callback to
// the preprocessor to handle it.
// TODO: -fpreprocessed mode??
if (TokAtPhysicalStartOfLine && !LexingRawMode && !Is_PragmaLexer)
if (TokAtPhysicalStartOfLine && !LexingRawMode && !Is_PragmaLexer) {
// We parsed a # character and it's the start of a preprocessing
// directive.
FormTokenWithChars(Result, CurPtr, tok::hash);
goto HandleDirective;
}
Kind = tok::hash;
}
@ -4414,8 +4441,12 @@ LexStart:
// it's actually the start of a preprocessing directive. Callback to
// the preprocessor to handle it.
// TODO: -fpreprocessed mode??
if (TokAtPhysicalStartOfLine && !LexingRawMode && !Is_PragmaLexer)
if (TokAtPhysicalStartOfLine && !LexingRawMode && !Is_PragmaLexer) {
// We parsed a # character and it's the start of a preprocessing
// directive.
FormTokenWithChars(Result, CurPtr, tok::hash);
goto HandleDirective;
}
Kind = tok::hash;
}
@ -4505,9 +4536,6 @@ LexStart:
return true;
HandleDirective:
// We parsed a # character and it's the start of a preprocessing directive.
FormTokenWithChars(Result, CurPtr, tok::hash);
PP->HandleDirective(Result);
if (PP->hadModuleLoaderFatalFailure())
@ -4530,6 +4558,10 @@ const char *Lexer::convertDependencyDirectiveToken(
Result.setKind(DDTok.Kind);
Result.setFlag((Token::TokenFlags)DDTok.Flags);
Result.setLength(DDTok.Length);
if (Result.is(tok::raw_identifier))
Result.setRawIdentifierData(TokPtr);
else if (Result.isLiteral())
Result.setLiteralData(TokPtr);
BufferPtr = TokPtr + DDTok.Length;
return TokPtr;
}
@ -4587,15 +4619,18 @@ bool Lexer::LexDependencyDirectiveToken(Token &Result) {
Result.setRawIdentifierData(TokPtr);
if (!isLexingRawMode()) {
const IdentifierInfo *II = PP->LookUpIdentifierInfo(Result);
if (LangOpts.CPlusPlusModules && Result.isModuleContextualKeyword() &&
PP->HandleModuleContextualKeyword(Result, Result.isAtStartOfLine())) {
PP->HandleDirective(Result);
return false;
}
if (II->isHandleIdentifierCase())
return PP->HandleIdentifier(Result);
}
return true;
}
if (Result.isLiteral()) {
Result.setLiteralData(TokPtr);
if (Result.isLiteral())
return true;
}
if (Result.is(tok::colon)) {
// Convert consecutive colons to 'tok::coloncolon'.
if (*BufferPtr == ':') {

View File

@ -48,6 +48,7 @@
#include "llvm/Support/SaveAndRestore.h"
#include <algorithm>
#include <cassert>
#include <cstddef>
#include <cstring>
#include <optional>
#include <string>
@ -82,14 +83,19 @@ Preprocessor::AllocateVisibilityMacroDirective(SourceLocation Loc,
/// Read and discard all tokens remaining on the current line until
/// the tok::eod token is found.
SourceRange Preprocessor::DiscardUntilEndOfDirective(Token &Tmp) {
SourceRange Preprocessor::DiscardUntilEndOfDirective(
Token &Tmp, SmallVectorImpl<Token> *DiscardedToks) {
SourceRange Res;
LexUnexpandedToken(Tmp);
auto ReadNextTok = [&]() {
LexUnexpandedToken(Tmp);
if (DiscardedToks && Tmp.isNot(tok::eod))
DiscardedToks->push_back(Tmp);
};
ReadNextTok();
Res.setBegin(Tmp.getLocation());
while (Tmp.isNot(tok::eod)) {
assert(Tmp.isNot(tok::eof) && "EOF seen while discarding directive tokens");
LexUnexpandedToken(Tmp);
ReadNextTok();
}
Res.setEnd(Tmp.getLocation());
return Res;
@ -456,21 +462,27 @@ void Preprocessor::ReadMacroName(Token &MacroNameTok, MacroUse isDefineUndef,
/// true, then we consider macros that expand to zero tokens as being ok.
///
/// Returns the location of the end of the directive.
SourceLocation Preprocessor::CheckEndOfDirective(const char *DirType,
bool EnableMacros) {
SourceLocation
Preprocessor::CheckEndOfDirective(StringRef DirType, bool EnableMacros,
SmallVectorImpl<Token> *ExtraToks) {
Token Tmp;
auto ReadNextTok = [this, ExtraToks, &Tmp](auto &&LexFn) {
std::invoke(LexFn, this, Tmp);
if (ExtraToks && Tmp.isNot(tok::eod))
ExtraToks->push_back(Tmp);
};
// Lex unexpanded tokens for most directives: macros might expand to zero
// tokens, causing us to miss diagnosing invalid lines. Some directives (like
// #line) allow empty macros.
if (EnableMacros)
Lex(Tmp);
ReadNextTok(&Preprocessor::Lex);
else
LexUnexpandedToken(Tmp);
ReadNextTok(&Preprocessor::LexUnexpandedToken);
// There should be no tokens after the directive, but we allow them as an
// extension.
while (Tmp.is(tok::comment)) // Skip comments in -C mode.
LexUnexpandedToken(Tmp);
ReadNextTok(&Preprocessor::LexUnexpandedToken);
if (Tmp.is(tok::eod))
return Tmp.getLocation();
@ -483,8 +495,15 @@ SourceLocation Preprocessor::CheckEndOfDirective(const char *DirType,
if ((LangOpts.GNUMode || LangOpts.C99 || LangOpts.CPlusPlus) &&
!CurTokenLexer)
Hint = FixItHint::CreateInsertion(Tmp.getLocation(),"//");
Diag(Tmp, diag::ext_pp_extra_tokens_at_eol) << DirType << Hint;
return DiscardUntilEndOfDirective().getEnd();
unsigned DiagID = diag::ext_pp_extra_tokens_at_eol;
// C++20 import or module directive has no '#' prefix.
if (getLangOpts().CPlusPlusModules &&
(DirType == "import" || DirType == "module"))
DiagID = diag::warn_pp_extra_tokens_at_module_directive_eol;
Diag(Tmp, DiagID) << DirType << Hint;
return DiscardUntilEndOfDirective(ExtraToks).getEnd();
}
void Preprocessor::SuggestTypoedDirective(const Token &Tok,
@ -610,6 +629,57 @@ void Preprocessor::SkipExcludedConditionalBlock(SourceLocation HashTokenLoc,
continue;
}
// There is actually no "skipped block" in the above because the module
// directive is not a text-line (https://wg21.link/cpp.pre#2) nor
// anything else that is allowed in a group
// (https://eel.is/c++draft/cpp.pre#nt:group-part).
//
// A preprocessor diagnostic (effective with -E) that triggers whenever
// a module directive is encountered where a control-line or a text-line
// is required.
if (getLangOpts().CPlusPlusModules && Tok.isAtStartOfLine() &&
Tok.is(tok::raw_identifier) &&
(Tok.getRawIdentifier() == "export" ||
Tok.getRawIdentifier() == "module")) {
llvm::SaveAndRestore ModuleDirectiveSkipping(
LastTokenWasExportKeyword);
LastTokenWasExportKeyword.reset();
LookUpIdentifierInfo(Tok);
IdentifierInfo *II = Tok.getIdentifierInfo();
if (II->getName()[0] == 'e') { // export
HandleModuleContextualKeyword(Tok, Tok.isAtStartOfLine());
CurLexer->Lex(Tok);
if (Tok.is(tok::raw_identifier)) {
LookUpIdentifierInfo(Tok);
II = Tok.getIdentifierInfo();
}
}
if (II->getName()[0] == 'm') { // module
// HandleModuleContextualKeyword changes the lexer state, so we need
// to save RawLexingMode
llvm::SaveAndRestore RestoreLexingRawMode(CurPPLexer->LexingRawMode,
false);
if (HandleModuleContextualKeyword(Tok, Tok.isAtStartOfLine())) {
// We just parsed a # character at the start of a line, so we're
// in directive mode. Tell the lexer this so any newlines we see
// will be converted into an EOD token (this terminates the
// macro).
CurPPLexer->ParsingPreprocessorDirective = true;
SourceLocation StartLoc = Tok.getLocation();
SourceLocation End = DiscardUntilEndOfDirective().getEnd();
Diag(StartLoc, diag::err_pp_cond_span_module_decl)
<< SourceRange(StartLoc, End);
CurPPLexer->ParsingPreprocessorDirective = false;
// Restore comment saving mode.
if (CurLexer)
CurLexer->resetExtendedTokenMode();
continue;
}
}
}
// If this is the end of the buffer, we have an error.
if (Tok.is(tok::eof)) {
// We don't emit errors for unterminated conditionals here,
@ -1259,12 +1329,14 @@ void Preprocessor::HandleDirective(Token &Result) {
// pp-directive.
bool ReadAnyTokensBeforeDirective =CurPPLexer->MIOpt.getHasReadAnyTokensVal();
// Save the '#' token in case we need to return it later.
Token SavedHash = Result;
// Save the directive-introducing token('#' and import/module in C++20) in
// case we need to return it later.
Token Introducer = Result;
// Read the next token, the directive flavor. This isn't expanded due to
// C99 6.10.3p8.
LexUnexpandedToken(Result);
if (Introducer.is(tok::hash))
LexUnexpandedToken(Result);
// C99 6.10.3p11: Is this preprocessor directive in macro invocation? e.g.:
// #define A(x) #x
@ -1283,7 +1355,14 @@ void Preprocessor::HandleDirective(Token &Result) {
case tok::pp___include_macros:
case tok::pp_pragma:
case tok::pp_embed:
Diag(Result, diag::err_embedded_directive) << II->getName();
case tok::pp_module:
case tok::pp___preprocessed_module:
case tok::pp___preprocessed_import:
Diag(Result, diag::err_embedded_directive)
<< (getLangOpts().CPlusPlusModules &&
Introducer.isModuleContextualKeyword(
/*AllowExport=*/false))
<< II->getName();
Diag(*ArgMacro, diag::note_macro_expansion_here)
<< ArgMacro->getIdentifierInfo();
DiscardUntilEndOfDirective();
@ -1300,7 +1379,8 @@ void Preprocessor::HandleDirective(Token &Result) {
ResetMacroExpansionHelper helper(this);
if (SkippingUntilPCHThroughHeader || SkippingUntilPragmaHdrStop)
return HandleSkippedDirectiveWhileUsingPCH(Result, SavedHash.getLocation());
return HandleSkippedDirectiveWhileUsingPCH(Result,
Introducer.getLocation());
switch (Result.getKind()) {
case tok::eod:
@ -1320,7 +1400,7 @@ void Preprocessor::HandleDirective(Token &Result) {
// directive. However do permit it in the predefines file, as we use line
// markers to mark the builtin macros as being in a system header.
if (getLangOpts().AsmPreprocessor &&
SourceMgr.getFileID(SavedHash.getLocation()) != getPredefinesFileID())
SourceMgr.getFileID(Introducer.getLocation()) != getPredefinesFileID())
break;
return HandleDigitDirective(Result);
default:
@ -1332,30 +1412,32 @@ void Preprocessor::HandleDirective(Token &Result) {
default: break;
// C99 6.10.1 - Conditional Inclusion.
case tok::pp_if:
return HandleIfDirective(Result, SavedHash, ReadAnyTokensBeforeDirective);
return HandleIfDirective(Result, Introducer,
ReadAnyTokensBeforeDirective);
case tok::pp_ifdef:
return HandleIfdefDirective(Result, SavedHash, false,
return HandleIfdefDirective(Result, Introducer, false,
true /*not valid for miopt*/);
case tok::pp_ifndef:
return HandleIfdefDirective(Result, SavedHash, true,
return HandleIfdefDirective(Result, Introducer, true,
ReadAnyTokensBeforeDirective);
case tok::pp_elif:
case tok::pp_elifdef:
case tok::pp_elifndef:
return HandleElifFamilyDirective(Result, SavedHash, II->getPPKeywordID());
return HandleElifFamilyDirective(Result, Introducer,
II->getPPKeywordID());
case tok::pp_else:
return HandleElseDirective(Result, SavedHash);
return HandleElseDirective(Result, Introducer);
case tok::pp_endif:
return HandleEndifDirective(Result);
// C99 6.10.2 - Source File Inclusion.
case tok::pp_include:
// Handle #include.
return HandleIncludeDirective(SavedHash.getLocation(), Result);
return HandleIncludeDirective(Introducer.getLocation(), Result);
case tok::pp___include_macros:
// Handle -imacros.
return HandleIncludeMacrosDirective(SavedHash.getLocation(), Result);
return HandleIncludeMacrosDirective(Introducer.getLocation(), Result);
// C99 6.10.3 - Macro Replacement.
case tok::pp_define:
@ -1373,13 +1455,21 @@ void Preprocessor::HandleDirective(Token &Result) {
// C99 6.10.6 - Pragma Directive.
case tok::pp_pragma:
return HandlePragmaDirective({PIK_HashPragma, SavedHash.getLocation()});
return HandlePragmaDirective({PIK_HashPragma, Introducer.getLocation()});
case tok::pp_module:
case tok::pp___preprocessed_module:
return HandleCXXModuleDirective(Result);
case tok::pp___preprocessed_import:
return HandleCXXImportDirective(Result);
// GNU Extensions.
case tok::pp_import:
return HandleImportDirective(SavedHash.getLocation(), Result);
if (getLangOpts().CPlusPlusModules &&
Introducer.isModuleContextualKeyword(
/*AllowExport=*/false))
return HandleCXXImportDirective(Result);
return HandleImportDirective(Introducer.getLocation(), Result);
case tok::pp_include_next:
return HandleIncludeNextDirective(SavedHash.getLocation(), Result);
return HandleIncludeNextDirective(Introducer.getLocation(), Result);
case tok::pp_warning:
if (LangOpts.CPlusPlus)
@ -1400,8 +1490,8 @@ void Preprocessor::HandleDirective(Token &Result) {
case tok::pp_embed: {
if (PreprocessorLexer *CurrentFileLexer = getCurrentFileLexer())
if (OptionalFileEntryRef FERef = CurrentFileLexer->getFileEntry())
return HandleEmbedDirective(SavedHash.getLocation(), Result, *FERef);
return HandleEmbedDirective(SavedHash.getLocation(), Result, nullptr);
return HandleEmbedDirective(Introducer.getLocation(), Result, *FERef);
return HandleEmbedDirective(Introducer.getLocation(), Result, nullptr);
}
case tok::pp_assert:
//isExtension = true; // FIXME: implement #assert
@ -1430,7 +1520,7 @@ void Preprocessor::HandleDirective(Token &Result) {
if (getLangOpts().AsmPreprocessor) {
auto Toks = std::make_unique<Token[]>(2);
// Return the # and the token after it.
Toks[0] = SavedHash;
Toks[0] = Introducer;
Toks[1] = Result;
// If the second token is a hashhash token, then we need to translate it to
@ -4095,3 +4185,323 @@ void Preprocessor::HandleEmbedDirective(SourceLocation HashLoc, Token &EmbedTok,
StringRef(static_cast<char *>(Mem), OriginalFilename.size());
HandleEmbedDirectiveImpl(HashLoc, *Params, BinaryContents, FilenameToGo);
}
/// HandleCXXImportDirective - Handle the C++ modules import directives
///
/// pp-import:
/// export[opt] import header-name pp-tokens[opt] ; new-line
/// export[opt] import header-name-tokens pp-tokens[opt] ; new-line
/// export[opt] import pp-tokens ; new-line
///
/// The header importing are replaced by annot_header_unit token, and the
/// lexed module name are replaced by annot_module_name token.
void Preprocessor::HandleCXXImportDirective(Token ImportTok) {
assert(getLangOpts().CPlusPlusModules && ImportTok.is(tok::kw_import));
llvm::SaveAndRestore<bool> SaveImportingCXXModules(
this->ImportingCXXNamedModules, true);
if (LastTokenWasExportKeyword.isValid())
LastTokenWasExportKeyword.reset();
Token Tok;
if (LexHeaderName(Tok)) {
if (Tok.isNot(tok::eod))
CheckEndOfDirective(ImportTok.getIdentifierInfo()->getName());
return;
}
SourceLocation UseLoc = ImportTok.getLocation();
SmallVector<Token, 4> DirToks{ImportTok};
SmallVector<IdentifierLoc, 2> Path;
bool ImportingHeader = false;
bool IsPartition = false;
std::string FlatName;
switch (Tok.getKind()) {
case tok::header_name:
ImportingHeader = true;
DirToks.push_back(Tok);
Lex(DirToks.emplace_back());
break;
case tok::colon:
IsPartition = true;
DirToks.push_back(Tok);
UseLoc = Tok.getLocation();
Lex(Tok);
[[fallthrough]];
case tok::identifier: {
bool LeadingSpace = Tok.hasLeadingSpace();
unsigned NumToksInDirective = DirToks.size();
if (LexModuleNameContinue(Tok, UseLoc, DirToks, Path)) {
if (Tok.isNot(tok::eod))
CheckEndOfDirective(ImportTok.getIdentifierInfo()->getName(),
/*EnableMacros=*/false, &DirToks);
EnterModuleSuffixTokenStream(DirToks);
return;
}
// Clean the module-name tokens and replace these tokens with
// annot_module_name.
DirToks.resize(NumToksInDirective);
ModuleNameLoc *NameLoc = ModuleNameLoc::Create(*this, Path);
DirToks.emplace_back();
DirToks.back().setKind(tok::annot_module_name);
DirToks.back().setAnnotationRange(NameLoc->getRange());
DirToks.back().setAnnotationValue(static_cast<void *>(NameLoc));
DirToks.back().setFlagValue(Token::LeadingSpace, LeadingSpace);
DirToks.push_back(Tok);
bool IsValid =
(IsPartition && ModuleDeclState.isNamedModule()) || !IsPartition;
if (Callbacks && IsValid) {
if (IsPartition && ModuleDeclState.isNamedModule()) {
FlatName += ModuleDeclState.getPrimaryName();
FlatName += ":";
}
FlatName += ModuleLoader::getFlatNameFromPath(Path);
SourceLocation StartLoc = IsPartition ? UseLoc : Path[0].getLoc();
IdentifierLoc FlatNameLoc(StartLoc, getIdentifierInfo(FlatName));
// We don't/shouldn't load the standard c++20 modules when preprocessing.
// so the imported module is nullptr.
Callbacks->moduleImport(ImportTok.getLocation(),
ModuleIdPath(FlatNameLoc),
/*Imported=*/nullptr);
}
break;
}
default:
DirToks.push_back(Tok);
break;
}
// Consume the pp-import-suffix and expand any macros in it now, if we're not
// at the semicolon already.
if (!DirToks.back().isOneOf(tok::semi, tok::eod))
CollectPPImportSuffix(DirToks);
if (DirToks.back().isNot(tok::eod))
CheckEndOfDirective(ImportTok.getIdentifierInfo()->getName());
else
DirToks.pop_back();
// This is not a pp-import after all.
if (DirToks.back().isNot(tok::semi)) {
EnterModuleSuffixTokenStream(DirToks);
return;
}
if (ImportingHeader) {
// C++2a [cpp.module]p1:
// The ';' preprocessing-token terminating a pp-import shall not have
// been produced by macro replacement.
SourceLocation SemiLoc = DirToks.back().getLocation();
if (SemiLoc.isMacroID())
Diag(SemiLoc, diag::err_header_import_semi_in_macro);
auto Action = HandleHeaderIncludeOrImport(
/*HashLoc*/ SourceLocation(), ImportTok, Tok, SemiLoc);
switch (Action.Kind) {
case ImportAction::None:
break;
case ImportAction::ModuleBegin:
// Let the parser know we're textually entering the module.
DirToks.emplace_back();
DirToks.back().startToken();
DirToks.back().setKind(tok::annot_module_begin);
DirToks.back().setLocation(SemiLoc);
DirToks.back().setAnnotationEndLoc(SemiLoc);
DirToks.back().setAnnotationValue(Action.ModuleForHeader);
[[fallthrough]];
case ImportAction::ModuleImport:
case ImportAction::HeaderUnitImport:
case ImportAction::SkippedModuleImport:
// We chose to import (or textually enter) the file. Convert the
// header-name token into a header unit annotation token.
DirToks[1].setKind(tok::annot_header_unit);
DirToks[1].setAnnotationEndLoc(DirToks[0].getLocation());
DirToks[1].setAnnotationValue(Action.ModuleForHeader);
// FIXME: Call the moduleImport callback?
break;
case ImportAction::Failure:
assert(TheModuleLoader.HadFatalFailure &&
"This should be an early exit only to a fatal error");
CurLexer->cutOffLexing();
return;
}
}
EnterModuleSuffixTokenStream(DirToks);
}
/// HandleCXXModuleDirective - Handle C++ module declaration directives.
///
/// pp-module:
/// export[opt] module pp-tokens[opt] ; new-line
///
/// pp-module-name:
/// pp-module-name-qualifier[opt] identifier
/// pp-module-partition:
/// : pp-module-name-qualifier[opt] identifier
/// pp-module-name-qualifier:
/// identifier .
/// pp-module-name-qualifier identifier .
///
/// global-module-fragment:
/// module-keyword ; declaration-seq[opt]
///
/// private-module-fragment:
/// module-keyword : private ; declaration-seq[opt]
///
/// The lexed module name are replaced by annot_module_name token.
void Preprocessor::HandleCXXModuleDirective(Token ModuleTok) {
assert(getLangOpts().CPlusPlusModules && ModuleTok.is(tok::kw_module));
Token Introducer = ModuleTok;
if (LastTokenWasExportKeyword.isValid()) {
Introducer = LastTokenWasExportKeyword.getExportTok();
LastTokenWasExportKeyword.reset();
}
SourceLocation StartLoc = Introducer.getLocation();
Token Tok;
SourceLocation UseLoc = ModuleTok.getLocation();
SmallVector<Token, 4> DirToks{ModuleTok};
SmallVector<IdentifierLoc, 2> Path, Partition;
LexUnexpandedToken(Tok);
switch (Tok.getKind()) {
// Global Module Fragment.
case tok::semi:
DirToks.push_back(Tok);
break;
case tok::colon:
DirToks.push_back(Tok);
LexUnexpandedToken(Tok);
if (Tok.isNot(tok::kw_private)) {
if (Tok.isNot(tok::eod))
CheckEndOfDirective(ModuleTok.getIdentifierInfo()->getName(),
/*EnableMacros=*/false, &DirToks);
EnterModuleSuffixTokenStream(DirToks);
return;
}
DirToks.push_back(Tok);
break;
case tok::identifier: {
bool LeadingSpace = Tok.hasLeadingSpace();
unsigned NumToksInDirective = DirToks.size();
// C++ [cpp.module]p3: Any preprocessing tokens after the module
// preprocessing token in the module directive are processed just as in
// normal text.
//
// P3034R1 Module Declarations Shouldnt be Macros.
if (LexModuleNameContinue(Tok, UseLoc, DirToks, Path,
/*AllowMacroExpansion=*/false)) {
if (Tok.isNot(tok::eod))
CheckEndOfDirective(ModuleTok.getIdentifierInfo()->getName(),
/*EnableMacros=*/false, &DirToks);
EnterModuleSuffixTokenStream(DirToks);
return;
}
ModuleNameLoc *NameLoc = ModuleNameLoc::Create(*this, Path);
DirToks.resize(NumToksInDirective);
DirToks.emplace_back();
DirToks.back().setKind(tok::annot_module_name);
DirToks.back().setAnnotationRange(NameLoc->getRange());
DirToks.back().setAnnotationValue(static_cast<void *>(NameLoc));
DirToks.back().setFlagValue(Token::LeadingSpace, LeadingSpace);
DirToks.push_back(Tok);
// C++20 [cpp.module]p
// The pp-tokens, if any, of a pp-module shall be of the form:
// pp-module-name pp-module-partition[opt] pp-tokens[opt]
if (Tok.is(tok::colon)) {
NumToksInDirective = DirToks.size();
LexUnexpandedToken(Tok);
LeadingSpace = Tok.hasLeadingSpace();
if (LexModuleNameContinue(Tok, UseLoc, DirToks, Partition,
/*AllowMacroExpansion=*/false,
/*IsPartition=*/true)) {
if (Tok.isNot(tok::eod))
CheckEndOfDirective(ModuleTok.getIdentifierInfo()->getName(),
/*EnableMacros=*/false, &DirToks);
EnterModuleSuffixTokenStream(DirToks);
return;
}
ModuleNameLoc *PartitionLoc = ModuleNameLoc::Create(*this, Partition);
DirToks.resize(NumToksInDirective);
DirToks.emplace_back();
DirToks.back().setKind(tok::annot_module_name);
DirToks.back().setAnnotationRange(NameLoc->getRange());
DirToks.back().setAnnotationValue(static_cast<void *>(PartitionLoc));
DirToks.back().setFlagValue(Token::LeadingSpace, LeadingSpace);
DirToks.push_back(Tok);
}
// If the current token is a macro definition, put it back to token stream
// and expand any macros in it later.
//
// export module M ATTR(some_attr); // -D'ATTR(x)=[[x]]'
//
// Current token is `ATTR`.
if (Tok.is(tok::identifier) &&
getMacroDefinition(Tok.getIdentifierInfo())) {
std::unique_ptr<Token[]> TokCopy = std::make_unique<Token[]>(1);
TokCopy[0] = Tok;
EnterTokenStream(std::move(TokCopy), /*NumToks=*/1,
/*DisableMacroExpansion=*/false, /*IsReinject=*/false);
Lex(Tok);
DirToks.back() = Tok;
}
break;
}
default:
DirToks.push_back(Tok);
break;
}
// Consume the pp-import-suffix and expand any macros in it now, if we're not
// at the semicolon already.
SourceLocation End = DirToks.back().getLocation();
std::optional<Token> NextPPTok = DirToks.back();
if (DirToks.back().is(tok::eod)) {
NextPPTok = peekNextPPToken();
if (NextPPTok && NextPPTok->is(tok::raw_identifier))
LookUpIdentifierInfo(*NextPPTok);
}
// Only ';' and '[' are allowed after module name.
// We also check 'private' because the previous is not a module name.
if (!NextPPTok->isOneOf(tok::semi, tok::eod, tok::l_square, tok::kw_private))
Diag(*NextPPTok, diag::err_pp_unexpected_tok_after_module_name)
<< getSpelling(*NextPPTok);
if (!DirToks.back().isOneOf(tok::semi, tok::eod)) {
// Consume the pp-import-suffix and expand any macros in it now. We'll add
// it back into the token stream later.
CollectPPImportSuffix(DirToks);
End = DirToks.back().getLocation();
}
if (DirToks.back().isNot(tok::eod))
End = CheckEndOfDirective(ModuleTok.getIdentifierInfo()->getName(),
/*EnableMacros=*/false, &DirToks);
else
End = DirToks.pop_back_val().getLocation();
if (!IncludeMacroStack.empty()) {
Diag(StartLoc, diag::err_pp_module_decl_in_header)
<< SourceRange(StartLoc, End);
}
if (CurPPLexer->getConditionalStackDepth() != 0) {
Diag(StartLoc, diag::err_pp_cond_span_module_decl)
<< SourceRange(StartLoc, End);
}
EnterModuleSuffixTokenStream(DirToks);
}

View File

@ -441,7 +441,7 @@ bool Preprocessor::HandleEndOfFile(Token &Result, bool isEndOfMacro) {
assert(CurLexer && "Got EOF but no current lexer set!");
Result.startToken();
CurLexer->FormTokenWithChars(Result, CurLexer->BufferEnd, tok::eof);
CurLexer.reset();
PendingDestroyLexers.push_back(std::move(CurLexer));
CurPPLexer = nullptr;
recomputeCurLexerKind();
@ -558,9 +558,17 @@ bool Preprocessor::HandleEndOfFile(Token &Result, bool isEndOfMacro) {
<< PPOpts.PCHThroughHeader << 0;
}
if (!isIncrementalProcessingEnabled())
// We're done with lexing.
CurLexer.reset();
if (!isIncrementalProcessingEnabled()) {
// We're done with lexing. If we're inside a nested Lex call (LexLevel > 0),
// defer destruction of the lexer until Lex returns to avoid use-after-free
// when HandleEndOfFile is called from within Lexer methods that still need
// to access their members after this function returns.
if (LexLevel > 0 && CurLexer) {
PendingDestroyLexers.push_back(std::move(CurLexer));
} else {
CurLexer.reset();
}
}
if (!isIncrementalProcessingEnabled())
CurPPLexer = nullptr;

View File

@ -35,6 +35,7 @@
#include "clang/Basic/SourceManager.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Lex/CodeCompletionHandler.h"
#include "clang/Lex/DependencyDirectivesScanner.h"
#include "clang/Lex/ExternalPreprocessorSource.h"
#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/LexDiagnostic.h"
@ -55,11 +56,14 @@
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Capacity.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/MemoryBufferRef.h"
#include "llvm/Support/SaveAndRestore.h"
#include "llvm/Support/raw_ostream.h"
#include <algorithm>
#include <cassert>
@ -115,6 +119,8 @@ Preprocessor::Preprocessor(const PreprocessorOptions &PPOpts,
// We haven't read anything from the external source.
ReadMacrosFromExternalSource = false;
LastTokenWasExportKeyword.reset();
BuiltinInfo = std::make_unique<Builtin::Context>();
// "Poison" __VA_ARGS__, __VA_OPT__ which can only appear in the expansion of
@ -576,6 +582,11 @@ void Preprocessor::EnterMainSourceFile() {
// export module M; // error: module declaration must occur
// // at the start of the translation unit.
if (getLangOpts().CPlusPlusModules) {
std::optional<StringRef> Input =
getSourceManager().getBufferDataOrNone(MainFileID);
if (!isPreprocessedModuleFile() && Input)
MainFileIsPreprocessedModuleFile =
clang::isPreprocessedModuleFile(*Input);
auto Tracer = std::make_unique<NoTrivialPPDirectiveTracer>(*this);
DirTracer = Tracer.get();
addPPCallbacks(std::move(Tracer));
@ -875,15 +886,13 @@ bool Preprocessor::HandleIdentifier(Token &Identifier) {
// used in contexts where import declarations are disallowed.
//
// Likewise if this is the standard C++ import keyword.
if (((LastTokenWasAt && II.isModulesImport()) ||
if (((LastTokenWasAt && II.isImportKeyword()) ||
Identifier.is(tok::kw_import)) &&
!InMacroArgs && !DisableMacroExpansion &&
(getLangOpts().Modules || getLangOpts().DebuggerSupport) &&
!InMacroArgs &&
(!DisableMacroExpansion || MacroExpansionInDirectivesOverride) &&
CurLexerCallback != CLK_CachingLexer) {
ModuleImportLoc = Identifier.getLocation();
NamedModuleImportPath.clear();
IsAtImport = true;
ModuleImportExpectsIdentifier = true;
CurLexerCallback = CLK_LexAfterModuleImport;
}
return true;
@ -932,6 +941,7 @@ void Preprocessor::Lex(Token &Result) {
// This token is injected to represent the translation of '#include "a.h"'
// into "import a.h;". Mimic the notional ';'.
case tok::annot_module_include:
case tok::annot_repl_input_end:
case tok::semi:
TrackGMFState.handleSemi();
StdCXXImportSeqState.handleSemi();
@ -951,35 +961,23 @@ void Preprocessor::Lex(Token &Result) {
case tok::colon:
ModuleDeclState.handleColon();
break;
case tok::period:
ModuleDeclState.handlePeriod();
break;
case tok::eod:
break;
case tok::identifier:
// Check "import" and "module" when there is no open bracket. The two
// identifiers are not meaningful with open brackets.
case tok::kw_import:
if (StdCXXImportSeqState.atTopLevel()) {
if (Result.getIdentifierInfo()->isModulesImport()) {
TrackGMFState.handleImport(StdCXXImportSeqState.afterTopLevelSeq());
StdCXXImportSeqState.handleImport();
if (StdCXXImportSeqState.afterImportSeq()) {
ModuleImportLoc = Result.getLocation();
NamedModuleImportPath.clear();
IsAtImport = false;
ModuleImportExpectsIdentifier = true;
CurLexerCallback = CLK_LexAfterModuleImport;
}
break;
} else if (Result.getIdentifierInfo() == getIdentifierInfo("module")) {
if (hasSeenNoTrivialPPDirective())
Result.setFlag(Token::HasSeenNoTrivialPPDirective);
TrackGMFState.handleModule(StdCXXImportSeqState.afterTopLevelSeq());
ModuleDeclState.handleModule();
break;
}
TrackGMFState.handleImport(StdCXXImportSeqState.afterTopLevelSeq());
StdCXXImportSeqState.handleImport();
}
ModuleDeclState.handleIdentifier(Result.getIdentifierInfo());
break;
case tok::kw_module:
if (StdCXXImportSeqState.atTopLevel()) {
if (hasSeenNoTrivialPPDirective())
Result.setFlag(Token::HasSeenNoTrivialPPDirective);
TrackGMFState.handleModule(StdCXXImportSeqState.afterTopLevelSeq());
ModuleDeclState.handleModule();
}
break;
case tok::annot_module_name:
ModuleDeclState.handleModuleName(
static_cast<ModuleNameLoc *>(Result.getAnnotationValue()));
if (ModuleDeclState.isModuleCandidate())
break;
[[fallthrough]];
@ -997,8 +995,17 @@ void Preprocessor::Lex(Token &Result) {
}
LastTokenWasAt = Result.is(tok::at);
if (Result.isNot(tok::kw_export))
LastTokenWasExportKeyword.reset();
--LexLevel;
// Destroy any lexers that were deferred while we were in nested Lex calls.
// This must happen after decrementing LexLevel but before any other
// processing that might re-enter Lex.
if (LexLevel == 0 && !PendingDestroyLexers.empty())
PendingDestroyLexers.clear();
if ((LexLevel == 0 || PreprocessToken) &&
!Result.getFlag(Token::IsReinjected)) {
if (LexLevel == 0)
@ -1119,41 +1126,247 @@ bool Preprocessor::LexHeaderName(Token &FilenameTok, bool AllowMacroExpansion) {
return false;
}
std::optional<Token> Preprocessor::peekNextPPToken() const {
// Do some quick tests for rejection cases.
std::optional<Token> Val;
if (CurLexer)
Val = CurLexer->peekNextPPToken();
else
Val = CurTokenLexer->peekNextPPToken();
if (!Val) {
// We have run off the end. If it's a source file we don't
// examine enclosing ones (C99 5.1.1.2p4). Otherwise walk up the
// macro stack.
if (CurPPLexer)
return std::nullopt;
for (const IncludeStackInfo &Entry : llvm::reverse(IncludeMacroStack)) {
if (Entry.TheLexer)
Val = Entry.TheLexer->peekNextPPToken();
else
Val = Entry.TheTokenLexer->peekNextPPToken();
if (Val)
break;
// Ran off the end of a source file?
if (Entry.ThePPLexer)
return std::nullopt;
}
}
// Okay, we found the token and return. Otherwise we found the end of the
// translation unit.
return Val;
}
// We represent the primary and partition names as 'Paths' which are sections
// of the hierarchical access path for a clang module. However for C++20
// the periods in a name are just another character, and we will need to
// flatten them into a string.
std::string ModuleLoader::getFlatNameFromPath(ModuleIdPath Path) {
std::string Name;
if (Path.empty())
return Name;
for (auto &Piece : Path) {
assert(Piece.getIdentifierInfo() && Piece.getLoc().isValid());
if (!Name.empty())
Name += ".";
Name += Piece.getIdentifierInfo()->getName();
}
return Name;
}
ModuleNameLoc *ModuleNameLoc::Create(Preprocessor &PP, ModuleIdPath Path) {
assert(!Path.empty() && "expect at least one identifier in a module name");
void *Mem = PP.getPreprocessorAllocator().Allocate(
totalSizeToAlloc<IdentifierLoc>(Path.size()), alignof(ModuleNameLoc));
return new (Mem) ModuleNameLoc(Path);
}
bool Preprocessor::LexModuleNameContinue(Token &Tok, SourceLocation UseLoc,
SmallVectorImpl<Token> &Suffix,
SmallVectorImpl<IdentifierLoc> &Path,
bool AllowMacroExpansion,
bool IsPartition) {
auto ConsumeToken = [&]() {
if (AllowMacroExpansion)
Lex(Tok);
else
LexUnexpandedToken(Tok);
Suffix.push_back(Tok);
};
while (true) {
if (Tok.isNot(tok::identifier)) {
if (Tok.is(tok::code_completion)) {
CurLexer->cutOffLexing();
CodeComplete->CodeCompleteModuleImport(UseLoc, Path);
return true;
}
Diag(Tok, diag::err_pp_module_expected_ident) << Path.empty();
return true;
}
// [cpp.pre]/p2:
// No identifier in the pp-module-name or pp-module-partition shall
// currently be defined as an object-like macro.
if (MacroInfo *MI = getMacroInfo(Tok.getIdentifierInfo());
MI && MI->isObjectLike() && getLangOpts().CPlusPlus20 &&
!AllowMacroExpansion) {
Diag(Tok, diag::err_pp_module_name_is_macro)
<< IsPartition << Tok.getIdentifierInfo();
Diag(MI->getDefinitionLoc(), diag::note_macro_here)
<< Tok.getIdentifierInfo();
}
// Record this part of the module path.
Path.emplace_back(Tok.getLocation(), Tok.getIdentifierInfo());
ConsumeToken();
if (Tok.isNot(tok::period))
return false;
ConsumeToken();
}
}
/// [cpp.pre]/p2:
/// A preprocessing directive consists of a sequence of preprocessing tokens
/// that satisfies the following constraints: At the start of translation phase
/// 4, the first preprocessing token in the sequence, referred to as a
/// directive-introducing token, begins with the first character in the source
/// file (optionally after whitespace containing no new-line characters) or
/// follows whitespace containing at least one new-line character, and is:
/// - a # preprocessing token, or
/// - an import preprocessing token immediately followed on the same logical
/// source line by a header-name, <, identifier, or : preprocessing token, or
/// - a module preprocessing token immediately followed on the same logical
/// source line by an identifier, :, or ; preprocessing token, or
/// - an export preprocessing token immediately followed on the same logical
/// source line by one of the two preceding forms.
///
///
/// At the start of phase 4 an import or module token is treated as starting a
/// directive and are converted to their respective keywords iff:
/// - After skipping horizontal whitespace are
/// - at the start of a logical line, or
/// - preceded by an 'export' at the start of the logical line.
/// - Are followed by an identifier pp token (before macro expansion), or
/// - <, ", or : (but not ::) pp tokens for 'import', or
/// - ; for 'module'
/// Otherwise the token is treated as an identifier.
bool Preprocessor::HandleModuleContextualKeyword(
Token &Result, bool TokAtPhysicalStartOfLine) {
if (!getLangOpts().CPlusPlusModules || !Result.isModuleContextualKeyword())
return false;
if (Result.is(tok::kw_export)) {
LastTokenWasExportKeyword = {Result, TokAtPhysicalStartOfLine};
return false;
}
/// Trait 'module' and 'import' as a identifier when the main file is a
/// preprocessed module file. We only allow '__preprocessed_module' and
/// '__preprocessed_import' in this context.
IdentifierInfo *II = Result.getIdentifierInfo();
if (isPreprocessedModuleFile() &&
(II->isStr(tok::getKeywordSpelling(tok::kw_import)) ||
II->isStr(tok::getKeywordSpelling(tok::kw_module))))
return false;
if (LastTokenWasExportKeyword.isValid()) {
// The export keyword was not at the start of line, it's not a
// directive-introducing token.
if (!LastTokenWasExportKeyword.isAtPhysicalStartOfLine())
return false;
// [cpp.pre]/1.4
// export // not a preprocessing directive
// import foo; // preprocessing directive (ill-formed at phase7)
if (TokAtPhysicalStartOfLine)
return false;
} else if (!TokAtPhysicalStartOfLine)
return false;
llvm::SaveAndRestore<bool> SavedParsingPreprocessorDirective(
CurPPLexer->ParsingPreprocessorDirective, true);
// The next token may be an angled string literal after import keyword.
llvm::SaveAndRestore<bool> SavedParsingFilemame(
CurPPLexer->ParsingFilename,
Result.getIdentifierInfo()->isImportKeyword());
std::optional<Token> NextTok =
CurLexer ? CurLexer->peekNextPPToken() : CurTokenLexer->peekNextPPToken();
if (!NextTok)
return false;
if (NextTok->is(tok::raw_identifier))
LookUpIdentifierInfo(*NextTok);
if (Result.getIdentifierInfo()->isImportKeyword()) {
if (NextTok->isOneOf(tok::identifier, tok::less, tok::colon,
tok::header_name)) {
Result.setKind(tok::kw_import);
ModuleImportLoc = Result.getLocation();
IsAtImport = false;
return true;
}
}
if (Result.getIdentifierInfo()->isModuleKeyword() &&
NextTok->isOneOf(tok::identifier, tok::colon, tok::semi)) {
Result.setKind(tok::kw_module);
ModuleDeclLoc = Result.getLocation();
return true;
}
// Ok, it's an identifier.
return false;
}
bool Preprocessor::CollectPPImportSuffixAndEnterStream(
SmallVectorImpl<Token> &Toks, bool StopUntilEOD) {
CollectPPImportSuffix(Toks);
EnterModuleSuffixTokenStream(Toks);
return false;
}
/// Collect the tokens of a C++20 pp-import-suffix.
void Preprocessor::CollectPpImportSuffix(SmallVectorImpl<Token> &Toks) {
// FIXME: For error recovery, consider recognizing attribute syntax here
// and terminating / diagnosing a missing semicolon if we find anything
// else? (Can we leave that to the parser?)
unsigned BracketDepth = 0;
void Preprocessor::CollectPPImportSuffix(SmallVectorImpl<Token> &Toks,
bool StopUntilEOD) {
while (true) {
Toks.emplace_back();
Lex(Toks.back());
switch (Toks.back().getKind()) {
case tok::l_paren: case tok::l_square: case tok::l_brace:
++BracketDepth;
break;
case tok::r_paren: case tok::r_square: case tok::r_brace:
if (BracketDepth == 0)
return;
--BracketDepth;
break;
case tok::semi:
if (BracketDepth == 0)
if (!StopUntilEOD)
return;
break;
[[fallthrough]];
case tok::eod:
case tok::eof:
return;
default:
break;
}
}
}
// Allocate a holding buffer for a sequence of tokens and introduce it into
// the token stream.
void Preprocessor::EnterModuleSuffixTokenStream(ArrayRef<Token> Toks) {
if (Toks.empty())
return;
auto ToksCopy = std::make_unique<Token[]>(Toks.size());
std::copy(Toks.begin(), Toks.end(), ToksCopy.get());
EnterTokenStream(std::move(ToksCopy), Toks.size(),
/*DisableMacroExpansion*/ false, /*IsReinject*/ false);
assert(CurTokenLexer && "Must have a TokenLexer");
CurTokenLexer->setLexingCXXModuleDirective();
}
/// Lex a token following the 'import' contextual keyword.
///
@ -1178,186 +1391,47 @@ bool Preprocessor::LexAfterModuleImport(Token &Result) {
// Figure out what kind of lexer we actually have.
recomputeCurLexerKind();
// Lex the next token. The header-name lexing rules are used at the start of
// a pp-import.
//
// For now, we only support header-name imports in C++20 mode.
// FIXME: Should we allow this in all language modes that support an import
// declaration as an extension?
if (NamedModuleImportPath.empty() && getLangOpts().CPlusPlusModules) {
if (LexHeaderName(Result))
return true;
if (Result.is(tok::colon) && ModuleDeclState.isNamedModule()) {
std::string Name = ModuleDeclState.getPrimaryName().str();
Name += ":";
NamedModuleImportPath.emplace_back(Result.getLocation(),
getIdentifierInfo(Name));
CurLexerCallback = CLK_LexAfterModuleImport;
return true;
}
} else {
Lex(Result);
}
// Allocate a holding buffer for a sequence of tokens and introduce it into
// the token stream.
auto EnterTokens = [this](ArrayRef<Token> Toks) {
auto ToksCopy = std::make_unique<Token[]>(Toks.size());
std::copy(Toks.begin(), Toks.end(), ToksCopy.get());
EnterTokenStream(std::move(ToksCopy), Toks.size(),
/*DisableMacroExpansion*/ true, /*IsReinject*/ false);
};
bool ImportingHeader = Result.is(tok::header_name);
// Check for a header-name.
SmallVector<Token, 32> Suffix;
if (ImportingHeader) {
// Enter the header-name token into the token stream; a Lex action cannot
// both return a token and cache tokens (doing so would corrupt the token
// cache if the call to Lex comes from CachingLex / PeekAhead).
Suffix.push_back(Result);
SmallVector<IdentifierLoc, 3> Path;
Lex(Result);
if (LexModuleNameContinue(Result, ModuleImportLoc, Suffix, Path))
return CollectPPImportSuffixAndEnterStream(Suffix);
// Consume the pp-import-suffix and expand any macros in it now. We'll add
// it back into the token stream later.
CollectPpImportSuffix(Suffix);
if (Suffix.back().isNot(tok::semi)) {
// This is not a pp-import after all.
EnterTokens(Suffix);
return false;
}
// C++2a [cpp.module]p1:
// The ';' preprocessing-token terminating a pp-import shall not have
// been produced by macro replacement.
SourceLocation SemiLoc = Suffix.back().getLocation();
if (SemiLoc.isMacroID())
Diag(SemiLoc, diag::err_header_import_semi_in_macro);
// Reconstitute the import token.
Token ImportTok;
ImportTok.startToken();
ImportTok.setKind(tok::kw_import);
ImportTok.setLocation(ModuleImportLoc);
ImportTok.setIdentifierInfo(getIdentifierInfo("import"));
ImportTok.setLength(6);
auto Action = HandleHeaderIncludeOrImport(
/*HashLoc*/ SourceLocation(), ImportTok, Suffix.front(), SemiLoc);
switch (Action.Kind) {
case ImportAction::None:
break;
case ImportAction::ModuleBegin:
// Let the parser know we're textually entering the module.
Suffix.emplace_back();
Suffix.back().startToken();
Suffix.back().setKind(tok::annot_module_begin);
Suffix.back().setLocation(SemiLoc);
Suffix.back().setAnnotationEndLoc(SemiLoc);
Suffix.back().setAnnotationValue(Action.ModuleForHeader);
[[fallthrough]];
case ImportAction::ModuleImport:
case ImportAction::HeaderUnitImport:
case ImportAction::SkippedModuleImport:
// We chose to import (or textually enter) the file. Convert the
// header-name token into a header unit annotation token.
Suffix[0].setKind(tok::annot_header_unit);
Suffix[0].setAnnotationEndLoc(Suffix[0].getLocation());
Suffix[0].setAnnotationValue(Action.ModuleForHeader);
// FIXME: Call the moduleImport callback?
break;
case ImportAction::Failure:
assert(TheModuleLoader.HadFatalFailure &&
"This should be an early exit only to a fatal error");
Result.setKind(tok::eof);
CurLexer->cutOffLexing();
EnterTokens(Suffix);
return true;
}
EnterTokens(Suffix);
return false;
}
// The token sequence
//
// import identifier (. identifier)*
//
// indicates a module import directive. We already saw the 'import'
// contextual keyword, so now we're looking for the identifiers.
if (ModuleImportExpectsIdentifier && Result.getKind() == tok::identifier) {
// We expected to see an identifier here, and we did; continue handling
// identifiers.
NamedModuleImportPath.emplace_back(Result.getLocation(),
Result.getIdentifierInfo());
ModuleImportExpectsIdentifier = false;
CurLexerCallback = CLK_LexAfterModuleImport;
return true;
}
// If we're expecting a '.' or a ';', and we got a '.', then wait until we
// see the next identifier. (We can also see a '[[' that begins an
// attribute-specifier-seq here under the Standard C++ Modules.)
if (!ModuleImportExpectsIdentifier && Result.getKind() == tok::period) {
ModuleImportExpectsIdentifier = true;
CurLexerCallback = CLK_LexAfterModuleImport;
return true;
}
// If we didn't recognize a module name at all, this is not a (valid) import.
if (NamedModuleImportPath.empty() || Result.is(tok::eof))
return true;
ModuleNameLoc *NameLoc = ModuleNameLoc::Create(*this, Path);
Suffix.clear();
Suffix.emplace_back();
Suffix.back().setKind(tok::annot_module_name);
Suffix.back().setAnnotationRange(NameLoc->getRange());
Suffix.back().setAnnotationValue(static_cast<void *>(NameLoc));
Suffix.push_back(Result);
// Consume the pp-import-suffix and expand any macros in it now, if we're not
// at the semicolon already.
SourceLocation SemiLoc = Result.getLocation();
if (Result.isNot(tok::semi)) {
Suffix.push_back(Result);
CollectPpImportSuffix(Suffix);
if (Suffix.back().isNot(tok::semi)) {
if (Suffix.back().isNot(tok::eof))
CollectPPImportSuffix(Suffix);
if (Suffix.back().isNot(tok::semi)) {
// This is not an import after all.
EnterTokens(Suffix);
EnterModuleSuffixTokenStream(Suffix);
return false;
}
SemiLoc = Suffix.back().getLocation();
}
// Under the standard C++ Modules, the dot is just part of the module name,
// and not a real hierarchy separator. Flatten such module names now.
//
// FIXME: Is this the right level to be performing this transformation?
std::string FlatModuleName;
if (getLangOpts().CPlusPlusModules) {
for (auto &Piece : NamedModuleImportPath) {
// If the FlatModuleName ends with colon, it implies it is a partition.
if (!FlatModuleName.empty() && FlatModuleName.back() != ':')
FlatModuleName += ".";
FlatModuleName += Piece.getIdentifierInfo()->getName();
}
SourceLocation FirstPathLoc = NamedModuleImportPath[0].getLoc();
NamedModuleImportPath.clear();
NamedModuleImportPath.emplace_back(FirstPathLoc,
getIdentifierInfo(FlatModuleName));
}
Module *Imported = nullptr;
// We don't/shouldn't load the standard c++20 modules when preprocessing.
if (getLangOpts().Modules && !isInImportingCXXNamedModules()) {
Imported = TheModuleLoader.loadModule(ModuleImportLoc,
NamedModuleImportPath,
Module::Hidden,
if (getLangOpts().Modules) {
Imported = TheModuleLoader.loadModule(ModuleImportLoc, Path, Module::Hidden,
/*IsInclusionDirective=*/false);
if (Imported)
makeModuleVisible(Imported, SemiLoc);
}
if (Callbacks)
Callbacks->moduleImport(ModuleImportLoc, NamedModuleImportPath, Imported);
Callbacks->moduleImport(ModuleImportLoc, Path, Imported);
if (!Suffix.empty()) {
EnterTokens(Suffix);
EnterModuleSuffixTokenStream(Suffix);
return false;
}
return true;

View File

@ -161,7 +161,8 @@ bool TokenConcatenation::AvoidConcat(const Token &PrevPrevTok,
const Token &PrevTok,
const Token &Tok) const {
// No space is required between header unit name in quote and semi.
if (PrevTok.is(tok::annot_header_unit) && Tok.is(tok::semi))
if (PrevTok.isOneOf(tok::annot_header_unit, tok::annot_module_name) &&
Tok.is(tok::semi))
return false;
// Conservatively assume that every annotation token that has a printable
@ -197,11 +198,12 @@ bool TokenConcatenation::AvoidConcat(const Token &PrevPrevTok,
if (Tok.isAnnotation()) {
// Modules annotation can show up when generated automatically for includes.
assert(Tok.isOneOf(tok::annot_module_include, tok::annot_module_begin,
tok::annot_module_end, tok::annot_embed) &&
tok::annot_module_end, tok::annot_embed,
tok::annot_module_name) &&
"unexpected annotation in AvoidConcat");
ConcatInfo = 0;
if (Tok.is(tok::annot_embed))
if (Tok.isOneOf(tok::annot_embed, tok::annot_module_name))
return true;
}

View File

@ -57,6 +57,7 @@ void TokenLexer::Init(Token &Tok, SourceLocation ELEnd, MacroInfo *MI,
IsReinject = false;
NumTokens = Macro->tokens_end()-Macro->tokens_begin();
MacroExpansionStart = SourceLocation();
LexingCXXModuleDirective = false;
SourceManager &SM = PP.getSourceManager();
MacroStartSLocOffset = SM.getNextLocalOffset();
@ -113,6 +114,7 @@ void TokenLexer::Init(const Token *TokArray, unsigned NumToks,
HasLeadingSpace = false;
NextTokGetsSpace = false;
MacroExpansionStart = SourceLocation();
LexingCXXModuleDirective = false;
// Set HasLeadingSpace/AtStartOfLine so that the first token will be
// returned unmodified.
@ -625,6 +627,18 @@ bool TokenLexer::Lex(Token &Tok) {
// that it is no longer being expanded.
if (Macro) Macro->EnableMacro();
// CWG2947: Allow the following code:
//
// export module m; int x;
// extern "C++" int *y = &x;
//
// The 'extern' token should has 'StartOfLine' flag when current TokenLexer
// exits and propagate line start/leading space info.
if (!Macro && isLexingCXXModuleDirective()) {
AtStartOfLine = true;
setLexingCXXModuleDirective(false);
}
Tok.startToken();
Tok.setFlagValue(Token::StartOfLine , AtStartOfLine);
Tok.setFlagValue(Token::LeadingSpace, HasLeadingSpace || NextTokGetsSpace);
@ -699,7 +713,9 @@ bool TokenLexer::Lex(Token &Tok) {
HasLeadingSpace = false;
// Handle recursive expansion!
if (!Tok.isAnnotation() && Tok.getIdentifierInfo() != nullptr) {
if (!Tok.isAnnotation() && Tok.getIdentifierInfo() != nullptr &&
(!PP.getLangOpts().CPlusPlusModules ||
!Tok.isModuleContextualKeyword())) {
// Change the kind of this identifier to the appropriate token kind, e.g.
// turning "for" into a keyword.
IdentifierInfo *II = Tok.getIdentifierInfo();
@ -947,6 +963,18 @@ bool TokenLexer::isParsingPreprocessorDirective() const {
return Tokens[NumTokens-1].is(tok::eod) && !isAtEnd();
}
/// setLexingCXXModuleDirective - This is set to true if this TokenLexer is
/// created when handling C++ module directive.
void TokenLexer::setLexingCXXModuleDirective(bool Val) {
LexingCXXModuleDirective = Val;
}
/// isLexingCXXModuleDirective - Return true if we are lexing a C++ module or
/// import directive.
bool TokenLexer::isLexingCXXModuleDirective() const {
return LexingCXXModuleDirective;
}
/// HandleMicrosoftCommentPaste - In microsoft compatibility mode, /##/ pastes
/// together to form a comment that comments out everything in the current
/// macro, other active macros, and anything left on the current physical

View File

@ -17,6 +17,9 @@
#include "clang/AST/DeclTemplate.h"
#include "clang/Basic/DiagnosticParse.h"
#include "clang/Basic/StackExhaustionHandler.h"
#include "clang/Basic/TokenKinds.h"
#include "clang/Lex/ModuleLoader.h"
#include "clang/Lex/Preprocessor.h"
#include "clang/Parse/RAIIObjectsForParser.h"
#include "clang/Sema/DeclSpec.h"
#include "clang/Sema/EnterExpressionEvaluationContext.h"
@ -515,8 +518,6 @@ void Parser::Initialize() {
Ident_abstract = nullptr;
Ident_override = nullptr;
Ident_GNU_final = nullptr;
Ident_import = nullptr;
Ident_module = nullptr;
Ident_super = &PP.getIdentifierTable().get("super");
@ -572,11 +573,6 @@ void Parser::Initialize() {
PP.SetPoisonReason(Ident_AbnormalTermination,diag::err_seh___finally_block);
}
if (getLangOpts().CPlusPlusModules) {
Ident_import = PP.getIdentifierInfo("import");
Ident_module = PP.getIdentifierInfo("module");
}
Actions.Initialize();
// Prime the lexer look-ahead.
@ -624,25 +620,8 @@ bool Parser::ParseTopLevelDecl(DeclGroupPtrTy &Result,
switch (NextToken().getKind()) {
case tok::kw_module:
goto module_decl;
// Note: no need to handle kw_import here. We only form kw_import under
// the Standard C++ Modules, and in that case 'export import' is parsed as
// an export-declaration containing an import-declaration.
// Recognize context-sensitive C++20 'export module' and 'export import'
// declarations.
case tok::identifier: {
IdentifierInfo *II = NextToken().getIdentifierInfo();
if ((II == Ident_module || II == Ident_import) &&
GetLookAheadToken(2).isNot(tok::coloncolon)) {
if (II == Ident_module)
goto module_decl;
else
goto import_decl;
}
break;
}
case tok::kw_import:
goto import_decl;
default:
break;
}
@ -710,22 +689,6 @@ bool Parser::ParseTopLevelDecl(DeclGroupPtrTy &Result,
Actions.ActOnEndOfTranslationUnit();
//else don't tell Sema that we ended parsing: more input might come.
return true;
case tok::identifier:
// C++2a [basic.link]p3:
// A token sequence beginning with 'export[opt] module' or
// 'export[opt] import' and not immediately followed by '::'
// is never interpreted as the declaration of a top-level-declaration.
if ((Tok.getIdentifierInfo() == Ident_module ||
Tok.getIdentifierInfo() == Ident_import) &&
NextToken().isNot(tok::coloncolon)) {
if (Tok.getIdentifierInfo() == Ident_module)
goto module_decl;
else
goto import_decl;
}
break;
default:
break;
}
@ -918,8 +881,10 @@ Parser::ParseExternalDeclaration(ParsedAttributes &Attrs,
case tok::kw_import: {
Sema::ModuleImportState IS = Sema::ModuleImportState::NotACXX20Module;
if (getLangOpts().CPlusPlusModules) {
llvm_unreachable("not expecting a c++20 import here");
ProhibitAttributes(Attrs);
Diag(Tok, diag::err_unexpected_module_or_import_decl)
<< /*IsImport*/ true;
SkipUntil(tok::semi);
return nullptr;
}
SingleDecl = ParseModuleImport(SourceLocation(), IS);
} break;
@ -1011,7 +976,7 @@ Parser::ParseExternalDeclaration(ParsedAttributes &Attrs,
return nullptr;
case tok::kw_module:
Diag(Tok, diag::err_unexpected_module_decl);
Diag(Tok, diag::err_unexpected_module_or_import_decl) << /*IsImport*/ false;
SkipUntil(tok::semi);
return nullptr;
@ -2231,6 +2196,11 @@ void Parser::CodeCompleteNaturalLanguage() {
Actions.CodeCompletion().CodeCompleteNaturalLanguage();
}
void Parser::CodeCompleteModuleImport(SourceLocation ImportLoc,
ModuleIdPath Path) {
Actions.CodeCompletion().CodeCompleteModuleImport(ImportLoc, Path);
}
bool Parser::ParseMicrosoftIfExistsCondition(IfExistsCondition& Result) {
assert((Tok.is(tok::kw___if_exists) || Tok.is(tok::kw___if_not_exists)) &&
"Expected '__if_exists' or '__if_not_exists'");
@ -2342,10 +2312,8 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
? Sema::ModuleDeclKind::Interface
: Sema::ModuleDeclKind::Implementation;
assert(
(Tok.is(tok::kw_module) ||
(Tok.is(tok::identifier) && Tok.getIdentifierInfo() == Ident_module)) &&
"not a module declaration");
assert(Tok.is(tok::kw_module) && "not a module declaration");
SourceLocation ModuleLoc = ConsumeToken();
// Attributes appear after the module name, not before.
@ -2402,6 +2370,10 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
return nullptr;
}
// This should already diagnosed in phase 4, just skip unil semicolon.
if (!Tok.isOneOf(tok::semi, tok::l_square))
SkipUntil(tok::semi, SkipUntilFlags::StopBeforeMatch);
// We don't support any module attributes yet; just parse them and diagnose.
ParsedAttributes Attrs(AttrFactory);
MaybeParseCXX11Attributes(Attrs);
@ -2410,7 +2382,9 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
/*DiagnoseEmptyAttrs=*/false,
/*WarnOnUnknownAttrs=*/true);
ExpectAndConsumeSemi(diag::err_module_expected_semi);
if (ExpectAndConsumeSemi(diag::err_expected_semi_after_module_or_import,
tok::getKeywordSpelling(tok::kw_module)))
SkipUntil(tok::semi);
return Actions.ActOnModuleDecl(StartLoc, ModuleLoc, MDK, Path, Partition,
ImportState,
@ -2424,7 +2398,7 @@ Decl *Parser::ParseModuleImport(SourceLocation AtLoc,
SourceLocation ExportLoc;
TryConsumeToken(tok::kw_export, ExportLoc);
assert((AtLoc.isInvalid() ? Tok.isOneOf(tok::kw_import, tok::identifier)
assert((AtLoc.isInvalid() ? Tok.is(tok::kw_import)
: Tok.isObjCAtKeyword(tok::objc_import)) &&
"Improper start to module import");
bool IsObjCAtImport = Tok.isObjCAtKeyword(tok::objc_import);
@ -2449,12 +2423,12 @@ Decl *Parser::ParseModuleImport(SourceLocation AtLoc,
Diag(ColonLoc, diag::err_unsupported_module_partition)
<< SourceRange(ColonLoc, Path.back().getLoc());
// Recover by leaving partition empty.
else if (ParseModuleName(ColonLoc, Path, /*IsImport*/ true))
else if (ParseModuleName(ColonLoc, Path, /*IsImport=*/true))
return nullptr;
else
IsPartition = true;
} else {
if (ParseModuleName(ImportLoc, Path, /*IsImport*/ true))
if (ParseModuleName(ImportLoc, Path, /*IsImport=*/true))
return nullptr;
}
@ -2514,8 +2488,17 @@ Decl *Parser::ParseModuleImport(SourceLocation AtLoc,
SeenError = false;
break;
}
ExpectAndConsumeSemi(diag::err_module_expected_semi);
TryConsumeToken(tok::eod);
bool LexedSemi = false;
if (getLangOpts().CPlusPlusModules)
LexedSemi =
!ExpectAndConsumeSemi(diag::err_expected_semi_after_module_or_import,
tok::getKeywordSpelling(tok::kw_import));
else
LexedSemi = !ExpectAndConsumeSemi(diag::err_module_expected_semi);
if (!LexedSemi)
SkipUntil(tok::semi);
if (SeenError)
return nullptr;
@ -2546,29 +2529,16 @@ Decl *Parser::ParseModuleImport(SourceLocation AtLoc,
bool Parser::ParseModuleName(SourceLocation UseLoc,
SmallVectorImpl<IdentifierLoc> &Path,
bool IsImport) {
// Parse the module path.
while (true) {
if (!Tok.is(tok::identifier)) {
if (Tok.is(tok::code_completion)) {
cutOffParsing();
Actions.CodeCompletion().CodeCompleteModuleImport(UseLoc, Path);
return true;
}
Diag(Tok, diag::err_module_expected_ident) << IsImport;
SkipUntil(tok::semi);
return true;
}
// Record this part of the module path.
Path.emplace_back(Tok.getLocation(), Tok.getIdentifierInfo());
ConsumeToken();
if (Tok.isNot(tok::period))
return false;
ConsumeToken();
if (Tok.isNot(tok::annot_module_name)) {
SkipUntil(tok::semi);
return true;
}
ModuleNameLoc *NameLoc =
static_cast<ModuleNameLoc *>(Tok.getAnnotationValue());
Path.assign(NameLoc->getModuleIdPath().begin(),
NameLoc->getModuleIdPath().end());
ConsumeAnnotationToken();
return false;
}
bool Parser::parseMisplacedModuleImport() {

View File

@ -59,23 +59,6 @@ static void checkModuleImportContext(Sema &S, Module *M,
}
}
// We represent the primary and partition names as 'Paths' which are sections
// of the hierarchical access path for a clang module. However for C++20
// the periods in a name are just another character, and we will need to
// flatten them into a string.
static std::string stringFromPath(ModuleIdPath Path) {
std::string Name;
if (Path.empty())
return Name;
for (auto &Piece : Path) {
if (!Name.empty())
Name += ".";
Name += Piece.getIdentifierInfo()->getName();
}
return Name;
}
/// Helper function for makeTransitiveImportsVisible to decide whether
/// the \param Imported module unit is in the same module with the \param
/// CurrentModule.
@ -306,7 +289,7 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
// We were asked to compile a module interface unit but this is a module
// implementation unit.
Diag(ModuleLoc, diag::err_module_interface_implementation_mismatch)
<< FixItHint::CreateInsertion(ModuleLoc, "export ");
<< FixItHint::CreateInsertion(ModuleLoc, "export ");
MDK = ModuleDeclKind::Interface;
break;
@ -373,10 +356,10 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
// Flatten the dots in a module name. Unlike Clang's hierarchical module map
// modules, the dots here are just another character that can appear in a
// module name.
std::string ModuleName = stringFromPath(Path);
std::string ModuleName = ModuleLoader::getFlatNameFromPath(Path);
if (IsPartition) {
ModuleName += ":";
ModuleName += stringFromPath(Partition);
ModuleName += ModuleLoader::getFlatNameFromPath(Partition);
}
// If a module name was explicitly specified on the command line, it must be
// correct.
@ -389,7 +372,7 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
<< getLangOpts().CurrentModule;
return nullptr;
}
const_cast<LangOptions&>(getLangOpts()).CurrentModule = ModuleName;
const_cast<LangOptions &>(getLangOpts()).CurrentModule = ModuleName;
auto &Map = PP.getHeaderSearchInfo().getModuleMap();
Module *Mod; // The module we are creating.
@ -434,7 +417,7 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
Interface = getModuleLoader().loadModule(ModuleLoc, {ModuleNameLoc},
Module::AllVisible,
/*IsInclusionDirective=*/false);
const_cast<LangOptions&>(getLangOpts()).CurrentModule = ModuleName;
const_cast<LangOptions &>(getLangOpts()).CurrentModule = ModuleName;
if (!Interface) {
Diag(ModuleLoc, diag::err_module_not_defined) << ModuleName;
@ -597,12 +580,12 @@ DeclResult Sema::ActOnModuleImport(SourceLocation StartLoc,
// otherwise, the name of the importing named module.
ModuleName = NamedMod->getPrimaryModuleInterfaceName().str();
ModuleName += ":";
ModuleName += stringFromPath(Path);
ModuleName += ModuleLoader::getFlatNameFromPath(Path);
ModuleNameLoc =
IdentifierLoc(Path[0].getLoc(), PP.getIdentifierInfo(ModuleName));
Path = ModuleIdPath(ModuleNameLoc);
} else if (getLangOpts().CPlusPlusModules) {
ModuleName = stringFromPath(Path);
ModuleName = ModuleLoader::getFlatNameFromPath(Path);
ModuleNameLoc =
IdentifierLoc(Path[0].getLoc(), PP.getIdentifierInfo(ModuleName));
Path = ModuleIdPath(ModuleNameLoc);

View File

@ -13,7 +13,8 @@ struct module { struct inner {}; };
constexpr int n = 123;
export module m; // #1
module y = {}; // expected-error {{multiple module declarations}} expected-error 2{{}}
module y = {}; // expected-error {{multiple module declarations}}
// expected-error@-1 {{unexpected preprocessing token '=' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
// expected-note@#1 {{previous module declaration}}
::import x = {};
@ -23,8 +24,8 @@ import::inner xi = {};
module::inner yi = {};
namespace N {
module a;
import b;
module a; // expected-error {{module declaration can only appear at the top level}}
import b; // expected-error {{import declaration can only appear at the top level}}
}
extern "C++" module cxxm;
@ -45,10 +46,11 @@ constexpr int n = 123;
export module m; // #1
import x = {}; // expected-error {{expected ';' after module name}}
import x = {}; // expected-error {{import directive must end with a ';'}}
// expected-error@-1 {{module 'x' not found}}
//--- ImportError2.cpp
// expected-no-diagnostics
module;
struct module { struct inner {}; };
@ -63,7 +65,4 @@ template<> struct import<n> {
static X y;
};
// This is not valid because the 'import <n>' is a pp-import, even though it
// grammatically can't possibly be an import declaration.
struct X {} import<n>::y; // expected-error {{'n' file not found}}
struct X {} import<n>::y;

View File

@ -107,4 +107,4 @@ void test_late() {
// expected-error@-2 {{undeclared identifier}}
internal_private = 1; // expected-error {{use of undeclared identifier 'internal_private'}}
}
}

View File

@ -0,0 +1,81 @@
// RUN: rm -rf %t
// RUN: mkdir %t
// RUN: split-file %s %t
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_example1.cpp -D'DOT_BAR=.bar' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_example2.cpp -D'MOD_ATTR=[[vendor::shiny_module]]' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_example3.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_example4.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_example5.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_example6.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_ext1.cpp -verify -E | FileCheck %t/cwg2947_ext1.cpp
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_ext2.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/cwg2947_ext3.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_example1.cpp -D'DOT_BAR=.bar' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_example2.cpp -D'MOD_ATTR=[[vendor::shiny_module]]' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_example3.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_example4.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_example5.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_example6.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_ext1.cpp -verify -E | FileCheck %t/cwg2947_ext1.cpp
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_ext2.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++23 %t/cwg2947_ext3.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_example1.cpp -D'DOT_BAR=.bar' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_example2.cpp -D'MOD_ATTR=[[vendor::shiny_module]]' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_example3.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_example4.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_example5.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_example6.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_ext1.cpp -verify -E | FileCheck %t/cwg2947_ext1.cpp
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_ext2.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++26 %t/cwg2947_ext3.cpp -fsyntax-only -verify
//--- cwg2947_example1.cpp
// #define DOT_BAR .bar
export module foo DOT_BAR; // error: expansion of DOT_BAR; does not begin with ; or [
// expected-error@-1 {{unexpected preprocessing token '.' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
//--- cwg2947_example2.cpp
export module M MOD_ATTR; // OK
// expected-warning@-1 {{unknown attribute 'vendor::shiny_module' ignored}}
//--- cwg2947_example3.cpp
export module a
.b; // error: preprocessing token after pp-module-name is not ; or [
// expected-error@-1 {{unexpected preprocessing token '.' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
//--- cwg2947_example4.cpp
export module M [[
attr1,
// expected-warning@-1 {{unknown attribute 'attr1' ignored}}
attr2 ]] ; // OK
// expected-warning@-1 {{unknown attribute 'attr2' ignored}}
//--- cwg2947_example5.cpp
export module M
[[ attr1,
// expected-warning@-1 {{unknown attribute 'attr1' ignored}}
attr2 ]] ; // OK
// expected-warning@-1 {{unknown attribute 'attr2' ignored}}
//--- cwg2947_example6.cpp
export module M; int
// expected-warning@-1 {{extra tokens after semicolon in 'module' directive}}
n; // OK
//--- cwg2947_ext1.cpp
// CHECK: export __preprocessed_module m; int x;
// CHECK: extern "C++" int *y = &x;
export module m; int x;
// expected-warning@-1 {{extra tokens after semicolon in 'module' directive}}
extern "C++" int *y = &x;
//--- cwg2947_ext2.cpp
export module x _Pragma("GCC warning \"Hi\"");
// expected-warning@-1 {{Hi}}
//--- cwg2947_ext3.cpp
export module x; _Pragma("GCC warning \"hi\""); // expected-warning {{hi}}
// expected-warning@-1 {{extra tokens after semicolon in 'module' directive}}

View File

@ -1,7 +1,7 @@
// RUN: not %clang_cc1 -std=c++2a -E -I%S/Inputs %s -o - | FileCheck %s --strict-whitespace --implicit-check-not=ERROR
// Check for context-sensitive header-name token formation.
// CHECK: import <foo bar>;
// CHECK: __preprocessed_import <foo bar>;
import <foo bar>;
// Not at the top level: these are each 8 tokens rather than 5.
@ -12,59 +12,64 @@ import <foo bar>;
// CHECK: [ import <foo bar>; %>
[ import <foo bar>; %>
// CHECK: import <foo bar>;
// CHECK: __preprocessed_import <foo bar>;
import <foo bar>;
// CHECK: foo; import <foo bar>;
// CHECK: foo; import <foo bar>;
foo; import <foo bar>;
// CHECK: foo import <foo bar>;
foo import <foo bar>;
// CHECK: import <foo bar> {{\[\[ ]]}};
// CHECK: __preprocessed_import <foo bar> {{\[\[ ]]}};
import <foo bar> [[ ]];
// CHECK: import <foo bar> import <foo bar>;
// CHECK: __preprocessed_import <foo bar> import <foo bar>;
import <foo bar> import <foo bar>;
// FIXME: We do not form header-name tokens in the pp-import-suffix of a
// pp-import. Conforming programs can't tell the difference.
// CHECK: import <foo bar> {} import <foo bar>;
// CHECK: __preprocessed_import <foo bar> {} import <foo bar>;
// FIXME: import <foo bar> {} import <foo bar>;
import <foo bar> {} import <foo bar>;
// CHECK: export import <foo bar>;
// CHECK: export __preprocessed_import <foo bar>;
export import <foo bar>;
// CHECK: export export import <foo bar>;
export export import <foo bar>;
#define UNBALANCED_PAREN (
// CHECK: import <foo bar>;
// CHECK: __preprocessed_import <foo bar>;
import <foo bar>;
UNBALANCED_PAREN
// CHECK: import <foo bar>;
// CHECK: __preprocessed_import <foo bar>;
import <foo bar>;
)
_Pragma("clang no_such_pragma (");
// CHECK: import <foo bar>;
// CHECK: __preprocessed_import <foo bar>;
import <foo bar>;
#define HEADER <foo bar>
// CHECK: import <foo bar>;
// CHECK: __preprocessed_import <foo bar>;
import HEADER;
// CHECK: import <foo bar>;
// CHECK: {{^}}foo{{$}}
// CHECK-NEXT: {{^}} bar{{$}}
// CHECK-NEXT: {{^}}>;{{$}}
import <
foo
bar
>;
// CHECK: import{{$}}
// CHECK: {{^}}<foo bar>;
// CHECK-NEXT: {{^}}<{{$}}
// CHECK-NEXT: {{^}}foo{{$}}
// CHECK-NEXT: {{^}} bar{{$}}
// CHECK-NEXT: {{^}}>;{{$}}
import
<
foo
@ -72,7 +77,7 @@ foo
>;
// CHECK: import{{$}}
// CHECK: {{^}}<foo bar>;
// CHECK: {{^}}<foo bar>;
import
<foo bar>;

View File

@ -46,8 +46,8 @@ export module z;
export module x;
//--- invalid_module_name.cppm
export module z elderberry; // expected-error {{expected ';'}} \
// expected-error {{a type specifier is required}}
export module z elderberry;
// expected-error@-1 {{unexpected preprocessing token 'elderberry' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
//--- empty_attribute.cppm
// expected-no-diagnostics

View File

@ -0,0 +1,207 @@
// RUN: rm -rf %t
// RUN: mkdir %t
// RUN: split-file %s %t
// RUN: %clang_cc1 -std=c++20 %t/hash.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/module.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/rightpad.cppm -emit-module-interface -o %t/rightpad.pcm
// RUN: %clang_cc1 -std=c++20 %t/M_part.cppm -emit-module-interface -o %t/M_part.pcm
// RUN: %clang_cc1 -std=c++20 -xc++-system-header %t/string -emit-header-unit -o %t/string.pcm
// RUN: %clang_cc1 -std=c++20 -xc++-user-header %t/squee -emit-header-unit -o %t/squee.pcm
// RUN: %clang_cc1 -std=c++20 %t/import.cpp -isystem %t \
// RUN: -fmodule-file=rightpad=%t/rightpad.pcm \
// RUN: -fmodule-file=M:part=%t/M_part.pcm \
// RUN: -fmodule-file=%t/string.pcm \
// RUN: -fmodule-file=%t/squee.pcm \
// RUN: -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/module_decl_not_in_same_line.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/foo.cppm -emit-module-interface -o %t/foo.pcm
// RUN: %clang_cc1 -std=c++20 %t/import_decl_not_in_same_line.cpp -fmodule-file=foo=%t/foo.pcm -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/not_import.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/import_spaceship.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/leading_empty_macro.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/operator_keyword_and.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/operator_keyword_and2.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/macro_in_module_decl_suffix.cpp -D'ATTR(X)=[[X]]' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/macro_in_module_decl_suffix2.cpp -D'ATTR(X)=[[X]]' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/extra_tokens_after_module_decl1.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/extra_tokens_after_module_decl2.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/object_like_macro_in_module_name.cpp -Dm=x -Dn=y -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/object_like_macro_in_partition_name.cpp -Dm=x -Dn=y -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/unexpected_character_in_pp_module_suffix.cpp -D'm(x)=x' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/semi_in_same_line.cpp -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/preprocessed_module_file.cpp -E | FileCheck %t/preprocessed_module_file.cpp
// RUN: %clang_cc1 -std=c++20 %t/pedantic-errors.cpp -pedantic-errors -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/xcpp-output.cpp -fsyntax-only -verify -xc++-cpp-output
// RUN: %clang_cc1 -std=c++20 %t/func_like_macro.cpp -D'm(x)=x' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/lparen.cpp -D'm(x)=x' -D'LPAREN=(' -fsyntax-only -verify
// RUN: %clang_cc1 -std=c++20 %t/control_line.cpp -fsyntax-only -verify
//--- hash.cpp
// expected-no-diagnostics
# // preprocessing directive
//--- module.cpp
// expected-no-diagnostics
module ; // preprocessing directive
export module leftpad; // preprocessing directive
//--- string
#ifndef STRING_H
#define STRING_H
#endif // STRING_H
//--- squee
#ifndef SQUEE_H
#define SQUEE_H
#endif
//--- rightpad.cppm
export module rightpad;
//--- M_part.cppm
export module M:part;
//--- import.cpp
export module M;
import <string>; // expected-warning {{the implementation of header units is in an experimental phase}}
export import "squee"; // expected-warning {{the implementation of header units is in an experimental phase}}
import rightpad; // preprocessing directive
import :part; // preprocessing directive
//--- module_decl_not_in_same_line.cpp
module // expected-error {{a type specifier is required for all declarations}}
;export module M; // expected-error {{export declaration can only be used within a module interface}} \
// expected-error {{unknown type name 'module'}}
//--- foo.cppm
export module foo;
//--- import_decl_not_in_same_line.cpp
export module M;
export
import // expected-error {{unknown type name 'import'}}
foo;
export
import foo; // expected-error {{unknown type name 'import'}}
//--- not_import.cpp
export module M;
import :: // expected-error {{use of undeclared identifier 'import'}}
import -> // expected-error {{cannot use arrow operator on a type}}
//--- import_spaceship.cpp
export module M;
import <=>; // expected-error {{'=' file not found}}
//--- leading_empty_macro.cpp
// expected-no-diagnostics
export module M;
typedef int import;
#define EMP
EMP import m; // The phase 7 grammar should see import as a typedef-name.
//--- operator_keyword_and.cpp
// expected-no-diagnostics
typedef int import;
extern
import and x;
//--- operator_keyword_and2.cpp
// expected-no-diagnostics
typedef int module;
extern
module and x;
//--- macro_in_module_decl_suffix.cpp
export module m ATTR(x); // expected-warning {{unknown attribute 'x' ignored}}
//--- macro_in_module_decl_suffix2.cpp
export module m [[y]] ATTR(x); // expected-warning {{unknown attribute 'y' ignored}} \
// expected-warning {{unknown attribute 'x' ignored}}
//--- extra_tokens_after_module_decl1.cpp
module; int n; // expected-warning {{extra tokens after semicolon in 'module' directive}}
import foo; int n1; // expected-warning {{extra tokens after semicolon in 'import' directive}}
// expected-error@-1 {{module 'foo' not found}}
const int *p1 = &n1;
//--- extra_tokens_after_module_decl2.cpp
export module m; int n2 // expected-warning {{extra tokens after semicolon in 'module' directive}}
;
const int *p2 = &n2;
//--- object_like_macro_in_module_name.cpp
export module m.n;
// expected-error@-1 {{module name component 'm' cannot be a object-like macro}}
// expected-note@* {{macro 'm' defined here}}
// expected-error@-3 {{module name component 'n' cannot be a object-like macro}}
// expected-note@* {{macro 'n' defined here}}
//--- object_like_macro_in_partition_name.cpp
export module m:n;
// expected-error@-1 {{module name component 'm' cannot be a object-like macro}}
// expected-note@* {{macro 'm' defined here}}
// expected-error@-3 {{partition name component 'n' cannot be a object-like macro}}
// expected-note@* {{macro 'n' defined here}}
//--- unexpected_character_in_pp_module_suffix.cpp
export module m();
// expected-error@-1 {{unexpected preprocessing token '(' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
//--- semi_in_same_line.cpp
export module m // OK
[[]];
import foo // expected-error {{module 'foo' not found}}
;
//--- preprocessed_module_file.cpp
// CHECK: __preprocessed_module;
// CHECK-NEXT: export __preprocessed_module M;
// CHECK-NEXT: __preprocessed_import std;
// CHECK-NEXT: export __preprocessed_import bar;
// CHECK-NEXT: struct import {};
// CHECK-EMPTY:
// CHECK-NEXT: import foo;
module;
export module M;
import std;
export import bar;
struct import {};
#define EMPTY
EMPTY import foo;
//--- pedantic-errors.cpp
export module m; int n; // expected-warning {{extra tokens after semicolon in 'module' directive}}
//--- xcpp-output.cpp
// expected-no-diagnostics
typedef int module;
module x;
//--- func_like_macro.cpp
// #define m(x) x
export module m
(foo); // expected-error {{unexpected preprocessing token '(' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
//--- lparen.cpp
// #define m(x) x
// #define LPAREN (
export module m
LPAREN foo); // expected-error {{unexpected preprocessing token 'LPAREN' after module name, only ';' and '[' (start of attribute specifier sequence) are allowed}}
//--- control_line.cpp
#if 0 // #1
export module m; // expected-error {{module directive lines are not allowed on lines controlled by preprocessor conditionals}}
#else
export module m; // expected-error {{module directive lines are not allowed on lines controlled by preprocessor conditionals}} \
// expected-error {{module declaration must occur at the start of the translation unit}} \
// expected-note@#1 {{add 'module;'}}
#endif

View File

@ -44,8 +44,8 @@ import x [[noreturn]]; // expected-error {{'noreturn' attribute cannot be applie
import x [[blarg::noreturn]]; // expected-warning-re {{unknown attribute 'blarg::noreturn' ignored{{.*}}}}
import x.y;
import x.; // expected-error {{expected a module name after 'import'}}
import .x; // expected-error {{expected a module name after 'import'}}
import x.; // expected-error {{expected identifier after '.' in module name}}
import .x; // expected-error {{unknown type name 'import'}} expected-error {{cannot use dot operator on a type}}
import blarg; // expected-error {{module 'blarg' not found}}
@ -62,8 +62,8 @@ import x [[noreturn]]; // expected-error {{'noreturn' attribute cannot be applie
import x [[blarg::noreturn]]; // expected-warning-re {{unknown attribute 'blarg::noreturn' ignored{{.*}}}}
import x.y;
import x.; // expected-error {{expected a module name after 'import'}}
import .x; // expected-error {{expected a module name after 'import'}}
import x.; // expected-error {{expected identifier after '.' in module name}}
import .x; // expected-error {{unknown type name 'import'}} expected-error {{cannot use dot operator on a type}}
import blarg; // expected-error {{module 'blarg' not found}}

View File

@ -0,0 +1,11 @@
// RUN: %clang_cc1 -E -std=c++20 %s
// CHECK: export __preprocessed_module M;
// CHECK-NEXT: export __preprocessed_import K;
// CHECK-NEXT: typedef int import;
// CHECK: import m;
export module M;
export import K;
typedef int import;
#define EMP
EMP import m;

View File

@ -1,4 +1,6 @@
// RUN: %clang_cc1 -std=c++20 -fsyntax-only %s -verify
import mod // expected-error {{expected ';' after module name}}
// This import directive is ill-formed, it's missing an ';' after
// module name, but we try to recovery from error and import the module.
import mod // expected-error {{import directive must end with a ';'}}
// expected-error@-1 {{module 'mod' not found}}

View File

@ -4,4 +4,4 @@
// RUN: %clang_cc1 -std=c++20 -E %s -o - | FileCheck %s
import non_exist_modules;
// CHECK: import non_exist_modules;
// CHECK: __preprocessed_import non_exist_modules;

View File

@ -193,7 +193,8 @@ TEST_P(ASTMatchersTest, ExportDecl) {
if (!GetParam().isCXX20OrLater()) {
return;
}
const std::string moduleHeader = "module;export module ast_matcher_test;";
const std::string moduleHeader =
"module;\n export module ast_matcher_test;\n";
EXPECT_TRUE(matches(moduleHeader + "export void foo();",
exportDecl(has(functionDecl()))));
EXPECT_TRUE(matches(moduleHeader + "export { void foo(); int v; }",

View File

@ -640,7 +640,7 @@ TEST(MinimizeSourceToDependencyDirectivesTest, AtImport) {
EXPECT_STREQ("@import A;\n", Out.data());
ASSERT_FALSE(minimizeSourceToDependencyDirectives("@import A\n;", Out));
EXPECT_STREQ("@import A\n;\n", Out.data());
EXPECT_STREQ("@import A;\n", Out.data());
ASSERT_FALSE(minimizeSourceToDependencyDirectives("@import A.B;\n", Out));
EXPECT_STREQ("@import A.B;\n", Out.data());
@ -685,18 +685,19 @@ TEST(MinimizeSourceToDependencyDirectivesTest, ImportFailures) {
minimizeSourceToDependencyDirectives("@import MACRO(A);\n", Out));
ASSERT_FALSE(minimizeSourceToDependencyDirectives("@import \" \";\n", Out));
ASSERT_FALSE(minimizeSourceToDependencyDirectives("import <Foo.h>\n"
ASSERT_FALSE(minimizeSourceToDependencyDirectives("import <Foo.h>;\n"
"@import Foo;",
Out));
EXPECT_STREQ("@import Foo;\n", Out.data());
EXPECT_STREQ("import<Foo.h>;\n@import Foo;\n", Out.data());
ASSERT_FALSE(
minimizeSourceToDependencyDirectives("import <Foo.h>\n"
minimizeSourceToDependencyDirectives("import <Foo.h>;\n"
"#import <Foo.h>\n"
"@;\n"
"#pragma clang module import Foo",
Out));
EXPECT_STREQ("#import <Foo.h>\n"
EXPECT_STREQ("import<Foo.h>;\n"
"#import <Foo.h>\n"
"#pragma clang module import Foo\n",
Out.data());
}
@ -1215,4 +1216,41 @@ TEST(MinimizeSourceToDependencyDirectivesTest, TokensBeforeEOF) {
EXPECT_STREQ("#ifndef A\n#define A\n#endif\n<TokBeforeEOF>\n", Out.data());
}
TEST(MinimizeSourceToDependencyDirectivesTest, PreprocessedModule) {
SmallVector<char, 128> Out;
ASSERT_FALSE(
minimizeSourceToDependencyDirectives("export __preprocessed_module M;\n"
"struct import {};\n"
"import foo;\n"
"__preprocessed_import bar;\n",
Out));
EXPECT_STREQ("export __preprocessed_module M;\n"
"__preprocessed_import bar;\n",
Out.data());
}
TEST(MinimizeSourceToDependencyDirectivesTest, ScanningPreprocessedModuleFile) {
StringRef Source = R"(
export __preprocessed_module M;
struct import {};
import foo;
)";
ASSERT_TRUE(clang::isPreprocessedModuleFile(Source));
Source = R"(
export module M;
struct import {};
import foo;
)";
ASSERT_FALSE(clang::isPreprocessedModuleFile(Source));
Source = R"(
__preprocessed_import foo;
)";
ASSERT_TRUE(clang::isPreprocessedModuleFile(Source));
}
} // end anonymous namespace

View File

@ -40,7 +40,7 @@ public:
void moduleImport(SourceLocation ImportLoc, ModuleIdPath Path,
const Module *Imported) override {
ASSERT_TRUE(NextCheckingIndex < IsImportingNamedModulesAssertions.size());
EXPECT_EQ(PP.isInImportingCXXNamedModules(),
EXPECT_EQ(PP.isImportingCXXNamedModules(),
IsImportingNamedModulesAssertions[NextCheckingIndex]);
NextCheckingIndex++;

View File

@ -20450,7 +20450,7 @@
<td>[<a href="https://wg21.link/cpp.module">cpp.module</a>]</td>
<td>open</td>
<td>Limiting macro expansion in <I>pp-module</I></td>
<td align="center">Not resolved</td>
<td class="unreleased" align="center">Clang 23</td>
</tr>
<tr class="open" id="2948">
<td><a href="https://cplusplus.github.io/CWG/issues/2948.html">2948</a></td>

View File

@ -910,7 +910,7 @@ C++23, informally referred to as C++26.</p>
</tr>
<tr>
<td><a href="https://wg21.link/p1703r1">P1703R1</a></td>
<td class="none" align="center">Subsumed by P1857</td>
<td class="unreleased" align="center">Subsumed by P1857</td>
</tr>
<tr> <!-- from Belfast -->
<td><a href="https://wg21.link/p1874r1">P1874R1</a></td>
@ -926,14 +926,7 @@ C++23, informally referred to as C++26.</p>
</tr>
<tr>
<td><a href="https://wg21.link/p1857r3">P1857R3</a></td>
<td class="partial" align="center">
<details>
<summary>Clang 21 (Partial)</summary>
The restriction that "[a] module directive may only appear
as the first preprocessing tokens in a file" is enforced
starting in Clang 21.
</details>
</td>
<td class="unreleased" align="center">Clang 23</td>
</tr>
<tr>
<td><a href="https://wg21.link/p2115r0">P2115R0</a></td>