llvm-project

Author	SHA1	Message	Date
yronglin	1da403937e	Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130 )" (#173789 ) The patch reapply https://github.com/llvm/llvm-project/pull/173130. This patch implement the following papers: [P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3). [P3034R1 Module Declarations Shouldn’t be Macros](https://wg21.link/P3034R1). [CWG2947](https://cplusplus.github.io/CWG/issues/2947.html). At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. Additionally: - The entire import or module directive (including the closing ;) must be on a single logical line and for module must not come from an #include. - The expansion of macros must not result in an import or module directive introducer that was not there prior to macro expansion. - A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) - Preprocessor conditionals shall not span a module declaration. After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line). This patch also introduce 2 new keyword `__preprocessed_module` and `__preprocessed_import`. These 2 keyword was generated during `-E` mode. This is useful to avoid confusion with `module` and `import` keyword in preprocessed output: ```cpp export module m; struct import {}; #define EMPTY EMPTY import foo; ``` Fixes https://github.com/llvm/llvm-project/issues/54047 The previous patch has an use-after-free issue in Lexer::LexTokenInternal function. Since C++20, the `export`, `import` and `module` identifiers may be an introducer of a C++ module declaration/importing directive, and the directive will handled in `LexIdentifierContinue`. Unfortunately, the EOF may be encountered in `LexIdentifierContinue` and `CurLexer` might be destructed in `HandleEndOfFile`, If the code after `LexIdentifierContinue` try to access `LangOps` or other class members in this Lexer, it will hit undefined behavior. This patch also fix the use-after-free issue in Lexer by introduce a mechanism to delay the destruction of `CurLexer` in `Preprocessor` class. --------- Signed-off-by: yronglin <yronglin777@gmail.com>	2026-01-20 17:42:46 +08:00
yronglin	71bba12587	Revert "Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130 )" (#173549 ) This reverts commit 0d1c396ce8178baf05f277b16bf41b8a6b847d6d. Co-authored-by: Yihan Wang <yihwang@nvidia.com>	2025-12-25 19:55:40 +08:00
yronglin	0d1c396ce8	Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173130 ) This PR reapply https://github.com/llvm/llvm-project/pull/107168. --------- Signed-off-by: Wang, Yihan <yronglin777@gmail.com> Signed-off-by: yronglin <yronglin777@gmail.com>	2025-12-25 18:55:44 +08:00
Paul Kirth	2b8b305d46	Revert "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" (#173118 ) Reverts llvm/llvm-project#107168 This patch broke on bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/33105 - https://lab.llvm.org/buildbot/#/builders/94/builds/13727 - https://lab.llvm.org/buildbot/#/builders/169/builds/18192 and on mac-aarch64 builds. see https://github.com/llvm/llvm-project/pull/107168#issuecomment-3675990781	2025-12-19 15:17:31 -08:00
yronglin	d2e62d9024	[C++20][Modules] Implement P1857R3 Modules Dependency Discovery (#107168 ) This PR implement the following papers: [P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3). [P3034R1 Module Declarations Shouldn’t be Macros](https://wg21.link/P3034R1). [CWG2947](https://cplusplus.github.io/CWG/issues/2947.html). At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. Additionally: - The entire import or module directive (including the closing ;) must be on a single logical line and for module must not come from an #include. - The expansion of macros must not result in an import or module directive introducer that was not there prior to macro expansion. - A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) - Preprocessor conditionals shall not span a module declaration. After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line). This patch also introduce 2 new keyword `__preprocessed_module` and `__preprocessed_import`. These 2 keyword was generated during `-E` mode. This is useful to avoid confusion with `module` and `import` keyword in preprocessed output: ```cpp export module m; struct import {}; #define EMPTY EMPTY import foo; ``` Fixes https://github.com/llvm/llvm-project/issues/54047 --------- Signed-off-by: yronglin <yronglin777@gmail.com> Signed-off-by: Wang, Yihan <yronglin777@gmail.com>	2025-12-19 23:29:17 +08:00
Sam McCall	7c1ee5e95f	[Pseudo] Token/TokenStream, PP directive parser. The TokenStream class is the representation of the source code that will be fed into the GLR parser. This patch allows a "raw" TokenStream to be built by reading source code. It also supports scanning a TokenStream to find the directive structure. Next steps (with placeholders in the code): heuristically choosing a path through #ifs, preprocessing the code by stripping directives and comments. These will produce a suitable stream to feed into the parser proper. Differential Revision: https://reviews.llvm.org/D119162	2022-02-23 17:52:02 +01:00
Serge Pavlov	fcb6123d05	Use switch instead of series of comparisons This is style correction, no functional changes. Differential Revision: https://reviews.llvm.org/D65670 llvm-svn: 367759	2019-08-03 16:32:49 +00:00
Serge Pavlov	3c26163d1a	[Parser] Use special definition for pragma annotations Previously pragma annotation tokens were described as any other annotations in TokenKinds.def. This change introduces special macro PRAGMA_ANNOTATION for the pragma descriptions. It allows implementing checks that deal with pragma annotations only. Differential Revision: https://reviews.llvm.org/D65405 llvm-svn: 367575	2019-08-01 15:15:10 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Craig Topper	f1186c5a8f	[C++11] Use 'nullptr'. llvm-svn: 208280	2014-05-08 06:41:40 +00:00
Alp Toker	b3f9501b25	Prospective MSVC 2010 build fix Try to fix Compiler Error C2011 following r198607 by removing enum from 'enum TokenKind' parameter types. llvm-svn: 198621	2014-01-06 15:52:13 +00:00
Alp Toker	a231ad2216	Support diagnostic formatting of keyword tokens Implemented with a new getKeywordSpelling() accessor. Unlike getTokenName() the result of this function is stable and may be used in diagnostic output. Uses of this feature are split out into the subsequent commit. llvm-svn: 198604	2014-01-06 12:54:18 +00:00
Alp Toker	6d35eab5a6	Rename getTokenSimpleSpelling() to getPunctuatorSpelling() That's what it does, what the documentation says it does and what callers expect it to do. llvm-svn: 198603	2014-01-06 12:54:07 +00:00
Alp Toker	637b347ed0	Apply some LLVM_READONLY / LLVM_READNONE on diagnostic functions llvm-svn: 198598	2014-01-06 11:30:15 +00:00
Chandler Carruth	3a02247dc9	Sort all of Clang's files under 'lib', and fix up the broken headers uncovered. This required manually correcting all of the incorrect main-module headers I could find, and running the new llvm/utils/sort_includes.py script over the files. I also manually added quite a few missing headers that were uncovered by shuffling the order or moving headers up to be main-module-headers. llvm-svn: 169237	2012-12-04 09:13:33 +00:00
Kovarththanan Rajaratnam	7632da4b8a	This patch adds a PUNCTUATOR macro (specialization of TOK) in TokenKinds.def and makes use of it in tok::getTokenSimpleSpelling. llvm-svn: 90042	2009-11-28 16:09:28 +00:00
Douglas Gregor	96977da72c	Clean up and document code modification hints. llvm-svn: 65641	2009-02-27 17:53:17 +00:00
Douglas Gregor	87f95b0a6a	Introduce code modification hints into the diagnostics system. When we know how to recover from an error, we can attach a hint to the diagnostic that states how to modify the code, which can be one of: - Insert some new code (a text string) at a particular source location - Remove the code within a given range - Replace the code within a given range with some new code (a text string) Right now, we use these hints to annotate diagnostic information. For example, if one uses the '>>' in a template argument in C++98, as in this code: template<int I> class B { }; B<1000 >> 2> b1; we'll warn that the behavior will change in C++0x. The fix is to insert parenthese, so we use code insertion annotations to illustrate where the parentheses go: test.cpp:10:10: warning: use of right-shift operator ('>>') in template argument will require parentheses in C++0x B<1000 >> 2> b1; ^ ( ) Use of these annotations is partially implemented for HTML diagnostics, but it's not (yet) producing valid HTML, which may be related to PR2386, so it has been #if 0'd out. In this future, we could consider hooking this mechanism up to the rewriter to actually try to fix these problems during compilation (or, after a compilation whose only errors have fixes). For now, however, I suggest that we use these code modification hints whenever we can, so that we get better diagnostics now and will have better coverage when we find better ways to use this information. This also fixes PR3410 by placing the complaint about missing tokens just after the previous token (rather than at the location of the next token). llvm-svn: 65570	2009-02-26 21:00:50 +00:00
Chris Lattner	7a51313d8a	Make a major restructuring of the clang tree: introduce a top-level lib dir and move all the libraries into it. This follows the main llvm tree, and allows the libraries to be built in parallel. The top level now enforces that all the libs are built before Driver, but we don't care what order the libs are built in. This speeds up parallel builds, particularly incremental ones. llvm-svn: 48402	2008-03-15 23:59:48 +00:00

19 Commits