Consider the following code:
```cpp
# 1 __FILE__ 1 3
export module a;
```
According to the wording in
[P1857R3](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1857r3.html):
```
A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.)
```
and the wording in
[[cpp.pre]](https://eel.is/c++draft/cpp.pre#nt:module-file)
```
module-file:
pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]
```
`#` is the first pp-token in the translation unit, and it was rejected
by clang, but they really should be exempted from this rule. The goal is
to not allow any preprocessor conditionals or most state changes, but
these don't fit that.
State change would mean most semantically observable preprocessor state,
particularly anything that is order dependent. Global flags like being a
system header/module shouldn't matter.
We should exempt a brunch of directives, even though it violates the
current standard wording.
In this patch, we introduce a `TrivialDirectiveTracer` to trace the
**State change** that described above and propose to exempt the
following kind of directive: `#line`, GNU line marker, `#ident`,
`#pragma comment`, `#pragma mark`, `#pragma detect_mismatch`, `#pragma
clang __debug`, `#pragma message`, `#pragma GCC warning`, `#pragma GCC
error`, `#pragma gcc diagnostic`, `#pragma OPENCL EXTENSION`, `#pragma
warning`, `#pragma execution_character_set`, `#pragma clang
assume_nonnull` and builtin macro expansion.
Fixes https://github.com/llvm/llvm-project/issues/145274
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This is a first pass at implementing
[P2841R7](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2841r7.pdf).
The implementation is far from complete; however, I'm aiming to do that
in chunks, to make our lives easier.
In particular, this does not implement
- Subsumption
- Mangling
- Satisfaction checking is minimal as we should focus on #141776 first
(note that I'm currently very stuck)
FTM, release notes, status page, etc, will be updated once the feature
is more mature. Given the state of the feature, it is not yet allowed in
older language modes.
Of note:
- Mismatches between template template arguments and template template
parameters are a bit wonky. This is addressed by #130603
- We use `UnresolvedLookupExpr` to model template-id. While this is
pre-existing, I have been wondering if we want to introduce a different
OverloadExpr subclass for that. I did not make the change in this patch.
Previously, the newline after a module directive was not properly
captured and printed by `clang::printDependencyDirectivesAsSource`.
According to P1857R3, each directive must, after skipping horizontal
whitespace, appear at the start of a logical line. Because the newline
after module directives was missing, this invalidated the following
line.
This fixes tests that were previously in violation of P1857R3,
including for Objective-C directives, which should also comply with
P1857R3.
This also ensures that the global module fragment `module;` is captured
by the dependency directives scanner.
We have a parsing helper function which parses either a parenthesized
expression or a parenthesized type name. This is used when parsing a
unary operator such as sizeof, for example.
The problem this solves is when that construct is ambiguous. Consider:
enum E : typeof(int) { F };
After we've parsed the 'typeof', what ParseParenExpression() is
responsible for is '(int) { F }' which looks like a compound literal
expression when it's actually the parens and operand for 'typeof'
followed by the enumerator list for the enumeration declaration. Then
consider:
sizeof (int){ 0 };
After we've parsed 'sizeof', ParseParenExpression is responsible for
parsing something grammatically similar to the problematic case.
The solution is to recognize that the expression form of 'typeof' is
required to have parentheses. So we know the open and close parens that
ParseParenExpression handles must be part of the grammar production for
the operator, not part of the operand expression itself.
Fixes#146351
This PR is 2nd part of
[P1857R3](https://github.com/llvm/llvm-project/pull/107168)
implementation, and mainly implement the restriction `A module directive
may only appear as the first preprocessing tokens in a file (excluding
the global module fragment.)`:
[cpp.pre](https://eel.is/c++draft/cpp.pre):
```
module-file:
pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]
```
We also refine tests use `split-file` instead of conditional macro.
Signed-off-by: yronglin <yronglin777@gmail.com>
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Following the steps of #82217, this patch reorganizes declarations in
`Parse.h`. Highlights are:
1) Declarations are grouped in the same fashion as in `Sema.h`. Table of
contents is provided at the beginning of `Parser` class. `public`
declaration go first, then `private` ones, but unlike `Sema`, most of
the stuff in `Parser` is private.
2) Documentation has been moved from `.cpp` files to the header. Grammar
was consistently put in `\verbatim` blocks to render nicely in Doxygen.
3) File has been formatted with clang-format, except for the grammar,
because clang-format butchers it.
This adds
- The parsing of `trivially_relocatable_if_eligible`,
`replaceable_if_eligible` keywords
- `__builtin_trivially_relocate`, implemented in terms of memmove. In
the future this should
- Add the appropriate start/end lifetime markers that llvm does not have
(`start_lifetime_as`)
- Add support for ptrauth when that's upstreamed
- the `__builtin_is_cpp_trivially_relocatable` and
`__builtin_is_replaceable` traits
Fixes#127609
This patch adds templated `operator<<` for diagnostics that pass scoped
enums, saving people from `llvm::to_underlying()` clutter on the side of
emitting the diagnostic. This eliminates 80 out of 220 usages of
`llvm::to_underlying()` in Clang.
I also backported `std::is_scoped_enum_v` from C++23.
Fixes#79893
---
This PR addresses the issue of _attributes_ being incorrectly allowed on
`extern template` declarations
```cpp
[[deprecated]] extern template struct S<int>;
```
This PR reland https://github.com/llvm/llvm-project/pull/135808, fixed
some missed changes in LLDB.
I found this issue when I working on
https://github.com/llvm/llvm-project/pull/107168.
Currently we have many similiar data structures like:
- std::pair<IdentifierInfo *, SourceLocation>.
- Element type of ModuleIdPath.
- IdentifierLocPair.
- IdentifierLoc.
This PR unify these data structures to IdentifierLoc, moved
IdentifierLoc definition to SourceLocation.h, and deleted other similer
data structures.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Reverts llvm/llvm-project#135808
Example from the LLDB macOS CI:
https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/24084/execution/node/54/log/?consoleFull
```
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/source/Plugins/ExpressionParser/Clang/ClangModulesDeclVendor.cpp:360:49: error: no viable conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'clang::ModuleIdPath' (aka 'ArrayRef<IdentifierLoc>')
clang::Module *top_level_module = DoGetModule(clang_path.front(), false);
^~~~~~~~~~~~~~~~~~
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:41:40: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'const llvm::ArrayRef<clang::IdentifierLoc> &' for 1st argument
class LLVM_GSL_POINTER [[nodiscard]] ArrayRef {
^
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:41:40: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'llvm::ArrayRef<clang::IdentifierLoc> &&' for 1st argument
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:70:18: note: candidate constructor not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'std::nullopt_t' for 1st argument
/*implicit*/ ArrayRef(std::nullopt_t) {}
```
I found this issue when I working on
https://github.com/llvm/llvm-project/pull/107168.
Currently we have many similiar data structures like:
- `std::pair<IdentifierInfo *, SourceLocation>`.
- Element type of `ModuleIdPath`.
- `IdentifierLocPair`.
- `IdentifierLoc`.
This PR unify these data structures to `IdentifierLoc`, moved
`IdentifierLoc` definition to SourceLocation.h, and deleted other
similer data structures.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
This is a follow-up to #132129.
Currently, only `Parser` and `SemaBase` get a `DiagCompat()` helper; I’m
planning to keep refactoring compatibility warnings and add more helpers
to other classes as needed. I also refactored a single parser compat
warning just to make sure everything works properly when diagnostics
across multiple components (i.e. Sema and Parser in this case) are
involved.
This is the last item of the OpenACC 3.3 spec. It includes the
implicit-name version of 'routine', plus significant refactorings to
make the two work together. The implicit name version is represented as
an attribute on the function call. This patch also implements the
clauses for the implicit-name version, as well as the A.3.4 warning.
This iterates on #104717 (which we had to revert)
In a bid to increase our chances of success, we try to avoid blowing up
the stack by
- Using `runWithSufficientStackSpace` in ParseCompoundStatement
- Reducing the size of `StmtVector` a bit
- Reducing the size of `DeclsInGroup` a bit
- Removing a few `ParsedAttributes` from the stacks in places where they
are not strictly necessary. `ParsedAttributes` is a _huge_ object
On a 64 bits system, the following stack size reductions are observed
```
ParseStatementOrDeclarationAfterAttributes: 344 -> 264
ParseStatementOrDeclaration: 520 -> 376
ParseCompoundStatementBody: 1080 -> 1016
ParseDeclaration: 264 -> 120
```
Fixes#94728
Close https://github.com/llvm/llvm-project/issues/121066
Now we will diagnose that the import statement lacks a semicolon as
expected. Note that the original "not found" diagnose message remains.
I meant to remove that, but the test shows it might be more complex
process (other unexpected diagnose shows up). Given the importance of
the issue, I chose to not dig deeper.
The parser hangs when processing types/variables prefixed by `::` as an
optional scope specifier. For example,
```
- (instancetype)init:(::A *) foo;
```
The parser should not hang, and it should emit an error. This PR
implements the error check.
rdar://140885078
When incremental processing is enabled, the Parser will never report
tok::eof but tok::annot_repl_input_end. However, that case is already
taken care of in IncrementalParser::ParseOrWrapTopLevelDecl() so this
check was never executing.
This PR implement [P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1), and refactor the convoluted state
machines in module name lexical analysis.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Implements `export` keyword in HLSL.
There are two ways the `export` keyword can be used:
1. On individual function declarations
```
export void f() {}
```
2. On a group of function declaration:
```
export {
void f1();
void f2() {}
}
```
Functions declared with the `export` keyword have external linkage. The
implementation does not include validation of when a function can or
cannot be exported, such as when it has resource argument or semantic
annotations. That will be covered by llvm/llvm-project#93330.
Currently all function declarations in global or named namespaces have
external linkage by default so there are no specific code changes
required right now to make sure exported function have external linkage
as well. That will change as part of llvm/llvm-project#92071. Any
additional changes to make sure exported functions still have external
linkage will be done as part of this work item.
Fixes#92812
This patch continues previous efforts to split `Sema` up, this time
covering code completion.
Context can be found in #84184.
Dropping `Code` prefix from function names in `SemaCodeCompletion` would
make sense, but I think this PR has enough changes already.
As usual, formatting changes are done as a separate commit. Hopefully
this helps with the review.
After https://github.com/llvm/llvm-project/pull/85605 ([clang] Set
correct FPOptions if attribute 'optnone' presents) the current FP
options in Sema are saved during parsing function because Sema can
modify them if optnone is present. However they were saved too late, it
caused fails in some cases when precompiled headers are used. This patch
moves the storing earlier.
Attribute `optnone` must turn off all optimizations including fast-math
ones. Actually AST nodes in the 'optnone' function still had fast-math
flags. This change implements fixing FP options before function body is
parsed.
This patch tries to make the boundary of clang module and C++20 named
module more clear.
The changes included:
- Rename `TranslationUnitKind::TU_Module` to
`TranslationUnitKind::TU_ClangModule`.
- Rename `Sema::ActOnModuleInclude` to `Sema::ActOnAnnotModuleInclude`.
- Rename `ActOnModuleBegin` to `Sema::ActOnAnnotModuleBegin`.
- Rename `Sema::ActOnModuleEnd` to `Sema::ActOnAnnotModuleEnd`.
- Removes a warning if we're trying to compile a non-module unit as
C++20 module unit. This is not actually useful and makes (the future)
implementation unnecessarily complex.
This patch meant to be a NFC fix. But it shows that it fixed a bug
suprisingly that previously we would surppress the unused-value warning
in named modules. Because it shares the same logic with clang modules,
which has headers semantics. This shows the change is meaningful.
Our usual pattern when issuing an extension warning is to also issue a
default-off diagnostic about the keywords not being compatible with
standards before a certain point. This adds those diagnostics for C11
keywords.
According to [temp.pre] p5:
> In a template-declaration, explicit specialization, or explicit instantiation the init-declarator-list in the declaration shall contain at most one declarator.
A member-declaration that is a template-declaration or explicit-specialization contains a declaration, even though it declares a member. This means it _will_ contain an init-declarator-list (not a member-declarator-list), so [temp.pre] p5 applies.
This diagnoses declarations such as:
```
struct A
{
template<typename T>
static const int x = 0, f(); // error: a template declaration can only declare a single entity
template<typename T>
static const int g(), y = 0; // error: a template declaration can only declare a single entity
};
```
The diagnostic messages are the same as those of the equivalent namespace scope declarations.
Note: since we currently do not diagnose declarations with multiple abbreviated function template declarators at namespace scope e.g., `void f(auto), g(auto);`, so this patch does not add diagnostics for the equivalent member declarations.
This patch also refactors `ParseSingleDeclarationAfterTemplate` (now named `ParseDeclarationAfterTemplate`) to call `ParseDeclGroup` and return the resultant `DeclGroup`.
Implements https://isocpp.org/files/papers/P2662R3.pdf
The feature is exposed as an extension in older language modes.
Mangling is not yet supported and that is something we will have to do before release.
This upstreams more of the Clang API Notes functionality that is
currently implemented in the Apple fork:
https://github.com/apple/llvm-project/tree/next/clang/lib/APINotes
This is the largest chunk of the API Notes functionality in the
upstreaming process. I will soon submit a follow-up patch to actually
enable usage of this functionality by having a Clang driver flag that
enables API Notes, along with tests.