This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
[Discourse
link](https://discourse.llvm.org/t/a-small-proposal-for-extraction-of-inline-assembly-block-information/86658)
We strive for exposing such information using existing stable ABIs. In
doing so, queries are limited to what the original source holds or the
LLVM IR `asm` block would expose in connection with attributes that the
queries are concerned.
These APIs opens new opportunities for `rust-bindgen` to translate
inline assemblies in reasonably cases into Rust inline assembly blocks,
which would further aid better interoperability with other existing
code.
---------
Signed-off-by: Xiangfei Ding <dingxiangfei2009@protonmail.ch>
This offloads some test changes from another PR in order to facilitate
review.
- Adds some new tests.
- Cleans stray spaces and newlines on existing tests.
- Regenerates some AST json dumps, as the generator now includes
offsets.
This fixes emitting undefined behavior where a 64-bit generic
pointer is written to a 32-bit slot allocated for a private pointer.
This can be seen in test/CodeGenOpenCL/amdgcn-automatic-variable.cl's
wrong_pointer_alloca.
In the 'single-file-parse' mode, seeing `#include UNDEFINED_IDENTIFIER`
should not be treated as an error. The identifier might be defined in a
header that we decided to skip, resulting in a nonsensical diagnostic
from the user point of view.
The single file parse mode is supposed to enter both branches of an
`#if` directive whenever the condition contains undefined identifiers.
This patch adds support for undefined function-like macros, where we
would previously emit an error that doesn't make sense from end-user
perspective.
(I discovered this while working on a very similar feature that parses
single module only and doesn't enter either `#if` branch when the
condition contains undefined identifiers.)
This PR fixes two crashes in ExtractAPI that occur when decls are
requested via libclang:
- A null-dereference would sometimes happen in
`DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization`
when the template being processed was loaded indirectly via a typedef,
with parameters filled in. The first commit loads the template parameter
locations ahead of time to perform a null check before dereferencing.
- An assertion (or another null-dereference) was happening in
`CXXRecordDecl::bases` when processing a forward-declaration (i.e. a
record without a definition). The second commit guards the use of
`bases` in `ExtractAPIVisitorBase::getBases` by first checking that the
decl in question has a complete definition.
The added test `extract-api-cursor-cpp` adds tests for these two
scenarios to protect against the crash in the future.
Fixes rdar://140592475, fixes rdar://123430367
Original PR: #130537
Originally reverted due to revert of dependent commit. Relanding with no
changes.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the base class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntatically, and
they represent the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements.
Original PR: #130537
Reland after updating lldb too.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the base class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntatically, and
they represent the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntactically, and it
also represents the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements, and removing some duplications, for example
CheckBaseClassAccess is deduplicated from across SemaAccess and
SemaCast.
The tests updated by this commit were designed to check features in the
clang's driver and index that require clang to be targgeting a darwin
platform while running on a darwin host. For that, their execution is
currently gated by the `REQUIRES: system-darwin` annotation.
This approach becomes a problem when trying to run such tests on a
cross-compiling build of clang on a darwin platform. When the default
target is not darwin (e.g. via `LLVM_DEFAULT_TARGET_TRIPLE `), the
tests will still run on a darwin host and fail spuriously because of the
mismatch with the target detection.
To fix this issue, this patch introduces an extra condition to the
tests' REQUIRES annotation, `target={{.*}}-{{darwin|macos}}{{.*}}`,
ensuring they only run when the relevant target is present.
Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
You can provide more than one AST file as an input. Emit a path for a
file with a problem, so you can disambiguate between multiple files.
rdar://65005546
This is a rework of patch [D10833](https://reviews.llvm.org/D10833)
previously posted on LLVM Phabricator by arthurp in 2015. It allows to
retrieve the type of binary operator via libclangs python bindings.
I did clean up the changes, removed unrelated changes and rebased the
changeset to the latest main branch. As this is my first contribution to
the LLVM project, let me know if any required tests or documentation are
missing.
Given the following:
```
template<typename T>
struct A
{
void f(int); // #1
template<typename U>
void f(U); // #2
template<>
void f<int>(int); // #3
};
```
Clang will generate the same USR for `#1` and `#2`. This patch fixes the
issue by including the template arguments of dependent class scope
explicit specializations in their USRs.
### Background
It's surprisingly common for C++ code in the wild to conditionally
show/hide declarations to Doxygen through the use of preprocessor
directives. One especially common version of this pattern is
demonstrated below:
```cpp
/// @brief Test comment
#ifdef DOXYGEN_BUILD_ENABLED
template<typename T>
#else
template <typename T>
typename std::enable_if<std::is_integral<T>::value>::type
#endif
void f() {}
```
There are more examples I've collected below to demonstrate usage of
this pattern:
- Example 1:
[Magnum](8538610fa2/src/Magnum/Resource.h (L117-L127))
- Example 2:
[libcds](9985d2a87f/cds/container/michael_list_nogc.h (L36-L54))
- Example 3:
[rocPRIM](609ae19565/rocprim/include/rocprim/block/detail/block_reduce_raking_reduce.hpp (L60-L65))
From my research, it seems like the most common rationale for this
functionality is hiding difficult-to-parse code from Doxygen, especially
where template metaprogramming is concerned.
Currently, Clang does not support attaching comments to decls if there
are preprocessor comments between the comment and the decl. This is
enforced here:
b6ebea7972/clang/lib/AST/ASTContext.cpp (L284-L287)
Alongside preprocessor directives, any instance of `;{}#@` between a
comment and decl will cause the comment to not be attached to the decl.
#### Rationale
It would be nice for Clang-based documentation tools, such as
[hdoc](https://hdoc.io), to support code using this pattern. Users
expect to see comments attached to the relevant decl — even if there is
an `#ifdef` in the way — which Clang does not currently do.
#### History
Originally, commas were also in the list of "banned" characters, but
were removed in `b534d3a0ef69`
([link](b534d3a0ef))
because availability macros often have commas in them. From my reading
of the code, it appears that the original intent of the code was to
exclude macros and decorators between comments and decls, possibly in an
attempt to properly attribute comments to macros (discussed further in
"Complications", below). There's some more discussion here:
https://reviews.llvm.org/D125061.
### Change
This modifies Clang comment parsing so that comments are attached to
subsequent declarations even if there are preprocessor directives
between the end of the comment and the start of the decl. Furthermore,
this change:
- Adds tests to verify that comments are attached to their associated
decls even if there are preprocessor directives in between
- Adds tests to verify that current behavior has not changed (i.e. use
of the other characters between comment and decl will result in the
comment not being attached to the decl)
- Updates existing `lit` tests which would otherwise break.
#### Complications
Clang [does not yet
support](https://github.com/llvm/llvm-project/issues/38206) attaching
doc comments to macros. Consequently, the change proposed in this RFC
affects cases where a doc comment attached to a macro is followed
immediately by a normal declaration. In these cases, the macro's doc
comments will be attached to the subsequent decl. Previously they would
be ignored because any preprocessor directives between a comment and a
decl would result in the comment not being attached to the decl. An
example of this is shown below.
```cpp
/// Doc comment for a function-like macro
/// @param n
/// A macro argument
#define custom_sqrt(n) __internal_sqrt(n)
int __internal_sqrt(int n) { return __builtin_sqrt(n); }
// NB: the doc comment for the custom_sqrt macro will actually be attached to __internal_sqrt!
```
There is a real instance of this problem in the Clang codebase, namely
here:
be10070f91/clang/lib/Headers/amxcomplexintrin.h (L65-L114)
As part of this RFC, I've added a semicolon to break up the Clang
comment parsing so that the `-Wdocumentation` errors go away, but this
is a hack. The real solution is to fix Clang comment parsing so that doc
comments are properly attached to macros, however this would be a large
change that is outside of the scope of this RFC.
Reapply 4a7bf42a9b83144db8a11ac06cce4da21166e6a2
which was reverted in 34d44eb41dfbbbf01712719558b02763334fbeb3
Not sure why there are tests elsewhere in clang that rely on the output
of clang-format, but they were wrong
### Background
Doxygen's `\par` command
([link](https://www.doxygen.nl/manual/commands.html#cmdpar)) has an
optional argument, which denotes the header of the paragraph started by
a given `\par` command.
In short, the paragraph command can be used with a heading, or without
one. The code block below shows both forms and how the current version
of LLVM/Clang parses this code:
```
$ cat test.cpp
/// \par User defined paragraph:
/// Contents of the paragraph.
///
/// \par
/// New paragraph under the same heading.
///
/// \par
/// A second paragraph.
class A {};
$ clang++ -cc1 -ast-dump -fcolor-diagnostics -std=c++20 test.cpp
`-CXXRecordDecl 0x1530f3a78 <test.cpp:11:1, col:10> col:7 class A definition
|-FullComment 0x1530fea38 <line:2:4, line:9:23>
| |-ParagraphComment 0x1530fe7e0 <line:2:4>
| | `-TextComment 0x1530fe7b8 <col:4> Text=" "
| |-BlockCommandComment 0x1530fe800 <col:5, line:3:30> Name="par"
| | `-ParagraphComment 0x1530fe878 <line:2:9, line:3:30>
| | |-TextComment 0x1530fe828 <line:2:9, col:32> Text=" User defined paragraph:"
| | `-TextComment 0x1530fe848 <line:3:4, col:30> Text=" Contents of the paragraph."
| |-ParagraphComment 0x1530fe8c0 <line:5:4>
| | `-TextComment 0x1530fe898 <col:4> Text=" "
| |-BlockCommandComment 0x1530fe8e0 <col:5, line:6:41> Name="par"
| | `-ParagraphComment 0x1530fe930 <col:4, col:41>
| | `-TextComment 0x1530fe908 <col:4, col:41> Text=" New paragraph under the same heading."
| |-ParagraphComment 0x1530fe978 <line:8:4>
| | `-TextComment 0x1530fe950 <col:4> Text=" "
| `-BlockCommandComment 0x1530fe998 <col:5, line:9:23> Name="par"
| `-ParagraphComment 0x1530fe9e8 <col:4, col:23>
| `-TextComment 0x1530fe9c0 <col:4, col:23> Text=" A second paragraph."
`-CXXRecordDecl 0x1530f3bb0 <line:11:1, col:7> col:7 implicit class A
```
As we can see above, the optional paragraph heading (`"User defined
paragraph"`) is not an argument of the `\par` `BlockCommandComment`, but
instead a child `TextComment`.
For documentation generators like [hdoc](https://hdoc.io/), it would be
ideal if we could parse Doxygen documentation comments with these
semantics in mind. Currently that's not possible.
### Change
This change parses `\par` command according to how Doxygen parses them,
making an optional header available as a an argument if it is present.
In addition:
- AST unit tests are defined to test this functionality when an argument
is present, isn't present, with additional spacing, etc.
- TableGen is updated with an `IsParCommand` to support this
functionality
- `lit` tests are updated where needed
The -mmacos-version-min flag is preferred over -mmacosx-version-min.
This patch updates the tests and documentation to make this clear and
also adds the missing logic to scan build to handle the new flag.
Fixes#86376.
Co-authored-by: Gabor Horvath <gaborh@apple.com>
Doxygen allows for the `@throw`, `@throws`, and `@exception` commands to
have an attached argument indicating the type being thrown. Currently,
Clang's AST parsing doesn't support parsing out this argument from doc
comments. The result is missing compatibility with Doxygen.
This PR implements parsing of arguments for the `@throw`, `@throws`, and
`@exception` commands. Each command can only have one argument, matching
the semantics of Doxygen.
This patch improves the preservation of qualifiers and loss of type
sugar in TemplateNames.
This problem is analogous to https://reviews.llvm.org/D112374 and this
patch takes a very similar approach to that patch, except the impact
here is much lesser.
When a TemplateName was written bare, without qualifications, we
wouldn't produce a QualifiedTemplate which could be used to disambiguate
it from a Canonical TemplateName. This had effects in the TemplateName
printer, which had workarounds to deal with this, and wouldn't print the
TemplateName as-written in most situations.
There are also some related fixes to help preserve this type sugar along
the way into diagnostics, so that this patch can be properly tested.
- Fix dropping the template keyword.
- Fix type deduction to preserve sugar in TST TemplateNames.
Our current method of storing the template arguments as written for
`(Class/Var)Template(Partial)SpecializationDecl` suffers from a number
of flaws:
- We use `TypeSourceInfo` to store `TemplateArgumentLocs` for class
template/variable template partial/explicit specializations. For
variable template specializations, this is a rather unintuitive hack (as
we store a non-type specialization as a type). Moreover, we don't ever
*need* the type as written -- in almost all cases, we only want the
template arguments (e.g. in tooling use-cases).
- The template arguments as written are stored in a number of redundant
data members. For example, `(Class/Var)TemplatePartialSpecialization`
have their own `ArgsAsWritten` member that stores an
`ASTTemplateArgumentListInfo` (the template arguments).
`VarTemplateSpecializationDecl` has yet _another_ redundant member
"`TemplateArgsInfo`" that also stores an `ASTTemplateArgumentListInfo`.
This patch eliminates all
`(Class/Var)Template(Partial)SpecializationDecl` members which store the
template arguments as written, and turns the `ExplicitInfo` member into
a `llvm::PointerUnion<const ASTTemplateArgumentListInfo*,
ExplicitInstantiationInfo*>` (to avoid unnecessary allocations when the
declaration isn't an explicit instantiation). The template arguments as
written are now accessed via `getTemplateArgsWritten` in all cases.
The "most breaking" change is to AST Matchers, insofar that `hasTypeLoc`
will no longer match class template specializations (since they no
longer store the type as written).
Reapplies #84050, addressing a bug which cases a crash when an
expression with the type of the current instantiation is used as the
_postfix-expression_ in a class member access expression (arrow form).
Consider the following:
```cpp
template<typename T>
struct A
{
auto f()
{
return this->x;
}
};
```
Although `A` has no dependent base classes and the lookup context for
`x` is the current instantiation, we currently do not diagnose the
absence of a member `x` until `A<T>::f` is instantiated. This patch
moves the point of diagnosis for such expressions to occur at the point
of definition (i.e. prior to instantiation).
This is necessary to ensure that functions declared in different
translation units whose parameter types only differ in top-level
cv-qualification generate the same USR.
For example:
```
// A.cpp
void f(const int x); // c:@F@f#1I#
// B.cpp
void f(int x); // c:@F@f#I#
```
With this patch, the USR for both functions will be
`c:@F@f#I#`.
Reenables b31414bf4f9898f7817a9fcf8a91f62ec26f3eaf.
Also adds a new warning for missing `--symbol-graph-dir` arg when
`--emit-extension-symbol-graphs` is provided. This also reverts the
commit that removed.
This extends ExtractAPI to take into account symbols defined in categories to types defined in an external module.
This introduces 2 new command line flags, `--symbol-graph-dir=DIR` and `--emit-extension-symbol-graphs`, when used together this generates additional symbol graph files at `DIR/ExtendedModule@ProductName.symbols.json` for each external module that is extended in this way.
Additionally this makes some cleanups to tests to make them more resilient and cleans up the `APISet` data structure.
Previously committed as 9e08e51a20d0d2b1c5724bb17e969d036fced4cd, and
reverted because a dependency commit was reverted, then committed again
as 4b574008aef5a7235c1f894ab065fe300d26e786 and reverted again because
"dependency commit" 5a391d38ac6c561ba908334d427f26124ed9132e was
reverted. But it doesn't seem that 5a391d38ac6c was a real dependency
for this.
This commit incorporates 4b574008aef5a7235c1f894ab065fe300d26e786 and
18e093faf726d15f210ab4917142beec51848258 by Richard Smith (@zygoloid),
with some minor fixes, most notably:
- `UncommonValue` renamed to `StructuralValue`
- `VK_PRValue` instead of `VK_RValue` as default kind in lvalue and
member pointer handling branch in
`BuildExpressionFromNonTypeTemplateArgumentValue`;
- handling of `StructuralValue` in `IsTypeDeclaredInsideVisitor`;
- filling in `SugaredConverted` along with `CanonicalConverted`
parameter in `Sema::CheckTemplateArgument`;
- minor cleanup in
`TemplateInstantiator::transformNonTypeTemplateParmRef`;
- `TemplateArgument` constructors refactored;
- `ODRHash` calculation for `UncommonValue`;
- USR generation for `UncommonValue`;
- more correct MS compatibility mangling algorithm (tested on MSVC ver.
19.35; toolset ver. 143);
- IR emitting fixed on using a subobject as a template argument when the
corresponding template parameter is used in an lvalue context;
- `noundef` attribute and opaque pointers in `template-arguments` test;
- analysis for C++17 mode is turned off for templates in
`warn-bool-conversion` test; in C++17 and C++20 mode, array reference
used as a template argument of pointer type produces template argument
of UncommonValue type, and
`BuildExpressionFromNonTypeTemplateArgumentValue` makes
`OpaqueValueExpr` for it, and `DiagnoseAlwaysNonNullPointer` cannot see
through it; despite of "These cases should not warn" comment, I'm not
sure about correct behavior; I'd expect a suggestion to replace `if` by
`if constexpr`;
- `temp.arg.nontype/p1.cpp` and `dr18xx.cpp` tests fixed.
This adds a warning when applying the `pure` attribute along with the `const` attribute, or when applying the `pure` attribute to a function with a `void` return type (including constructors and destructors).
Fixes https://github.com/llvm/llvm-project/issues/77482
b8f89b84bc26c46a5a10d01eb5414fbde3c8700a inadvertently replaced
startswith/endswith with starts_with/ends_with even though the test
uses a custom StringRef. This patch reverts the change.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch deprecates `module.map` in favor of `module.modulemap`, which
has been the preferred form since 2014. The eventual goal is to remove
support for `module.map` to reduce the number of stats Clang needs to do
while searching for module map files.
This patch touches a lot of files, but the majority of them are just
renaming tests or references to the file in comments or documentation.
The relevant files are:
* lib/Lex/HeaderSearch.cpp
* include/clang/Basic/DiagnosticGroups.td
* include/clang/Basic/DiagnosticLexKinds.td