This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This is a first pass at implementing
[P2841R7](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2841r7.pdf).
The implementation is far from complete; however, I'm aiming to do that
in chunks, to make our lives easier.
In particular, this does not implement
- Subsumption
- Mangling
- Satisfaction checking is minimal as we should focus on #141776 first
(note that I'm currently very stuck)
FTM, release notes, status page, etc, will be updated once the feature
is more mature. Given the state of the feature, it is not yet allowed in
older language modes.
Of note:
- Mismatches between template template arguments and template template
parameters are a bit wonky. This is addressed by #130603
- We use `UnresolvedLookupExpr` to model template-id. While this is
pre-existing, I have been wondering if we want to introduce a different
OverloadExpr subclass for that. I did not make the change in this patch.
Previously, the `sycl_kernel_entry_point` attribute could be specified
using either the GNU or C++11 spelling styles. Future SYCL attributes
are expected to support only the C++11 spelling style, so support for
the GNU style is being removed.
In order to ensure consistent presentation of the attribute in
diagnostic messages, diagnostics specific to this attribute now require
the attribute to be provided as an argument. This delegates formatting
of the attribute name to the diagnostic engine.
As an additional nicety, "the" is added to some diagnostic messages so
that they read more like proper sentences.
Add `NamespaceBaseDecl` as common base class of `NamespaceDecl` and
`NamespaceAliasDecl`. This simplifies `NestedNameSpecifier` a bit.
Co-authored-by: Matheus Izvekov <mizvekov@gmail.com>
C++23 mandates that temporaries used in range-based for loops are
lifetime-extended
to cover the full loop. This patch adds a check for loop variables and
compiler-
generated `__range` bindings to apply the correct extension.
Includes test cases based on examples from CWG900/P2644R1.
Fixes https://github.com/llvm/llvm-project/issues/109793
All deserialized VarDecl initializers are EvaluatedStmt, but not all
EvaluatedStmt initializers are from a PCH. Calling
`VarDecl::hasInitWithSideEffects` can trigger constant evaluation, but
it's hard to know ahead of time whether that will trigger
deserialization - even if the initializer is fully deserialized, it may
contain a call to a constructor whose body is not deserialized. By
caching the result of `VarDecl::hasInitWithSideEffects` and populating
that cache during deserialization we can guarantee that calling it won't
trigger deserialization regardless of the state of the initializer.
This also reduces memory usage by removing the `InitSideEffectVars` set
in `ASTReader`.
rdar://154717930
Reverts llvm/llvm-project#143739 because it triggers an assert:
```
Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 952.
```
Calling `DeclMustBeEmitted` should not lead to more deserialization, as
it may occur before previous deserialization has finished.
When passed a `VarDecl` with an initializer however, `DeclMustBeEmitted`
needs to know whether that initializer contains side effects. When the
`VarDecl` is deserialized but the initializer is not, this triggers
deserialization of the initializer. To avoid this we add a bit to the
serialization format for `VarDecl`s, indicating whether its initializer
contains side effects or not, so that the `ASTReader` can query this
information directly without deserializing the initializer.
rdar://153085264
This is a fix for a completely unrelated patch, that started to cause
failures in the explicit-build.cpp test because the size of the b.pcm
and b-not-a.pcm files became the same. The alignment added by empty
ObjCCategory blobs being written to the file causes them to become the
same size, and the error 'module file has a different size than
expected' will not be emitted as the pcms only track module size, not
content, for whether they are valid.
This prevents that issue by not saving the ObjCCategories if it is
empty. The change in clang/lib/Serialization/ASTReaderDecl.cpp is just
formatting, but shows that the only use of ObjCCategoriesMap loaded from
the file will be OK with null (never loaded) data. It is a bit of a weird
fix, but should help decrease the size of the modules for objects that
are not used.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This changes the type compatibility rules so that it is permitted to
redefine tag types within the same TU so long as they are equivalent
definitions.
It is intentionally not being exposed as an extension in older C
language modes. GCC does not do so and the feature doesn't seem
compelling enough to warrant it.
When serializing and deserializing a FunctionDecl we don't recover
whether or not the decl was a type aware allocator or destroying delete,
because in the final PR that information was placed in a side table in
ASTContext.
In principle it should be possible to re-do the semantic checks to
determine what these flags should be when deserializing, but it seems
like the most robust path is simply recording the flags directly in the
serialized AST.
This introduces a new class 'UnsignedOrNone', which models a lite
version of `std::optional<unsigned>`, but has the same size as
'unsigned'.
This replaces most uses of `std::optional<unsigned>`, and similar
schemes utilizing 'int' and '-1' as sentinel.
Besides the smaller size advantage, this is simpler to serialize, as its
internal representation is a single unsigned int as well.
Fix for regression #130917, changes in #111992 were too broad. This change reduces scope of previous fix. Added `ExternalASTSource::wasThisDeclarationADefinition` to detect cases when FunctionDecl lost body due to declaration merges.
This reverts an earlier attempt
(adb0d8ddceb143749c519d14b8b31b481071da77 and
50e5411e4247421fd606f0a206682fcdf0303ae3) to support these expansions,
which was limited to type arguments and which subverted the purpose
of SubstTemplateTypeParmType.
This propagates the ArgumentPackSubstitutionIndex along with the
AssociatedConstraint, so that the pack expansion works, without
needing any new transforms or otherwise any changes to the template
instantiation process.
This keeps the tests from the reverted commits, and adds a few more
showing the new solution also works for NTTPs.
Fixes https://github.com/llvm/llvm-project/issues/131798
`ASTReader::FinishedDeserializing` uses `NumCurrentElementsDeserializing` to keep track of nested `Deserializing` RAII actions. The `FinishedDeserializing` only performs actions if it is the top-level `Deserializing` layer. This works fine in general, but there is a problematic edge case.
If a call to `redecls()` in `FinishedDeserializing` performs deserialization, we re-enter `FinishedDeserializing` while in the middle of the previous `FinishedDeserializing` call.
The known problematic part of this is that this inner `FinishedDeserializing` can go all the way to `PassInterestingDeclsToConsumer`, which operates on `PotentiallyInterestingDecls` data structure which contain decls that should be handled by the previous `FinishedDeserializing` stage.
The other shared data structures are also somewhat concerning at a high-level in that the inner `FinishedDeserializing` would be handling pending actions that are not "within its scope", but this part is not known to be problematic.
We already have a guard within `PassInterestingDeclsToConsumer` because we can end up with recursive deserialization within `PassInterestingDeclsToConsumer`. The implemented solution is to apply this guard to the portion of `FinishedDeserializing` that performs further deserialization as well. This ensures that recursive deserialization does not trigger `PassInterestingDeclsToConsumer` which may operate on entries that are not ready to be passed.
The 'routine' construct has two forms, one which takes the name of a
function that it applies to, and another where it implicitly figures it
out based on the next declaration. This patch implements the former with
the required restrictions on the name and the function-static-variables
as specified.
What has not been implemented is any clauses for this, any of the A.3.4
warnings, or the other form.
The 'declare' construct is the first of two 'declaration' level
constructs, so it is legal in any place a declaration is, including as a
statement, which this accomplishes by wrapping it in a DeclStmt. All
clauses on this have a 'same scope' requirement, which this enforces as
declaration context instead, which makes it possible to implement these
as a template.
The 'link' and 'device_resident' clauses are also added, which have some
similar/small restrictions, but are otherwise pretty rote.
This patch implements all of the above.
Close https://github.com/llvm/llvm-project/issues/126373
Although the root problems should be we shouldn't place the friend
declaration to the incorrect module, let's avoid bleeding the edge by
stoping diagnosing declarations not in file scope.
Class templates might be only instantiated when they are required to be
complete, but checking the template args against the primary template is
immediate.
This result is cached so that later when the class is instantiated,
checking against the primary template is not repeated.
The 'MatchedPackOnParmToNonPackOnArg' flag is also produced upon
checking against the primary template, so it needs to be cached in the
specialziation as well.
This fixes a bug which has not been in any release, so there are no
release notes.
Fixes#125290
This fixes instantiation of definition for friend function templates,
when the declaration found and the one containing the definition have
different template contexts.
In these cases, the the function declaration corresponding to the
definition is not available; it may not even be instantiated at all.
So this patch adds a bit which tracks which function template
declaration was instantiated from the member template. It's used to find
which primary template serves as a context for the purpose of
obtainining the template arguments needed to instantiate the definition.
Fixes#55509
Relanding patch, with no changes, after it was reverted due to revert of
commit this patch depended on.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect FD->TemplateOrSpecialization to be nonnull.
For deduction guides generated from alias template CTAD, store the
deduction guide they were originated from. The source kind is also
maintained for future expansion in CTAD from inherited constructors.
This tracking is required to determine whether an alias template already
has a deduction guide corresponding to some deduction guide on the
original template, in order to support deduction guides for the alias
from deduction guides declared after the initial usage.
A SYCL kernel entry point function is a non-member function or a static
member function declared with the `sycl_kernel_entry_point` attribute.
Such functions define a pattern for an offload kernel entry point
function to be generated to enable execution of a SYCL kernel on a
device. A SYCL library implementation orchestrates the invocation of
these functions with corresponding SYCL kernel arguments in response to
calls to SYCL kernel invocation functions specified by the SYCL 2020
specification.
The offload kernel entry point function (sometimes referred to as the
SYCL kernel caller function) is generated from the SYCL kernel entry
point function by a transformation of the function parameters followed
by a transformation of the function body to replace references to the
original parameters with references to the transformed ones. Exactly how
parameters are transformed will be explained in a future change that
implements non-trivial transformations. For now, it suffices to state
that a given parameter of the SYCL kernel entry point function may be
transformed to multiple parameters of the offload kernel entry point as
needed to satisfy offload kernel argument passing requirements.
Parameters that are decomposed in this way are reconstituted as local
variables in the body of the generated offload kernel entry point
function.
For example, given the following SYCL kernel entry point function
definition:
```
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void sycl_kernel_entry_point(KernelType kernel) {
kernel();
}
```
and the following call:
```
struct Kernel {
int dm1;
int dm2;
void operator()() const;
};
Kernel k;
sycl_kernel_entry_point<class kernel_name>(k);
```
the corresponding offload kernel entry point function that is generated
might look as follows (assuming `Kernel` is a type that requires
decomposition):
```
void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) {
Kernel kernel{dm1, dm2};
kernel();
}
```
Other details of the generated offload kernel entry point function, such
as its name and calling convention, are implementation details that need
not be reflected in the AST and may differ across target devices. For
that reason, only the transformation described above is represented in
the AST; other details will be filled in during code generation.
These transformations are represented using new AST nodes introduced
with this change. `OutlinedFunctionDecl` holds a sequence of
`ImplicitParamDecl` nodes and a sequence of statement nodes that
correspond to the transformed parameters and function body.
`SYCLKernelCallStmt` wraps the original function body and associates it
with an `OutlinedFunctionDecl` instance. For the example above, the AST
generated for the `sycl_kernel_entry_point<kernel_name>` specialization
would look as follows:
```
FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)'
TemplateArgument type 'kernel_name'
TemplateArgument type 'Kernel'
ParmVarDecl kernel 'Kernel'
SYCLKernelCallStmt
CompoundStmt
<original statements>
OutlinedFunctionDecl
ImplicitParamDecl 'dm1' 'int'
ImplicitParamDecl 'dm2' 'int'
CompoundStmt
VarDecl 'kernel' 'Kernel'
<initialization of 'kernel' with 'dm1' and 'dm2'>
<transformed statements with redirected references of 'kernel'>
```
Any ODR-use of the SYCL kernel entry point function will (with future
changes) suffice for the offload kernel entry point to be emitted. An
actual call to the SYCL kernel entry point function will result in a
call to the function. However, evaluation of a `SYCLKernelCallStmt`
statement is a no-op, so such calls will have no effect other than to
trigger emission of the offload kernel entry point.
Additionally, as a related change inspired by code review feedback,
these changes disallow use of the `sycl_kernel_entry_point` attribute
with functions defined with a _function-try-block_. The SYCL 2020
specification prohibits the use of C++ exceptions in device functions.
Even if exceptions were not prohibited, it is unclear what the semantics
would be for an exception that escapes the SYCL kernel entry point
function; the boundary between host and device code could be an implicit
noexcept boundary that results in program termination if violated, or
the exception could perhaps be propagated to host code via the SYCL
library. Pending support for C++ exceptions in device code and clear
semantics for handling them at the host-device boundary, this change
makes use of the `sycl_kernel_entry_point` attribute with a function
defined with a _function-try-block_ an error.
This reverts commit c3ba6f378ef80d750e2278560c6f95a300114412.
We are seeing performance regressions of up to 40% on some compilations
with this patch, we will investigate and reland after fixing performance
issues.
…ecord level.
This fixes the incorrect diagnostic emitted when compiling the following
snippet
```
// string_view.h
template<class _CharT>
class basic_string_view;
typedef basic_string_view<char> string_view;
template<class _CharT>
class
__attribute__((__preferred_name__(string_view)))
basic_string_view {
public:
basic_string_view()
{
}
};
inline basic_string_view<char> foo()
{
return basic_string_view<char>();
}
// A.cppm
module;
#include "string_view.h"
export module A;
// Use.cppm
module;
#include "string_view.h"
export module Use;
import A;
```
The diagnostic is
```
string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier
```
The underlying issue is that deserialization of the `preferred_name`
attribute triggers deserialization of `basic_string_view<char>`, which
triggers the deserialization of the `preferred_name` attribute again
(since it's attached to the `basic_string_view` template).
The deserialization logic is implemented in a way that prevents it from
going on a loop in a literal sense (it detects early on that it has
already seen the `string_view` typedef when trying to start its
deserialization for the second time), but leaves the typedef
deserialization in an unfinished state. Subsequently, the `string_view`
typedef from the deserialized module cannot be merged with the same
typedef from `string_view.h`, resulting in the above diagnostic.
This PR resolves the problem by delaying the deserialization of the
`preferred_name` attribute until the deserialization of the
`basic_string_view` template is completed. As a result of deferring, the
deserialization of the `preferred_name` attribute doesn't need to go on
a loop since the type of the `string_view` typedef is already known when
it's deserialized.
Close https://github.com/llvm/llvm-project/issues/90154
This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.
Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.
Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.
In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as
the key. And refactored `DeclContext::lookup()` method to take the
module information. So that a lookup in a DeclContext won't load
declarations that are local to **other** modules.
And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.
BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.
This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.
Close https://github.com/llvm/llvm-project/issues/90154
This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.
Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.
Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.
In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as
the key. And refactored `DeclContext::lookup()` method to take the
module information. So that a lookup in a DeclContext won't load
declarations that are local to **other** modules.
And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.
BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.
This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.
---
On the API side, this patch unfortunately add a maybe-confusing argument
`Module *NamedModule` to
`ExternalASTSource::FindExternalVisibleDeclsByName()`. People may think
we can get the information from the first argument `const DeclContext
*DC`. But sadly there are declarations (e.g., namespace) can appear in
multiple different modules as a single declaration. So we have to add
additional information to indicate this.
This is a new Clang-specific attribute to ensure that field
initializations are performed explicitly.
For example, if we have
```
struct B {
[[clang::explicit]] int f1;
};
```
then the diagnostic would trigger if we do `B b{};`:
```
field 'f1' is left uninitialized, but was marked as requiring initialization
```
This prevents callers from accidentally forgetting to initialize fields,
particularly when new fields are added to the class.
The `sycl_kernel_entry_point` attribute is used to declare a function that
defines a pattern for an offload kernel entry point. The attribute requires
a single type argument that specifies a class type that meets the requirements
for a SYCL kernel name as described in section 5.2, "Naming of kernels", of
the SYCL 2020 specification. A unique kernel name type is required for each
function declared with the attribute. The attribute may not first appear on a
declaration that follows a definition of the function. The function is
required to have a non-deduced `void` return type. The function must not be
a non-static member function, be deleted or defaulted, be declared with the
`constexpr` or `consteval` specifiers, be declared with the `[[noreturn]]`
attribute, be a coroutine, or accept variadic arguments.
Diagnostics are not yet provided for the following:
- Use of a type as a kernel name that does not satisfy the forward
declarability requirements specified in section 5.2, "Naming of kernels",
of the SYCL 2020 specification.
- Use of a type as a parameter of the attributed function that does not
satisfy the kernel parameter requirements specified in section 4.12.4,
"Rules for parameter passing to kernels", of the SYCL 2020 specification
(each such function parameter constitutes a kernel parameter).
- Use of language features that are not permitted in device functions as
specified in section 5.4, "Language restrictions for device functions",
of the SYCL 2020 specification.
There are several issues noted by various FIXME comments.
- The diagnostic generated for kernel name conflicts needs additional work
to better detail the relevant source locations; such as the location of
each declaration as well as the original source of each kernel name.
- A number of the tests illustrate spurious errors being produced due to
attributes that appertain to function templates being instantiated too
early (during overload resolution as opposed to after an overload is
selected).
Included changes allow the `SYCLKernelEntryPointAttr` attribute to be
marked as invalid if a `sycl_kernel_entry_point` attribute is used incorrectly.
This is intended to prevent trying to emit an offload kernel entry point
without having to mark the associated function as invalid since doing so
would affect overload resolution; which this attribute should not do.
Unfortunately, Clang eagerly instantiates attributes that appertain to
functions with the result that errors might be issued for function
declarations that are never selected by overload resolution. Tests have
been added to demonstrate this. Further work will be needed to address
these issues (for this and other attributes).
After 0dedd6fe1 and 03229e7c0, invalid concept declarations might lack
expressions for evaluation and normalization. This could make it crash
in certain scenarios, apart from the one of evaluation concepts showed
in 03229e7c0, there's also an issue when checking specializations where
the normalization also relies on a non-null expression.
This patch prevents that by avoiding building up a type constraint in
such situations, thereafter the template parameter wouldn't have a
concept specialization of a null expression.
With this patch, the assumption in ASTWriterDecl is no longer valid.
Namely, HasConstraint and TypeConstraintInitialized must now represent
different meanings for both source fidelity and semantic requirements.
Fixes https://github.com/llvm/llvm-project/issues/115004
Fixes https://github.com/llvm/llvm-project/issues/121980
Reland https://github.com/llvm/llvm-project/pull/83237
---
(Original comments)
Currently all the specializations of a template (including
instantiation, specialization and partial specializations) will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.
This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.
For example,
```
// M.cppm
export module M;
export template <class T>
class A {};
export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
// use.cpp
import M;
A<int> a;
```
We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.
Given that the templates are heavily used in C++, this is a pain point
for the performance.
This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.
To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.