The current `ASTReader::visitInputFiles()` function calls into `FileManager` to create `FileEntryRef` objects. This ends up being fairly costly in `clang-scan-deps`, where we mostly only care about file paths.
This patch introduces new `ASTReader` API that gives clients access to just the serialized paths. Since the scanner needs both the as-requested path and the on-disk one (and doesn't want to transform the former into the latter via `FileManager`), this patch starts serializing both of them into the PCM file if they differ.
This increases the size of scanning PCMs by 0.1% and speeds up scanning by 5%.
Reviewed By: benlangmuir, vsapsai
Differential Revision: https://reviews.llvm.org/D157066
This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and
reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an extension to
fix the Flang build.
Differential Revision: https://reviews.llvm.org/D156184
CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions we
introduce the `ompx_attribute` clause that takes these kernel
attributes and emits code as if they had been attached to the kernel
fuction (which is implicitly generated).
To limit the impact, we only support three kernel attributes:
`amdgpu_waves_per_eu`, for AMDGPU
`amdgpu_flat_work_group_size`, for AMDGPU
`launch_bounds`, for NVPTX
The existing implementations of those attributes are used for error
checking and code generation. `ompx_attribute` can be attached to any
executable target region and it can hold more than one kernel attribute.
Differential Revision: https://reviews.llvm.org/D156184
Currently, `ASTReader` performs some checks to diagnose relocated modules. This can add quite a bit of overhead to the scanner: it requires looking up, parsing and resolving module maps for all transitively loaded module files (and all the module maps encountered in the search paths on the way). Most of those checks are not really useful in the scanner anyway, since it uses strict context hash and immutable filesystem, which prevent those scenarios in the first place.
This can speed up scanning by up to 30%.
Depends on D150292.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D150320
This is a prep patch for avoiding the quadratic number of calls to `HeaderSearch::lookupModule()` in `ASTReader` for each (transitively) loaded PCM file. (Specifically in the context of `clang-scan-deps`).
This patch explicitly serializes `Module::DefinitionLoc` so that we can stop relying on it being filled by the module map parser. This change also required change to the module map parser, where we used the absence of `DefinitionLoc` to determine whether a file came from a PCM file. We also need to make sure we consider the "containing" module map affecting when writing a PCM, so that it's not stripped during serialization, which ensures `DefinitionLoc` still ends up pointing to the correct offset. This is intended to be a NFC change.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D150292
In D114095, `HeaderFileInfo::NumIncludes` was moved into `Preprocessor`. This still makes sense, because we want to track this on the granularity of submodules (D112915, D114173), but the way this information is serialized is not ideal. In `ASTWriter`, the set of included files gets deserialized eagerly, issuing lots of calls to `FileManager::getFile()` for input files the PCM consumer might not be interested in.
This patch makes the information part of the header file info table, taking advantage of its lazy deserialization which typically happens when a file is about to be included.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D155131
This is recommit of 98390ccb80569e8fbb20e6c996b4b8cff87fbec6, reverted
in 82a3969d710f5fb7a2ee4c9afadb648653923fef, because it caused
https://github.com/llvm/llvm-project/issues/63542. Although the problem
described in the issue is independent of the reverted patch, fail of
PCH/late-parsed-instantiations.cpp indeed obseved on PowerPC and is
likely to be caused by wrong serialization of `LateParsedTemplate`
objects. In this patch the serialization is fixed.
Original commit message is below.
Previously function template instantiations occurred with FP options
that were in effect at the end of translation unit. It was a problem
for late template parsing as these FP options were used as attributes of
AST nodes and may result in crash. To fix it FP options are set to the
state of the point of template definition.
Differential Revision: https://reviews.llvm.org/D143241
In `clang-scan-deps` contexts, the number of interesting identifiers in PCM files is fairly low (only macros), while the number of identifiers in the importing instance is high (builtins). Marking the whole identifier table out-of-date triggers lots of benign and expensive calls to `ASTReader::updateOutOfDateIdentifiers()`. (That unfortunately happens even for unused identifiers due to `SemaRef.IdResolver.begin(II)` line in `ASTWriter::WriteASTCore()`.)
This patch makes the main code path more similar to C++ modules, where the PCM files have `INTERESTING_IDENTIFIERS` section which lists identifiers that get created in the identifier table of the importing instance and marked as out-of-date. The only difference is that the main code path doesn't *create* identifiers in the table and relies on the importing instance calling `ASTReader::get()` when creating new identifier on-demand. It only marks existing identifiers as out-of-date.
This speeds up `clang-scan-deps` by 5-10%.
Reviewed By: Bigcheese, benlangmuir
Differential Revision: https://reviews.llvm.org/D151277
In D152070 we added many new intrinsic types required for the RISC-V
Vector Extension.
This was crashing when loading the AST as those types are intrinsically
added to the AST (they don't come from the disk).
The total number required now by clang exceeds 400 so increasing the
value to 500 solves the problem. This value was already increased in
D92715 but I assume this has some impact on the on-disk format.
Also add a static assert to avoid this happening again in the future.
Differential Revision: https://reviews.llvm.org/D153111
Most users of `Module::Header` already assume its `Entry` is populated. Enforce this assumption in the type system and handle the only case where this is not the case by wrapping the whole struct in `std::optional`. Do the same for `Module::DirectoryName`.
Depends on D151584.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D151586
For modules with umbrellas, we track how they were written in the module map. Unfortunately, the getter for the umbrella directory conflates the "as written" directory and the "effective" directory (either the written one or the parent of the written umbrella header).
This patch makes the distinction between "as written" and "effective" umbrella directories clearer. No functional change intended.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D151581
A step to address https://github.com/llvm/llvm-project/issues/62707.
It is not user friendly enough to drop the implicitly generated path
directly. Let's emit the warning first and drop it in the next version.
ASTReader after we start writing
This is intended to mitigate
https://github.com/llvm/llvm-project/issues/61447.
Before the patch, it takes 5s to compile test.cppm in the above
reproducer. After the patch it takes 3s to compile it. Although this
patch didn't solve the problem completely, it should mitigate the
problem for sure. Noted that the behavior of the patch is consistent
with the comment of the originally empty function
ASTReader::finalizeForWriting. So the change should be consistent with
the original design.
Clang currently updates the mtime of .timestamp files on each load of the corresponding .pcm file. This is not necessary. In a given build session, Clang only needs to write the .timestamp file once, when we first validate the input files. This patch makes it so that we only touch the .timestamp file when it's older than the build session, alleviating some filesystem contention in clang-scan-deps.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D149802
Close https://github.com/llvm/llvm-project/issues/62269
Currently, the compiler will emit errors when we compile C++20 modules
if the referenced files changed or got removed. This is because we reuse
the existing logic from Clang implicit modules. It is helpful for clang
implicit modules since it is implicit and we want to be sure things
don't go wrong. But it is not necessary for C++20 modules. The C++20
modules is explicit and it is build systems' responsibility to maintain
the dependencies. So the check in the compiler side may be an overkill.
Reported by Coverity:
1. Inside "ASTReader.cpp" file, in clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const *, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl *> &): Using the auto keyword without an & causes a copy.
auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type pair.
2. Inside "ASTReader.cpp" file, in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, llvm::SmallVectorImpl<clang::ASTReader::ImportedSubmodule> *): Using the auto keyword without an & causes a copy.
auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type DenseMapPair.
3. Inside "CGOpenMPRuntimeGPU.cpp" file, in clang::CodeGen::CGOpenMPRuntimeGPU::emitGenericVarsEpilog(clang::CodeGen::CodeGenFunction &, bool): Using the auto keyword without an & causes a copy.
auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type pair.
4. Inside "ASTWriter.cpp" file, in clang::ASTWriter::WriteHeaderSearch(clang::HeaderSearch const &): Using the auto keyword without an & causes a copy.
auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type UnresolvedHeaderDirective.
Reviewed By: tahonermann
Differential Revision: https://reviews.llvm.org/D149461
Fix several problems related to serialization causing command line
defines to be reported as being built-in defines:
* When serializing the <built-in> and <command line> files don't
convert them into absolute paths.
* When deserializing SM_SLOC_BUFFER_ENTRY we need to call
setHasLineDirectives in the same way as we do for
SM_SLOC_FILE_ENTRY.
* When created suggested predefines based on the current command line
options we need to add line markers in the same way that
InitializePreprocessor does.
* Adjust a place in clangd where it was implicitly relying on command
line defines being treated as builtin.
Differential Revision: https://reviews.llvm.org/D144651
We only want to make PCH imports visible once for the the TU, not
repeatedly after every subsequent import. This causes some incorrect
behaviour with submodule visibility, and causes us to get extra module
dependencies in the scanner. So far I have only seen obviously incorrect
behaviour when building with -fmodule-name to cause a submodule to be
textually included when using the PCH, though the old behaviour seems
wrong regardless.
rdar://107449644
Differential Revision: https://reviews.llvm.org/D148176
Previously, distinct lambdas would get merged, and multiple definitions
of the same lambda would not get merged, because we attempted to
identify lambdas by their ordinal position within their lexical
DeclContext. This failed for lambdas within namespace-scope variables
and within variable templates, where the lexical position in the context
containing the variable didn't uniquely identify the lambda.
In this patch, we instead identify lambda closure types by index within
their context declaration, which does uniquely identify them in a way
that's consistent across modules.
This change causes a deserialization cycle between the type of a
variable with deduced type and a lambda appearing as the initializer of
the variable -- reading the variable's type requires reading and merging
the lambda, and reading the lambda requires reading and merging the
variable. This is addressed by deferring loading the deduced type of a
variable until after we finish recursive deserialization.
This also exposes a pre-existing subtle issue where loading a
variable declaration would trigger immediate loading of its initializer,
which could recursively refer back to properties of the variable. This
particularly causes problems if the initializer contains a
lambda-expression, but can be problematic in general. That is addressed
by switching to lazily loading the initializers of variables rather than
always loading them with the variable declaration. As well as fixing a
deserialization cycle, that should improve laziness of deserialization
in general.
LambdaDefinitionData had 63 spare bits in it, presumably caused by an
off-by-one-error in some previous change. This change claims 32 of those bits
as a counter for the lambda within its context. We could probably move the
numbering to separate storage, like we do for the device-side mangling number,
to optimize the likely-common case where all three numbers (host-side mangling
number, device-side mangling number, and index within the context declaration)
are zero, but that's not done in this change.
Fixes#60985.
Reviewed By: #clang-language-wg, aaron.ballman
Differential Revision: https://reviews.llvm.org/D145737
Fix several problems related to serialization causing command line
defines to be reported as being built-in defines:
* When serializing the <built-in> and <command line> files don't
convert them into absolute paths.
* When deserializing SM_SLOC_BUFFER_ENTRY we need to call
setHasLineDirectives in the same way as we do for
SM_SLOC_FILE_ENTRY.
* When created suggested predefines based on the current command line
options we need to add line markers in the same way that
InitializePreprocessor does.
* Adjust a place in clangd where it was implicitly relying on command
line defines being treated as builtin.
Differential Revision: https://reviews.llvm.org/D144651
Close https://github.com/llvm/llvm-project/issues/60824
The form -fmodule-file=<path-to-BMI> will load modules eagerly and the
form -fmodule-file=<module-name>=<path-to-BMI> will load modules lazily.
The inconsistency adds many additional burdens to the implementations.
And the inconsistency looks not helpful and necessary neither. So I want
to deprecate the form -fmodule-file=<path-to-BMI> for named modules.
This is pretty helpful for us (the developers).
Does this change make any regression from the perspective of the users?
To be honest, yes. But I think such regression is acceptable. Here is
the example:
```
// M.cppm
export module M;
export int m = 5;
// N.cpp
// import M; // woops, we forgot to import M.
int n = m;
```
In the original version, the compiler can diagnose the users to import
`M` since the compiler have already imported M. But in the later style,
the compiler can only say "unknown identifier `m`".
But I think such regression doesn't make a deal since it only works if
the user put `-fmodule-file=M.pcm` in the command line. But how can the
user put `-fmodule-file=M.pcm` in the command line without `import M;`?
Especially currently such options are generated by build systems. And
the build systems will only generate the command line from the source
file.
So I think this change is pretty pretty helpful for developers and
almost innocent for users and we should accept this one.
I'll add the release notes and edit the document after we land this.
Differential Revision: https://reviews.llvm.org/D144707
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122215
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Differential Revision: https://reviews.llvm.org/D122215
Dynamic memory allows users to allocate fast shared memory when a kernel
is launched. We support a single size for all kernels via the
`LIBOMPTARGET_SHARED_MEMORY_SIZE` environment variable but now we can
control it per kernel invocation, hence allow computed values.
Note: Only the nextgen plugins will allocate memory based on the clause,
the old plugins will silently miscompile.
Differential Revision: https://reviews.llvm.org/D141233
When two modules contain interfaces with the same name, check the
definitions are equivalent and diagnose if they are not.
Differential Revision: https://reviews.llvm.org/D140073
Refactor the `Module::Header` class to use an `OptionalFileEntryRef`
instead of a `FileEntry*`. This is preparation for refactoring the
`TopHeaderNames` to use `FileEntryRef` so that we preserve the
lookup path of the headers when serializing.
This is mostly based on https://reviews.llvm.org/D90497
Reviewed By: jansvoboda11
Differential Revision: https://reviews.llvm.org/D142113
Allow completing a redeclaration check for anonymous structs/unions
inside `RecordDecl`, so we deserialize and compare anonymous entities
from different modules.
Completing the redeclaration chain for `RecordDecl` in
`ASTContext::getASTRecordLayout` mimics the behavior in
`CXXRecordDecl::dataPtr`. Instead of completing the redeclaration chain
every time we request a definition, do that right before we need a
complete definition in `ASTContext::getASTRecordLayout`.
Such code is required only for anonymous `RecordDecl` because we
deserialize named decls when we look them up by name. But it doesn't
work for anonymous decls as they don't have a name. That's why need to
force deserialization of anonymous decls in a different way.
rdar://81864186
Differential Revision: https://reviews.llvm.org/D140055
When two modules contain struct/union with the same name, check the
definitions are equivalent and diagnose if they are not. This is similar
to `CXXRecordDecl` where we already discover and diagnose mismatches.
rdar://problem/56764293
Differential Revision: https://reviews.llvm.org/D71734
The needed tweaks are mostly trivial, the one nasty bit is Clang's usage
of OptionalStorage. To keep this working old Optional stays around as
clang::CustomizableOptional, with the default Storage removed.
Optional<File/DirectoryEntryRef> is replaced with a typedef.
I tested this with GCC 7.5, the oldest supported GCC I had around.
Differential Revision: https://reviews.llvm.org/D140332
This reverts commit 8f0df9f3bbc6d7f3d5cbfd955c5ee4404c53a75d.
The Optional*RefDegradesTo*EntryPtr types want to keep the same size as
the underlying type, which std::optional doesn't guarantee. For use with
llvm::Optional, they define their own storage class, and there is no way
to do that in std::optional.
On top of that, that commit broke builds with older GCCs, where
std::optional was not trivially copyable (static_assert in the clang
sources was failing).
This patch gives basic parsing and semantic support for "modifiers" of order clause introduced in OpenMP 5.1 ( section 2.11.3 )
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D127855
The existing ReadAST block only describes the top-level PCM file being
loaded, when in practice most of the time taken is dealing with other
PCM files which are loaded in turn.
Because this work isn't strictly recursive (first all the modules are
discovered, then processsed in several flat loops), we can't have a neat
recursive structure like processing of source files. Instead, slap a
timer on the largest of these boxes: reading the AST block for modules.
In practice this shows where most of the time goes, and in particular
which modules are most expensive.