This is the correct order according to the function prototype.
This should be NFC, because for PCH, AllowCompatibleDifferences
is always false: it is only used in isAcceptableASTFile, which
calls readASTFileControlBlock, which explicitely passes false.
We explicitely pass in `nullptr` for Diag, so the incorrect error
message isn't printed.
Also deserialize them back again on reading.
The implementation is based on the existing implementation of `#pragma
weak` serialization.
Fixes issue #186742.
---------
Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
This PR removes the assumption that a deserialized module file is backed
by a `FileEntry`. The uniquing and lookup role of `ModuleFile`'s
`FileEntryRef` member is entirely replaced with the `ModuleFileKey`
member. For checking whether an existing `ModuleFile` conforms to the
expectations of importers, the file size and mod time are now stored
directly on `ModuleFile` (previously provided by its `FileEntry`).
Together, these changes enable removal of the
`ModuleManager::lookupByFileName(StringRef)` and
`ModuleManager::lookup(const FileEntry *)` APIs.
This removes the assumption that a deserialized module is backed by a
`FileEntry`. The `OptionalFileEntryRef` member is replaced with
`ModuleFile{Name,Key}`.
This PR changes how `ModuleManager` deduplicates module files.
Previously, `ModuleManager` used `FileEntry` for assigning unique
identity to module files. This works fine for explicitly-built modules
because they don't change during the lifetime of a single Clang
instance. For implicitly-built modules however, there are two issues:
1. The `FileEntry` objects are deduplicated by `FileManager` based on
the inode number. Some file systems reuse inode numbers of previously
removed files. Because implicitly-built module files are rapidly removed
and created, this deduplication breaks and compilations may fail
spuriously when inode numbers are recycled during the lifetime of a
single Clang instance.
2. The first thing `ModuleManager` does when loading a module file is
consulting the `FileManager` and checking the file size and modification
time match the expectation of the importer. This is done even when such
module file already lives in the `InMemoryModuleCache`. This introduces
racy behavior into the mechanism that explicitly tries to solve race
conditions, and may lead into spurious compilation failures.
This PR identifies implicitly-built module files by a pair of
`DirectoryEntry` of the module cache path and the path suffix
`<context-hash>/<module-name>-<module-map-path-hash>.pcm`. This gives us
canonicalization of the user-provided module cache path without turning
to `FileEntry` for the PCM file. The path suffix is Clang-generated and
is already canonical.
Some tests needed to be updated because the module cache path directory
was also used as an include directory. This PR relies on not caching the
non-existence of the module cache directory in the `FileManager`. When
other parts of Clang are trying to look up the same path and cache its
non-existence, things break. This is probably very specific to some of
our tests and not how users are setting up their compilations.
Previously, the normalized module cache path was only accessible via
`HeaderSearch::getSpecificModuleCachePath()` which may or may not also
contain the context hash. Clients would need to parse the result to
learn the normalized module cache path. What `ASTWriter` does instead is
normalize the as-written module cache path redundantly.
Instead, this PR exposes the normalized module cache path in the
`HeaderSearch` interface and moves the computation of specific module
cache path into the clangLex library.
This is motivated by another patch that would've needed to redundantly
perform the module cache path canonicalization or parse the specific
module cache path.
* To avoid the build time overhead of checking for relocated modules,
only check it once per build session.
* Enable relocated module checks in the dependency scanner.
* Add remarks to know when this is happening with `-Rmodule-validation`
This check is necessary to be able to handle new libraries appearing in
earlier search paths. This is a valid scenario when dependency info
changes between incremental builds of the same scheme, thus new build
sessions.
It is still malformed to expect new versions of libraries to be added
within the same build session.
resolves: rdar://169174750
After header search has found a header it looks for module maps that
cover that header. This patch uses the parsed representation of module
maps to do this search instead of relying on FileEntryRef lookups after
stating headers in module maps.
This behavior is currently gated behind the
`-fmodules-lazy-load-module-maps` `-cc1` flag.
Introduce `OverflowBehaviorType` (OBT), a new type attribute in Clang
that provides developers with fine-grained control over the overflow
behavior of integer types. This feature allows for a more nuanced
approach to integer safety, achieving better granularity than global
compiler flags like `-fwrapv` and `-ftrapv`. Type specifiers are also
available as keywords `__ob_wrap` and `__ob_trap`.
These can be applied to integer types (both signed and unsigned) as well
as typedef declarations, where the behavior is one of the following:
* `wrap`: Guarantees that arithmetic operations on the type will wrap on
overflow, similar to `-fwrapv`. This suppresses UBSan's integer overflow
checks for the attributed type and prevents eager compiler
optimizations.
* `trap`: Enforces overflow checking for the type, even when global
flags like `-fwrapv` would otherwise suppress it.
A key aspect of this feature is its interaction with existing
mechanisms. `OverflowBehaviorType` takes precedence over global flags
and, notably, over entries in the Sanitizer Special Case List (SSCL).
This allows developers to "allowlist" critical types for overflow
instrumentation, even if they are disabled by a broad rule in an SSCL.
Signed-off-by: Justin Stitt <justinstitt@google.com>
The expected behavior for implicitly built modules is to validate input
files and to rebuild a module if there are any input file changes. But
if for some reason a module hasn't been rebuilt, it is useful to know if
the validation has been done and what kind of validation.
The goal is to make investigations for fixes like
f2a3079a1b48033a92d0a7d9f03251ebeb4a0c30 and
ada79f4c2691ab6546d379a144377162fd4f5191 easier.
rdar://159857416
---------
Co-authored-by: Cyndy Ishida <cyndyishida@gmail.com>
Depends on #169603.
This is the `use_device_ptr` counterpart of #168905.
With OpenMP 6.1, a `fallback` modifier can be specified on the
`use_device_ptr` clause to control the behavior when a pointer lookup
fails, i.e. there is no device pointer to translate into.
The default is `fb_preserve` (i.e. retain the original pointer), while
`fb_nullify` means: use `nullptr` as the translated pointer.
Dependent PR: #173930.
This PR unifies the terminology for:
* "context hash" - previously ambiguously referred to as "module hash"
or as overly specific "module context hash"
* "specific module cache path" - previously referred to as just "module
cache path" - hard to distinguish from the command-line-provided module
cache path without the context hash
NFCI
It is unclear (to me) why this needs to be done "for safety", but
this change significantly improves the effectiveness of lazy loading.
Reviewed as part of https://github.com/llvm/llvm-project/pull/133057
This reverts commit 1928c1ea9b57e9c44325d436bc7bb2f4585031f3.
We have at least one repro, but I won't be able to work on this until
next week. Also with Clang 22 cut upcoming, we probably need to revert
for now.
PCHs (but also modules generated from several implicit invocations like
swiftc) previously reported a confusing diagnostic about module caches
being mismatched by subdir. This is an implementation detail of the
module machinery, and not very useful to the end user. Instead, report
this case as a configuration mismatch when the compiler can confirm the
module cache was passed the same between the current TU & previously
compiled products.
Ideally, each argument that could result in this error would be uniquely
reported (e.g., O3), but as a starting point, providing something more
general is strictly better than pointing the user to the module cache.
This patch also includes NFCs for renaming variable names from Module to
AST and formatting cleanup in related areas.
resolves: rdar://167453135
RISC-V vector intrinsic is generated dynamically at runtime, thus it's
note preserved in AST yet when using precompile header, neither do
information in SemaRISCV. We need to write these information to ast
record to be able to use precompile header for RISC-V.
Fixes#109634
## Problem
Given code such as `N::foo();`, we perform name look-up on `N`. In the
case where `N` is a namespace declared in imported modules, one
namespace decl (the "key declaration") for each module that declares a
namespace `foo` is loaded and stored. In large scales where there are
many such modules, (e.g., 1,500) and many uses (e.g., 500,000), this
becomes extremely inefficient because every look-up (500,000 of them)
return 1,500 results.
The following synthetic script demonstrates the problem:
```bash
#/usr/bin/env bash
CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1500}
NUM_USES=${NUM_USES:-500000}
USE_MODULES=${USE_MODULES:-true}
TMPDIR=$(mktemp -d)
echo "Working in temp directory: $TMPDIR"
cd $TMPDIR
trap "rm -rf \"$TMPDIR\"" EXIT
echo "namespace N { inline void foo() {} }" > m1.h
for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done
if $USE_MODULES; then
seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h"
fi
> a.cpp
if $USE_MODULES; then
for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
else
for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done
fi
echo "int main() {" >> a.cpp
for i in $(seq 1 $NUM_USES); do echo " N::foo();" >> a.cpp; done
echo "}" >> a.cpp
if $USE_MODULES; then
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \
$(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done)
else
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null
fi
```
As of 575d6892bcc5cef926cfc1b95225148262c96a15, without modules
(`USE_MODULES=false`) this takes about **4.5s**, whereas with modules
(`USE_MODULES=true`), this takes about **37s**.
With this PR, without modules there's no change (as expected) at 4.5s,
but with modules it improves to about **5.2s**.
## Approach
The approach taken here aims to maintain status-quo with respect to the
input and output of modules. That is, the `ASTReader` and `ASTWriter`
both read and write the same declarations as it did before. The
difference is in the middle part: the [`StoredDeclsMap` in
`DeclContext`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/AST/DeclBase.h#L2024-L2030).
The `StoredDeclsMap` is roughly a `map<DeclarationName,
StoredDeclsList>`. Currently, we read all of the external namespace
decls from `ASTReader`, they all get stored into the `StoredDeclsList`,
and the `ASTWriter` iterates through that list and writes out the
results.
This PR continues to read all of the external namespace decls from
`ASTReader`, but only stores one namespace decl in the
`StoredDeclsList`. This is okay since the reading of the decls handles
all of the merging and chaining of the namespace decls, and as long as
they're loaded and chained, returning one for look-up purposes is
sufficient.
The other half of the problem is to write out all of the external
namespaces that we used to store in `StoredDeclsList` but no longer. For
this, we take advantage of the
[`KeyDecls`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/Serialization/ASTReader.h#L1342-L1347)
data structure in `ASTReader`. `KeyDecls` is roughly a `map<Decl *,
vector<GlobalDeclID>>`, and it stores a mapping from the canonical decl
of a redeclarable decl to a list of `GlobalDeclID`s where each ID
represents a "key declaration" from each imported module. More to the
point, if we read external namespaces `N1`, `N2`, `N3` in `ASTReader`,
we'll either have `N1` mapped to `[N2, N3]`, or some newly local
canonical decl mapped to `[N1, N2, N3]`. Either way, we can visit `N1`,
`N2`, and `N3` by doing `ASTReader::forEachImportedKeyDecls(N1,
Visitor)`, and we leverage this to maintain the current behavior of
writing out all of the imported namespace decls in `ASTWriter`.
## Alternatives Attempted
- Tried reading fewer declarations on the `ASTReader` side, and writing
out fewer declarations on the `ASTWriter` side, and neither options
worked at all.
- Tried trying to split `StoredDeclsList` into two pieces, one with
non-namespace decls and one with only namespace decls, but that didn't
work well... I think because the order of the declarations matter
sometimes, and maybe also because the declaration replacement logic gets
more complicated.
- Tried to deduplicate at the `SemaLookup` level. Basically, retrieve
all the stored decls but deduplicate populating the `LookupResult`
[here](https://github.com/llvm/llvm-project/blob/release/21.x/clang/lib/Sema/SemaLookup.cpp#L1137-L1144).
This did improve things slightly, but not quite enough, and this
solution seemed cleaner in the end anyway.
- CUDA's dynamic parallelism extension allows device-side kernel
launches, which share the identical syntax to host-side launches, e.g.,
kernel<<<Dg, Db, Ns, S>>>(arguments);
but differ from the code generation. That device-side kernel launches is
eventually translated into the following sequence
config = cudaGetParameterBuffer(alignment, size);
// setup arguments by copying them into `config`.
cudaLaunchDevice(func, config, Dg, Db, Ns, S);
- To support the device-side kernel launch, 'CUDAKernelCallExpr' is
reused but its config expr is set to a call to 'cudaLaunchDevice'.
During the code generation, 'CUDAKernelCallExpr' is expanded into the
sequence aforementioned.
- As the device-side kernel launch requires the source to be compiled as
relocatable device code and linked with '-lcudadevrt'. Linkers are
changed to pass relevant link options to 'nvlink'.
As described in section 2.14.6 of openmp spec, the patch implements
support for iterator in motion clauses.
---------
Co-authored-by: Shashwathi N <nshashwa@pe31.hpc.amslabs.hpecorp.net>
Similar to previous no transitive changes to decls, types, identifiers
and source locations (
https://github.com/llvm/llvm-project/pull/92083https://github.com/llvm/llvm-project/pull/92085https://github.com/llvm/llvm-project/pull/92511https://github.com/llvm/llvm-project/pull/86912
)
This patch does the same thing for MacroID and PreprocessedEntityID.
---
### Some background
Previously we record different IDs linearly. That is, when writing a
module, if we have 17 decls in imported modules, the ID of decls in the
module will start from 18. This makes the contents of the BMI changes if
the we add/remove any decls, types, identifiers and source locations in
the imported modules.
This makes it hard for us to reduce recompilations with modules. We want
to skip recompilations as we think the modules can help us to remove
fake dependencies. This can be done by split the ID into <ModuleIndex,
LocalIndex> pairs.
This is ALREADY done for several different ID above. We call it
non-casacading changes
(https://clang.llvm.org/docs/StandardCPlusPlusModules.html#experimental-non-cascading-changes).
Our internal users have already used this feature and it works well for
years.
Now we want to extend this to MacroID and PreprocessedEntityID. This is
helpful for us in the downstream as we allowed named modules to export
macros. But I believe this is also helpful for header-like modules if
you'd like to explore the area.
And also I think this is a nice cleanup too.
---
Given the use of MacroID and PreprocessedEntityID are not as complicated
as other IDs in the above series, I feel the patch itself should be
good. I hope the vendors can test the patch to make sure it won't affect
existing users.
Close https://github.com/llvm/llvm-project/issues/166068
The cause of the problem is that we would import initializers and
pending implicit instantiations from other named module. This is very
bad and it may waste a lot of time.
And we didn't observe it as the weak symbols can live together and the
strong symbols would be removed by other mechanism. So we didn't observe
the bad behavior for a long time. But it indeeds waste compilation time.
This PR adds support for the `dyn_groupprivate` clause, which will be
part of OpenMP 6.1. This feature allows users to request dynamic shared
memory on target regions.
---------
Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>
This implements the parts of
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3457.htm which were
adopted at the recent meeting in Brno.
Clang already implemented `__COUNTER__`, but needed some changes for
conformance. Specifically, we now diagnose when the macro is expanded
more than 2147483647 times. Additionally, we now give the expected
extension and pre-compat warnings for the feature.
To support testing the limits, this also adds a -cc1-only option,
`-finitial-counter-value=`, which lets you specify the initial value the
`__COUNTER__` macro should expand to.
This PR refactors `ASTUnit::LoadFromASTFile()` to be easier to follow.
Conceptually, it tries to read an AST file, adopt the serialized
options, and set up `Sema` and `ASTContext` to deserialize the AST file
contents on-demand.
The implementation of this used to be spread across an
`ASTReaderListener` and the function in question. Figuring out what
listener method gets called when and how it's supposed to interact with
the rest of the functionality was very unclear. The `FileManager`'s VFS
was being swapped-out during deserialization, the options were being
adopted by `Preprocessor` and others just-in-time to pass `ASTReader`'s
validation checks, and the target was being initialized somewhere in
between all of this. This lead to a very muddy semantics.
This PR splits `ASTUnit::LoadFromASTFile()` into three distinct steps:
1. Read out the options from the AST file.
2. Initialize objects from the VFS to the `ASTContext`.
3. Load the AST file and hook it up with the compiler objects.
This should be much easier to understand, and I've done my best to
clearly document the remaining gotchas.
(This was originally motivated by the desire to remove
`FileManager::setVirtualFileSystem()` and make it impossible to swap out
VFSs from underneath `FileManager` mid-compile.)
This rename was made as part of
https://github.com/llvm/llvm-project/pull/147835 in order to ease
rebasing the PR, and give a nice window for other patches to get rebased
as well.
It has been a while already, so lets go ahead and rename it back.
This PR enhances the OpenMP `nowait` clause implementation by adding
support for optional argument in both parsing and semantic analysis
phases.
Reference:
1. OpenMP 6.0 Specification, page 481
#137363 was supposed to be NFC for the `CrossProcessModuleCache` (a.k.a
normal implicit module builds), but accidentally passed the wrong path
to `sys::fs::status`. Then, #141358 removed the correct path that
should've been passed instead. (The variable was flagged as unused.)
None of our existing tests caught this regression, we only found out due
to a SourceKit-LSP benchmark getting slower.
This PR re-implements the original behavior, adds new remark to Clang
for PCM input file validation, and uses it to create more reliable tests
of the `-fmodules-validate-once-per-build-session` flag.
This reverts commit
8d9aecce06.
Additionally, this refactors how we're doing the AST storage to put it
all in the trailing storage, which will hopefully prevent it from
leaking. The problem was that the AST doesn't call destructors on things
in ASTContext storage, so we weren't actually able to delete the
combiner
SmallVector (which I should have known...). This patch instead moves all
of that SmallVector data into trailing storage, which shouldn't have the
same
problem with leaking as before.
This is the first patch of a handful to get the reduction combiner
recipe lowering properly. THIS patch is NFC as it doesn't actually
change anything except the structure of the AST.
For each 'combiner' recipe we need a 'LHS' 'RHS' and expression to
represent the operation.
Each var-reference can have 1 or more combiners.
IF it is a plain scalar, or a struct with the proper operator, or an
array of either of those, there will be 1.
HOWEVER, aggregates without the proper operator are supposed to be
broken down and done from their elements (which can only be scalars). In
this case, we will represent 1 'combiner' recipe per field-decl.
This patch only puts the infrastructure in place to do so, future
patches wll do the work to fill this in.
I originally expected that we were going to need the initExpr stored
separately from the allocaDecl when doing arrays/pointers, however after
implementing it, we found that the idea of having the allocaDecl just
store its init directly still works perfectly. This patch removes the
extra field from the AST.
This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang.
This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending).
---------
Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
A DependentTemplateSpecializationType (DTST) is basically just a
TemplateSpecializationType (TST) with a hardcoded DependentTemplateName
(DTN) as its TemplateName.
This removes the DTST and replaces all uses of it with a TST, removing a
lot of duplication in the implementation.
Technically the hardcoded DTN is an optimization for a most common case,
but the TST implementation is in better shape overall and with other
optimizations, so this patch ends up being an overall performance
positive:
<img width="1465" height="38" alt="image"
src="https://github.com/user-attachments/assets/084b0694-2839-427a-b664-eff400f780b5"
/>
A DTST also didn't allow a template name representing a DTN that was
substituted, such as from an alias template, while the TST does allow it
by the simple fact it can hold an arbitrary TemplateName, so this patch
also increases the amount of sugar retained, while still being faster
overall.
Example (from included test case):
```C++
template<template<class> class TT> using T1 = TT<int>;
template<class T> using T2 = T1<T::template X>;
```
Here we can now represent in the AST that `TT` was substituted for the
dependent template name `T::template X`.
Expressions/references with 'bounds' are going to need to do
initialization significantly differently, so we need to have the
initializer and the declaration 'separate' in the future. This patch
splits the AST node into two, and normalizes them a bit.
Additionally, since this required significant work on the recipe
generation, this patch also does a bit of a refactor to improve
readability and future expansion, now that we have a good understanding
of how these are going to look.
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary codegen changes.
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary codegen changes.
This reintroduces `Type.h`, having earlier been renamed to `TypeBase.h`,
as a redirection to `TypeBase.h`, and redirects most users to include
the former instead.
This is a preparatory patch for being able to provide inline definitions
for `Type` methods which would otherwise cause a circular dependency
with `Decl{,CXX}.h`.
Doing these operations into their own NFC patch helps the git rename
detection logic work, preserving the history.
This patch makes clang just a little slower to build (~0.17%), just
because it makes more code indirectly include `DeclCXX.h`.