This PR upstreams a piece of infrastructure from Swift's LLVM fork. This
allows adding custom wrappers around the `FrontendAction` that compiles
Clang modules from a module map into a PCM file. This will be used to
implement modules support in the CAS-based compilation caching mode.
When a PCH is compiled with macro definitions on the command line, such
as `-DCONFIG1`, an unexpected warning can occur if the macro definitions
happen to belong to an imported module's config macros. The warning may
look like the following:
```
definition of configuration macro 'CONFIG1' has no effect on the import of 'Mod1'; pass '-DCONFIG1=...' on the command line to configure the module
```
while `-DCONFIG1` is clearly on the command line when `clang` compiles
the source that uses the PCH and the module.
The reason this can happen is a combination of two things:
1. The logic that checks for config macros is not aware of any command
line macros passed through the PCH
([here](7976ac9900/clang/lib/Frontend/CompilerInstance.cpp (L1562))).
2. `clang` _replaces_ the predefined macros on the command line with the
predefined macros from the PCH, which does not include any builtins
([here](7976ac9900/clang/lib/Frontend/CompilerInstance.cpp (L679))).
This PR teaches the preprocessor to recognize the command line macro
definitions passed transitively through the PCH, so that the error check
does not miss these definitions by mistake. The config macro itself
works fine, and it is only the error check that needs fixing.
rdar://95261458
The `DiagnosticConsumer::finish()` API has historically been a source of
friction. Lots of different clients must manually ensure it gets called
for all consumers to work correctly. Idiomatic C++ uses destructors for
this. In Clang, there are some cases where destructors don't run
automatically, such as under `-disable-free` or some signal handling
code in `clang_main()`. This PR squeezes the complexity of ensuring
those destructors do run out of library code and into the tools that
already deal with the complexities of `-disable-free` and signal
handling.
After header search has found a header it looks for module maps that
cover that header. This patch uses the parsed representation of module
maps to do this search instead of relying on FileEntryRef lookups after
stating headers in module maps.
This behavior is currently gated behind the
`-fmodules-lazy-load-module-maps` `-cc1` flag.
When there's a dependency cycle between modules, the dependency scanner
may encounter a deadlock. This was caused by not respecting the lock
timeout. But even with the timeout implemented, leaving
`unsafeMaybeUnlock()` unimplemented means trying to take a lock after a
timeout would still fail and prevent making progress. This PR implements
this API in a way to avoid UB on `std::mutex` (when it's unlocked by
someone else than the owner). Lastly, this PR makes sure that
`unsafeUnlock()` ends the wait of existing threads, so that they don't
need to hit the full timeout amount.
This PR also implements `-fimplicit-modules-lock-timeout=<seconds>` that
allows tweaking the default 90-second lock timeout, and adds `#pragma
clang __debug sleep` that makes it easier to achieve desired execution
ordering.
rdar://170738600
This PR adds new preprocessor callback that's invoked whenever the
single-module-parse-mode skips over a module import. This will be used
later on from the dependency scanner.
Previously, `-fmodules-single-module-parse-mode` only prevented module
compilation/loading when initiated from an `#include` or `#import`
directive. This PR does the same for `@import`, `#pragma clang module
import` and `#pragma clang module load`. This is done by sinking the
logic down into `CompilerInstance::loadModule()`.
This PR adds new member to `CompilerInstance::ThreadSafeCloneConfig` to
allow using a different `ModuleCache` instance in the cloned
`CompilerInstance`. This is done so that the original and the clone
can't concurrently work on the same `InMemoryModuleCache`, which is not
thread safe. This will be made use of shortly from the dependency
scanner along with the single-module-parse-mode to compile modules
asynchronously/concurrently.
This also fixes an old comment that incorrectly claimed that
`CompilerInstance`'s constructor is responsible for finalizing
`InMemoryModuleCache` buffers, which is no longer the case.
The patch reapply https://github.com/llvm/llvm-project/pull/173130.
This patch implement the following papers:
[P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3).
[P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1).
[CWG2947](https://cplusplus.github.io/CWG/issues/2947.html).
At the start of phase 4 an import or module token is treated as starting
a directive and are converted to their respective keywords iff:
- After skipping horizontal whitespace are
- at the start of a logical line, or
- preceded by an export at the start of the logical line.
- Are followed by an identifier pp token (before macro expansion), or
- <, ", or : (but not ::) pp tokens for import, or
- ; for module
Otherwise the token is treated as an identifier.
Additionally:
- The entire import or module directive (including the closing ;) must
be on a single logical line and for module must not come from an
#include.
- The expansion of macros must not result in an import or module
directive introducer that was not there prior to macro expansion.
- A module directive may only appear as the first preprocessing tokens
in a file (excluding the global module fragment.)
- Preprocessor conditionals shall not span a module declaration.
After this patch, we handle C++ module-import and module-declaration as
a real pp-directive in preprocessor. Additionally, we refactor module
name lexing, remove the complex state machine and read full module name
during module/import directive handling. Possibly we can introduce a
tok::annot_module_name token in the future, avoid duplicatly parsing
module name in both preprocessor and parser, but it's makes error
recovery much diffcult(eg. import a; import b; in same line).
This patch also introduce 2 new keyword `__preprocessed_module` and
`__preprocessed_import`. These 2 keyword was generated during `-E` mode.
This is useful to avoid confusion with `module` and `import` keyword in
preprocessed output:
```cpp
export module m;
struct import {};
#define EMPTY
EMPTY import foo;
```
Fixes https://github.com/llvm/llvm-project/issues/54047
The previous patch has an use-after-free issue in
Lexer::LexTokenInternal function. Since C++20, the `export`, `import`
and `module` identifiers may be an introducer of a C++ module
declaration/importing directive, and the directive will handled in
`LexIdentifierContinue`. Unfortunately, the EOF may be encountered in
`LexIdentifierContinue` and `CurLexer` might be destructed in
`HandleEndOfFile`, If the code after `LexIdentifierContinue` try to
access `LangOps` or other class members in this Lexer, it will hit
undefined behavior.
This patch also fix the use-after-free issue in Lexer by introduce a
mechanism to delay the destruction of `CurLexer` in `Preprocessor`
class.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
…nd a module with config macros (#174034)"
This reverts commit 71925cbdf80d4fc88ef844e726f89aa71c64bceb.
The test case is causing test failures on downstream Swift forks. This
PR reverts the commit to avoid failing the test case on Linux.
rdar://168174041
This PR unifies the terminology for:
* "context hash" - previously ambiguously referred to as "module hash"
or as overly specific "module context hash"
* "specific module cache path" - previously referred to as just "module
cache path" - hard to distinguish from the command-line-provided module
cache path without the context hash
NFCI
The `ModuleCache` class is currently reference-counted intrusively. As
explained in https://github.com/llvm/llvm-project/pull/139584, this is
problematic. This PR uses `std::shared_ptr` to reference-count
`ModuleCache` instead, which clarifies what happens to its lifetime when
constructing `CompilerInstance`, for example. This also makes the
reference in `ModuleManager` non-owning, simplifying the ownership
relationship further. The
`ASTUnit::transferASTDataFromCompilerInstance()` function now accounts
for that by taking care to keep it alive.
When a PCH is compiled with macro definitions on the command line, such
as `-DCONFIG1`, an unexpected warning can occur if the macro definitions
happen to belong to an imported module's config macros. The warning may
look like the following:
```
definition of configuration macro 'CONFIG1' has no effect on the import of 'Mod1'; pass '-DCONFIG1=...' on the command line to configure the module
```
while `-DCONFIG1` is clearly on the command line when `clang` compiles
the source that uses the PCH and the module.
The reason this can happen is a combination of two things:
1. The logic that checks for config macros is not aware of any command
line macros passed through the PCH
([here](7976ac9900/clang/lib/Frontend/CompilerInstance.cpp (L1562))).
2. `clang` _replaces_ the predefined macros on the command line with the
predefined macros from the PCH, which does not include any builtins
([here](7976ac9900/clang/lib/Frontend/CompilerInstance.cpp (L679))).
This PR teaches the preprocessor to recognize the command line macro
definitions passed transitively through the PCH, so that the error check
does not miss these definitions by mistake. The config macro itself
works fine, and it is only the error check that needs fixing.
rdar://95261458
PCHs (but also modules generated from several implicit invocations like
swiftc) previously reported a confusing diagnostic about module caches
being mismatched by subdir. This is an implementation detail of the
module machinery, and not very useful to the end user. Instead, report
this case as a configuration mismatch when the compiler can confirm the
module cache was passed the same between the current TU & previously
compiled products.
Ideally, each argument that could result in this error would be uniquely
reported (e.g., O3), but as a starting point, providing something more
general is strictly better than pointing the user to the module cache.
This patch also includes NFCs for renaming variable names from Module to
AST and formatting cleanup in related areas.
resolves: rdar://167453135
This permits pass plugins to use llvm:🆑:opt. Additionally, add a test
of -fpass-plugin, this was previously not tested at all.
I'm not sure whether using the LLVM Bye.so in the tests is possible this
way (e.g., if Clang is built standalone).
Reland after #173279.
Pull Request: https://github.com/llvm/llvm-project/pull/173287
This permits pass plugins to use llvm:🆑:opt. Additionally, add a test
of -fpass-plugin, this was previously not tested at all.
I'm not sure whether using the LLVM Bye.so in the tests is possible this
way (e.g., if Clang is built standalone).
Pull Request: https://github.com/llvm/llvm-project/pull/171868
This PR implement the following papers:
[P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3).
[P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1).
[CWG2947](https://cplusplus.github.io/CWG/issues/2947.html).
At the start of phase 4 an import or module token is treated as starting
a directive and are converted to their respective keywords iff:
- After skipping horizontal whitespace are
- at the start of a logical line, or
- preceded by an export at the start of the logical line.
- Are followed by an identifier pp token (before macro expansion), or
- <, ", or : (but not ::) pp tokens for import, or
- ; for module
Otherwise the token is treated as an identifier.
Additionally:
- The entire import or module directive (including the closing ;) must
be on a single logical line and for module must not come from an
#include.
- The expansion of macros must not result in an import or module
directive introducer that was not there prior to macro expansion.
- A module directive may only appear as the first preprocessing tokens
in a file (excluding the global module fragment.)
- Preprocessor conditionals shall not span a module declaration.
After this patch, we handle C++ module-import and module-declaration as
a real pp-directive in preprocessor. Additionally, we refactor module
name lexing, remove the complex state machine and read full module name
during module/import directive handling. Possibly we can introduce a
tok::annot_module_name token in the future, avoid duplicatly parsing
module name in both preprocessor and parser, but it's makes error
recovery much diffcult(eg. import a; import b; in same line).
This patch also introduce 2 new keyword `__preprocessed_module` and
`__preprocessed_import`. These 2 keyword was generated during `-E` mode.
This is useful to avoid confusion with `module` and `import` keyword in
preprocessed output:
```cpp
export module m;
struct import {};
#define EMPTY
EMPTY import foo;
```
Fixes https://github.com/llvm/llvm-project/issues/54047
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
d076608d58d1ec55016eb747a995511e3a3f72aa moved some deps around to avoid
cycles and left clang/Frontend/FrontendDiagnostic.h as a shim that
simply includes clang/Basic/DiagnosticFrontend.h. This PR inlines it so
that nothing in tree still includes clang/Frontend/FrontendDiagnostic.h.
Doing this will help prevent future layering issues. See #162865.
Frontend already depends on Basic, so no new deps need to be added
anywhere except for places that do strict dep checking.
It is possible to run into situations where no mcpu is specified for an offload device side compilation. Currently, this'd lead to a rather uninformative blank being presented as the target for a failing compilation, when messaging the error count. This patch changes things so that if there is no `-mcpu` we print the triple, which is slightly more helpful, especially when there are multiple offload targets for a single compilation.
Conceptually, the `CompilerInstance` doesn't need an instance of the
`FileManager` to create an output file. This PR enables that, removing
an edge-case in `cc1_main()`.
PR https://github.com/llvm/llvm-project/pull/150123 changed how we
normalize the modules cache path. Unfortunately, empty path would get
normalized to the current working directory. This means that even
explicitly-built PCMs that don't rely on the CWD now embed it, leading
to surprising behavior. This PR fixes that by normalizing an empty
modules cache path to an empty string.
Since https://github.com/llvm/llvm-project/pull/158381 the
`CompilerInstance` is aware of the VFS and co-owns it. To reduce scope
of that PR, the VFS was being inherited from the `FileManager` during
`setFileManager()` if it wasn't configured before. However, the
implementation of that setter was buggy. This PR fixes the bug, and
moves us closer to the long-term goal of `CompilerInstance` requiring
the VFS to be configured explicitly and owned by the instance.
The `CompilerInstance::createSourceManager()` function currently accepts
the `FileManager` to be used. However, all clients call
`CompilerInstance::createFileManager()` prior to creating the
`SourceManager`, and it never makes sense to use a `FileManager` in the
`SourceManager` that's different from the rest of the compiler. Passing
the `FileManager` explicitly is redundant, error-prone, and deviates
from the style of other `CompilerInstance` initialization APIs.
This PR therefore removes the `FileManager` parameter from
`createSourceManager()` and also stops returning the `FileManager`
pointer from `createFileManager()`, since that was its primary use. Now,
`createSourceManager()` internally calls `getFileManager()` instead.
This PR virtualizes module cache pruning via the new `ModuleCache`
interface. Currently this is an NFC, but I left a FIXME in
`InProcessModuleCache` to make this more efficient for the dependency
scanner.
This PR changes `llvm::FileCollector` to use the `llvm::vfs::FileSystem`
API for making file paths absolute instead of using
`llvm::sys::fs::make_absolute()` directly. This matches the behavior of
the compiler on most other input files.
This PR is a part of the effort to make the VFS used in the compiler
more explicit and consistent.
Instead of creating the VFS deep within the compiler (in
`CompilerInstance::createFileManager()`), clients are now required to
explicitly call `CompilerInstance::createVirtualFileSystem()` and
provide the base VFS from the outside.
This PR also helps in breaking up the dependency cycle where creating a
properly configured `DiagnosticsEngine` requires a properly configured
VFS, but creating properly configuring a VFS requires the
`DiagnosticsEngine`.
Both `CompilerInstance::create{FileManager,Diagnostics}()` now just use
the VFS already in `CompilerInstance` instead of taking one as a
parameter, making the VFS consistent across the instance sub-object.
This PR uses the new-ish `llvm::vfs::FileSystem::visit()` interface to
collect VFS overlay entries from an existing `FileSystem` instance
rather than parsing the VFS YAML file anew. This prevents duplicate
diagnostics as observed by `clang/test/VFS/broken-vfs-module-dep.c`.
This reverts commit 613caa909c78f707e88960723c6a98364656a926, essentially
reapplying 4a4bddec3571d78c8073fa45b57bbabc8796d13d after moving
`normalizeModuleCachePath` from clangFrontend to clangLex.
This PR is part of an effort to remove file system usage from the
command line parsing code. The reason for that is that it's impossible
to do file system access correctly without a configured VFS, and the VFS
can only be configured after the command line is parsed. I don't want to
intertwine command line parsing and VFS configuration, so I decided to
perform the file system access after the command line is parsed and the
VFS is configured - ideally right before the file system entity is used
for the first time.
This patch delays normalization of the module cache path until
`CompilerInstance` is asked for the cache path in the current
compilation context.
This reverts commit 4a4bddec3571d78c8073fa45b57bbabc8796d13d. The Serialization library doesn't link Frontend, where CompilerInstance lives, causing link failures on some build bots.
This PR is part of an effort to remove file system usage from the
command line parsing code. The reason for that is that it's impossible
to do file system access correctly without a configured VFS, and the VFS
can only be configured after the command line is parsed. I don't want to
intertwine command line parsing and VFS configuration, so I decided to
perform the file system access after the command line is parsed and the
VFS is configured - ideally right before the file system entity is used
for the first time.
This patch delays normalization of the module cache path until
`CompilerInstance` is asked for the cache path in the current
compilation context.
Upstreams https://github.com/swiftlang/llvm-project/pull/10943.
https://github.com/llvm/llvm-project/pull/134887 added a clone for the
compiler instance in `compileModuleAndReadASTImpl`, which would then be
destroyed *after* the corresponding read in the importing instance.
Swift has a `SwiftNameLookupExtension` module extension which updates
(effectively) global state - populating the lookup table for a module on
read and removing it when the module is destroyed.
With newly cloned instance, we would then see:
- Module compiled with cloned instance
- Module read with importing instance
- Lookup table for that module added
- Cloned instance destroyed
- Module from that cloned instance destroyed
- Lookup table for that module name removed
Depending on the original semantics is incredibly fragile, but for now
it's good enough to ensure that the read in the importing instance is
after the cloned instanced is destroyed. Ideally we'd only ever add to
the lookup tables in the original importing instance, never its clones.
Co-authored-by: Ben Barham <ben_barham@apple.com>
When two threads are accessing the same `pcm`, it is possible that the
reading thread sees the timestamp update, while the file on disk is not
updated.
This PR moves timestamp update from `writeAST` to
`compileModuleAndReadASTImpl`, so we only update the timestamp after the
file has been committed to disk.
rdar://152097193
This commit handles the following types:
- clang::ExternalASTSource
- clang::TargetInfo
- clang::ASTContext
- clang::SourceManager
- clang::FileManager
Part of cleanup #151026
Handles clang::DiagnosticsEngine and clang::DiagnosticIDs.
For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to
convenience method `DiagnosticIDs::create()`.
Part of cleanup https://github.com/llvm/llvm-project/issues/151026
This switches to `makeIntrusiveRefCnt<FileSystem>` where creating a new
object, and to passing/returning by `IntrusiveRefCntPtr<FileSystem>`
instead of `FileSystem*` or `FileSystem&`, when dealing with existing
objects.
Part of cleanup #151026.
report_fatal_error is not a good way to report diagnostics to the users, so this switches to using actual diagnostic reporting mechanisms instead.
Fixes#147187
Some `LangOptions` duplicate their `CodeGenOptions` counterparts. My
understanding is that this was done solely because some infrastructure
(like preprocessor initialization, serialization, module compatibility
checks, etc.) were only possible/convenient for `LangOptions`. This PR
implements the missing support for `CodeGenOptions`, which makes it
possible to remove some duplicate `LangOptions` fields and simplify the
logic. Motivated by https://github.com/llvm/llvm-project/pull/146342.
This patch fixes an issue where Microsoft-specific layout attributes,
such as __declspec(empty_bases), were ignored during CUDA/HIP device
compilation on a Windows host. This caused a critical memory layout
mismatch between host and device objects, breaking libraries that rely
on these attributes for ABI compatibility.
The fix introduces a centralized hasMicrosoftRecordLayout() check within
the TargetInfo class. This check is aware of the auxiliary (host) target
and is set during TargetInfo::adjust if the host uses a Microsoft ABI.
The empty_bases, layout_version, and msvc::no_unique_address attributes
now use this centralized flag, ensuring device code respects them and
maintains layout consistency with the host.
Fixes: https://github.com/llvm/llvm-project/issues/146047
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.