This patch deprecates `module.map` in favor of `module.modulemap`, which
has been the preferred form since 2014. The eventual goal is to remove
support for `module.map` to reduce the number of stats Clang needs to do
while searching for module map files.
This patch touches a lot of files, but the majority of them are just
renaming tests or references to the file in comments or documentation.
The relevant files are:
* lib/Lex/HeaderSearch.cpp
* include/clang/Basic/DiagnosticGroups.td
* include/clang/Basic/DiagnosticLexKinds.td
This patch tries to reduce the size of the BMIs by packing more bits
into an unsigned integer.
This patch was reverted due to buildbot failure report. But it should be
irrevelent after I took a double look. So I tried to recommit this NFC
change again.
Close https://github.com/llvm/llvm-project/issues/73893
As the issue shows, generally, the diagnose information for
invisible namespace is confusing more than helpful. Also this patch
implements the same solution as suggested in the issue: don't diagnose
on invisible namespace.
Fix decl-params-determinisim test after 48be81e1 packed some information
in the clang module. The test is to make sure the decls are appearing in
a strict ordering and it relies on check the correct field in the
bitcode format.
Add more explanation in the comments to help future updates when
serialization format affects this test.
Both options do not affect the AST content that is serialized into the PCM. This
commit includes the following changes:
1.) Mark `-fvisibility={}` and `-ftype-visibility={}` as benign options.That
means they are no longer considered part of the module hash, which can
reduce the number of module variants.
2.) Add a test to verify the generated LLVM IR is not affected by the default
visibiliy mode in the module.
3.) Add a test to clang-scan-deps to ensure only one module is build, even if
the above mentioned options are used.
This fixes rdar://118246054.
Due to an oversight, when users use an unexported declaration from
implicit global module, the diagnostic will show "please #include"
instead of "please import". This patch corrects the behavior.
Also previously, when users use an unexported declarations from module
partitions, the diagnostic message will always show the partition name
no matter if that partition name is visible to the users. Now the users
may only see the partition name if the users are in the same module with
the partition unit.
Close https://github.com/llvm/llvm-project/issues/71347
Previously I misread the concept of module purview. I thought if a
declaration attached to a unnamed module, it can't be part of the module
purview. But after the issue report, I recognized that module purview is
more of a concept about locations instead of semantics.
Concretely, the things in the language linkage after module declarations
can be exported.
This patch refactors `Module::isModulePurview()` and introduces some
possible code cleanups.
Close https://github.com/llvm/llvm-project/issues/60996.
Previously, clang will try to import function bodies from other module
units to get more optimization oppotunities as much as possible. Then
the motivation becomes the direct cause of the above issue.
However, according to the discussion in SG15, the behavior of importing
function bodies from other module units breaks the ABI compatibility. It
is unwanted. So the original behavior of clang is incorrect. This patch
choose to not import function bodies from other module units in all
cases to follow the expectation.
Note that the desired optimized BMI idea is discarded too. Since it will
still break the ABI compatibility after we import function bodies
seperately.
The release note will be added seperately.
There is a similar issue for variable definitions. I'll try to handle
that in a different commit.
After #70144 Clang started resolving module maps even for
`__has_include()` expressions. This had the unintended consequence of
emitting diagnostics around header misuse. These don't make sense if you
actually don't bring contents of the header into the importer, so should
be skipped for `__has_include()`. This patch moves emission of these
diagnostics out of `Preprocessor::LookupFile()` up into
`Preprocessor::LookupHeaderIncludeOrImport()`.
Previously, the boolean values will occupy spaces that can contain
integers. It wastes the spaces especially if the boolean values are
serialized consecutively. The patch tries to pack such consecutive
boolean values (and enum values) so that we can save more spaces and so
the times.
Before the patch, we need 4.478s (in my machine) to build the std module
(https://libcxx.llvm.org/Modules.html) with 28712 bytes for size of the
BMI. After the patch, the time becomes to 4.374s and the size becomes to
27388 bytes for the size of the BMI.
This is intended to be a NFC patch.
This patch doesn't optimize all such cases. We can do it later after we
have consensus on this.
Deserialization of the `DIAGNOSTIC_OPTIONS` and `HEADER_SEARCH_PATHS`
records is slow and done for every transitively loaded PCM.
Deserialization of these records cannot be skipped, because the words
are VBR6-encoded and we don't store the length of the entire record. We
could either turn them into binary blobs that can be skipped during
deserialization, or skip writing them altogether. This patch takes the
latter approach, since these records are not necessary in scanning PCMs.
The scanner doesn't make any guarantees about the accuracy of
diagnostics, and we always have the same header search paths due to
strict context hashing.
The commit that makes the `DIAGNOSTIC_OPTIONS` record skippable was
originally implemented by @benlangmuir in a downstream repo.
…off support
Same as D135848. The newly added test fails with `fatal error: error in
backend: Objective-C support is unimplemented for object file format`.
When including builtin headers as part of a system module, ensure we use
relative paths to those headers. Otherwise the module will fail to compile
when specifying relative resource directories without extra search paths.
-fmodule-file=<module-name>= option
Currently if we have multiple `-fmodule-file=<module-name>=<BMI-path>`
flags for the same `<module-name>`, we will pick the BMI-path from the
first flag. And this is inconsistent with what users generally expect.
e.g, we might expect the latter flags can override the former ones.
This patch changes the behavior to match user's expectation.
When an include from a textual header is resolved, the textual header's
submodule is used as the requesting module. The submodule's uses are
resolved, but that doesn't work because only top level modules have
uses, and only the top level module uses are used for checking uses in
Module::directlyUses. ModuleMap::resolveUses to resolve the top level
module instead of the submodule.
This prevents redefinition errors due to having multiple paths for the
same module map. (rdar://24116019)
Originally implemented and tested downstream by @bcardosolopes, I just
made use of `FileEntryRef::getNameAsRequested()`.
Summary:
When a PCM file is loaded, it can go wrong in various ways. The current
diagnostic only produces the name of the malformed PCM, not why it is
malformed. Expand the diagnostic to display what went wrong!
There is only one call site for this diagnostic, and it already passes
the error message:
https://github.com/llvm/llvm-project/blob/main/clang/lib/Serialization/ASTReader.cpp#L4763-L4764
Test Plan:
The modified LIT test.
---------
Co-authored-by: Nuri Amari <nuriamari@fb.com>
…ailabl externally
A workaround for https://github.com/llvm/llvm-project/issues/60996
As the title suggested, we can avoid emitting available externally
functions which is marked as noinline already. Such functions should
contribute nothing for optimizations.
The update for docs will be sent seperately if this got approved.
In `SourceManager::getFileID()`, Clang performs binary search over its
buffer of `SLocEntries`. For modules, this binary search fully
deserializes the entire `SLocEntry` block for each visited entry. For
some entries, that includes decompressing the associated buffer (e.g.
the predefines buffer, macro expansion buffers, contents of volatile
files), which shows up in profiles of the dependency scanner.
This patch moves the binary search over loaded entries into `ASTReader`,
which can perform cheaper partial deserialization during the binary
search, reducing the wall time of dependency scans by ~3%. This also
reduces the number of retired instructions by ~1.4% on regular
(implicit) modules compilation.
Note that this patch drops the optimizations based on the last lookup ID
(pruning the search space and performing linear search before resorting
to the full binary search). Instead, it reduces the search space by
asking `ASTReader::GlobalSLocOffsetMap` for the containing `ModuleFile`
and only does binary search over entries of single module file.
The #if now has a conditional expression, so a user can add
`-D__CLANG_REWRITTEN_SYSTEM_INCLUDES` to include the system headers
instead of using the expanded content, or
`-D__CLANG_REWRITTEN_INCLUDES` to include all headers.
Also added the filename to the comments it emits, to help identify where
included text ends, making it easier to identify and remove the content of
individual headers.
All of the _Builtin_stdarg and _Builtin_stddef submodules need to be
allowed from [no_undeclared_includes] modules. Split the builtin headers
tests out from the compiler_builtins test so that the testing modules
can be modified without affecting the other many tests that use
Inputs/System/usr/include.
Make top level modules for all the C standard library headers.
The `__stddef` implementation headers need header guards now that they're all modular. stdarg.h and stddef.h will be textual headers in the builtin modules, and so need to be repeatedly included in both the system and builtin module case. Define their header guards for consistency, but ignore them when building with modules.
`__stddef_null.h` needs to ignore its header guard when modules aren't being used to fulfill its redefinition obligation.
`__stddef_nullptr_t.h` needs to add a guard for C23 so that `_Builtin_stddef` can compile in C17 and earlier modes. `_Builtin_stddef.nullptr_t` can't require C23 because it also needs to be usable from C++.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D159064
Close https://github.com/llvm/llvm-project/issues/67893
The root cause of the crash is an oversight that we missed the point
that the same module can be imported multiple times. And we should use
`SmallSetVector` instead of `SmallVector` to filter the case.
This change sets the debug compilation directory when generating debug
information for PCH object containers. This allows for overriding the
compilation directory in debug information in precompiled pcm files.
Including select builtin headers in system modules is a workaround for module cycles, primarily in Apple's Darwin module that includes all of its C standard library headers. The workaround is problematic because it doesn't include all of the builtin headers (inttypes.h is notably absent), and it also doesn't include C++ headers. The straightforward for for this is to make top level modules for all of the C standard library headers and unwind.h in C++, clang, and the OS.
However, doing so in clang before the OS modules are ready re-introduces the module cycles. Add a -fbuiltin-headers-in-system-modules option to control if the special builtin headers belong to system modules or builtin modules. Pass the option by default for Apple.
Reviewed By: ChuanqiXu, Bigcheese, benlangmuir
Differential Revision: https://reviews.llvm.org/D159483
There is a long-standing FIXME in `HeaderSearch.cpp` to use the path separator preferred by the platform instead of forward slash. There was an attempt to fix that (1cf6c28a) which got reverted (cf385dc8). I couldn't find an explanation, but my guess is that some tests assuming forward slash started failing.
This commit fixes tests with that assumption.
This is intended to be NFC, but there are two exceptions to that:
* Some diagnostic messages might now contain backslash instead of forward slash.
* Arguments to the "-remap-file" option that use forward slash might stop kicking in. Separators between potential includer path and header name need to be replaced by backslash in that case.
C++20 modules
Previously, we banned the check for input files from C++20 modules since
we thought the BMI from C++20 modules should be a standalone artifact.
However, during the recent experiment with clangd for modules, I find
it is necessary to tell whether or not a BMI is out-of-date by checking the
input files especially for language servers.
So this patch brings a header search option
ForceCheckCXX20ModulesInputFiles to allow the tools (concretly, clangd)
to check the input files from BMI.
stdarg.h and stddef.h have to be textual headers in their upcoming modules to support their `__needs_xxx` macros. That means that they won't get precompiled into their modules' pcm, and instead their declarations will go into every other pcm that uses them. For now that's ok since the type merger can handle the declarations in these headers, but it's suboptimal at best. Make separate headers for all of the pieces so that they can be properly modularized.
Reviewed By: aaron.ballman, ChuanqiXu
Differential Revision: https://reviews.llvm.org/D158709
This reverts commit b6ba804f7775f89f230ee1e62526a2f8225c7966, effectively relanding commit 7d1565727dad3acb54fe76a908630843835d7bc8.
The original commit incorrectly called `ASTWriter::writeUnhashedControlBlock()` before `ASTWriter::collectNonAffectingInputFiles()`, causing SourceLocations/FileIDs in the pragma diagnostic mappings block to be invalid. This is now tested by `clang/test/Modules/diag-mappings-affecting.c`.
We have a new policy in place making links to private resources
something we try to avoid in source and test files. Normally, we'd
organically switch to the new policy rather than make a sweeping change
across a project. However, Clang is in a somewhat special circumstance
currently: recently, I've had several new contributors run into rdar
links around test code which their patch was changing the behavior of.
This turns out to be a surprisingly bad experience, especially for
newer folks, for a handful of reasons: not understanding what the link
is and feeling intimidated by it, wondering whether their changes are
actually breaking something important to a downstream in some way,
having to hunt down strangers not involved with the patch to impose on
them for help, accidental pressure from asking for potentially private
IP to be made public, etc. Because folks run into these links entirely
by chance (through fixing bugs or working on new features), there's not
really a set of problematic links to focus on -- all of the links have
basically the same potential for causing these problems. As a result,
this is an omnibus patch to remove all such links.
This was not a mechanical change; it was done by manually searching for
rdar, radar, radr, and other variants to find all the various
problematic links. From there, I tried to retain or reword the
surrounding comments so that we would lose as little context as
possible. However, because most links were just a plain link with no
supporting context, the majority of the changes are simple removals.
Differential Review: https://reviews.llvm.org/D158071
When loading (transitively) imported AST file, `ModuleManager::addModule()` first checks it has the expected signature via `readASTFileSignature()`. The signature is part of `UNHASHED_CONTROL_BLOCK`, which is placed at the end of the AST file. This means that just to verify signature of an AST file, we need to skip over all top-level blocks, paging in the whole AST file from disk. This is pretty slow.
This patch moves `UNHASHED_CONTROL_BLOCK` to the start of the AST file, so that it can be read more efficiently. To achieve this, we use dummy signature when first emitting the unhashed control block, and then backpatch the real signature at the end of the serialization process.
This speeds up dependency scanning by over 9% and significantly reduces run-to-run variability of my benchmarks.
Depends on D158572.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D158573
With implicit modules, it's impossible to load a PCM file that was built using different command-line macro definitions. This is guaranteed by the fact that they contribute to the context hash. This means that we don't need to store those macros into PCM files for validation purposes. This patch avoids serializing them in those circumstances, since there's no other use for command-line macro definitions (besides "-module-file-info").
For a typical Apple project, this speeds up the dependency scan by 5.6% and shrinks the cache with scanning PCMs by 26%.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D158136
Close https://github.com/llvm/llvm-project/issues/64755
This wouldn't affect the form @import as the test shows. The two
affected test case `diag-flags.cpp` and `diag-pragma.cpp` are old test
cases in 2017 and 2018, when we're not so clear about the direction of
modules. And the things that these 2 tests tested can be covered by
clang modules naturally. So I change the them into clang modules to
not block this patch.
Clang writes the set of textually included files into AST files, so that importers know to avoid including those files again and instead deserialize their contents from the AST on-demand.
Logic for determining the set of included files files only considers headers that are either non-modular or that are modular but with `HeaderFileInfo::isCompilingModuleHeader` set. Logic for computing that bit is different than the one that determines whether to include a header textually with the "-fmodule-name=Mod" option. That can lead to header from module "Mod" being included textually in a PCH, but be omitted in the serialized set of included files. This can then allow such header to be textually included from importer of the PCH, wreaking havoc.
This patch fixes that by aligning the logic for computing `HeaderFileInfo::isCompilingModuleHeader` with the logic for deciding whether to include modular header textually.
As far as I can tell, this bug has been in Clang for forever. It got accidentally "fixed" by D114095 (that changed the logic for determining the set of included files) and got broken again in D155131 (which is essentially a revert of the former).
rdar://113520515
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D157559
Add a way to enable -Wsystem-headers only for a specific module. This is
useful for validating a module that would otherwise not see system
header diagnostics without being flooded by diagnostics for unrelated
headers/modules. It's relatively common for a module to be marked
[system] but still wish to validate itself explicitly.
rdar://113401565
Differential Revision: https://reviews.llvm.org/D156948
This patch adds all the language-level function keywords defined in:
https://github.com/ARM-software/acle/pull/188 (merged)
https://github.com/ARM-software/acle/pull/261 (update after D148700 landed)
The keywords are used to control PSTATE.ZA and PSTATE.SM, which are
respectively used for enabling the use of the ZA matrix array and Streaming
mode. This information needs to be available on call sites, since the use
of ZA or streaming mode may have to be enabled or disabled around the
call-site (depending on the IR attributes set on the caller and the
callee). For calls to functions from a function pointer, there is no IR
declaration available, so the IR attributes must be added explicitly to the
call-site.
With the exception of '__arm_locally_streaming' and '__arm_new_za' the
information is part of the function's interface, not just the function
definition, and thus needs to be propagated through the
FunctionProtoType::ExtProtoInfo.
This patch adds the defintions of these keywords, as well as codegen and
semantic analysis to ensure conversions between function pointers are valid
and that no conflicting keywords are set. For example, '__arm_streaming'
and '__arm_streaming_compatible' are mutually exclusive.
Differential Revision: https://reviews.llvm.org/D127762
This is a reduced test case originally meant to be addressed by
https://reviews.llvm.org/D137787. It was recently fixed by commit
61c7a9140b ("Commit to a primary definition for a class when we load
its first member."), noting the difficulty to come up with a reduced
test case. This setup with four modules seems to fail consistently
before the fix mentioned above with an assertion in CGExprCXX.cpp,
CodeGenFunction::EmitCXXDestructorCall():
Assertion `ThisTy->getAsCXXRecordDecl() == DtorDecl->getParent() &&
"Pointer/Object mixup"' failed.
Differential Revision: https://reviews.llvm.org/D156806