This reverts commit 407a2f23 which stopped propagating the callback to module compiles, effectively disabling dependency directive scanning for all modular dependencies. Also added a regression test.
Once a file has been `#import`'ed, it gets stamped as if it was `#pragma
once` and will not be re-entered, even on #include. This means that any
errant #import of a file designed to be included multiple times, such as
<assert.h>, will incorrectly mark it as include-once and break the
multiple include functionality. Normally this isn't a big problem, e.g.
<assert.h> can't have its NDEBUG mode changed after the first #import,
but it is still mostly functional. However, when clang modules are
involved, this can cause the header to be hidden entirely.
Objective-C code most often uses #import for everything, because it's
required for most Objective-C headers to prevent double inclusion and
redeclaration errors. (It's rare for Objective-C headers to use macro
guards or `#pragma once`.) The problem arises when a submodule includes
a multiple-include header. The "already included" state is global across
all modules (which is necessary so that non-modular headers don't get
compiled into multiple translation units and cause redeclaration
errors). If another module or the main file #import's the same header,
it becomes invisible from then on. If the original submodule is not
imported, the include of the header will effectively do nothing and the
header will be invisible. The only way to actually get the header's
declarations is to somehow figure out which submodule consumed the
header, and import that instead. That's basically impossible since it
depends on exactly which modules were built in which order.
#import is a poor indicator of whether a header is actually
include-once, as the #import is external to the header it applies to,
and requires that all inclusions correctly and consistently use #import
vs #include. When modules are enabled, consider a header marked
`textual` in its module as a stronger indicator of multiple-include than
#import's indication of include-once. This will allow headers like
<assert.h> to always be included when modules are enabled, even if
#import is erroneously used somewhere.
An instance of `PreprocessorOptions` is part of `CompilerInvocation`
which is supposed to be a value type. The `DependencyDirectivesForFile`
member is problematic, since it holds an owning reference of the
scanning VFS. This makes it not a true value type, and it can keep
potentially large chunk of memory (the local cache in the scanning VFS)
alive for longer than clients might expect. Let's move it into the
`Preprocessor` instead.
The `ASTWriter` algorithm for computing affecting module maps uses
`SourceManager::translateFile()` to get a `FileID` from a `FileEntry`.
This is slow (O(n)) since the function performs a linear walk over
`SLocEntries` until it finds one with a matching `FileEntry`.
This patch removes this use of `SourceManager::translateFile()` by
tracking `FileID` instead of `FileEntry` in couple of places in
`ModuleMap`, giving `ASTWriter` the desired `FileID` directly. There are
no changes required for clients that still want a `FileEntry` from
`ModuleMap`: the existing APIs internally use `SourceManager` to perform
the reverse `FileID` to `FileEntry` conversion in O(1).
This updates a few warnings that were diagnosing no arguments for a
`...` variadic macro parameter as a GNU extension when it actually is a
C++20/C23 extension now.
This fixes#84495.
On Apple platforms, some of the stddef.h types are also declared in
system headers. In particular NULL has a conflicting declaration in
<sys/_types/_null.h>. When that's in a different module from
<__stddef_null.h>, redeclaration errors can occur.
Make the \_\_stddef_ headers be non-modular in
-fbuiltin-headers-in-system-modules and restore them back to not
respecting their header guards. Still define the header guards though.
__stddef_max_align_t.h was in _Builtin_stddef_max_align_t prior to the
addition of _Builtin_stddef, and it needs to stay in a module because
struct's can't be type merged. __stddef_wint_t.h didn't used to have a
module, but leave it in it current module since it doesn't really belong
to stddef.h.
Clang was incorrectly finding the start of the exponent in a fixed point
hex literal. It would unconditionally find the first `e/E/p/P` in a
constant regardless of if it were hex or not and parser the remaining
digits as an APInt. In a debug build, this would be caught by an
assertion, but in a release build, the assertion is removed and we'd end
up in an infinite loop.
Fixes#83050
(bad error message on incorrect string literal)
Fixed the error message for incorrect string literal
before:
```
test.cpp:1:19: error: invalid character '
' character in raw string delimiter; use PREFIX( )PREFIX to delimit raw string
char const* a = R"
^
```
now:
```
test.cpp:1:19: error: invalid newline character in raw string delimiter; use PREFIX( )PREFIX to delimit raw string
1 | char const* a = R"
| ^
```
---------
Co-authored-by: Jon Roelofs <jroelofs@gmail.com>
We previously would diagnose them as a GNU extension in C mode, but they
are now a feature of C23. The -Wgnu-binary-literal warning group no
longer controls any diagnostics as this is no longer a GNU extension.
The warning group is retained as a noop to help avoid "unknown warning"
diagnostics.
This also adds the companion compatibility warning which existed for C++
but not for C.
Fixes https://github.com/llvm/llvm-project/issues/72017
This patch provides more information to the
`PPCallbacks::InclusionDirective()` hook. We now always pass the
suggested module, regardless of whether it was actually imported or not.
The extra `bool ModuleImported` parameter then denotes whether the
header `#include` will be automatically translated into import the the
module.
The main change is in `clang/lib/Lex/PPDirectives.cpp`, where we take
care to not modify `SuggestedModule` after it's been populated by
`LookupHeaderIncludeOrImport()`. We now exclusively use the `SM`
(`ModuleToImport`) variable instead, which has been equivalent to
`SuggestedModule` until now. This allows us to use the original
non-modified `SuggestedModule` for the callback itself.
(This patch turns out to be necessary for
https://github.com/apple/llvm-project/pull/8011).
In https://github.com/llvm/llvm-project/pull/76873 a warning was added
when the macros INFINITY and NAN are used in binary expressions when
-menable-no-nans or -menable-no-infs are used. If the user uses an
option that nullifies these two options, the warning will still be
generated. This patch adds an additional information to the warning
comment to let the user know about this. It also suppresses the warning
when #ifdef INFINITY, #ifdef NAN, #ifdef NAN or #ifndef NAN are used in
the code.
My recent commit (67c1c1d) made the CPU ID builtins target-independent
so they can be used on PPC as well. However, that had the unintended
consequence of changing the behaviour of __has_builtin in that it
reports these as supported at the pre-processor level. This makes it
impossible to guard the use of these with this feature test macro which
is clearly not ideal.
This patch restores the behaviour of __has_builtin for __builtin_cpu_is,
__builtin_cpu_init,
__builtin_cpu_supports. Now the preprocessor queries the target to
determine whether the target supports the builtin.
`-ivfsoverlay` files are unused when building most modules. Enable
removing them by,
* adding a way to visit the filesystem tree with extensible RTTI to
access each `RedirectingFileSystem`.
* Adding tracking to `RedirectingFileSystem` to record when it
actually redirects a file access.
* Storing this information in each PCM.
Usage tracking is only enabled when iterating over the source manager
and affecting modulemaps. Here each path is stated to cause an access.
During scanning these stats all hit the cache.
Add some primitive syntax highlighting to our code snippet output.
This adds "checkpoints" to the Preprocessor, which we can use to start lexing from. When printing a code snippet, we lex from the nearest checkpoint and highlight the tokens based on their token type.
Close https://github.com/llvm/llvm-project/issues/73023
The direct issue of https://github.com/llvm/llvm-project/issues/73023 is
that we entered a header which is marked as pragma once since the
compiler think it is OK if there is controlling macro.
It doesn't make sense. I feel like it should be sufficient to skip it
after we see the '#pragma once'.
From the context, it looks like the workaround is primarily for
ObjectiveC. So we might need reviewers from OC.
This removes a long standing piece of technical debt. Most other
platforms have moved all their header search path logic to the driver,
but Darwin still had some logic for setting framework search paths
present in the frontend. This patch moves that logic to the driver
alongside existing logic that already handles part of these search
paths.
This is intended to be a pure refactor without any functional change
visible to users, since the search paths before and after should be the
same, and in the same order. The change in the tests is necessary
because we would previously add the DriverKit framework search path in
the frontend regardless of whether we actually need to, which we now
handle correctly because the driver checks for ld64-605.1+.
Fixes#75638
The existing code incorrectly assumes that `Path` can be empty. It
can't, it always contains at least `<` or `"`. On Unix, this patch fixes
an incorrect diagnostics that instead of `"/Users/blah"` suggested
`"Userss/blah"`. In assert builds, this would outright crash.
This patch also fixes a bug on Windows that would prevent the diagnostic
being triggered due to separator mismatch.
rdar://91172342
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch deprecates `module.map` in favor of `module.modulemap`, which
has been the preferred form since 2014. The eventual goal is to remove
support for `module.map` to reduce the number of stats Clang needs to do
while searching for module map files.
This patch touches a lot of files, but the majority of them are just
renaming tests or references to the file in comments or documentation.
The relevant files are:
* lib/Lex/HeaderSearch.cpp
* include/clang/Basic/DiagnosticGroups.td
* include/clang/Basic/DiagnosticLexKinds.td
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch renames {starts,ends}with to {starts,ends}_with for
consistency with std::{string,string_view}::{starts,ends}_with in
C++20. Since there are only a handful of occurrences, this patch
skips the deprecation phase and simply renames them.
This code was added 17 years ago but never enabled or tested. GCC warns
that -I- is deprecated for them, and Clang gives an error when passed
-I-, so we may as well remove this code rather than hook it up to the
driver and maintain it.
HLSL supports vector swizzles on scalars by implicitly converting the
scalar to a single-element vector. This syntax is a convienent way to
initialize vectors based on filling a scalar value.
There are two parts of this change. The first part in the Lexer splits
numeric constant tokens when a `.x` or `.r` suffix is encountered. This
splitting is a bit hacky but allows the numeric constant to be parsed
separately from the vector element expression. There is an ambiguity
here with the `r` suffix used by fixed point types, however fixed point
types aren't supported in HLSL so this should not cause any exposable
problems (a separate issue has been filed to track validating language
options for HLSL: #67689).
The second part of this change is in Sema::LookupMemberExpr. For HLSL,
if the base type is a scalar, we implicit cast the scalar to a
one-element vector then call back to perform the vector lookup.
Fixes#56658 and #67511
Struct Module::Header is not a POD type. As such, qsort() and
llvm::array_pod_sort() must not be used to sort it. This became an issue
with the new implementation of qsort() in glibc 2.39 that is not
guaranteed to be a stable sort, causing Headers to be re-ordered and
corrupted.
Replace the usage of llvm::array_pod_sort() with std::stable_sort() in
order to fix this issue. The signature of compareModuleHeaders() has to
be modified.
Fixes#73145.
Close https://github.com/llvm/llvm-project/issues/71347
Previously I misread the concept of module purview. I thought if a
declaration attached to a unnamed module, it can't be part of the module
purview. But after the issue report, I recognized that module purview is
more of a concept about locations instead of semantics.
Concretely, the things in the language linkage after module declarations
can be exported.
This patch refactors `Module::isModulePurview()` and introduces some
possible code cleanups.
After #70144 Clang started resolving module maps even for
`__has_include()` expressions. This had the unintended consequence of
emitting diagnostics around header misuse. These don't make sense if you
actually don't bring contents of the header into the importer, so should
be skipped for `__has_include()`. This patch moves emission of these
diagnostics out of `Preprocessor::LookupFile()` up into
`Preprocessor::LookupHeaderIncludeOrImport()`.
Previous representation used an enumeration combined to a switch to
dispatch to the appropriate lexer.
Use function pointer so that the dispatching is just an indirect call,
which is actually better because lexing is a costly task compared to a
function call.
This also makes the code slightly cleaner, speedup on compile time
tracker are consistent and range form -0.05% to -0.20% for NewPM-O0-g,
see
https://llvm-compile-time-tracker.com/compare.php?from=f9906508bc4f05d3950e2219b4c56f6c078a61ef&to=608c85ec1283638db949d73e062bcc3355001ce4&stat=instructions:u
Considering just the preprocessing task, preprocessing the sqlite
amalgametion takes -0.6% instructions (according to valgrind
--tool=callgrind)
---------
Co-authored-by: serge-sans-paille <sguelton@mozilla.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
`ModuleDeclState` is incorrectly changed to `NamedModuleImplementation`
for `struct module {}; void foo(module a);`. This is mostly benign but
leads to a spurious warning after #69555.
A real world example is:
```
// pybind11.h
class module_ { ... };
using module = module_;
// tensorflow
void DefineMetricsModule(pybind11::module main_module);
// `module main_module);` incorrectly changes `ModuleDeclState` to `NamedModuleImplementation`
#include <algorithm> // spurious warning
```
Instead of passing the Size by reference, assuming it is initialized,
return it alongside the expected char result as a POD.
This makes the interface less error prone: previous interface expected
the Size reference to be initialized, and it was often forgotten,
leading to uninitialized variable usage. This patch fixes the issue.
This also generates faster code, as the returned POD (a char and an
unsigned) fits in 64 bits. The speedup according to compile time tracker
reach -O.7%, with a good number of -0.4%. Details are available on
https://llvm-compile-time-tracker.com/compare.php?from=3fe63f81fcb999681daa11b2890c82fda3aaeef5&to=fc76a9202f737472ecad4d6e0b0bf87a013866f3&stat=instructions:u
And icing on the cake, on my setup it also shaves 2kB out of
libclang-cpp :-)
This is a recommit of d8f5a18b6e587aeaa8b99707e87b652f49b160cd for
When including builtin headers as part of a system module, ensure we use
relative paths to those headers. Otherwise the module will fail to compile
when specifying relative resource directories without extra search paths.
Previously, Clang wouldn't try to resolve the module for the header
being checked via `__has_include`. This meant that such header was
considered textual (a.k.a. part of the module Clang is currently
compiling).
rdar://116926060
When an include from a textual header is resolved, the textual header's
submodule is used as the requesting module. The submodule's uses are
resolved, but that doesn't work because only top level modules have
uses, and only the top level module uses are used for checking uses in
Module::directlyUses. ModuleMap::resolveUses to resolve the top level
module instead of the submodule.
This prevents redefinition errors due to having multiple paths for the
same module map. (rdar://24116019)
Originally implemented and tested downstream by @bcardosolopes, I just
made use of `FileEntryRef::getNameAsRequested()`.