Static analysis flagged the unconditional access of getExternalSource().
We don't initialize ExternalSource during construction but via
setExternalSource(). If this is not set it will violate the invariant
covered by the assert.
This adds a new diagnostic group, -Wc++-keyword, which is off by default
and grouped under -Wc++-compat. The diagnostic catches use of C++
keywords in C code.
This change additionally fixes an issue with -Wreserved-identifier not
diagnosing use of reserved identifiers in function parameter lists in a
function declaration which is not a definition.
Fixes https://github.com/llvm/llvm-project/issues/21898
This PR reland https://github.com/llvm/llvm-project/pull/135808, fixed
some missed changes in LLDB.
I found this issue when I working on
https://github.com/llvm/llvm-project/pull/107168.
Currently we have many similiar data structures like:
- std::pair<IdentifierInfo *, SourceLocation>.
- Element type of ModuleIdPath.
- IdentifierLocPair.
- IdentifierLoc.
This PR unify these data structures to IdentifierLoc, moved
IdentifierLoc definition to SourceLocation.h, and deleted other similer
data structures.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Reverts llvm/llvm-project#135808
Example from the LLDB macOS CI:
https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/24084/execution/node/54/log/?consoleFull
```
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/source/Plugins/ExpressionParser/Clang/ClangModulesDeclVendor.cpp:360:49: error: no viable conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'clang::ModuleIdPath' (aka 'ArrayRef<IdentifierLoc>')
clang::Module *top_level_module = DoGetModule(clang_path.front(), false);
^~~~~~~~~~~~~~~~~~
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:41:40: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'const llvm::ArrayRef<clang::IdentifierLoc> &' for 1st argument
class LLVM_GSL_POINTER [[nodiscard]] ArrayRef {
^
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:41:40: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'llvm::ArrayRef<clang::IdentifierLoc> &&' for 1st argument
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:70:18: note: candidate constructor not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'std::nullopt_t' for 1st argument
/*implicit*/ ArrayRef(std::nullopt_t) {}
```
I found this issue when I working on
https://github.com/llvm/llvm-project/pull/107168.
Currently we have many similiar data structures like:
- `std::pair<IdentifierInfo *, SourceLocation>`.
- Element type of `ModuleIdPath`.
- `IdentifierLocPair`.
- `IdentifierLoc`.
This PR unify these data structures to `IdentifierLoc`, moved
`IdentifierLoc` definition to SourceLocation.h, and deleted other
similer data structures.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
When we mark a module visible, we normally mark all of its non-explicit
submodules and other exports as visible. However, when we first enter a
submodule we should not make them visible to the submodule itself until
they are actually imported. Marking exports visible before import would
cause bizarre behaviour with local submodule visibility, because it
happened before we discovered the submodule's transitive imports and
could fail to make them visible in the parent module depending on
whether the submodules involved were explicitly defined (module X) or
implicitly defined from an umbrella (module *).
rdar://136524433
Consider the following:
```
template<typename T>
struct A { };
template<typename T>
int A<T>::B::* f(); // error: no member named 'B' in 'A<T>'
```
Although this is clearly valid, clang rejects it because the
_nested-name-specifier_ `A<T>::` is parsed as-if it was declarative,
meaning, we parse it as-if it was the _nested-name-specifier_ in a
redeclaration/specialization. However, we don't (and can't) know whether
the _nested-name-specifier_ is declarative until we see the '`*`' token,
but at that point we have already complained that `A` has no member
named `B`! This patch addresses this bug by adding support for _fully_
unannotated _and_ unbounded tentative parsing, which allows for us to
parse past tokens without having to cache them until we reach a point
where we can guarantee to be past the construct we are disambiguating.
I don't know where the approach taken here is ideal -- alternatives are
welcome. However, the performance impact (as measured by
llvm-compile-time-tracker (https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=sdkrystian)
is quite minimal (0.09%, which I plan to further improve).
This PR implement [P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1), and refactor the convoluted state
machines in module name lexical analysis.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.
(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)
The reported issue at upstream: https://github.com/llvm/llvm-project/issues/90501
rdar://124035402
Add some primitive syntax highlighting to our code snippet output.
This adds "checkpoints" to the Preprocessor, which we can use to start lexing from. When printing a code snippet, we lex from the nearest checkpoint and highlight the tokens based on their token type.
Previous representation used an enumeration combined to a switch to
dispatch to the appropriate lexer.
Use function pointer so that the dispatching is just an indirect call,
which is actually better because lexing is a costly task compared to a
function call.
This also makes the code slightly cleaner, speedup on compile time
tracker are consistent and range form -0.05% to -0.20% for NewPM-O0-g,
see
https://llvm-compile-time-tracker.com/compare.php?from=f9906508bc4f05d3950e2219b4c56f6c078a61ef&to=608c85ec1283638db949d73e062bcc3355001ce4&stat=instructions:u
Considering just the preprocessing task, preprocessing the sqlite
amalgametion takes -0.6% instructions (according to valgrind
--tool=callgrind)
---------
Co-authored-by: serge-sans-paille <sguelton@mozilla.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
`ModuleDeclState` is incorrectly changed to `NamedModuleImplementation`
for `struct module {}; void foo(module a);`. This is mostly benign but
leads to a spurious warning after #69555.
A real world example is:
```
// pybind11.h
class module_ { ... };
using module = module_;
// tensorflow
void DefineMetricsModule(pybind11::module main_module);
// `module main_module);` incorrectly changes `ModuleDeclState` to `NamedModuleImplementation`
#include <algorithm> // spurious warning
```
This fixes many unit tests when trying to enable IncrementalExtensions
by default for testing purposes.
Differential Revision: https://reviews.llvm.org/D158415
This new method repeatedly calls Lex() until end of file is reached
and optionally fills a std::vector of Tokens. Use it in Clang's unit
tests to avoid quite some code duplication.
Differential Revision: https://reviews.llvm.org/D158413
The preprocessor `IncrementalProcessing` option was being used to
control whether or not to teardown the lexer or run the end of
translation unit action. In D127284 this was merged with
`-fincremental-extensions`, which also changes top level parsing.
Split these again so that the former behavior can be achieved without
the latter (ie. to allow managing lifetime without also changing
parsing).
Resolves rdar://113406310.
Setting __FLT_EVAL_METHOD__ to -1 with fast-math will set
__GLIBC_FLT_EVAL_METHOD to 2 and long double ends up being used for
float_t and double_t. This creates some ABI breakage with various C libraries.
See details here: https://github.com/llvm/llvm-project/issues/60781
This reverts commit bbf0d1932a3c1be970ed8a580e51edf571b80fd5.
As the diagnostic message shows, we should remove -fmodules-ts flag in
clang/llvm17. Since clang/llvm16 is already branched. We can remove the
depreacared flag now.
This patch prepares the necessary interfaces in the preprocessor part
for D137527 since we need to recognize if we're in a module unit, the
module kinds and the module declaration and the module we're importing
in the preprocessor.
Differential Revision: https://reviews.llvm.org/D137526
Add a pair of clang pragmas:
- `#pragma clang unsafe_buffer_usage begin` and
- `#pragma clang unsafe_buffer_usage end`,
which specify the start and end of an (unsafe buffer checking) opt-out
region, respectively.
Behaviors of opt-out regions conform to the following rules:
- No nested nor overlapped opt-out regions are allowed. One cannot
start an opt-out region with `... unsafe_buffer_usage begin` but never
close it with `... unsafe_buffer_usage end`. Mis-use of the pragmas
will be warned.
- Warnings raised from unsafe buffer operations inside such an opt-out
region will always be suppressed. This behavior CANNOT be changed by
`clang diagnostic` pragmas or command-line flags.
- Warnings raised from unsafe operations outside of such opt-out
regions may be reported on declarations inside opt-out
regions. These warnings are NOT suppressed.
- An un-suppressed unsafe operation warning may be attached with
notes. These notes are NOT suppressed as well regardless of whether
they are in opt-out regions.
The implementation maintains a separate sequence of location pairs
representing opt-out regions in `Preprocessor`. The `UnsafeBufferUsage`
analyzer reads the region sequence to check if an unsafe operation is
in an opt-out region. If it is, discard the warning raised from the
operation immediately.
This is a re-land after I reverting it at 9aa00c8a306561c4e3ddb09058e66bae322a0769.
The compilation error should be resolved.
Reviewed by: NoQ
Differential revision: https://reviews.llvm.org/D140179
Add a pair of clang pragmas:
- `#pragma clang unsafe_buffer_usage begin` and
- `#pragma clang unsafe_buffer_usage end`,
which specify the start and end of an (unsafe buffer checking) opt-out
region, respectively.
Behaviors of opt-out regions conform to the following rules:
- No nested nor overlapped opt-out regions are allowed. One cannot
start an opt-out region with `... unsafe_buffer_usage begin` but never
close it with `... unsafe_buffer_usage end`. Mis-use of the pragmas
will be warned.
- Warnings raised from unsafe buffer operations inside such an opt-out
region will always be suppressed. This behavior CANNOT be changed by
`clang diagnostic` pragmas or command-line flags.
- Warnings raised from unsafe operations outside of such opt-out
regions may be reported on declarations inside opt-out
regions. These warnings are NOT suppressed.
- An un-suppressed unsafe operation warning may be attached with
notes. These notes are NOT suppressed as well regardless of whether
they are in opt-out regions.
The implementation maintains a separate sequence of location pairs
representing opt-out regions in `Preprocessor`. The `UnsafeBufferUsage`
analyzer reads the region sequence to check if an unsafe operation is
in an opt-out region. If it is, discard the warning raised from the
operation immediately.
Reviewed by: NoQ
Differential revision: https://reviews.llvm.org/D140179
The needed tweaks are mostly trivial, the one nasty bit is Clang's usage
of OptionalStorage. To keep this working old Optional stays around as
clang::CustomizableOptional, with the default Storage removed.
Optional<File/DirectoryEntryRef> is replaced with a typedef.
I tested this with GCC 7.5, the oldest supported GCC I had around.
Differential Revision: https://reviews.llvm.org/D140332
This reverts commit 8f0df9f3bbc6d7f3d5cbfd955c5ee4404c53a75d.
The Optional*RefDegradesTo*EntryPtr types want to keep the same size as
the underlying type, which std::optional doesn't guarantee. For use with
llvm::Optional, they define their own storage class, and there is no way
to do that in std::optional.
On top of that, that commit broke builds with older GCCs, where
std::optional was not trivially copyable (static_assert in the clang
sources was failing).
This list was originally used for to make sure MacroInfo's clever memory
management got called (1f1e4bdbf7815c), but that was
simplified in 73a29662b9bf640a and 1f1e4bdbf7815c, and there's nothing left.
Differential Revision: https://reviews.llvm.org/D136725
This patch fixes compilation failure with explicit modules caused by scanner not reporting the module map describing the module whose implementation is being compiled.
Below is a breakdown of the attached test case. Note the VFS that makes frameworks "A" and "B" share the same header "shared/H.h".
In non-modular build, Clang skips the last import, since the "shared/H.h" header has already been included.
During scan (or implicit build), the compiler handles "tu.m" as follows:
* `@import B` imports module "B", as expected,
* `#import <A/H.h>` is resolved textually (due to `-fmodule-name=A`) to "shared/H.h" (due to the VFS remapping),
* `#import <B/H.h>` is resolved to import module "A_Private", since the header "shared/H.h" is already known to be part of that module, and the import is skipped.
In the end, the only modular dependency of the TU is "B".
In explicit modular build without `-fmodule-name=A`, TU does depend on module "A_Private" properly, not just textually. Clang therefore builds & loads its PCM, and knows to ignore the last import, since "shared/H.h" is known to be part of "A_Private".
But with current scanner behavior and `-fmodule-name=A` present, the last import fails during explicit build. Clang doesn't know about "A_Private" (it's included textually) and tries to import "B_Private" instead, which it doesn't know about either (the scanner correctly didn't report it as dependency). This is fixed by reporting the module map describing "A" and matching the semantics of implicit build.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D134222
The patch diagnoses an identifier as a future keyword if it exists in a
future language mode, such as:
int restrict;
in C modes earlier than C99. We now give a warning to the user that
such an identifier is a future keyword. Handles keywords from C as well
as C++.
Differential Revision: https://reviews.llvm.org/D131683
This addresses [cpp.include]/7
(when encountering #include header-name)
If the header identified by the header-name denotes an importable header, it
is implementation-defined whether the #include preprocessing directive is
instead replaced by an import directive.
In this implementation, include translation is performed _only_ for headers
in the Global Module fragment, so:
```
module;
#include "will-be-translated.h" // IFF the header unit is available.
export module M;
#include "will-not-be-translated.h" // even if the header unit is available
```
The reasoning is that, in general, includes in the module purview would not
be validly translatable (they would have to immediately follow the module
decl and without any other intervening decls). Otherwise that would violate
the rules on contiguous import directives.
This would be quite complex to track in the preprocessor, and for relatively
little gain (the user can 'import "will-not-be-translated.h";' instead.)
TODO: This is one area where it becomes increasingly difficult to disambiguate
clang modules in C++ from C++ standard modules. That needs to be addressed in
both the driver and the FE.
Differential Revision: https://reviews.llvm.org/D128981
When running `clang -E -Ofast` on macOS, the `__FLT_EVAL_METHOD__` macro is `0`, which causes the following typedef to be emitted into the preprocessed source: `typedef float float_t`.
However, when running `clang -c -Ofast`, `__FLT_EVAL_METHOD__` is `-1`, and `typedef long double float_t` is emitted.
This causes build errors for certain projects, which are not reproducible when compiling from preprocessed source.
The issue is that `__FLT_EVAL_METHOD__` is configured in `Sema::Sema` which is not executed when running in `-E` mode.
This change moves that logic into the preprocessor initialization method, which is invoked correctly in `-E` mode.
rdar://96134605
rdar://92748429
Differential Revision: https://reviews.llvm.org/D128814
"Ascii" StringLiteral instances are actually narrow strings
that are UTF-8 encoded and do not have an encoding prefix.
(UTF8 StringLiteral are also UTF-8 encoded strings, but with
the u8 prefix.
To avoid possible confusion both with actuall ASCII strings,
and with future works extending the set of literal encodings
supported by clang, this rename StringLiteral::isAscii() to
isOrdinary(), matching C++ standard terminology.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D128762