Depends on #169603.
This is the `use_device_ptr` counterpart of #168905.
With OpenMP 6.1, a `fallback` modifier can be specified on the
`use_device_ptr` clause to control the behavior when a pointer lookup
fails, i.e. there is no device pointer to translate into.
The default is `fb_preserve` (i.e. retain the original pointer), while
`fb_nullify` means: use `nullptr` as the translated pointer.
Dependent PR: #173930.
This PR unifies the terminology for:
* "context hash" - previously ambiguously referred to as "module hash"
or as overly specific "module context hash"
* "specific module cache path" - previously referred to as just "module
cache path" - hard to distinguish from the command-line-provided module
cache path without the context hash
NFCI
It is unclear (to me) why this needs to be done "for safety", but
this change significantly improves the effectiveness of lazy loading.
Reviewed as part of https://github.com/llvm/llvm-project/pull/133057
This reverts commit 1928c1ea9b57e9c44325d436bc7bb2f4585031f3.
We have at least one repro, but I won't be able to work on this until
next week. Also with Clang 22 cut upcoming, we probably need to revert
for now.
PCHs (but also modules generated from several implicit invocations like
swiftc) previously reported a confusing diagnostic about module caches
being mismatched by subdir. This is an implementation detail of the
module machinery, and not very useful to the end user. Instead, report
this case as a configuration mismatch when the compiler can confirm the
module cache was passed the same between the current TU & previously
compiled products.
Ideally, each argument that could result in this error would be uniquely
reported (e.g., O3), but as a starting point, providing something more
general is strictly better than pointing the user to the module cache.
This patch also includes NFCs for renaming variable names from Module to
AST and formatting cleanup in related areas.
resolves: rdar://167453135
RISC-V vector intrinsic is generated dynamically at runtime, thus it's
note preserved in AST yet when using precompile header, neither do
information in SemaRISCV. We need to write these information to ast
record to be able to use precompile header for RISC-V.
Fixes#109634
## Problem
Given code such as `N::foo();`, we perform name look-up on `N`. In the
case where `N` is a namespace declared in imported modules, one
namespace decl (the "key declaration") for each module that declares a
namespace `foo` is loaded and stored. In large scales where there are
many such modules, (e.g., 1,500) and many uses (e.g., 500,000), this
becomes extremely inefficient because every look-up (500,000 of them)
return 1,500 results.
The following synthetic script demonstrates the problem:
```bash
#/usr/bin/env bash
CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1500}
NUM_USES=${NUM_USES:-500000}
USE_MODULES=${USE_MODULES:-true}
TMPDIR=$(mktemp -d)
echo "Working in temp directory: $TMPDIR"
cd $TMPDIR
trap "rm -rf \"$TMPDIR\"" EXIT
echo "namespace N { inline void foo() {} }" > m1.h
for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done
if $USE_MODULES; then
seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h"
fi
> a.cpp
if $USE_MODULES; then
for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
else
for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done
fi
echo "int main() {" >> a.cpp
for i in $(seq 1 $NUM_USES); do echo " N::foo();" >> a.cpp; done
echo "}" >> a.cpp
if $USE_MODULES; then
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \
$(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done)
else
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null
fi
```
As of 575d6892bcc5cef926cfc1b95225148262c96a15, without modules
(`USE_MODULES=false`) this takes about **4.5s**, whereas with modules
(`USE_MODULES=true`), this takes about **37s**.
With this PR, without modules there's no change (as expected) at 4.5s,
but with modules it improves to about **5.2s**.
## Approach
The approach taken here aims to maintain status-quo with respect to the
input and output of modules. That is, the `ASTReader` and `ASTWriter`
both read and write the same declarations as it did before. The
difference is in the middle part: the [`StoredDeclsMap` in
`DeclContext`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/AST/DeclBase.h#L2024-L2030).
The `StoredDeclsMap` is roughly a `map<DeclarationName,
StoredDeclsList>`. Currently, we read all of the external namespace
decls from `ASTReader`, they all get stored into the `StoredDeclsList`,
and the `ASTWriter` iterates through that list and writes out the
results.
This PR continues to read all of the external namespace decls from
`ASTReader`, but only stores one namespace decl in the
`StoredDeclsList`. This is okay since the reading of the decls handles
all of the merging and chaining of the namespace decls, and as long as
they're loaded and chained, returning one for look-up purposes is
sufficient.
The other half of the problem is to write out all of the external
namespaces that we used to store in `StoredDeclsList` but no longer. For
this, we take advantage of the
[`KeyDecls`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/Serialization/ASTReader.h#L1342-L1347)
data structure in `ASTReader`. `KeyDecls` is roughly a `map<Decl *,
vector<GlobalDeclID>>`, and it stores a mapping from the canonical decl
of a redeclarable decl to a list of `GlobalDeclID`s where each ID
represents a "key declaration" from each imported module. More to the
point, if we read external namespaces `N1`, `N2`, `N3` in `ASTReader`,
we'll either have `N1` mapped to `[N2, N3]`, or some newly local
canonical decl mapped to `[N1, N2, N3]`. Either way, we can visit `N1`,
`N2`, and `N3` by doing `ASTReader::forEachImportedKeyDecls(N1,
Visitor)`, and we leverage this to maintain the current behavior of
writing out all of the imported namespace decls in `ASTWriter`.
## Alternatives Attempted
- Tried reading fewer declarations on the `ASTReader` side, and writing
out fewer declarations on the `ASTWriter` side, and neither options
worked at all.
- Tried trying to split `StoredDeclsList` into two pieces, one with
non-namespace decls and one with only namespace decls, but that didn't
work well... I think because the order of the declarations matter
sometimes, and maybe also because the declaration replacement logic gets
more complicated.
- Tried to deduplicate at the `SemaLookup` level. Basically, retrieve
all the stored decls but deduplicate populating the `LookupResult`
[here](https://github.com/llvm/llvm-project/blob/release/21.x/clang/lib/Sema/SemaLookup.cpp#L1137-L1144).
This did improve things slightly, but not quite enough, and this
solution seemed cleaner in the end anyway.
- CUDA's dynamic parallelism extension allows device-side kernel
launches, which share the identical syntax to host-side launches, e.g.,
kernel<<<Dg, Db, Ns, S>>>(arguments);
but differ from the code generation. That device-side kernel launches is
eventually translated into the following sequence
config = cudaGetParameterBuffer(alignment, size);
// setup arguments by copying them into `config`.
cudaLaunchDevice(func, config, Dg, Db, Ns, S);
- To support the device-side kernel launch, 'CUDAKernelCallExpr' is
reused but its config expr is set to a call to 'cudaLaunchDevice'.
During the code generation, 'CUDAKernelCallExpr' is expanded into the
sequence aforementioned.
- As the device-side kernel launch requires the source to be compiled as
relocatable device code and linked with '-lcudadevrt'. Linkers are
changed to pass relevant link options to 'nvlink'.
As described in section 2.14.6 of openmp spec, the patch implements
support for iterator in motion clauses.
---------
Co-authored-by: Shashwathi N <nshashwa@pe31.hpc.amslabs.hpecorp.net>
Similar to previous no transitive changes to decls, types, identifiers
and source locations (
https://github.com/llvm/llvm-project/pull/92083https://github.com/llvm/llvm-project/pull/92085https://github.com/llvm/llvm-project/pull/92511https://github.com/llvm/llvm-project/pull/86912
)
This patch does the same thing for MacroID and PreprocessedEntityID.
---
### Some background
Previously we record different IDs linearly. That is, when writing a
module, if we have 17 decls in imported modules, the ID of decls in the
module will start from 18. This makes the contents of the BMI changes if
the we add/remove any decls, types, identifiers and source locations in
the imported modules.
This makes it hard for us to reduce recompilations with modules. We want
to skip recompilations as we think the modules can help us to remove
fake dependencies. This can be done by split the ID into <ModuleIndex,
LocalIndex> pairs.
This is ALREADY done for several different ID above. We call it
non-casacading changes
(https://clang.llvm.org/docs/StandardCPlusPlusModules.html#experimental-non-cascading-changes).
Our internal users have already used this feature and it works well for
years.
Now we want to extend this to MacroID and PreprocessedEntityID. This is
helpful for us in the downstream as we allowed named modules to export
macros. But I believe this is also helpful for header-like modules if
you'd like to explore the area.
And also I think this is a nice cleanup too.
---
Given the use of MacroID and PreprocessedEntityID are not as complicated
as other IDs in the above series, I feel the patch itself should be
good. I hope the vendors can test the patch to make sure it won't affect
existing users.
Close https://github.com/llvm/llvm-project/issues/166068
The cause of the problem is that we would import initializers and
pending implicit instantiations from other named module. This is very
bad and it may waste a lot of time.
And we didn't observe it as the weak symbols can live together and the
strong symbols would be removed by other mechanism. So we didn't observe
the bad behavior for a long time. But it indeeds waste compilation time.
This PR adds support for the `dyn_groupprivate` clause, which will be
part of OpenMP 6.1. This feature allows users to request dynamic shared
memory on target regions.
---------
Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>
This implements the parts of
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3457.htm which were
adopted at the recent meeting in Brno.
Clang already implemented `__COUNTER__`, but needed some changes for
conformance. Specifically, we now diagnose when the macro is expanded
more than 2147483647 times. Additionally, we now give the expected
extension and pre-compat warnings for the feature.
To support testing the limits, this also adds a -cc1-only option,
`-finitial-counter-value=`, which lets you specify the initial value the
`__COUNTER__` macro should expand to.
This PR refactors `ASTUnit::LoadFromASTFile()` to be easier to follow.
Conceptually, it tries to read an AST file, adopt the serialized
options, and set up `Sema` and `ASTContext` to deserialize the AST file
contents on-demand.
The implementation of this used to be spread across an
`ASTReaderListener` and the function in question. Figuring out what
listener method gets called when and how it's supposed to interact with
the rest of the functionality was very unclear. The `FileManager`'s VFS
was being swapped-out during deserialization, the options were being
adopted by `Preprocessor` and others just-in-time to pass `ASTReader`'s
validation checks, and the target was being initialized somewhere in
between all of this. This lead to a very muddy semantics.
This PR splits `ASTUnit::LoadFromASTFile()` into three distinct steps:
1. Read out the options from the AST file.
2. Initialize objects from the VFS to the `ASTContext`.
3. Load the AST file and hook it up with the compiler objects.
This should be much easier to understand, and I've done my best to
clearly document the remaining gotchas.
(This was originally motivated by the desire to remove
`FileManager::setVirtualFileSystem()` and make it impossible to swap out
VFSs from underneath `FileManager` mid-compile.)
This rename was made as part of
https://github.com/llvm/llvm-project/pull/147835 in order to ease
rebasing the PR, and give a nice window for other patches to get rebased
as well.
It has been a while already, so lets go ahead and rename it back.
This PR enhances the OpenMP `nowait` clause implementation by adding
support for optional argument in both parsing and semantic analysis
phases.
Reference:
1. OpenMP 6.0 Specification, page 481
#137363 was supposed to be NFC for the `CrossProcessModuleCache` (a.k.a
normal implicit module builds), but accidentally passed the wrong path
to `sys::fs::status`. Then, #141358 removed the correct path that
should've been passed instead. (The variable was flagged as unused.)
None of our existing tests caught this regression, we only found out due
to a SourceKit-LSP benchmark getting slower.
This PR re-implements the original behavior, adds new remark to Clang
for PCM input file validation, and uses it to create more reliable tests
of the `-fmodules-validate-once-per-build-session` flag.
This reverts commit
8d9aecce06.
Additionally, this refactors how we're doing the AST storage to put it
all in the trailing storage, which will hopefully prevent it from
leaking. The problem was that the AST doesn't call destructors on things
in ASTContext storage, so we weren't actually able to delete the
combiner
SmallVector (which I should have known...). This patch instead moves all
of that SmallVector data into trailing storage, which shouldn't have the
same
problem with leaking as before.
This is the first patch of a handful to get the reduction combiner
recipe lowering properly. THIS patch is NFC as it doesn't actually
change anything except the structure of the AST.
For each 'combiner' recipe we need a 'LHS' 'RHS' and expression to
represent the operation.
Each var-reference can have 1 or more combiners.
IF it is a plain scalar, or a struct with the proper operator, or an
array of either of those, there will be 1.
HOWEVER, aggregates without the proper operator are supposed to be
broken down and done from their elements (which can only be scalars). In
this case, we will represent 1 'combiner' recipe per field-decl.
This patch only puts the infrastructure in place to do so, future
patches wll do the work to fill this in.
I originally expected that we were going to need the initExpr stored
separately from the allocaDecl when doing arrays/pointers, however after
implementing it, we found that the idea of having the allocaDecl just
store its init directly still works perfectly. This patch removes the
extra field from the AST.
This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang.
This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending).
---------
Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
A DependentTemplateSpecializationType (DTST) is basically just a
TemplateSpecializationType (TST) with a hardcoded DependentTemplateName
(DTN) as its TemplateName.
This removes the DTST and replaces all uses of it with a TST, removing a
lot of duplication in the implementation.
Technically the hardcoded DTN is an optimization for a most common case,
but the TST implementation is in better shape overall and with other
optimizations, so this patch ends up being an overall performance
positive:
<img width="1465" height="38" alt="image"
src="https://github.com/user-attachments/assets/084b0694-2839-427a-b664-eff400f780b5"
/>
A DTST also didn't allow a template name representing a DTN that was
substituted, such as from an alias template, while the TST does allow it
by the simple fact it can hold an arbitrary TemplateName, so this patch
also increases the amount of sugar retained, while still being faster
overall.
Example (from included test case):
```C++
template<template<class> class TT> using T1 = TT<int>;
template<class T> using T2 = T1<T::template X>;
```
Here we can now represent in the AST that `TT` was substituted for the
dependent template name `T::template X`.
Expressions/references with 'bounds' are going to need to do
initialization significantly differently, so we need to have the
initializer and the declaration 'separate' in the future. This patch
splits the AST node into two, and normalizes them a bit.
Additionally, since this required significant work on the recipe
generation, this patch also does a bit of a refactor to improve
readability and future expansion, now that we have a good understanding
of how these are going to look.
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary codegen changes.
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary codegen changes.
This reintroduces `Type.h`, having earlier been renamed to `TypeBase.h`,
as a redirection to `TypeBase.h`, and redirects most users to include
the former instead.
This is a preparatory patch for being able to provide inline definitions
for `Type` methods which would otherwise cause a circular dependency
with `Decl{,CXX}.h`.
Doing these operations into their own NFC patch helps the git rename
detection logic work, preserving the history.
This patch makes clang just a little slower to build (~0.17%), just
because it makes more code indirectly include `DeclCXX.h`.
This is a preparatory patch, to be able to provide inline definitions
for `Type` functions which depend on `Decl{,CXX}.h`. As the latter also
depends on `Type.h`, this would not be possible without some
reorganizing.
Splitting this rename into its own patch allows git to track this as a
rename, and preserve all git history, and not force any code
reformatting.
A later NFC patch will reintroduce `Type.h` as redirection to
`TypeBase.h`, rewriting most places back to directly including `Type.h`
instead of `TypeBase.h`, leaving only a handful of places where this is
necessary.
Then yet a later patch will exploit this by making more stuff inline.
`ASTReader::FinishedDeserializing()` calls
`adjustDeductedFunctionResultType(...)` [0], which in turn calls
`FunctionDecl::getMostRecentDecl()`[1]. In modules builds,
`getMostRecentDecl()` may reach out to the `ASTReader` and start
deserializing again. Starting deserialization starts `ReadTimer`;
however, `FinishedDeserializing()` doesn't call `stopTimer()` until
after it's call to `adjustDeductedFunctionResultType(...)` [2]. As a
result, we hit an assert checking that we don't start an already started
timer [3]. To fix this, we simply don't start the timer if it's already
running.
Unfortunately I don't have a test case for this yet as modules builds
are notoriously difficult to reduce.
[0]:
4d2288d318/clang/lib/Serialization/ASTReader.cpp (L11053)
[1]:
4d2288d318/clang/lib/AST/ASTContext.cpp (L3804)
[2]:
4d2288d318/clang/lib/Serialization/ASTReader.cpp (L11065-L11066)
[3]:
4d2288d318/llvm/lib/Support/Timer.cpp (L150)
The new builtin `__builtin_dedup_pack` removes duplicates from list of
types.
The added builtin is special in that they produce an unexpanded pack
in the spirit of P3115R0 proposal.
Produced packs can be used directly in template argument lists and get
immediately expanded as soon as results of the computation are
available.
It allows to easily combine them, e.g.:
```cpp
template <class ...T>
struct Normalize {
// Note: sort is not included in this PR, it illustrates the idea.
using result = std::tuple<
__builtin_sort_pack<
__builtin_dedup_pack<int, double, T...>...
>...>;
}
;
```
Limitations:
- only supported in template arguments and bases,
- can only be used inside the templates, even if non-dependent,
- the builtins cannot be assigned to template template parameters.
The actual implementation proceeds as follows:
- When the compiler encounters a `__builtin_dedup_pack` or other
type-producing
builtin with dependent arguments, it creates a dependent
`TemplateSpecializationType`.
- During substitution, if the template arguments are non-dependent, we
will produce: a new type `SubstBuiltinTemplatePackType`, which stores
an argument pack that needs to be substituted. This type is similar to
the existing `SubstTemplateParmPack` in that it carries the argument
pack that needs to be expanded further. The relevant code is shared.
- On top of that, Clang also wraps the resulting type into
`TemplateSpecializationType`, but this time only as a sugar.
- To actually expand those packs, we collect the produced
`SubstBuiltinTemplatePackType` inside `CollectUnexpandedPacks`.
Because we know the size of the produces packs only after the initial
substitution, places that do the actual expansion will need to have a
second run over the substituted type to finalize the expansions (in
this patch we only support this for template arguments, see
`ExpandTemplateArgument`).
If the expansion are requested in the places we do not currently
support, we will produce an error.
More follow-up work will be needed to fully shape this:
- adding the builtin that sorts types,
- remove the restrictions for expansions,
- implementing P3115R0 (scheduled for C++29, see
https://github.com/cplusplus/papers/issues/2300).
This patch does the bare minimum to start setting up the reduction
recipe support, including adding a type to the AST to store it. No real
additional work is done, and a bunch of static_asserts are left around
to allow us to do this properly.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This patch adds the 'init recipes' to firstprivate like I did for
'private', so that we can properly init these types. At the moment,
the recipe init isn't generated (just the VarDecl), and this isn't
really used anywhere as it will be used exclusively in Codegen.
Previously, #151360 implemented 'private' clause lowering, but didn't
properly initialize the variables. This patch adds that behavior to make
sure we correctly get the constructor or other init called.
This fixes an ambiguous type type_info when you try and reference the
`type_info` type while using clang modulemaps with `-fms-compatibility`
enabled
Fixes#38400
Handles clang::DiagnosticsEngine and clang::DiagnosticIDs.
For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to
convenience method `DiagnosticIDs::create()`.
Part of cleanup https://github.com/llvm/llvm-project/issues/151026
## Problem
This is a regression that was observed in Clang 20 on modules code that
uses import std.
The lazy-loading mechanism for template specializations introduced in
#119333 can currently load additional nodes when called multiple times,
which breaks assumptions made by code that iterates over
specializations. This leads to iterator invalidation crashes in some
scenarios.
The core issue occurs when:
1. Code calls `spec_begin()` to get an iterator over template
specializations. This invokes `LoadLazySpecializations()`.
2. Code then calls `spec_end()` to get the end iterator.
3. During the `spec_end()` call, `LoadExternalSpecializations()` is
invoked again.
4. This can load additional specializations for certain cases,
invalidating the begin iterator returned in 1.
I was able to trigger the problem when constructing a ParentMapContext.
The regression test demonstrates two ways to trigger the construction of
the ParentMapContext on problematic code:
- The ArrayBoundV2 checker
- Unsigned overflow detection in sanitized builds
Unfortunately, simply dumping the ast (e.g. using `-ast-dump-all`)
doesn't trigger the crash because dumping requires completing the redecl
chain before iterating over the specializations.
## Solution
The fix ensures that the redeclaration chain is always completed
**before** loading external specializations by calling
`CompleteRedeclChain(D)` at the start of
`LoadExternalSpecializations()`. The idea is to ensure that all
`SpecLookups` are fully known and loaded before the call to
`LoadExternalSpecializationsImpl()`.
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
Add `NamespaceBaseDecl` as common base class of `NamespaceDecl` and
`NamespaceAliasDecl`. This simplifies `NestedNameSpecifier` a bit.
Co-authored-by: Matheus Izvekov <mizvekov@gmail.com>