This patch adds logic to canonicalize `-include-pch`'s input in the
frontend. This way, the `ASTWriter` always serializes the canonicalized
path to the included pch file whether the input is an absolute path or a
relative path.
Fixes rdar://168596546.
Depends on #169603.
This is the `use_device_ptr` counterpart of #168905.
With OpenMP 6.1, a `fallback` modifier can be specified on the
`use_device_ptr` clause to control the behavior when a pointer lookup
fails, i.e. there is no device pointer to translate into.
The default is `fb_preserve` (i.e. retain the original pointer), while
`fb_nullify` means: use `nullptr` as the translated pointer.
Dependent PR: #173930.
This PR unifies the terminology for:
* "context hash" - previously ambiguously referred to as "module hash"
or as overly specific "module context hash"
* "specific module cache path" - previously referred to as just "module
cache path" - hard to distinguish from the command-line-provided module
cache path without the context hash
NFCI
This reverts commit 1928c1ea9b57e9c44325d436bc7bb2f4585031f3.
We have at least one repro, but I won't be able to work on this until
next week. Also with Clang 22 cut upcoming, we probably need to revert
for now.
RISC-V vector intrinsic is generated dynamically at runtime, thus it's
note preserved in AST yet when using precompile header, neither do
information in SemaRISCV. We need to write these information to ast
record to be able to use precompile header for RISC-V.
Fixes#109634
## Problem
Given code such as `N::foo();`, we perform name look-up on `N`. In the
case where `N` is a namespace declared in imported modules, one
namespace decl (the "key declaration") for each module that declares a
namespace `foo` is loaded and stored. In large scales where there are
many such modules, (e.g., 1,500) and many uses (e.g., 500,000), this
becomes extremely inefficient because every look-up (500,000 of them)
return 1,500 results.
The following synthetic script demonstrates the problem:
```bash
#/usr/bin/env bash
CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1500}
NUM_USES=${NUM_USES:-500000}
USE_MODULES=${USE_MODULES:-true}
TMPDIR=$(mktemp -d)
echo "Working in temp directory: $TMPDIR"
cd $TMPDIR
trap "rm -rf \"$TMPDIR\"" EXIT
echo "namespace N { inline void foo() {} }" > m1.h
for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done
if $USE_MODULES; then
seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h"
fi
> a.cpp
if $USE_MODULES; then
for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
else
for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done
fi
echo "int main() {" >> a.cpp
for i in $(seq 1 $NUM_USES); do echo " N::foo();" >> a.cpp; done
echo "}" >> a.cpp
if $USE_MODULES; then
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \
$(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done)
else
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null
fi
```
As of 575d6892bcc5cef926cfc1b95225148262c96a15, without modules
(`USE_MODULES=false`) this takes about **4.5s**, whereas with modules
(`USE_MODULES=true`), this takes about **37s**.
With this PR, without modules there's no change (as expected) at 4.5s,
but with modules it improves to about **5.2s**.
## Approach
The approach taken here aims to maintain status-quo with respect to the
input and output of modules. That is, the `ASTReader` and `ASTWriter`
both read and write the same declarations as it did before. The
difference is in the middle part: the [`StoredDeclsMap` in
`DeclContext`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/AST/DeclBase.h#L2024-L2030).
The `StoredDeclsMap` is roughly a `map<DeclarationName,
StoredDeclsList>`. Currently, we read all of the external namespace
decls from `ASTReader`, they all get stored into the `StoredDeclsList`,
and the `ASTWriter` iterates through that list and writes out the
results.
This PR continues to read all of the external namespace decls from
`ASTReader`, but only stores one namespace decl in the
`StoredDeclsList`. This is okay since the reading of the decls handles
all of the merging and chaining of the namespace decls, and as long as
they're loaded and chained, returning one for look-up purposes is
sufficient.
The other half of the problem is to write out all of the external
namespaces that we used to store in `StoredDeclsList` but no longer. For
this, we take advantage of the
[`KeyDecls`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/Serialization/ASTReader.h#L1342-L1347)
data structure in `ASTReader`. `KeyDecls` is roughly a `map<Decl *,
vector<GlobalDeclID>>`, and it stores a mapping from the canonical decl
of a redeclarable decl to a list of `GlobalDeclID`s where each ID
represents a "key declaration" from each imported module. More to the
point, if we read external namespaces `N1`, `N2`, `N3` in `ASTReader`,
we'll either have `N1` mapped to `[N2, N3]`, or some newly local
canonical decl mapped to `[N1, N2, N3]`. Either way, we can visit `N1`,
`N2`, and `N3` by doing `ASTReader::forEachImportedKeyDecls(N1,
Visitor)`, and we leverage this to maintain the current behavior of
writing out all of the imported namespace decls in `ASTWriter`.
## Alternatives Attempted
- Tried reading fewer declarations on the `ASTReader` side, and writing
out fewer declarations on the `ASTWriter` side, and neither options
worked at all.
- Tried trying to split `StoredDeclsList` into two pieces, one with
non-namespace decls and one with only namespace decls, but that didn't
work well... I think because the order of the declarations matter
sometimes, and maybe also because the declaration replacement logic gets
more complicated.
- Tried to deduplicate at the `SemaLookup` level. Basically, retrieve
all the stored decls but deduplicate populating the `LookupResult`
[here](https://github.com/llvm/llvm-project/blob/release/21.x/clang/lib/Sema/SemaLookup.cpp#L1137-L1144).
This did improve things slightly, but not quite enough, and this
solution seemed cleaner in the end anyway.
This reverts commit
54a4da9df6.
MSVC supports an extension allowing to delete an array of objects via
pointer whose static type doesn't match its dynamic type. This is done
via generation of special destructors - vector deleting destructors.
MSVC's virtual tables always contain a pointer to the vector deleting
destructor for classes with virtual destructors, so not having this
extension implemented causes clang to generate code that is not
compatible with the code generated by MSVC, because clang always puts a
pointer to a scalar deleting destructor to the vtable. As a bonus the
deletion of an array of polymorphic object will work just like it does
with MSVC - no memory leaks and correct destructors are called.
This patch will cause clang to emit code that is compatible with code
produced by MSVC but not compatible with code produced with clang of
older versions, so the new behavior can be disabled via passing
-fclang-abi-compat=21 (or lower).
Fixes https://github.com/llvm/llvm-project/issues/19772
- CUDA's dynamic parallelism extension allows device-side kernel
launches, which share the identical syntax to host-side launches, e.g.,
kernel<<<Dg, Db, Ns, S>>>(arguments);
but differ from the code generation. That device-side kernel launches is
eventually translated into the following sequence
config = cudaGetParameterBuffer(alignment, size);
// setup arguments by copying them into `config`.
cudaLaunchDevice(func, config, Dg, Db, Ns, S);
- To support the device-side kernel launch, 'CUDAKernelCallExpr' is
reused but its config expr is set to a call to 'cudaLaunchDevice'.
During the code generation, 'CUDAKernelCallExpr' is expanded into the
sequence aforementioned.
- As the device-side kernel launch requires the source to be compiled as
relocatable device code and linked with '-lcudadevrt'. Linkers are
changed to pass relevant link options to 'nvlink'.
As described in section 2.14.6 of openmp spec, the patch implements
support for iterator in motion clauses.
---------
Co-authored-by: Shashwathi N <nshashwa@pe31.hpc.amslabs.hpecorp.net>
MSVC supports an extension allowing to delete an array of objects via
pointer whose static type doesn't match its dynamic type. This is done
via generation of special destructors - vector deleting destructors.
MSVC's virtual tables always contain a pointer to the vector deleting
destructor for classes with virtual destructors, so not having this
extension implemented causes clang to generate code that is not
compatible with the code generated by MSVC, because clang always puts a
pointer to a scalar deleting destructor to the vtable. As a bonus the
deletion of an array of polymorphic object will work just like it does
with MSVC - no memory leaks and correct destructors are called.
This patch will cause clang to emit code that is compatible with code
produced by MSVC but not compatible with code produced with clang of
older versions, so the new behavior can be disabled via passing
-fclang-abi-compat=21 (or lower).
This is yet another attempt to land vector deleting destructors support
originally implemented by
https://github.com/llvm/llvm-project/pull/133451.
This PR contains fixes for issues reported in the original PR as well as
fixes for issues related to operator delete[] search reported in several
issues like
https://github.com/llvm/llvm-project/pull/133950#issuecomment-2787510484https://github.com/llvm/llvm-project/issues/134265
Fixes https://github.com/llvm/llvm-project/issues/19772
Similar to previous no transitive changes to decls, types, identifiers
and source locations (
https://github.com/llvm/llvm-project/pull/92083https://github.com/llvm/llvm-project/pull/92085https://github.com/llvm/llvm-project/pull/92511https://github.com/llvm/llvm-project/pull/86912
)
This patch does the same thing for MacroID and PreprocessedEntityID.
---
### Some background
Previously we record different IDs linearly. That is, when writing a
module, if we have 17 decls in imported modules, the ID of decls in the
module will start from 18. This makes the contents of the BMI changes if
the we add/remove any decls, types, identifiers and source locations in
the imported modules.
This makes it hard for us to reduce recompilations with modules. We want
to skip recompilations as we think the modules can help us to remove
fake dependencies. This can be done by split the ID into <ModuleIndex,
LocalIndex> pairs.
This is ALREADY done for several different ID above. We call it
non-casacading changes
(https://clang.llvm.org/docs/StandardCPlusPlusModules.html#experimental-non-cascading-changes).
Our internal users have already used this feature and it works well for
years.
Now we want to extend this to MacroID and PreprocessedEntityID. This is
helpful for us in the downstream as we allowed named modules to export
macros. But I believe this is also helpful for header-like modules if
you'd like to explore the area.
And also I think this is a nice cleanup too.
---
Given the use of MacroID and PreprocessedEntityID are not as complicated
as other IDs in the above series, I feel the patch itself should be
good. I hope the vendors can test the patch to make sure it won't affect
existing users.
Close https://github.com/llvm/llvm-project/issues/166068
The cause of the problem is that we would import initializers and
pending implicit instantiations from other named module. This is very
bad and it may waste a lot of time.
And we didn't observe it as the weak symbols can live together and the
strong symbols would be removed by other mechanism. So we didn't observe
the bad behavior for a long time. But it indeeds waste compilation time.
This PR adds support for the `dyn_groupprivate` clause, which will be
part of OpenMP 6.1. This feature allows users to request dynamic shared
memory on target regions.
---------
Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>
Fixes#165445.
Fixes a crash when `ASTWriter::GenerateNameLookupTable` processes enum
constants from C++20 header units.
The special handling for enum constants, introduced in fccc6ee, doesn't
account for declarations whose owning module is a C++20 header unit. It
calls `isNamedModule()` on the result of
`getTopLevelOwningNamedModule()`, which returns null for header units,
causing a null pointer dereference.
This PR enhances the OpenMP `nowait` clause implementation by adding
support for optional argument in both parsing and semantic analysis
phases.
Reference:
1. OpenMP 6.0 Specification, page 481
This is the first patch of a handful to get the reduction combiner
recipe lowering properly. THIS patch is NFC as it doesn't actually
change anything except the structure of the AST.
For each 'combiner' recipe we need a 'LHS' 'RHS' and expression to
represent the operation.
Each var-reference can have 1 or more combiners.
IF it is a plain scalar, or a struct with the proper operator, or an
array of either of those, there will be 1.
HOWEVER, aggregates without the proper operator are supposed to be
broken down and done from their elements (which can only be scalars). In
this case, we will represent 1 'combiner' recipe per field-decl.
This patch only puts the infrastructure in place to do so, future
patches wll do the work to fill this in.
I originally expected that we were going to need the initExpr stored
separately from the allocaDecl when doing arrays/pointers, however after
implementing it, we found that the idea of having the allocaDecl just
store its init directly still works perfectly. This patch removes the
extra field from the AST.
This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang.
This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending).
---------
Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
Close https://github.com/llvm/llvm-project/issues/159424
Close https://github.com/llvm/llvm-project/issues/133720
For in-class friend declaration, it is hard for the serializer to decide
if they are visible to other modules. But luckily, Sema can handle it
perfectly enough. So it is fine to make all of the in-class friend
declaration as generally visible in ASTWriter and let the Sema to make
the final call. This is safe as long as the corresponding class's
visibility are correct.
While working on vector deleting destructors support
([GH19772](https://github.com/llvm/llvm-project/issues/19772)), I
noticed that MSVC produces different code in scalar deleting destructor
body depending on whether class defined its own operator delete. In
MSABI deleting destructors accept an additional implicit flag parameter
allowing some sort of flexibility. The mismatch I noticed is that
whenever a global operator delete is called, i.e. `::delete`, in the
code produced by MSVC the implicit flag argument has a value that makes
the 3rd bit set, i.e. "5" for scalar deleting destructors "7" for vector
deleting destructors.
Prior to this patch, clang handled `::delete` via calling global
operator delete direct after the destructor call and not calling class
operator delete in scalar deleting destructor body by passing "0" as
implicit flag argument value. This is fine until there is an attempt to
link binaries compiled with clang with binaries compiled with MSVC. The
problem is that in binaries produced by MSVC the callsite of the
destructor won't call global operator delete because it is assumed that
the destructor will do that and a destructor body generated by clang
will never do.
This patch removes call to global operator delete from the callsite and
add additional check of the 3rd bit of the implicit parameter inside of
scalar deleting destructor body.
---------
Co-authored-by: Tom Honermann <tom@honermann.net>
A DependentTemplateSpecializationType (DTST) is basically just a
TemplateSpecializationType (TST) with a hardcoded DependentTemplateName
(DTN) as its TemplateName.
This removes the DTST and replaces all uses of it with a TST, removing a
lot of duplication in the implementation.
Technically the hardcoded DTN is an optimization for a most common case,
but the TST implementation is in better shape overall and with other
optimizations, so this patch ends up being an overall performance
positive:
<img width="1465" height="38" alt="image"
src="https://github.com/user-attachments/assets/084b0694-2839-427a-b664-eff400f780b5"
/>
A DTST also didn't allow a template name representing a DTN that was
substituted, such as from an alias template, while the TST does allow it
by the simple fact it can hold an arbitrary TemplateName, so this patch
also increases the amount of sugar retained, while still being faster
overall.
Example (from included test case):
```C++
template<template<class> class TT> using T1 = TT<int>;
template<class T> using T2 = T1<T::template X>;
```
Here we can now represent in the AST that `TT` was substituted for the
dependent template name `T::template X`.
This reverts commit 613caa909c78f707e88960723c6a98364656a926, essentially
reapplying 4a4bddec3571d78c8073fa45b57bbabc8796d13d after moving
`normalizeModuleCachePath` from clangFrontend to clangLex.
This PR is part of an effort to remove file system usage from the
command line parsing code. The reason for that is that it's impossible
to do file system access correctly without a configured VFS, and the VFS
can only be configured after the command line is parsed. I don't want to
intertwine command line parsing and VFS configuration, so I decided to
perform the file system access after the command line is parsed and the
VFS is configured - ideally right before the file system entity is used
for the first time.
This patch delays normalization of the module cache path until
`CompilerInstance` is asked for the cache path in the current
compilation context.
This reverts commit 4a4bddec3571d78c8073fa45b57bbabc8796d13d. The Serialization library doesn't link Frontend, where CompilerInstance lives, causing link failures on some build bots.
This PR is part of an effort to remove file system usage from the
command line parsing code. The reason for that is that it's impossible
to do file system access correctly without a configured VFS, and the VFS
can only be configured after the command line is parsed. I don't want to
intertwine command line parsing and VFS configuration, so I decided to
perform the file system access after the command line is parsed and the
VFS is configured - ideally right before the file system entity is used
for the first time.
This patch delays normalization of the module cache path until
`CompilerInstance` is asked for the cache path in the current
compilation context.
Expressions/references with 'bounds' are going to need to do
initialization significantly differently, so we need to have the
initializer and the declaration 'separate' in the future. This patch
splits the AST node into two, and normalizes them a bit.
Additionally, since this required significant work on the recipe
generation, this patch also does a bit of a refactor to improve
readability and future expansion, now that we have a good understanding
of how these are going to look.
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary codegen changes.
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary codegen changes.
This reintroduces `Type.h`, having earlier been renamed to `TypeBase.h`,
as a redirection to `TypeBase.h`, and redirects most users to include
the former instead.
This is a preparatory patch for being able to provide inline definitions
for `Type` methods which would otherwise cause a circular dependency
with `Decl{,CXX}.h`.
Doing these operations into their own NFC patch helps the git rename
detection logic work, preserving the history.
This patch makes clang just a little slower to build (~0.17%), just
because it makes more code indirectly include `DeclCXX.h`.
This is a preparatory patch, to be able to provide inline definitions
for `Type` functions which depend on `Decl{,CXX}.h`. As the latter also
depends on `Type.h`, this would not be possible without some
reorganizing.
Splitting this rename into its own patch allows git to track this as a
rename, and preserve all git history, and not force any code
reformatting.
A later NFC patch will reintroduce `Type.h` as redirection to
`TypeBase.h`, rewriting most places back to directly including `Type.h`
instead of `TypeBase.h`, leaving only a handful of places where this is
necessary.
Then yet a later patch will exploit this by making more stuff inline.
The new builtin `__builtin_dedup_pack` removes duplicates from list of
types.
The added builtin is special in that they produce an unexpanded pack
in the spirit of P3115R0 proposal.
Produced packs can be used directly in template argument lists and get
immediately expanded as soon as results of the computation are
available.
It allows to easily combine them, e.g.:
```cpp
template <class ...T>
struct Normalize {
// Note: sort is not included in this PR, it illustrates the idea.
using result = std::tuple<
__builtin_sort_pack<
__builtin_dedup_pack<int, double, T...>...
>...>;
}
;
```
Limitations:
- only supported in template arguments and bases,
- can only be used inside the templates, even if non-dependent,
- the builtins cannot be assigned to template template parameters.
The actual implementation proceeds as follows:
- When the compiler encounters a `__builtin_dedup_pack` or other
type-producing
builtin with dependent arguments, it creates a dependent
`TemplateSpecializationType`.
- During substitution, if the template arguments are non-dependent, we
will produce: a new type `SubstBuiltinTemplatePackType`, which stores
an argument pack that needs to be substituted. This type is similar to
the existing `SubstTemplateParmPack` in that it carries the argument
pack that needs to be expanded further. The relevant code is shared.
- On top of that, Clang also wraps the resulting type into
`TemplateSpecializationType`, but this time only as a sugar.
- To actually expand those packs, we collect the produced
`SubstBuiltinTemplatePackType` inside `CollectUnexpandedPacks`.
Because we know the size of the produces packs only after the initial
substitution, places that do the actual expansion will need to have a
second run over the substituted type to finalize the expansions (in
this patch we only support this for template arguments, see
`ExpandTemplateArgument`).
If the expansion are requested in the places we do not currently
support, we will produce an error.
More follow-up work will be needed to fully shape this:
- adding the builtin that sorts types,
- remove the restrictions for expansions,
- implementing P3115R0 (scheduled for C++29, see
https://github.com/cplusplus/papers/issues/2300).
This patch does the bare minimum to start setting up the reduction
recipe support, including adding a type to the AST to store it. No real
additional work is done, and a bunch of static_asserts are left around
to allow us to do this properly.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
When two threads are accessing the same `pcm`, it is possible that the
reading thread sees the timestamp update, while the file on disk is not
updated.
This PR moves timestamp update from `writeAST` to
`compileModuleAndReadASTImpl`, so we only update the timestamp after the
file has been committed to disk.
rdar://152097193
This patch adds the 'init recipes' to firstprivate like I did for
'private', so that we can properly init these types. At the moment,
the recipe init isn't generated (just the VarDecl), and this isn't
really used anywhere as it will be used exclusively in Codegen.
Previously, #151360 implemented 'private' clause lowering, but didn't
properly initialize the variables. This patch adds that behavior to make
sure we correctly get the constructor or other init called.
This fixes an ambiguous type type_info when you try and reference the
`type_info` type while using clang modulemaps with `-fms-compatibility`
enabled
Fixes#38400
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.