Summary:
The `__builtin_shufflevector` call would return a GCC vector in all
cases where the vector type was increased. Change this to preserve
whether or not this was an extended vector.
Fixes: https://github.com/llvm/llvm-project/issues/107981
Summary:
Clang has support for boolean vectors, these builtins expose the LLVM
instruction of the same name. This differs from a manual load and select
by potentially suppressing traps from deactivated lanes.
Fixes: https://github.com/llvm/llvm-project/issues/107753
These builtins are modeled on the clzg/ctzg builtins, which accept an
optional second argument. This second argument is returned if the first
argument is 0. These builtins unconditionally exhibit zero-is-undef
behaviour, regardless of target preference for the other ctz/clz
builtins. The builtins have constexpr support.
Fixes#154113
These warnings are reported on a per expression basis, however some
potential misaligned accesses are discarded before that happens. The
problem is when a new expression starts while processing another
expression. The new expression will end first and emit all potential
misaligned accesses collected up to that point. That includes candidates
that were found in the parent expression, even though they might have
gotten discarded later.
Fixed by checking if the candidate is located withing the currently
processed expression.
Fixes#144729
The goal is to correctly identify diagnostics that are emitted by virtue
of -Wformat-signedness.
Before this change, diagnostic messages triggered by -Wformat-signedness
might look like:
format specifies type 'unsigned int' but the argument has type 'int'
[-Wformat]
signedness of format specifier 'u' is incompatible with 'c' [-Wformat]
With this change:
format specifies type 'unsigned int' but the argument has type 'int',
which differs in signedness [-Wformat-signedness]
signedness of format specifier 'u' is incompatible with 'c'
[-Wformat-signedness]
Fix:
- handleFormatSignedness can now return NoMatchSignedness. Callers
handle this.
- warn_format_conversion_argument_type extends the message it used to
emit by a string that
mentions "signedness".
- warn_format_cmp_specifier_sign_mismatch is now correctly categorized
as a
diagnostic controlled by -Wformat-signedness.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This is a first pass at implementing
[P2841R7](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2841r7.pdf).
The implementation is far from complete; however, I'm aiming to do that
in chunks, to make our lives easier.
In particular, this does not implement
- Subsumption
- Mangling
- Satisfaction checking is minimal as we should focus on #141776 first
(note that I'm currently very stuck)
FTM, release notes, status page, etc, will be updated once the feature
is more mature. Given the state of the feature, it is not yet allowed in
older language modes.
Of note:
- Mismatches between template template arguments and template template
parameters are a bit wonky. This is addressed by #130603
- We use `UnresolvedLookupExpr` to model template-id. While this is
pre-existing, I have been wondering if we want to introduce a different
OverloadExpr subclass for that. I did not make the change in this patch.
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
If field width is specified, the sign/space is already accounted for
within the field width, so no additional size is needed.
Fixes https://github.com/llvm/llvm-project/issues/143951.
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
`std::invoke` is currently quite heavy compared to a function call,
since it involves quite heavy SFINAE. This can be done significantly
more efficient by the compiler, since most calls to `std::invoke` are
simple function calls and 6 out of the seven overloads for `std::invoke`
exist only to support member pointers. Even these boil down to a few
relatively simple checks.
Some real-world testing with this patch revealed some significant
results. For example, instantiating `std::format("Banane")` (and its
callees) went down from ~125ms on my system to ~104ms.
This commit addresses a fallout introduced by #126846.
Previously, TryGetExprRange would return an IntRange that has an active
range exceeding the maximum representable range for the expression's
underlying type. This led to clang erroneously issuing warnings about
implicit conversions losing integer precision.
This commit fixes the bug by capping IntRange::Width to MaxWidth.
rdar://149444029
The function already exposes a work list to avoid deep recursion, this
commit starts utilizing it in a helper that could also lead to a deep
recursion.
We have observed this crash on `clang/test/C/C99/n590.c` with our
internal builds that enable aggressive optimizations and hit the limit
earlier than default release builds of Clang.
See the added test for an example with a deeper recursion that used to
crash in upstream Clang before this change with the following stack
trace:
```
#0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
#1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18
#2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
#3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0)
#4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0
#5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
#6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
#7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
#8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
#9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
#10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
#11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
#12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
#13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
#14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
#15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
#16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
#17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
#18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
#19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
... 700+ more stack frames.
```
This introduces the attribute discussed in
https://discourse.llvm.org/t/rfc-function-type-attribute-to-prevent-cfi-instrumentation/85458.
The proposed name has been changed from `no_cfi` to
`cfi_unchecked_callee` to help differentiate from `no_sanitize("cfi")`
more easily. The proposed attribute has the following semantics:
1. Indirect calls to a function type with this attribute will not be
instrumented with CFI. That is, the indirect call will not be checked.
Note that this only changes the behavior for indirect calls on pointers
to function types having this attribute. It does not prevent all
indirect function calls for a given type from being checked.
2. All direct references to a function whose type has this attribute
will always reference the true function definition rather than an entry
in the CFI jump table.
3. When a pointer to a function with this attribute is implicitly cast
to a pointer to a function without this attribute, the compiler will
give a warning saying this attribute is discarded. This warning can be
silenced with an explicit C-style cast or C++ static_cast.
With pointer authentication it becomes non-trivial to correctly load the
vtable pointer of a polymorphic object.
__builtin_get_vtable_pointer is a function that performs the load and
performs the appropriate authentication operations if necessary.
This PR reverts a change made in #126846.
#126846 introduced an ``-Wimplicit-int-conversion`` diagnosis for
```c++
int8_t x = something;
x = -x; // warning here
```
This is technically correct since -x could have a width of 9, but this
pattern is common in codebases.
Reverting this change would also introduce the false positive I fixed in
#126846:
```c++
bool b(signed char c) {
return -c >= 128; // -c can be 128
}
```
This false positive is uncommon, so I think it makes sense to revert the
change.
Microsoft helpfully defines `THIS` to `void` in two different platform
SDK headers, at least one of which is reachable via <Windows.h>. We have
a user who ran into a build because of `THIS` unfortunate macro name
collision.
Rename the members to better match our naming conventions.
Fixes#142186
Fixes#141397
Element-wise math builtins (e.g.
__builtin_elementwise_max/__builtin_elementwise_pow etc.) fail when
their arguments have different qualifications, but should be
well-formed. The fix is to use hasSameUnqualifiedType to check if the
arguments match.
The patch introduce __builtin_spirv_generic_cast_to_ptr_explicit which
is lowered to the llvm.spv.generic.cast.to.ptr.explicit intrinsic.
The SPIR-V builtins are now split into 3 differents file:
BuiltinsSPIRVCore.td,
BuiltinsSPIRVVK.td for Vulkan specific builtins, BuiltinsSPIRVCL.td for
OpenCL specific builtins
and BuiltinsSPIRVCommon.td for common ones.
The patch also introduces a new header defining its SPIR-V friendly
equivalent (__spirv_GenericCastToPtrExplicit_ToGlobal,
__spirv_GenericCastToPtrExplicit_ToLocal and
__spirv_GenericCastToPtrExplicit_ToPrivate). The functions are declared
as aliases to the new builtin allowing C-like languages to have a
definition to rely on as well as gaining proper front-end diagnostics.
The motivation for the header is to provide a stable binding for
applications or library (such as SYCL) and allows non SPIR-V targets to
provide an implementation (via libclc or similar to how it is done for
gpuintrin.h).
charN_t represent code units of different UTF encodings. Therefore the
values of 2 different charN_t objects do not represent the same
characters.
In order to avoid comparing apples and oranges, we add new warnings to
warn on:
- Implicit conversions
- Comparisons
- Other cases involving arithmetic conversions
We only produce the warning if we cannot establish the comparison would
be safe through constant evaluation.
The new `-Wimplicit-unicode-conversion` warning is enabled by default.
Note that this PR intentionally doesn;t touches char/wchar_t, but it
would be worth considering also warning on extending the new warnings to
these types (in a follow up)
Additionally most arithmetic operations on charN_t don't really make
sense (ie what does it mean to addition code units), so we could add
warnings for that.
Fixes#138526
The passed indices have to be constant integers anyway, which we verify
before creating the ShuffleVectorExpr. Use the value we create there and
save the indices using a ConstantExpr instead. This way, we don't have
to evaluate the args every time we call getShuffleMaskIdx().
When calling a function that expects zero arguments with one argument,
`Call->getArg(1)` will trap when trying to format the diagnostic.
This also seems to improve the rendering of the diagnostic some of the
time. Before:
```
$ ./bin/clang -c a.c
a.c:2:30: error: too many arguments to function call, expected 2, have 4
2 | __builtin_annotation(1, 2, 3, 4);
| ~ ^
```
After:
```
$ ./bin/clang -c a.c
a.c:2:30: error: too many arguments to function call, expected 2, have 4
2 | __builtin_annotation(1, 2, 3, 4);
| ^~~~
```
Split from #139580.
---------
Co-authored-by: Mariya Podchishchaeva <mariya.podchishchaeva@intel.com>
In C++, the type of an enumerator is the type of the enumeration,
whereas in C, the type of the enumerator is 'int'. The type of a comma
operator is the type of the right-hand operand, which means you can get
an implicit conversion with this code in C but not in C++:
```
enum E { Zero };
enum E foo() {
return ((void)0, Zero);
}
```
We were previously incorrectly diagnosing this code as being
incompatible with C++ because the type of the paren expression would be
'int' there, whereas in C++ the type is 'E'.
So now we handle the comma operator with special logic when analyzing
implicit conversions in C. When analyzing the left-hand operand of a
comma operator, we do not need to check for that operand causing an
implicit conversion for the entire comma expression. So we only check
for that case with the right-hand operand.
This addresses a concern brought up post-commit:
https://github.com/llvm/llvm-project/pull/137658#issuecomment-2854525259
This adds
- The parsing of `trivially_relocatable_if_eligible`,
`replaceable_if_eligible` keywords
- `__builtin_trivially_relocate`, implemented in terms of memmove. In
the future this should
- Add the appropriate start/end lifetime markers that llvm does not have
(`start_lifetime_as`)
- Add support for ptrauth when that's upstreamed
- the `__builtin_is_cpp_trivially_relocatable` and
`__builtin_is_replaceable` traits
Fixes#127609
This patch adds templated `operator<<` for diagnostics that pass scoped
enums, saving people from `llvm::to_underlying()` clutter on the side of
emitting the diagnostic. This eliminates 80 out of 220 usages of
`llvm::to_underlying()` in Clang.
I also backported `std::is_scoped_enum_v` from C++23.
clang currently issues a warning when memset is used on a struct that
contains an address-discriminated pointer field, even though this is
entirely valid behavior.
For example:
```
struct S {
int * __ptrauth(1, 1, 100) p;
} s;
memset(&s, 0, sizeof(struct S));
```
Only allow the warning to be emitted in C++ mode to silence the warning.
rdar://142495870
This introduces a new diagnostic group to diagnose implicit casts from
int to an enumeration type. In C, this is valid, but it is not
compatible with C++.
Additionally, this moves the "implicit conversion from enum type to
different enum type" diagnostic from `-Wenum-conversion` to a new group
`-Wimplicit-enum-enum-cast`, which is a more accurate home for it.
`-Wimplicit-enum-enum-cast` is also under `-Wimplicit-int-enum-cast`, as
it is the same incompatibility (the enumeration on the right-hand is
promoted to `int`, so it's an int -> enum conversion).
Fixes#37027
When records contain fields with pointer authentication, even simple
copies can require
additional work be performed. This patch contains the core functionality
required to
handle user defined structs, as well as the implicitly constructed
structs for blocks, etc.
Co-authored-by: Ahmed Bougacha
Co-authored-by: Akira Hatanaka
Co-authored-by: John Mccall
Very simply extends the bitfield sema checks for assignment to fields
with a preferred type specified to consider the preferred type if the
decl storage type is not explicitly an enum type.
This does mean that if the preferred and explicit types have different
storage requirements we may not warn in all possible cases, but that's a
scenario for which the warnings are much more complex and confusing.