The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
A significant number of our tests in C accidentally use functions
without prototypes. This patch converts the function signatures to have
a prototype for the situations where the test is not specific to K&R C
declarations. e.g.,
void func();
becomes
void func(void);
This is the sixth batch of tests being updated (there are a significant
number of other tests left to be updated).
The `printf` specifier `%n` is not supported on Android's libc and will soon be removed from Fuchsia's
Reviewed By: enh
Differential Revision: https://reviews.llvm.org/D117611
In case a char-literal of type int (C/ObjectiveC) corresponds to a
format specifier with the %hh length modifier, don't treat the literal
as of type char for issuing diagnostics, as otherwise this results in:
printf("%hhd", 'e');
warning: format specifies type 'char' but the argument has type 'char'.
Differential revision: https://reviews.llvm.org/D97951
C11 standard refers to the unsigned counterpart of the type ptrdiff_t
in the paragraph 7.21.6.1p7 where it defines the format specifier %tu.
In Clang (in PrintfFormatString.cpp, lines 508-510) there is a FIXME for this case,
in particular, Clang didn't diagnose %tu issues at all, i.e.
it didn't emit any warnings on the code printf("%tu", 3.14).
In this diff we add a method getUnsignedPointerDiffType for getting the corresponding type
similarly to how it's already done in the other analogous cases (size_t, ssize_t, ptrdiff_t etc)
and fix -Wformat diagnostics for %tu plus the emitted fix-it as well.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D38270
llvm-svn: 314470
This diff makes the test FixIt/format.m more robust.
The issue was caught by the build bot clang-cmake-thumbv7-a15.
Test plan: make check-all
llvm-svn: 308073
This diff addresses FIXME in lib/Analysis/PrintfFormatString.cpp
and makes PrintfSpecifier::getArgType return the correct type.
In particular, this change enables Clang to emit a warning on
incorrect using of "%zd"/"%zn" format specifiers.
Differential revision: https://reviews.llvm.org/D35427
Test plan: make check-all
llvm-svn: 308067
These test updates almost exclusively around the change in behavior
around enum: enums without a definition are considered incomplete except
when targeting MSVC ABIs. Since these tests are interested in the
'incomplete-enum' behavior, restrict them to %itanium_abi_triple.
llvm-svn: 249660
This allows us to be more careful when dealing with enums whose fixed
underlying type requires special handling in a format string, like
NSInteger.
A refinement of r163266 from a year and a half ago, which added the
special handling for NSInteger and friends in the first place.
<rdar://problem/16616623>
llvm-svn: 209966
Presumably, if the printf format has the sign explicitly requested, the user
wants to treat the data as signed.
This is a fix-up for r172739, and also includes several test changes that
didn't make it into that commit.
llvm-svn: 172762
For most cases where a conversion specifier doesn't match an argument,
we usually guess that the conversion specifier is wrong. However, if
the argument is an integer type and the specifier is %C, it's likely
the user really did mean to print the integer as a character.
(This is more common than %c because there is no way to specify a unichar
literal -- you have to write an integer literal, such as '0x2603',
and then cast it to unichar.)
This does not change the behavior of %S, since there are fewer cases
where printing a literal Unicode *string* is necessary, but this could
easily be changed in the future.
<rdar://problem/11982013>
llvm-svn: 169400
The type of a character literal is 'int' in C, but if the user writes a
character /as/ a literal, we should assume they meant it to be a
character and not a numeric value, and thus offer %c as a correction
rather than %d.
There's a special case for multi-character literals (like 'MooV'), which
have implementation-defined value and usually cannot be printed with %c.
These still use %d as the suggestion.
In C++, the type of a character literal is 'char', and so this problem
doesn't exist.
<rdar://problem/12282316>
llvm-svn: 169398
We tried to account for 'uint8_t' by saying that /typedefs/ of 'char'
should be corrected as %hhd rather than %c, but the condition was wrong.
llvm-svn: 169397