Prevent an assertion failure in the cstring checker when library
functions like memcpy are defined with non-default address spaces.
Adds a test for this case.
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 30 instances that rely on this "redirection", with about
half of them under clang/. Since the redirection doesn't improve
readability, this patch replaces SmallSet with SmallPtrSet for pointer
element types.
I'm planning to remove the redirection eventually.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This patch improves MallocChecker to detect use-after-free bugs when
a freed structure's field is passed by address (e.g., `&ptr->field`).
Previously, MallocChecker would miss such cases, as it only checked the
top-level symbol of argument values.
This patch analyzes the base region of arguments and extracts the
symbolic region (if any), allowing UAF detection even for field address
expressions.
Fixes#152446
Binding a value to location can happen when a new value is created or
when and existing value is updated. This modification exposes whether
the value binding happens at a declaration.
This helps simplify the hacky logic of the BindToImmutable checker.
This commit converts RetainCountChecker to the new checker family
framework that was introduced in the commit
6833076a5d9f5719539a24e900037da5a3979289
This commit also performs some minor cleanup around the parts that had
to be changed, but lots of technical debt still remains in this old
codebase.
CStringChecker had an AdditionOverflow bug type which was intended for a
situation where the analyzer concludes that the addition of two
size/length values overflows `size_t`.
I strongly suspect that the analyzer could emit bugs of this type in
certain complex corner cases (e.g. due to inaccurate cast modeling), but
these reports would be all false positives because in the real world the
sum of two size/length values is always far below SIZE_MAX. (Although
note that there was no test where the analyzer emitted a bug with this
type.)
To simplify the code (and perhaps eliminate false positives), I
eliminated this bug type and replaced code that emits it by a simple
`addSink()` call (because we still want to get rid of the execution
paths where the analyzer has invalid assumptions).
We sometimes don't know if the operand of [[assume]] is true/false, and
that's okay. We can just ignore the attribute in that case.
If we wanted something more fancy, we could bring the assumption to the
constraints, but dropping them should be just as fine for now.
Fixes#151854
This adds alpha.core.StoreToImmutable, a new alpha checker that detects
writes
to immutable memory regions, implementing part of SEI CERT Rule ENV30-C.
The
original proposal only handled global const variables, but this
implementation
extends it to also detect writes to:
- Local const variables
- String literals
- Const parameters and struct members
- Const arrays and pointers to const data
This checker is the continuation of the work started by zukatsinadze.
Discussion: https://reviews.llvm.org/D124244
This commit handles the following types:
- clang::ExternalASTSource
- clang::TargetInfo
- clang::ASTContext
- clang::SourceManager
- clang::FileManager
Part of cleanup #151026
The factory method `MemRegionManager::getElementRegion()` is the main
way of constructing `ElementRegion` objects, which are widespread in the
analysis and may represent array elements (as lvalues), pointer
arithmetic and type conversions.
This factory method used to strip all qualifiers from the type
associated with the element (after canonicalizing it), but the address
space qualifier can affect the size of a pointer type, so stripping it
caused an assertion failure (in `evalBinOpLL`):
clang: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:785: void
assertEqualBitWidths(clang::ento::ProgramStateRef, clang::ento::Loc,
clang::ento::Loc): Assertion `RhsBitwidth == LhsBitwidth && "RhsLoc and
LhsLoc bitwidth must be same!"' failed.
---------
Co-authored-by: Vince Bridgers <vince.a.bridgers@ericsson.com>
This commit converts the class CStringChecker to the checker family
framework and slightly simplifies the implementation.
This commit is NFC and preserves the confused garbage descriptions and
categories of the bug types (which was inherited from old mistakes). I'm
planning to clean that up in a follow-up commit.
Fix false positive where warnings were asserted for placement new even
when no additional space is requested
The PlacementNewChecker incorrectly triggered warnings when the storage
provided matched or exceeded the allocated type size, causing false
positives. Now the warning triggers only when the provided storage is
strictly less than the required size.
Add test cases covering exact size, undersize, and oversize scenarios to
validate the fix.
Fixes#149240
This commit converts the class StackAddrEscapeChecker to the checker
family framework and slightly simplifies the implementation.
This commit is almost NFC, the only technically "functional" change is
that it removes the hidden modeling checker `core.StackAddrEscapeBase`
which was only relevant as an implementation detail of the old checker
registration procedure.
This commit converts the class `NSOrCFErrorDerefChecker` to the checker
family framework and simplifies some parts of the implementation (e.g.
removes two very trivial subclasses of `BugType`).
This commit is almost NFC, the only technically "functional" change is
that it removes the hidden modeling checker `NSOrCFErrorDerefChecker`
which was only relevant as an implementation detail of the old checker
registration procedure.
This commit eliminates some corrupted variants of the once-widespread
`mutable std::unique_ptr<BugType>` antipattern from the checker file
`BasicObjCFoundationChecks.cpp`.
Previous purges probably missed these because there are slight mutations
(e.g. using a subclass of `BugType` instead of `BugType`) and therefore
some natural search terms (e.g. `make_unique<BugType>`) didn't produce
matches in this file.
Changed the warning message:
- **From**: 'Attempt to free released memory'
**To**: 'Attempt to release already released memory'
- **From**: 'Attempt to free non-owned memory'
**To**: 'Attempt to release non-owned memory'
- **From**: 'Use of memory after it is freed'
**To**: 'Use of memory after it is released'
All connected tests and their expectations have been changed
accordingly.
Inspired by [this
PR](https://github.com/llvm/llvm-project/pull/147542#discussion_r2195197922)
This commit converts the class DereferenceChecker to the checker family
framework and simplifies some parts of the implementation.
This commit is almost NFC, the only technically "functional" change is
that it removes the hidden modeling checker `DereferenceModeling` which
was only relevant as an implementation detail of the old checker
registration procedure.
MallocChecker.cpp has a complex heuristic that supresses reports where
the memory release happens during the release of a reference-counted
object (to suppress a significant amount of false positives).
Previously this logic asserted that there is at most one release point
corresponding to a symbol, but it turns out that there is a rare corner
case where the symbol can be released, forgotten and then released
again. This commit removes that assertion to avoid the crash. (As this
issue just affects a bug suppression heuristic, I didn't want to dig
deeper and modify the way the state of the symbol is changed.)
Fixes#149754
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
This patch addresses the lack of support for parenthesized
initialization in the Clang Static Analyzer's `ExprEngine`. Previously,
initializations such as `V v(1, 2);` were not modeled properly, which
could lead to false negatives in analyses like `DivideZero`.
```cpp
struct A {
int x;
A(int v) : x(v) {}
};
int t() {
A a(42);
return 1 / (a.x - 42); // expected-warning {{Division by zero}}
}
```
Fixes#148875
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
Variables `stdin`, `stdout`, `stderr` are now added to internal memory space
instead of system memory space. The system memory space is invalidated
at calls to system-defined functions which is not the correct behavior for
the standard stream variables.
This commit removes the DoubleDelete bug type from `MallocChecker.cpp`
because it's completely redundant with the `DoubleFree` bug (which is
already used for all allocator families, including new/delete).
This simplifies the code of the checker and prevents the potential
confusion caused by two semantically equivalent and very similar, but
not identical bug report messages.
This commit converts MallocChecker to the new checker family framework
that was introduced in the recent commit
6833076a5d9f5719539a24e900037da5a3979289 -- and gets rid of some awkward
unintended interactions between the checker frontends.
Bounded string functions takes smallest of two values as it's copy size
(`amountCopied` variable in `evalStrcpyCommon`), and it's used to
decided whether this operation will cause out-of-bound access and
invalidate it's super region if it does.
for `strlcat`: `amountCopied = min (size - dstLen - 1 , srcLen)`
for others: `amountCopied = min (srcLen, size)`
Currently when one of two values is unknown or `SValBuilder` can't
decide which one is smaller, `amountCopied` will remain `UnknownVal`,
which will invalidate copy destination's super region unconditionally.
This patch add check to see if one of these two values is definitely
in-bound, if so `amountCopied` has to be in-bound too, because it‘s less
than or equal to them, we can avoid the invalidation of super region and
some related false positives in this situation.
Note: This patch uses `size` as an approximation of `size - dstLen - 1`
in `strlcat` case because currently analyzer doesn't handle complex
expressions like this very well.
Closes#143807.
For the following code in C mode: https://godbolt.org/z/3eo1MeGhe
(There is no warning in C++ mode though).
```c++
struct B {
int i : 2;
int : 30; // unnamed bit-field
};
extern void consume_B(struct B);
void bitfield_B_init(void) {
struct B b1;
b1.i = 1; // b1 is initialized
consume_B(b1); // FP: Passed-by-value struct argument contains uninitialized data (e.g., field: '') [core.CallAndMessage]
}
```
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
For lambdas that are converted to C function pointers,
```
int (*ret_zero)() = []() { return 0; };
```
clang will generate conversion method like:
```
CXXConversionDecl implicit used constexpr operator int (*)() 'int (*() const noexcept)()' inline
-CompoundStmt
-ReturnStmt
-ImplicitCastExpr 'int (*)()' <FunctionToPointerDecay>
-DeclRefExpr 'int ()' lvalue CXXMethod 0x5ddb6fe35b18 '__invoke' 'int ()'
-CXXMethodDecl implicit used __invoke 'int ()' static inline
-CompoundStmt (empty)
```
Based on comment in Sema, `__invoke`'s function body is left empty
because it's will be filled in CodeGen, so in AST analysis phase we
should get lambda's `operator()` directly instead of calling `__invoke`
itself.
This commit converts the class DynamicTypePropagation to a very simple
checker family, which has only one checker frontend -- but also supports
enabling the backend ("modeling checker") without the frontend.
As a tangentially related change, this commit adds the backend of
DynamicTypePropagation as a dependency of alpha.core.DynamicTypeChecker
in Checkers.td, because the header comment of DynamicTypeChecker.cpp
claims that it depends on DynamicTypePropagation and the source code
seems to confirm this.
(The lack of this dependency relationship didn't cause problems, because
'core.DynamicTypePropagation' is in the group 'core', so it is
practically always active. However, explicitly declaring the dependency
clarifies the fact that the separate existence of the modeling checker
is warranted.)
## Purpose
This patch makes a minor changes to LLVM and Clang so that LLVM can be
built as a Windows DLL with `clang-cl`. These changes were not required
for building a Windows DLL with MSVC.
## Background
The Windows DLL effort is tracked in #109483. Additional context is
provided in [this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
Specific changes made in this patch:
- Remove `constexpr` fields that reference DLL exported symbols. These
symbols cannot be resolved at compile time when building a Windows DLL
using `clang-cl`, so they cannot be `constexpr`. Instead, they are made
`const` and initialized in the implementation file rather than at
declaration in the header.
- Annotate symbols now defined out-of-line with `LLVM_ABI` so they are
exported when building as a shared library.
- Explicitly add default copy assignment operator for `ELFFile` to
resolve a compiler warning.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
This patch corrects the state of the error node generated by the
core.DivideZero checker when it detects potential division by zero
involving a tainted denominator.
The checker split in
91ac5ed10a
started to introduce a conflicting assumption about the denominator into
the error node:
Node with the Bug Report "Division by a tainted value, possibly zero"
has an assumption "denominator != 0".
This has been done as a shortcut to continue analysis with the correct
assumption *after* the division - if we proceed, we can only assume the
denominator was not zero. However, this assumption is introduced
one-node too soon, leading to a self-contradictory error node.
In this patch, I make the error node with assumption of zero denominator
fatal, but allow analysis to continue on the second half of the state
split with the assumption of non-zero denominator.
---
CPP-6376
This commit converts NullabilityChecker to the new checker family
framework that was introduced in the recent commit
6833076a5d9f5719539a24e900037da5a3979289
This commit removes the dummy checker `nullability.NullabilityBase`
because it was hidden from the users and didn't have any useful role
except for helping the registration of the checker parts in the old
ad-hoc system (which is replaced by the new standardized framework).
Except for the removal of this dummy checker, no functional changes
intended.
Placement new does not allocate memory, so it should not be reported as
a memory leak. A recent MallocChecker refactor changed inlining of
placement-new calls with manual evaluation by MallocChecker.
339282d49f
This change avoids marking the value returned by placement new as
allocated and hence avoids the false leak reports.
Note that the there are two syntaxes to invoke placement new:
`new (p) int` and an explicit operator call `operator new(sizeof(int), p)`.
The first syntax was already properly handled by the engine.
This change corrects handling of the second syntax.
CPP-6375
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Allow @property of a raw pointer when NS_REQUIRES_PROPERTY_DEFINITIONS
is specified on the interface since such an interface does not
automatically synthesize raw pointer ivars.
Also emit a warning for @property(assign) and
@property(unsafe_unretained) under ARC as well as when explicitly
synthesizing a unsafe raw pointer property.
This PR adds the WebKit checker support for
[[clang::annotate_type("webkit.pointerconversion")]].
When this attribute is set on the return value of a function, the
function is treated as safe to call anywhere and the return value's
pointer origin is the argument.`
It's safe for a member function of a class or struct to call a function
or allocate a local variable with a pointer or a reference to a member
variable since "this" pointer, and therefore all its members, will be
kept alive by its caller so recognize as such.
This PR fixes the bug that the checker emits a warning when a function
takes T&& and passes it to another function using std::move. We should
treat std::move like any other pointer conversion and the origin of the
pointer to be that of the argument.