Transfer all casts by kind as we currently do implicit casts. This
obviates the need for specific handling of static casts.
Also transfer CK_BaseToDerived and CK_DerivedToBase and add tests for
these and missing tests for already-handled cast types.
Ensure that CK_BaseToDerived casts result in modeling of the fields of the derived class.
Both these attributes were introduced in ab1dc2d54db5 ("Thread Safety
Analysis: add support for before/after annotations on mutexes") back in
2015 as "beta" features.
Anecdotally, we've been using `-Wthread-safety-beta` for years without
problems.
Furthermore, this feature requires the user to explicitly use these
attributes in the first place.
After 10 years, let's graduate the feature to the stable feature set,
and reserve `-Wthread-safety-beta` for new upcoming features.
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 30 instances that rely on this "redirection", with about
half of them under clang/. Since the redirection doesn't improve
readability, this patch replaces SmallSet with SmallPtrSet for pointer
element types.
I'm planning to remove the redirection eventually.
The previous debug output only showed numeric IDs for origins, making it
difficult to understand what each origin represented. This change makes
the debug output more informative by showing what kind of entity each
origin refers to (declaration or expression) and additional details like
declaration names or expression class names. This improved output makes
it easier to debug and understand the lifetime safety analysis.
Adds support for compact serialization of Formulas, and a corresponding
parse function. Extends Environment and AnalysisContext with necessary
functions for serializing and deserializing all formula-related parts of
the environment.
Implement use-after-free detection in the lifetime safety analysis with two warning levels.
- Added a `LifetimeSafetyReporter` interface for reporting lifetime safety issues
- Created two warning levels:
- Definite errors (reported with `-Wexperimental-lifetime-safety-permissive`)
- Potential errors (reported with `-Wexperimental-lifetime-safety-strict`)
- Implemented a `LifetimeChecker` class that analyzes loan propagation and expired loans to detect use-after-free issues.
- Added tracking of use sites through a new `UseFact` class.
- Enhanced the `ExpireFact` to track the expressions where objects are destroyed.
- Added test cases for both definite and potential use-after-free scenarios.
The implementation now tracks pointer uses and can determine when a pointer is dereferenced after its loan has been expired, with appropriate diagnostics.
The two warning levels provide flexibility - definite errors for high-confidence issues and potential errors for cases that depend on control flow.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This commit fixes the false positive that C++ class methods with libc
function names would be false warned about. For example,
```
struct T {void strcpy() const;};
void test(const T& t) { str.strcpy(); // no warn }
```
rdar://156264388
The FactManager managing the FactEntrys stays alive for the analysis of
a single function. We are already using that by allocating TIL
S-expressions via BumpPtrAllocator. We can do the same with FactEntrys.
If we allocate the facts in an arena, we won't get destructor calls for
them, and they better be trivially destructible. This required replacing
the SmallVector member of ScopedLockableFactEntry with TrailingObjects.
FactEntrys are now passed around by plain pointer. However, to hide the
allocation of TrailingObjects from users, we introduce `create` methods,
and this allows us to make the constructors private.
Fix a crash in the lifetime safety dataflow analysis when handling null CFG blocks.
Added a null check for adjacent blocks in the dataflow analysis algorithm to prevent dereferencing null pointers. This occurs when processing CFG blocks with unreachable successors or predecessors.
Original crash: https://compiler-explorer.com/z/qfzfqG5vM
Fixes https://github.com/llvm/llvm-project/issues/150095
The point of reentrant capabilities is that they can be acquired
multiple times, so they should probably be excluded from requiring a
negative capability on acquisition via -Wthread-safety-negative.
However, we still propagate explicit negative requirements.
The character buffer passed to a "%.*s" specifier may be safely bound if
the precision is properly specified, even if the buffer does not
guarantee null-termination.
For example,
```
void f(std::span<char> span) {
printf("%.*s", (int)span.size(), span.data()); // "span.data()" does not guarantee null-termination but is safely bound by "span.size()", so this call is safe
}
```
rdar://154072130
This PR adds the `ExpiredLoansAnalysis` class to track which loans have expired. The analysis uses a dataflow lattice (`ExpiredLattice`) to maintain the set of expired loans at each program point.
This is a very light weight dataflow analysis and is expected to reach fixed point in ~2 iterations.
In principle, this does not need a dataflow analysis but is used for convenience in favour of lean code.
This commit tweaks the interface of `CheckerRegistry::addChecker` to
make it more practical for plugins and tests:
- The parameter `IsHidden` now defaults to `false` even in the
non-templated overload (because setting it to true is unusual,
especially in plugins).
- The parameter `DocsUri` defaults to the dummy placeholder string
`"NoDocsUri"` because (as of now) nothing queries its value from the
checker registry (it's only used by the logic that generates the
clang-tidy documentation, but that loads it directly from `Checkers.td`
without involving the `CheckerRegistry`), so there is no reason to
demand specifying this value.
In addition to propagating these changes, this commit clarifies,
corrects and extends lots of comments and performs various minor code
quality improvements in the code of unit tests and example plugins.
I originally wrote the bulk of this commit when I was planning to add an
extra parameter to `addChecker` in order to implement some technical
details of the CheckerFamily framework. At the end I decided against
adding that extra parameter, so this cleanup was left out of the PR
https://github.com/llvm/llvm-project/pull/139256 and I'm merging it now
as a separate commit (after minor tweaks).
This commit is mostly NFC: the only functional change is that the
analyzer will be compatible with plugins that rely on the default
argument values and don't specify `IsHidden` or `DocsUri`. (But existing
plugin code will remain valid as well.)
Refactor the Lifetime Safety Analysis infrastructure to support unit testing.
- Created a public API class `LifetimeSafetyAnalysis` that encapsulates the analysis functionality
- Added support for test points via a special `TestPointFact` that can be used to mark specific program points
- Added unit tests that verify loan propagation in various code patterns
Add per-program-point state tracking to the dataflow analysis framework.
- Added a `ProgramPoint` type representing a pair of a CFGBlock and a Fact within that block
- Added a `PerPointStates` map to store lattice states at each program point
- Modified the `transferBlock` method to store intermediate states after each fact is processed
- Added a `getLoans` method to the `LoanPropagationAnalysis` class that uses program points
This change enables more precise analysis by tracking program state at each individual program point rather than just at block boundaries. This is necessary for answering queries about the state of loans, origins, and other properties at specific points in the program, which is required for error reporting in the lifetime safety analysis.
Generalize the dataflow analysis to support both forward and backward analyses.
Some program analyses would be expressed as backward dataflow problems (like liveness analysis). This change enables the framework to support both forward analyses (like the loan propagation analysis) and backward analyses with the same infrastructure.
Refactored the lifetime safety analysis to use a generic dataflow framework with a policy-based design.
### Changes
- Introduced a generic `DataflowAnalysis` template class that can be specialized for different analyses
- Renamed `LifetimeLattice` to `LoanPropagationLattice` to better reflect its purpose
- Created a `LoanPropagationAnalysis` class that inherits from the generic framework
- Moved transfer functions from the standalone `Transferer` class into the analysis class
- Restructured the code to separate the dataflow engine from the specific analysis logic
- Updated debug output and test expectations to use the new class names
### Motivation
In order to add more analyses, e.g. [loan expiry](https://github.com/llvm/llvm-project/pull/148712) and origin liveness, the previous implementation would have separate, nearly identical dataflow runners for each analysis. This change creates a single, reusable component, which will make it much simpler to add subsequent analyses without repeating boilerplate code.
This is quite close to the existing dataflow framework!
Refactor the safe pattern analysis of pointer and size expression pairs
so that the check can be re-used in more places. For example, it can be
used to check whether the following cases are safe:
- `std::span<T>{ptr, size} // span construction`
- `snprintf(ptr, size, "%s", ...) // unsafe libc call`
- `printf("%.*s", size, ptr) // unsafe libc call`
This enables producing a "variable is uninitialized" warning when a
value is passed to a pointer-to-const argument:
```
void foo(const int *);
void test() {
int *v;
foo(v);
}
```
Fixes#37460
This option is similar to -Wuninitialized-const-reference, but diagnoses
the passing of an uninitialized value via a const pointer, like in the
following code:
```
void foo(const int *);
void test() {
int v;
foo(&v);
}
```
This is an extract from #147221 as suggested in [this
comment](https://github.com/llvm/llvm-project/pull/147221#discussion_r2190998730).
This patch introduces the core dataflow analysis infrastructure for the C++ Lifetime Safety checker. This change implements the logic to propagate "loan" information across the control-flow graph. The primary goal is to compute a fixed-point state that accurately models which pointer (Origin) can hold which borrow (Loan) at any given program point.
Key components
* `LifetimeLattice`: Defines the dataflow state, mapping an `OriginID` to a `LoanSet` using `llvm::ImmutableMap`.
* `Transferer`: Implements the transfer function, which updates the `LifetimeLattice` by applying the lifetime facts (Issue, AssignOrigin, etc.) generated for each basic block.
* `LifetimeDataflow`: A forward dataflow analysis driver that uses a worklist algorithm to iterate over the CFG until the lattice state converges.
The existing test suite has been extended to check the final dataflow results.
This work is a prerequisite for the final step of the analysis: consuming these results to identify and report lifetime violations.
This helps to avoid duplicating warnings in cases like:
```
> cat test.cpp
void bar(int);
void foo(const int &);
void test(bool a) {
int v = v;
if (a)
bar(v);
else
foo(v);
}
> clang++.exe test.cpp -fsyntax-only -Wuninitialized
test.cpp:4:11: warning: variable 'v' is uninitialized when used within its own initialization [-Wuninitialized]
4 | int v = v;
| ~ ^
test.cpp:4:11: warning: variable 'v' is uninitialized when used within its own initialization [-Wuninitialized]
4 | int v = v;
| ~ ^
2 warnings generated.
```
This patch introduces the initial implementation of the
intra-procedural, flow-sensitive lifetime analysis for Clang, as
proposed in the recent RFC:
https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-analysis-in-clang/86291
The primary goal of this initial submission is to establish the core
dataflow framework and gather feedback on the overall design, fact
representation, and testing strategy. The focus is on the dataflow
mechanism itself rather than exhaustively covering all C++ AST edge
cases, which will be addressed in subsequent patches.
#### Key Components
* **Conceptual Model:** Introduces the fundamental concepts of `Loan`,
`Origin`, and `Path` to model memory borrows and the lifetime of
pointers.
* **Fact Generation:** A frontend pass traverses the Clang CFG to
generate a representation of lifetime-relevant events, such as pointer
assignments, taking an address, and variables going out of scope.
* **Testing:** `llvm-lit` tests validate the analysis by checking the
generated facts.
### Next Steps
*(Not covered in this PR but planned for subsequent patches)*
The following functionality is planned for the upcoming patches to build
upon this foundation and make the analysis usable in practice:
* **Dataflow Lattice:** A dataflow lattice used to map each pointer's
symbolic `Origin` to the set of `Loans` it may contain at any given
program point.
* **Fixed-Point Analysis:** A worklist-based, flow-sensitive analysis
that propagates the lattice state across the CFG to a fixed point.
* **Placeholder Loans:** Introduce placeholder loans to represent the
lifetimes of function parameters, forming the basis for analysis
involving function calls.
* **Annotation and Opaque Call Handling:** Use placeholder loans to
correctly model **function calls**, both by respecting
`[[clang::lifetimebound]]` annotations and by conservatively handling
opaque/un-annotated functions.
* **Error Reporting:** Implement the final analysis phase that consumes
the dataflow results to generate user-facing diagnostics. This will
likely require liveness analysis to identify live origins holding
expired loans.
* **Strict vs. Permissive Modes:** Add the logic to support both
high-confidence (permissive) and more comprehensive (strict) warning
levels.
* **Expanded C++ Coverage:** Broaden support for common patterns,
including the lifetimes of temporary objects and pointers within
aggregate types (structs/containers).
* Performance benchmarking
* Capping number of iterations or number of times a CFGBlock is
processed.
---------
Co-authored-by: Baranov Victor <bar.victor.2002@gmail.com>
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
Support safe construction of `std::span` from `begin` and `end` calls on
hardened containers or views or `std::initializer_list`s.
For example, the following code is safe:
```
void create(std::initializer_list<int> il) {
std::span<int> input{ il.begin(), il.end() }; // no warn
}
```
rdar://152637380
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
`checkIncorrectLogicOperator` checks if an expression, for example `x !=
0 || x != 1.0`, is always true or false by comparing the two literals
`0` and `1.0`. But in case `x` is a 16-bit float, the two literals have
distinct types---16-bit float and double, respectively. Directly
comparing `APValue`s extracted from the two literals results in an
assertion failure because of their distinct types.
This commit fixes the issue by doing a conversion from the "smaller" one
to the "bigger" one. The two literals must be compatible because both of
them are comparing with `x`.
rdar://152456316
The CI on my PR https://github.com/llvm/llvm-project/pull/138895 is
failing with errors like
`clang/lib/Analysis/UnsafeBufferUsage.cpp:971:45: error: reference to
'PointerType' is ambiguous`
This patch should resolve it by removing `using namespace llvm`
In ScopedLockableFactEntry::unlock(), we can avoid a second search,
pop_back(), and push_back() if we use the already obtained iterator into
the FactSet to replace the old FactEntry and take its position in the
vector.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
The checker classes (i.e. classes derived from `CheckerBase` via the
utility template `Checker<...>`) act as intermediates between the user
and the analyzer engine, so they have two interfaces:
- On the frontend side, they have a public name, can be enabled or
disabled, can accept checker options and can be reported as the source
of bug reports.
- On the backend side, they can handle various checker callbacks and
they "leave a mark" on the `ExplodedNode`s that are created by them.
(These `ProgramPointTag` marks are internal: they appear in debug logs
and can be queried by checker logic; but the user doesn't see them.)
In a significant majority of the checkers there is 1:1 correspondence
between these sides, but there are also many checker classes where
several related user-facing checkers share the same backend class.
Historically each of these "multi-part checker" classes had its own
hacks to juggle its multiple names, which led to lots of ugliness like
lazy initialization of `mutable std::unique_ptr<BugType>` members and
redundant data members (when a checker used its custom `CheckNames`
array and ignored the inherited single `Name`).
My recent commit 27099982da2f5a6c2d282d6b385e79d080669546 tried to unify
and standardize these existing solutions to get rid of some of the
technical debt, but it still used enum values to identify the checker
parts within a "multi-part" checker class, which led to some ugliness.
This commit introduces a new framework which takes a more direct,
object-oriented approach: instead of identifying checker parts with
`{parent checker object, index of part}` pairs, the parts of a
multi-part checker become stand-alone objects that store their own name
(and enabled/disabled status) as a data member.
This is implemented by separating the functionality of `CheckerBase`
into two new classes: `CheckerFrontend` and `CheckerBackend`. The name
`CheckerBase` is kept (as a class derived from both `CheckerFrontend`
and `CheckerBackend`), so "simple" checkers that use `CheckerBase` and
`Checker<...>` continues to work without changes. However we also get
first-class support for the "many frontends - one backend" situation:
- The class `CheckerFamily<...>` works exactly like `Checker<...>` but
inherits from `CheckerBackend` instead of `CheckerBase`, so it won't
have a superfluous single `Name` member.
- Classes deriving from `CheckerFamily` can freely own multiple
`CheckerFrontend` data members, which are enabled within the
registration methods corresponding to their name and can be used to
initialize the `BugType`s that they can emit.
In this scheme each `CheckerFamily` needs to override the pure virtual
method `ProgramPointTag::getTagDescription()` which returns a string
which represents that class for debugging purposes. (Previously this
used the name of one arbitrary sub-checker, which was passable for
debugging purposes, but not too elegant.)
I'm planning to implement follow-up commits that convert all the
"multi-part" checkers to this `CheckerFamily` framework.
Introduce the `reentrant_capability` attribute, which may be specified
alongside the `capability(..)` attribute to denote that the defined
capability type is reentrant. Marking a capability as reentrant means
that acquiring the same capability multiple times is safe, and does not
produce warnings on attempted re-acquisition.
The most significant changes required are plumbing to propagate if the
attribute is present to a CapabilityExpr, and introducing
ReentrancyDepth to the LockableFactEntry class.
The -Wunsafe-buffer-usage analysis currently warns when a const sized
array is safely accessed, with an index that is bound by the remainder
operator. The false alarm is now suppressed.
Example:
int circular_buffer(int array[10], uint idx) {
return array[idx %10]; // Safe operation
}
rdar://148443453
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Closes#57270.
This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.
Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
There is a caveat at loop widening, where the correct location is the
CFG terminator (which is not an element and does not have a ref). In
this case, the first element in the block is passed as a location.
Previous PR #128251, Reverted at #137304.
This PR attempts to improve the diagnostics flag
`-Wtautological-overlap-compare` (#13473). I have added code to warn
about float-point literals and character literals. I have also changed
the warning message for the non-overlapping case to provide a more
correct hint to the user.
Fixes#13473.