The function `tryExpandAsInteger` attempts to extract an integer from a
macro definition. Previously, the attempt would fail when the macro is
from a PCH, because the function tried to access the text buffer of the
source file, which does not exist in case of PCHs. The fix uses
`Preprocessor::getSpelling`, which works in either cases.
rdar://151403070
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
The checker classes (i.e. classes derived from `CheckerBase` via the
utility template `Checker<...>`) act as intermediates between the user
and the analyzer engine, so they have two interfaces:
- On the frontend side, they have a public name, can be enabled or
disabled, can accept checker options and can be reported as the source
of bug reports.
- On the backend side, they can handle various checker callbacks and
they "leave a mark" on the `ExplodedNode`s that are created by them.
(These `ProgramPointTag` marks are internal: they appear in debug logs
and can be queried by checker logic; but the user doesn't see them.)
In a significant majority of the checkers there is 1:1 correspondence
between these sides, but there are also many checker classes where
several related user-facing checkers share the same backend class.
Historically each of these "multi-part checker" classes had its own
hacks to juggle its multiple names, which led to lots of ugliness like
lazy initialization of `mutable std::unique_ptr<BugType>` members and
redundant data members (when a checker used its custom `CheckNames`
array and ignored the inherited single `Name`).
My recent commit 27099982da2f5a6c2d282d6b385e79d080669546 tried to unify
and standardize these existing solutions to get rid of some of the
technical debt, but it still used enum values to identify the checker
parts within a "multi-part" checker class, which led to some ugliness.
This commit introduces a new framework which takes a more direct,
object-oriented approach: instead of identifying checker parts with
`{parent checker object, index of part}` pairs, the parts of a
multi-part checker become stand-alone objects that store their own name
(and enabled/disabled status) as a data member.
This is implemented by separating the functionality of `CheckerBase`
into two new classes: `CheckerFrontend` and `CheckerBackend`. The name
`CheckerBase` is kept (as a class derived from both `CheckerFrontend`
and `CheckerBackend`), so "simple" checkers that use `CheckerBase` and
`Checker<...>` continues to work without changes. However we also get
first-class support for the "many frontends - one backend" situation:
- The class `CheckerFamily<...>` works exactly like `Checker<...>` but
inherits from `CheckerBackend` instead of `CheckerBase`, so it won't
have a superfluous single `Name` member.
- Classes deriving from `CheckerFamily` can freely own multiple
`CheckerFrontend` data members, which are enabled within the
registration methods corresponding to their name and can be used to
initialize the `BugType`s that they can emit.
In this scheme each `CheckerFamily` needs to override the pure virtual
method `ProgramPointTag::getTagDescription()` which returns a string
which represents that class for debugging purposes. (Previously this
used the name of one arbitrary sub-checker, which was passable for
debugging purposes, but not too elegant.)
I'm planning to implement follow-up commits that convert all the
"multi-part" checkers to this `CheckerFamily` framework.
Resolves
https://github.com/llvm/llvm-project/issues/76208#issuecomment-2830854351
Quoting the docs of `[[clang::flag_enum]]`:
https://clang.llvm.org/docs/AttributeReference.html#flag-enum
> This attribute can be added to an enumerator to signal to the compiler that it
> is intended to be used as a flag type. This will cause the compiler to assume
> that the range of the type includes all of the values that you can get by
> manipulating bits of the enumerator when issuing warnings.
Ideally, we should still check the upper bounds but for simplicity let's not bother for now.
This helps to gain contextual information about how we entered a CFG block.
The `noexprcrash.c` test probably changed due to the fact that now
BlockEntrance ProgramPoint Profile also hashes the pointer of the
previous CFG block. I didn't investigate.
CPP-6483
Fixes#139779.
The bug was introduced in #137355 in `SymbolConjured::getStmt`, when
trying to obtain a statement for a CFG initializer without an
initializer. This commit adds a null check before access.
Previous PR #139820, Revert #139936
Additional notes since previous PR:
When conjuring a symbol, sometimes there is no valid CFG element, e.g.
in the file causing the crash, there is no element at all in the CFG. In
these cases, the CFG element reference in the expression engine will be
invalid. As a consequence, there needs to be extra checks to ensure the
validity of the CFG element reference.
This commit documents the process of specifying values for the analyzer
options and checker options implemented in the static analyzer, and adds
a script which includes the documentation of the analyzer options (which
was previously only available through a command-line flag) in the
RST-based web documentation.
Fixes#139779.
The bug was introduced in #137355 in `SymbolConjured::getStmt`, when
trying to obtain a statement for a CFG initializer without an
initializer. This commit adds a null check before access.
Previously, CSA did not handle __builtin_bit_cast correctly. It
evaluated the LvalueToRvalue conversion for the casting expression,
but did not actually convert the value of the expression to be of the
destination type.
This commit fixes the problem.
rdar://149987320
Closes#57270.
This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.
Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
There is a caveat at loop widening, where the correct location is the
CFG terminator (which is not an element and does not have a ref). In
this case, the first element in the block is passed as a location.
Previous PR #128251, Reverted at #137304.
Recently some users reported that they observed large increases of
runtime (up to +600% on some translation units) when they upgraded to a
more recent (slightly patched, internal) clang version. Bisection
revealed that the bulk of this increase was probably caused by my
earlier commit bb27d5e5c6b194a1440b8ac4e5ace68d0ee2a849 ("Don't assume
third iteration in loops").
As I evaluated that earlier commit on several open source project, it
turns out that on average it's runtime-neutral (or slightly helpful: it
reduced the total analysis time by 1.5%) but it can cause runtime spikes
on some code: in particular it more than doubled the time to analyze
`tmux` (one of the smaller test projects).
Further profiling and investigation proved that these spikes were caused
by an _increase of analysis scope_ because there was an heuristic that
placed functions on a "don't inline this" blacklist if they reached the
`-analyzer-max-loop` limit (anywhere, on any one execution path) --
which became significantly rarer when my commit ensured the analyzer no
longer "just assumes" four iterations. (With more inlining significantly
more entry points use up their allocated budgets, which leads to the
increased runtime.)
I feel that this heuristic for the "don't inline" blacklist is
unjustified and arbitrary, because reaching the "retry without inlining"
limit on one path does not imply that inlining the function won't be
valuable on other paths -- so I hope that we can eventually replace it
with more "natural" limits of the analysis scope.
However, the runtime increases are annoying for the users whose project
is affected, so I created this quick workaround commit that approximates
the "don't inline" blacklist effects of ambiguous loops (where the
analyzer doesn't understand the loop condition) without fully reverting
the "Don't assume third iteration" commit (to avoid reintroducing the
false positives that were eliminated by it).
Investigating this issue was a team effort: I'm grateful to Endre Fülöp
(gamesh411) who did the bisection and shared his time measurement setup,
and Gábor Tóthvári (tigbr) who helped me in profiling.
Functions in std::ranges namespace does not store the lambada passed-in
as an arugment in heap so treat such an argument as if it has
[[noescape]] in the WebKit lambda capture checker so that we don't emit
warnings for capturing raw pointers or references to smart-pointer
capable objects.
The original commit assumes that
`CXXConstructExpr->getType()->getAsRecordDecl()` is always a
`CXXRecordDecl` but it is not true for ObjC programs.
This relanding changes
`cast<CXXRecordDecl>(CXXConstructExpr->getType()->getAsRecordDecl())`
to
`dyn_cast_or_null<CXXRecordDecl>(CXXConstructExpr->getType()->getAsRecordDecl())`
This reverts commit 9048c2d4f239cb47fed17cb150e2bbf3934454c2.
rdar://146753089
Allow copy capture of a reference to a CheckedPtr capable object since
such a capture will copy the said object instead of keeping a dangling
reference to the object.
Previously, Static Analyzer initializes empty type fields with zeroes.
This can cause problems when those fields have no unique addresses. For
example, https://github.com/llvm/llvm-project/issues/137252.
rdar://146753089
This PR implements various small enhancements to
alpha.webkit.RetainPtrCtorAdoptChecker:
- Detect leaks from [[X alloc] init] when ARC is disabled.
- Detect leaks from calling Create, Copy, and other +1 CF functions.
- Recognize [allocX() init] pattern where allocX is a C/C++ function.
- Recognize _init in addition to init as an init function.
- Recognize [[[X alloc] init] autorelease].
- Recognize CFBridgingRelease.
- Support CF_RETRUNS_RETAINED on out arguments of a C function.
- Support returning +1 object in Create, Copy, and other +1 functions or
+1 selectors.
- Support variadic Create, Copy, and other +1 C/C++ functions.
To make these enhancements, this PR introduces new visit functions for
ObjCMessageExpr, ReturnStmt, VarDecl, and BinaryOperator. These
functions look for a specific construct mentioned above and adds an
expression such as [[X alloc] init] or CreateX to a DenseSet
CreateOrCopyFnCall when the expression does not result in leaks. When
the code to detect leaks such as the one in visitObjCMessageExpr later
encounters this expression, it can bail out early if the expression is
in the set.
This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.
Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
Closes#57270
This PR fixes the bug that alpha.webkit.UncheckedCallArgsChecker did not
recognize CanMakeCheckedPtrBase due to getAsCXXRecordDecl returning
nullptr for it in hasPublicMethodInBase. Manually grab getTemplatedDecl
out of TemplateSpecializationType then CXXRecordDecl to workaround this
bug in clang frontend.
This PR fixes member variable checker to allow the usage of T* in smart
pointer classes. e.g. alpha.webkit.NoUncheckedPtrMemberChecker should
allow T* to appear within RefPtr.
Previously, analysis-based diagnostics (like -Wconsumed) had to be
enabled at file scope in order to be run at the end of each function
body. This meant that they did not respect #pragma clang diagnostic
enabling or disabling the diagnostic.
Now, these pragmas can control the diagnostic emission.
Fixes#42199
As reported in #135665, C++20 parenthesis initializer list expressions
are not handled correctly and were causing crashes. This commit attempts
to fix the issue by handing parenthesis initializer lists along side
existing initializer lists.
Fixes#135665.
This PR fixes a bug that when a template specialization is declared with
a forward declaration of a template, the checker fails to find its
definition in the same translation unit and erroneously emit an unsafe
forward declaration warning.
This PR adds the support for recognizing calling adoptCF/adoptNS on the
result of a cast operation on the return value of a function which
creates NS or CF types. It also fixes a bug that we weren't reporting
memory leaks when CF types are created without ever calling RetainPtr's
constructor, adoptCF, or adoptNS.
To do this, this PR adds a new mechanism to report a memory leak
whenever create or copy CF functions are invoked unless this CallExpr
has already been visited while validating a call to adoptCF. Also added
an early exit when isOwned returns IsOwnedResult::Skip due to an
unresolved template argument.
This PR adds the support for treating capturing of "self" as safe if the
lambda simultaneously captures "protectedSelf", which is a RetainPtr of
"self".
This PR also fixes a bug that the checker wasn't generating a warning
when "self" is implicitly captured. Note when "self" is implicitly
captured, we use the lambda's getBeginLoc as a fallback source location.
This PR adds the support for recognizing the return value of copy /
mutableCopy as +1. isAllocInit and isOwned now traverses
PseudoObjectExpr to its last semantic expression.
There are some system libraries such as sqlite3 which forward declare a
struct then use a pointer to that forward declared type in various APIs.
Ignore these types ForwardDeclChecker like other pointer types.
Reverts llvm/llvm-project#133265
This causes the whole libc++ CI to fail, since we're not building
against a compiler built from current trunk. Specifically, the CMake
changes causes some feature detection to fail, resulting in CMake
being unable to configure libc++.
Prior to this PR, we were emitting warnings for Objective-C ivars and
properties if the forward declaration of the type appeared first in a
non-system header. This PR fixes the checker so tha we'd ignore ivars
and properties defined for a forward declared type.
This turns on the unnecessary-virtual-specifier warning in general, but
disables it when building LLVM. It also tweaks the warning description
to be slightly more accurate.
Background: I've been working on cleaning up this warning in two
codebases: LLVM and chromium (plus its dependencies). The chromium
cleanup has been straightforward. Git archaeology shows that there are
two reasons for the warnings: classes to which `final` was added after
they were initially committed, and classes with virtual destructors that
nobody remarks on. Presumably the latter case is because people are just
very used to destructors being virtual.
The LLVM cleanup was more surprising: I discovered that we have an [old
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers)
about including out-of-line virtual functions in every class with a
vtable, even `final` ones. This means our codebase has many virtual
"anchor" functions which do nothing except control where the vtable is
emitted, and which trigger the warning. I looked into alternatives to
satisfy the policy, such as using destructors instead of introducing a
new function, but it wasn't clear if they had larger implications.
Overall, it seems like the warning is genuinely useful in most codebases
(evidenced by chromium and its dependencies), and LLVM is an unusual
case. Therefore we should enable the warning by default, and turn it off
only for LLVM builds.
This PR fixes the bug that we weren't generating warnings when a raw
poiner is used to point to a NS type in Objective-C ivars. Also fix the
bug that we weren't suppressing this warning in system headers.
Previously `optin.taint.GenericTaint` misinterpreted the parameter
indices and produced false positives in situations when a [format
attribute](https://clang.llvm.org/docs/AttributeReference.html#format)
is applied on a non-static method. This commit fixes this bug