Discussion here:
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520/15?u=szelethus
The original patch, #97298 introduced new timeouts backed by thorough
testing and measurements to keep the running time of Z3 within
reasonable limits. The measurements also showed that only certain
reports and certain TUs were responsible for the poor performance of Z3
refutation.
Unfortunately, it seems like that on machines with different
characteristics (slower machines) the current timeouts don't just axe
0.01% of reports, but many more as well. Considering that timeouts are
inherently nondeterministic as a cutoff point, this lead reports sets
being vastly different on the same projects with the same configuration.
The discussion link shows that all configurations introduced in the
patch with their default values lead to severa nondeterminism of the
analyzer. As we, and others use the analyzer as a gating tool for PRs,
we should revert to the original defaults.
We should respect that
* There are still parts of the analyzer that are either proven or
suspected to contain nondeterministic code (like pointer sets),
* A 15s timeout is more likely to hit the same reports every time on a
wider range of machines, but is still inherently nondeterministic, but
an infinite timeout leads to the tool hanging,
* If you measure the performance of the analyzer on your machines, you
can and should achieve some speedup with little or no observable
nondeterminism.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
In WebKit, we often write Foo::ensureBar function which lazily
initializes m_bar and returns a raw pointer or a raw reference to m_bar.
Such a return value is safe to use for the duration of a member function
call in Foo so long as m_bar is const so that it never gets unset or
updated with a new value once it's initialized.
This PR adds support for recognizing these types of functions and
treating its return value as a safe origin of a function argument
(including "this") or a local variable.
It's safe to have a raw pointer or a raw reference to a singleton
object. Explicitly allow this in UncountedLocalVarsChecker and
UncheckedLocalVarsChecker.
This PR introduces a new checker
`[alpha.webkit.MemoryUnsafeCastChecker]` that warns all downcasts from a base type to a derived type.
rdar://137766829
Currently, we support `-wdeprecated-array-compare` for C++20 or above
and don't report any warning for older versions, this PR supports
`-Warray-compare` for older versions and for GCC compatibility.
Fixes#114770
This change removes the alpha.core.IdenticalExpr static analysis checker
since it's checks are present in the clang-tidy checks
misc-redundant-expression and bugprone-branch-clone. This check was
implemented as a static analysis check using AST matching, and since
alpha and duplicated in 2 clang-tidy checks may be removed.
Co-authored-by: Vince Bridgers <vince.a.bridgers@ericsson.com>
Like in the test case:
```c++
struct String {
String(const String &) {}
};
struct MatchComponent {
unsigned numbers[2];
String prerelease;
MatchComponent(MatchComponent const &) = default;
};
MatchComponent get();
void consume(MatchComponent const &);
MatchComponent parseMatchComponent() {
MatchComponent component = get();
component.numbers[0] = 10;
component.numbers[1] = 20;
return component; // We should have no stack addr escape warning here.
}
void top() {
consume(parseMatchComponent());
}
```
When calling `consume(parseMatchComponent())` the
`parseMatchComponent()` would return a copy of a temporary of
`component`. That copy would invoke the
`MatchComponent::MatchComponent(const MatchComponent &)` ctor.
That ctor would have a (reference typed) ParamVarRegion, holding the
location (lvalue) of the object we are about to copy (&component). So
far so good, but just before evaluating the binding operation for
initializing the `numbers` field of the temporary, we evaluate the
ArrayInitLoopExpr representing the by-value elementwise copy of the
array `component.numbers`. This is represented by a LazyCompoundVal,
because we (usually) don't just copy large arrays and bind many
individual direct bindings. Rather, we take a snapshot by using a LCV.
However, notice that the LCV representing this copy would look like
this:
lazyCompoundVal{ParamVarRegion{"reference param of cctor"}.numbers}
Notice that it refers to the numbers field of a reference. It would be
much better to desugar the reference to the actual object, thus it
should be: `lazyCompoundVal{component.numbers}`
Actually, when binding the result of the ArrayInitLoopExpr to the
`temp_object.numbers` in the compiler-generated member initializer of
the cctor, we should have two individual direct bindings because this is
a "small array":
```
binding &Element{temp_object.numbers, 0} <- loadFrom(&Element{component.numbers, 0})
binding &Element{temp_object.numbers, 1} <- loadFrom(&Element{component.numbers, 1})
```
Where `loadFrom(...)` would be:
```
loadFrom(&Element{component.numbers, 0}): 10 U32b
loadFrom(&Element{component.numbers, 1}): 20 U32b
```
So the store should look like this, after PostInitializer of
`temp_object.numbers`:
```
temp_object at offset 0: 10 U32b
temp_object at offset 32: 20 U32b
```
The lesson is that it's okay to have TypedValueRegions of references as
long as we don't form subregions of those. If we ever want to refer to a
subregion of a "reference" we actually meant to "desugar" the reference
and slice a subregion of the pointee of the reference instead.
Once this canonicalization is in place, we can also drop the special
handling of references in `ProcessInitializer`, because now reference
TypedValueRegions are eagerly desugared into their referee region when
forming a subregion of it.
There should be no practical differences, but there are of course bugs
that this patch may surface.
In #115916 I allowed copying empty structs.
Later in #115917 I changed how objects are copied, and basically when we
would want to copy a struct (an LCV) of a single symbol (likely coming
from an opaque fncall or invalidation), just directly bind that symbol
instead of creating an LCV referring to the symbol. This was an
optimization to skip a layer of indirection.
Now, it turns out I should have apply the same logic in #115916. I
should not have just blindly created an LCV by calling
`createLazyBinding()`, but rather check if I can apply the shortcut
described in #115917 and only create the LCV if the shortcut doesn't
apply.
In this patch I check if there is a single default binding that the copy
would refer to and if so, just return that symbol instead of creating an
LCV.
There shouldn't be any observable changes besides that we should have
fewer LCVs. This change may surface bugs in checkers that were
associating some metadata with entities in a wrong way. Notably,
STLAlgorithmModeling and DebugIteratorModeling checkers would likely
stop working after this change.
I didn't investigate them deeply because they were broken even prior to
this patch. Let me know if I should migrate these checkers to be just as
bugged as they were prior to this patch - thus make the tests pass.
This change modernizes, improves and promotes the chroot checker from
alpha to the Unix family of checkers. This checker covers the POS05
recommendations for use of chroot.
The improvements included modeling of a success or failure from chroot
and not falsely reporting a warning along an error path. This was made
possible through modernizing the checker to be flow sensitive.
---------
Co-authored-by: einvbri <vince.a.bridgers@ericsson.com>
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
The motivating example: https://compiler-explorer.com/z/WjsxYfs43
```C++
#include <stdlib.h>
void inf_loop_break_callee() {
void* data = malloc(10);
while (1) {
(void)data; // line 3
break; // -> execution continues on line 3 ?!!
}
}
```
To correct the flow steps in this example (see the fixed version in the
added test case) I changed two things in the engine:
- Make `processCallExit` create a new StmtPoint only for return
statements. If the last non-jump statement is not a return statement,
e.g. `(void)data;`, it is no longer inserted in the exploded graph after
the function exit.
- Skip the purge program points. In the example above, purge
points are still inserted after the `break;` executes. Now, when the bug
reporter is looking for the next statement executed after the function
execution is finished, it will ignore the purge program points, so it
won't confusingly pick the `(void)data;` statement.
CPP-5778
Fixed a bug that UncountedLambdaCapturesChecker would emit a warning on
a lambda capture when the lambda is invoked with arguments.
LocalVisitor::VisitCallExpr was not tolerating a lambda invocation with
more than 1 arguments.
returned by-value from opaque function calls.
If a struct is returned by-value from an opaque call, the "value" of the
whole struct is represented by a Conjured symbol.
Later fields may slice off smaller subregions by creating Derived
symbols of that Conjured symbol, but those are handled well, and
"isTainted" returns true as expected.
However, passing the whole struct to "isTainted" would be false, because
LazyCompoundVals and CompoundVals are not handled.
This patch addresses this.
Fixes#114270
Split from #114835
This could deduce some complex syms derived from simple ones whose
values could be constrainted to be concrete during execution, thus
reducing some overconstrainted states.
This commit also fix `unix.StdCLibraryFunctions` crash due to these
overconstrainted states being added to the graph, which is marked as
sink node (PosteriorlyOverconstrained). The 'assume' API is used in
non-dual style so the checker should protectively test whether these
newly added nodes are actually impossible.
1. The crash: https://godbolt.org/z/8KKWeKb86
2. The solver needs to solve equivalent: https://godbolt.org/z/ed8WqsbTh
We represent copies of structs by LazyCompoundVals, that is basically a
snapshot of the Store and Region that your copy would refer to.
This snapshot is actually not taken for empty structs (structs that have
no non-static data members), because the users won't be able to access
any fields anyways, so why bother.
However, when it comes to taint propagation, it would be nice if
instances of empty structs would behave similar to non-empty structs.
For this, we need an identity for which taint can bind, so Unknown -
that was used in the past wouldn't work.
Consequently, copying the value of an empty struct should behave the
same way as a non-empty struct, thus be represented by a
LazyCompoundVal.
Split from #114835
Treat member function calls to ref() and incrementCheckedPtrCount() as
trivial so that a function which returns RefPtr/Ref out of a raw
reference / pointer is also considered trivial.
This is reapplies #115615 without using tuples. The eager call of
`getRegion()` and `getOffset()` could cause crashes when the Store had
symbolic bindings.
Here I'm fixing the crash by lazily calling those getters.
Also, the tuple version poorly sorted the Clusters. The memory spaces
should have come before the regular clusters.
Now, that is also fixed here, demonstrated by the test.
This PR makes webkit.UncountedLambdaCapturesChecker ignore trivial
functions as well as the one being passed to an argument with
[[clang::noescape]] attribute. This dramatically reduces the false
positive rate for this checker.
To do this, this PR replaces VisitLambdaExpr in favor of checking
lambdas via VisitDeclRefExpr and VisitCallExpr. The idea is that if a
lambda is defined but never called or stored somewhere, then capturing
whatever variable in such a lambda is harmless.
VisitCallExpr explicitly looks for direct invocation of lambdas and
registers its DeclRefExpr to be ignored in VisitDeclRefExpr. If a lambda
is being passed to a function, it checks whether its argument is
annotated with [[clang::noescape]]. If it's not annotated such, it
checks captures for their safety.
Because WTF::switchOn could not be annotated with [[clang::noescape]] as
function type parameters are variadic template function so we hard-code
this function into the checker.
In order to check whether "this" pointer is ref-counted type or not, we
override TraverseDecl and record the most recent method's declaration.
In addition, this PR fixes a bug in isUnsafePtr that it was erroneously
checking whether std::nullopt was returned by isUncounted and
isUnchecked as opposed to the actual boolean value.
Finally, this PR also converts the accompanying test to use -verify and
adds a bunch of tests.
This PR introduces alpha.webkit.UncheckedCallArgsChecker which detects a
function argument which is a raw reference or a raw pointer to a
CheckedPtr capable object.
This commit removes the `check::BranchCondition` callback of the debug
checker `debug.DumpTraversal` (in `TraversalChecker.cpp`) and the single
broken testcase that was referring to it.
The testcase `traversal-algorithm.mm` was added in 2012 to verify that
we're using DFS traversal -- however it failed to detect that we're no
longer using DFS traversal and in fact it continues to pass even if I
remove large random portions of its code.
This change was motivated by the plan discussed at
https://discourse.llvm.org/t/fixing-or-removing-check-branchcondition/82738
I also added some TODO notes to mark the rest of `TraversalChecker.cpp`
for removal in follow-up commits.
This PR introduces alpha.webkit.UncheckedLocalVarsChecker which detects
a raw reference or a raw pointer local, static, or global variable to a
CheckedPtr capable object without a guardian variable in an outer scope.
Fixes#62985Fixes#58820
When 3rd-party header files are included as system headers, their
overloaded `new` and `delete` operators are also considered as the std
ones. However, those overloaded operator functions will also be inlined.
This makes the same
symbolic memory marked as released twice: during `checkPreCall` of the
overloaded `delete` operator and when calling `::operator delete` after
inlining the overloaded operator function (if it has).
This patch attempts to fix this bug by adjusting the strategy of
verifying whether the callee is a standard `new` or `delete` operator in
the `isStandardNewDelete` function.
This PR makes webkit.UncountedLambdaCapturesChecker ignore trivial
functions as well as the one being passed to an argument with
[[clang::noescape]] attribute. This dramatically reduces the false
positive rate for this checker.
To do this, this PR replaces VisitLambdaExpr in favor of checking
lambdas via VisitDeclRefExpr and VisitCallExpr. The idea is that if a
lambda is defined but never called or stored somewhere, then capturing
whatever variable in such a lambda is harmless.
VisitCallExpr explicitly looks for direct invocation of lambdas and
registers its DeclRefExpr to be ignored in VisitDeclRefExpr. If a lambda
is being passed to a function, it checks whether its argument is
annotated with [[clang::noescape]]. If it's not annotated such, it
checks captures for their safety.
Because WTF::switchOn could not be annotated with [[clang::noescape]] as
function type parameters are variadic template function so we hard-code
this function into the checker.
Finally, this PR also converts the accompanying test to use -verify and
adds a bunch of tests.
This checker has a notion of a guardian variable which is a variable and
keeps the object pointed to by a raw pointer / reference in an inner
scope alive long enough to "guard" it from use-after-free. But such a
guardian variable fails to flawed to keep the object alive if it ever
gets mutated within the scope of a raw pointer / reference.
This PR fixes this bug by introducing a new AST visitor class,
GuardianVisitor, which traverses the compound statements of a guarded
variable (raw pointer / reference) and looks for any operator=, move
constructor, or calls to "swap", "leakRef", or "releaseNonNull"
functions.
This change moves the `alpha.nondeterministic.PointerSorting` and
`alpha.nondeterministic.PointerIteration` static analyzer checkers to a
single `clang-tidy` check. Those checkers were implemented as simple
`clang-tidy` check-like code, wrapped in the static analyzer framework.
The documentation was updated to describe what the checks can and cannot
do, and testing was completed on a broad set of open-source projects.
Co-authored-by: Vince Bridgers <vince.a.bridgers@ericsson.com>
This patch generalizes the way element regions are constructed when an
`ArrayInitLoopExpr` is being analyzed. Previously the base region of the
`ElementRegion` was determined with pattern matching, which led to
crashes, when an unhandled pattern was encountered.
Fixes#112813
When instantiating "callable<T>", the "class CallableType" nested type
will only have a declaration in the copy for the instantiation - because
it's not refereed to directly by any other code that would need a
complete definition.
However, in the past, when conservative eval calling member function, we
took the static type of the "this" expr, and looked up the CXXRecordDecl
it refereed to to see if it has any mutable members (to decide if it
needs to refine invalidation or not). Unfortunately, that query needs a
definition, and it asserts otherwise, thus we crashed.
To fix this, we should consult the dynamic type of the object, because
that will have the definition.
I anyways added a check for "hasDefinition" just to be on the safe side.
Fixes#77378
This patch simplifies the diagnostic message in the core.StackAddrEscape
for stack memory associated with compound literals by removing the
redundant "returned to caller" suffix.
Example: https://godbolt.org/z/KxM67vr7c
```c
// clang --analyze -Xanalyzer -analyzer-checker=core.StackAddressEscape
void* compound_literal() {
return &(unsigned short){((unsigned short)0x22EF)};
}
```
warning: Address of stack memory associated with a compound literal
declared on line 2 **returned to caller returned to caller**
[core.StackAddressEscape]
This PR fixes the bug that alpha.webkit.UncountedLocalVarsChecker
erroneously treats a trivial recursive function as non-trivial. This was
caused by TrivialFunctionAnalysis::isTrivialImpl which takes a statement
as an argument populating the cache with "false" while traversing the
statement to determine its triviality within a recursive function in
TrivialFunctionAnalysisVisitor's WithCachedResult. Because
IsFunctionTrivial honors an entry in the cache, this resulted in the
whole function to be treated as non-trivial.
Thankfully, TrivialFunctionAnalysisVisitor::IsFunctionTrivial already
handles recursive functions correctly so this PR applies the same logic
to TrivialFunctionAnalysisVisitor::WithCachedResult by sharing code
between the two functions. This avoids the cache to be pre-populated
with "false" while traversing statements in a recurisve function.
CStringChecker has a sub-checker alpha.unix.cstring.NotNullTerminated
which checks for invalid objects passed to string functions. The checker
and its name are not exact and more functions could be checked, this
change only adds some tests and improves documentation.
Ideally, we wouldn't workaround our current cast-modeling, but the
experimental "support-symbolic-integer-casts" is not finished so we need
to live with our current modeling.
Ideally, we could probably bind `UndefinedVal` as the result of the call
even without evaluating the call, as the result types mismatch between
the static type of the `CallExpr` and the actually function that happens
to be called.
Nevertheless, let's not crash.
https://compiler-explorer.com/z/WvcqK6MbY
CPP-5768
`builtin_*_overflow` functions return `_Bool` according to [1].
`BuiltinFunctionChecker` was using `makeTruthVal` w/o specifying
explicit type, which creates an `int` value, since it's the type of any
compassion according to C standard.
Fix it by directly passing `BoolTy` to `makeTruthVal`
Closes: #111147
[1]
https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins