5538 Commits

Author SHA1 Message Date
Ryosuke Niwa
6f508492d1
[webkit.UncountedLambdaCapturesChecker] Fix a regression that [[noescape]] on a member function no longer works. (#126016) 2025-02-06 18:57:44 -08:00
Sarah Spall
14716f2e4b
Revert "[HLSL] Implement HLSL Flat casting (excluding splat cases)" (#126149)
Reverts llvm/llvm-project#118842
2025-02-06 15:25:20 -08:00
Sarah Spall
01072e546f
[HLSL] Implement HLSL Flat casting (excluding splat cases) (#118842)
Implement HLSLElementwiseCast excluding support for splat cases
Do not support casting types that contain bitfields.
Partly closes #100609 and partly closes #100619
2025-02-06 14:38:01 -08:00
Donát Nagy
6e17ed9b04
[analyzer] Consolidate array bound checkers (#125534)
Before this commit, there were two alpha checkers that used different
algorithms/logic for detecting out of bounds memory access: the old
`alpha.security.ArrayBound` and the experimental, more complex
`alpha.security.ArrayBoundV2`.

After lots of quality improvement commits ArrayBoundV2 is now stable
enough to be moved out of the alpha stage. As indexing (and dereference)
are common operations, it still produces a significant amount of false
positives, but not much more than e.g. `core.NullDereference` or
`core.UndefinedBinaryOperatorResult`, so it should be acceptable as a
non-`core` checker.

At this point `alpha.security.ArrayBound` became obsolete (there is a
better tool for the same task), so I'm removing it from the codebase.
With this I can eliminate the ugly "V2" version mark almost everywhere
and rename `alpha.security.ArrayBoundV2` to `security.ArrayBound`.

(The version mark is preserved in the filename "ArrayBoundCheckerV2", to
ensure a clear git history. I'll rename it to "ArrayBoundChecker.cpp" in
a separate commit.)

This commit adapts the unit tests of `alpha.security.ArrayBound` to
testing the new `security.ArrayBound` (= old ArrayBoundV2). Currently
the names of the test files are very haphazard, I'll probably create a
separate followup commit that consolidates this.
2025-02-06 17:45:42 +01:00
Arseniy Zaostrovnykh
f5c4f271ab
[analyzer] Add -ftime-trace scopes for region-store bindings and removeDead (#125884)
From investigation of a few slow analysis cases, I discovered that
`RegionStoreManager::bind*` and `ExprEngine::removeDead` are often the
slowest actions. This change adds explicit scope to the time trace
generated by `-ftime-trace` to enable easy diagnostics of the cases when
these functions are the slowdown culprits.

--
CPP-6109
2025-02-06 16:09:14 +01:00
Utkarsh Saxena
112490ce04
[analyzer] Use getFileName and do not use realpath names (#126039)
The real paths resolves symlinks and makes the tests fail when the
filesystem is a symlink tree over a content-addressable storage (our
internal environment).
2025-02-06 11:15:20 +01:00
Arseniy Zaostrovnykh
c1d5be8f7f
[analyzer] Add time-trace scopes for high-level analyzer steps (#125508)
Specifically, add a scope for 
- each work-list step,
- each entry point,
- each checker run within a step, and
- bug-suppression phase at the end of the analysis of an entry-point.

These scopes add no perceptible run-time overhead when time-tracing is
disabled. You can enable it and generate a time trace using the
`-ftime-trace=file.json` option.

See also the RFC:
https://discourse.llvm.org/t/analyzer-rfc-ftime-trace-time-scopes-for-steps-and-entry-points/84343

--
CPP-6065
2025-02-05 17:22:18 +01:00
Ryosuke Niwa
d5a2638ae9
[webkit.UncountedLambdaCapturesChecker] Fix a bug that the checker didn't take the object pointer into account. (#125662)
When a callee is a method call (e.g. calling a lambda), we need to skip
the object pointer to match the parameter list with the call arguments.
This manifests as a bug that the checker erroneously generate a warning
for a lambda capture (L1) which is passed to a no-escape argument of
another lambda (L2).
2025-02-04 21:51:16 -08:00
Chandler Carruth
cd269fee05 [StrTable] Switch Clang builtins to use string tables
This both reapplies #118734, the initial attempt at this, and updates it
significantly.

First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.

It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:

1) It improves the APIs used significantly.

2) When builtins are defined from different sources (like SVE vs MVE in
   AArch64), this allows each of them to build their own string table
   independently rather than having to merge the string tables and info
   structures.

3) It allows each shard to factor out a common prefix, often cutting the
   size of the strings needed for the builtins by a factor two.

The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.

Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.
2025-02-04 18:04:57 +00:00
erichkeane
99a9133a68 [OpenACC] Implement Sema/AST for 'atomic' construct
The atomic construct is a particularly complicated one.  The directive
itself is pretty simple, it has 5 options for the 'atomic-clause'.
However, the associated statement is fairly complicated.

'read' accepts:
  v = x;
'write' accepts:
  x = expr;
'update' (or no clause) accepts:
  x++;
  x--;
  ++x;
  --x;
  x binop= expr;
  x = x binop expr;
  x = expr binop x;

'capture' accepts either a compound statement, or:
  v = x++;
  v = x--;
  v = ++x;
  v = --x;
  v = x binop= expr;
  v = x = x binop expr;
  v = x = expr binop x;

IF 'capture' has a compound statement, it accepts:
  {v = x; x binop= expr; }
  {x binop= expr; v = x; }
  {v = x; x = x binop expr; }
  {v = x; x = expr binop x; }
  {x = x binop expr ;v = x; }
  {x = expr binop x; v = x; }
  {v = x; x = expr; }
  {v = x; x++; }
  {v = x; ++x; }
  {x++; v = x; }
  {++x; v = x; }
  {v = x; x--; }
  {v = x; --x; }
  {x--; v = x; }
  {--x; v = x; }

While these are all quite complicated, there is a significant amount
of similarity between the 'capture' and 'update' lists, so this patch
reuses a lot of the same functions.

This patch implements the entirety of 'atomic', creating a new Sema file
for the sema for it, as it is fairly sizable.
2025-02-03 07:22:22 -08:00
Hans Wennborg
90e0dd15ff Revert "[Analyzer][CFG] Correctly handle rebuilt default arg and default init expression (#117437)"
This caused assertion failures:

  clang/lib/Analysis/CFG.cpp:822:
  void (anonymous namespace)::CFGBuilder::appendStmt(CFGBlock *, const Stmt *):
  Assertion `!isa<Expr>(S) || cast<Expr>(S)->IgnoreParens() == S' failed.

See comment on the PR.

This reverts commit 44aa618ef67d302f5ab77cc591fb3434fe967a2e.
2025-02-03 15:52:04 +01:00
yronglin
44aa618ef6
[Analyzer][CFG] Correctly handle rebuilt default arg and default init expression (#117437)
Clang currently support extending lifetime of object bound to reference
members of aggregates, that are created from default member initializer.
This PR address this change and updaye CFG and ExprEngine.

This PR reapply https://github.com/llvm/llvm-project/pull/91879.
Fixes https://github.com/llvm/llvm-project/issues/93725.

---------

Signed-off-by: yronglin <yronglin777@gmail.com>
2025-02-01 16:58:05 +08:00
Balazs Benics
9333d8fb07
[analyzer][NFC] Simplify and eliminate redundant map lookups (#125272) 2025-01-31 22:16:39 +01:00
Ziqing Luo
536606f6f6
[StaticAnalyzer] Fix state update in VisitObjCForCollectionStmt (#124477)
In `VisitObjCForCollectionStmt`, the function does `evalLocation` for
the current element at the original source state `Pred`. The evaluation
may result in a new state, say `PredNew`. I.e., there is a transition:
`Pred -> PredNew`, though it is a very rare case that `Pred` is NOT
identical to `PredNew`. (This explains why the bug exists for many years
but no one noticed until recently a crash observed downstream.) Later,
the original code does NOT use `PredNew` as the new source state in
`StmtNodeBuilder` for next transitions. In cases `Pred != PredNew`, the
program ill behaves.

(rdar://143280254)
2025-01-30 16:21:46 -08:00
Balazs Benics
0e62c748d4
[analyzer][NFC] Remove a redundant container lookup (#125064)
I found this using my experimental checker present at:
https://github.com/steakhal/llvm-project/tree/bb/add-redundant-lookup-checker

The idea for looking for redundant container lookups was inspired by
#123376

If there is interest, I could think of upstreaming this alpha checker.
(For the StaticAnalyzer sources it was the only TP, and I had no FPs
from the checker btw.)
2025-01-30 20:48:43 +01:00
Balazs Benics
025541dded
[analyzer] Relax assertion in BugReporterVisitors.cpp isInitializationOfVar (#125044)
If we see a variable declaration (aka. DeclStmt), and the VarRegion it
declared doesn't have Stack memspace, we assumed that it must be a local
static variable.
However, the declared variable may be an extern declaration of a global.

In this patch, let's admit that local extern declarations are a thing.

For the sake of completeness, I also added one more test for
thread_locals - which are implicitly considered statics btw. (the
`isStaticLocal()` correctly also considers thread locals as local
statics).

Fixes #124975
2025-01-30 12:48:09 +01:00
Jason Rice
abc8812df0
[Clang][P1061] Add stuctured binding packs (#121417)
This is an implementation of P1061 Structure Bindings Introduce a Pack
without the ability to use packs outside of templates. There is a couple
of ways the AST could have been sliced so let me know what you think.
The only part of this change that I am unsure of is the
serialization/deserialization stuff. I followed the implementation of
other Exprs, but I do not really know how it is tested. Thank you for
your time considering this.

---------

Co-authored-by: Yanzuo Liu <zwuis@outlook.com>
2025-01-29 21:43:52 +01:00
Sirraide
39a72be5e5
[Clang] [NFC] Introduce ConstDynamicRecursiveASTVisitor (reland) (#124821)
This relands #122991 (eeefa72).

The last attempt at landing this caused some problems; I’m not entirely
sure what happened, but it might have been due to an unnecessary use
of the `template` keyword in a few places. This removes that and attempts
to land the change again.
2025-01-29 16:14:18 +01:00
Nikolas Klauser
0865ecc515
[clang] Extend diagnose_if to accept more detailed warning information, take 2 (#119712)
This is take two of #70976. This iteration of the patch makes sure that
custom
diagnostics without any warning group don't get promoted by `-Werror` or
`-Wfatal-errors`.

This implements parts of the extension proposed in
https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7.

Specifically, this makes it possible to specify a diagnostic group in an
optional third argument.
2025-01-28 08:41:31 +01:00
Sirraide
7873d3b50b
Revert "[Clang] [NFC] Introduce ConstDynamicRecursiveASTVisitor" (#124667)
Reverts llvm/llvm-project#122991

One of the bots is breaking; I’ll have to investigate what the issue is;
this might be because I haven’t updated the branch in a while.
2025-01-28 02:21:18 +01:00
Sirraide
eeefa72f7c
[Clang] [NFC] Introduce ConstDynamicRecursiveASTVisitor (#122991)
After some discussion around #116823, it was decided that it would be
nice to have a `const` variant of `DynamicRecursiveASTVisitor`, so this
pr does exactly that by making the main DRAV implementation a template
with a single `bool` template parameter that turns several function
parameters from a `T*` or `T&` to a `const T*` or `const T&`.

Since that made the implementation of a bunch of DRAV functions quite a
bit more verbose, I’ve moved most of them to be stamped out by a macro,
which imo makes it easier to understand what’s actually going on there.

For functions which already accepted `const` parameters in the original
RAV implementation, the parameter is `const` in both versions (e.g.
`TraverseTemplateArgument()` always takes a `const TemplateArgument&`);
conversely, parameters that are passed by value (e.g. in
`TraverseType()`, which takes a `QualType` by value) are *not* `const`
in either variant (i.e. the `QualType` argument is always just a
`QualType`, never a `const QualType`).

As a sanity check, I’ve also migrated some random visitor in the static
analyser to the `const` version (and indeed, it ends up simplifying the
code around that particular visitor actually). It would make sense to do
a pass over all visitors and change all which can be `const` use the
`const` version, but that can be done in a follow-up pr.

The [performance
impact](https://llvm-compile-time-tracker.com/compare.php?from=e3cd88a7be1dfd912bb6e7c7e888e7b442ffb5de&to=d55c5afe4a485b6d0431386e6f45cb44c1fc8883&stat=instructions:u)
of this change seems to be negligible. Clang’s binary size went up by
0.5%, but that’s expected considering that this effectively adds an
extra instantiation of `RecursiveASTVisitor`. Fortunately, this is of
course a one-time cost.
2025-01-28 02:15:02 +01:00
Tom Honermann
8fb42300a0
[SYCL] AST support for SYCL kernel entry point functions. (#122379)
A SYCL kernel entry point function is a non-member function or a static
member function declared with the `sycl_kernel_entry_point` attribute.
Such functions define a pattern for an offload kernel entry point
function to be generated to enable execution of a SYCL kernel on a
device. A SYCL library implementation orchestrates the invocation of
these functions with corresponding SYCL kernel arguments in response to
calls to SYCL kernel invocation functions specified by the SYCL 2020
specification.

The offload kernel entry point function (sometimes referred to as the
SYCL kernel caller function) is generated from the SYCL kernel entry
point function by a transformation of the function parameters followed
by a transformation of the function body to replace references to the
original parameters with references to the transformed ones. Exactly how
parameters are transformed will be explained in a future change that
implements non-trivial transformations. For now, it suffices to state
that a given parameter of the SYCL kernel entry point function may be
transformed to multiple parameters of the offload kernel entry point as
needed to satisfy offload kernel argument passing requirements.
Parameters that are decomposed in this way are reconstituted as local
variables in the body of the generated offload kernel entry point
function.

For example, given the following SYCL kernel entry point function
definition:
```
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void sycl_kernel_entry_point(KernelType kernel) {
  kernel();
}
```

and the following call:
```
struct Kernel {
  int dm1;
  int dm2;
  void operator()() const;
};
Kernel k;
sycl_kernel_entry_point<class kernel_name>(k);
```

the corresponding offload kernel entry point function that is generated
might look as follows (assuming `Kernel` is a type that requires
decomposition):
```
void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) {
  Kernel kernel{dm1, dm2};
  kernel();
}
```

Other details of the generated offload kernel entry point function, such
as its name and calling convention, are implementation details that need
not be reflected in the AST and may differ across target devices. For
that reason, only the transformation described above is represented in
the AST; other details will be filled in during code generation.

These transformations are represented using new AST nodes introduced
with this change. `OutlinedFunctionDecl` holds a sequence of
`ImplicitParamDecl` nodes and a sequence of statement nodes that
correspond to the transformed parameters and function body.
`SYCLKernelCallStmt` wraps the original function body and associates it
with an `OutlinedFunctionDecl` instance. For the example above, the AST
generated for the `sycl_kernel_entry_point<kernel_name>` specialization
would look as follows:
```
FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)'
  TemplateArgument type 'kernel_name'
  TemplateArgument type 'Kernel'
  ParmVarDecl kernel 'Kernel'
  SYCLKernelCallStmt
    CompoundStmt
      <original statements>
    OutlinedFunctionDecl
      ImplicitParamDecl 'dm1' 'int'
      ImplicitParamDecl 'dm2' 'int'
      CompoundStmt
        VarDecl 'kernel' 'Kernel'
          <initialization of 'kernel' with 'dm1' and 'dm2'>
        <transformed statements with redirected references of 'kernel'>
```

Any ODR-use of the SYCL kernel entry point function will (with future
changes) suffice for the offload kernel entry point to be emitted. An
actual call to the SYCL kernel entry point function will result in a
call to the function. However, evaluation of a `SYCLKernelCallStmt`
statement is a no-op, so such calls will have no effect other than to
trigger emission of the offload kernel entry point.

Additionally, as a related change inspired by code review feedback,
these changes disallow use of the `sycl_kernel_entry_point` attribute
with functions defined with a _function-try-block_. The SYCL 2020
specification prohibits the use of C++ exceptions in device functions.
Even if exceptions were not prohibited, it is unclear what the semantics
would be for an exception that escapes the SYCL kernel entry point
function; the boundary between host and device code could be an implicit
noexcept boundary that results in program termination if violated, or
the exception could perhaps be propagated to host code via the SYCL
library. Pending support for C++ exceptions in device code and clear
semantics for handling them at the host-device boundary, this change
makes use of the `sycl_kernel_entry_point` attribute with a function
defined with a _function-try-block_ an error.
2025-01-22 16:39:08 -05:00
Kazu Hirata
a1f8ce683a
[StaticAnalyzer] Migrate away from PointerUnion::dyn_cast (NFC) (#122856)
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect Storage to be nonnull.
2025-01-14 15:26:08 -08:00
Timm Baeder
cfe26358e3
Reapply "[clang] Avoid re-evaluating field bitwidth" (#122289) 2025-01-11 07:12:37 +01:00
Balázs Kéri
854cbbf4a8
[clang][analyzer] Split NullDereferenceChecker into modeling and reporting (#122139)
The checker currently reports beneath the null dereference dereferences
of undefined value and of label addresses. If we want to add more kinds
of invalid dereferences (or split the existing functionality) it is more
useful to make it separate checkers.
To make this possible the existing checker is split into a
DereferenceModeling part and a NullDereference checker that actually
only switches on the check of null dereference. This is similar
architecture as in MallocChecker and CStringChecker.

The change is almost NFC but a new (modeling) checker is added. If the
NullDereference checker is turned off the found invalid dereferences
will still stop the analysis without emitted warning (this is different
compared to the old behavior).
2025-01-10 11:17:06 +01:00
Arseniy Zaostrovnykh
dd331082e7
[analyzer][NFC] Factor out SymbolManager::get<*> (#121781)
Replace the family of `SymbolManager::get*Symbol(...)` member functions
with a single generic `SymbolManager::get<*>` member function.
2025-01-10 08:21:00 +01:00
Timm Bäder
59bdea24b0 Revert "[clang] Avoid re-evaluating field bitwidth (#117732)"
This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54.

This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:

lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
2025-01-08 15:09:52 +01:00
Timm Baeder
81fc3add1e
[clang] Avoid re-evaluating field bitwidth (#117732)
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
2025-01-08 14:45:19 +01:00
erichkeane
db81e8c42e [OpenACC] Initial sema implementation of 'update' construct
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions.  This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
    or label' statement restriction.  Future patches will handle the
    rest of the restrictions, which are based on clauses.
2025-01-07 08:20:20 -08:00
erichkeane
21c785d7bd [OpenACC] Implement 'set' construct sema
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
2025-01-06 11:03:18 -08:00
Balazs Benics
55391f85ac
[analyzer] Retry UNDEF Z3 queries 2 times by default (#120239)
If we have a refutation Z3 query timed out (UNDEF), allow a couple of
retries to improve stability of the query. By default allow 2 retries,
which will give us in maximum of 3 solve attempts per query.

Retries should help mitigating flaky Z3 queries.
See the details in the following RFC:

https://discourse.llvm.org/t/analyzer-rfc-retry-z3-crosscheck-queries-on-timeout/83711

Note that with each attempt, we spend more time per query.
Currently, we have a 15 seconds timeout per query - which are also in
effect for the retry attempts.

---

Why should this help?
In short, retrying queries should bring stability because if a query
runs long
it's more likely that it did so due to some runtime anomaly than it's on
the edge of succeeding. This is because most queries run quick, and the
queries that run long, usually run long by a fair amount.
Consequently, retries should improve the stability of the outcome of the
Z3 query.

In general, the retries shouldn't increase the overall analysis time
because it's really rare we hit the 0.1% of the cases when we would do
retries. But keep in mind that the retry attempts can add up if many
retries are allowed, or the individual query timeout is large.

CPP-5920
2025-01-06 18:08:12 +01:00
Arseniy Zaostrovnykh
648e256e54
Reapply "[clang][analyzer] Stable order for SymbolRef-keyed containers" (#121749)
Generalize the SymbolIDs used for SymbolData to all SymExprs and use
these IDs for comparison SymbolRef keys in various containers, such as
ConstraintMap. These IDs are superior to raw pointer values because they
are more controllable and are not randomized across executions (unlike
[pointers](https://en.wikipedia.org/wiki/Address_space_layout_randomization)).

These IDs order is stable across runs because SymExprs are allocated in
the same order.

Stability of the constraint order is important for the stability of the
analyzer results. I evaluated this change on a set of 200+ open-source C
and C++ projects with the total number of ~78 000 symbolic-execution
issues passing Z3 refutation.

This patch reduced the run-to-run churn (flakiness) in SE issues from
80-90 to 30-40 (out of 78K) in our CSA deployment (in our setting flaky
issues are mostly due to Z3 refutation instability).

Note, most of the issue churn (flakiness) is caused by the mentioned Z3
refutation. With Z3 refutation disabled, issue churn goes down to ~10
issues out of 83K and this patch has no effect on appearing/disappearing
issues between runs. It however, seems to reduce the volatility of the
execution flow: before we had 40-80 issues with changed execution flow,
after - 10-30.

Importantly, this change is necessary for the next step in stabilizing
analysis results by caching Z3 query outcomes between analysis runs
(work in progress).

Across our admittedly noisy CI runs, I detected no significant effect on
memory footprint or analysis time.

This PR reapplies https://github.com/llvm/llvm-project/pull/121551 with
a fix to a g++ compiler error reported on some build bots

CPP-5919
2025-01-06 12:45:31 +01:00
Balazs Benics
a106ad0f1d
Revert "[clang][analyzer] Stable order for SymbolRef-keyed containers" (#121592)
Reverts llvm/llvm-project#121551

We had a bunch of build errors caused by this PR.
https://lab.llvm.org/buildbot/#/builders/144/builds/14875
2025-01-03 19:43:24 +01:00
Arseniy Zaostrovnykh
0844f83fea
[clang][analyzer] Stable order for SymbolRef-keyed containers (#121551)
Generalize the `SymbolID`s used for `SymbolData` to all `SymExpr`s and
use these IDs for comparison `SymbolRef` keys in various containers,
such as `ConstraintMap`. These IDs are superior to raw pointer values
because they are more controllable and are not randomized across
executions (unlike
[pointers](https://en.wikipedia.org/wiki/Address_space_layout_randomization)).

These IDs order is stable across runs because SymExprs are allocated in
the same order.

Stability of the constraint order is important for the stability of the
analyzer results. I evaluated this change on a set of 200+ open-source C
and C++ projects with the total number of ~78 000 symbolic-execution
issues passing Z3 refutation.

This patch reduced the run-to-run churn (flakiness) in SE issues from
80-90 to 30-40 (out of 78K) in our CSA deployment (in our setting flaky
issues are mostly due to Z3 refutation instability).

Note, most of the issue churn (flakiness) is caused by the mentioned Z3
refutation. With Z3 refutation disabled, issue churn goes down to ~10
issues out of 83K and this patch has no effect on appearing/disappearing
issues between runs. It however, seems to reduce the volatility of the
execution flow: before we had 40-80 issues with changed execution flow,
after - 10-30.

Importantly, this change is necessary for the next step in stabilizing
analysis results by caching Z3 query outcomes between analysis runs
(work in progress).

Across our admittedly noisy CI runs, I detected no significant effect on
memory footprint or analysis time.

CPP-5919
2025-01-03 19:36:24 +01:00
Donát Nagy
bb27d5e5c6
[analyzer] Don't assume third iteration in loops (#119388)
This commit ensures that if the loop condition is opaque (the analyzer
cannot determine whether it's true or false) and there were at least two
iterations, then the analyzer doesn't make the unjustified assumption
that it can enter yet another iteration.

Note that the presence of a loop suggests that the developer thought
that two iterations can happen (otherwise an `if` would've been
sufficient), but it does not imply that the developer expected three or
four iterations -- and in fact there are many false positives where a
loop iterates over a two-element (or three-element) data structure, but
the analyzer cannot understand the loop condition and blindly assumes
that there may be three or more iterations. (In particular, analyzing
the FFMPEG project produces 100+ such false positives.)

Moreover, this provides some performance improvements in the sense that
the analyzer won't waste time on traversing the execution paths with 3
or 4 iterations in a loop (which are very similar to the paths with 2
iterations) and therefore will be able to traverse more branches
elsewhere on the `ExplodedGraph`.

This logic is disabled if the user enables the widen-loops analyzer
option (which is disabled by default), because the "simulate one final
iteration after the invalidation" execution path would be suppressed by
the "exit the loop if the loop condition is opaque and there were at
least two iterations" logic. If we want to support loop widening, we
would need to create a follow-up commit which ensures that it "plays
nicely" with this logic.
2025-01-02 15:51:03 +01:00
JOSTAR
8e965d89c9
[analyzer] Fix zext assertion failure in loop unrolling (#121203)
The current implementation of APInt extension in the code can trigger an
assertion failure when the `zext` function is called with a target width
smaller than the current bit width. For example:
```cpp
if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
    InitNum = InitNum.zext(BoundNum.getBitWidth());
    BoundNum = BoundNum.zext(InitNum.getBitWidth());
}
```

This logic does not guarantee that the `zext` target width is always
greater than or equal to the current bit width, leading to potential
crashes.

Expected Behavior:
- Ensure InitNum and BoundNum are extended to the maximum of their respective widths.
- Prevent assertion failures by enforcing correct `zext` usage.

Fixes #121201
2024-12-28 11:09:29 +01:00
Balazs Benics
8dbb33762c
[analyzer] Simplify CallEvent castArgToParamTypeIfNeeded (#120981)
I noticed recently that this code (that I wrote xD) uses the
`getRuntimeDefinition()` which isn't quite necessary for the simple task
this function was designed for.

Why would it be better not using this API here?
I'm experimenting with improving how virtual functions are inlined,
where depending on our ability of deducing the dynamic type of the
object we may end up with inaccurate type information. Such inaccuracy
would mean that we may have multiple runtime definitions. After that,
this code would become ambiguous.

To resolve this, I decided to refactor this and use a simpler - but
equivalent approach.
2024-12-24 17:05:38 +01:00
Ryosuke Niwa
5162fde6ee
[webkit.UncountedLambdaCapturesChecker] Fix a nullptr deference. (#120702)
Added a nullptr check.
2024-12-20 10:18:08 -08:00
Congcong Cai
9e33387219
[clang analyzer]consume llvm::Error (#120597)
`llvm::Error` must be consumed, otherwise it will cause trap during destructor
2024-12-20 15:29:44 +01:00
Ryosuke Niwa
a71f9e6986
[webkit.UncountedLambdaCapturesChecker] Detect protectedThis pattern. (#120528)
In WebKit, we often capture this as Ref or RefPtr in addition to this
itself so that the object lives as long as a capturing lambda stays
alive.

Detect this pattern and treat it as safe. This PR also makes the check
for a lambda being passed as a function argument more robust by handling
CXXBindTemporaryExpr, CXXConstructExpr, and DeclRefExpr referring to the
lambda.
2024-12-19 18:11:28 -08:00
Ryosuke Niwa
d3c4637cbb
[WebKit checkers] Recognize adoptRef as a safe function (#120629)
adoptRef in WebKit constructs Ref/RefPtr so treat it as such in
isCtorOfRefCounted.
Also removed the support for makeRef and makeRefPtr as they don't exist
any more.
2024-12-19 18:08:24 -08:00
erichkeane
4bbdb018a6 [OpenACC] Implement 'init' and 'shutdown' constructs
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented.  This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
2024-12-19 12:21:50 -08:00
Ryosuke Niwa
b5d02786be
Revert "[WebKit checkers] Recognize adoptRef as a safe function" (#120626)
Reverts llvm/llvm-project#119846. Introduced a failing test.
2024-12-19 11:42:17 -08:00
Ryosuke Niwa
98c97d4a19
[WebKit checkers] Recognize adoptRef as a safe function (#119846)
adoptRef in WebKit constructs Ref/RefPtr so treat it as such in
isCtorOfRefCounted. Also removed the support for makeRef and makeRefPtr
as they don't exist any more.
2024-12-19 11:09:36 -08:00
Thurston Dang
2b9abf0db2 Revert "[analyzer] Handle [[assume(cond)]] as __builtin_assume(cond) (#116462)"
This reverts commit 89da344e5879e5347b5057520d5230e40ae24831.

Reason: buildbot breakages e.g., https://lab.llvm.org/buildbot/#/builders/55/builds/4556 (for which the reverted patch is the only code change)
2024-12-19 17:02:16 +00:00
Vinay Deshmukh
89da344e58
[analyzer] Handle [[assume(cond)]] as __builtin_assume(cond) (#116462)
Resolves #100762 

Gist of the change:
1. All the symbol analysis, constraint manager and expression parsing
logic was already present, but the previous code didn't "visit" the
expressions within `assume()` by parsing those expressions, all of the
code "just works" by evaluating the SVals, and hence leaning on the same
logic that makes the code with `__builtin_assume` work
2. "Ignore" an expression from adding in CFG if it has side-effects (
similar to CGStmt.cpp (todo add link))
3. Add additional test case for ternary operator handling and modify
CFG.cpp's VisitGuardedExpr code for `continue`-ing if the `ProgramPoint`
is a `StmtPoint`

---------

Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
2024-12-19 13:49:16 +01:00
Balazs Benics
23377890d0
[analyzer][NFC] Migrate {SymInt,IntSym}Expr to use APSIntPtr (4/4) (#120438) 2024-12-19 13:01:38 +01:00
Balazs Benics
13e20bcb98
[analyzer][NFC] Migrate loc::ConcreteInt to use APSIntPtr (3/4) (#120437) 2024-12-19 12:57:51 +01:00
Balazs Benics
d0d5101f99
[analyzer][NFC] Migrate nonloc::ConcreteInt to use APSIntPtr (2/4) (#120436) 2024-12-19 12:51:40 +01:00
Balazs Benics
b41240be6b
[analyzer][NFC] Introduce APSIntPtr, a safe wrapper of APSInt (1/4) (#120435)
One could create dangling APSInt references in various ways in the past, that were sometimes assumed to be persisted in the BasicValueFactor.

One should always use BasicValueFactory to create persistent APSInts, that could be used by ConcreteInts or SymIntExprs and similar long-living objects.
If one used a temporary or local variables for this, these would dangle.
To enforce the contract of the analyzer BasicValueFactory and the uses of APSInts, let's have a dedicated strong-type for this.

The idea is that APSIntPtr is always owned by the BasicValueFactory, and that is the only component that can construct it.

These PRs are all NFC - besides fixing dangling APSInt references.
2024-12-19 12:04:04 +01:00