29 Commits

Author SHA1 Message Date
martinboehme
3b6d63c519
Revert "[clang][dataflow] Retrieve members from accessors called using member…" (#74299)
Reverts llvm/llvm-project#73978
2023-12-04 11:27:31 +01:00
Samira Bazuzi
a3fe9cb24d
[clang][dataflow] Retrieve members from accessors called using member… (#73978)
… pointers.

getMethodDecl does not handle pointers to members and returns nullptr
for them. getMethodDecl contains a decade-plus-old FIXME to handle
pointers to members, but two approaches I looked at for fixing it are
more invasive or complex than simply swapping to getCalleeDecl.

The first, have getMethodDecl call getCalleeDecl, creates a large tree
of const-ness mismatches due to getMethodDecl returning a non-const
value while being a const member function and getCalleeDecl only being a
const member function when it returns a const value.

The second, implementing an AST walk to match how
CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary
operator, is basically reimplementing Expr::getReferencedDeclOfCallee,
which is used by Expr::getCalleeDecl. We don't need another copy of that
code.
2023-12-04 10:10:07 +01:00
martinboehme
71f2ec2db1
[clang][dataflow] Add synthetic fields to RecordStorageLocation (#73860)
Synthetic fields are intended to model the internal state of a class
(e.g. the value stored in a `std::optional`) without having to depend on
that class's implementation details.

Today, this is typically done with properties on `RecordValue`s, but
these have several drawbacks:

* Care must be taken to call `refreshRecordValue()` before modifying a
property so that the modified property values aren’t seen by other
environments that may have access to the same `RecordValue`.

* Properties aren’t associated with a storage location. If an analysis
needs to associate a location with the value stored in a property (e.g.
to model the reference returned by `std::optional::value()`), it needs
to manually add an indirection using a `PointerValue`. (See for example
the way this is done in UncheckedOptionalAccessModel.cpp, specifically
in `maybeInitializeOptionalValueMember()`.)

* Properties don’t participate in the builtin compare, join, and widen
operations. If an analysis needs to apply these operations to
properties, it needs to override the corresponding methods of
`ValueModel`.

* Longer-term, we plan to eliminate `RecordValue`, as by-value
operations on records aren’t really “a thing” in C++ (see
https://discourse.llvm.org/t/70086#changed-structvalue-api-14). This
would obviously eliminate the ability to set properties on
`RecordValue`s.

To demonstrate the advantages of synthetic fields, this patch converts
UncheckedOptionalAccessModel.cpp to synthetic fields. This greatly
simplifies the implementation of the check.

This PR is pretty big; to make it easier to review, I have broken it
down into a stack of three commits, each of which contains a set of
logically related changes. I considered submitting each of these as a
separate PR, but the commits only really make sense when taken together.

To review, I suggest first looking at the changes in
UncheckedOptionalAccessModel.cpp. This gives a flavor for how the
various API changes work together in the context of an analysis. Then,
review the rest of the changes.
2023-12-04 09:29:22 +01:00
martinboehme
c4c59192e6
[clang][dataflow] Clear ExprToLoc and ExprToVal at the start of a block. (#72985)
We never need to access entries from these maps outside of the current
basic
block. This could only ever become a consideration when flow control
happens
inside a full-expression (i.e. we have multiple basic blocks for a full
expression); there are two kinds of expression where this can happen,
but we
already deal with these in other ways:

* Short-circuiting logical operators (`&&` and `||`) have operands that
live in
different basic blocks than the operator itself, but we already have
code in
the framework to retrieve the value of these operands from the
environment
for the block they are computed in, rather than in the environment of
the
   block containing the operator.

* The conditional operator similarly has operands that live in different
basic
blocks. However, we currently don't implement a transfer function for
the
conditional operator. When we do this, we need to retrieve the values of
the
operands from the environments of the basic blocks they live in, as we
already do for logical operators. This patch adds a comment to this
effect
   to the code.

Clearing out `ExprToLoc` and `ExprToVal` has two benefits:

* We avoid performing joins on boolean expressions contained in
`ExprToVal` and
hence extending the flow condition in cases where this is not needed.
Simpler
flow conditions should reduce the amount of work we do in the SAT
solver.

* Debugging becomes easier when flow conditions are simpler and
`ExprToLoc` /
  `ExprToVal` don’t contain any extraneous entries.

Benchmark results on Crubit's `pointer_nullability_analysis_benchmark
show a
slight runtime increase for simple benchmarks, offset by substantial
runtime
reductions for more complex benchmarks:

```
name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     29.8µs ± 1%  29.9µs ± 4%     ~     (p=0.879 n=46+49)
BM_PointerAnalysisIntLoop          101µs ± 3%   104µs ± 4%   +2.96%  (p=0.000 n=55+57)
BM_PointerAnalysisPointerLoop      378µs ± 3%   245µs ± 3%  -35.09%  (p=0.000 n=47+55)
BM_PointerAnalysisBranch           118µs ± 2%   122µs ± 3%   +3.37%  (p=0.000 n=59+59)
BM_PointerAnalysisLoopAndBranch    779µs ± 3%   413µs ± 5%  -47.01%  (p=0.000 n=56+45)
BM_PointerAnalysisTwoLoops         187µs ± 3%   192µs ± 5%   +2.80%  (p=0.000 n=57+58)
BM_PointerAnalysisJoinFilePath    17.4ms ± 3%   7.2ms ± 3%  -58.75%  (p=0.000 n=58+57)
BM_PointerAnalysisCallInLoop      14.7ms ± 4%  10.3ms ± 2%  -29.87%  (p=0.000 n=56+58)
```
2023-11-22 16:34:24 +01:00
martinboehme
d1f59544cf
[clang][dataflow] Add Environment::allows(). (#70046)
This allows querying whether, given the flow condition, a certain
formula still
has a solution (though it is not necessarily implied by the flow
condition, as
`flowConditionImplies()` would check).

This can be checked today, but only with a double negation, i.e. to
check
whether, given the flow condition, a formula F has a solution, you can
check
`!Env.flowConditionImplies(Arena.makeNot(F))`. The double negation makes
this
hard to reason about, and it would be nicer to have a way of directly
checking
this.

For consistency, this patch also renames `flowConditionImplies()` to
`proves()`;
the old name is kept around for compatibility but deprecated.
2023-10-25 16:02:22 +02:00
martinboehme
7cf20f156f
[clang][dataflow] Eliminate RecordValue::getChild(). (#65586)
We want to eliminate the `RecordStorageLocation` from `RecordValue` and,
ultimately, eliminate `RecordValue` entirely (see the discussion linked
in the
`RecordValue` class comment). This is one step in that direction.

To eliminate `RecordValue::getChild()`, we also eliminate the last
remaining
caller, namely the `getFieldValue(const RecordValue *, ...)` overload.
Calls
to this overload have been rewritten to use the
`getFieldValue(const RecordStorageLocation *, ...)` overload. Note that
this
also makes the code slightly simpler in many cases.
2023-09-12 09:17:38 +02:00
martinboehme
7f66cc7d7a
[clang][dataflow] Merge RecordValues with different locations correctly. (#65319)
Now that prvalue expressions map directly to values (see
https://reviews.llvm.org/D158977), it's no longer guaranteed that
`RecordValue`s
associated with the same expression will always have the same storage
location.

In other words, D158977 invalidated the assertion in
`mergeDistinctValues()`.
The newly added test causes this assertion to fail without the other
changes in
the patch.

This patch fixes the issue. However, the real fix will be to eliminate
the
`StorageLocation` from `RecordValue` entirely.
2023-09-12 08:43:29 +02:00
Martin Braenne
9ecdbe3855 [clang][dataflow] Rename AggregateStorageLocation to RecordStorageLocation and StructValue to RecordValue.
- Both of these constructs are used to represent structs, classes, and unions;
  Clang uses the collective term "record" for these.

- The term "aggregate" in `AggregateStorageLocation` implies that, at some
  point, the intention may have been to use it also for arrays, but it don't
  think it's possible to use it for arrays. Records and arrays are very
  different and therefore need to be modeled differently. Records have a fixed
  set of named fields, which can have different type; arrays have a variable
  number of elements, but they all have the same type.

- Futhermore, "aggregate" has a very specific meaning in C++
  (https://en.cppreference.com/w/cpp/language/aggregate_initialization).
  Aggregates of class type may not have any user-declared or inherited
  constructors, no private or protected non-static data members, no virtual
  member functions, and so on, but we use `AggregateStorageLocations` to model all objects of class type.

In addition, for consistency, we also rename the following:

- `getAggregateLoc()` (in `RecordValue`, formerly known as `StructValue`) to
  simply `getLoc()`.

- `refreshStructValue()` to `refreshRecordValue()`

We keep the old names around as deprecated synonyms to enable clients to be migrated to the new names.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156788
2023-08-01 20:29:40 +00:00
Martin Braenne
b244b6ae0b [clang][dataflow] Remove Strict suffix from accessors.
For the time being, we're keeping the `Strict` versions around as deprecated synonyms so that clients can be migrated, but these synonyms will be removed soon.

Depends On D156673

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156674
2023-07-31 19:40:09 +00:00
Martin Braenne
0f6cf55567 [clang][dataflow] Bugfix for refreshStructValue().
In the case where the expression was not yet associated with a storage location, we created a new storage location but failed to associate it with the expression.

The newly added test fails without the fix.

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D155465
2023-07-17 18:56:25 +00:00
Sam McCall
6272226b9f [dataflow] Remove deprecated BoolValue flow condition accessors
Use the Formula versions instead, now.

Differential Revision: https://reviews.llvm.org/D155152
2023-07-13 09:39:23 +02:00
Martin Braenne
2902ea3d81 [clang][dataflow] Introduce getFieldValue() test helpers.
These insulate tests against changes to the `getChild()` functions of `AggregateStorageLocation` and `StructValue` that will happen as part of the migration to strict handling of value categories (see https://discourse.llvm.org/t/70086 for details):

- `AggregateStorageLocation::getChild()` will soon return a `StorageLocation *`
  instead of a `StorageLocation &`. When this happens, `getFieldValue()` will be
  changed to return null if `AggregateStorageLocation::getChild()` returns null;
  test code will not need to change as it should already be checking whether the
  return value of `getFieldValue()` is null.

- `StructValue::getChild()` will soon return a `StorageLocation *` instead of a
  `Value *`. When this happens, `getFieldValue()` will be changed to look up the
  `Value *` in the `Environment`. Again, test code will not need to change.

The test helpers will continue to serve a useful purpose once the API changes are complete, so the intent is to leave them in place.

This patch changes DataflowEnvironmentTest.cpp and RecordOpsTest.cpp to use the test helpers. TransferTest.cpp will be changed in an upcoming patch to help keep patch sizes manageable for review.

Depends On D154934

Reviewed By: ymandel, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D154935
2023-07-12 04:52:23 +00:00
Martin Braenne
0c852dc88e [clang][dataflow][NFC] Remove SkipPast param from getValue(const ValueDecl &).
This parameter was already a no-op, so removing it doesn't change behavior.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D150137
2023-05-09 07:42:20 +00:00
Martin Braenne
9940fac753 [clang][dataflow][NFC] Remove SkipPast parameter from `getStorageLocation(const ValueDecl &).
This parameter was already a no-op, so removing it doesn't change behavior.

Depends On D149144

Reviewed By: ymandel, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D149151
2023-05-08 07:10:44 +00:00
Dmitri Gribenko
cd11f55a0c [clang][dataflow] Fix indentation in a test 2023-03-20 18:34:16 +01:00
Yitzhak Mandelbaum
73c98831f6 [clang][dataflow] Fix missed fields in field set construction.
When building the set of referenced fields for the `DataflowAnalysisContext`,
include fields referenced only in default member initializers. These
initializers are visited in the CFGs of constructors and so the fields must be
included when analysing constructor bodies.

Differential Revision: https://reviews.llvm.org/D144987
2023-02-28 18:56:54 +00:00
Yitzhak Mandelbaum
3ce03c42db [clang][dataflow] Fix 2 bugs in MemberExpr interpretation.
There were two (small) bugs causing crashes in the analysis.  This patch fixes both of them.

1. An enum value was accessed as a class member. Now, the engine gracefully
ignores such member expressions.

2. Field access in `MemberExpr` of struct/class-typed global variables. Analysis
didn't interpret fields of global vars, because the vars were initialized before
the fields were added to the "allowlist". Now, the allowlist is set _before_
init of globals.

Differential Revision: https://reviews.llvm.org/D141384
2023-01-10 15:48:00 +00:00
Yitzhak Mandelbaum
01ccf7b3ce Revert "Revert "[clang][dataflow] Only model struct fields that are used in the function being analyzed.""
This reverts commit 2b1a517a92bfdfa3b692a660e19a2bb22513a567. It's a fix forward
with two memory errors fixed, one of which was the cause of the build breakage
in the buildbots.

Original message:

Previously, the model for structs modeled all fields in a struct when
`createValue` was called for that type. This patch adds a prepass on the
function under analysis to discover the fields referenced in the scope and then
limits modeling to only those fields. This reduces wasted memory usage
(modeling unused fields) which can be important for programs that use large
structs.

Note: This patch obviates the need for https://reviews.llvm.org/D123032.
2023-01-09 19:32:10 +00:00
Yitzhak Mandelbaum
2b1a517a92 Revert "[clang][dataflow] Only model struct fields that are used in the function being analyzed."
This reverts commit 5e8f597c2fedc740b71f07dfdb1ef3c2d348b193. It caused msan and ubsan breakages.
2023-01-06 01:07:28 +00:00
Yitzhak Mandelbaum
5e8f597c2f [clang][dataflow] Only model struct fields that are used in the function being analyzed.
Previously, the model for structs modeled all fields in a struct when
`createValue` was called for that type. This patch adds a prepass on the
function under analysis to discover the fields referenced in the scope and then
limits modeling to only those fields.  This reduces wasted memory usage
(modeling unused fields) which can be important for programss that use large
structs.

Note: This patch obviates the need for https://reviews.llvm.org/D123032.

Differential Revision: https://reviews.llvm.org/D140694
2023-01-05 21:46:39 +00:00
Yitzhak Mandelbaum
f3700bdb7f [clang][dataflow] Account for global variables in constructor initializers.
Previously, the analysis modeled global variables appearing in the _body_ of
any function (including constructors). But, that misses those appearing in
constructor _initializers_. This patch adds the initializers to the set of
expressions used to determine which globals to model.

Differential Revision: https://reviews.llvm.org/D140501
2022-12-22 14:20:50 +00:00
Sam Estep
32dcb759c3 [clang][dataflow] Move NoopAnalysis from unittests to include
This patch moves `Analysis/FlowSensitive/NoopAnalysis.h` from `clang/unittests/` to `clang/include/clang/`, so that we can use it for doing context-sensitive analysis.

Reviewed By: ymandel, gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D130304
2022-07-22 14:11:32 +00:00
Wei Yi Tee
00e9d53453 [clang][dataflow] Move logic for creating implication and iff expressions into DataflowAnalysisContext from DataflowEnvironment.
To keep functionality of creating boolean expressions in a consistent location.

Depends On D128357

Reviewed By: gribozavr2, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D128519
2022-06-24 23:16:44 +02:00
Stanislav Gatev
3dd7877b27 Revert "[clang][dataflow] Move dataflow testing support out of unittests"
This reverts commit 26bbde2612b2042c3a8a31aed7f45e065c3dd413.
2022-03-09 15:38:51 +00:00
Stanislav Gatev
26bbde2612 [clang][dataflow] Move dataflow testing support out of unittests
This enables tests out of clang/unittests/Analysis/FlowSensitive to
use the testing support utilities.

Reviewed-by: ymandel, gribozavr2

Differential Revision: https://reviews.llvm.org/D121285
2022-03-09 15:31:02 +00:00
Stanislav Gatev
e0cc28dfdc Revert "[clang][dataflow] Add analysis that detects unsafe accesses to optionals"
This reverts commit ce205cffdfa0f16ce9441ba46fa43e23cecf8be7.
2022-03-09 09:51:03 +00:00
Stanislav Gatev
ce205cffdf [clang][dataflow] Add analysis that detects unsafe accesses to optionals
Adds a dataflow analysis that detects unsafe accesses to values of type
`std::optional`, `absl::optional`, or `base::Optional`.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D121197
2022-03-09 09:42:51 +00:00
Yitzhak Mandelbaum
18c84e2d32 [clang][dataflow] Fix nullptr dereferencing error.
When pre-initializing fields in the environment, the code assumed that all
fields of a struct would be initialized. However, given limits on value
construction, that assumption is incorrect. This patch changes the code to drop
that assumption and thereby avoid dereferencing a nullptr.

Differential Revision: https://reviews.llvm.org/D121158
2022-03-08 03:01:31 +00:00
Stanislav Gatev
ae60884dfe [clang][dataflow] Add flow condition constraints to Environment
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120711
2022-03-02 08:57:27 +00:00