172 Commits

Author SHA1 Message Date
martinboehme
b851c7f1fc
[clang][dataflow] Support StmtExpr in PropagateResultObject(). (#88872)
This patch adds a test that assert-fails without the fix.
2024-04-17 08:05:43 +02:00
Samira Bazuzi
9ec8c96166
[clang][dataflow] Expose getReferencedDecls and relocate free functions. (#88754)
Moves free functions from DataflowEnvironment.h/cc and
DataflowAnalysisContext.h/cc to RecordOps and a new ASTOps and exposes
them as needed for current use and to expose getReferencedDecls for
out-of-tree use.

Minimal change in functionality, only to modify the return type of
getReferenceDecls to return the collected decls instead of using output
params.

Tested with `ninja check-clang-tooling`.
2024-04-16 14:46:05 -04:00
martinboehme
3c6f91e5b6
[clang][dataflow] Fix result object location for builtin <=>. (#88726)
The newly added test causes an assertion failure in
`PropagateResultObject()`
without the fix added here.
2024-04-16 08:49:45 +02:00
martinboehme
71f1932b84
[clang][dataflow] Reland #87320: Propagate locations from result objects to initializers. (#88316)
This relands #87320 and additionally removes the now-unused function
`isOriginalRecordConstructor()`, which was causing buildbots to fail.
2024-04-11 08:20:35 +02:00
martinboehme
7549b45825
Revert "[clang][dataflow] Propagate locations from result objects to initializers." (#88315)
Reverts llvm/llvm-project#87320

This is causing buildbots to fail because
`isOriginalRecordConstructor()` is now unused.
2024-04-10 21:27:10 +02:00
martinboehme
21009f466e
[clang][dataflow] Propagate locations from result objects to initializers. (#87320)
Previously, we were propagating storage locations the other way around,
i.e.
from initializers to result objects, using `RecordValue::getLoc()`. This
gave
the wrong behavior in some cases -- see the newly added or fixed tests
in this
patch.

In addition, this patch now unblocks removing the `RecordValue` class
entirely,
as we no longer need `RecordValue::getLoc()`.

With this patch, the test `TransferTest.DifferentReferenceLocInJoin`
started to
fail because the framework now always uses the same storge location for
a
`MaterializeTemporaryExpr`, meaning that the code under test no longer
set up
the desired state where a variable of reference type is mapped to two
different
storage locations in environments being joined. Rather than trying to
modify
this test to set up the test condition again, I have chosen to replace
the test
with an equivalent test in DataflowEnvironmentTest.cpp that sets up the
test
condition directly; because this test is more direct, it will also be
less
brittle in the face of future changes.
2024-04-10 20:03:35 +02:00
NAKAMURA Takumi
d08a76d1ac Fix warnings discovered by #87348 [-Wunused-but-set-variable] 2024-04-07 11:02:08 +09:00
Yitzhak Mandelbaum
bbd259af0a
[clang][dataflow] Refactor widen API to be explicit about change effect. (#87233)
The previous API relied on pointer equality of inputs and outputs to
signal whether a change occured. This was too subtle and led to bugs in
practice. It was also very limiting: the override could not return an equivalent (but
not identical) value.
2024-04-04 08:39:51 -04:00
martinboehme
8d77d362af
[clang][dataflow] Introduce a helper class for handling record initializer lists. (#86675)
This is currently only used in one place, but I'm working on a patch
that will
use this from a second place. And I think this already improves the
readability
of the one place this is used so far.
2024-03-28 10:12:45 +01:00
smanna12
4c4ea249ce
[NFC][CLANG] Fix static analyzer bugs about unnecessary object copies with auto keyword (#85962)
Reported by Static Analyzer Tool:
In clang::dataflow::Environment::initialize(): Using the auto keyword
without an & causes the copy of an object of type LambdaCapture
2024-03-26 12:05:40 -05:00
martinboehme
27d504998e
[clang][dataflow] Fix getResultObjectLocation() on CXXDefaultArgExpr. (#85072)
This patch includes a test that causes an assertion failure without the
other
changes in this patch.
2024-03-18 13:36:20 +01:00
martinboehme
2d539db246
[clang][dataflow] When analyzing ctors, don't initialize fields of *this with values. (#84164)
This is the constructor's job, and we want to be able to test that it
does this.
2024-03-08 08:19:02 +01:00
martinboehme
d5aecf0c19
[clang][nullability] Don't discard expression state before end of full-expression. (#82611)
In https://github.com/llvm/llvm-project/pull/72985, I made a change to
discard
expression state (`ExprToLoc` and `ExprToVal`) at the beginning of each
basic
block. I did so with the claim that "we never need to access entries
from these
maps outside of the current basic block", noting that there are
exceptions to
this claim when control flow happens inside a full-expression (the
operands of
`&&`, `||`, and the conditional operator live in different basic blocks
than the
operator itself) but that we already have a mechanism for retrieving the
values
of these operands from the environment for the block they are computed
in.

It turns out, however, that the operands of these operators aren't the
only
expressions whose values can be accessed from a different basic block;
when
control flow happens within a full-expression, that control flow can be
"interposed" between an expression and its parent. Here is an example:

```cxx
void f(int*, int);
bool cond();

void target() {
  int i = 0;
  f(&i, cond() ? 1 : 0);
}
```

([godbolt](https://godbolt.org/z/hrbj1Mj3o))

In the CFG[^1] , note how the expression for `&i` is computed in block
B4,
but the parent of this expression (the `CallExpr`) is located in block
B1.
The the argument expression `&i` and the `CallExpr` are essentially
"torn apart"
into different basic blocks by the conditional operator in the second
argument.
In other words, the edge between the `CallExpr` and its argument `&i`
straddles
the boundary between two blocks.

I used to think that this scenario -- where an edge between an
expression and
one of its children straddles a block boundary -- could only happen
between the
expression that triggers the control flow (`&&`, `||`, or the
conditional
operator) and its children, but the example above shows that other
expressions
can be affected as well; the control flow is still triggered by `&&`,
`||` or
the conditional operator, but the expressions affected lie outside these
operators.

Discarding expression state too soon is harmful. For example, an
analysis that
checks the arguments of the `CallExpr` above would not be able to
retrieve a
value for the `&i` argument.

This patch therefore ensures that we don't discard expression state
before the
end of a full-expression. In other cases -- when the evaluation of a
full-expression is complete -- we still want to discard expression state
for the
reasons explained in https://github.com/llvm/llvm-project/pull/72985
(avoid
performing joins on boolean values that are no longer needed, which
unnecessarily extends the flow condition; improve debuggability by
removing
clutter from the expression state).

The impact on performance from this change is about a 1% slowdown in the
Crubit nullability check benchmarks:

```
name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     71.9µs ± 1%  71.9µs ± 2%    ~     (p=0.987 n=15+20)
BM_PointerAnalysisIntLoop          190µs ± 1%   192µs ± 2%  +1.06%  (p=0.000 n=14+16)
BM_PointerAnalysisPointerLoop      325µs ± 5%   324µs ± 4%    ~     (p=0.496 n=18+20)
BM_PointerAnalysisBranch           193µs ± 0%   192µs ± 4%    ~     (p=0.488 n=14+18)
BM_PointerAnalysisLoopAndBranch    521µs ± 1%   525µs ± 3%  +0.94%  (p=0.017 n=18+19)
BM_PointerAnalysisTwoLoops         337µs ± 1%   341µs ± 3%  +1.19%  (p=0.004 n=17+19)
BM_PointerAnalysisJoinFilePath    1.62ms ± 2%  1.64ms ± 3%  +0.92%  (p=0.021 n=20+20)
BM_PointerAnalysisCallInLoop      1.14ms ± 1%  1.15ms ± 4%    ~     (p=0.135 n=16+18)
```

[^1]:
```
 [B5 (ENTRY)]
   Succs (1): B4

 [B1]
   1: [B4.9] ? [B2.1] : [B3.1]
   2: [B4.4]([B4.6], [B1.1])
   Preds (2): B2 B3
   Succs (1): B0

 [B2]
   1: 1
   Preds (1): B4
   Succs (1): B1

 [B3]
   1: 0
   Preds (1): B4
   Succs (1): B1

 [B4]
   1: 0
   2: int i = 0;
   3: f
   4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int))
   5: i
   6: &[B4.5]
   7: cond
   8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void))
   9: [B4.8]()
   T: [B4.9] ? ... : ...
   Preds (1): B5
   Succs (2): B2 B3

 [B0 (EXIT)]
   Preds (1): B1
```
2024-03-07 13:31:23 +01:00
martinboehme
128780b06f
[clang][dataflow] Correctly treat empty initializer lists for unions. (#82986)
This fixes a crash introduced by
https://github.com/llvm/llvm-project/pull/82348
but also adds additional handling to make sure that we treat empty
initializer
lists for both unions and structs/classes correctly (see tests added in
this
patch).
2024-03-01 09:27:59 +01:00
Samira Bazuzi
c4e94633e8
Revert "[clang][dataflow] Correctly handle InitListExpr of union type." (#82856)
Reverts llvm/llvm-project#82348, which caused crashes when analyzing
empty InitListExprs for unions, e.g.

```cc
union U {
  double double_value;
  int int_value;
};

void target() {
  U value;
  value = {};
}
```

Co-authored-by: Samira Bazuzi <bazuzi@users.noreply.github.com>
2024-02-26 14:23:46 +01:00
martinboehme
4725993f1a
[clang][dataflow] Correctly handle InitListExpr of union type. (#82348) 2024-02-21 10:10:25 +01:00
Antonio Frighetto
8c6e96d9eb [clang][Dataflow] Fix unnecessary copy in initializeFieldsWithValues (NFC) 2024-02-13 11:05:26 +01:00
martinboehme
270f2c5575
[clang][dataflow] Add Environment::initializeFieldsWithValues(). (#81239)
This function will be useful when we change the behavior of record-type
prvalues
so that they directly initialize the associated result object. See also
the
comment here for more details:


9e73656af5/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h (L354)

As part of this patch, we document and assert that synthetic fields may
not have
reference type.

There is no practical use case for this: A `StorageLocation` may not
have
reference type, and a synthetic field of the corresponding non-reference
type
can serve the same purpose.
2024-02-13 10:01:25 +01:00
Yitzhak Mandelbaum
672fb27b26
[clang][dataflow] Add new join API and replace existing merge implementations. (#80361)
This patch adds a new interface for the join operation, now properly
called `join`. Originally, the framework offered a single `merge`
operation, which could serve either as a join or a widening. In
practice, though we found this conflation didn't work for non-trivial
anlyses, and split of the widening operation (`widen`). This change
completes the transition by introducing a proper `join` with strict join
semantics.

In the process, it drops an odd (and often misused) aspect of `merge`
wherein callees could implictly instruct the framework to drop the
current entry by returning `false`. This features was never used
correctly in analyses and doesn't belong in a join operation, so it is
omitted.

---------

Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>
Co-authored-by: martinboehme <mboehme@google.com>
2024-02-06 15:38:56 -05:00
martinboehme
c83ec847ac
[clang][dataflow] Extend debug output for Environment. (#79982)
*  Print `ReturnLoc`, `ReturnVal`, and `ThisPointeeLoc` if applicable.

* For entries in `LocToVal` that correspond to declarations, print the
names
   of the declarations next to them.

I've removed the FIXME because all relevant fields are now being dumped.
I'm
not sure we actually need the capability for the caller to specify which
fields
to dump, so I've simply deleted this part of the comment.

Some examples of the output:


![image](https://github.com/llvm/llvm-project/assets/29098113/17d0978f-b86d-4555-8a61-d1f2021f8d59)


![image](https://github.com/llvm/llvm-project/assets/29098113/021dbb24-5fe2-4720-8a08-f48dcf4b88f8)
2024-01-31 08:11:13 +01:00
martinboehme
7a6c2628e9
[clang][dataflow] Eliminate two uses of RecordValue::getLoc(). (#79163)
This is a small step towards eventually eliminating `RecordValue`
entirely.
2024-01-24 08:06:32 +01:00
martinboehme
a2caa4929e
[clang][dataflow] Treat comma operator correctly in getResultObjectLocation(). (#78427) 2024-01-22 09:23:06 +01:00
martinboehme
f1226eea52
[clang][dataflow] Consider CXXDefaultInitExpr to be an "original record ctor". (#78423)
The CFG doesn't contain a CFGElement for the
`CXXDefaultInitExpr::getInit()`, so
it makes sense to consider the `CXXDefaultInitExpr` to be the expression
that
originally constructs the object.
2024-01-18 08:59:26 +01:00
martinboehme
23bfc271a3
[clang][dataflow] Use ignoreCFGOmittedNodes() in setValue(). (#78245)
This is to be consistent with `getValue()`, which also uses
`ignoreCFGOmittedNodes()`.

Before this fix, it was not possible to retrieve a `Value` from a "CFG
omitted"
node that had previously been set using `setValue()`; see the
accompanying test,
which fails without the fix.

I discovered this issue while running internal integration tests on
https://github.com/llvm/llvm-project/pull/78127.
2024-01-16 15:48:44 +01:00
martinboehme
c19cacfa34
[clang][dataflow] Tighten checking for existence of a function body. (#78163)
In various places, we would previously call `FunctionDecl::hasBody()`
(which
checks whether any redeclaration of the function has a body, not
necessarily the
one on which `hasBody()` is being called).

This is bug-prone, as a recent bug in Crubit's nullability checker has
shown

([fix](4b01ed0f14),
[fix for the
fix](e0c5d8ddd7)).

Instead, we now use `FunctionDecl::doesThisDeclarationHaveABody()`
which, as the
name implies, checks whether the specific redeclaration it is being
called on
has a body.

Alternatively, I considered being more lenient and "canonicalizing" to
the
`FunctionDecl` that has the body if the `FunctionDecl` being passed is a
different redeclaration. However, this also risks hiding bugs: A caller
might
inadverently perform the analysis for all redeclarations of a function
and end
up duplicating work without realizing it. By accepting only the
redeclaration
that contains the body, we prevent this.

I've checked, and all clients that I'm aware of do currently pass in the
redeclaration that contains the function body. Typically this is because
they
use the `ast_matchers::hasBody()` matcher which, unlike
`FunctionDecl::hasBody()`, only matches for the redeclaration containing
the
body.
2024-01-16 12:52:55 +01:00
martinboehme
2ee396b0b1
[clang][dataflow] Add Environment::get<>(). (#76027)
This template function casts the result of `getValue()` or
`getStorageLocation()` to a given subclass of `Value` or
`StorageLocation` (using `cast_or_null`).

It's a common pattern to do something like this:

```cxx
auto *Val = cast_or_null<PointerValue>(Env.getValue(E));
```

This can now be expressed more concisely like this:

```cxx
auto *Val = Env.get<PointerValue>(E);
```

Instead of adding a new method `get()`, I had originally considered
simply adding a template parameter to `getValue()` and
`getStorageLocation()` (with a default argument of `Value` or
`StorageLocation`), but this results in an undesirable repetition at the
callsite, e.g. `getStorageLocation<RecordStorageLocation>(...)`. The
`Value` and `StorageLocation` in the method name adds nothing of value
when the template argument already contains this information, so it
seemed best to shorten the method name to simply `get()`.
2023-12-21 09:02:20 +01:00
martinboehme
ca1034341c
[clang][dataflow] Fix an issue with Environment::getResultObjectLocation(). (#75483)
So far, if there was a chain of record type prvalues,
`getResultObjectLocation()` would assign a different result object
location to
each one. This makes no sense, of course, as all of these prvalues end
up
initializing the same result object.

This patch fixes this by propagating storage locations up through the
entire
chain of prvalues.

The new implementation also has the desirable effect of making it
possible to
make `getResultObjectLocation()` const, which seems appropriate given
that,
logically, it is just an accessor.
2023-12-18 09:10:03 +01:00
Samira Bazuzi
40381d1264
[clang][dataflow] Re-land: Retrieve members from accessors called usi… (#74336)
…ng member pointers.

This initially landed with a broken test due to a mid-air collision with
a new requirement for Environment initialization before field modeling.
Have added that initialization in the test.

From first landing:

getMethodDecl does not handle pointers to members and returns nullptr
for them. getMethodDecl contains a decade-plus-old FIXME to handle
pointers to members, but two approaches I looked at for fixing it are
more invasive or complex than simply swapping to getCalleeDecl.

The first, have getMethodDecl call getCalleeDecl, creates a large tree
of const-ness mismatches due to getMethodDecl returning a non-const
value while being a const member function and getCalleeDecl only being a
const member function when it returns a const value.

The second, implementing an AST walk to match how
CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary
operator, is basically reimplementing Expr::getReferencedDeclOfCallee,
which is used by Expr::getCalleeDecl. We don't need another copy of that
code.
2023-12-05 12:09:33 +01:00
martinboehme
3b6d63c519
Revert "[clang][dataflow] Retrieve members from accessors called using member…" (#74299)
Reverts llvm/llvm-project#73978
2023-12-04 11:27:31 +01:00
Samira Bazuzi
a3fe9cb24d
[clang][dataflow] Retrieve members from accessors called using member… (#73978)
… pointers.

getMethodDecl does not handle pointers to members and returns nullptr
for them. getMethodDecl contains a decade-plus-old FIXME to handle
pointers to members, but two approaches I looked at for fixing it are
more invasive or complex than simply swapping to getCalleeDecl.

The first, have getMethodDecl call getCalleeDecl, creates a large tree
of const-ness mismatches due to getMethodDecl returning a non-const
value while being a const member function and getCalleeDecl only being a
const member function when it returns a const value.

The second, implementing an AST walk to match how
CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary
operator, is basically reimplementing Expr::getReferencedDeclOfCallee,
which is used by Expr::getCalleeDecl. We don't need another copy of that
code.
2023-12-04 10:10:07 +01:00
martinboehme
71f2ec2db1
[clang][dataflow] Add synthetic fields to RecordStorageLocation (#73860)
Synthetic fields are intended to model the internal state of a class
(e.g. the value stored in a `std::optional`) without having to depend on
that class's implementation details.

Today, this is typically done with properties on `RecordValue`s, but
these have several drawbacks:

* Care must be taken to call `refreshRecordValue()` before modifying a
property so that the modified property values aren’t seen by other
environments that may have access to the same `RecordValue`.

* Properties aren’t associated with a storage location. If an analysis
needs to associate a location with the value stored in a property (e.g.
to model the reference returned by `std::optional::value()`), it needs
to manually add an indirection using a `PointerValue`. (See for example
the way this is done in UncheckedOptionalAccessModel.cpp, specifically
in `maybeInitializeOptionalValueMember()`.)

* Properties don’t participate in the builtin compare, join, and widen
operations. If an analysis needs to apply these operations to
properties, it needs to override the corresponding methods of
`ValueModel`.

* Longer-term, we plan to eliminate `RecordValue`, as by-value
operations on records aren’t really “a thing” in C++ (see
https://discourse.llvm.org/t/70086#changed-structvalue-api-14). This
would obviously eliminate the ability to set properties on
`RecordValue`s.

To demonstrate the advantages of synthetic fields, this patch converts
UncheckedOptionalAccessModel.cpp to synthetic fields. This greatly
simplifies the implementation of the check.

This PR is pretty big; to make it easier to review, I have broken it
down into a stack of three commits, each of which contains a set of
logically related changes. I considered submitting each of these as a
separate PR, but the commits only really make sense when taken together.

To review, I suggest first looking at the changes in
UncheckedOptionalAccessModel.cpp. This gives a flavor for how the
various API changes work together in the context of an analysis. Then,
review the rest of the changes.
2023-12-04 09:29:22 +01:00
martinboehme
5bd643e145
[clang][dataflow] Strengthen widening of boolean values. (#73484)
Before we widen to top, we now check if both values can be proved either
true or
false in their respective environments; if so, widening returns a true
or false
literal. The idea is that we avoid losing information if posssible.

This patch includes a test that fails without this change to widening.

This change does mean that we call the SAT solver in more places, but
this seems
acceptable given the additional precision we gain.

In tests on an internal codebase, the number of SAT solver timeouts we
observe
with Crubit's nullability checker does increase by about 25%. They can
be
brought back to the previous level by doubling the SAT solver work
limit.
2023-11-27 14:55:49 +01:00
martinboehme
c4c59192e6
[clang][dataflow] Clear ExprToLoc and ExprToVal at the start of a block. (#72985)
We never need to access entries from these maps outside of the current
basic
block. This could only ever become a consideration when flow control
happens
inside a full-expression (i.e. we have multiple basic blocks for a full
expression); there are two kinds of expression where this can happen,
but we
already deal with these in other ways:

* Short-circuiting logical operators (`&&` and `||`) have operands that
live in
different basic blocks than the operator itself, but we already have
code in
the framework to retrieve the value of these operands from the
environment
for the block they are computed in, rather than in the environment of
the
   block containing the operator.

* The conditional operator similarly has operands that live in different
basic
blocks. However, we currently don't implement a transfer function for
the
conditional operator. When we do this, we need to retrieve the values of
the
operands from the environments of the basic blocks they live in, as we
already do for logical operators. This patch adds a comment to this
effect
   to the code.

Clearing out `ExprToLoc` and `ExprToVal` has two benefits:

* We avoid performing joins on boolean expressions contained in
`ExprToVal` and
hence extending the flow condition in cases where this is not needed.
Simpler
flow conditions should reduce the amount of work we do in the SAT
solver.

* Debugging becomes easier when flow conditions are simpler and
`ExprToLoc` /
  `ExprToVal` don’t contain any extraneous entries.

Benchmark results on Crubit's `pointer_nullability_analysis_benchmark
show a
slight runtime increase for simple benchmarks, offset by substantial
runtime
reductions for more complex benchmarks:

```
name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     29.8µs ± 1%  29.9µs ± 4%     ~     (p=0.879 n=46+49)
BM_PointerAnalysisIntLoop          101µs ± 3%   104µs ± 4%   +2.96%  (p=0.000 n=55+57)
BM_PointerAnalysisPointerLoop      378µs ± 3%   245µs ± 3%  -35.09%  (p=0.000 n=47+55)
BM_PointerAnalysisBranch           118µs ± 2%   122µs ± 3%   +3.37%  (p=0.000 n=59+59)
BM_PointerAnalysisLoopAndBranch    779µs ± 3%   413µs ± 5%  -47.01%  (p=0.000 n=56+45)
BM_PointerAnalysisTwoLoops         187µs ± 3%   192µs ± 5%   +2.80%  (p=0.000 n=57+58)
BM_PointerAnalysisJoinFilePath    17.4ms ± 3%   7.2ms ± 3%  -58.75%  (p=0.000 n=58+57)
BM_PointerAnalysisCallInLoop      14.7ms ± 4%  10.3ms ± 2%  -29.87%  (p=0.000 n=56+58)
```
2023-11-22 16:34:24 +01:00
martinboehme
a0700532dd
[clang][dataflow] Replace one remaining call to deprecated addToFlowCondition(). (#71547) 2023-11-08 05:32:04 +01:00
martinboehme
7c636728c0
[clang][dataflow] Simplify flow conditions displayed in HTMLLogger. (#70848)
This can make the flow condition significantly easier to interpret; see
below
for an example.

I had hoped that adding the simplification as a preprocessing step
before the
SAT solver (in `DataflowAnalysisContext::querySolver()`) would also
speed up SAT
solving and maybe even eliminate SAT solver timeouts, but in my testing,
this
actually turns out to be a pessimization. It appears that these
simplifications
are easy enough for the SAT solver to perform itself.

Nevertheless, the improvement in debugging alone makes this a worthwhile
change.

Example of flow condition output with these changes:

```
Flow condition token: V37
Constraints:
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = ((V12 & ((V14 = V9) | (V14 = V4))) & (V13 = V14)))
True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37)
False atoms: (V3, V8, V17)
Equivalent atoms:
(V11, V15)

Flow condition constraints before simplification:
V37
((!V3 & !V8) & !V17)
(V37 = V34)
(V34 = (V29 & (V35 = V30)))
(V29 = (((V16 | V2) & V32) & (V30 = V32)))
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = V11)
(V11 = ((((V7 | V2) & V12) & ((V7 & (V14 = V9)) | (V2 & (V14 = V4)))) & (V13 = V14)))
(V2 = V1)
(V1 = V0)
V0
(V7 = V6)
(V6 = V5)
(V5 = V2)
```
2023-11-07 15:18:34 +01:00
martinboehme
d1f59544cf
[clang][dataflow] Add Environment::allows(). (#70046)
This allows querying whether, given the flow condition, a certain
formula still
has a solution (though it is not necessarily implied by the flow
condition, as
`flowConditionImplies()` would check).

This can be checked today, but only with a double negation, i.e. to
check
whether, given the flow condition, a formula F has a solution, you can
check
`!Env.flowConditionImplies(Arena.makeNot(F))`. The double negation makes
this
hard to reason about, and it would be nicer to have a way of directly
checking
this.

For consistency, this patch also renames `flowConditionImplies()` to
`proves()`;
the old name is kept around for compatibility but deprecated.
2023-10-25 16:02:22 +02:00
martinboehme
14b039c1dd
[clang][dataflow] Remove declToLocConsistent() assertion. (#69819)
As described [here](https://discourse.llvm.org/t/70086/6), there are
legitimate
non-bug scenarios where two `DeclToLoc` maps to be joined contain
different
storage locations for the same declaration. This patch also adds a test
containing an example of such a situation. (The test fails without the
other
changes in this patch.)

With the assertion removed, the existing logic in `intersectDenseMaps()`
will
remove the corresponding declaration from the joined DeclToLoc map.

We also remove `removeDecl()`'s precondition (that the declaration must
be
associated with a storage location) because this may no longer hold if
the
declaration was previously removed during a join, as described above.
2023-10-24 08:42:30 +02:00
Stanislav Gatev
52d0696355
[clang][dataflow] Add support for lambda captures (#68558)
This adds support for copy, ref, and this lambda captures to the core
framework and also adds relevant tests in UncheckedOptionalAccessTest.
2023-10-11 22:18:46 +02:00
Corentin Jabot
af4751738d [C++] Implement "Deducing this" (P0847R7)
This patch implements P0847R7 (partially),
CWG2561 and CWG2653.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D140828
2023-10-02 14:33:02 +02:00
martinboehme
834cb919b3
[clang][dataflow] Remove declarations from DeclToLoc when their lifetime ends. (#67300)
After https://reviews.llvm.org/D153273, we're now able to use
`CFGLifetimeEnds`
together with the other CFG options we use.
2023-09-26 08:41:09 +02:00
martinboehme
1d7b59ca8d
[clang][dataflow] Fix two null pointer dereferences in getMemberForAccessor(). (#66742)
The additions to the test trigger crashes without the fixes.
2023-09-19 09:03:20 +02:00
Kinuko Yasuda
03be486ecc
[clang][dataflow] Model the fields that are accessed via inline accessors (#66368)
So that the values that are accessed via such accessors can be analyzed
as a limited version of context-sensitive analysis. We can potentially
do this only when some option is set, but doing additional modeling like
this won't be expensive and intrusive, so we do it by default for now.
2023-09-18 10:46:36 +02:00
martinboehme
7f66cc7d7a
[clang][dataflow] Merge RecordValues with different locations correctly. (#65319)
Now that prvalue expressions map directly to values (see
https://reviews.llvm.org/D158977), it's no longer guaranteed that
`RecordValue`s
associated with the same expression will always have the same storage
location.

In other words, D158977 invalidated the assertion in
`mergeDistinctValues()`.
The newly added test causes this assertion to fail without the other
changes in
the patch.

This patch fixes the issue. However, the real fix will be to eliminate
the
`StorageLocation` from `RecordValue` entirely.
2023-09-12 08:43:29 +02:00
Yitzhak Mandelbaum
80f0dc3aa4 [clang][dataflow] Unsoundly treat "Unknown" as "Equivalent" in widening.
This change makes widening act the same as equivalence checking. When the
analysis does not provide an answer regarding the equivalence of two distinct
values, the framework treats them as equivalent. This is an unsound choice that
enables convergence.

Differential Revision: https://reviews.llvm.org/D159355
2023-09-07 19:06:35 +00:00
Martin Braenne
266c12a1bd [clang][dataflow] When dumping ExprToVal, dump the Value, not just its location.
This makes `ExprToVal` dumping consistent with `LocToVal` dumping.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D159274
2023-09-04 07:38:33 +00:00
Martin Braenne
330d5bcbf6 [clang][dataflow] Don't associate prvalue expressions with storage locations.
Instead, map prvalue expressions directly to values in a newly introduced map `Environment::ExprToVal`.

This change introduces an additional member variable in `Environment` but is an overall win:

- It is more conceptually correctly, since prvalues don't have storage
  locations.

- It eliminates complexity from
  `Environment::setValue(const Expr &E, Value &Val)`.

- It reduces the amount of data stored in `Environment`: A prvalue now has a
  single entry in `ExprToVal` instead of one in `ExprToLoc` and one in
  `LocToVal`.

- Not allocating `StorageLocation`s for prvalues additionally reduces memory
  usage.

This patch is the last step in the migration to strict handling of value categories (see https://discourse.llvm.org/t/70086 for details). The changes here are almost entirely internal to `Environment`.

The only externally observable change is that when associating a `RecordValue` with the location returned by `Environment::getResultObjectLocation()` for a given expression, callers additionally need to associate the `RecordValue` with the expression themselves.

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D158977
2023-08-29 07:28:46 +00:00
Martin Braenne
9ecdbe3855 [clang][dataflow] Rename AggregateStorageLocation to RecordStorageLocation and StructValue to RecordValue.
- Both of these constructs are used to represent structs, classes, and unions;
  Clang uses the collective term "record" for these.

- The term "aggregate" in `AggregateStorageLocation` implies that, at some
  point, the intention may have been to use it also for arrays, but it don't
  think it's possible to use it for arrays. Records and arrays are very
  different and therefore need to be modeled differently. Records have a fixed
  set of named fields, which can have different type; arrays have a variable
  number of elements, but they all have the same type.

- Futhermore, "aggregate" has a very specific meaning in C++
  (https://en.cppreference.com/w/cpp/language/aggregate_initialization).
  Aggregates of class type may not have any user-declared or inherited
  constructors, no private or protected non-static data members, no virtual
  member functions, and so on, but we use `AggregateStorageLocations` to model all objects of class type.

In addition, for consistency, we also rename the following:

- `getAggregateLoc()` (in `RecordValue`, formerly known as `StructValue`) to
  simply `getLoc()`.

- `refreshStructValue()` to `refreshRecordValue()`

We keep the old names around as deprecated synonyms to enable clients to be migrated to the new names.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156788
2023-08-01 20:29:40 +00:00
Martin Braenne
b244b6ae0b [clang][dataflow] Remove Strict suffix from accessors.
For the time being, we're keeping the `Strict` versions around as deprecated synonyms so that clients can be migrated, but these synonyms will be removed soon.

Depends On D156673

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156674
2023-07-31 19:40:09 +00:00
Martin Braenne
17ba278f76 [clang][dataflow] Remove deprecated accessors as well as SkipPast.
Depends On D156672

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156673
2023-07-31 19:40:06 +00:00
Martin Braenne
f76f6674d8 [clang][dataflow] Use Strict accessors where we weren't using them yet.
This eliminates all uses of the deprecated accessors.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156672
2023-07-31 19:40:04 +00:00