59 Commits

Author SHA1 Message Date
Dmitri Gribenko
941959d69d [clang][dataflow] Use llvm::is_contained()
Reviewed By: samestep, xazax.hun

Differential Revision: https://reviews.llvm.org/D131975
2022-08-16 19:59:21 +02:00
Sam Estep
2efc8f8d65 [clang][dataflow] Add an option for context-sensitive depth
This patch adds a `Depth` field (default value 2) to `ContextSensitiveOptions`, allowing context-sensitive analysis of functions that call other functions. This also requires replacing the `DeclCtx` field on `Environment` with a `CallString` field that contains a vector of decl contexts, to ensure that the analysis doesn't try to analyze recursive or mutually recursive calls (which would result in a crash, due to the way we handle `StorageLocation`s).

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D131809
2022-08-15 19:58:40 +00:00
Sam Estep
d09d4bd66c [clang][dataflow] Don't crash when caller args are missing storage locations
This patch modifies `Environment`'s `pushCall` method to pass over arguments that are missing storage locations, instead of crashing.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D131600
2022-08-11 13:00:42 +00:00
Sam Estep
eb91fd5cbc [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-11 12:46:20 +00:00
Wei Yi Tee
2cb51449f0 [clang][dataflow] Store DeclContext of block being analysed in Environment if available.
Differential Revision: https://reviews.llvm.org/D131065
2022-08-11 07:36:57 +00:00
Evgenii Stepanov
8d3c960295 Revert "[clang][dataflow] Store DeclContext of block being analysed in Environment if available."
Use of uninitialized memory.
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 8a4c40bfe8e6605ffc9d866f8620618dfdde2875.
2022-08-10 14:22:04 -07:00
Evgenii Stepanov
7587065043 Revert "[clang][dataflow] Analyze constructor bodies"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 000c8fef86abb7f056cbea2de99f21dca4b81bf8.
2022-08-10 14:21:56 -07:00
Evgenii Stepanov
26089d4da4 Revert "[clang][dataflow] Don't crash when caller args are missing storage locations"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 43b298ea1282f29d448fc0f6ca971bc5fa698355.
2022-08-10 14:21:46 -07:00
Sam Estep
43b298ea12 [clang][dataflow] Don't crash when caller args are missing storage locations
This patch modifies `Environment`'s `pushCall` method to pass over arguments that are missing storage locations, instead of crashing.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D131600
2022-08-10 17:50:34 +00:00
Sam Estep
000c8fef86 [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-10 14:01:45 +00:00
Wei Yi Tee
8a4c40bfe8 [clang][dataflow] Store DeclContext of block being analysed in Environment if available.
Differential Revision: https://reviews.llvm.org/D131065
2022-08-10 11:27:03 +00:00
Sam Estep
8611a77ee7 [clang][dataflow] Analyze method bodies
This patch adds the ability to context-sensitively analyze method bodies, by moving `ThisPointeeLoc` from `DataflowAnalysisContext` to `Environment`, and adding code in `pushCall` to set it.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131170
2022-08-04 17:45:47 +00:00
Sam Estep
0eaecbbc23 [clang][dataflow] Handle return statements
This patch adds a `ReturnLoc` field to the `Environment`, serving a similar to the `ThisPointeeLoc` field in the `DataflowAnalysisContext`. It then uses that (along with a new `VisitReturnStmt` method in `TransferVisitor`) to handle non-`void`-returning functions in context-sensitive analysis.

Reviewed By: ymandel, sgatev

Differential Revision: https://reviews.llvm.org/D130600
2022-08-04 17:42:19 +00:00
Stanislav Gatev
817dd5e3fd [clang][dataflow] Rename member to make it clear that it isn't stable
Rename `DataflowAnalysisContext::getStableStorageLocation(QualType)`
to `createStorageLocation`, to make it clear that it doesn't return
a stable storage location.

Differential Revision: https://reviews.llvm.org/D131021

Reviewed-by: ymandel, xazax.hun, gribozavr2
2022-08-03 06:25:02 +00:00
Sam Estep
a6ddc68487 [clang][dataflow] Handle multiple context-sensitive calls to the same function
This patch enables context-sensitive analysis of multiple different calls to the same function (see the `ContextSensitiveSetBothTrueAndFalse` example in the `TransferTest` suite) by replacing the `Environment` copy-assignment with a call to the new `popCall` method, which  `std::move`s some fields but specifically does not move `DeclToLoc` and `ExprToLoc` from the callee back to the caller.

To enable this, the `StorageLocation` for a given parameter needs to be stable across different calls to the same function, so this patch also improves the modeling of parameter initialization, using `ReferenceValue` when necessary (for arguments passed by reference).

This approach explicitly does not work for recursive calls, because we currently only plan to use this context-sensitive machinery to support specialized analysis models we write, not analysis of arbitrary callees.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130726
2022-07-29 19:40:19 +00:00
Weverything
1f8ae9d7e7 Inline function calls.
Fix unused variable in non-assert builds after
300fbf56f89aebbe2ef9ed490066bab23e5356d1
2022-07-26 21:12:28 -07:00
Sam Estep
300fbf56f8 [clang][dataflow] Analyze calls to in-TU functions
This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `FIXME` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:54:27 +00:00
Sam Estep
cc9aa157a8 Revert "[clang][dataflow] Analyze calls to in-TU functions"
This reverts commit fa2b83d07ecab3b24b4c5ee2e7dc4b6bbc895317.
2022-07-26 17:30:09 +00:00
Sam Estep
fa2b83d07e [clang][dataflow] Analyze calls to in-TU functions
Depends On D130305

This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) first calls `initGlobalVars`, then maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `TODO` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:27:19 +00:00
Dmitri Gribenko
c0c9d717df [clang][dataflow] Rename iterators from IT to It
The latter way to abbreviate is a lot more common in the LLVM codebase.

Reviewed By: sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D130423
2022-07-25 20:28:47 +02:00
Dmitri Gribenko
b5414b566a [clang][dataflow] Add DataflowEnvironment::dump()
Start by dumping the flow condition.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D130398
2022-07-23 01:31:53 +02:00
Wei Yi Tee
b611376e7e [clang][dataflow] Singleton pointer values for null pointers.
When a `nullptr` is assigned to a pointer variable, it is wrapped in a `ImplicitCastExpr` with cast kind `CK_NullTo(Member)Pointer`. This patch assigns singleton pointer values representing null to these expressions.

For each pointee type, a singleton null `PointerValue` is created and stored in the `NullPointerVals` map of the `DataflowAnalysisContext` class. The pointee type is retrieved from the implicit cast expression, and used to initialise the `PointeeLoc` field of the `PointerValue`. The `PointeeLoc` created is not mapped to any `Value`, reflecting the absence of value indicated by null pointers.

Reviewed By: gribozavr2, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D128056
2022-06-27 14:17:34 +02:00
Wei Yi Tee
12c7352fa4 [clang][dataflow] Move logic for createStorageLocation from DataflowEnvironment to DataflowAnalysisContext.
`createStorageLocation` in `DataflowEnvironment` is now a trivial wrapper around the logic in `DataflowAnalysisContext`.
Additionally, `getObjectFields` and `getFieldsFromClassHierarchy` (required for the implementation of `createStorageLocation`) are also moved to `DataflowAnalysisContext`.

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D128359
2022-06-27 11:16:51 +02:00
Stanislav Gatev
e363c5963d [clang][dataflow] Extend flow condition in the body of a do/while loop
Extend flow condition in the body of a do/while loop.

Differential Revision: https://reviews.llvm.org/D128183

Reviewed-by: gribozavr2, xazax.hun
2022-06-20 17:31:00 +00:00
Kazu Hirata
80c12bdb3b [clang] Call *set::insert without checking membership first (NFC) 2022-06-18 10:41:26 -07:00
Wei Yi Tee
97d69cdaf3 [clang][dataflow] Rename getPointeeLoc to getReferentLoc for ReferenceValue.
We distinguish between the referent location for `ReferenceValue` and pointee location for `PointerValue`. The former must be non-empty but the latter may be empty in the case of a `nullptr`

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D127745
2022-06-15 00:53:30 +02:00
Wei Yi Tee
a1b2b7d979 [clang][dataflow] Remove IndirectionValue class, moving PointeeLoc field into PointerValue and ReferenceValue
This patch precedes a future patch to make PointeeLoc for PointerValue possibly empty (for nullptr), by using a pointer instead of a reference type.
ReferenceValue should maintain a non-empty PointeeLoc reference.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D127312
2022-06-09 01:29:16 +02:00
Yitzhak Mandelbaum
67136d0e8f [clang][dataflow] Remove private-field filtering from StorageLocation creation.
The API for `AggregateStorageLocation` does not allow for missing fields (it asserts). Therefore, it is incorrect to filter out any fields at location-creation time which may be accessed by the code. Currently, we limit filtering to private, base-calss fields on the assumption that those can never be accessed. However, `friend` declarations can invalidate that assumption, thereby breaking our invariants.

This patch removes said field filtering to avoid violating the invariant of "no missing fields" for `AggregateStorageLocation`.

Differential Revision: https://reviews.llvm.org/D126420
2022-05-27 13:39:53 +00:00
Eric Li
5520c58390 [clang][dataflow] Fix incorrect CXXThisExpr pointee for lambdas
When constructing the `Environment`, the `this` pointee is established
for a `CXXMethodDecl` by looking at its parent. However, inside of
lambdas, a `CXXThisExpr` refers to the captured `this` coming from the
enclosing member function.

When establishing the `this` pointee for a function, we check whether
the function is a lambda, and check for an enclosing member function
to establish the `this` pointee storage location.

Differential Revision: https://reviews.llvm.org/D126413
2022-05-25 20:58:02 +00:00
Yitzhak Mandelbaum
2f93bbb9cd [clang][dataflow] Relax Environment comparison operation.
Ignore `MemberLocToStruct` in environment comparison. As an ancillary data
structure, including it is redundant. We also can generate environments which
differ in their `MemberLocToStruct` but are otherwise equivalent.

Differential Revision: https://reviews.llvm.org/D126314
2022-05-24 20:58:18 +00:00
Eric Li
45643cfcc1 [clang][dataflow] Centralize expression skipping logic
A follow-up to 62b2a47 to centralize the logic that skips expressions
that the CFG does not emit. This allows client code to avoid
sprinkling this logic everywhere.

Add redirects in the transfer function to similarly skip such
expressions by forwarding the visit to the sub-expression.

Differential Revision: https://reviews.llvm.org/D124965
2022-05-05 20:28:11 +00:00
Eric Li
62b2a47a9f [clang][dataflow] Only skip ExprWithCleanups when visiting terminators
`IgnoreParenImpCasts` will remove implicit casts to bool
(e.g. `PointerToBoolean`), such that the resulting expression may not
be of the `bool` type. The `cast_or_null<BoolValue>` in
`extendFlowCondition` will then trigger an assert, as the pointer
expression will not have a `BoolValue`.

Instead, we only skip `ExprWithCleanups` and `ParenExpr` nodes, as the
CFG does not emit them.

Differential Revision: https://reviews.llvm.org/D124807
2022-05-04 15:31:49 +00:00
Stanislav Gatev
955a05a278 [clang][dataflow] Optimize flow condition representation
Enable efficient implementation of context-aware joining of distinct
boolean values. It can be used to join distinct boolean values while
preserving flow condition information.

Flow conditions are represented as Token <=> Clause iff formulas. To
perform context-aware joining, one can simply add the tokens of flow
conditions to the formula when joining distinct boolean values, e.g:
`makeOr(makeAnd(FC1, Val1), makeAnd(FC2, Val2))`. This significantly
simplifies the implementation of `Environment::join`.

This patch removes the `DataflowAnalysisContext::getSolver` method.
The `DataflowAnalysisContext::flowConditionImplies` method should be
used instead.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D124395
2022-05-01 16:25:29 +00:00
Yitzhak Mandelbaum
6c81b57237 [clang][dataflow] Perform structural comparison of indirection values in join.
This patch changes `Environment::join`, in the case that two values at the same
location are not (pointer) equal, to structurally compare indirection values
(pointers and references) for equivalence (that is, equivalent pointees) before
resorting to merging.

This change makes join consistent with equivalence, which also performs
structural comparison. It also fixes a bug where the values are `ReferenceValue`
but the merge creates a non-reference value. This case arises when the
`ReferenceValue`s were created to represent an lvalue, so the "reference-ness"
is not reflected in the type. In this case, the pointees will always be
equivalent, because lvalues at the same code location point to the location of a
fixed declaration, whose location is itself stable across blocks.

We were unable to reproduce a unit test for this latter bug, but have verified
the fix in the context of a larger piece of code that triggers the bug.

Differential Revision: https://reviews.llvm.org/D124540
2022-04-28 17:55:09 +00:00
Yitzhak Mandelbaum
37b4782e3e [clang][dataflow] Fix Environment::join's handling of flow-condition merging.
The current implementation mutates the environment as it performs the
join. However, that interferes with the call to the model's `merge` operation,
which can modify `MergedEnv`. Since any modifications are assumed to apply to
the post-join environment, providing the same environment for both is
incorrect. This mismatch is a particular concern for joining the flow
conditions, where modifications in the old environment may not be propagated to
the new one.

Differential Revision: https://reviews.llvm.org/D124104
2022-04-25 15:05:50 +00:00
Nathan James
cfb8169059
[clang] Add a raw_ostream operator<< overload for QualType
Under the hood this prints the same as `QualType::getAsString()` but cuts out the middle-man when that string is sent to another raw_ostream.

Also cleaned up all the call sites where this occurs.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123926
2022-04-20 22:09:05 +01:00
Yitzhak Mandelbaum
bbcf11f5af [clang][dataflow] Weaken abstract comparison to enable loop termination.
Currently, when the framework is used with an analysis that does not override
`compareEquivalent`, it does not terminate for most loops. The root cause is the
interaction of (the default implementation of) environment comparison
(`compareEquivalent`) and the means by which locations and values are
allocated. Specifically, the creation of certain values (including: reference
and pointer values; merged values) results in allocations of fresh locations in
the environment. As a result, analysis of even trivial loop bodies produces
different (if isomorphic) environments, on identical inputs. At the same time,
the default analysis relies on strict equality (versus some relaxed notion of
equivalence). Together, when the analysis compares these isomorphic, yet
unequal, environments, to determine whether the successors of the given block
need to be (re)processed, the result is invariably "yes", thus preventing loop
analysis from reaching a fixed point.

There are many possible solutions to this problem, including equivalence that is
less than strict pointer equality (like structural equivalence) and/or the
introduction of an explicit widening operation. However, these solutions will
require care to be implemented correctly. While a high priority, it seems more
urgent that we fix the current default implentation to allow
termination. Therefore, this patch proposes, essentially, to change the default
comparison to trivally equate any two values. As a result, we can say precisely
that the analysis will process the loop exactly twice -- once to establish an
initial result state and the second to produce an updated result which will
(always) compare equal to the previous. While clearly unsound -- we are not
reaching a fix point of the transfer function, in practice, this level of
analysis will find many practical issues where a single iteration of the loop
impacts abstract program state.

Note, however, that the change to the default `merge` operation does not affect
soundness, because the framework already produces a fresh (sound) abstraction of
the value when the two values are distinct. The previous setting was overly
conservative.

Differential Revision: https://reviews.llvm.org/D123586
2022-04-13 19:49:50 +00:00
Yitzhak Mandelbaum
01db10365e [clang][dataflow] Add support for correlation of boolean (tracked) values
This patch extends the join logic for environments to explicitly handle
boolean values. It creates the disjunction of both source values, guarded by the
respective flow conditions from each input environment. This change allows the
framework to reason about boolean correlations across multiple branches (and
subsequent joins).

Differential Revision: https://reviews.llvm.org/D122838
2022-04-01 17:25:49 +00:00
Yitzhak Mandelbaum
36d4e84427 [clang][dataflow] Fix handling of base-class fields.
Currently, the framework does not track derived class access to base
fields. This patch adds that support and a corresponding test.

Differential Revision: https://reviews.llvm.org/D122273
2022-04-01 15:01:32 +00:00
Stanislav Gatev
b000b7705a [clang][dataflow] Model the behavior of non-standard optional assignment
Model nullopt, value, and conversion assignment operators.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D121863
2022-03-17 17:11:12 +00:00
Simon Pilgrim
a157d839c5 [clang] Environment::createValueUnlessSelfReferential - use castAs<> instead of getAs<> to avoid dereference of nullptr
The pointer is always dereferenced, so assert the cast is correct instead of returning nullptr
2022-03-09 11:40:37 +00:00
Yitzhak Mandelbaum
18c84e2d32 [clang][dataflow] Fix nullptr dereferencing error.
When pre-initializing fields in the environment, the code assumed that all
fields of a struct would be initialized. However, given limits on value
construction, that assumption is incorrect. This patch changes the code to drop
that assumption and thereby avoid dereferencing a nullptr.

Differential Revision: https://reviews.llvm.org/D121158
2022-03-08 03:01:31 +00:00
Stanislav Gatev
1e5715857a [clang][dataflow] Extend flow conditions from block terminators
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120984
2022-03-07 17:50:44 +00:00
Stanislav Gatev
ae60884dfe [clang][dataflow] Add flow condition constraints to Environment
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120711
2022-03-02 08:57:27 +00:00
Yitzhak Mandelbaum
208c25fcbf [clang][dataflow] Add limits to size of modeled data structures in environment.
Adds two new parameters to control the size of data structures modeled in the environment: # of values and depth of data structure.  The environment already prevents creation of recursive data structures, but that was insufficient in practice. Very large structs still ground the analysis to a halt.  These new parameters allow tuning the size more effectively.

In this patch, the parameters are set as internal constants. We leave to a future patch to make these proper model parameters.

Differential Revision: https://reviews.llvm.org/D120510
2022-02-24 20:51:59 +00:00
Stanislav Gatev
baa0f221d6 [clang][dataflow] Update StructValue child when assigning a value
When assigning a value to a storage location of a struct member we
need to also update the value in the corresponding `StructValue`.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120414
2022-02-24 16:41:48 +00:00
Stanislav Gatev
03dff12197 Revert "Revert "[clang][dataflow] Add support for global storage values""
This reverts commit 169e1aba55bed9f7ffa000f9f170ab2defbc40b2.

It also fixes an incorrect assumption in `initGlobalVars`.
2022-02-23 13:57:34 +00:00
Stanislav Gatev
169e1aba55 Revert "[clang][dataflow] Add support for global storage values"
This reverts commit 7ea103de140b59a64fc884fa90afd2213619384d.
2022-02-23 10:32:17 +00:00
Stanislav Gatev
7ea103de14 [clang][dataflow] Add support for global storage values
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120149
2022-02-23 08:27:58 +00:00
Stanislav Gatev
6b8800dfb5 [clang][dataflow] Enable comparison of distinct values in Environment
Make specializations of `DataflowAnalysis` extendable with domain-specific
logic for comparing distinct values when comparing environments.

This includes a breaking change to the `runDataflowAnalysis` interface
as the return type is now `llvm::Expected<...>`.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D118596
2022-02-01 15:25:59 +00:00