83 Commits

Author SHA1 Message Date
Martin Braenne
745a957f9d [clang][dataflow] Add create<T>() methods to Environment and DataflowAnalysisContext.
These methods provide a less verbose way of allocating `StorageLocation`s and
`Value`s than the existing `takeOwnership(make_unique(...))` pattern.

In addition, because allocation of `StorageLocation`s and `Value`s now happens
within the `DataflowAnalysisContext`, the `create<T>()` open up the possibility
of using `BumpPtrAllocator` to allocate these objects if it turns out this
helps performance.

Reviewed By: ymandel, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D147302
2023-04-04 07:13:44 +00:00
Martin Braenne
ce0ab9d11c [clang][dataflow][NFC] Share code between Environment ctor and pushCallInternal().
The deduplicated code is moved into initVars().

As an added bonus, pushCallInternal() now also gets the "Add all fields
mentioned in default member initializers" behavior, which apparently had been
added to the Environment ctor but not pushCallInternal().

Reviewed By: xazax.hun, ymandel

Differential Revision: https://reviews.llvm.org/D147326
2023-04-03 08:25:10 +00:00
Sam McCall
ee2cd606ab [dataflow] Log flow condition to the correct stream.
Differential Revision: https://reviews.llvm.org/D146527
2023-03-22 10:57:21 +01:00
Kazu Hirata
7eaa7b0553 [clang] Use *{Map,Set}::contains (NFC) 2023-03-15 18:06:34 -07:00
Yitzhak Mandelbaum
73c98831f6 [clang][dataflow] Fix missed fields in field set construction.
When building the set of referenced fields for the `DataflowAnalysisContext`,
include fields referenced only in default member initializers. These
initializers are visited in the CFGs of constructors and so the fields must be
included when analysing constructor bodies.

Differential Revision: https://reviews.llvm.org/D144987
2023-02-28 18:56:54 +00:00
Yitzhak Mandelbaum
daa316bcaf [clang][dataflow] Fix bug in joining bool values.
Currently, the code assumes that all boolean-typed values are an instance of
`BoolValue` (or its subclasses). Yet, lvalues violate this assumption. This
patch drops the assumption and strengthens the check to confirm the shape of
both values being joined.

The patch also notes as FIXMES a number of problems discovered fixing this bug.

Differential Revision: https://reviews.llvm.org/D141709
2023-01-19 15:59:06 +00:00
Yitzhak Mandelbaum
c441f65f91 [clang][dataflow] Add (initial) debug printing for Value and Environment.
Also adds uses of the new printing in analysis inner loop.

Differential Revision: https://reviews.llvm.org/D141716
2023-01-19 14:33:32 +00:00
Yitzhak Mandelbaum
3ce03c42db [clang][dataflow] Fix 2 bugs in MemberExpr interpretation.
There were two (small) bugs causing crashes in the analysis.  This patch fixes both of them.

1. An enum value was accessed as a class member. Now, the engine gracefully
ignores such member expressions.

2. Field access in `MemberExpr` of struct/class-typed global variables. Analysis
didn't interpret fields of global vars, because the vars were initialized before
the fields were added to the "allowlist". Now, the allowlist is set _before_
init of globals.

Differential Revision: https://reviews.llvm.org/D141384
2023-01-10 15:48:00 +00:00
Yitzhak Mandelbaum
089a54469f [clang][dataflow][NFC] Refine names and comments for field filtering.
Tweaks elements of the new API for filtering the set of modeled fields.

Differential Revision: https://reviews.llvm.org/D141319
2023-01-10 14:28:45 +00:00
Yitzhak Mandelbaum
01ccf7b3ce Revert "Revert "[clang][dataflow] Only model struct fields that are used in the function being analyzed.""
This reverts commit 2b1a517a92bfdfa3b692a660e19a2bb22513a567. It's a fix forward
with two memory errors fixed, one of which was the cause of the build breakage
in the buildbots.

Original message:

Previously, the model for structs modeled all fields in a struct when
`createValue` was called for that type. This patch adds a prepass on the
function under analysis to discover the fields referenced in the scope and then
limits modeling to only those fields. This reduces wasted memory usage
(modeling unused fields) which can be important for programs that use large
structs.

Note: This patch obviates the need for https://reviews.llvm.org/D123032.
2023-01-09 19:32:10 +00:00
serge-sans-paille
a3c248db87
Move from llvm::makeArrayRef to ArrayRef deduction guides - clang/ part
This is a follow-up to https://reviews.llvm.org/D140896, split into
several parts as it touches a lot of files.

Differential Revision: https://reviews.llvm.org/D141139
2023-01-09 12:15:24 +01:00
Yitzhak Mandelbaum
2b1a517a92 Revert "[clang][dataflow] Only model struct fields that are used in the function being analyzed."
This reverts commit 5e8f597c2fedc740b71f07dfdb1ef3c2d348b193. It caused msan and ubsan breakages.
2023-01-06 01:07:28 +00:00
Yitzhak Mandelbaum
5e8f597c2f [clang][dataflow] Only model struct fields that are used in the function being analyzed.
Previously, the model for structs modeled all fields in a struct when
`createValue` was called for that type. This patch adds a prepass on the
function under analysis to discover the fields referenced in the scope and then
limits modeling to only those fields.  This reduces wasted memory usage
(modeling unused fields) which can be important for programss that use large
structs.

Note: This patch obviates the need for https://reviews.llvm.org/D123032.

Differential Revision: https://reviews.llvm.org/D140694
2023-01-05 21:46:39 +00:00
Dani Ferreira Franco Moura
d862f66221 [clang][dataflow] Treat unions as structs.
This is a straightfoward way to handle unions in dataflow analysis. Without this change, nullability verification crashes on files that contain unions.

Reviewed By: gribozavr2, ymandel

Differential Revision: https://reviews.llvm.org/D140696
2023-01-03 18:36:24 +00:00
Jun Zhang
eda2eaabf2
[clang][dataflow] Fix crash when having boolean-to-integral casts.
Since now we just ignore all (implicit) integral casts, treating the
resulting value as the same as the underlying value, it could cause
inconsistency between values after `Join` if in some paths the type
doesn't strictly match. This could cause intermittent crashes.

std::optional<bool> o;
int x;
if (o.has_value()) {
  x = o.value();
}

Fixes: https://github.com/llvm/llvm-project/issues/59728

Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D140753
2022-12-30 13:14:44 +08:00
Yitzhak Mandelbaum
f3700bdb7f [clang][dataflow] Account for global variables in constructor initializers.
Previously, the analysis modeled global variables appearing in the _body_ of
any function (including constructors). But, that misses those appearing in
constructor _initializers_. This patch adds the initializers to the set of
expressions used to determine which globals to model.

Differential Revision: https://reviews.llvm.org/D140501
2022-12-22 14:20:50 +00:00
Yitzhak Mandelbaum
d2e4aaf6ac [clang][dataflow][NFC] Fix comments related to widening.
The comments describing the API for analysis `widen` and the environment `widen`
were overly strict in the preconditions they assumed for the operation. In
particular, both assumed that the previous value preceded the current value in
the relevant ordering. However, that's not generally how widen operators work
and widening itself can violate this property. That is, when the previous value
is the result of a widening, it can easily be "greater" than the current value.

This patch updates the comments to accurately reflect the expectations.

Differential Revision: https://reviews.llvm.org/D140308
2022-12-19 21:01:27 +00:00
Yitzhak Mandelbaum
a18cf8d14f [clang][dataflow] Remove stray lines from Environment::join
Removes an assertion and a useless line. The assertion seems left over from
earlier debugging and the line that follows is a stray line.

Differential Revision: https://reviews.llvm.org/D140306
2022-12-19 15:28:30 +00:00
Yitzhak Mandelbaum
84dd12b290 [clang][dataflow] Add widening API and implement it for built-in boolean model.
* Adds API support for widening of lattice elements and environments,
* Updates the algorithm to apply widening where appropriate,
* Implements widening for boolean values. In the process, moves the unsoundness
  of comparison from the default implementation of
  `Environment::ValueModel::compare` to model-specific handling inside
  `DataflowEnvironment::equivalentTo`. This change is intended to clarify
  the source and location of unsoundess.

This patch is a replacement for, and was based substantially on, https://reviews.llvm.org/D131645.

Differential Revision: https://reviews.llvm.org/D137948
2022-11-22 16:09:28 +00:00
Yitzhak Mandelbaum
c0725865b1 [clang][dataflow] Generalize custom comparison to return tri-value result.
Currently, the API for a model's custom value comparison returns a
boolean. Therefore, models cannot distinguish between situations where the
values are recognized by the model and different and those where the values are
just not recognized.  This patch changes the return value to a tri-valued enum,
allowing models to express "don't know".

This patch is essentially a NFC -- no practical differences result from this
change in this patch. But, it prepares for future patches (particularly,
upcoming patches for widening) which will take advantage of the new flexibility.

Differential Revision: https://reviews.llvm.org/D137334
2022-11-03 23:31:20 +00:00
Yitzhak Mandelbaum
8cadac41e9 [clang][dataflow] Add equivalence relation Value type.
Defines an equivalence relation on the `Value` type to standardize several
places in the code where we replicate the ~same equivalence comparison.

Differential Revision: https://reviews.llvm.org/D135964
2022-10-19 12:23:09 +00:00
Yitzhak Mandelbaum
39b9d4f188 [clang][dataflow] Add support for a Top value in boolean formulas.
Currently, our boolean formulas (`BoolValue`) don't form a lattice, since they
have no Top element. This patch adds such an element, thereby "completing" the
built-in model of bools to be a proper semi-lattice. It still has infinite
height, which is its own problem, but that can be solved separately, through
widening and the like.

Patch 1 for Issue #56931.

Differential Revision: https://reviews.llvm.org/D135397
2022-10-14 17:41:53 +00:00
Yitzhak Mandelbaum
0b12efc7a4 [clang][dataflow] Add support for nested method calls.
Extend the context-sensitive analysis to handle a call to a method (of the same
class) from within a method. That, is a member-call expression through `this`.

Differential Revision: https://reviews.llvm.org/D134432
2022-09-22 19:16:31 +00:00
Yitzhak Mandelbaum
abc16c7a5b [NFC] Remove a FIXME fixed by an earlier patch.
Commit 28bd7945eabdbde2b1fc071ab2f9b78e6e754a1a incidentally fixed the
associated FIXME, but didn't delete it.

Differential Revision: https://reviews.llvm.org/D133588
2022-09-09 17:13:52 +00:00
Dmitri Gribenko
941959d69d [clang][dataflow] Use llvm::is_contained()
Reviewed By: samestep, xazax.hun

Differential Revision: https://reviews.llvm.org/D131975
2022-08-16 19:59:21 +02:00
Sam Estep
2efc8f8d65 [clang][dataflow] Add an option for context-sensitive depth
This patch adds a `Depth` field (default value 2) to `ContextSensitiveOptions`, allowing context-sensitive analysis of functions that call other functions. This also requires replacing the `DeclCtx` field on `Environment` with a `CallString` field that contains a vector of decl contexts, to ensure that the analysis doesn't try to analyze recursive or mutually recursive calls (which would result in a crash, due to the way we handle `StorageLocation`s).

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D131809
2022-08-15 19:58:40 +00:00
Sam Estep
d09d4bd66c [clang][dataflow] Don't crash when caller args are missing storage locations
This patch modifies `Environment`'s `pushCall` method to pass over arguments that are missing storage locations, instead of crashing.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D131600
2022-08-11 13:00:42 +00:00
Sam Estep
eb91fd5cbc [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-11 12:46:20 +00:00
Wei Yi Tee
2cb51449f0 [clang][dataflow] Store DeclContext of block being analysed in Environment if available.
Differential Revision: https://reviews.llvm.org/D131065
2022-08-11 07:36:57 +00:00
Evgenii Stepanov
8d3c960295 Revert "[clang][dataflow] Store DeclContext of block being analysed in Environment if available."
Use of uninitialized memory.
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 8a4c40bfe8e6605ffc9d866f8620618dfdde2875.
2022-08-10 14:22:04 -07:00
Evgenii Stepanov
7587065043 Revert "[clang][dataflow] Analyze constructor bodies"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 000c8fef86abb7f056cbea2de99f21dca4b81bf8.
2022-08-10 14:21:56 -07:00
Evgenii Stepanov
26089d4da4 Revert "[clang][dataflow] Don't crash when caller args are missing storage locations"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 43b298ea1282f29d448fc0f6ca971bc5fa698355.
2022-08-10 14:21:46 -07:00
Sam Estep
43b298ea12 [clang][dataflow] Don't crash when caller args are missing storage locations
This patch modifies `Environment`'s `pushCall` method to pass over arguments that are missing storage locations, instead of crashing.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D131600
2022-08-10 17:50:34 +00:00
Sam Estep
000c8fef86 [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-10 14:01:45 +00:00
Wei Yi Tee
8a4c40bfe8 [clang][dataflow] Store DeclContext of block being analysed in Environment if available.
Differential Revision: https://reviews.llvm.org/D131065
2022-08-10 11:27:03 +00:00
Sam Estep
8611a77ee7 [clang][dataflow] Analyze method bodies
This patch adds the ability to context-sensitively analyze method bodies, by moving `ThisPointeeLoc` from `DataflowAnalysisContext` to `Environment`, and adding code in `pushCall` to set it.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131170
2022-08-04 17:45:47 +00:00
Sam Estep
0eaecbbc23 [clang][dataflow] Handle return statements
This patch adds a `ReturnLoc` field to the `Environment`, serving a similar to the `ThisPointeeLoc` field in the `DataflowAnalysisContext`. It then uses that (along with a new `VisitReturnStmt` method in `TransferVisitor`) to handle non-`void`-returning functions in context-sensitive analysis.

Reviewed By: ymandel, sgatev

Differential Revision: https://reviews.llvm.org/D130600
2022-08-04 17:42:19 +00:00
Stanislav Gatev
817dd5e3fd [clang][dataflow] Rename member to make it clear that it isn't stable
Rename `DataflowAnalysisContext::getStableStorageLocation(QualType)`
to `createStorageLocation`, to make it clear that it doesn't return
a stable storage location.

Differential Revision: https://reviews.llvm.org/D131021

Reviewed-by: ymandel, xazax.hun, gribozavr2
2022-08-03 06:25:02 +00:00
Sam Estep
a6ddc68487 [clang][dataflow] Handle multiple context-sensitive calls to the same function
This patch enables context-sensitive analysis of multiple different calls to the same function (see the `ContextSensitiveSetBothTrueAndFalse` example in the `TransferTest` suite) by replacing the `Environment` copy-assignment with a call to the new `popCall` method, which  `std::move`s some fields but specifically does not move `DeclToLoc` and `ExprToLoc` from the callee back to the caller.

To enable this, the `StorageLocation` for a given parameter needs to be stable across different calls to the same function, so this patch also improves the modeling of parameter initialization, using `ReferenceValue` when necessary (for arguments passed by reference).

This approach explicitly does not work for recursive calls, because we currently only plan to use this context-sensitive machinery to support specialized analysis models we write, not analysis of arbitrary callees.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130726
2022-07-29 19:40:19 +00:00
Weverything
1f8ae9d7e7 Inline function calls.
Fix unused variable in non-assert builds after
300fbf56f89aebbe2ef9ed490066bab23e5356d1
2022-07-26 21:12:28 -07:00
Sam Estep
300fbf56f8 [clang][dataflow] Analyze calls to in-TU functions
This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `FIXME` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:54:27 +00:00
Sam Estep
cc9aa157a8 Revert "[clang][dataflow] Analyze calls to in-TU functions"
This reverts commit fa2b83d07ecab3b24b4c5ee2e7dc4b6bbc895317.
2022-07-26 17:30:09 +00:00
Sam Estep
fa2b83d07e [clang][dataflow] Analyze calls to in-TU functions
Depends On D130305

This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) first calls `initGlobalVars`, then maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `TODO` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:27:19 +00:00
Dmitri Gribenko
c0c9d717df [clang][dataflow] Rename iterators from IT to It
The latter way to abbreviate is a lot more common in the LLVM codebase.

Reviewed By: sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D130423
2022-07-25 20:28:47 +02:00
Dmitri Gribenko
b5414b566a [clang][dataflow] Add DataflowEnvironment::dump()
Start by dumping the flow condition.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D130398
2022-07-23 01:31:53 +02:00
Wei Yi Tee
b611376e7e [clang][dataflow] Singleton pointer values for null pointers.
When a `nullptr` is assigned to a pointer variable, it is wrapped in a `ImplicitCastExpr` with cast kind `CK_NullTo(Member)Pointer`. This patch assigns singleton pointer values representing null to these expressions.

For each pointee type, a singleton null `PointerValue` is created and stored in the `NullPointerVals` map of the `DataflowAnalysisContext` class. The pointee type is retrieved from the implicit cast expression, and used to initialise the `PointeeLoc` field of the `PointerValue`. The `PointeeLoc` created is not mapped to any `Value`, reflecting the absence of value indicated by null pointers.

Reviewed By: gribozavr2, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D128056
2022-06-27 14:17:34 +02:00
Wei Yi Tee
12c7352fa4 [clang][dataflow] Move logic for createStorageLocation from DataflowEnvironment to DataflowAnalysisContext.
`createStorageLocation` in `DataflowEnvironment` is now a trivial wrapper around the logic in `DataflowAnalysisContext`.
Additionally, `getObjectFields` and `getFieldsFromClassHierarchy` (required for the implementation of `createStorageLocation`) are also moved to `DataflowAnalysisContext`.

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D128359
2022-06-27 11:16:51 +02:00
Stanislav Gatev
e363c5963d [clang][dataflow] Extend flow condition in the body of a do/while loop
Extend flow condition in the body of a do/while loop.

Differential Revision: https://reviews.llvm.org/D128183

Reviewed-by: gribozavr2, xazax.hun
2022-06-20 17:31:00 +00:00
Kazu Hirata
80c12bdb3b [clang] Call *set::insert without checking membership first (NFC) 2022-06-18 10:41:26 -07:00
Wei Yi Tee
97d69cdaf3 [clang][dataflow] Rename getPointeeLoc to getReferentLoc for ReferenceValue.
We distinguish between the referent location for `ReferenceValue` and pointee location for `PointerValue`. The former must be non-empty but the latter may be empty in the case of a `nullptr`

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D127745
2022-06-15 00:53:30 +02:00