llvm-project

Author	SHA1	Message	Date
martinboehme	4d839d8f18	[clang][dataflow] Don't propagate result objects in unevaluated contexts (reland #90438 ) (#91172 ) This relands #90348 with a fix for a [buildbot failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446) caused by the test being run with `-fno-rtti`.	2024-05-06 14:21:15 +02:00
Weaver	2252c5c42b	Revert "[clang][dataflow] Don't propagate result objects in unevaluated contexts (#90438 )" This reverts commit 597a3150e932a9423c65b5ea4b53dd431aff5865. Caused test failure on the following buildbot: https://lab.llvm.org/buildbot/#/builders/216/builds/38446	2024-05-02 11:51:45 +01:00
martinboehme	597a3150e9	[clang][dataflow] Don't propagate result objects in unevaluated contexts (#90438 ) Trying to do so can cause crashes -- see newly added test and the comments in the fix. We're starting to see a repeating pattern here: We're getting crashes because `ResultObjectVisitor` and `getReferencedDecls()` don't agree on which parts of the AST to visit and, hence, which fields should be modeled. I think we should ensure consistency between these two parts of the code by using a `RecursiveASTVisitor` in `getReferencedDecls()`[^1]; the `Traverse...()` functions that control which parts of the AST we visit would go in a common base class that would be used for both `ResultObjectVisitor` and `getReferencedDecls()`. I'd like to focus this PR, however, on a targeted fix for the current crash and postpone the refactoring to a later PR (which will be easier to revert if there are unintended side-effects). [^1]: As an added bonus, this would make the code better structured and more efficient than the current sequence of `if (dyn_cast<T>(...))` statements).	2024-05-02 08:35:13 +02:00
martinboehme	b9208ce318	[clang][dataflow] Crash fix for `widenDistinctValues()`. (#89895 ) We used to crash if the previous iteration contained a `BoolValue` and the current iteration contained an `IntegerValue`. The accompanying test sets up this situation -- see comments there for details. While I'm here, clean up the tests for integral casts to use the test helpers we have available now. I was looking at these tests to understand how we handle integral casts, and the test helpers make the tests easier to read.	2024-04-25 09:24:08 +02:00
martinboehme	9b0651f5ae	[clang][dataflow] Don't propagate result objects in nested declarations. (#89903 ) Trying to do so can cause crashes -- see newly added test and the comments in the fix.	2024-04-25 09:22:14 +02:00
martinboehme	9ba6961ce0	Reapply "[clang][dataflow] Model conditional operator correctly." with fixes (#89596 ) I reverted https://github.com/llvm/llvm-project/pull/89213 beause it was causing buildbots to fail with assertion failures. Embarrassingly, it turns out I had been running tests locally in `Release` mode, i.e. with `assert()` compiled away. This PR re-lands #89213 with fixes for the failing assertions.	2024-04-23 08:10:55 +02:00
martinboehme	8ff6434546	Revert "[clang][dataflow] Model conditional operator correctly." (#89577 ) Reverts llvm/llvm-project#89213 This is causing buildbot failures.	2024-04-22 09:35:29 +02:00
martinboehme	abb958f161	[clang][dataflow] Model conditional operator correctly. (#89213 )	2024-04-22 09:23:13 +02:00
martinboehme	1412210632	[clang][dataflow][NFC] Fix code formatting in DataflowEnvironment.cpp (#89352 ) For some reason, when I merged #89235, two lines were mis-formatted. This patch corrects this; while I'm here, I'm also correcting other existing formatting errors.	2024-04-19 10:12:57 +02:00
martinboehme	e8fce95887	[clang][nullability] Remove `RecordValue`. (#89052 ) This class no longer serves any purpose; see also the discussion here: https://reviews.llvm.org/D155204#inline-1503204 A lot of existing tests in TransferTest.cpp check for the existence of `RecordValue`s. Some of these checks are now simply redundant and have been removed. In other cases, tests were checking for the existence of a `RecordValue` as a way of testing whether a record has been initialized. I have typically changed these test to instead check whether a field of the record has a value.	2024-04-19 09:39:52 +02:00
martinboehme	ca7d9442ba	[clang][dataflow] Support `CXXParenListInitExpr` in `PropagateResultObject()`. (#89235 )	2024-04-19 09:06:13 +02:00
martinboehme	b5f2cecf83	Revert "[clang][dataflow] Refactor `PropagateResultObject()` with a switch statement." (#89176 ) Reverts llvm/llvm-project#88865 There were failing tests in the CI that I didn't notice. Sorry.	2024-04-18 09:23:03 +02:00
martinboehme	da579ad807	[clang][dataflow] Refactor `PropagateResultObject()` with a switch statement. (#88865 ) See also discussion in #88726.	2024-04-18 09:21:10 +02:00
martinboehme	1bccbe1f49	[clang][dataflow] Treat `BuiltinBitCastExpr` correctly in `PropagateResultObject()`. (#88875 ) This patch includes a test that assert-fails without the fix.	2024-04-17 08:17:56 +02:00
martinboehme	b851c7f1fc	[clang][dataflow] Support `StmtExpr` in `PropagateResultObject()`. (#88872 ) This patch adds a test that assert-fails without the fix.	2024-04-17 08:05:43 +02:00
Samira Bazuzi	9ec8c96166	[clang][dataflow] Expose getReferencedDecls and relocate free functions. (#88754 ) Moves free functions from DataflowEnvironment.h/cc and DataflowAnalysisContext.h/cc to RecordOps and a new ASTOps and exposes them as needed for current use and to expose getReferencedDecls for out-of-tree use. Minimal change in functionality, only to modify the return type of getReferenceDecls to return the collected decls instead of using output params. Tested with `ninja check-clang-tooling`.	2024-04-16 14:46:05 -04:00
martinboehme	3c6f91e5b6	[clang][dataflow] Fix result object location for builtin `<=>`. (#88726 ) The newly added test causes an assertion failure in `PropagateResultObject()` without the fix added here.	2024-04-16 08:49:45 +02:00
martinboehme	71f1932b84	[clang][dataflow] Reland #87320 : Propagate locations from result objects to initializers. (#88316 ) This relands #87320 and additionally removes the now-unused function `isOriginalRecordConstructor()`, which was causing buildbots to fail.	2024-04-11 08:20:35 +02:00
martinboehme	7549b45825	Revert "[clang][dataflow] Propagate locations from result objects to initializers." (#88315 ) Reverts llvm/llvm-project#87320 This is causing buildbots to fail because `isOriginalRecordConstructor()` is now unused.	2024-04-10 21:27:10 +02:00
martinboehme	21009f466e	[clang][dataflow] Propagate locations from result objects to initializers. (#87320 ) Previously, we were propagating storage locations the other way around, i.e. from initializers to result objects, using `RecordValue::getLoc()`. This gave the wrong behavior in some cases -- see the newly added or fixed tests in this patch. In addition, this patch now unblocks removing the `RecordValue` class entirely, as we no longer need `RecordValue::getLoc()`. With this patch, the test `TransferTest.DifferentReferenceLocInJoin` started to fail because the framework now always uses the same storge location for a `MaterializeTemporaryExpr`, meaning that the code under test no longer set up the desired state where a variable of reference type is mapped to two different storage locations in environments being joined. Rather than trying to modify this test to set up the test condition again, I have chosen to replace the test with an equivalent test in DataflowEnvironmentTest.cpp that sets up the test condition directly; because this test is more direct, it will also be less brittle in the face of future changes.	2024-04-10 20:03:35 +02:00
NAKAMURA Takumi	d08a76d1ac	Fix warnings discovered by #87348 [-Wunused-but-set-variable]	2024-04-07 11:02:08 +09:00
Yitzhak Mandelbaum	bbd259af0a	[clang][dataflow] Refactor `widen` API to be explicit about change effect. (#87233 ) The previous API relied on pointer equality of inputs and outputs to signal whether a change occured. This was too subtle and led to bugs in practice. It was also very limiting: the override could not return an equivalent (but not identical) value.	2024-04-04 08:39:51 -04:00
martinboehme	8d77d362af	[clang][dataflow] Introduce a helper class for handling record initializer lists. (#86675 ) This is currently only used in one place, but I'm working on a patch that will use this from a second place. And I think this already improves the readability of the one place this is used so far.	2024-03-28 10:12:45 +01:00
smanna12	4c4ea249ce	[NFC][CLANG] Fix static analyzer bugs about unnecessary object copies with auto keyword (#85962 ) Reported by Static Analyzer Tool: In clang::dataflow::Environment::initialize(): Using the auto keyword without an & causes the copy of an object of type LambdaCapture	2024-03-26 12:05:40 -05:00
martinboehme	27d504998e	[clang][dataflow] Fix `getResultObjectLocation()` on `CXXDefaultArgExpr`. (#85072 ) This patch includes a test that causes an assertion failure without the other changes in this patch.	2024-03-18 13:36:20 +01:00
martinboehme	2d539db246	[clang][dataflow] When analyzing ctors, don't initialize fields of `*this` with values. (#84164 ) This is the constructor's job, and we want to be able to test that it does this.	2024-03-08 08:19:02 +01:00
martinboehme	d5aecf0c19	[clang][nullability] Don't discard expression state before end of full-expression. (#82611 ) In https://github.com/llvm/llvm-project/pull/72985, I made a change to discard expression state (`ExprToLoc` and `ExprToVal`) at the beginning of each basic block. I did so with the claim that "we never need to access entries from these maps outside of the current basic block", noting that there are exceptions to this claim when control flow happens inside a full-expression (the operands of `&&`, `\|\|`, and the conditional operator live in different basic blocks than the operator itself) but that we already have a mechanism for retrieving the values of these operands from the environment for the block they are computed in. It turns out, however, that the operands of these operators aren't the only expressions whose values can be accessed from a different basic block; when control flow happens within a full-expression, that control flow can be "interposed" between an expression and its parent. Here is an example: ```cxx void f(int, int); bool cond(); void target() { int i = 0; f(&i, cond() ? 1 : 0); } ``` ([godbolt](https://godbolt.org/z/hrbj1Mj3o)) In the CFG[^1] , note how the expression for `&i` is computed in block B4, but the parent of this expression (the `CallExpr`) is located in block B1. The the argument expression `&i` and the `CallExpr` are essentially "torn apart" into different basic blocks by the conditional operator in the second argument. In other words, the edge between the `CallExpr` and its argument `&i` straddles the boundary between two blocks. I used to think that this scenario -- where an edge between an expression and one of its children straddles a block boundary -- could only happen between the expression that triggers the control flow (`&&`, `\|\|`, or the conditional operator) and its children, but the example above shows that other expressions can be affected as well; the control flow is still triggered by `&&`, `\|\|` or the conditional operator, but the expressions affected lie outside these operators. Discarding expression state too soon is harmful. For example, an analysis that checks the arguments of the `CallExpr` above would not be able to retrieve a value for the `&i` argument. This patch therefore ensures that we don't discard expression state before the end of a full-expression. In other cases -- when the evaluation of a full-expression is complete -- we still want to discard expression state for the reasons explained in https://github.com/llvm/llvm-project/pull/72985 (avoid performing joins on boolean values that are no longer needed, which unnecessarily extends the flow condition; improve debuggability by removing clutter from the expression state). The impact on performance from this change is about a 1% slowdown in the Crubit nullability check benchmarks: ``` name old cpu/op new cpu/op delta BM_PointerAnalysisCopyPointer 71.9µs ± 1% 71.9µs ± 2% ~ (p=0.987 n=15+20) BM_PointerAnalysisIntLoop 190µs ± 1% 192µs ± 2% +1.06% (p=0.000 n=14+16) BM_PointerAnalysisPointerLoop 325µs ± 5% 324µs ± 4% ~ (p=0.496 n=18+20) BM_PointerAnalysisBranch 193µs ± 0% 192µs ± 4% ~ (p=0.488 n=14+18) BM_PointerAnalysisLoopAndBranch 521µs ± 1% 525µs ± 3% +0.94% (p=0.017 n=18+19) BM_PointerAnalysisTwoLoops 337µs ± 1% 341µs ± 3% +1.19% (p=0.004 n=17+19) BM_PointerAnalysisJoinFilePath 1.62ms ± 2% 1.64ms ± 3% +0.92% (p=0.021 n=20+20) BM_PointerAnalysisCallInLoop 1.14ms ± 1% 1.15ms ± 4% ~ (p=0.135 n=16+18) ``` [^1]: ``` [B5 (ENTRY)] Succs (1): B4 [B1] 1: [B4.9] ? [B2.1] : [B3.1] 2: [B4.4]([B4.6], [B1.1]) Preds (2): B2 B3 Succs (1): B0 [B2] 1: 1 Preds (1): B4 Succs (1): B1 [B3] 1: 0 Preds (1): B4 Succs (1): B1 [B4] 1: 0 2: int i = 0; 3: f 4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void ()(int , int)) 5: i 6: &[B4.5] 7: cond 8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool ()(void)) 9: [B4.8]() T: [B4.9] ? ... : ... Preds (1): B5 Succs (2): B2 B3 [B0 (EXIT)] Preds (1): B1 ```	2024-03-07 13:31:23 +01:00
martinboehme	128780b06f	[clang][dataflow] Correctly treat empty initializer lists for unions. (#82986 ) This fixes a crash introduced by https://github.com/llvm/llvm-project/pull/82348 but also adds additional handling to make sure that we treat empty initializer lists for both unions and structs/classes correctly (see tests added in this patch).	2024-03-01 09:27:59 +01:00
Samira Bazuzi	c4e94633e8	Revert "[clang][dataflow] Correctly handle `InitListExpr` of union type." (#82856 ) Reverts llvm/llvm-project#82348, which caused crashes when analyzing empty InitListExprs for unions, e.g. ```cc union U { double double_value; int int_value; }; void target() { U value; value = {}; } ``` Co-authored-by: Samira Bazuzi <bazuzi@users.noreply.github.com>	2024-02-26 14:23:46 +01:00
martinboehme	4725993f1a	[clang][dataflow] Correctly handle `InitListExpr` of union type. (#82348 )	2024-02-21 10:10:25 +01:00
Antonio Frighetto	8c6e96d9eb	[clang][Dataflow] Fix unnecessary copy in `initializeFieldsWithValues` (NFC)	2024-02-13 11:05:26 +01:00
martinboehme	270f2c5575	[clang][dataflow] Add `Environment::initializeFieldsWithValues()`. (#81239 ) This function will be useful when we change the behavior of record-type prvalues so that they directly initialize the associated result object. See also the comment here for more details: `9e73656af5/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h (L354)` As part of this patch, we document and assert that synthetic fields may not have reference type. There is no practical use case for this: A `StorageLocation` may not have reference type, and a synthetic field of the corresponding non-reference type can serve the same purpose.	2024-02-13 10:01:25 +01:00
Yitzhak Mandelbaum	672fb27b26	[clang][dataflow] Add new `join` API and replace existing `merge` implementations. (#80361 ) This patch adds a new interface for the join operation, now properly called `join`. Originally, the framework offered a single `merge` operation, which could serve either as a join or a widening. In practice, though we found this conflation didn't work for non-trivial anlyses, and split of the widening operation (`widen`). This change completes the transition by introducing a proper `join` with strict join semantics. In the process, it drops an odd (and often misused) aspect of `merge` wherein callees could implictly instruct the framework to drop the current entry by returning `false`. This features was never used correctly in analyses and doesn't belong in a join operation, so it is omitted. --------- Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com> Co-authored-by: martinboehme <mboehme@google.com>	2024-02-06 15:38:56 -05:00
martinboehme	c83ec847ac	[clang][dataflow] Extend debug output for `Environment`. (#79982 ) * Print `ReturnLoc`, `ReturnVal`, and `ThisPointeeLoc` if applicable. * For entries in `LocToVal` that correspond to declarations, print the names of the declarations next to them. I've removed the FIXME because all relevant fields are now being dumped. I'm not sure we actually need the capability for the caller to specify which fields to dump, so I've simply deleted this part of the comment. Some examples of the output: ![image](https://github.com/llvm/llvm-project/assets/29098113/17d0978f-b86d-4555-8a61-d1f2021f8d59) ![image](https://github.com/llvm/llvm-project/assets/29098113/021dbb24-5fe2-4720-8a08-f48dcf4b88f8)	2024-01-31 08:11:13 +01:00
martinboehme	7a6c2628e9	[clang][dataflow] Eliminate two uses of `RecordValue::getLoc()`. (#79163 ) This is a small step towards eventually eliminating `RecordValue` entirely.	2024-01-24 08:06:32 +01:00
martinboehme	a2caa4929e	[clang][dataflow] Treat comma operator correctly in `getResultObjectLocation()`. (#78427 )	2024-01-22 09:23:06 +01:00
martinboehme	f1226eea52	[clang][dataflow] Consider `CXXDefaultInitExpr` to be an "original record ctor". (#78423 ) The CFG doesn't contain a CFGElement for the `CXXDefaultInitExpr::getInit()`, so it makes sense to consider the `CXXDefaultInitExpr` to be the expression that originally constructs the object.	2024-01-18 08:59:26 +01:00
martinboehme	23bfc271a3	[clang][dataflow] Use `ignoreCFGOmittedNodes()` in `setValue()`. (#78245 ) This is to be consistent with `getValue()`, which also uses `ignoreCFGOmittedNodes()`. Before this fix, it was not possible to retrieve a `Value` from a "CFG omitted" node that had previously been set using `setValue()`; see the accompanying test, which fails without the fix. I discovered this issue while running internal integration tests on https://github.com/llvm/llvm-project/pull/78127.	2024-01-16 15:48:44 +01:00
martinboehme	c19cacfa34	[clang][dataflow] Tighten checking for existence of a function body. (#78163 ) In various places, we would previously call `FunctionDecl::hasBody()` (which checks whether any redeclaration of the function has a body, not necessarily the one on which `hasBody()` is being called). This is bug-prone, as a recent bug in Crubit's nullability checker has shown ([fix](`4b01ed0f14`), [fix for the fix](`e0c5d8ddd7`)). Instead, we now use `FunctionDecl::doesThisDeclarationHaveABody()` which, as the name implies, checks whether the specific redeclaration it is being called on has a body. Alternatively, I considered being more lenient and "canonicalizing" to the `FunctionDecl` that has the body if the `FunctionDecl` being passed is a different redeclaration. However, this also risks hiding bugs: A caller might inadverently perform the analysis for all redeclarations of a function and end up duplicating work without realizing it. By accepting only the redeclaration that contains the body, we prevent this. I've checked, and all clients that I'm aware of do currently pass in the redeclaration that contains the function body. Typically this is because they use the `ast_matchers::hasBody()` matcher which, unlike `FunctionDecl::hasBody()`, only matches for the redeclaration containing the body.	2024-01-16 12:52:55 +01:00
martinboehme	2ee396b0b1	[clang][dataflow] Add `Environment::get<>()`. (#76027 ) This template function casts the result of `getValue()` or `getStorageLocation()` to a given subclass of `Value` or `StorageLocation` (using `cast_or_null`). It's a common pattern to do something like this: ```cxx auto Val = cast_or_null<PointerValue>(Env.getValue(E)); ``` This can now be expressed more concisely like this: ```cxx auto Val = Env.get<PointerValue>(E); ``` Instead of adding a new method `get()`, I had originally considered simply adding a template parameter to `getValue()` and `getStorageLocation()` (with a default argument of `Value` or `StorageLocation`), but this results in an undesirable repetition at the callsite, e.g. `getStorageLocation<RecordStorageLocation>(...)`. The `Value` and `StorageLocation` in the method name adds nothing of value when the template argument already contains this information, so it seemed best to shorten the method name to simply `get()`.	2023-12-21 09:02:20 +01:00
martinboehme	ca1034341c	[clang][dataflow] Fix an issue with `Environment::getResultObjectLocation()`. (#75483 ) So far, if there was a chain of record type prvalues, `getResultObjectLocation()` would assign a different result object location to each one. This makes no sense, of course, as all of these prvalues end up initializing the same result object. This patch fixes this by propagating storage locations up through the entire chain of prvalues. The new implementation also has the desirable effect of making it possible to make `getResultObjectLocation()` const, which seems appropriate given that, logically, it is just an accessor.	2023-12-18 09:10:03 +01:00
Samira Bazuzi	40381d1264	[clang][dataflow] Re-land: Retrieve members from accessors called usi… (#74336 ) …ng member pointers. This initially landed with a broken test due to a mid-air collision with a new requirement for Environment initialization before field modeling. Have added that initialization in the test. From first landing: getMethodDecl does not handle pointers to members and returns nullptr for them. getMethodDecl contains a decade-plus-old FIXME to handle pointers to members, but two approaches I looked at for fixing it are more invasive or complex than simply swapping to getCalleeDecl. The first, have getMethodDecl call getCalleeDecl, creates a large tree of const-ness mismatches due to getMethodDecl returning a non-const value while being a const member function and getCalleeDecl only being a const member function when it returns a const value. The second, implementing an AST walk to match how CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary operator, is basically reimplementing Expr::getReferencedDeclOfCallee, which is used by Expr::getCalleeDecl. We don't need another copy of that code.	2023-12-05 12:09:33 +01:00
martinboehme	3b6d63c519	Revert "[clang][dataflow] Retrieve members from accessors called using member…" (#74299 ) Reverts llvm/llvm-project#73978	2023-12-04 11:27:31 +01:00
Samira Bazuzi	a3fe9cb24d	[clang][dataflow] Retrieve members from accessors called using member… (#73978 ) … pointers. getMethodDecl does not handle pointers to members and returns nullptr for them. getMethodDecl contains a decade-plus-old FIXME to handle pointers to members, but two approaches I looked at for fixing it are more invasive or complex than simply swapping to getCalleeDecl. The first, have getMethodDecl call getCalleeDecl, creates a large tree of const-ness mismatches due to getMethodDecl returning a non-const value while being a const member function and getCalleeDecl only being a const member function when it returns a const value. The second, implementing an AST walk to match how CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary operator, is basically reimplementing Expr::getReferencedDeclOfCallee, which is used by Expr::getCalleeDecl. We don't need another copy of that code.	2023-12-04 10:10:07 +01:00
martinboehme	71f2ec2db1	[clang][dataflow] Add synthetic fields to `RecordStorageLocation` (#73860 ) Synthetic fields are intended to model the internal state of a class (e.g. the value stored in a `std::optional`) without having to depend on that class's implementation details. Today, this is typically done with properties on `RecordValue`s, but these have several drawbacks: * Care must be taken to call `refreshRecordValue()` before modifying a property so that the modified property values aren’t seen by other environments that may have access to the same `RecordValue`. * Properties aren’t associated with a storage location. If an analysis needs to associate a location with the value stored in a property (e.g. to model the reference returned by `std::optional::value()`), it needs to manually add an indirection using a `PointerValue`. (See for example the way this is done in UncheckedOptionalAccessModel.cpp, specifically in `maybeInitializeOptionalValueMember()`.) * Properties don’t participate in the builtin compare, join, and widen operations. If an analysis needs to apply these operations to properties, it needs to override the corresponding methods of `ValueModel`. * Longer-term, we plan to eliminate `RecordValue`, as by-value operations on records aren’t really “a thing” in C++ (see https://discourse.llvm.org/t/70086#changed-structvalue-api-14). This would obviously eliminate the ability to set properties on `RecordValue`s. To demonstrate the advantages of synthetic fields, this patch converts UncheckedOptionalAccessModel.cpp to synthetic fields. This greatly simplifies the implementation of the check. This PR is pretty big; to make it easier to review, I have broken it down into a stack of three commits, each of which contains a set of logically related changes. I considered submitting each of these as a separate PR, but the commits only really make sense when taken together. To review, I suggest first looking at the changes in UncheckedOptionalAccessModel.cpp. This gives a flavor for how the various API changes work together in the context of an analysis. Then, review the rest of the changes.	2023-12-04 09:29:22 +01:00
martinboehme	5bd643e145	[clang][dataflow] Strengthen widening of boolean values. (#73484 ) Before we widen to top, we now check if both values can be proved either true or false in their respective environments; if so, widening returns a true or false literal. The idea is that we avoid losing information if posssible. This patch includes a test that fails without this change to widening. This change does mean that we call the SAT solver in more places, but this seems acceptable given the additional precision we gain. In tests on an internal codebase, the number of SAT solver timeouts we observe with Crubit's nullability checker does increase by about 25%. They can be brought back to the previous level by doubling the SAT solver work limit.	2023-11-27 14:55:49 +01:00
martinboehme	c4c59192e6	[clang][dataflow] Clear `ExprToLoc` and `ExprToVal` at the start of a block. (#72985 ) We never need to access entries from these maps outside of the current basic block. This could only ever become a consideration when flow control happens inside a full-expression (i.e. we have multiple basic blocks for a full expression); there are two kinds of expression where this can happen, but we already deal with these in other ways: * Short-circuiting logical operators (`&&` and `\|\|`) have operands that live in different basic blocks than the operator itself, but we already have code in the framework to retrieve the value of these operands from the environment for the block they are computed in, rather than in the environment of the block containing the operator. * The conditional operator similarly has operands that live in different basic blocks. However, we currently don't implement a transfer function for the conditional operator. When we do this, we need to retrieve the values of the operands from the environments of the basic blocks they live in, as we already do for logical operators. This patch adds a comment to this effect to the code. Clearing out `ExprToLoc` and `ExprToVal` has two benefits: * We avoid performing joins on boolean expressions contained in `ExprToVal` and hence extending the flow condition in cases where this is not needed. Simpler flow conditions should reduce the amount of work we do in the SAT solver. * Debugging becomes easier when flow conditions are simpler and `ExprToLoc` / `ExprToVal` don’t contain any extraneous entries. Benchmark results on Crubit's `pointer_nullability_analysis_benchmark show a slight runtime increase for simple benchmarks, offset by substantial runtime reductions for more complex benchmarks: ``` name old cpu/op new cpu/op delta BM_PointerAnalysisCopyPointer 29.8µs ± 1% 29.9µs ± 4% ~ (p=0.879 n=46+49) BM_PointerAnalysisIntLoop 101µs ± 3% 104µs ± 4% +2.96% (p=0.000 n=55+57) BM_PointerAnalysisPointerLoop 378µs ± 3% 245µs ± 3% -35.09% (p=0.000 n=47+55) BM_PointerAnalysisBranch 118µs ± 2% 122µs ± 3% +3.37% (p=0.000 n=59+59) BM_PointerAnalysisLoopAndBranch 779µs ± 3% 413µs ± 5% -47.01% (p=0.000 n=56+45) BM_PointerAnalysisTwoLoops 187µs ± 3% 192µs ± 5% +2.80% (p=0.000 n=57+58) BM_PointerAnalysisJoinFilePath 17.4ms ± 3% 7.2ms ± 3% -58.75% (p=0.000 n=58+57) BM_PointerAnalysisCallInLoop 14.7ms ± 4% 10.3ms ± 2% -29.87% (p=0.000 n=56+58) ```	2023-11-22 16:34:24 +01:00
martinboehme	a0700532dd	[clang][dataflow] Replace one remaining call to deprecated `addToFlowCondition()`. (#71547 )	2023-11-08 05:32:04 +01:00
martinboehme	7c636728c0	[clang][dataflow] Simplify flow conditions displayed in HTMLLogger. (#70848 ) This can make the flow condition significantly easier to interpret; see below for an example. I had hoped that adding the simplification as a preprocessing step before the SAT solver (in `DataflowAnalysisContext::querySolver()`) would also speed up SAT solving and maybe even eliminate SAT solver timeouts, but in my testing, this actually turns out to be a pessimization. It appears that these simplifications are easy enough for the SAT solver to perform itself. Nevertheless, the improvement in debugging alone makes this a worthwhile change. Example of flow condition output with these changes: ``` Flow condition token: V37 Constraints: (V16 = (((V15 & (V19 = V12)) & V22) & V25)) (V15 = ((V12 & ((V14 = V9) \| (V14 = V4))) & (V13 = V14))) True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37) False atoms: (V3, V8, V17) Equivalent atoms: (V11, V15) Flow condition constraints before simplification: V37 ((!V3 & !V8) & !V17) (V37 = V34) (V34 = (V29 & (V35 = V30))) (V29 = (((V16 \| V2) & V32) & (V30 = V32))) (V16 = (((V15 & (V19 = V12)) & V22) & V25)) (V15 = V11) (V11 = ((((V7 \| V2) & V12) & ((V7 & (V14 = V9)) \| (V2 & (V14 = V4)))) & (V13 = V14))) (V2 = V1) (V1 = V0) V0 (V7 = V6) (V6 = V5) (V5 = V2) ```	2023-11-07 15:18:34 +01:00
martinboehme	d1f59544cf	[clang][dataflow] Add `Environment::allows()`. (#70046 ) This allows querying whether, given the flow condition, a certain formula still has a solution (though it is not necessarily implied by the flow condition, as `flowConditionImplies()` would check). This can be checked today, but only with a double negation, i.e. to check whether, given the flow condition, a formula F has a solution, you can check `!Env.flowConditionImplies(Arena.makeNot(F))`. The double negation makes this hard to reason about, and it would be nicer to have a way of directly checking this. For consistency, this patch also renames `flowConditionImplies()` to `proves()`; the old name is kept around for compatibility but deprecated.	2023-10-25 16:02:22 +02:00

1 2 3 4

186 Commits