llvm-project

Author	SHA1	Message	Date
Kazu Hirata	e955e4fba6	[clang] Replace None with std::nullopt in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-05-04 22:42:52 -07:00
Balazs Benics	afcf70aa6d	[analyzer] Remove unjustified assert from EquivalenceClass::simplify One might think that by merging the equivalence classes (eqclasses) of `Sym1` and `Sym2` symbols we would end up with a `State` in which the eqclass of `Sym1` and `Sym2` should now be the same. Surprisingly, it's not //always// true. Such an example triggered the assertion enforcing this _unjustified_ invariant in https://github.com/llvm/llvm-project/issues/58677. ```lang=C++ unsigned a, b; #define assert(cond) if (!(cond)) return void f(unsigned c) { /(1)/ assert(c == b); /(2)/ assert((c \| a) != a); /(3)/ assert(a); // a = 0 => c \| 0 != 0 => c != 0 => b != 0 } ``` I believe, that this assertion hold for reasonable cases - where both `MemberSym` and `SimplifiedMemberSym` refer to live symbols. It can only fail if `SimplifiedMemberSym` refers to an already dead symbol. See the reasoning at the very end. In this context, I don't know any way of determining if a symbol is alive/dead, so I cannot refine the assertion in that way. So, I'm proposing to drop this unjustified assertion. --- Let me elaborate on why I think the assertion is wrong in its current shape. Here is a quick reminder about equivalence classes in CSA. We have 4 mappings: 1) `ClassMap`: map, associating `Symbols` with an `EquivalenceClass`. 2) `ClassMembers`: map, associating `EquivalenceClasses` with its members - basically an enumeration of the `Symbols` which are known to be equal. 3) `ConstraintRange`: map, associating `EquivalenceClasses` with the range constraint which should hold for all the members of the eqclass. 4) `DisequalityMap`: I'm omitting this, as it's irrelevant for our purposes now. Whenever we encounter/assume that two `SymbolRefs` are equal, we update the `ClassMap` so that now both `SymbolRefs` are referring to the same eqclass. This operation is known as `merge` or `union`. Each eqclass is uniquely identified by the `representative` symbol, but it could have been just a unique number or anything else - the point is that an eqclass is identified by something unique. Initially, all Symbols form - by itself - a trivial eqclass, as there are no other Symbols to which it is assumed to be equal. A trivial eqclass has //notionally// exactly one member, the representative symbol. I'm emphasizing that //notionally// because for such cases we don't store an entry in the `ClassMap` nor in the `ClassMembers` map, because it could be deduced. By `merging` two eqclasses, we essentially do what you would think it does. An important thing to highlight is that the representative symbol of the resulting eqclass will be the same as one of the two eqclasses. This operation should be commutative, so that `merge(eq1,eq2)` and `merge(eq2,eq1)` should result in the same eqclass - except for the representative symbol, which is just a unique identifier of the class, a name if you will. Using the representative symbol of `eq1` or `eq2` should have no visible effect on the analysis overall. When merging `eq1` into `eq2`, we take each of the `ClassMembers` of `eq1` and add them to the `ClassMembers` of `eq2` while we also redirect all the `Symbol` members of `eq1` to map to the `eq2` eqclass in the `ClassMap`. This way all members of `eq1` will refer to the correct eqclass. After these, `eq1` key is unreachable in the `ClassMembers`, hence we can drop it. --- Let's get back to the example. Note that when I refer to symbols `a`, `b`, `c`, I'm referring to the `SymbolRegionValue{VarRegion{.}}` - the value of that variable. After `(1)`, we will have a constraint `c == b : [1,1]` and an eqclass `c,b` with the `c` representative symbol. After `(2)`, we will have an additional constraint `c\|b != a : [1,1]` with the same eqclass. We will also have disequality info about that `c\|a` is disequal to `a` - and the other way around. However, after the full-expression, `c` will become dead. This kicks in the garbage collection, which transforms the state into this: - We no longer have any constraints, because only `a` is alive, for which we don't have any constraints. - We have a single (non-trivial) eqclass with a single element `b` and representative symbol `c`. (Dead symbols can be representative symbols.) - We have the same disequality info as before. At `(3)` within the false branch, `a` get perfectly constrained to zero. This kicks in the simplification, so we try to substitute (simplify) the variable in each SymExpr-tree. In our case, it means that the `c\|a != a : [1,1]` constraint gets re-evaluated as `c\|0 != 0 : [1,1]`, which is `c != 0 : [1,1]`. Under the hood, it means that we will call `merge(c\|a, c)`. where `c` is the result of `simplifyToSVal(State, MemberSym).getAsSymbol()` inside `EquivalenceClass::simplify()`. Note that the result of `simplifyToSVal()` was a dead symbol. We shouldn't have acquired an already dead symbol. AFAIK, this is the only way we can get one at this point. Since `c` is dead, we no longer have a mapping in `ClassMap` for it; hence when we try to `find()` the eqclass of `c`, it will report that it's a trivial eqclass with the representative symbol `c`. After `merge(c\|a, c)`, we will have a single (non-trivial) eqclass of `b, c\|a` with the representative symbol `c\|a` - because we merged the eqclass of `c` into the eqclass of `c\|a`. This means that `find(c\|a)` will result in the eqclass with the representative symbol `c\|a`. So, we ended up having different eqclasses for `find(c\|a)` and `find(c)` after `merge(c\|a, c)`, firing the assertion. I believe, that this assertion hold for reasonable cases - where both `MemberSym` and `SimplifiedMemberSym` refer to live symbols. `MemberSym` should be live in all cases here, because it is from `ClassMembers` which is pruned in `removeDeadBindings()` so these must be alive. In this context, I don't know any way of determining if a symbol is alive/dead, so I cannot refine the assertion in that way. So, I'm proposing to drop this unjustified assertion. I'd like to thank @martong for helping me to conclude the investigation. It was really difficult to track down. PS: I mentioned that `merge(eq1, eq2)` should be commutative. We actually exploit this for merging the smaller eqclass into the bigger one within `EquivalenceClass::merge()`. Yea, for some reason, if you swap the operands, 3 tests break (only failures, no crashes) aside from the test which checks the state dumps. But I believe, that is a different bug and orthogonal to this one. I just wanted to mention that. - `Analysis/solver-sym-simplification-adjustment.c` - `Analysis/symbol-simplification-fixpoint-iteration-unreachable-code.cpp` - `Analysis/symbol-simplification-reassume.cpp` Fixes #58677 Reviewed By: vabridgers Differential Revision: https://reviews.llvm.org/D138037	2023-02-17 11:37:02 +01:00
Manas	5f02ad880e	[analyzer][solver] Improve reasoning for not equal to operator This patch fixes certain cases where solver was not able to infer disequality due to overlapping of values in rangeset. This case was casting from lower signed type to bigger unsigned type. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D140086	2023-01-25 02:32:55 +05:30
Kazu Hirata	6ad0788c33	[clang] Use std::optional instead of llvm::Optional (NFC) This patch replaces (llvm::\|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-14 12:31:01 -08:00
Kazu Hirata	a1580d7b59	[clang] Add #include <optional> (NFC) This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>. I'll post a separate patch to actually replace llvm::Optional with std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-14 11:07:21 -08:00
Balazs Benics	f61a08b67f	[analyzer] Fix crash inside RangeConstraintManager.cpp introduced by D112621 It seems like `LHS` and `RHS` could be empty range sets. This caused an assertion failure inside RangeConstraintManager. I'm hoisting out the check from the function into the call-site. This way we could assert that we only want to deal with non-empty range sets. The relevant part of the trace: ``` #6 0x00007fe6ff5f81a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6) #7 0x00007fe6ff5f8252 (/lib64/libc.so.6+0x2f252) #8 0x00000000049caed2 (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) RangeConstraintManager.cpp:0:0 #9 0x00000000049c9867 (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const) RangeConstraintManager.cpp:0:0 #10 0x00000000049bebf5 (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const, llvm::APSInt const&, llvm::APSInt const&) RangeConstraintManager.cpp:0:0 #11 0x00000000049d368c clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const, bool) (../../main-github/llvm/build-all/bin/clang+0x49d368c) #12 0x00000000049f0b09 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (../../main-github/llvm/build-all/bin/clang+0x49f0b09) #13 0x00000000049f096a clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (../../main-github/llvm/build-all/bin/clang+0x49f096a) #14 0x00000000049f086d clang::ento::SimpleConstraintManager::assumeInternal(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (../../main-github/llvm/build-all/bin/clang+0x49f086d) #15 0x000000000492d3e3 clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) (../../main-github/llvm/build-all/bin/clang+0x492d3e3) #16 0x0000000004955b6d clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const) (../../main-github/llvm/build-all/bin/clang+0x4955b6d) #17 0x00000000049514b6 clang::ento::ExprEngine::Visit(clang::Stmt const, clang::ento::ExplodedNode, clang::ento::ExplodedNodeSet&) (../../main-github/llvm/build-all/bin/clang+0x49514b6) #18 0x000000000494c73e clang::ento::ExprEngine::ProcessStmt(clang::Stmt const, clang::ento::ExplodedNode) (../../main-github/llvm/build-all/bin/clang+0x494c73e) #19 0x000000000494c459 clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode, unsigned int, clang::ento::NodeBuilderContext) (../../main-github/llvm/build-all/bin/clang+0x494c459) #20 0x000000000492f3d0 clang::ento::CoreEngine::HandlePostStmt(clang::CFGBlock const, unsigned int, clang::ento::ExplodedNode) (../../main-github/llvm/build-all/bin/clang+0x492f3d0) #21 0x000000000492e1f6 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (../../main-github/llvm/build-all/bin/clang+0x492e1f6) ``` Differential Revision: https://reviews.llvm.org/D112621	2022-12-19 12:49:43 +01:00
Manas	77ab7281aa	[analyzer][solver] Introduce reasoning for not equal to operator With this patch, the solver can infer results for not equal (!=) operator over Ranges as well. This also fixes the issue of comparison between different types, by first converting the RangeSets to the resulting type, which then can be used for comparisons. Patch by Manas. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D112621	2022-12-09 13:30:57 +01:00
Kazu Hirata	22731dbd75	[clang] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 20:31:05 -08:00
Kazu Hirata	35b4fbb559	[clang] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 15:57:24 -08:00
Kazu Hirata	180600660b	[StaticAnalyzer] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 11:34:24 -08:00
Fangrui Song	32197830ef	[clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC	2022-08-09 07:11:18 +00:00
Denys Petrov	bc08c3cb7f	[analyzer] Add new function `clang_analyzer_value` to ExprInspectionChecker Summary: Introduce a new function 'clang_analyzer_value'. It emits a report that in turn prints a RangeSet or APSInt associated with SVal. If there is no associated value, prints "n/a".	2022-07-15 20:07:04 +03:00
Denys Petrov	82f76c0477	[analyzer][NFC] Tidy up handler-functions in SymbolicRangeInferrer Summary: Sorted some handler-functions into more appropriate visitor functions of the SymbolicRangeInferrer. - Spread `getRangeForNegatedSub` body over several visitor functions: `VisitSymExpr`, `VisitSymIntExpr`, `VisitSymSymExpr`. - Moved `getRangeForComparisonSymbol` from `infer` to `VisitSymSymExpr`. Differential Revision: https://reviews.llvm.org/D129678	2022-07-15 19:24:57 +03:00
Gabor Marton	81e44414aa	[analyzer][NFC] Move overconstrained check from reAssume to assumeDualImpl Depends on D126406. Checking of the overconstrained property is much better suited here. Differential Revision: https://reviews.llvm.org/D126707	2022-06-02 11:41:19 +02:00
Gabor Marton	88abc50398	[analyzer][solver] Handle UnarySymExpr in RangeConstraintSolver Fixes https://github.com/llvm/llvm-project/issues/55241 Differential Revision: https://reviews.llvm.org/D125395	2022-05-26 14:09:46 +02:00
Gabor Marton	ca3d962548	[analyzer] Return from reAssume if State is posteriorly overconstrained Depends on D124758. That patch introduced serious regression in the run-time in some special cases. This fixes that. Differential Revision: https://reviews.llvm.org/D126406	2022-05-26 13:50:40 +02:00
Gabor Marton	f75bc5bfc8	[analyzer] Fix symbol simplification assertion failure Fixes https://github.com/llvm/llvm-project/issues/55546 The assertion mentioned in the issue is triggered because an inconsistency is formed in the Sym->Class and Class->Sym relations. A simpler but similar inconsistency is demonstrated here: https://reviews.llvm.org/D114887 . Previously in `removeMember`, we didn't remove the old symbol's Sym->Class relation. Back then, we explained it with the following two bullet points: > 1) This way constraints for the old symbol can still be found via it's > equivalence class that it used to be the member of. > 2) Performance and resource reasons. We can spare one removal and thus one > additional tree in the forest of `ClassMap`. This patch do remove the old symbol's Sym->Class relation in order to keep the Sym->Class relation consistent with the Class->Sym relations. Point 2) above has negligible performance impact, empirical measurements do not show any noticeable difference in the run-time. Point 1) above seems to be a not well justified statement. This is because we cannot create a new symbol that would be equal to the old symbol after the simplification had happened. The reason for this is that the SValBuilder uses the available constant constraints for each sub-symbol. Differential Revision: https://reviews.llvm.org/D126281	2022-05-25 10:55:50 +02:00
Denys Petrov	e37726beb2	[analyzer] Implemented RangeSet::Factory::castTo function to perform promotions, truncations and conversions. Summary: Handle casts for ranges working similarly to APSIntType::apply function but for the whole range set. Support promotions, truncations and conversions. Example: promotion: char [0, 42] -> short [0, 42] -> int [0, 42] -> llong [0, 42] truncation: llong [4295033088, 4295033130] -> int [65792, 65834] -> short [256, 298] -> char [0, 42] conversion: char [-42, 42] -> uint [0, 42]U[4294967254, 4294967295] -> short[-42, 42] Differential Revision: https://reviews.llvm.org/D103094	2022-04-19 22:34:03 +03:00
Denys Petrov	6a399bf4b3	[analyzer] Implemented RangeSet::Factory::unite function to handle intersections and adjacency Summary: Handle intersected and adjacent ranges uniting them into a single one. Example: intersection [0, 10] U [5, 20] = [0, 20] adjacency [0, 10] U [11, 20] = [0, 20] Differential Revision: https://reviews.llvm.org/D99797	2021-12-10 18:48:02 +02:00
Gabor Marton	978431e80b	[Analyzer] SValBuilder: Simlify a SymExpr to the absolute simplest form Move the SymExpr simplification fixpoint logic into SValBuilder. Differential Revision: https://reviews.llvm.org/D114938	2021-12-07 10:02:32 +01:00
Gabor Marton	20f8733d4b	[Analyzer][solver] Simplification: Do a fixpoint iteration before the eq class merge This reverts commit f02c5f3478318075d1a469203900e452ba651421 and addresses the issue mentioned in D114619 differently. Repeating the issue here: Currently, during symbol simplification we remove the original member symbol from the equivalence class (`ClassMembers` trait). However, we keep the reverse link (`ClassMap` trait), in order to be able the query the related constraints even for the old member. This asymmetry can lead to a problem when we merge equivalence classes: ``` ClassA: [a, b] // ClassMembers trait, a->a, b->a // ClassMap trait, a is the representative symbol ``` Now let,s delete `a`: ``` ClassA: [b] a->a, b->a ``` Let's merge ClassA into the trivial class `c`: ``` ClassA: [c, b] c->c, b->c, a->a ``` Now, after the merge operation, `c` and `a` are actually in different equivalence classes, which is inconsistent. This issue manifests in a test case (added in D103317): ``` void recurring_symbol(int b) { if (b * b != b) if ((b * b) * b * b != (b * b) * b) if (b * b == 1) } ``` Before the simplification we have these equivalence classes: ``` trivial EQ1: [b * b != b] trivial EQ2: [(b * b) * b * b != (b * b) * b] ``` During the simplification with `b * b == 1`, EQ1 is merged with `1 != b` `EQ1: [b * b != b, 1 != b]` and we remove the complex symbol, so `EQ1: [1 != b]` Then we start to simplify the only symbol in EQ2: `(b * b) * b * b != (b * b) * b --> 1 * b * b != 1 * b --> b * b != b` But `b * b != b` is such a symbol that had been removed previously from EQ1, thus we reach the above mentioned inconsistency. This patch addresses the issue by making it impossible to synthesise a symbol that had been simplified before. We achieve this by simplifying the given symbol to the absolute simplest form. Differential Revision: https://reviews.llvm.org/D114887	2021-12-01 22:23:41 +01:00
Gabor Marton	f02c5f3478	[Analyzer][solver] Do not remove the simplified symbol from the eq class Currently, during symbol simplification we remove the original member symbol from the equivalence class (`ClassMembers` trait). However, we keep the reverse link (`ClassMap` trait), in order to be able the query the related constraints even for the old member. This asymmetry can lead to a problem when we merge equivalence classes: ``` ClassA: [a, b] // ClassMembers trait, a->a, b->a // ClassMap trait, a is the representative symbol ``` Now lets delete `a`: ``` ClassA: [b] a->a, b->a ``` Let's merge the trivial class `c` into ClassA: ``` ClassA: [c, b] c->c, b->c, a->a ``` Now after the merge operation, `c` and `a` are actually in different equivalence classes, which is inconsistent. One solution to this problem is to simply avoid removing the original member and this is what this patch does. Other options I have considered: 1) Always merge the trivial class into the non-trivial class. This might work most of the time, however, will fail if we have to merge two non-trivial classes (in that case we no longer can track equivalences precisely). 2) In `removeMember`, update the reverse link as well. This would cease the inconsistency, but we'd loose precision since we could not query the constraints for the removed member. Differential Revision: https://reviews.llvm.org/D114619	2021-11-30 11:13:13 +01:00
Gabor Marton	01c9700aaa	[analyzer][solver] Remove reference to RangedConstraintManager We no longer need a reference to RangedConstraintManager, we call top level `State->assume` functions. Differential Revision: https://reviews.llvm.org/D113261	2021-11-12 11:44:49 +01:00
Gabor Marton	806329da07	[analyzer][solver] Iterate to a fixpoint during symbol simplification with constants D103314 introduced symbol simplification when a new constant constraint is added. Currently, we simplify existing equivalence classes by iterating over all existing members of them and trying to simplify each member symbol with simplifySVal. At the end of such a simplification round we may end up introducing a new constant constraint. Example: ``` if (a + b + c != d) return; if (c + b != 0) return; // Simplification starts here. if (b != 0) return; ``` The `c == 0` constraint is the result of the first simplification iteration. However, we could do another round of simplification to reach the conclusion that `a == d`. Generally, we could do as many new iterations until we reach a fixpoint. We can reach to a fixpoint by recursively calling `State->assume` on the newly simplified symbol. By calling `State->assume` we re-ignite the whole assume machinery (along e.g with adjustment handling). Why should we do this? By reaching a fixpoint in simplification we are capable of discovering infeasible states at the moment of the introduction of the first constant constraint. Let's modify the previous example just a bit, and consider what happens without the fixpoint iteration. ``` if (a + b + c != d) return; if (c + b != 0) return; // Adding a new constraint. if (a == d) return; // This brings in a contradiction. if (b != 0) return; clang_analyzer_warnIfReached(); // This produces a warning. // The path is already infeasible... if (c == 0) // ...but we realize that only when we evaluate `c == 0`. return; ``` What happens currently, without the fixpoint iteration? As the inline comments suggest, without the fixpoint iteration we are doomed to realize that we are on an infeasible path only after we are already walking on that. With fixpoint iteration we can detect that before stepping on that. With fixpoint iteration, the `clang_analyzer_warnIfReached` does not warn in the above example b/c during the evaluation of `b == 0` we realize the contradiction. The engine and the checkers do rely on that either `assume(Cond)` or `assume(!Cond)` should be feasible. This is in fact assured by the so called expensive checks (LLVM_ENABLE_EXPENSIVE_CHECKS). The StdLibraryFuncionsChecker is notably one of the checkers that has a very similar assertion. Before this patch, we simply added the simplified symbol to the equivalence class. In this patch, after we have added the simplified symbol, we remove the old (more complex) symbol from the members of the equivalence class (`ClassMembers`). Removing the old symbol is beneficial because during the next iteration of the simplification we don't have to consider again the old symbol. Contrary to how we handle `ClassMembers`, we don't remove the old Sym->Class relation from the `ClassMap`. This is important for two reasons: The constraints of the old symbol can still be found via it's equivalence class that it used to be the member of (1). We can spare one removal and thus one additional tree in the forest of `ClassMap` (2). Performance and complexity: Let us assume that in a State we have N non-trivial equivalence classes and that all constraints and disequality info is related to non-trivial classes. In the worst case, we can simplify only one symbol of one class in each iteration. The number of symbols in one class cannot grow b/c we replace the old symbol with the simplified one. Also, the number of the equivalence classes can decrease only, b/c the algorithm does a merge operation optionally. We need N iterations in this case to reach the fixpoint. Thus, the steps needed to be done in the worst case is proportional to `N*N`. Empirical results (attached) show that there is some hardly noticeable run-time and peak memory discrepancy compared to the baseline. In my opinion, these differences could be the result of measurement error. This worst case scenario can be extended to that cases when we have trivial classes in the constraints and in the disequality map are transforming to such a State where there are only non-trivial classes, b/c the algorithm does merge operations. A merge operation on two trivial classes results in one non-trivial class. Differential Revision: https://reviews.llvm.org/D106823	2021-11-12 11:44:49 +01:00
Gabor Marton	a8297ed994	[Analyzer][solver] Handle adjustments in constraint assignor remainder We can reuse the "adjustment" handling logic in the higher level of the solver by calling `State->assume`. Differential Revision: https://reviews.llvm.org/D112296	2021-10-27 17:14:34 +02:00
Gabor Marton	888af47095	[Analyzer][solver] Simplification: reorganize equalities with adjustment Initiate the reorganization of the equality information during symbol simplification. E.g., if we bump into `c + 1 == 0` during simplification then we'd like to express that `c == -1`. It makes sense to do this only with `SymIntExpr`s. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D111642	2021-10-27 16:48:55 +02:00
Balazs Benics	f9db6a44eb	Revert "[analyzer][solver] Introduce reasoning for not equal to operator" This reverts commit cac8808f154cef6446e507d55aba5721c3bd5352. #5 0x00007f28ec629859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859) #6 0x00007f28ec629729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729) #7 0x00007f28ec63af36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36) #8 0x00007f28ecc2cc46 llvm::APInt::compareSigned(llvm::APInt const&) const (libLLVMSupport.so.14git+0xeac46) #9 0x00007f28e7bbf957 (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) (libclangStaticAnalyzerCore.so.14git+0x1df957) #10 0x00007f28e7bbf2db (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const) (libclangStaticAnalyzerCore.so.14git+0x1df2db) #11 0x00007f28e7bb2b5e (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const, llvm::APSInt const&, llvm::APSInt const&) (libclangStaticAnalyzerCore.so.14git+0x1d2b5e) #12 0x00007f28e7bc67af clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const, bool) (libclangStaticAnalyzerCore.so.14git+0x1e67af) #13 0x00007f28e7be3578 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x203578) #14 0x00007f28e7be33d8 clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x2033d8) #15 0x00007f28e7be32fb clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (libclangStaticAnalyzerCore.so.14git+0x2032fb) #16 0x00007f28e7b15dbc clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) (libclangStaticAnalyzerCore.so.14git+0x135dbc) #17 0x00007f28e7b4780f clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const) (libclangStaticAnalyzerCore.so.14git+0x16780f) This is known to be triggered on curl, tinyxml2, tmux, twin and on xerces. But @bjope also reported similar crashes. So, I'm reverting it to make our internal bots happy again. Differential Revision: https://reviews.llvm.org/D106102	2021-10-23 21:01:59 +02:00
Manas	cac8808f15	[analyzer][solver] Introduce reasoning for not equal to operator Prior to this, the solver was only able to verify whether two symbols are equal/unequal, only when constants were involved. This patch allows the solver to work over ranges as well. Reviewed By: steakhal, martong Differential Revision: https://reviews.llvm.org/D106102 Patch by: @manas (Manas Gupta)	2021-10-22 12:00:08 +02:00
Gabor Marton	5f8dca0235	[Analyzer] Extend ConstraintAssignor to handle remainder op Summary: `a % b != 0` implies that `a != 0` for any `a` and `b`. This patch extends the ConstraintAssignor to do just that. In fact, we could do something similar with division and in case of multiplications we could have some other inferences, but I'd like to keep these for future patches. Fixes https://bugs.llvm.org/show_bug.cgi?id=51940 Reviewers: noq, vsavchenko, steakhal, szelethus, asdenyspetrov Subscribers: Differential Revision: https://reviews.llvm.org/D110357	2021-10-22 10:47:25 +02:00
Gabor Marton	e2a2c8328f	[Analyzer][NFC] Add RangedConstraintManager to ConstraintAssignor In this patch we store a reference to `RangedConstraintManager` in the `ConstraintAssignor`. This way it is possible to call back and reuse some functions of it. This patch is exclusively needed for its child patches, it is not intended to be a standalone patch. Differential Revision: https://reviews.llvm.org/D111640	2021-10-22 10:46:28 +02:00
Gabor Marton	01b4ddbfbb	[Analyzer][NFC] Move RangeConstraintManager's def before ConstraintAssignor's def In this patch we simply move the definition of RangeConstraintManager before the definition of ConstraintAssignor. This patch is exclusively needed for it's child patch, so in the child the diff would be clean and the review would be easier. Differential Revision: https://reviews.llvm.org/D110387	2021-10-22 10:46:28 +02:00
Gabor Marton	ac3edc5af0	[analyzer][solver] Handle simplification to ConcreteInt The solver's symbol simplification mechanism was not able to handle cases when a symbol is simplified to a concrete integer. This patch adds the capability. E.g., in the attached lit test case, the original symbol is `c + 1` and it has a `[0, 0]` range associated with it. Then, a new condition `c == 0` is assumed, so a new range constraint `[0, 0]` comes in for `c` and simplification kicks in. `c + 1` becomes `0 + 1`, but the associated range is `[0, 0]`, so now we are able to realize the contradiction. Differential Revision: https://reviews.llvm.org/D110913	2021-10-14 17:53:29 +02:00
Gabor Marton	b8f6c85a83	[analyzer][NFC] Add RangeSet::dump This tiny change improves the debugging experience of the solver a lot! Differential Revision: https://reviews.llvm.org/D110911	2021-10-06 18:45:07 +02:00
Gabor Marton	792be5df92	[analyzer][solver] Fix CmpOpTable handling bug There is an error in the implementation of the logic of reaching the `Unknonw` tristate in CmpOpTable. ``` void cmp_op_table_unknownX2(int x, int y, int z) { if (x >= y) { // x >= y [1, 1] if (x + z < y) return; // x + z < y [0, 0] if (z != 0) return; // x < y [0, 0] clang_analyzer_eval(x > y); // expected-warning{{TRUE}} expected-warning{{FALSE}} } } ``` We miss the `FALSE` warning because the false branch is infeasible. We have to exploit simplification to discover the bug. If we had `x < y` as the second condition then the analyzer would return the parent state on the false path and the new constraint would not be part of the State. But adding `z` to the condition makes both paths feasible. The root cause of the bug is that we reach the `Unknown` tristate twice, but in both occasions we reach the same `Op` that is `>=` in the test case. So, we reached `>=` twice, but we never reached `!=`, thus querying the `Unknonw2x` column with `getCmpOpStateForUnknownX2` is wrong. The solution is to ensure that we reached both different `Op`s once. Differential Revision: https://reviews.llvm.org/D110910	2021-10-06 18:28:03 +02:00
Jay Foad	d933adeaca	[APInt] Stop using soft-deprecated constructors and methods in clang. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in clang. Differential Revision: https://reviews.llvm.org/D110808	2021-10-04 09:38:11 +01:00
Gabor Marton	4761321d49	[Analyzer][solver][NFC] print constraints deterministically (ordered by their string representation) This change is an extension to D103967 where I added dump methods for (dis)equality classes of the State. There, the (dis)equality classes and their contents are dumped in an ordered fashion, they are ordered based on their string representation. This is very useful once we start to use FileCheck to test the State dump in certain tests. Differential Revision: https://reviews.llvm.org/D106642	2021-07-26 16:27:23 +02:00
Gabor Marton	44fa31fa6d	[Analyzer][solver] Fix inconsistent equivalence class data https://bugs.llvm.org/show_bug.cgi?id=51109 When we merged two classes, `this` became an obsolete representation of the new `State`. This is b/c the member relations had changed during the previous merge of another member of the same class in a way that `this` had no longer any members. (`mergeImpl` might keep the member relations to `Other` and could dissolve `*this`.) Differential Revision: https://reviews.llvm.org/D106285	2021-07-23 14:25:32 +02:00
Gabor Marton	732a8a9dfb	[Analyzer][solver][NFC] Add explanatory comments to trivial eq classes Differential Revision: https://reviews.llvm.org/D106370	2021-07-21 11:59:56 +02:00
Gabor Marton	d0d37fcc4e	[Analyzer][solver] Remove unused functions ../../git/llvm-project/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2395:17: warning: 'clang::ento::ProgramStateRef {anonymous}::RangeConstraintManager::setRange(clang::ento::ProgramStateRef, {anonymous}::EquivalenceClass, clang::ento::RangeSet)' defined but not used [-Wunused-function] ../../git/llvm-project/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2384:10: warning: 'clang::ento::RangeSet {anonymous}::RangeConstraintManager::getRange(clang::ento::ProgramStateRef, {anonymous}::EquivalenceClass)' defined but not used [-Wunused-function] Differential Revision: https://reviews.llvm.org/D106063	2021-07-15 16:36:01 +02:00
Gabor Marton	bdf31471c7	[Analyzer][solver] Add dump methods for (dis)equality classes. This proved to be very useful during debugging. Differential Revision: https://reviews.llvm.org/D103967	2021-07-14 13:45:02 +02:00
Valeriy Savchenko	60bd8cbc0c	[analyzer][solver][NFC] Refactor how we detect (dis)equalities This patch simplifies the way we deal with (dis)equalities. Due to the symmetry between constraint handler and range inferrer, we can have very similar implementations of logic handling questions about (dis)equality and assumptions involving (dis)equality. It also helps us to remove one more visitor, and removes uncertainty that we got all the right places to put `trackNE` and `trackEQ`. Differential Revision: https://reviews.llvm.org/D105693	2021-07-13 21:00:30 +03:00
Valeriy Savchenko	f26deb4e6b	[analyzer][solver][NFC] Introduce ConstraintAssignor The new component is a symmetric response to SymbolicRangeInferrer. While the latter is the unified component, which answers all the questions what does the solver knows about a particular symbolic expression, assignor associates new constraints (aka "assumes") with symbolic expressions and can imply additional knowledge that the solver can extract and use later on. - Why do we need it and why is SymbolicRangeInferrer not enough? As it is noted before, the inferrer only helps us to get the most precise range information based on the existing knowledge and on the mathematical foundations of different operations that symbolic expressions actually represent. It doesn't introduce new constraints. The assignor, on the other hand, can impose constraints on other symbols using the same domain knowledge. - But for some expressions, SymbolicRangeInferrer looks into constraints for similar expressions, why can't we do that for all the cases? That's correct! But in order to do something like this, we should have a finite number of possible "similar expressions". Let's say we are asked about `$a - $b` and we know something about `$b - $a`. The inferrer can invert this expression and check constraints for `$b - $a`. This is simple! But let's say we are asked about `$a` and we know that `$a * $b != 0`. In this situation, we can imply that `$a != 0`, but the inferrer shouldn't try every possible symbolic expression `X` to check if `$a * X` or `X * $a` is constrained to non-zero. With the assignor mechanism, we can catch this implication right at the moment we associate `$a * $b` with non-zero range, and set similar constraints for `$a` and `$b` as well. Differential Revision: https://reviews.llvm.org/D105692	2021-07-13 21:00:30 +03:00
Valeriy Savchenko	6017cb31bb	[analyzer][solver] Use all sources of constraints Prior to this patch, we always gave priority to constraints that we actually know about symbols in question. However, these can get outdated and we can get better results if we look at all possible sources of knowledge, including sub-expressions. Differential Revision: https://reviews.llvm.org/D105436	2021-07-06 11:09:08 +03:00
Nico Weber	d5402a2fee	Revert "[Analyzer][solver] Add dump methods for (dis)equality classes." This reverts commit 6f3b775c3e9c685f74ecbe2ce1a94af52cc17c2f. Test fails flakily, see comments on https://reviews.llvm.org/D103967 Also revert follow-up "[Analyzer] Attempt to fix windows bots test failure b/c of new-line" This reverts commit fe0e861a4d9946a3e7de1bc95a3ec12fa602b492.	2021-06-28 11:32:57 -04:00
Valeriy Savchenko	8474bb13c3	[analyzer][solver][NFC] Simplify function signatures Since RangeSet::Factory actually contains BasicValueFactory, we can remove value factory from many function signatures inside the solver. Differential Revision: https://reviews.llvm.org/D105005	2021-06-28 14:20:06 +03:00
Gabor Marton	6f3b775c3e	[Analyzer][solver] Add dump methods for (dis)equality classes. This proved to be very useful during debugging. Differential Revision: https://reviews.llvm.org/D103967	2021-06-28 12:57:14 +02:00
Gabor Marton	0646e36254	[Analyzer][solver] Fix crashes during symbol simplification Consider the code ``` void f(int a0, int b0, int c) { int a1 = a0 - b0; int b1 = (unsigned)a1 + c; if (c == 0) { int d = 7L / b1; } } ``` At the point of divisiion by `b1` that is considered to be non-zero, which results in a new constraint for `$a0 - $b0 + $c`. The type of this sym is unsigned, however, the simplified sym is `$a0 - $b0` and its type is signed. This is probably the result of the inherent improper handling of casts. Anyway, Range assignment for constraints use this type information. Therefore, we must make sure that first we simplify the symbol and only then we assign the range. Differential Revision: https://reviews.llvm.org/D104844	2021-06-25 11:49:26 +02:00
Gabor Marton	8ddbb442b6	[Analyzer][solver] Simplify existing eq classes and constraints when a new constraint is added Update `setConstraint` to simplify existing equivalence classes when a new constraint is added. In this patch we iterate over all existing equivalence classes and constraints and try to simplfy them with simplifySVal. This solves problematic cases where we have two symbols in the tree, e.g.: ``` int test_rhs_further_constrained(int x, int y) { if (x + y != 0) return 0; if (y != 0) return 0; clang_analyzer_eval(x + y == 0); // expected-warning{{TRUE}} clang_analyzer_eval(y == 0); // expected-warning{{TRUE}} return 0; } ``` Differential Revision: https://reviews.llvm.org/D103314	2021-06-14 12:19:09 +02:00
Simon Pilgrim	61cdaf66fe	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Valeriy Savchenko	45212dec01	[analyzer][solver] Prevent use of a null state rdar://77686137 Differential Revision: https://reviews.llvm.org/D102240	2021-05-13 20:16:29 +03:00

1 2 3

119 Commits