This is reapplies #115615 without using tuples. The eager call of
`getRegion()` and `getOffset()` could cause crashes when the Store had
symbolic bindings.
Here I'm fixing the crash by lazily calling those getters.
Also, the tuple version poorly sorted the Clusters. The memory spaces
should have come before the regular clusters.
Now, that is also fixed here, demonstrated by the test.
Combined constructs (OpenACC 3.3 section 2.11) are a short-cut for
writing a `loop` construct immediately inside of a `compute` construct.
However, this interaction requires we do additional work to ensure that
we get the semantics between the two correct, as well as diagnostics.
This patch adds the semantic analysis for the constructs (but no
clauses), as well as the AST nodes.
Reverts llvm/llvm-project#115615
There are two problems with this PR:
1) If any of the dumps contains a store with a symbolic binding, we
crash.
2) The memory space clusters come last among the clusters, which is not
what I intended.
I'm reverting because of the crash.
Dump the memory space clusters before the other clusters, in
alphabetical order. Then default bindings over direct bindings, and if
any has symbolic offset, then those should come before the ones with
concrete offsets.
In theory, we should either have a symbolic offset OR concrete offsets,
but never both at the same time.
Needed for #114835
SValBuilder::getKnownValue, getMinValue, getMaxValue use
SValBuilder::simplifySVal.
simplifySVal does repeated simplification until a fixed-point is
reached. A single step is done by SimpleSValBuilder::simplifySValOnce,
using a Simplifier visitor. That will basically decompose SymSymExprs,
and apply constant folding using the constraints we have in the State.
Once it decomposes a SymSymExpr, it simplifies both sides and then uses
the SValBuilder::evalBinOp to reconstruct the same - but now simpler -
SymSymExpr, while applying some caching to remain performant.
This decomposition, and then the subsequent re-composition poses new
challenges to the SValBuilder::evalBinOp, which is built to handle
expressions coming from real C/C++ code, thus applying some implicit
assumptions.
One previous assumption was that nobody would form an expression like
"((int*)0) - q" (where q is an int pointer), because it doesn't really
makes sense to write code like that.
However, during simplification, we may end up with a call to evalBinOp
similar to this.
To me, simplifying a SymbolRef should never result in Unknown or Undef,
unless it was Unknown or Undef initially or, during simplification we
realized that it's a division by zero once we did the constant folding,
etc.
In the following case the simplified SVal should not become UnknownVal:
```c++
void top(char *p, char *q) {
int diff = p - q; // diff: reg<p> - reg<q>
if (!p) // p: NULL
simplify(diff); // diff after simplification should be: 0(loc) - reg<q>
}
```
Returning Unknown from the simplifySVal can weaken analysis precision in
other places too, such as in SValBuilder::getKnownValue, getMinValue, or
getMaxValue because we call simplifySVal before doing anything else.
For nonloc::SymbolVals, this loss of precision is critical, because for
those the SymbolRef carries an accurate type of the encoded computation,
thus we should at least have a conservative upper or lower bound that we
could return from getMinValue or getMaxValue - yet we would just return
nullptr.
```c++
const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state,
SVal V) {
return getConstValue(state, simplifySVal(state, V));
}
const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state,
SVal V) {
V = simplifySVal(state, V);
if (const llvm::APSInt *Res = getConcreteValue(V))
return Res;
if (SymbolRef Sym = V.getAsSymbol())
return state->getConstraintManager().getSymMinVal(state, Sym);
return nullptr;
}
```
For now, I don't plan to make the simplification bullet-proof, I'm just
explaining why I made this change and what you need to look out for in
the future if you see a similar issue.
CPP-5750
This patch generalizes the way element regions are constructed when an
`ArrayInitLoopExpr` is being analyzed. Previously the base region of the
`ElementRegion` was determined with pattern matching, which led to
crashes, when an unhandled pattern was encountered.
Fixes#112813
The current implementation of MemRegion::getDescriptiveName fails for
FieldRegions whose SuperRegion is an ElementRegion. As outlined below:
```Cpp
struct val_struct { int val; };
extern struct val_struct val_struct_array[3];
void func(){
// FieldRegion with ElementRegion as SuperRegion.
val_struct_array[0].val;
}
```
For this special case, the expression cannot be pretty printed and must
therefore be obtained separately.
When instantiating "callable<T>", the "class CallableType" nested type
will only have a declaration in the copy for the instantiation - because
it's not refereed to directly by any other code that would need a
complete definition.
However, in the past, when conservative eval calling member function, we
took the static type of the "this" expr, and looked up the CXXRecordDecl
it refereed to to see if it has any mutable members (to decide if it
needs to refine invalidation or not). Unfortunately, that query needs a
definition, and it asserts otherwise, thus we crashed.
To fix this, we should consult the dynamic type of the object, because
that will have the definition.
I anyways added a check for "hasDefinition" just to be on the safe side.
Fixes#77378
This commit is a collection of several very minor code quality
improvements. The main goal is removing the misleading "Bin" substring
from the names of several methods and variables (like
`evalEagerlyAssumedBinOpBifurcation`) that are also applied for the
unary logical not operator.
In addition to this, I clarified the doc-comment of the method
`evalEagerlyAssumedBinOpBifurcation` and refactored the body of this
method to fix the capitalization of variable names and replace an
obsolete use of `std::tie` with a structured binding.
Finally, the data member `eagerlyAssumeBinOpBifurcation` of the class
`AnalyzerOptions` was completely removed (including a line in clang-tidy
that sets it to true), because it was never read by any code.
Note that the eagerly-assume mode of the analyzer is controlled by a
different boolean member of `AnalyzerOptions` which is called
`ShouldEagerlyAssume` and is defined via the macro magic from
`AnalyzerOptions.def`.
Ideally, we wouldn't workaround our current cast-modeling, but the
experimental "support-symbolic-integer-casts" is not finished so we need
to live with our current modeling.
Ideally, we could probably bind `UndefinedVal` as the result of the call
even without evaluating the call, as the result types mismatch between
the static type of the `CallExpr` and the actually function that happens
to be called.
Nevertheless, let's not crash.
https://compiler-explorer.com/z/WvcqK6MbY
CPP-5768
The 'tile' clause shares quite a bit of the rules with 'collapse', so a
followup patch will add those tests/behaviors. This patch deals with
adding the AST node.
The 'tile' clause takes a series of integer constant expressions, or *.
The asterisk is now represented by a new OpenACCAsteriskSizeExpr node,
else this clause is very similar to others.
This reverts commit e39205654dc11c50bd117e8ccac243a641ebd71f.
There are further discussions in
https://github.com/llvm/llvm-project/pull/70976, happening for past two
weeks. Since there were no responses for couple weeks now, reverting
until author is back.
As was reported
[here](https://github.com/llvm/llvm-project/pull/103714#pullrequestreview-2238037812),
`invalidateRegions` should accept `Stmt` instead of `Expr`. This
conversion is possible, since `Expr` was anyway converted back to `Stmt`
later.
This refactoring is needed to fix another FP related to use of inline
assembly. The fix would be to change `State->bindLoc` to
`state->invalidateRegions` inside inline assembly visitor, since
`bindLoc` only binds to offset 0, which is not really correct semantics
in case of inline assembly.
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
…ring.UninitRead
This is a drastic simplification of #106982. If you read that patch,
this is the same thing with all BugReporterVisitors.cpp and
SValBuilder.cpp changes removed! (since all replies came regarding
changed to those files, I felt the new PR was justified)
The patch was inspired by a pretty poor bug report on FFMpeg:

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.
I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now, with 'seq_unpack_rle_block' having notes describing the path of
execution and lack of a value change:


Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.
PR refactors `MallocChecker` to not violate invariant of `BindExpr`,
which should be called only during `evalCall` to avoid conflicts.
To achieve this, most of `postCall` logic was moved to `evalCall` with
addition return value binding in case of processing of allocation
functions. Check functions prototypes was changed to use `State` with
bound return value.
`checkDelim` logic was left in `postCall` to avoid conflicts with
`StreamChecker` which also evaluates `getline` and friends.
PR also introduces breaking change in the unlikely case when the
definition of an allocation function (e.g. `malloc()`) is visible: now
checker does not try to inline allocation functions and assumes their
initial semantics.
Closes#73830
This reverts commit e7f782e7481cea23ef452a75607d3d61f5bd0d22.
This had UBSan failures:
[----------] 1 test from ConfigCompileTests
[ RUN ] ConfigCompileTests.DiagnosticSuppression
Config fragment: compiling <unknown>:0 -> 0x00007B8366E2F7D8 (trusted=false)
/usr/local/google/home/fmayer/large/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:203:33: runtime error: reference binding to null pointer of type 'clang::DiagnosticIDs'
UndefinedBehaviorSanitizer: undefined-behavior /usr/local/google/home/fmayer/large/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:203:33
Pull Request: https://github.com/llvm/llvm-project/pull/108645
As reported in
https://github.com/llvm/llvm-project/pull/103714#issuecomment-2295769193.
CSA crashes on trying to bind value to symbolic region with `void *`.
This happens when such region gets passed as inline asm input and engine
tries to bind `UnknownVal` to that region.
Fix it by changing type from void to char before calling
`GetElementZeroRegion`
Random testing revealed it's possible to crash the analyzer with the
command line invocation:
clang -cc1 -analyze -analyzer-checker=nullability empty.c
where the source file, empty.c is an empty source file.
```
clang: <root>/clang/lib/StaticAnalyzer/Core/CheckerManager.cpp:56:
void clang::ento::CheckerManager::finishedCheckerRegistration():
Assertion `Event.second.HasDispatcher && "No dispatcher registered for an event"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/
Stack dump:
0. Program arguments: clang -cc1 -analyze -analyzer-checker=nullability nullability-nocrash.c
#0 ...
...
#7 <addr> clang::ento::CheckerManager::finishedCheckerRegistration()
#8 <addr> clang::ento::CheckerManager::CheckerManager(clang::ASTContext&,
clang::AnalyzerOptions&, clang::Preprocessor const&,
llvm::ArrayRef<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char>>>, llvm::ArrayRef<std::function<void (clang::ento::CheckerRegistry&)>>)
```
This commit removes the assertion which failed here, because it was
logically incorrect: it required that if an Event is handled by some
(enabled) checker, then there must be an **enabled** checker which can
emit that kind of Event. It should be OK to disable the event-producing
checkers but enable an event-consuming checker which has different
responsibilities in addition to handling the events.
Note that this assertion was in an `#ifndef NDEBUG` block, so this
change does not impact the non-debug builds.
Co-authored-by: Vince Bridgers <vince.a.bridgers@ericsson.com>
Bind the array member to the compound region associated with the
initializer list, e.g.:
class C {
int arr[2];
C() : arr{1, 2} {}
};
C c;
This change enables correct values in `c.arr[0]` and `c.arr[1]`
CPP-5647
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.
For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.
In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.
This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.
Fixes#87526
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
If a mutex interface is split in inheritance chain, e.g. struct mutex
has `unlock` and inherits `lock` from __mutex_base then calls m.lock()
and m.unlock() have different "this" targets: m and the __mutex_base of
m, which used to confuse the `ActiveCritSections` list.
Taking base region canonicalizes the region used to identify a critical
section and enables search in ActiveCritSections list regardless of
which class the callee is the member of.
This likely fixes#104241
CPP-5541
Fix some false negatives of StackAddrEscapeChecker:
- Output parameters
```
void top(int **out) {
int local = 42;
*out = &local; // Noncompliant
}
```
- Indirect global pointers
```
int **global;
void top() {
int local = 42;
*global = &local; // Noncompliant
}
```
Note that now StackAddrEscapeChecker produces a diagnostic if a function
with an output parameter is analyzed as top-level or as a callee. I took
special care to make sure the reports point to the same primary location
and, in many cases, feature the same primary message. That is the
motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp
To avoid false positive reports when a global indirect pointer is
assigned a local address, invalidated, and then reset, I rely on the
fact that the invalidation symbol will be a DerivedSymbol of a
ConjuredSymbol that refers to the same memory region.
The checker still has a false negative for non-trivial escaping via a
returned value. It requires a more sophisticated traversal akin to
scanReachableSymbols, which out of the scope of this change.
CPP-4734
---------
This is the last of the 3 stacked PRs, it must not be merged before
https://github.com/llvm/llvm-project/pull/105652 and
https://github.com/llvm/llvm-project/pull/105653
If pointer is passed as input operand for inline assembly, it's possible
that asm block will change memory behind this pointer. So if pointer is
passed inside inline asm block, it's better to not guess and assume
memory has unknown state.
Without such change, we observed a lot of FP with hand-written `memcpy`
and friends.
This commit removes `invalidateRegionsImpl()`, moving its body to
`invalidateRegions(ValueList Values, ...)`, because it was a completely
useless layer of indirection.
Moreover I'm fixing some strange indentation within this function body
and renaming two variables to the proper `UpperCamelCase` format.
Current CSA logic does not expect `LazyCompoundValKind` as array index.
This may happen if array is used as subscript to another, in case of
bitcast to integer type.
Catch such cases and return `UnknownVal`, since CSA cannot model
array -> int casts.
Closes#94496
Previously there were certain situations where
alpha.security.ArrayBoundV2 produced lots of very similar and redundant
reports that only differed in their full `Description` that contained
the (negative) byte offset value. (See
https://github.com/llvm/llvm-project/issues/86969 for details.)
This change updates the `Profile()` method of `PathSensitiveBugReport`
to ensure that it uses `getShortDescription()` instead of the full
`Description` so the standard report deduplication eliminates most of
these redundant reports.
Note that the effects of this change are very limited because there are
very few checkers that specify a separate short description, and so
`getShortDescription()` practically always defaults to returning the
full `Description`.
For the sake of consistency `BasicBugReport::Profile()` is also updated
to use the short description. (Right now there are no checkers that use
`BasicBugReport` with separate long and short descriptions.)
This commit also includes some small code quality improvements in
`ArrayBoundV2` that are IMO too trivial to be moved into a separate
commit.
Since `raw_string_ostream` doesn't own the string buffer, it is
desirable (in terms of memory safety) for users to directly reference
the string buffer rather than use `raw_string_ostream::str()`.
Work towards TODO comment to remove `raw_string_ostream::str()`.
This is exactly as originally landed in #95129,
but now the minimal Z3 version was increased to meet this change in #96682.
https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4
---
This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520
As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in at
most 1 second.
The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted. After this
patch, CSA will tackle such extreme cases as well.
(cherry picked from commit eacc3b3504be061f7334410dd0eb599688ba103a)