This is a performance optimization for HTML diagnostics output mode.
Currently they're incredibly inefficient:
* The HTMLRewriter is re-run from scratch on every file on every report.
Each such re-run involves re-lexing the entire file and producing
a syntax-highlighted webpage of the entire file, with text behind macros
duplicated as pop-up macro expansion tooltips. Then, warning and note
bubbles are injected into the page. Only the bubble part is different
across reports; everything else can theoretically be cached.
* Additionally, if duplicate reports are emitted (with the same issue hash),
HTMLRewriter will be re-run even though the output file is going to be
discarded due to filename collision. This is mostly an issue for
path-insensitive bug reports because path-sensitive bug reports
are already deduplicated by the BugReporter as part of searching
for the shortest bug path. But on some translation units almost 80% of
bug reports are dry-run here.
We only get away with all this because there are usually very few reports
emitted per file. But if loud checkers are enabled, such as `webkit.*`,
this may explode in complexity and even cause the compiler to run over
the 32-bit SourceLocation addressing limit. (We're re-lexing everything
each time, remember?)
This patch hotfixes the *second* problem. Adds a FIXME for the first problem,
which will require more yak shaving to solve.
rdar://120801986
Cleanup most of the lazy-init `BugType` legacy.
Some will be preserved, as those are slightly more complicated to
refactor.
Notice, that the default category for `BugType` is `LogicError`. I
omitted setting this explicitly where I could.
Please, actually have a look at the diff. I did this manually, and we
rarely check the bug type descriptions and stuff in tests, so the
testing might be shallow on this one.
The new attribute can be placed on statements in order to suppress
arbitrary warnings produced by static analysis tools at those statements.
Previously such suppressions were implemented as either informal comments
(eg. clang-tidy `// NOLINT:`) or with preprocessor macros (eg.
clang static analyzer's `#ifdef __clang_analyzer__`). The attribute
provides a universal, formal, flexible and neat-looking suppression mechanism.
Implement support for the new attribute in the clang static analyzer;
clang-tidy coming soon.
The attribute allows specifying which specific warnings to suppress,
in the form of free-form strings that are intended to be specific to
the tools, but currently none are actually supported; so this is also
going to be a future improvement.
Differential Revision: https://reviews.llvm.org/D93110
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This commit extends the class `SValBuilder` with the methods
`getMinValue()` and `getMaxValue()` to that work like
`SValBuilder::getKnownValue()` but return the minimal/maximal possible
value the `SVal` is not perfectly constrained.
This extension of the ConstraintManager API is discussed at:
https://discourse.llvm.org/t/expose-the-inferred-range-information-in-warning-messages/75192
As a simple proof-of-concept application of this new API, this commit
extends a message from `core.BitwiseShift` with some range information
that reports the assumptions of the analyzer.
My main motivation for adding these methods is that I'll also want to
use them in `ArrayBoundCheckerV2` to make the error messages less
awkward, but I'm starting with this simpler and less important usecase
because I want to avoid merge conflicts with my other commit
https://github.com/llvm/llvm-project/pull/72107 which is currently under
review.
The testcase `too_large_right_operand_compound()` shows a situation
where querying the range information does not work (and the extra
information is not added to the error message). This also affects the
debug utility `clang_analyzer_value()`, so the problem isn't in the
fresh code. I'll do some investigations to resolve this, but I think
that this commit is a step forward even with this limitation.
...to model the results of alloca() and _alloca() calls. Previously it
acted as if these functions were returning memory from the heap, which
led to alpha.security.ArrayBoundV2 producing incorrect messages.
As my BSc thesis I've implemented a checker for std::variant and
std::any, and in the following weeks I'll upload a revised version of
them here.
# Prelude
@Szelethus and I sent out an email with our initial plans here:
https://discourse.llvm.org/t/analyzer-new-checker-for-std-any-as-a-bsc-thesis/65613/2
We also created a stub checker patch here:
https://reviews.llvm.org/D142354.
Upon the recommendation of @haoNoQ , we explored an option where instead
of writing a checker, we tried to improve on how the analyzer natively
inlined the methods of std::variant and std::any. Our attempt is in this
patch https://reviews.llvm.org/D145069, but in a nutshell, this is what
happened: The analyzer was able to model much of what happened inside
those classes, but our false positive suppression machinery erroneously
suppressed it. After months of trying, we could not find a satisfying
enhancement on the heuristic without introducing an allowlist/denylist
of which functions to not suppress.
As a result (and partly on the encouragement of @Xazax-hun) I wrote a
dedicated checker!
The advantage of the checker is that it is not dependent on the
standard's implementation and won't put warnings in the standard library
definitions. Also without the checker it would be difficult to create
nice user-friendly warnings and NoteTags -- as per the standard's
specification, the analysis is sinked by an exception, which we don't
model well now.
# Design ideas
The working of the checker is straightforward: We find the creation of
an std::variant instance, store the type of the variable we want to
store in it, then save this type for the instance. When retrieving type
from the instance we check what type we want to retrieve as, and compare
it to the actual type. If the two don't march we emit an error.
Distinguishing variants by instance (e.g. MemRegion *) is not the most
optimal way. Other checkers, like MallocChecker uses a symbol-to-trait
map instead of region-to-trait. The upside of using symbols (which would
be the value of a variant, not the variant itself itself) is that the
analyzer would take care of modeling copies, moves, invalidation, etc,
out of the box. The problem is that for compound types, the analyzer
doesn't create a symbol as a result of a constructor call that is fit
for this job. MallocChecker in contrast manipulates simple pointers.
My colleges and I considered the option of making adjustments directly
to the memory model of the analyzer, but for the time being decided
against it, and go with the bit more cumbersome, but immediately viable
option of simply using MemRegions.
# Current state and review plan
This patch contains an already working checker that can find and report
certain variant/any misuses, but still lands it in alpha. I plan to
upload the rest of the checker in later patches.
The full checker is also able to "follow" the symbolic value held by the
std::variant and updates the program state whenever we assign the value
stored in the variant. I have also built a library that is meant to
model union-like types similar to variant, hence some functions being a
bit more multipurpose then is immediately needed.
I also intend to publish my std::any checker in a later commit.
---------
Co-authored-by: Gabor Spaits <gabor.spaits@ericsson.com>
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
This patch converts `ImplicitParamDecl::ImplicitParamKind` into a scoped enum at namespace scope, making it eligible for forward declaring. This is useful for `preferred_type` annotations on bit-fields.
This patch converts `CXXConstructExpr::ConstructionKind` into a scoped enum in namespace scope, making it eligible for forward declaring. This is useful in cases like annotating bit-fields with `preferred_type`.
The goal of this patch is to refine how the `SVal` base and sub-kinds
are represented by forming one unified enum describing the possible
SVals. This means that the `unsigned SVal::Kind` and the attached
bit-packing semantics would be replaced by a single unified enum. This
is more conventional and leads to a better debugging experience by
default. This eases the need of using debug pretty-printers, or the use
of runtime functions doing the printing for us like we do today by
calling `Val.dump()` whenever we inspect the values.
Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated
the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`.
The rest of the upper bits represented the sub-kind, where the value
represented the index among only the `Loc`s or `NonLoc`s, effectively
attaching 2 meanings of the upper bits depending on the base-kind. We
don't need to pack these bits, as we have plenty even if we would use
just a plan-old `unsigned char`.
Consequently, in this patch, I propose to lay out all the (non-abstract)
`SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`,
`END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar
how we do with the `MemRegions`.
Note that in the unified `SVal::Kind` enum, to differentiate
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with
`Loc` and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the
`nonloc::Kind` enum items with `inline constexpr` global constants to
mimic the original behavior - and offer nicer spelling to these enum
values.
Some `SVal` constructors were not marked explicit, which I now mark as
such to follow best practices, and marked others as `/*implicit*/` to
clarify the intent.
During refactoring, I also found at least one function not marked
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.
The `TypeRetrievingVisitor` visitor had some accidental dead code,
namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`.
Previously, the `SValVisitor` expected visit handlers of
`VisitNonLocXXXXX(nonloc::XXXXX)` and `VisitLocXXXXX(loc::XXXXX)`, where
I felt that envoding `NonLoc` and `Loc` in the name is not necessary as
the type of the parameter would select the right overload anyways, so I
simplified the naming of those visit functions.
The rest of the diff is a lot of times just formatting, because
`getKind()` by nature, frequently appears in switches, which means that
the whole switch gets automatically reformatted. I could probably undo
the formatting, but I didn't want to deviate from the rule unless
explicitly requested.
Workaround the case when the `this` pointer is actually a `NonLoc`, by
returning `Unknown` instead.
The solution isn't ideal, as `this` should be really a `Loc`, but due to
how casts work, I feel this is our easiest and best option.
As this patch presents, I'm evaluating a cast to transform the `NonLoc`.
However, given that `evalCast()` can't be cast from `NonLoc` to a
pointer type thingy (`Loc`), we end up with `Unknown`.
It is because `EvalCastVisitor::VisitNonLocSymbolVal()` only evaluates
casts that happen from NonLoc to NonLocs.
When I tried to actually implement that case, I figured:
1) Create a `SymbolicRegion` from that `nonloc::SymbolVal`; but
`SymbolRegion` ctor expects a pointer type for the symbol.
2) Okay, just have a `SymbolCast`, getting us the pointer type; but
`SymbolRegion` expects `SymbolData` symbols, not generic `SymExpr`s, as
stated:
> // Because pointer arithmetic is represented by ElementRegion layers,
> // the base symbol here should not contain any arithmetic.
3) We can't use `ElementRegion`s to perform this cast because to have an
`ElementRegion`, you already have to have a `SubRegion` that you want to
cast, but the point is that we don't have that.
At this point, I gave up, and just left a FIXME instead, while still
returning `Unknown` on that path.
IMO this is still better than having a crash.
Fixes#69922
Fixes#70464
When ctor is not declared in the base class, initializing the base class
with the initializer list will not trigger a proper assignment of the
base region, as a CXXConstructExpr doing that is not available in the
AST.
This patch checks whether the init expr is an InitListExpr under a base
initializer, and adds a binding if so.
Static analyze can't report diagnose when statement after a
CXXForRangeStmt and enable widen, because
`ExprEngine::processCFGBlockEntrance` lacks of CXXForRangeStmt and when
`AMgr.options.maxBlockVisitOnPath - 1` equals to `blockCount`, it can't
widen. After next iteration, `BlockCount >=
AMgr.options.maxBlockVisitOnPath` holds and generate a sink node. Add
`CXXForRangeStmt` makes it work.
Co-authored-by: huqizhi <836744285@qq.com>
In the following code:
```cpp
int main() {
struct Wrapper {char c; int &ref; };
Wrapper w = {.c = 'a', .ref = *(int *)0 };
w.ref = 1;
}
```
The clang static analyzer will produce the following warnings and notes:
```
test.cpp:12:11: warning: Dereference of null pointer [core.NullDereference]
12 | w.ref = 1;
| ~~~~~~^~~
test.cpp:11:5: note: 'w' initialized here
11 | Wrapper w = {.c = 'a', .ref = *(int *)0 };
| ^~~~~~~~~
test.cpp:12:11: note: Dereference of null pointer
12 | w.ref = 1;
| ~~~~~~^~~
1 warning generated.
```
In the line where `w` is created, the note gives information about the
initialization of `w` instead of `w.ref`. Let's compare it to a similar
case where a null pointer dereference happens to a pointer member:
```cpp
int main() {
struct Wrapper {char c; int *ptr; };
Wrapper w = {.c = 'a', .ptr = nullptr };
*w.ptr = 1;
}
```
Here the following error and notes are seen:
```
test.cpp:18:12: warning: Dereference of null pointer (loaded from field 'ptr') [core.NullDereference]
18 | *w.ptr = 1;
| ~~~ ^
test.cpp:17:5: note: 'w.ptr' initialized to a null pointer value
17 | Wrapper w = {.c = 'a', .ptr = nullptr };
| ^~~~~~~~~
test.cpp:18:12: note: Dereference of null pointer (loaded from field 'ptr')
18 | *w.ptr = 1;
| ~~~ ^
1 warning generated.
```
Here the note that shows the initialization the initialization of
`w.ptr` in shown instead of `w`.
This commit is here to achieve similar notes for member reference as the
notes of member pointers, so the report looks like the following:
```
test.cpp:12:11: warning: Dereference of null pointer [core.NullDereference]
12 | w.ref = 1;
| ~~~~~~^~~
test.cpp:11:5: note: 'w.ref' initialized to a null pointer value
11 | Wrapper w = {.c = 'a', .ref = *(int *)0 };
| ^~~~~~~~~
test.cpp:12:11: note: Dereference of null pointer
12 | w.ref = 1;
| ~~~~~~^~~
1 warning generated.
```
Here the initialization of `w.ref` is shown instead of `w`.
---------
Authored-by: Gábor Spaits <gabor.spaits@ericsson.com>
Reviewed-by: Donát Nagy <donat.nagy@ericsson.com>
This change avoids a crash in BasicValueFactory by checking the bit
width of an APSInt to avoid calling getZExtValue if greater than
64-bits. This was caught by our internal, randomized test generator.
Clang invocation
clang -cc1 -analyzer-checker=optin.portability.UnixAPI case.c
<src-root>/llvm/include/llvm/ADT/APInt.h:1488:
uint64_t llvm::APInt::getZExtValue() const: Assertion `getActiveBits()
<= 64
&& "Too many bits for uint64_t"' failed.
...
#9 <address> llvm::APInt::getZExtValue() const
<src-root>/llvm/include/llvm/ADT/APInt.h:1488:5
clang::BinaryOperatorKind, llvm::APSInt const&, llvm::APSInt const&)
<src-root>/clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp:307:37
llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>,
clang::BinaryOperatorKind, clang::ento::NonLoc, clang::ento::NonLoc,
clang::QualType)
<src-root>/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:531:31
llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>,
clang::BinaryOperatorKind, clang::ento::SVal, clang::ento::SVal,
clang::QualType)
<src-root>/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:532:26
...
Previously, bitwise shifts with constant operands were validated by the
checker `core.UndefinedBinaryOperatorResult`. However, this logic was
unreliable, and commit 25b9696b61e53a958e217bb3d0eab66350dc187f added
the dedicated checker `core.BitwiseShift` which validated the
preconditions of all bitwise shifts with a more accurate logic (that
uses the real types from the AST instead of the unreliable type
information encoded in `APSInt` objects).
This commit disables the inaccurate logic that could mark bitwise shifts
as 'undefined' and removes the redundant shift-related warning messages
from core.UndefinedBinaryOperatorResult. The tests that were validating
this logic are also deleted by this commit; but I verified that those
testcases trigger the expected bug reports from `core.BitwiseShift`. (I
didn't convert them to tests of `core.BitwiseShift`, because that
checker already has its own extensive test suite with many analogous
testcases.)
I hope that there will be a time when the constant folding will be
reliable, but until then we need hacky solutions like this improve the
quality of results.
evalIntegralCast was using makeIntVal, and when _BitInt() types were
introduced this exposed a crash in evalIntegralCast as a result.
This is a reapply of a previous patch that failed post merge on the arm
buildbots, because arm cannot handle large
BitInts. Pinning the triple for the testcase solves that problem.
Improve evalIntegralCast to use makeIntVal more efficiently to avoid the
crash exposed by use of _BitInt.
This was caught with our internal randomized testing.
<src-root>/llvm/include/llvm/ADT/APInt.h:1510:
int64_t llvm::APInt::getSExtValue() const: Assertion
`getSignificantBits() <= 64 && "Too many bits for int64_t"' failed.a
...
#9 <address> llvm::APInt::getSExtValue() const
<src-root>/llvm/include/llvm/ADT/APInt.h:1510:5
llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>,
clang::ento::SVal, clang::QualType, clang::QualType)
<src-root>/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:607:24
clang::Expr const*, clang::ento::ExplodedNode*,
clang::ento::ExplodedNodeSet&)
<src-root>/clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp:413:61
...
Fixes: https://github.com/llvm/llvm-project/issues/61960
Reviewed By: donat.nagy
This reverts commit 4898c33527f90b067f353a115442a9a702319fce.
Lots of buildbots are failing, probably because lots of targets not supporting
large _BitInt types.
evalIntegralCast was using makeIntVal, and when _BitInt() types were
introduced this exposed a crash in evalIntegralCast as a result.
Improve evalIntegralCast to use makeIntVal more efficiently to avoid the
crash exposed by use of _BitInt.
This was caught with our internal randomized testing.
<src-root>/llvm/include/llvm/ADT/APInt.h:1510:
int64_t llvm::APInt::getSExtValue() const: Assertion
`getSignificantBits() <= 64 && "Too many bits for int64_t"' failed.a
...
#9 <address> llvm::APInt::getSExtValue() const
<src-root>/llvm/include/llvm/ADT/APInt.h:1510:5
llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>,
clang::ento::SVal, clang::QualType, clang::QualType)
<src-root>/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:607:24
clang::Expr const*, clang::ento::ExplodedNode*,
clang::ento::ExplodedNodeSet&)
<src-root>/clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp:413:61
...
Fixes: https://github.com/llvm/llvm-project/issues/61960
Reviewed By: donat.nagy
Reapply after fixing the test by enabling the `debug.ExprInspection` checker.
-----
NonLoc symbolic SVal to Loc casts are not supported except for
nonloc::ConcreteInt.
This change simplifies the source SVals so that the more casts can
go through nonloc::ConcreteInt->loc::ConcreteInt path. For example:
void test_simplified_before_cast_add(long long t1) {
long long t2 = t1 + 3;
if (!t2) {
int *p = (int *) t2;
clang_analyzer_eval(p == 0); // expected-warning{{TRUE}}
}
}
If simplified, 't2' is 0, resulting 'p' is nullptr, otherwise 'p'
is unknown.
Fixes#62232
This reverts commit 3ebf3dd30da219f9f9aee12f42d45d18d55e7580.
I thought "Mergeing" will wait and confirm if the checks pass, and only
merge it if they succeed. Apparently, it's not the case here xD
The test is just broken in x86. See:
https://lab.llvm.org/buildbot/#/builders/109/builds/73686
NonLoc symbolic SVal to Loc casts are not supported except for
nonloc::ConcreteInt.
This change simplifies the source SVals so that the more casts can go
through nonloc::ConcreteInt->loc::ConcreteInt path. For example:
```c
void test_simplified_before_cast_add(long long t1) {
long long t2 = t1 + 3;
if (!t2) {
int *p = (int *) t2;
clang_analyzer_eval(p == 0); // expected-warning{{TRUE}}
}
}
```
If simplified, `t2` is 0, resulting `p` is nullptr, otherwise `p` is
unknown.
Fixes#62232
This reapplies ddbcc10b9e26b18f6a70e23d0611b9da75ffa52f, except for a tiny part that was reverted separately: 65331da0032ab4253a4bc0ddcb2da67664bd86a9. That will be reapplied later on, since it turned out to be more involved.
This commit is enabled by 5523fefb01c282c4cbcaf6314a9aaf658c6c145f and f0f548a65a215c450d956dbcedb03656449705b9, specifically the part that makes 'clang-tidy/checkers/misc/header-include-cycle.cpp' separator agnostic.
This commit replaces some calls to the deprecated `FileEntry::getName()` with `FileEntryRef::getName()` by swapping current usages of `SourceManager::getFileEntryForID()` with `SourceManager::getFileEntryRefForID()`. This lowers the number of usages of the deprecated `FileEntry::getName()` from 95 to 50.
Size-type inconsistency (signedness) causes confusion and even bugs.
For example when signed compared to unsigned the result might not
be expected. Summary of this commit:
Related APIs changes:
1. getDynamicExtent() returns signed version of extent;
2. Add getDynamicElementCountWithOffset() for offset version of element count;
3. getElementExtent() could be 0, add defensive checking for
getDynamicElementCount(), if element is of zero-length, try
ConstantArrayType::getSize() as element count;
Related checker changes:
1. ArrayBoundCheckerV2: add testcase for signed <-> unsigned comparison from
type-inconsistency results by getDynamicExtent()
2. ExprInspection: use more general API to report more results
Fixes https://github.com/llvm/llvm-project/issues/64920
Reviewed By: donat.nagy, steakhal
Differential Revision: https://reviews.llvm.org/D158499
SVal argument 'Cond' passed in is corrupted in release mode with
exception handling enabled (result in an UndefinedSVal), or changing
lambda capture inside the callee can workaround this.
Known problematic VS Versions:
- VS 2022 17.4.4
- VS 2022 17.5.4
- VS 2022 17.7.2
Verified working VS Version:
- VS 2019 16.11.25
Fixes https://github.com/llvm/llvm-project/issues/62130
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D159163
...because it provides no useful functionality compared to its base
class `BugType`.
A long time ago there were substantial differences between `BugType` and
`BuiltinBug`, but they were eliminated by commit 1bd58233 in 2009 (!).
Since then the only functionality provided by `BuiltinBug` was that it
specified `categories::LogicError` as the bug category and it stored an
extra data member `desc`.
This commit sets `categories::LogicError` as the default value of the
third argument (bug category) in the constructors of BugType and
replaces use of the `desc` field with simpler logic.
Note that `BugType` has a data member `Description` and a non-virtual
method `BugType::getDescription()` which queries it; these are distinct
from the member `desc` of `BuiltinBug` and the identically named method
`BuiltinBug::getDescription()` which queries it. This confusing name
collision was a major motivation for the elimination of `BuiltinBug`.
As this commit touches many files, I avoided functional changes and left
behind FIXME notes to mark minor issues that should be fixed later.
Differential Revision: https://reviews.llvm.org/D158855
structured-block
where clause is one of the following:
private(list)
reduction([reduction-modifier ,] reduction-identifier : list)
nowait
Differential Revision: https://reviews.llvm.org/D157933
IDs of the note list start from 1. Link generated for each note started
with index 0 i.e #Note0, #Note1 and so on.
As a result, first link ("#Note0") was invalid, subsequent links pointed
at wrong note.
Now, generated links to the notes start with index 1 i.e (#Note1, #Note2
and so on.
Patch by Guruprasad Hegde (gruuprasad)!
Fixes https://github.com/llvm/llvm-project/issues/64054
Differential Revision: https://reviews.llvm.org/D156724
I actually visited each link and added relevant context directly to the code.
This is related to the effort to eliminate internal bug tracker links
(d618f1c, e0ac46e).
Test files still have a lot of rdar links and ids in them.
I haven't touched them yet.
The `GenericTaintChecker` checker was crashing, when the taint
was propagated to `AllocaRegion` region in following code:
```
int x;
void* p = alloca(10);
mempcy(p, &x, sizeof(x));
```
This crash was caused by the fact that determining type of
`AllocaRegion` returns a null `QualType`.
This patch makes `AllocaRegion` expose its type as `void`,
making them consistent with results of `malloc` or `new`
that produce `SymRegion` with `void*` symbol.
Reviewed By: steakhal, xazax.hun
Differential Revision: https://reviews.llvm.org/D155847
We now properly bind return value of the trivial copy constructor
and assignments of the empty objects. Such operations do not
perform any loads from the source, however they preserve identity
of the assigned object:
```
Empty e;
auto& x = (e = Empty());
clang_analyzer_dump(x); // &e, was Unknown
```
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D155442
When we construct a `NonParamVarRegion`, we canonicalize the decl to
always use the same entity for consistency.
At the moment that is the canonical decl - which is the first decl in
the redecl chain.
However, this can cause problems with tentative declarations and extern
declarations if we declare an array with unknown bounds.
Consider this C example: https://godbolt.org/z/Kdvr11EqY
```lang=C
typedef typeof(sizeof(int)) size_t;
size_t clang_analyzer_getExtent(const void *p);
void clang_analyzer_dump(size_t n);
extern const unsigned char extern_redecl[];
const unsigned char extern_redecl[] = { 1,2,3,4 };
const unsigned char tentative_redecl[];
const unsigned char tentative_redecl[] = { 1,2,3,4 };
const unsigned char direct_decl[] = { 1,2,3,4 };
void test_redeclaration_extent(void) {
clang_analyzer_dump(clang_analyzer_getExtent(direct_decl)); // 4
clang_analyzer_dump(clang_analyzer_getExtent(extern_redecl)); // should be 4 instead of Unknown
clang_analyzer_dump(clang_analyzer_getExtent(tentative_redecl)); // should be 4 instead of Unknown
}
```
The `getType()` of the canonical decls for the forward declared globals,
will return `IncompleteArrayType`, unlike the
`getDefinition()->getType()`, which would have returned
`ConstantArrayType` of 4 elements.
This makes the `MemRegionManager::getStaticSize()` return `Unknown` as
the extent for the array variables, leading to FNs.
To resolve this, I think we should prefer the definition decl (if
present) over the canonical decl when constructing `NonParamVarRegion`s.
FYI The canonicalization of the decl was introduced by D57619 in 2019.
Differential Revision: https://reviews.llvm.org/D154827
I'm involved with the Static Analyzer for the most part.
I think we should embrace newer language standard features and gradually
move forward.
Differential Revision: https://reviews.llvm.org/D154325
This patch introduces a new `CXXLifetimeExtendedObjectRegion` as a representation
of the memory for the temporary object that is lifetime extended by the reference
to which they are bound.
This separation provides an ability to detect the use of dangling pointers
(either binding or dereference) in a robust manner.
For example, the `ref` is conditionally dangling in the following example:
```
template<typename T>
T const& select(bool cond, T const& t, T const& u) { return cond ? t : u; }
int const& le = Composite{}.x;
auto&& ref = select(cond, le, 10);
```
Before the change, regardless of the value of `cond`, the `select()` call would
have returned a `temp_object` region.
With the proposed change we would produce a (non-dangling) `lifetime_extended_object`
region with lifetime bound to `le` or a `temp_object` region for the dangling case.
We believe that such separation is desired, as such lifetime extended temporaries
are closer to the variables. For example, they may have a static storage duration
(this patch removes a static temporary region, which was an abomination).
We also think that alternative approaches are not viable.
While for some cases it may be possible to determine if the region is lifetime
extended by searching the parents of the initializer expr, this quickly becomes
complex in the presence of the conditions operators like this one:
```
Composite cc;
// Ternary produces prvalue 'int' which is extended, as branches differ in value category
auto&& x = cond ? Composite{}.x : cc.x;
// Ternary produces xvalue, and extends the Composite object
auto&& y = cond ? Composite{}.x : std::move(cc).x;
```
Finally, the lifetime of the `CXXLifetimeExtendedObjectRegion` is tied to the lifetime of
the corresponding variables, however, the "liveness" (or reachability) of the extending
variable does not imply the reachability of all symbols in the region.
In conclusion `CXXLifetimeExtendedObjectRegion`, in contrast to `VarRegions`, does not
need any special handling in `SymReaper`.
RFC: https://discourse.llvm.org/t/rfc-detecting-uses-of-dangling-references/70731
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D151325
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
The last use was removed by:
commit e2e37b9afc0a0a66a1594377a88221e115d95348
Author: Ted Kremenek <kremenek@apple.com>
Date: Thu Jul 28 23:08:02 2011 +0000
The `TrackConstraintBRVisitor` should accept a message for the note
instead of creating one. It would let us inject domain-specific
knowledge in a non-intrusive way, leading to a more generic visitor.
Differential Revision: https://reviews.llvm.org/D152255
MemRegion::getMemorySpace() is annotated with
LLVM_ATTRIBUTE_RETURNS_NONNULL (which triggers instant UB if a null
pointer is returned), and callers indeed don't check the return value
for null. Thus, even though llvm::dyn_cast is called, it can never
return null in this context. Therefore, we can safely call llvm::cast.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D151727