These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Closes#57270.
This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.
Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
There is a caveat at loop widening, where the correct location is the
CFG terminator (which is not an element and does not have a ref). In
this case, the first element in the block is passed as a location.
Previous PR #128251, Reverted at #137304.
This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.
Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
Closes#57270
`CheckerNameRef` is a trivial wrapper around a `StringRef` which is
guaranteed to be owned by the `CheckerRegistry` (the only `friend` of
the class) because other code can't call the private constructor.
This class had offered two ways to recover the plain `StringRef`: an an
`operator StringRef()` for implicit conversion and a method `StringRef
getName()` which could be called explicitly.
However this method name was really confusing, because it implies "get
the name of this object" instead of "get this name as a plain
`StringRef`"; so I removed it from the codebase and used
`static_cast<StringRef>` in the two locations where the cast wasn't
performed implicitly.
This commit "prepares the ground" for planned improvements in checker
name handling.
In general, if we see an allocation, we associate the immutable memory
space with the constructed memory region.
This works fine if we see the allocation.
However, with symbolic regions it's not great because there we don't
know anything about their memory spaces, thus put them into the Unknown
space.
The unfortunate consequence is that once we learn about some aliasing
with this Symbolic Region, we can't change the memory space to the
deduced one.
In this patch, we open up the memory spaces as a trait, basically
allowing associating a better memory space with a memregion that
was created with the Unknown memory space.
As a side effect, this means that now queriing the memory space of a
region depends on the State, but many places in the analyzer, such as
the Store, doesn't have (and cannot have) access to the State by design.
This means that some uses must solely rely on the memspaces of the
region, but any other users should use the getter taking a State.
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Fixes#62985Fixes#58820
When 3rd-party header files are included as system headers, their
overloaded `new` and `delete` operators are also considered as the std
ones. However, those overloaded operator functions will also be inlined.
This makes the same
symbolic memory marked as released twice: during `checkPreCall` of the
overloaded `delete` operator and when calling `::operator delete` after
inlining the overloaded operator function (if it has).
This patch attempts to fix this bug by adjusting the strategy of
verifying whether the callee is a standard `new` or `delete` operator in
the `isStandardNewDelete` function.
There is no good way to tell CSA if function with `ownership_returns`
attribute returns initialized or not initialized memory. To make FP rate
lower, let's assume that memory returned from such functions is unknown
and do not reason about it.
In future it would be great to add a way to annotate such behavior
Current MalloChecker logic suppresses FP caused by refcounting only for
C++ destructors. The same pattern occurs a lot in C in objects with
intrusive refcounting. See #104229 for code example.
To extend current logic to C, suspect all release functions as candidate
for suppression.
Closes: #104229
PR refactors `MallocChecker` to not violate invariant of `BindExpr`,
which should be called only during `evalCall` to avoid conflicts.
To achieve this, most of `postCall` logic was moved to `evalCall` with
addition return value binding in case of processing of allocation
functions. Check functions prototypes was changed to use `State` with
bound return value.
`checkDelim` logic was left in `postCall` to avoid conflicts with
`StreamChecker` which also evaluates `getline` and friends.
PR also introduces breaking change in the unlikely case when the
definition of an allocation function (e.g. `malloc()`) is visible: now
checker does not try to inline allocation functions and assumes their
initial semantics.
Closes#73830
In preparation for adding essentially the same visitor to StreamChecker,
this patch factors this visitor out to a common header.
I'll be the first to admit that the interface of these classes are not
terrific, but it rather tightly held back by its main technical debt,
which is NoStoreFuncVisitor, the main descendant of
NoStateChangeVisitor.
Change-Id: I99d73ccd93a18dd145bbbc83afadbb432dd42b90
A new optional checker (optin.taint.TaintedAlloc) will warn if a memory
allocation function (malloc, calloc, realloc, alloca, operator new[]) is
called with a tainted (attacker controlled) size parameter.
A large, maliciously set size value can trigger memory exhaustion. To
get this warning, the alpha.security.taint.TaintPropagation checker also
needs to be switched on.
The warning will only be emitted, if the analyzer cannot prove that the
size is below reasonable bounds (<SIZE_MAX/4).
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
24 under clang/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
Fixes#90498.
Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this
time. This is useful because even though the actual builtin atomic is
still there, it may be buried beyond the inlining depth limit.
Also add one popular custom smart pointer class name to the name-based
heuristics, which isn't necessary to fix the bug but arguably a good
idea regardless.
Before this commit the the checker alpha.security.taint.TaintPropagation always reported warnings when the size argument of a memcpy-like or malloc-like function was tainted. However, this produced false positive reports in situations where the size was tainted, but correctly performed bound checks guaranteed the safety of the call.
This commit removes the rough "always warn if the size argument is tainted" heuristic; but it would be good to add a more refined "warns if the size argument is tainted and can be too large" heuristic in follow-up commits. That logic would belong to CStringChecker and MallocChecker, because those are the checkers responsible for the more detailed modeling of memcpy-like and malloc-like functions. To mark this plan, TODO comments are added in those two checkers.
There were several test cases that used these sinks to test generic properties of taint tracking; those were adapted to use different logic.
As a minor unrelated change, this commit ensures that strcat (and its wide variant, wcsncat) propagates taint from the first argument to the first argument, i.e. a tainted string remains tainted if we concatenate it with another string. This change was required because the adapted variant of multipleTaintedArgs is relying on strncat to compose a value that combines taint from two different sources.
This commit ensures that the `CallDescription`s in `MallocChecker` are
matched with the mode `CDM::CLibrary`, so:
- they don't match methods or functions within user-defined namespaces;
- they also match builtin variants of these functions (if any), so the
checker can model `__builtin_alloca()` like `alloca()`.
This change fixes https://github.com/llvm/llvm-project/issues/81597. New
tests were added to verify that `std::malloc` and `std::free` (from
`<cstdlib>`) are modeled, but a method that's named e.g. `free` isn't
confused with the memory release function.
The responsibility for modeling `__builtin_alloca` and
`__builtin_alloca_with_align` was moved from `BuiltinFunctionChecker` to
`MallocChecker`, to avoid buggy interactions between the checkers and
ensure that the builtin and non-builtin variants are handled by exactly
the same logic.
This change might be a step backwards for the users who don't have
`unix.Malloc` enabled; but I suspect that `__builtin_alloca()` is so
rare that it would be a waste of time to implement backwards
compatibility for them.
There were several test files that relied on `__builtin_alloca()` calls
to get an `AllocaRegion`, these were modified to enable `unix.Malloc`.
One of these files (cxx-uninitialized-object-ptr-ref.cpp) had some tests
that relied on the fact that `malloc()` was treated as a "black box" in
them, these were updated to use `calloc()` (to get initialized memory)
and `free()` (to avoid memory leak reports).
While I was developing this change, I found a very suspicious assert in
`MallocChecker`. As it isn't blocking the goals of this commit, I just
marked it with a FIXME, but I'll try to investigate and fix it in a
follow-up change.
This reapplies 80ab8234ac309418637488b97e0a62d8377b2ecf again, after
fixing a name collision warning in the unit tests (see the revert commit
13ccaf9b9d4400bb128b35ff4ac733e4afc3ad1c for details).
In addition to the previously applied changes, this commit also clarifies the
code in MallocChecker that distinguishes POSIX "getline()" and C++ standard
library "std::getline()" (which are two completely different functions). Note
that "std::getline()" was (accidentally) handled correctly even without this
clarification; but it's better to explicitly handle and test this corner case.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
According to POSIX 2018.
1. lineptr, n and stream can not be NULL.
2. If *n is non-zero, *lineptr must point to a region of at least *n
bytes, or be a NULL pointer.
Additionally, if *lineptr is not NULL, *n must not be undefined.
`getdelim` and `getline` may free, allocate, or re-allocate the input
buffer, ensuring its size is enough to hold the incoming line, the
delimiter, and the null terminator.
`*lineptr` must be a valid argument to `free`, which means it can be
either
1. `NULL`, in which case these functions perform an allocation
equivalent to a call to `malloc` even on failure.
2. A pointer returned by the `malloc` family of functions. Other
pointers are UB (`alloca`, a pointer to a static, to a stack variable, etc.)
The class `CallDescription` is used to define patterns that are used for
matching `CallEvent`s. For example, a
`CallDescription{{"std", "find_if"}, 3}`
matches a call to `std::find_if` with 3 arguments.
However, these patterns are somewhat fuzzy, so this pattern could also
match something like `std::__1::find_if` (with an additional namespace
layer), or, unfortunately, a `CallDescription` for the well-known
function `free()` can match a C++ method named `free()`:
https://github.com/llvm/llvm-project/issues/81597
To prevent this kind of ambiguity this commit introduces the enum
`CallDescription::Mode` which can limit the pattern matching to
non-method function calls (or method calls etc.). After this NFC change,
one or more follow-up commits will apply the right pattern matching
modes in the ~30 checkers that use `CallDescription`s.
Note that `CallDescription` previously had a `Flags` field which had
only two supported values:
- `CDF_None` was the default "match anything" mode,
- `CDF_MaybeBuiltin` was a "match only C library functions and accept
some inexact matches" mode.
This commit preserves `CDF_MaybeBuiltin` under the more descriptive
name `CallDescription::Mode::CLibrary` (or `CDM::CLibrary`).
Instead of this "Flags" model I'm switching to a plain enumeration
becasue I don't think that there is a natural usecase to combine the
different matching modes. (Except for the default "match anything" mode,
which is currently kept for compatibility, but will be phased out in the
follow-up commits.)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
...to model the results of alloca() and _alloca() calls. Previously it
acted as if these functions were returning memory from the heap, which
led to alpha.security.ArrayBoundV2 producing incorrect messages.
This trivial commit removes a 13-year-old FIXME comment from
MallocChecker.cpp because it was fixed at least 10 years ago when
`getConjuredHeapSymbolVal()` was added.
I'm involved with the Static Analyzer for the most part.
I think we should embrace newer language standard features and gradually
move forward.
Differential Revision: https://reviews.llvm.org/D154325
This patch adds CFGElementRef to ProgramPoints
and helps the analyzer to differentiate between
two otherwise identically looking ProgramPoints.
Fixes#60412
Differential Revision: https://reviews.llvm.org/D143328
This avoids recomputing string length that is already known at compile time.
It has a slight impact on preprocessing / compile time, see
https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u
This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.
The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.
Differential Revision: https://reviews.llvm.org/D139881
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4 (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"
GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d
The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.
This reverts commit caa713559bd38f337d7d35de35686775e8fb5175.
This reverts commit 06b90e2e9c991e211fecc97948e533320a825470.
This reverts commit e953ae5bbc313fd0cc980ce021d487e5b5199ea4.
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.
This makes `ninja clang` work in the absence of llvm::Optional::value.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
This patch introduces a new checker, called NewArraySize checker,
which detects if the expression that yields the element count of
the array in new[], results in an Undefined value.
Differential Revision: https://reviews.llvm.org/D131299