In builds that use source hardening (-D_FORTIFY_SOURCE), many standard
functions are implemented as macros that expand to calls of hardened
functions that take one additional argument compared to the "usual"
variant and perform additional input validation. For example, a `memcpy`
call may expand to `__memcpy_chk()` or `__builtin___memcpy_chk()`.
Before this commit, `CallDescription`s created with the matching mode
`CDM::CLibrary` automatically matched these hardened variants (in a
addition to the "usual" function) with a fairly lenient heuristic.
Unfortunately this heuristic meant that the `CLibrary` matching mode was
only usable by checkers that were prepared to handle matches with an
unusual number of arguments.
This commit limits the recognition of the hardened functions to a
separate matching mode `CDM::CLibraryMaybeHardened` and applies this
mode for functions that have hardened variants and were previously
recognized with `CDM::CLibrary`.
This way checkers that are prepared to handle the hardened variants will
be able to detect them easily; while other checkers can simply use
`CDM::CLibrary` for matching C library functions (and they won't
encounter surprising argument counts).
The initial motivation for refactoring this area was that previously
`CDM::CLibrary` accepted calls with more arguments/parameters than the
expected number, so I wasn't able to use it for `malloc` without
accidentally matching calls to the 3-argument BSD kernel malloc.
After this commit this "may have more args/params" logic will only
activate when we're actually matching a hardened variant function (in
`CDM::CLibraryMaybeHardened` mode). The recognition of "sprintf()" and
"snprintf()" in CStringChecker was refactored, because previously it was
abusing the behavior that extra arguments are accepted even if the
matched function is not a hardened variant.
This commit also fixes the oversight that the old code would've
recognized e.g. `__wmemcpy_chk` as a hardened variant of `memcpy`.
After this commit I'm planning to create several follow-up commits that
ensure that checkers looking for C library functions use `CDM::CLibrary`
as a "sane default" matching mode.
This commit is not truly NFC (it eliminates some buggy corner cases),
but it does not intentionally modify the behavior of CSA on real-world
non-crazy code.
As a minor unrelated change I'm eliminating the argument/variable
"IsBuiltin" from the evalSprintf function family in CStringChecker,
because it was completely unused.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
This reapplies 80ab8234ac309418637488b97e0a62d8377b2ecf again, after
fixing a name collision warning in the unit tests (see the revert commit
13ccaf9b9d4400bb128b35ff4ac733e4afc3ad1c for details).
In addition to the previously applied changes, this commit also clarifies the
code in MallocChecker that distinguishes POSIX "getline()" and C++ standard
library "std::getline()" (which are two completely different functions). Note
that "std::getline()" was (accidentally) handled correctly even without this
clarification; but it's better to explicitly handle and test this corner case.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
This reverts commit e48d5a838f69e0a8e0ae95a8aed1a8809f45465a.
Fails to build on x86-64 w/gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
with the following message:
../llvm-project/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp:41:28: error: declaration of ‘std::unique_ptr<clang::ASTUnit> IsCLibraryFunctionTest::ASTUnit’ changes meaning of ‘ASTUnit’ [-fpermissive]
41 | std::unique_ptr<ASTUnit> ASTUnit;
| ^~~~~~~
In file included from ../llvm-project/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp:4:
../llvm-project/clang/include/clang/Frontend/ASTUnit.h:89:7: note: ‘ASTUnit’ declared here as ‘class clang::ASTUnit’
89 | class ASTUnit {
| ^~~~~~~
Previously, the function `isCLibraryFunction()` and logic relying on it
only accepted functions that are declared directly within a TU (i.e. not
in a namespace or a class). However C++ headers like <cstdlib> declare
many C standard library functions within the namespace `std`, so this
commit ensures that functions within the namespace `std` are also
accepted.
After this commit it will be possible to match functions like `malloc`
or `free` with `CallDescription::Mode::CLibrary`.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
`CallDescriptions` for builtin functions relaxes the match rules
somewhat, so that the `CallDescription` will match for calls that have
some prefix or suffix. This was achieved by doing a `StringRef::contains()`.
However, this is somewhat problematic for builtins that are substrings
of each other.
Consider the following:
`CallDescription{ builtin, "memcpy"}` will match for
`__builtin_wmemcpy()` calls, which is unfortunate.
This patch addresses/works around the issue by checking if the
characters around the function's name are not part of the 'name'
semantically. In other words, to accept a match for `"memcpy"` the call
should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'.
So, `CallDescription{ builtin, "memcpy"}` will not match on:
- `__builtin_wmemcpy: there is a `w` alphanumeric character before the match.
- `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match.
- `__builtin_memcpyX_inline`: there is an `X` character after the match.
But it will still match for:
- `memcpy`: exact match
- `__builtin_memcpy`: there is an _ before the match
- `__builtin_memcpy_inline`: there is an _ after the match
- `memcpy_inline_builtinFooBar`: there is an _ after the match
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D118388
It turns out llvm::isa<> is variadic, and we could have used this at a
lot of places.
The following patterns:
x && isa<T1>(x) || isa<T2>(x) ...
Will be replaced by:
isa_and_non_null<T1, T2, ...>(x)
Sometimes it caused further simplifications, when it would cause even
more code smell.
Aside from this, keep in mind that within `assert()` or any macro
functions, we need to wrap the isa<> expression within a parenthesis,
due to the parsing of the comma.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111982
This fixes a crash in MallocChecker for the situation when operator new (delete) is invoked via NTTP and makes the behavior of CallContext.getCalleeDecl(Expr) identical to CallEvent.getDecl().
Reviewed By: vsavchenko
Differential Revision: https://reviews.llvm.org/D103025
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
In most cases using
`N->getState()->getSVal(E, N->getLocationContext())`
is ugly, verbose, and also opens up more surface area for bugs if an
inconsistent location context is used.
This patch introduces a helper on an exploded node, and ensures
consistent usage of either `ExplodedNode::getSVal` or
`CheckContext::getSVal` across the codebase.
As a result, a large number of redundant lines is removed.
Differential Revision: https://reviews.llvm.org/D42155
llvm-svn: 322753
- Include the position of the argument on which the nullability is violated
- Differentiate between a 'method' and a 'function' in the message wording
- Test for the error message text in the tests
- Fix a bug with setting 'IsDirectDereference' which resulted in regular dereferences assumed to have call context.
llvm-svn: 259221
Use getRedeclContext() instead of a manually-written loop and fix a comment.
A patch by Aleksei Sidorin!
Differential Revision: http://reviews.llvm.org/D15794
llvm-svn: 256524
This patch renames getLinkage to getLinkageInternal. Only code that
needs to handle UniqueExternalLinkage specially should call this.
Linkage, as defined in the c++ standard, is provided by
getFormalLinkage. It maps UniqueExternalLinkage to ExternalLinkage.
Most places in the compiler actually want isExternallyVisible, which
handles UniqueExternalLinkage as internal.
llvm-svn: 181677
These are CallEvent-equivalents of helpers already accessible in
CheckerContext, as part of making it easier for new checkers to be written
using CallEvent rather than raw CallExprs.
llvm-svn: 167338
At this point this is largely cosmetic, but it opens the door to replace
ProgramStateRef with a smart pointer that more eagerly acts in the role
of reclaiming unused ProgramState objects.
llvm-svn: 149081
(Stmt*,LocationContext*) pairs to SVals instead of Stmt* to SVals.
This is needed to support basic IPA via inlining. Without this, we cannot tell
if a Stmt* binding is part of the current analysis scope (StackFrameContext) or
part of a parent context.
This change introduces an uglification of the use of getSVal(), and thus takes
two steps forward and one step back. There are also potential performance implications
of enlarging the Environment. Both can be addressed going forward by refactoring the
APIs and optimizing the internal representation of Environment. This patch
mainly introduces the functionality upon when we want to build upon (and clean up).
llvm-svn: 147688
We are getting name of the called function or it's declaration in a few checkers. Refactor them to use the helper function in the CheckerContext.
llvm-svn: 145576
First step toward removing the global Stmt builder. Added several transitional methods (like takeNodes/addNodes).
+ Stop early if the set of exploded nodes for the next iteration is empty.
llvm-svn: 142827
This moves the responsibility for storing the output node set from the
builder to the clients. The builder is just responsible for transforming
an input set into the output set: {SrcSet/SrcNode} -> {Frontier}.
llvm-svn: 142826
It now only depends on a generic NodeBuilder instead. As part of this change, make the generic node builder results finalized by default.
llvm-svn: 142452
Currently we have a bunch of different node builders which provide some common
functionality but are difficult to refactor. Each builder generates nodes of
different kinds and calculates the frontier nodes, which should be propagated
to the next step (after the builder dies).
Introduce a new NodeBuilder which provides very basic node generation facilities
but takes care of the second problem. The idea is that all the other builders
will eventually use it. Use this builder in CheckerContext instead of
StmtNodeBuilder (the way the frontier is propagated to the StmtBuilder
is a hack and will be removed later on).
llvm-svn: 142443
Having a notion of an actual ProgramPointTag will aid in introspection of the analyzer's behavior.
For example, the GraphViz output of the analyzer will pretty-print the tags in a useful manner.
llvm-svn: 137529