This patch corrects the state of the error node generated by the
core.DivideZero checker when it detects potential division by zero
involving a tainted denominator.
The checker split in
91ac5ed10a
started to introduce a conflicting assumption about the denominator into
the error node:
Node with the Bug Report "Division by a tainted value, possibly zero"
has an assumption "denominator != 0".
This has been done as a shortcut to continue analysis with the correct
assumption *after* the division - if we proceed, we can only assume the
denominator was not zero. However, this assumption is introduced
one-node too soon, leading to a self-contradictory error node.
In this patch, I make the error node with assumption of zero denominator
fatal, but allow analysis to continue on the second half of the state
split with the assumption of non-zero denominator.
---
CPP-6376
Before this commit, there were two alpha checkers that used different
algorithms/logic for detecting out of bounds memory access: the old
`alpha.security.ArrayBound` and the experimental, more complex
`alpha.security.ArrayBoundV2`.
After lots of quality improvement commits ArrayBoundV2 is now stable
enough to be moved out of the alpha stage. As indexing (and dereference)
are common operations, it still produces a significant amount of false
positives, but not much more than e.g. `core.NullDereference` or
`core.UndefinedBinaryOperatorResult`, so it should be acceptable as a
non-`core` checker.
At this point `alpha.security.ArrayBound` became obsolete (there is a
better tool for the same task), so I'm removing it from the codebase.
With this I can eliminate the ugly "V2" version mark almost everywhere
and rename `alpha.security.ArrayBoundV2` to `security.ArrayBound`.
(The version mark is preserved in the filename "ArrayBoundCheckerV2", to
ensure a clear git history. I'll rename it to "ArrayBoundChecker.cpp" in
a separate commit.)
This commit adapts the unit tests of `alpha.security.ArrayBound` to
testing the new `security.ArrayBound` (= old ArrayBoundV2). Currently
the names of the test files are very haphazard, I'll probably create a
separate followup commit that consolidates this.
Tainted division operation is separated out from the core.DivideZero
checker into the optional optin.taint.TaintedDiv checker. The checker
warns when the denominator in a division operation is an attacker
controlled value.
This commit moves the **alpha.security.taint.TaintPropagation** and
**alpha.security.taint.GenericTaint** checkers to the **optin.taint**
optional package.
These checkers were stabilized and improved by recent commits thus
they are ready for production use.
As discussed in
https://discourse.llvm.org/t/rfc-make-istainted-and-complex-symbols-friends/79570/10
Some `isTainted()` queries can blow up the analysis times, and
effectively halt the analysis under specific workloads.
We don't really have the time now to do a caching re-implementation of
`isTainted()`, so we need to workaround the case.
The workaround with the smallest blast radius was to limit what symbols
`isTainted()` does the query (by walking the SymExpr). So far, the
threshold 10 worked for us, but this value can be overridden using the
"max-tainted-symbol-complexity" config value.
This new option is "deprecated" from the getgo, as I expect this issue
to be fixed within the next few months and I don't want users to
override this value anyways. If they do, this message will let them know
that they are on their own, and the next release may break them (as we
no longer recognize this option if we drop it).
Mitigates #89720
CPP-5414
Before this commit the the checker alpha.security.taint.TaintPropagation always reported warnings when the size argument of a memcpy-like or malloc-like function was tainted. However, this produced false positive reports in situations where the size was tainted, but correctly performed bound checks guaranteed the safety of the call.
This commit removes the rough "always warn if the size argument is tainted" heuristic; but it would be good to add a more refined "warns if the size argument is tainted and can be too large" heuristic in follow-up commits. That logic would belong to CStringChecker and MallocChecker, because those are the checkers responsible for the more detailed modeling of memcpy-like and malloc-like functions. To mark this plan, TODO comments are added in those two checkers.
There were several test cases that used these sinks to test generic properties of taint tracking; those were adapted to use different logic.
As a minor unrelated change, this commit ensures that strcat (and its wide variant, wcsncat) propagates taint from the first argument to the first argument, i.e. a tainted string remains tainted if we concatenate it with another string. This change was required because the adapted variant of multipleTaintedArgs is relying on strncat to compose a value that combines taint from two different sources.
The checker reported a false positive on this code
void testTaintedSanitizedVLASize(void) {
int x;
scanf("%d", &x);
if (x<1)
return;
int vla[x]; // no-warning
}
After the fix, the checker only emits tainted warning if the vla size is
coming from a tainted source and it cannot prove that it is positive.
Previously alpha.security.ArrayBoundV2 produced very spartan bug
reports; this commit ensures that the relevant (and known) details are
reported to the user.
The logic for detecting bugs is not changed, after this commit the
checker will report the same set of issues, but with better messages.
To test the details of the message generation this commit adds a new
test file 'out-of-bounds-diagnostics.c'. Three of the testcases are
added with FIXME notes because they reveal shortcomings of the existing
modeling and bounds checking code. I will try to fix them in separate
follow-up commits.
Variadic arguments were not considered as taint sink arguments. I also
decided to extend the list of exec-like functions.
(Juliet CWE78_OS_Command_Injection__char_connect_socket_execl)
Functions like `fgets`, `strlen`, `strcat` propagate taint.
However, their `wchar_t` variants don't. This patch fixes that.
Notice, that there could be many more APIs missing.
This patch intends to fix those that so far surfaced,
instead of exhaustively fixing this issue.
https://github.com/llvm/llvm-project/pull/66074
This patch warns on snprintf calls whose n argument is known to be smaller than the size of the formatted string like
```
char buf[5];
snprintf(buf, 5, "Hello");
```
This is a counterpart of gcc's Wformat-truncation
Fixes https://github.com/llvm/llvm-project/issues/64871
Reviewed By: aaron.ballman, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D158562
The `GenericTaintChecker` checker was crashing, when the taint
was propagated to `AllocaRegion` region in following code:
```
int x;
void* p = alloca(10);
mempcy(p, &x, sizeof(x));
```
This crash was caused by the fact that determining type of
`AllocaRegion` returns a null `QualType`.
This patch makes `AllocaRegion` expose its type as `void`,
making them consistent with results of `malloc` or `new`
that produce `SymRegion` with `void*` symbol.
Reviewed By: steakhal, xazax.hun
Differential Revision: https://reviews.llvm.org/D155847
There was a bug in alpha.security.taint.TaintPropagation checker
in Clang Static Analyzer.
Taint filtering could only sanitize const arguments.
After this patch, taint filtering is effective also
on non-const parameters.
Differential Revision: https://reviews.llvm.org/D155848
The checker alpha.security.ArrayBoundV2 created bug reports in
situations when the (tainted) result of fgetc() or getchar() was passed
to one of the isXXXXX() macros from ctype.h.
This is a common input handling pattern (within the limited toolbox of
the C language) and several open source projects contained code where it
led to false positive reports; so this commit suppresses ArrayBoundV2
reports generated within the isXXXXX() macros.
Note that here even true positive reports would be difficult to
understand, as they'd refer to the implementation details of these
macros.
Differential Revision: https://reviews.llvm.org/D149460
Fix test case for GenericTaintChecker.
Redefinition of types is a C11 feature, and it broke a buildbot.
Commit amended: 4fd6c6e65ab59f82284d8272aa3bec8d5084511e.
There was a typo in the rule.
`{{0}, ReturnValueIndex}` meant that the discrete index is `0` and the
variadic index is `-1`.
What we wanted instead is that both `0` and `-1` are in the discrete index
list.
Instead of this, we wanted to express that both `0` and the
`ReturnValueIndex` is in the discrete arg list.
The manual inspection revealed that `setproctitle_init` also suffered a
probably incomplete propagation rule.
Reviewed By: Szelethus, gamesh411
Differential Revision: https://reviews.llvm.org/D119129
without prototypes. This patch converts the function signatures to have
a prototype for the situations where the test is not specific to K&R C
declarations. e.g.,
void func();
becomes
void func(void);
This is the ninth batch of tests being updated (there are a
significant number of other tests left to be updated).
Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.
This patch fixes that typo and demonstrated the correct behavior with
tests.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D112558
This reverts commit 52aeacfbf5ce5f949efe0eae029e56db171ea1f7.
There isn't full agreement on a path forward yet, but there is agreement that
this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338
Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality"
This reverts commit f4877c78c0fc98be47b926439bbfe33d5e1d1b6d.
And all the related changes to tests:
This reverts commit 9a0152799f8e4a59e0483728c9f11c8a7805616f.
This reverts commit 3f7c9cc27422f7302cf5a683eeb3978e6cb84270.
This reverts commit 329f8197ef59f9bd23328b52d623ba768b51dbb2.
This reverts commit aa9f58cc2c48ca6cfc853a2467cd775dc7622746.
This reverts commit 2df37d5ddd38091aafbb7d338660e58836f4ac80.
This reverts commit a72a44181264fd83e05be958c2712cbd4560aba7.
On z/OS, other error messages are not matched correctly in lit tests.
```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```
This patch adds a lit substitution to fix it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95808
This patch is the last of the series of patches which allow the user to
annotate their functions with taint propagation rules.
I implemented the use of the configured filtering functions. These
functions can remove taintedness from the symbols which are passed at
the specified arguments to the filters.
Differential Revision: https://reviews.llvm.org/D59516
While we implemented taint propagation rules for several
builtin/standard functions, there's a natural desire for users to add
such rules to custom functions.
A series of patches will implement an option that allows users to
annotate their functions with taint propagation rules through a YAML
file. This one adds parsing of the configuration file, which may be
specified in the commands line with the analyzer config:
alpha.security.taint.TaintPropagation:Config. The configuration may
contain propagation rules, filter functions (remove taint) and sink
functions (give a warning if it gets a tainted value).
I also added a new header for future checkers to conveniently read YAML
files as checker options.
Differential Revision: https://reviews.llvm.org/D59555
llvm-svn: 367190
The gets function has no SrcArgs. Because the default value for isTainted was
false, it didn't mark its DstArgs as tainted.
Patch by Gábor Borsik!
Differential Revision: https://reviews.llvm.org/D58828
llvm-svn: 355396
When a function takes the address of a field the analyzer will no longer
assume that the function will change other fields of the enclosing structs.
Differential Revision: https://reviews.llvm.org/D57230
llvm-svn: 352473
Summary:
GenericTaintChecker can't recognize stdin in some cases. The reason is that `if (PtrTy->getPointeeType() == C.getASTContext().getFILEType()` does not hold when stdin is encountered.
My platform is ubuntu16.04 64bit, gcc 5.4.0, glibc 2.23. The definition of stdin is as follows:
```
__BEGIN_NAMESPACE_STD
/* The opaque type of streams. This is the definition used elsewhere. */
typedef struct _IO_FILE FILE;
___END_NAMESPACE_STD
...
/* The opaque type of streams. This is the definition used elsewhere. */
typedef struct _IO_FILE __FILE;
...
/* Standard streams. */
extern struct _IO_FILE *stdin; /* Standard input stream. */
extern struct _IO_FILE *stdout; /* Standard output stream. */
extern struct _IO_FILE *stderr; /* Standard error output stream. */
```
The type of stdin is as follows AST:
```
ElaboratedType 0xc911170'struct _IO_FILE'sugar
`-RecordType 0xc911150'struct _IO_FILE'
`-CXXRecord 0xc923ff0'_IO_FILE'
```
`C.getASTContext().GetFILEType()` is as follows AST:
```
TypedefType 0xc932710 'FILE' sugar
|-Typedef 0xc9111c0 'FILE'
`-ElaboratedType 0xc911170 'struct _IO_FILE' sugar
`-RecordType 0xc911150 'struct _IO_FILE'
`-CXXRecord 0xc923ff0 '_IO_FILE'
```
So I think it's better to use `getCanonicalType()`.
Reviewers: zaks.anna, NoQ, george.karpenkov, a.sidorin
Reviewed By: zaks.anna, a.sidorin
Subscribers: a.sidorin, cfe-commits, xazax.hun, szepet, MTC
Differential Revision: https://reviews.llvm.org/D39159
llvm-svn: 326709
The analyzer's taint analysis can now reason about structures or arrays
originating from taint sources in which only certain sections are tainted.
In particular, it also benefits modeling functions like read(), which may
read tainted data into a section of a structure, but RegionStore is incapable of
expressing the fact that the rest of the structure remains intact, even if we
try to model read() directly.
Patch by Vlad Tsyrklevich!
Differential revision: https://reviews.llvm.org/D28445
llvm-svn: 304162