As discussed in
https://discourse.llvm.org/t/rfc-make-istainted-and-complex-symbols-friends/79570/10
Some `isTainted()` queries can blow up the analysis times, and
effectively halt the analysis under specific workloads.
We don't really have the time now to do a caching re-implementation of
`isTainted()`, so we need to workaround the case.
The workaround with the smallest blast radius was to limit what symbols
`isTainted()` does the query (by walking the SymExpr). So far, the
threshold 10 worked for us, but this value can be overridden using the
"max-tainted-symbol-complexity" config value.
This new option is "deprecated" from the getgo, as I expect this issue
to be fixed within the next few months and I don't want users to
override this value anyways. If they do, this message will let them know
that they are on their own, and the next release may break them (as we
no longer recognize this option if we drop it).
Mitigates #89720
CPP-5414
Previously the function
```
std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State,
const MemRegion *Reg,
TaintTagType K,
bool returnFirstOnly)
```
(one of the 4 overloaded variants under this name) was handling element
regions in a highly inefficient manner: it performed the "also examine
the super-region" step twice. (Once in the branch for element regions,
and once in the more general branch for all `SubRegion`s -- note that
`ElementRegion` is a subclass of `SubRegion`.)
As pointer arithmetic produces `ElementRegion`s, it's not too difficult
to get a chain of N nested element regions where this inefficient
recursion would produce 2^N calls.
This commit is essentially NFC, apart from the performance improvements
and the removal of (probably irrelevant) duplicate entries from the
return value of `getTaintedSymbols()` calls.
Fixes#89045
I'm involved with the Static Analyzer for the most part.
I think we should embrace newer language standard features and gradually
move forward.
Differential Revision: https://reviews.llvm.org/D154325
This patch improves the diagnostics of the alpha.security.taint.TaintPropagation
checker and taint related checkers by showing the "Taint originated here" note
at the correct place, where the attacker may inject it. This greatly improves
the understandability of the taint reports.
In the baseline the taint source was pointing to an invalid location, typically
somewhere between the real taint source and sink.
After the fix, the "Taint originated here" tag is correctly shown at the taint
source. This is the function call where the attacker can inject a malicious data
(e.g. reading from environment variable, reading from file, reading from
standard input etc.).
This patch removes the BugVisitor from the implementation and replaces it with 2
new NoteTags. One, in the taintOriginTrackerTag() prints the "taint originated
here" Note and the other in taintPropagationExplainerTag() explaining how the
taintedness is propagating from argument to argument or to the return value
("Taint propagated to the Xth argument"). This implementation uses the
interestingess BugReport utility to track back the tainted symbols through
propagating function calls to the point where the taintedness was introduced by
a source function call.
The checker which wishes to emit a Taint related diagnostic must use the
categories::TaintedData BugType category and must mark the tainted symbols as
interesting. Then the TaintPropagationChecker will automatically generate the
"Taint originated here" and the "Taint propagated to..." diagnostic notes.
Summary: Simplify functions SVal::getAsSymbolicExpression SVal::getAsSymExpr and SVal::getAsSymbol. After revision I concluded that `getAsSymbolicExpression` and `getAsSymExpr` repeat functionality of `getAsSymbol`, thus them can be removed.
Fix: Remove functions SVal::getAsSymbolicExpression and SVal::getAsSymExpr.
Differential Revision: https://reviews.llvm.org/D85034
This patch is the last of the series of patches which allow the user to
annotate their functions with taint propagation rules.
I implemented the use of the configured filtering functions. These
functions can remove taintedness from the symbols which are passed at
the specified arguments to the filters.
Differential Revision: https://reviews.llvm.org/D59516
These static functions deal with ExplodedNodes which is something we don't want
the PathDiagnostic interface to know anything about, as it's planned to be
moved out of libStaticAnalyzerCore.
Differential Revision: https://reviews.llvm.org/D67382
llvm-svn: 371659
Checkers are now required to specify whether they're creating a
path-sensitive report or a path-insensitive report by constructing an
object of the respective type.
This makes BugReporter more independent from the rest of the Static Analyzer
because all Analyzer-specific code is now in sub-classes.
Differential Revision: https://reviews.llvm.org/D66572
llvm-svn: 371450
find clang/ -type f -exec sed -i 's/std::shared_ptr<PathDiagnosticPiece>/PathDiagnosticPieceRef/g' {} \;
git diff -U3 --no-color HEAD^ | clang-format-diff-6.0 -p1 -i
Just as C++ is meant to be refactored, right?
Differential Revision: https://reviews.llvm.org/D65381
llvm-svn: 368717
At least gcc 7.4 complained with
../tools/clang/lib/StaticAnalyzer/Checkers/Taint.cpp:26:53: warning: extra ';' [-Wpedantic]
TaintTagType);
^
llvm-svn: 357461
It is now an inter-checker communication API, similar to the one that
connects MallocChecker/CStringChecker/InnerPointerChecker: simply a set of
setters and getters for a state trait.
Differential Revision: https://reviews.llvm.org/D59861
llvm-svn: 357326