120 Commits

Author SHA1 Message Date
Balazs Benics
f90e063308
[analyzer] Fix taint sink rules for exec-like functions (#66358)
Variadic arguments were not considered as taint sink arguments. I also
decided to extend the list of exec-like functions.

(Juliet CWE78_OS_Command_Injection__char_connect_socket_execl)
2023-09-22 07:14:32 +02:00
Daniel Krupp
97495d3159
[analyzer] TaintPropagation checker strlen() should not propagate (#66086)
strlen(..) call should not propagate taintedness,
because it brings in many false positive findings. It is a common
pattern to copy user provided input to another buffer. In these cases we
always
get warnings about tainted data used as the malloc parameter:

buf = malloc(strlen(tainted_txt) + 1); // false warning

This pattern can lead to a denial of service attack only, when the
attacker can directly specify the size of the allocated area as an
arbitrary large number (e.g. the value is converted from a user provided
string).

Later, we could reintroduce strlen() as a taint propagating function
with the consideration not to emit warnings when the tainted value
cannot be "arbitrarily large" (such as the size of an already allocated
memory block).

The change has been evaluated on the following open source projects:

- memcached: [1 lost false
positive](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=memcached_1.6.8_ednikru_taint_nostrlen_baseline&newcheck=memcached_1.6.8_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved)

- tmux: 0 lost reports
- twin [3 lost false
positives](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=twin_v0.8.1_ednikru_taint_nostrlen_baseline&newcheck=twin_v0.8.1_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved)
- vim [1 lost false
positive](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=vim_v8.2.1920_ednikru_taint_nostrlen_baseline&newcheck=vim_v8.2.1920_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved)
- openssl 0 lost reports
- sqliste [2 lost false
positives](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=sqlite_version-3.33.0_ednikru_taint_nostrlen_baseline&newcheck=sqlite_version-3.33.0_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved)
- ffmpeg 0 lost repots
- postgresql [3 lost false
positives](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=postgres_REL_13_0_ednikru_taint_nostrlen_baseline&newcheck=postgres_REL_13_0_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved)
- tinyxml 0 lost reports
- libwebm 0 lost reports
- xerces 0 lost reports

In all cases the lost reports are originating from copying untrusted
environment variables into another buffer.

There are 2 types of lost false positive reports:
1) [Where the warning is emitted at the malloc call by the
TaintPropagation Checker
](https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=memcached_1.6.8_ednikru_taint_nostrlen_baseline&newcheck=memcached_1.6.8_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved&report-id=2648506&report-hash=2079221954026f17e1ecb614f5f054db&report-filepath=%2amemcached.c)
`
            len = strlen(portnumber_filename)+4+1;
            temp_portnumber_filename = malloc(len);
`

2) When pointers are set based on the length of the tainted string by
the ArrayOutofBoundsv2 checker.
For example [this
](https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=vim_v8.2.1920_ednikru_taint_nostrlen_baseline&newcheck=vim_v8.2.1920_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved&report-id=2649310&report-hash=79dc8522d2cd34ca8e1b2dc2db64b2df&report-filepath=%2aos_unix.c)case.
2023-09-19 11:04:50 +02:00
Balazs Benics
61924da630 [analyzer] Propagate taint for wchar variants of some APIs (#66074)
Functions like `fgets`, `strlen`, `strcat` propagate taint.
However, their `wchar_t` variants don't. This patch fixes that.

Notice, that there could be many more APIs missing.
This patch intends to fix those that so far surfaced,
instead of exhaustively fixing this issue.

https://github.com/llvm/llvm-project/pull/66074
2023-09-14 11:55:10 +02:00
Balazs Benics
8243bc4045 [analyzer] Make socket accept() propagate taint (#66074)
This allows to track taint on real code from `socket()`
to reading into a buffer using `recv()`.

https://github.com/llvm/llvm-project/pull/66074
2023-09-14 11:55:10 +02:00
Balazs Benics
909c963999 [analyzer] Fix stdin declaration in C++ tests (#66074)
The `stdin` declaration should be within `extern "C" {...}`, in C++
mode. In addition, it should be also marked `extern` in both C and
C++ modes.

I tightened the check to ensure we only accept `stdin` if both of these
match. However, from the Juliet test suite's perspective, this commit
should not matter.

https://github.com/llvm/llvm-project/pull/66074
2023-09-14 11:55:10 +02:00
Daniel Krupp
26b19a67e5 [clang][analyzer]Fix non-effective taint sanitation
There was a bug in alpha.security.taint.TaintPropagation checker
in Clang Static Analyzer.
Taint filtering could only sanitize const arguments.
After this patch, taint filtering is effective also
on non-const parameters.

Differential Revision: https://reviews.llvm.org/D155848
2023-07-21 15:11:13 +02:00
Jordan Rupprecht
8b39527535 [NFC] Wrap entire debug logging loop in LLVM_DEBUG 2023-04-26 05:28:15 -07:00
Daniel Krupp
343bdb1094 [analyzer] Show taint origin and propagation correctly
This patch improves the diagnostics of the alpha.security.taint.TaintPropagation
checker and taint related checkers by showing the "Taint originated here" note
at the correct place, where the attacker may inject it. This greatly improves
the understandability of the taint reports.

In the baseline the taint source was pointing to an invalid location, typically
somewhere between the real taint source and sink.

After the fix, the "Taint originated here" tag is correctly shown at the taint
source. This is the function call where the attacker can inject a malicious data
(e.g. reading from environment variable, reading from file, reading from
standard input etc.).

This patch removes the BugVisitor from the implementation and replaces it with 2
new NoteTags. One, in the taintOriginTrackerTag() prints the "taint originated
here" Note and the other in taintPropagationExplainerTag() explaining how the
taintedness is propagating from argument to argument or to the return value
("Taint propagated to the Xth argument"). This implementation uses the
interestingess BugReport utility to track back the tainted symbols through
propagating function calls to the point where the taintedness was introduced by
a source function call.

The checker which wishes to emit a Taint related diagnostic must use the
categories::TaintedData BugType category and must mark the tainted symbols as
interesting. Then the TaintPropagationChecker will automatically generate the
"Taint originated here" and the "Taint propagated to..." diagnostic notes.
2023-04-26 12:43:36 +02:00
Kazu Hirata
6ad0788c33 [clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 12:31:01 -08:00
serge-sans-paille
d9ab3e82f3
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.

The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-27 09:55:19 +01:00
Vitaly Buka
aa171833ab Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4 (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"

GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d

The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.

This reverts commit caa713559bd38f337d7d35de35686775e8fb5175.
This reverts commit 06b90e2e9c991e211fecc97948e533320a825470.
This reverts commit e953ae5bbc313fd0cc980ce021d487e5b5199ea4.
2022-12-25 23:12:47 -08:00
serge-sans-paille
e953ae5bbc
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa841c522d8d452f0685e503538415a53 that into
account a GGC issue (probably
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92181) when dealing with
intiailizer_list and constant expressions.

Workaround this by avoiding initializer list, at the expense of a
temporary plain old array.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-24 10:25:06 +01:00
serge-sans-paille
07d9ab9aa5
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
There are still remaining issues with GCC 12, see for instance

https://lab.llvm.org/buildbot/#/builders/93/builds/12669

This reverts commit 5ce4e92264102de21760c94db9166afe8f71fcf6.
2022-12-23 13:29:21 +01:00
serge-sans-paille
5ce4e92264
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa841c522d8d452f0685e503538415a53 with a
change to llvm/utils/TableGen/OptParserEmitter.cpp to cope with GCC bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

Differential Revision: https://reviews.llvm.org/D139881
2022-12-23 12:48:17 +01:00
serge-sans-paille
b7065a31b5
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
Failing builds: https://lab.llvm.org/buildbot#builders/9/builds/19030
This is GCC specific and has been reported upstream: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

This reverts commit 719d98dfa841c522d8d452f0685e503538415a53.
2022-12-23 11:36:56 +01:00
serge-sans-paille
719d98dfa8
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D139881
2022-12-23 10:31:47 +01:00
Kazu Hirata
6c8b8a6a2a [Checkers] Use std::optional in GenericTaintChecker.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 08:00:24 -08:00
Kazu Hirata
180600660b [StaticAnalyzer] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 11:34:24 -08:00
Kazu Hirata
ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Kazu Hirata
064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Kazu Hirata
06decd0b41 [clang] Use value_or instead of getValueOr (NFC) 2022-06-18 23:21:34 -07:00
Balazs Benics
f4fc3f6ba3 [analyzer] Treat system globals as mutable if they are not const
Previously, system globals were treated as immutable regions, unless it
was the `errno` which is known to be frequently modified.

D124244 wants to add a check for stores to immutable regions.
It would basically turn all stores to system globals into an error even
though we have no reason to believe that those mutable sys globals
should be treated as if they were immutable. And this leads to
false-positives if we apply D124244.

In this patch, I'm proposing to treat mutable sys globals actually
mutable, hence allocate them into the `GlobalSystemSpaceRegion`, UNLESS
they were declared as `const` (and a primitive arithmetic type), in
which case, we should use `GlobalImmutableSpaceRegion`.

In any other cases, I'm using the `GlobalInternalSpaceRegion`, which is
no different than the previous behavior.

---

In the tests I added, only the last `expected-warning` was different, compared to the baseline.
Which is this:
```lang=C++
void test_my_mutable_system_global_constraint() {
  assert(my_mutable_system_global > 2);
  clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{TRUE}}
  invalidate_globals();
  clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{UNKNOWN}} It was previously TRUE.
}
void test_my_mutable_system_global_assign(int x) {
  my_mutable_system_global = x;
  clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{TRUE}}
  invalidate_globals();
  clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{UNKNOWN}} It was previously TRUE.
}
```

---

Unfortunately, the taint checker will be also affected.
The `stdin` global variable is a pointer, which is assumed to be a taint
source, and the rest of the taint propagation rules will propagate from
it.
However, since mutable variables are no longer treated immutable, they
also get invalidated, when an opaque function call happens, such as the
first `scanf(stdin, ...)`. This would effectively remove taint from the
pointer, consequently disable all the rest of the taint propagations
down the line from the `stdin` variable.

All that said, I decided to look through `DerivedSymbol`s as well, to
acquire the memregion in that case as well. This should preserve the
previously existing taint reports.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127306
2022-06-15 17:08:27 +02:00
Balazs Benics
96ccb690a0 [analyzer][NFC] Prefer using isa<> instead getAs<> in conditions
Depends on D125709

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127742
2022-06-15 16:58:13 +02:00
Balazs Benics
f1b18a79b7 [analyzer][NFC] Remove dead code and modernize surroundings
Thanks @kazu for helping me clean these parts in D127799.

I'm leaving the dump methods, along with the unused visitor handlers and
the forwarding methods.

The dead parts actually helped to uncover two bugs, to which I'm going
to post separate patches.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127836
2022-06-15 16:50:12 +02:00
Tom Ritter
82f3ed9904 [analyzer] Expose Taint.h to plugins
Reviewed By: NoQ, xazax.hun, steakhal

Differential Revision: https://reviews.llvm.org/D123155
2022-04-19 16:55:01 +02:00
Endre Fülöp
4fd6c6e65a [analyzer] Add more propagations to Taint analysis
Add more functions as taint propators to GenericTaintChecker.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D120369
2022-03-07 13:18:54 +01:00
Endre Fülöp
34a7387986 [analyzer] Add more sources to Taint analysis
Add more functions as taint sources to GenericTaintChecker.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D120236
2022-02-28 11:33:02 +01:00
Fangrui Song
ecff9b65b5 [analyzer] Just use default capture after 7fd60ee6e0a87957a718297a4a42d9881fc561e3 2022-02-24 10:06:11 -08:00
Fangrui Song
7fd60ee6e0 [analyzer] Fix -Wunused-lambda-capture in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-02-24 00:13:13 -08:00
Balazs Benics
7036413dc2 Revert "Revert "[analyzer] Fix taint rule of fgets and setproctitle_init""
This reverts commit 2acead35c1289d2b3593a992b0639ca6427e481f.

Let's try `REQUIRES: asserts`.
2022-02-23 12:55:31 +01:00
Balazs Benics
a848a5cf2f Revert "Revert "[analyzer] Fix taint propagation by remembering to the location context""
This reverts commit d16c5f4192c30d53468a472c6820163a81192825.

Let's try `REQUIRES: asserts`.
2022-02-23 12:53:07 +01:00
Balazs Benics
fa0a80e017 Revert "Revert "[analyzer] Add failing test case demonstrating buggy taint propagation""
This reverts commit b8ae323cca61dc1edcd36e9ae18c7e4c3d76d52e.

Let's try `REQUIRES: asserts`.
2022-02-23 10:48:06 +01:00
Balazs Benics
b8ae323cca Revert "[analyzer] Add failing test case demonstrating buggy taint propagation"
This reverts commit 744745ae195f0997e5bfd5aa2de47b9ea156b6a6.

I'm reverting this since this patch caused a build breakage.

https://lab.llvm.org/buildbot/#/builders/91/builds/3818
2022-02-14 18:45:46 +01:00
Balazs Benics
d16c5f4192 Revert "[analyzer] Fix taint propagation by remembering to the location context"
This reverts commit b099e1e562555fbc67e2e0abbc15074e14a85ff1.

I'm reverting this since the head of the patch stack caused a build
breakage.

https://lab.llvm.org/buildbot/#/builders/91/builds/3818
2022-02-14 18:45:46 +01:00
Balazs Benics
2acead35c1 Revert "[analyzer] Fix taint rule of fgets and setproctitle_init"
This reverts commit bf5963bf19670ea58facdf57492e147c13bb650f.

I'm reverting this since the head of the patch stack caused a build
breakage.

https://lab.llvm.org/buildbot/#/builders/91/builds/3818
2022-02-14 18:45:46 +01:00
Balazs Benics
bf5963bf19 [analyzer] Fix taint rule of fgets and setproctitle_init
There was a typo in the rule.
`{{0}, ReturnValueIndex}` meant that the discrete index is `0` and the
variadic index is `-1`.
What we wanted instead is that both `0` and `-1` are in the discrete index
list.

Instead of this, we wanted to express that both `0` and the
`ReturnValueIndex` is in the discrete arg list.

The manual inspection revealed that `setproctitle_init` also suffered a
probably incomplete propagation rule.

Reviewed By: Szelethus, gamesh411

Differential Revision: https://reviews.llvm.org/D119129
2022-02-14 16:55:55 +01:00
Balazs Benics
b099e1e562 [analyzer] Fix taint propagation by remembering to the location context
Fixes the issue D118987 by mapping the propagation to the callsite's
LocationContext.
This way we can keep track of the in-flight propagations.

Note that empty propagation sets won't be inserted.

Reviewed By: NoQ, Szelethus

Differential Revision: https://reviews.llvm.org/D119128
2022-02-14 16:55:55 +01:00
Balazs Benics
744745ae19 [analyzer] Add failing test case demonstrating buggy taint propagation
Recently we uncovered a serious bug in the `GenericTaintChecker`.
It was already flawed before D116025, but that was the patch that turned
this silent bug into a crash.

It happens if the `GenericTaintChecker` has a rule for a function, which
also has a definition.

  char *fgets(char *s, int n, FILE *fp) {
    nested_call();   // no parameters!
    return (char *)0;
  }

  // Within some function:
  fgets(..., tainted_fd);

When the engine inlines the definition and finds a function call within
that, the `PostCall` event for the call will get triggered sooner than the
`PostCall` for the original function.
This mismatch violates the assumption of the `GenericTaintChecker` which
wants to propagate taint information from the `PreCall` event to the
`PostCall` event, where it can actually bind taint to the return value
**of the same call**.

Let's get back to the example and go through step-by-step.
The `GenericTaintChecker` will see the `PreCall<fgets(..., tainted_fd)>`
event, so it would 'remember' that it needs to taint the return value
and the buffer, from the `PostCall` handler, where it has access to the
return value symbol.
However, the engine will inline fgets and the `nested_call()` gets
evaluated subsequently, which produces an unimportant
`PreCall<nested_call()>`, then a `PostCall<nested_call()>` event, which is
observed by the `GenericTaintChecker`, which will unconditionally mark
tainted the 'remembered' arg indexes, trying to access a non-existing
argument, resulting in a crash.
If it doesn't crash, it will behave completely unintuitively, by marking
completely unrelated memory regions tainted, which is even worse.

The resulting assertion is something like this:
  Expr.h: const Expr *CallExpr::getArg(unsigned int) const: Assertion
          `Arg < getNumArgs() && "Arg access out of range!"' failed.

The gist of the backtrace:
  CallExpr::getArg(unsigned int) const
  SimpleFunctionCall::getArgExpr(unsigned int)
  CallEvent::getArgSVal(unsigned int) const
  GenericTaintChecker::checkPostCall(const CallEvent &, CheckerContext&) const

Prior to D116025, there was a check for the argument count before it
applied taint, however, it still suffered from the same underlying
issue/bug regarding propagation.

This path does not intend to fix the bug, rather start a discussion on
how to fix this.

---

Let me elaborate on how I see this problem.

This pre-call, post-call juggling is just a workaround.
The engine should by itself propagate taint where necessary right where
it invalidates regions.
For the tracked values, which potentially escape, we need to erase the
information we know about them; and this is exactly what is done by
invalidation.
However, in the case of taint, we basically want to approximate from the
opposite side of the spectrum.
We want to preserve taint in most cases, rather than cleansing them.

Now, we basically sanitize all escaping tainted regions implicitly,
since invalidation binds a fresh conjured symbol for the given region,
and that has not been associated with taint.

IMO this is a bad default behavior, we should be more aggressive about
preserving taint if not further spreading taint to the reachable
regions.

We have a couple of options for dealing with it (let's call it //tainting
policy//):
  1) Taint only the parameters which were tainted prior to the call.
  2) Taint the return value of the call, since it likely depends on the
     tainted input - if any arguments were tainted.
  3) Taint all escaped regions - (maybe transitively using the cluster
     algorithm) - if any arguments were tainted.
  4) Not taint anything - this is what we do right now :D

The `ExprEngine` should not deal with taint on its own. It should be done
by a checker, such as the `GenericTaintChecker`.
However, the `Pre`-`PostCall` checker callbacks are not designed for this.
`RegionChanges` would be a much better fit for modeling taint propagation.
What we would need in the `RegionChanges` callback is the `State` prior
invalidation, the `State` after the invalidation, and a `CheckerContext` in
which the checker can create transitions, where it would place `NoteTags`
for the modeled taint propagations and report errors if a taint sink
rule gets violated.
In this callback, we could query from the prior State, if the given
value was tainted; then act and taint if necessary according to the
checker's tainting policy.

By using RegionChanges for this, we would 'fix' the mentioned
propagation bug 'by-design'.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D118987
2022-02-14 16:55:55 +01:00
Tres Popp
262cc74e0b Fix pair construction with an implicit constructor inside. 2022-01-18 18:01:52 +01:00
Endre Fülöp
17f74240e6 [analyzer][NFC] Refactor GenericTaintChecker to use CallDescriptionMap
GenericTaintChecker now uses CallDescriptionMap to describe the possible
operation in code which trigger the introduction (sources), the removal
(filters), the passing along (propagations) and detection (sinks) of
tainted values.

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D116025
2022-01-18 16:04:04 +01:00
Balazs Benics
49285f43e5 [analyzer] sprintf is a taint propagator not a source
Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.

This patch fixes that typo and demonstrated the correct behavior with
tests.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D112558
2021-10-28 11:03:02 +02:00
Kazu Hirata
0abb5d293c [Sema, StaticAnalyzer] Use StringRef::contains (NFC) 2021-10-20 08:02:36 -07:00
Kazu Hirata
e567f37dab [clang] Use llvm::is_contained (NFC) 2021-10-13 20:41:55 -07:00
Balazs Benics
edde4efc66 [analyzer] Introduce the assume-controlled-environment config option
If the `assume-controlled-environment` is `true`, we should expect `getenv()`
to succeed, and the result should not be considered tainted.
By default, the option will be `false`.

Reviewed By: NoQ, martong

Differential Revision: https://reviews.llvm.org/D111296
2021-10-13 10:50:26 +02:00
Barry Revzin
92310454bf Make LLVM build in C++20 mode
Part of the <=> changes in C++20 make certain patterns of writing equality
operators ambiguous with themselves (sorry!).
This patch goes through and adjusts all the comparison operators such that
they should work in both C++17 and C++20 modes. It also makes two other small
C++20-specific changes (adding a constructor to a type that cases to be an
aggregate, and adding casts from u8 literals which no longer have type
const char*).

There were four categories of errors that this review fixes.
Here are canonical examples of them, ordered from most to least common:

// 1) Missing const
namespace missing_const {
    struct A {
    #ifndef FIXED
        bool operator==(A const&);
    #else
        bool operator==(A const&) const;
    #endif
    };

    bool a = A{} == A{}; // error
}

// 2) Type mismatch on CRTP
namespace crtp_mismatch {
    template <typename Derived>
    struct Base {
    #ifndef FIXED
        bool operator==(Derived const&) const;
    #else
        // in one case changed to taking Base const&
        friend bool operator==(Derived const&, Derived const&);
    #endif
    };

    struct D : Base<D> { };

    bool b = D{} == D{}; // error
}

// 3) iterator/const_iterator with only mixed comparison
namespace iter_const_iter {
    template <bool Const>
    struct iterator {
        using const_iterator = iterator<true>;

        iterator();

        template <bool B, std::enable_if_t<(Const && !B), int> = 0>
        iterator(iterator<B> const&);

    #ifndef FIXED
        bool operator==(const_iterator const&) const;
    #else
        friend bool operator==(iterator const&, iterator const&);
    #endif
    };

    bool c = iterator<false>{} == iterator<false>{} // error
          || iterator<false>{} == iterator<true>{}
          || iterator<true>{} == iterator<false>{}
          || iterator<true>{} == iterator<true>{};
}

// 4) Same-type comparison but only have mixed-type operator
namespace ambiguous_choice {
    enum Color { Red };

    struct C {
        C();
        C(Color);
        operator Color() const;
        bool operator==(Color) const;
        friend bool operator==(C, C);
    };

    bool c = C{} == C{}; // error
    bool d = C{} == Red;
}

Differential revision: https://reviews.llvm.org/D78938
2020-12-17 10:44:10 +00:00
Artem Dergachev
8781944141 [analyzer] GenericTaint: Don't expect CallEvent to always have a Decl.
This isn't the case when the callee is completely unknown,
eg. when it is a symbolic function pointer.
2020-04-20 15:31:43 +03:00
Kirstóf Umann
bda3dd0d98 [analyzer][NFC] Change LangOptions to CheckerManager in the shouldRegister* functions
Some checkers may not only depend on language options but also analyzer options.
To make this possible this patch changes the parameter of the shouldRegister*
function to CheckerManager to be able to query the analyzer options when
deciding whether the checker should be registered.

Differential Revision: https://reviews.llvm.org/D75271
2020-03-27 14:34:09 +01:00
Balazs Benics
95a94df5a9 [analyzer][NFC] Use CallEvent checker callback in GenericTaintChecker
Summary:
Intended to be a non-functional change but it turned out CallEvent handles
constructor calls unlike CallExpr which doesn't triggered for constructors.

All in all, this change shouldn't be observable since constructors are not
yet propagating taintness like functions.
In the future constructors should propagate taintness as well.

This change includes:
 - NFCi change all uses of the CallExpr to CallEvent
 - NFC rename some functions, mark static them etc.
 - NFC omit explicit TaintPropagationRule type in switches
 - NFC apply some clang-tidy fixits

Reviewers: NoQ, Szelethus, boga95

Reviewed By: Szelethus

Subscribers: martong, whisperity, xazax.hun, baloghadamsoftware, szepet,
a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72035
2020-03-04 17:03:59 +01:00
Benjamin Kramer
adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Borsik Gabor
273e674252 [analyzer] Add support for namespaces to GenericTaintChecker
This patch introduces the namespaces for the configured functions and
also enables the use of the member functions.

I added an optional Scope field for every configured function. Functions
without Scope match for every function regardless of the namespace.
Functions with Scope will match if the full name of the function starts
with the Scope.
Multiple functions can exist with the same name.

Differential Revision: https://reviews.llvm.org/D70878
2019-12-15 12:11:22 +01:00