This commit handles the following types:
- clang::ExternalASTSource
- clang::TargetInfo
- clang::ASTContext
- clang::SourceManager
- clang::FileManager
Part of cleanup #151026
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This PR makes it so that `CompilerInvocation` needs to be provided to
`CompilerInstance` on construction. There are a couple of benefits in my
view:
* Making it impossible to mis-use some `CompilerInstance` APIs. For
example there are cases, where `createDiagnostics()` was called before
`setInvocation()`, causing the `DiagnosticEngine` to use the
default-constructed `DiagnosticOptions` instead of the intended ones.
* This shrinks `CompilerInstance`'s state space.
* This makes it possible to access **the** invocation in
`CompilerInstance`'s constructor (to be used in a follow-up).
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
Reapply "[analyzer] Delay the checker constructions after parsing"
(#128350)
This reverts commit db836edf47f36ed04cab919a7a2c4414f4d0d7e6, as-is.
Depends on #128368
So far CSA was relying on the LLVM Statistic package that allowed us to
gather some data about analysis of an entire translation unit. However,
the translation unit consists of a collection of loosely related entry
points. Aggregating data across multiple such entry points is often
counter productive.
This change introduces a new lightweight always-on facility to collect
Boolean or numerical statistics for each entry point and dump them in a
CSV format. Such format makes it easy to aggregate data across multiple
translation units and analyze it with common data-processing tools.
We break down the existing statistics that were collected on the per-TU
basis into values per entry point.
Additionally, we enable the statistics unconditionally (STATISTIC ->
ALWAYS_ENABLED_STATISTIC) to facilitate their use (you can gather the
data with a simple run-time flag rather than having to recompile the
analyzer). These statistics are very light and add virtually no
overhead.
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
CPP-6160
Previously checker objects were created by raw `new` calls, which
necessitated managing and calling their destructors explicitly. This
commit refactors this convoluted logic by introducing `unique_ptr`s that
to manage the ownership of these objects automatically.
This change can be thought of as stand-alone code quality improvement;
but I also have a secondary motivation that I'm planning further changes
in the checker registration/initialization process (to formalize our
tradition of multi-part checker) and this commit "prepares the ground"
for those changes.
Well, yes. It's not pretty.
At least after this we would have a bit more unique pointers than
before.
This is for fixing the memory leak diagnosed by:
https://lab.llvm.org/buildbot/#/builders/24/builds/5580
And that caused the revert of #127409.
After these uptrs that patch can re-land finally.
If we were to delay checker constructions after we have a filled
ASTContext, then we could get rid of a bunch of "lazy initializers" in
checkers.
Turns out in the register functions of the checkers we could transfer
the ASTContext and all other things to checkers, so those could benefit
from in-class initializers and const fields.
For example, if a checker would take the ASTContext as the first field,
then the rest of the fields could use it in their in-class initializers,
so the ctor of the checker would only need to set a single field!
This would open uup countless opportunities for cleaning up the
asthetics of our checkers.
I attached a single use-case for the AST and the PP as demonstrating
purposes. You can imagine the rest.
**FYI: This may be a breaking change** to some downstream users that may
had some means to attach different listeners and what not to e.g. the
Preprocessor inside their checker register functions. Since we delay the
calls to these register fns after parsing is already done, they would of
course miss the parsing Preprocessor events.
Specifically, add a scope for
- each work-list step,
- each entry point,
- each checker run within a step, and
- bug-suppression phase at the end of the analysis of an entry-point.
These scopes add no perceptible run-time overhead when time-tracing is
disabled. You can enable it and generate a time trace using the
`-ftime-trace=file.json` option.
See also the RFC:
https://discourse.llvm.org/t/analyzer-rfc-ftime-trace-time-scopes-for-steps-and-entry-points/84343
--
CPP-6065
Starting with 41e3919ded78d8870f7c95e9181c7f7e29aa3cc4 DiagnosticsEngine
creation might perform IO. It was implicitly defaulting to
getRealFileSystem. This patch makes it explicit by pushing the decision
making to callers.
It uses ambient VFS if one is available, and keeps using
`getRealFileSystem` if there aren't any VFS.
Random testing revealed it's possible to crash the analyzer with the
command line invocation:
clang -cc1 -analyze -analyzer-checker=nullability empty.c
where the source file, empty.c is an empty source file.
```
clang: <root>/clang/lib/StaticAnalyzer/Core/CheckerManager.cpp:56:
void clang::ento::CheckerManager::finishedCheckerRegistration():
Assertion `Event.second.HasDispatcher && "No dispatcher registered for an event"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/
Stack dump:
0. Program arguments: clang -cc1 -analyze -analyzer-checker=nullability nullability-nocrash.c
#0 ...
...
#7 <addr> clang::ento::CheckerManager::finishedCheckerRegistration()
#8 <addr> clang::ento::CheckerManager::CheckerManager(clang::ASTContext&,
clang::AnalyzerOptions&, clang::Preprocessor const&,
llvm::ArrayRef<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char>>>, llvm::ArrayRef<std::function<void (clang::ento::CheckerRegistry&)>>)
```
This commit removes the assertion which failed here, because it was
logically incorrect: it required that if an Event is handled by some
(enabled) checker, then there must be an **enabled** checker which can
emit that kind of Event. It should be OK to disable the event-producing
checkers but enable an event-consuming checker which has different
responsibilities in addition to handling the events.
Note that this assertion was in an `#ifndef NDEBUG` block, so this
change does not impact the non-debug builds.
Co-authored-by: Vince Bridgers <vince.a.bridgers@ericsson.com>
When debugging CSA issues, sometimes it would be useful to have a
dedicated note for the analysis entry point, aka. the function name you
would need to pass as "-analyze-function=XYZ" to reproduce a specific
issue.
One way we use (or will use) this downstream is to provide tooling on
top of creduce to enhance to supercharge productivity by automatically
reduce cases on crashes for example.
This will be added only if the "-analyzer-note-analysis-entry-points" is
set or the "analyzer-display-progress" is on.
This additional entry point marker will be the first "note" if enabled,
with the following message: "[debug] analyzing from XYZ". They are
prefixed by "[debug]" to remind the CSA developer that this is only
meant to be visible for them, for debugging purposes.
CPP-5012
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
I'm involved with the Static Analyzer for the most part.
I think we should embrace newer language standard features and gradually
move forward.
Differential Revision: https://reviews.llvm.org/D154325
Thanks @kazu for helping me clean these parts in D127799.
I'm leaving the dump methods, along with the unused visitor handlers and
the forwarding methods.
The dead parts actually helped to uncover two bugs, to which I'm going
to post separate patches.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D127836
I'm trying to remove unused options from the `Analyses.def` file, then
merge the rest of the useful options into the `AnalyzerOptions.def`.
Then make sure one can set these by an `-analyzer-config XXX=YYY` style
flag.
Then surface the `-analyzer-config` to the `clang` frontend;
After all of this, we can pursue the tablegen approach described
https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488
In this patch, I'm proposing flag deprecations.
We should support deprecated analyzer flags for exactly one release. In
this case I'm planning to drop this flag in `clang-16`.
In the clang frontend, now we won't pass this option to the cc1
frontend, rather emit a warning diagnostic reminding the users about
this deprecated flag, which will be turned into error in clang-16.
Unfortunately, I had to remove all the tests referring to this flag,
causing a mass change. I've also added a test for checking this warning.
I've seen that `scan-build` also uses this flag, but I think we should
remove that part only after we turn this into a hard error.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D126215
This reverts commit d50d9946d1d7e5f20881f0bc71fbd025040b1c96.
Broke check-clang, see comments on https://reviews.llvm.org/D126067
Also revert dependent change "[analyzer] Deprecate the unused 'analyzer-opt-analyze-nested-blocks' cc1 flag"
This reverts commit 07b4a6d0461fe64e10d30029ed3d598e49ca3810.
Also revert "[analyzer] Fix buildbots after introducing a new frontend warning"
This reverts commit 90374df15ddc58d823ca42326a76f58e748f20eb.
(See https://reviews.llvm.org/rG90374df15ddc58d823ca42326a76f58e748f20eb)
I'm trying to remove unused options from the `Analyses.def` file, then
merge the rest of the useful options into the `AnalyzerOptions.def`.
Then make sure one can set these by an `-analyzer-config XXX=YYY` style
flag.
Then surface the `-analyzer-config` to the `clang` frontend;
After all of this, we can pursue the tablegen approach described
https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488
In this patch, I'm proposing flag deprecations.
We should support deprecated analyzer flags for exactly one release. In
this case I'm planning to drop this flag in `clang-16`.
In the clang frontend, now we won't pass this option to the cc1
frontend, rather emit a warning diagnostic reminding the users about
this deprecated flag, which will be turned into error in clang-16.
Unfortunately, I had to remove all the tests referring to this flag,
causing a mass change. I've also added a test for checking this warning.
I've seen that `scan-build` also uses this flag, but I think we should
remove that part only after we turn this into a hard error.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D126215
This reverts commit 3988bd13988aad72ec979beb2361e8738584926b.
Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372
/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
177 | { return bool(_M_comp(*__it, __val)); }
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.
As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.
Differential Revision: https://reviews.llvm.org/D126068
This new CTU implementation is the natural extension of the normal single TU
analysis. The approach consists of two analysis phases. During the first phase,
we do a normal single TU analysis. During this phase, if we find a foreign
function (that could be inlined from another TU) then we don’t inline that
immediately, we rather mark that to be analysed later.
When the first phase is finished then we start the second phase, the CTU phase.
In this phase, we continue the analysis from that point (exploded node)
which had been enqueued during the first phase. We gradually extend the
exploded graph of the single TU analysis with the new node that was
created by the inlining of the foreign function.
We count the number of analysis steps of the first phase and we limit the
second (ctu) phase with this number.
This new implementation makes it convenient for the users to run the
single-TU and the CTU analysis in one go, they don't need to run the two
analysis separately. Thus, we name this new implementation as "onego" CTU.
Discussion:
https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728
Differential Revision: https://reviews.llvm.org/D123773
Do import the definition of objects from a foreign translation unit if that's type is const and trivial.
Differential Revision: https://reviews.llvm.org/D122805
This reverts commit 620d99b7edc64ee87b1ce209f179305e6a919006.
Let's see if removing the two offending RUN lines makes this patch pass.
Not ideal to drop tests but, it's just a debugging feature, probably not
that important.
This reverts commit 841817b1ed26c1fbb709957d54c0e2751624fbf8.
Ah, it still fails on build bots for some reason.
Pinning the target triple was not enough.
Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.
This patch introduces some logic checking common misuses involving
`-analyze-function`.
Reviewed-By: martong
Differential Revision: https://reviews.llvm.org/D118690
This reverts commit 9d6a6159730171bc0faf78d7f109d6543f4c93c2.
Exit Code: 1
Command Output (stderr):
--
/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp:53:21: error: CHECK-EMPTY-NOT: excluded string found in input // CHECK-EMPTY-NOT: Every top-level function was skipped.
^
<stdin>:1:1: note: found here
Every top-level function was skipped.
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Input file: <stdin>
Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: Every top-level function was skipped.
not:53 !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match expected
2: Pass the -analyzer-display-progress for tracking which functions are analyzed.
>>>>>>
Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.
This patch introduces some logic checking common misuses involving
`-analyze-function`.
Reviewed-By: martong
Differential Revision: https://reviews.llvm.org/D118690
Some projects [1,2,3] have flex-generated files besides bison-generated
ones.
Unfortunately, the comment `"/* A lexical scanner generated by flex */"`
generated by the tools is not necessarily at the beginning of the file,
thus we need to quickly skim through the file for this needle string.
Luckily, StringRef can do this operation in an efficient way.
That being said, now the bison comment is not required to be at the very
beginning of the file. This allows us to detect a couple more cases
[4,5,6].
Alternatively, we could say that we only allow whitespace characters
before matching the bison/flex header comment. That would prevent the
(probably) unnecessary string search in the buffer. However, I could not
verify that these tools would actually respect this assumption.
Additionally to this, e.g. the Twin project [1] has other non-whitespace
characters (some preprocessor directives) before the flex-generated
header comment. So the heuristic in the previous paragraph won't work
with that.
Thus, I would advocate the current implementation.
According to my measurement, this patch won't introduce measurable
performance degradation, even though we will do 2 linear scans.
I introduce the ignore-bison-generated-files and
ignore-flex-generated-files to disable skipping these files.
Both of these options are true by default.
[1]: https://github.com/cosmos72/twin/blob/master/server/rcparse_lex.cpp#L7
[2]: 22362cdcf9/sandbox/count-words/lexer.c (L6)
[3]: 11abdf6462/lab1/lex.yy.c (L6)
[4]: 47f5b2cfe2/B_yacc/1/y1.tab.h (L2)
[5]: 71d1bf9b1e/src/VBox/Additions/x11/x11include/xorg-server-1.8.0/parser.h (L2)
[6]: 3f773ceb13/Framework/OpenEars.framework/Versions/A/Headers/jsgf_parser.h (L2)
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D114510
I just read this part of the code, and I found the nested ifs less
readable.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D114441
Summary: This patch is a part of an attempt to obtain more
timer data from the analyzer. In this patch, we try to use
LLVM::TimeRecord to save time before starting the analysis
and to print the time that a specific function takes while
getting analyzed.
The timer data is printed along with the
-analyzer-display-progress outputs.
ANALYZE (Syntax): test.c functionName : 0.4 ms
ANALYZE (Path, Inline_Regular): test.c functionName : 2.6 ms
Authored By: RithikSharma
Reviewer: NoQ, xazax.hun, teemperor, vsavchenko
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105565
The `-analyzer-display-progress` displayed the function name of the
currently analyzed function. It differs in C and C++. In C++, it
prints the argument types as well in a comma-separated list.
While in C, only the function name is displayed, without the brackets.
E.g.:
C++: foo(), foo(int, float)
C: foo
In crash traces, the analyzer dumps the location contexts, but the
string is not enough for `-analyze-function` in C++ mode.
This patch addresses the issue by dumping the proper function names
even in stack traces.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105708
Adds a `MacroExpansionContext` member to the `AnalysisConsumer` class.
Tracks macro expansions only if the `ShouldDisplayMacroExpansions` is set.
Passes a reference down the pipeline letting AnalysisConsumers query macro
expansions during bugreport construction.
Reviewed By: martong, Szelethus
Differential Revision: https://reviews.llvm.org/D93223