llvm-project

Author	SHA1	Message	Date
Matheus Izvekov	91cdd35008	[clang] Improve nested name specifier AST representation (#147835 ) This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well: ![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831) This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes #136624 Fixes https://github.com/llvm/llvm-project/issues/43179 Fixes https://github.com/llvm/llvm-project/issues/68670 Fixes https://github.com/llvm/llvm-project/issues/92757	2025-08-09 05:06:53 -03:00
Endre Fülöp	4f2ed926db	[analyzer][NFCi] Pass if bind is to a Decl or not to checkBind (#152137 ) Binding a value to location can happen when a new value is created or when and existing value is updated. This modification exposes whether the value binding happens at a declaration. This helps simplify the hacky logic of the BindToImmutable checker.	2025-08-08 19:16:47 +02:00
Donát Nagy	a807e8ea9f	[analyzer] Prettify checker registration and unittest code (#147797 ) This commit tweaks the interface of `CheckerRegistry::addChecker` to make it more practical for plugins and tests: - The parameter `IsHidden` now defaults to `false` even in the non-templated overload (because setting it to true is unusual, especially in plugins). - The parameter `DocsUri` defaults to the dummy placeholder string `"NoDocsUri"` because (as of now) nothing queries its value from the checker registry (it's only used by the logic that generates the clang-tidy documentation, but that loads it directly from `Checkers.td` without involving the `CheckerRegistry`), so there is no reason to demand specifying this value. In addition to propagating these changes, this commit clarifies, corrects and extends lots of comments and performs various minor code quality improvements in the code of unit tests and example plugins. I originally wrote the bulk of this commit when I was planning to add an extra parameter to `addChecker` in order to implement some technical details of the CheckerFamily framework. At the end I decided against adding that extra parameter, so this cleanup was left out of the PR https://github.com/llvm/llvm-project/pull/139256 and I'm merging it now as a separate commit (after minor tweaks). This commit is mostly NFC: the only functional change is that the analyzer will be compatible with plugins that rely on the default argument values and don't specify `IsHidden` or `DocsUri`. (But existing plugin code will remain valid as well.)	2025-07-22 13:36:58 +02:00
Kazu Hirata	6ab6321d03	[clang] Use range-based for loops (NFC) (#143153 ) Note that use of llvm::for_each is discouraged unless we have functors readily available.	2025-06-06 09:16:41 -07:00
Balázs Benics	104f5d1ff8	[analyzer] Introduce the check::BlockEntrance checker callback (#140924 ) Tranersing the CFG blocks of a function is a fundamental operation. Many C++ constructs can create splits in the control-flow, such as `if`, `for`, and similar control structures or ternary expressions, gnu conditionals, gotos, switches and possibly more. Checkers should be able to get notifications about entering or leaving a CFG block of interest. Note that in the ExplodedGraph there is always a BlockEntrance ProgramPoint right after the BlockEdge ProgramPoint. I considered naming this callback check::BlockEdge, but then that may leave the observer of the graph puzzled to see BlockEdge points followed more BlockEdge nodes describing the same CFG transition. This confusion could also apply to Bug Report Visitors too. Because of this, I decided to hook BlockEntrance ProgramPoints instead. The same confusion applies here, but I find this still a better place TBH. There would only appear only one BlockEntrance ProgramPoint in the graph if no checkers modify the state or emit a bug report. Otherwise they modify some GDM (aka. State) thus create a new ExplodedNode with the same BlockEntrance ProgramPoint in the graph. CPP-6484	2025-05-27 10:11:12 +02:00
Kazu Hirata	1cc9e6e5aa	[StaticAnalyzer] Use llvm::count (NFC) (#141370 )	2025-05-24 14:45:46 -07:00
Kazu Hirata	f002f300c5	[clang] Remove unused local variables (NFC) (#138453 )	2025-05-04 10:51:40 -07:00
Reid Kleckner	e3c0565b74	Reapply "[cmake] Refactor clang unittest cmake" (#134195 ) This reapplies 5ffd9bdb50b57 (#133545) with fixes. The BUILD_SHARED_LIBS=ON build was fixed by adding missing LLVM dependencies to the InterpTests binary in unittests/AST/ByteCode/CMakeLists.txt .	2025-04-02 21:07:30 -07:00
dpalermo	03a791f703	Revert "[cmake] Refactor clang unittest cmake" (#134022 ) Reverts llvm/llvm-project#133545 This change is breaking several buildbots as well as developer's builds. Reverting to allow people to make progress.	2025-04-01 22:19:27 -05:00
Reid Kleckner	5ffd9bdb50	[cmake] Refactor clang unittest cmake (#133545 ) Pass all the dependencies into add_clang_unittest. This is consistent with how it is done for LLDB. I borrowed the same named argument list structure from add_lldb_unittest. This is a necessary step towards consolidating unit tests into fewer binaries, but seems like a good refactoring in its own right.	2025-04-01 14:12:44 -07:00
Qinkun Bao	0cd82327ff	Fix some typos (NFC) (#133558 )	2025-03-29 20:54:15 +01:00
Donát Nagy	ea107d5c63	[NFC][analyzer] Use `CheckerBase::getName` in checker option handling (#131612 ) The virtual method `ProgramPointTag::getTagDescription` had two very distinct use cases: - It is printed in the DOT graph visualization of the exploded graph (that is, a debug printout). - The checker option handling code used it to query the name of a checker, which relied on the coincidence that in `CheckerBase` this method is defined to be equivalent with `getName()`. This commit switches to using `getName` in the second use case, because this way we will be able to properly support checkers that have multiple (separately named) parts. The method `reportInvalidCheckerOptionName` is extended with an additional overload that allows specifying the `CheckerPartIdx`. The methods `getChecker*Option` could be extended analogously in the future, but they are just convenience wrappers around the variants that directly take `StringRef CheckerName`, so I'll only do this extension if it's needed.	2025-03-18 16:11:43 +01:00
T-Gruber	9c542bcf0a	[analyzer] performTrivialCopy triggers checkLocation before binding (#129016 ) The triggered callbacks for the default copy constructed instance and the instance used for initialization now behave in the same way. The LHS already calls checkBind. To keep this consistent, checkLocation is now triggered accordingly for the RHS. Further details on the previous discussion: https://discourse.llvm.org/t/checklocation-for-implicitcastexpr-of-kind-ck-noop/84729 --------- Authored-by: tobias.gruber <tobias.gruber@concentrio.io>	2025-03-04 17:00:55 +01:00
Balázs Benics	22a5bb32b7	[analyzer] Limit Store by region-store-binding-limit (#127602 ) In our test pool, the max entry point RT was improved by this change: 1'181 seconds (~19.7 minutes) -> 94 seconds (1.6 minutes) BTW, the 1.6 minutes is still really bad. But a few orders of magnitude better than it was before. This was the most servere RT edge-case as you can see from the numbers. There are are more known RT bottlenecks, such as: - Large environment sizes, and `removeDead`. See more about the failed attempt on improving it at: https://discourse.llvm.org/t/unsuccessful-attempts-to-fix-a-slow-analysis-case-related-to-removedead-and-environment-size/84650 - Large chunk of time could be spend inside `assume`, to reach a fixed point. This is something we want to look into a bit later if we have time. We have 3'075'607 entry points in our test set. About 393'352 entry points ran longer than 1 second when measured. To give a sense of the distribution, if we ignore the slowest 500 entry points, then the maximum entry point runs for about 14 seconds. These 500 slow entry points are in 332 translation units. By this patch, out of the slowest 500 entry points, 72 entry points were improved by at least 10x after this change. We measured no RT regression on the "usual" entry points. ![slow-entrypoints-before-and-after-bind-limit](https://github.com/user-attachments/assets/44425a76-f1cb-449c-bc3e-f44beb8c5dc7) (The dashed lines represent the maximum of their RT) CPP-6092	2025-02-24 15:48:06 +01:00
Balazs Benics	3dc159431b	[analyzer] Clean up slightly the messed up ownership model of the analyzer (#128368 ) Well, yes. It's not pretty. At least after this we would have a bit more unique pointers than before. This is for fixing the memory leak diagnosed by: https://lab.llvm.org/buildbot/#/builders/24/builds/5580 And that caused the revert of #127409. After these uptrs that patch can re-land finally.	2025-02-24 11:34:36 +01:00
Ziqing Luo	536606f6f6	[StaticAnalyzer] Fix state update in VisitObjCForCollectionStmt (#124477 ) In `VisitObjCForCollectionStmt`, the function does `evalLocation` for the current element at the original source state `Pred`. The evaluation may result in a new state, say `PredNew`. I.e., there is a transition: `Pred -> PredNew`, though it is a very rare case that `Pred` is NOT identical to `PredNew`. (This explains why the bug exists for many years but no one noticed until recently a crash observed downstream.) Later, the original code does NOT use `PredNew` as the new source state in `StmtNodeBuilder` for next transitions. In cases `Pred != PredNew`, the program ill behaves. (rdar://143280254)	2025-01-30 16:21:46 -08:00
Balazs Benics	5f6b714507	[analyzer][NFC] Simplify PositiveAnalyzerOption handling (#121910 ) This simplifies #120239 Addresses my comment at: https://github.com/llvm/llvm-project/pull/120239#issuecomment-2574600543 CPP-5920	2025-01-07 15:19:16 +01:00
Balazs Benics	55391f85ac	[analyzer] Retry UNDEF Z3 queries 2 times by default (#120239 ) If we have a refutation Z3 query timed out (UNDEF), allow a couple of retries to improve stability of the query. By default allow 2 retries, which will give us in maximum of 3 solve attempts per query. Retries should help mitigating flaky Z3 queries. See the details in the following RFC: https://discourse.llvm.org/t/analyzer-rfc-retry-z3-crosscheck-queries-on-timeout/83711 Note that with each attempt, we spend more time per query. Currently, we have a 15 seconds timeout per query - which are also in effect for the retry attempts. --- Why should this help? In short, retrying queries should bring stability because if a query runs long it's more likely that it did so due to some runtime anomaly than it's on the edge of succeeding. This is because most queries run quick, and the queries that run long, usually run long by a fair amount. Consequently, retries should improve the stability of the outcome of the Z3 query. In general, the retries shouldn't increase the overall analysis time because it's really rare we hit the 0.1% of the cases when we would do retries. But keep in mind that the retry attempts can add up if many retries are allowed, or the individual query timeout is large. CPP-5920	2025-01-06 18:08:12 +01:00
Kristóf Umann	ea8e328ae2	[analyzer][Z3] Restore the original timeout of 15s (#118291 ) Discussion here: https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520/15?u=szelethus The original patch, #97298 introduced new timeouts backed by thorough testing and measurements to keep the running time of Z3 within reasonable limits. The measurements also showed that only certain reports and certain TUs were responsible for the poor performance of Z3 refutation. Unfortunately, it seems like that on machines with different characteristics (slower machines) the current timeouts don't just axe 0.01% of reports, but many more as well. Considering that timeouts are inherently nondeterministic as a cutoff point, this lead reports sets being vastly different on the same projects with the same configuration. The discussion link shows that all configurations introduced in the patch with their default values lead to severa nondeterminism of the analyzer. As we, and others use the analyzer as a gating tool for PRs, we should revert to the original defaults. We should respect that * There are still parts of the analyzer that are either proven or suspected to contain nondeterministic code (like pointer sets), * A 15s timeout is more likely to hit the same reports every time on a wider range of machines, but is still inherently nondeterministic, but an infinite timeout leads to the tool hanging, * If you measure the performance of the analyzer on your machines, you can and should achieve some speedup with little or no observable nondeterminism. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-12-13 14:31:06 +01:00
Balazs Benics	e67e03a22c	[analyzer] EvalBinOpLL should return Unknown less often (#114222 ) SValBuilder::getKnownValue, getMinValue, getMaxValue use SValBuilder::simplifySVal. simplifySVal does repeated simplification until a fixed-point is reached. A single step is done by SimpleSValBuilder::simplifySValOnce, using a Simplifier visitor. That will basically decompose SymSymExprs, and apply constant folding using the constraints we have in the State. Once it decomposes a SymSymExpr, it simplifies both sides and then uses the SValBuilder::evalBinOp to reconstruct the same - but now simpler - SymSymExpr, while applying some caching to remain performant. This decomposition, and then the subsequent re-composition poses new challenges to the SValBuilder::evalBinOp, which is built to handle expressions coming from real C/C++ code, thus applying some implicit assumptions. One previous assumption was that nobody would form an expression like "((int)0) - q" (where q is an int pointer), because it doesn't really makes sense to write code like that. However, during simplification, we may end up with a call to evalBinOp similar to this. To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc. In the following case the simplified SVal should not become UnknownVal: ```c++ void top(char p, char q) { int diff = p - q; // diff: reg<p> - reg<q> if (!p) // p: NULL simplify(diff); // diff after simplification should be: 0(loc) - reg<q> } ``` Returning Unknown from the simplifySVal can weaken analysis precision in other places too, such as in SValBuilder::getKnownValue, getMinValue, or getMaxValue because we call simplifySVal before doing anything else. For nonloc::SymbolVals, this loss of precision is critical, because for those the SymbolRef carries an accurate type of the encoded computation, thus we should at least have a conservative upper or lower bound that we could return from getMinValue or getMaxValue - yet we would just return nullptr. ```c++ const llvm::APSInt SimpleSValBuilder::getKnownValue(ProgramStateRef state, SVal V) { return getConstValue(state, simplifySVal(state, V)); } const llvm::APSInt SimpleSValBuilder::getMinValue(ProgramStateRef state, SVal V) { V = simplifySVal(state, V); if (const llvm::APSInt Res = getConcreteValue(V)) return Res; if (SymbolRef Sym = V.getAsSymbol()) return state->getConstraintManager().getSymMinVal(state, Sym); return nullptr; } ``` For now, I don't plan to make the simplification bullet-proof, I'm just explaining why I made this change and what you need to look out for in the future if you see a similar issue. CPP-5750	2024-10-31 11:01:47 +01:00
T-Gruber	86d65ae794	[analyzer] Improve FieldRegion descriptive name (#112313 ) The current implementation of MemRegion::getDescriptiveName fails for FieldRegions whose SuperRegion is an ElementRegion. As outlined below: ```Cpp struct val_struct { int val; }; extern struct val_struct val_struct_array[3]; void func(){ // FieldRegion with ElementRegion as SuperRegion. val_struct_array[0].val; } ``` For this special case, the expression cannot be pretty printed and must therefore be obtained separately.	2024-10-25 11:59:16 +02:00
JOE1994	918972bded	[clang] Strip unneeded calls to raw_string_ostream::str() (NFC) Avoid extra layer of indirection. p.s. Also, remove calls to raw_string_ostream::flush(), which are no-ops.	2024-09-14 04:38:50 -04:00
T-Gruber	87c51e2af0	Run PreStmt/PostStmt checker for GCCAsmStmt (#95409 ) Fixes #94940 Run PreStmt and PostStmt checker for GCCAsmStmt. Unittest to validate that corresponding callback functions are triggered.	2024-07-10 14:15:53 +02:00
Martin Storsjö	45b360d4a2	[clang] Disable C++14 sized deallocation by default for MinGW targets (#97232 ) This reverts 130e93cc26ca9d3ac50ec5a92e3109577ca2e702 for the MinGW target. This avoids the issue that is discussed in https://github.com/llvm/llvm-project/issues/96899 (and which is summarized in the code comment). This is intended as a temporary workaround until the issue is handled better within libc++.	2024-07-02 00:03:15 +03:00
Xing Xue	668ee3f547	[clang] Default to -fno-sized-deallocation for AIX (#97076 ) Some `libc++` LIT test cases and user code define their own version of `operator delete` that are not sized. With `-fno-sized-deallocation`, destructors call the non-sized `operator delete` and it will be resolved to the user defined version. However, with `-fsized-deallocation`, destructors will call the sized `operator delete` which will be resolved to the weak definition in `libc++abi` because the user code does not define the corresponding sized version. The `libc++abi` sized `operator delete` in turn calls the non-sized version of `operator delete` of the same shared object inside `libc++abi` instead of the user defined version on AIX because runtime linking is not the default for AIX and therefore, fails the tests or user code. This patch sets `-fno-sized-deallocation` as the default for AIX if neither `-fsize-deallocation` nor `-fno-sized-deallocation` is explicitly set, similar to what is done for ZOS.	2024-07-01 15:37:52 -04:00
Balazs Benics	ae570d82e8	Reland "[analyzer] Harden safeguards for Z3 query times" (#97298 ) This is exactly as originally landed in #95129, but now the minimal Z3 version was increased to meet this change in #96682. https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4 --- This patch is a functional change. https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520 As a result of this patch, individual Z3 queries in refutation will be bound by 300ms. Every report equivalence class will be processed in at most 1 second. The heuristic should have only really marginal observable impact - except for the cases when we had big report eqclasses with long-running (15s) Z3 queries, where previously CSA effectively halted. After this patch, CSA will tackle such extreme cases as well. (cherry picked from commit eacc3b3504be061f7334410dd0eb599688ba103a)	2024-07-01 17:22:24 +02:00
Balazs Benics	b3b0d09cce	Reland "[analyzer][NFC] Reorganize Z3 report refutation" (#97265 ) This is exactly as originally landed in #95128, but now the minimal Z3 version was increased to meet this change in #96682. https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4 --- This change keeps existing behavior, namely that if we hit a Z3 timeout we will accept the report as "satisfiable". This prepares for the commit "Harden safeguards for Z3 query times". https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520 (cherry picked from commit 89c26f6c7b0a6dfa257ec090fcf5b6e6e0c89aab)	2024-07-01 16:03:18 +02:00
Balazs Benics	8fc9c03cde	[analyzer] Revert Z3 changes (#95916 ) Requested in: https://github.com/llvm/llvm-project/pull/95128#issuecomment-2176008007 Revert "[analyzer] Harden safeguards for Z3 query times" Revert "[analyzer][NFC] Reorganize Z3 report refutation" This reverts commit eacc3b3504be061f7334410dd0eb599688ba103a. This reverts commit 89c26f6c7b0a6dfa257ec090fcf5b6e6e0c89aab.	2024-06-18 14:59:28 +02:00
Balazs Benics	eacc3b3504	[analyzer] Harden safeguards for Z3 query times This patch is a functional change. https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520 As a result of this patch, individual Z3 queries in refutation will be bound by 300ms. Every report equivalence class will be processed in at most 1 second. The heuristic should have only really marginal observable impact - except for the cases when we had big report eqclasses with long-running (15s) Z3 queries, where previously CSA effectively halted. After this patch, CSA will tackle such extreme cases as well. Reviewers: NagyDonat, haoNoQ, Xazax-hun, Szelethus, mikhailramalho Reviewed By: NagyDonat Pull Request: https://github.com/llvm/llvm-project/pull/95129	2024-06-18 09:48:22 +02:00
Balazs Benics	89c26f6c7b	[analyzer][NFC] Reorganize Z3 report refutation This change keeps existing behavior, namely that if we hit a Z3 timeout we will accept the report as "satisfiable". This prepares for the commit "Harden safeguards for Z3 query times". https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520 Reviewers: NagyDonat, haoNoQ, Xazax-hun, mikhailramalho, Szelethus Reviewed By: NagyDonat Pull Request: https://github.com/llvm/llvm-project/pull/95128	2024-06-18 09:42:29 +02:00
Martin Storsjö	f31b197d9d	[analyzer] Fix a test issue in mingw configurations (#92737 ) On Windows, long is always 32 bit, thus one can't use long for casting pointers to integers, on 64 bit architectures. Instead use long long, which should be large enough. This avoids errors like "error: cast from pointer to smaller type 'long' loses information" in this testcase. This condition only seems to be an error in mingw mode; in MSVC mode (clang-cl), this is only a warning.	2024-05-27 10:18:03 +03:00
Pengcheng Wang	130e93cc26	Reland "[clang] Enable sized deallocation by default in C++14 onwards" (#90373 ) Since C++14 has been released for about nine years and most standard libraries have implemented sized deallocation functions, it's time to make this feature default again. This is another try of https://reviews.llvm.org/D112921. The original commit cf5a8b4 was reverted by 2e5035a due to some failures (see #83774). Fixes #60061	2024-05-22 12:37:27 +08:00
Donát Nagy	58bad2862c	[analyzer][NFC] Require explicit matching mode for CallDescriptions (#92454 ) This commit deletes the "simple" constructor of `CallDescription` which did not require a `CallDescription::Mode` argument and always used the "wildcard" mode `CDM::Unspecified`. A few months ago, this vague matching mode was used by many checkers, which caused bugs like https://github.com/llvm/llvm-project/issues/81597 and https://github.com/llvm/llvm-project/issues/88181. Since then, my commits improved the available matching modes and ensured that all checkers explicitly specify the right matching mode. After those commits, the only remaining references to the "simple" constructor were some unit tests; this commit updates them to use an explicitly specified matching mode (often `CDM::SimpleFunc`). The mode `CDM::Unspecified` was not deleted in this commit because it's still a reasonable choice in `GenericTaintChecker` and a few unit tests.	2024-05-17 13:08:45 +02:00
Vitaly Buka	2e5035aeed	Revert "[clang] Enable sized deallocation by default in C++14 onwards (#83774 )" (#90299 ) https://lab.llvm.org/buildbot/#/builders/168/builds/20063 (should be fixed with #90292) More details in #83774 This reverts commit cf5a8b489464d09dfdd7a48ce7c8b41d3c9bf819.	2024-04-26 17:14:43 -07:00
Pengcheng Wang	cf5a8b4894	[clang] Enable sized deallocation by default in C++14 onwards (#83774 ) Since C++14 has been released for about nine years and most standard libraries have implemented sized deallocation functions, it's time to make this feature default again. This is another try of https://reviews.llvm.org/D112921. Fixes #60061	2024-04-26 16:59:12 +08:00
NagyDonat	fb299cae51	[analyzer] Make recognition of hardened __FOO_chk functions explicit (#86536 ) In builds that use source hardening (-D_FORTIFY_SOURCE), many standard functions are implemented as macros that expand to calls of hardened functions that take one additional argument compared to the "usual" variant and perform additional input validation. For example, a `memcpy` call may expand to `__memcpy_chk()` or `__builtin___memcpy_chk()`. Before this commit, `CallDescription`s created with the matching mode `CDM::CLibrary` automatically matched these hardened variants (in a addition to the "usual" function) with a fairly lenient heuristic. Unfortunately this heuristic meant that the `CLibrary` matching mode was only usable by checkers that were prepared to handle matches with an unusual number of arguments. This commit limits the recognition of the hardened functions to a separate matching mode `CDM::CLibraryMaybeHardened` and applies this mode for functions that have hardened variants and were previously recognized with `CDM::CLibrary`. This way checkers that are prepared to handle the hardened variants will be able to detect them easily; while other checkers can simply use `CDM::CLibrary` for matching C library functions (and they won't encounter surprising argument counts). The initial motivation for refactoring this area was that previously `CDM::CLibrary` accepted calls with more arguments/parameters than the expected number, so I wasn't able to use it for `malloc` without accidentally matching calls to the 3-argument BSD kernel malloc. After this commit this "may have more args/params" logic will only activate when we're actually matching a hardened variant function (in `CDM::CLibraryMaybeHardened` mode). The recognition of "sprintf()" and "snprintf()" in CStringChecker was refactored, because previously it was abusing the behavior that extra arguments are accepted even if the matched function is not a hardened variant. This commit also fixes the oversight that the old code would've recognized e.g. `__wmemcpy_chk` as a hardened variant of `memcpy`. After this commit I'm planning to create several follow-up commits that ensure that checkers looking for C library functions use `CDM::CLibrary` as a "sane default" matching mode. This commit is not truly NFC (it eliminates some buggy corner cases), but it does not intentionally modify the behavior of CSA on real-world non-crazy code. As a minor unrelated change I'm eliminating the argument/variable "IsBuiltin" from the evalSprintf function family in CStringChecker, because it was completely unused. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-04-05 11:20:27 +02:00
NagyDonat	e1d4ddb0c6	Reapply "[analyzer] Accept C library functions from the `std` namespace" again (#85791 ) This reapplies 80ab8234ac309418637488b97e0a62d8377b2ecf again, after fixing a name collision warning in the unit tests (see the revert commit 13ccaf9b9d4400bb128b35ff4ac733e4afc3ad1c for details). In addition to the previously applied changes, this commit also clarifies the code in MallocChecker that distinguishes POSIX "getline()" and C++ standard library "std::getline()" (which are two completely different functions). Note that "std::getline()" was (accidentally) handled correctly even without this clarification; but it's better to explicitly handle and test this corner case. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-25 12:43:51 +01:00
T-Gruber	86d479fd7c	Adapted MemRegion::getDescriptiveName to handle ElementRegions (#85104 ) Fixes https://github.com/llvm/llvm-project/issues/84463 Changes: - Adapted MemRegion::getDescriptiveName - Added unittest to check name for a given clang::ento::ElementRegion - Some format changes due to clang-format --------- Co-authored-by: Andreas Steinhausen <andreas.steinhausen@concenrio.io> Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-21 18:27:53 +01:00
Philip Reames	13ccaf9b9d	Revert "Reapply "[analyzer] Accept C library functions from the `std` namespace"" This reverts commit e48d5a838f69e0a8e0ae95a8aed1a8809f45465a. Fails to build on x86-64 w/gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) with the following message: ../llvm-project/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp:41:28: error: declaration of ‘std::unique_ptr<clang::ASTUnit> IsCLibraryFunctionTest::ASTUnit’ changes meaning of ‘ASTUnit’ [-fpermissive] 41 \| std::unique_ptr<ASTUnit> ASTUnit; \| ^~~~~~~ In file included from ../llvm-project/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp:4: ../llvm-project/clang/include/clang/Frontend/ASTUnit.h:89:7: note: ‘ASTUnit’ declared here as ‘class clang::ASTUnit’ 89 \| class ASTUnit { \| ^~~~~~~	2024-03-13 10:19:42 -07:00
NagyDonat	e48d5a838f	Reapply "[analyzer] Accept C library functions from the `std` namespace" This reapplies f32b04d4ea91ad1018c25a1d4178cc4392d34968i, after fixing the use-after-free of ASTUnit in the unittest. https://github.com/llvm/llvm-project/pull/84469#issuecomment-1992163439 Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-13 14:48:42 +01:00
NagyDonat	f32b04d4ea	Revert "[analyzer] Accept C library functions from the `std` namespace" (#84926 ) Reverts llvm/llvm-project#84469 because it causes buildbot failures. I'll examine them and re-submit the change.	2024-03-12 16:01:04 +01:00
NagyDonat	80ab8234ac	[analyzer] Accept C library functions from the `std` namespace (#84469 ) Previously, the function `isCLibraryFunction()` and logic relying on it only accepted functions that are declared directly within a TU (i.e. not in a namespace or a class). However C++ headers like <cstdlib> declare many C standard library functions within the namespace `std`, so this commit ensures that functions within the namespace `std` are also accepted. After this commit it will be possible to match functions like `malloc` or `free` with `CallDescription::Mode::CLibrary`. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-12 13:51:12 +01:00
NagyDonat	52a460f9d4	[analyzer] Refactor CallDescription match mode (NFC) (#83432 ) The class `CallDescription` is used to define patterns that are used for matching `CallEvent`s. For example, a `CallDescription{{"std", "find_if"}, 3}` matches a call to `std::find_if` with 3 arguments. However, these patterns are somewhat fuzzy, so this pattern could also match something like `std::__1::find_if` (with an additional namespace layer), or, unfortunately, a `CallDescription` for the well-known function `free()` can match a C++ method named `free()`: https://github.com/llvm/llvm-project/issues/81597 To prevent this kind of ambiguity this commit introduces the enum `CallDescription::Mode` which can limit the pattern matching to non-method function calls (or method calls etc.). After this NFC change, one or more follow-up commits will apply the right pattern matching modes in the ~30 checkers that use `CallDescription`s. Note that `CallDescription` previously had a `Flags` field which had only two supported values: - `CDF_None` was the default "match anything" mode, - `CDF_MaybeBuiltin` was a "match only C library functions and accept some inexact matches" mode. This commit preserves `CDF_MaybeBuiltin` under the more descriptive name `CallDescription::Mode::CLibrary` (or `CDM::CLibrary`). Instead of this "Flags" model I'm switching to a plain enumeration becasue I don't think that there is a natural usecase to combine the different matching modes. (Except for the default "match anything" mode, which is currently kept for compatibility, but will be phased out in the follow-up commits.)	2024-03-04 15:43:37 +01:00
Balazs Benics	18f219c5ac	[analyzer][NFC] Cleanup BugType lazy-init patterns (#76655 ) Cleanup most of the lazy-init `BugType` legacy. Some will be preserved, as those are slightly more complicated to refactor. Notice, that the default category for `BugType` is `LogicError`. I omitted setting this explicitly where I could. Please, actually have a look at the diff. I did this manually, and we rarely check the bug type descriptions and stuff in tests, so the testing might be shallow on this one.	2024-01-01 18:53:36 +01:00
Kazu Hirata	f3dcc2351c	[clang] Use StringRef::{starts,ends}_with (NFC) (#75149 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 08:54:13 -08:00
Jan Svoboda	8e0c9bb91f	[clang] NFCI: Change returned AnalyzerOptions smart pointer to reference	2023-09-05 13:23:53 -07:00
Aaron Ballman	a02f9a7756	Revert "[clang] Enable sized deallocation by default in C++14 onwards" This reverts commit 2916b125f686115deab2ba573dcaff3847566ab9. Reverting due to failures on: https://lab.llvm.org/buildbot/#/builders/216/builds/26407 https://lab.llvm.org/staging/#/builders/247/builds/5659 http://45.33.8.238/win/83485/step_7.txt	2023-08-29 09:36:59 -04:00
wangpc	2916b125f6	[clang] Enable sized deallocation by default in C++14 onwards Since C++14 has been released for about nine years and most standard libraries have implemented sized deallocation functions, it's time to make this feature default again. Reviewed By: rnk, aaron.ballman, #libc, ldionne, Mordante, MaskRay Differential Revision: https://reviews.llvm.org/D112921	2023-08-29 15:42:50 +08:00
Donát Nagy	8a5cfdf785	[analyzer][NFC] Remove useless class BuiltinBug ...because it provides no useful functionality compared to its base class `BugType`. A long time ago there were substantial differences between `BugType` and `BuiltinBug`, but they were eliminated by commit 1bd58233 in 2009 (!). Since then the only functionality provided by `BuiltinBug` was that it specified `categories::LogicError` as the bug category and it stored an extra data member `desc`. This commit sets `categories::LogicError` as the default value of the third argument (bug category) in the constructors of BugType and replaces use of the `desc` field with simpler logic. Note that `BugType` has a data member `Description` and a non-virtual method `BugType::getDescription()` which queries it; these are distinct from the member `desc` of `BuiltinBug` and the identically named method `BuiltinBug::getDescription()` which queries it. This confusing name collision was a major motivation for the elimination of `BuiltinBug`. As this commit touches many files, I avoided functional changes and left behind FIXME notes to mark minor issues that should be fixed later. Differential Revision: https://reviews.llvm.org/D158855	2023-08-28 15:20:14 +02:00
isuckatcs	d65379c8d4	[analyzer] Remove the loop from the exploded graph caused by missing information in program points This patch adds CFGElementRef to ProgramPoints and helps the analyzer to differentiate between two otherwise identically looking ProgramPoints. Fixes #60412 Differential Revision: https://reviews.llvm.org/D143328	2023-03-04 02:01:45 +01:00

1 2 3 4

162 Commits