This reverts commit e2a885537f11f8d9ced1c80c2c90069ab5adeb1d. Build failures were fixed right away and reverting the original commit without the fixes breaks the build again.
The `DiagnosticOptions` class is currently intrusively
reference-counted, which makes reasoning about its lifetime very
difficult in some cases. For example, `CompilerInvocation` owns the
`DiagnosticOptions` instance (wrapped in `llvm::IntrusiveRefCntPtr`) and
only exposes an accessor returning `DiagnosticOptions &`. One would
think this gives `CompilerInvocation` exclusive ownership of the object,
but that's not the case:
```c++
void shareOwnership(CompilerInvocation &CI) {
llvm::IntrusiveRefCntPtr<DiagnosticOptions> CoOwner = &CI.getDiagnosticOptions();
// ...
}
```
This is a perfectly valid pattern that is being actually used in the
codebase.
I would like to ensure the ownership of `DiagnosticOptions` by
`CompilerInvocation` is guaranteed to be exclusive. This can be
leveraged for a copy-on-write optimization later on. This PR changes
usages of `DiagnosticOptions` across `clang`, `clang-tools-extra` and
`lldb` to not be intrusively reference-counted.
This reapplies 5ffd9bdb50b57 (#133545) with fixes.
The BUILD_SHARED_LIBS=ON build was fixed by adding missing LLVM
dependencies to the InterpTests binary in
unittests/AST/ByteCode/CMakeLists.txt .
Pass all the dependencies into add_clang_unittest. This is consistent
with how it is done for LLDB. I borrowed the same named argument list
structure from add_lldb_unittest. This is a necessary step towards
consolidating unit tests into fewer binaries, but seems like a good
refactoring in its own right.
Relands #129502.
Previously when the framework encountered unsupported values (such as
enum classes), they were always treated as equal when comparing with
`==`, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.
Added handling for the special case of `nullptr == nullptr`, to preserve
the behavior of untyped `nullptr` having no value.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
Report the range in diagnostics, in addition to the location
in case the range helps disambiguate a little in chained `->`
expressions.
```
b->a->f->x = 1;
^~~~~~~
```
instead of just:
```
b->a->f->x = 1;
^
```
As a followup we should probably also report the location/range
of an `->` if that operator is used. Like:
```
b->a->f->x = 1;
^~
```
Add test for https://github.com/llvm/llvm-project/issues/125589
The crash is actually incidentally fixed by
https://github.com/llvm/llvm-project/pull/128437 since it added a branch
for the reference case and would no longer fall through when the return
type is a reference to a pointer.
Clean up a bit as well:
- make the fallback for early returns more consistent (check if
returning optional and call transfer function for that case)
- check RecordLoc == nullptr in one place
- clean up extra spaces in test
- clean up parameterization in test of `std::` vs `$ns::$`
Previously when the framework encountered unsupported values (such as
enum classes), they were always treated as equal when comparing with
`==`, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.
Added a Value assignment for `nullptr`, to handle the special case of
`nullptr == nullptr`.
We've already migrated known users from the old to the new version of
getOrCreateConstMethodReturnStorageLocation. The conversion is pretty
straightforward as well, if there are out-of-tree users:
Previously: use CallExpr as argument
New: get the direct Callee from CallExpr, null check, and use that as
the argument instead.
Part 2 (and final part) following
https://github.com/llvm/llvm-project/pull/120102
Allows users to do things like:
```
if (o->x.has_value()) {
((*o).x).value();
}
```
where the `->` and `*` are operator overload calls.
A user could instead extract the nested optional into a local variable
once instead of doing two accessor calls back to back, but currently
they are unsure why the code is flagged.
This is part 1 of caching for smart pointer accessors, building on top
of the CachedConstAccessorsLattice, which caches "normal" accessors.
Smart pointer accessors are a bit different in that they may:
- have aliases to access the same underlying data (but potentially
returning slightly different types like `&` vs `*`). Within a
"checked" sequence users may mix uses of the different aliases and the
check should apply to any of the spellings.
- may have non-const overloads in addition to the const version, where
the non-const doesn't actually modify the container
Part 2 will follow and add transfer functions utilities. It will also
add a user UncheckedOptionalAccessModel. We'd seen false positives when
nesting StatusOr<optional<T>> and optional<StatusOr<T>>, etc. which this
can help address.
Tuple-like types introduce `VarDecl`s in the AST for their "holding
vars", but AST visitors do not visit those. As a result the `VarDecl`
for the holding var is orphaned when trying to retreive its parents.
Fix a `FlowSensitive` test that assumes that only a `BindingDecl` is
introduced with the given name (the matcher now can also reach the
`VarDecl` for the holding var).
…da call operators.
This doesn't require that they be used in the operator's body, unlike
other ReferencedDecls. This is most obviously different from captured
local variables, which can be captured but will not appear in
ReferencedDecls unless they appear in the operator's body.
This difference simplifies the collection of the captured parameters,
but probably could be eliminated if desirable.
Previously, we covered returning refs, or copies of optional, and bools.
Now cover returning pointers (to any type).
This is useful for cases like operator-> of smart pointers.
Addresses more of issue llvm#58510
Treat calls to zero-param const methods as having stable return values
(with a cache) to address issue #58510. The cache is invalidated when
non-const methods are called. This uses the infrastructure from PR
#111006.
For now we cache methods returning:
- ref to optional
- optional by value
- booleans
We can extend that to pointers to optional in a next change.
By caching const accessor methods we can sometimes treat method call
results as stable (e.g., for issue
https://github.com/llvm/llvm-project/issues/58510). Users can clear the
cache when a non-const method is called that may modify the state of an
object.
This is represented as mixin.
It will be used in a follow on patch to change
bugprone-unchecked-optional-access's lattice from NoopLattice to
CachedConstAccessorsLattice<NoopLattice>, along with some additional
transfer functions.
…n/statement.
We don't need these for the same in-tree purposes as the other sets,
i.e. for making sure we model these Decls that are declared outside the
function, but we have an out-of-tree use for these sets that would
benefit from this simple addition and would avoid duplicating so much of
this code.
This was missing a call to `ignoreCFGOmittedNodes()`. As a result, the
function
would erroneously conclude that a block did not contain an expression
consumed
in a different block if the expression in question was surrounded by a
`ParenExpr` in the consuming block. The patch adds a test that triggers
this
scenario (and fails without the fix).
To prevent this kind of bug in the future, the patch also adds a new
method
`blockForStmt()` to `AdornedCFG` that calls `ignoreCFGOmittedNodes()`
and is
preferred over accessing `getStmtToBlock()` directly.
`CXXInheritedCtorInitExpr` is another of the node kinds that should be
considered an "original initializer". An assertion failure in
`assert(Children.size() == 1)` happens without this fix.
---------
Co-authored-by: martinboehme <mboehme@google.com>
- **Reapply "[clang][dataflow] Teach `AnalysisASTVisitor` that
`typeid()` can be evaluated." (#96766)**
- **Turn on RTTI explicitly in `checkDataflowWithNoopAnalysis()`.**
We definitely know that these operations change the value of their
operand, so
clear out any value associated with it. We don't create a new value,
instead
leaving it to the analysis to do this if desired.
We were previously treating the operand of `typeid()` as being
definitely
unevaluated, but it can be evaluated if it is a glvalue of polymorphic
type.
This patch includes a test that fails without the fix.
At the same time, rename `PostVisitCFG` to the more descriptive
`PostAnalysisCallbacks` (which emphasizes the fact that these callbacks
are run
after the dataflow analysis itself has converged).
Before this patch, it was only possible to run a callback on the state
_after_
the transfer function had been applied, but for many analyses, it's more
natural
to to check the state _before_ the transfer function has been applied,
because we
are usually checking the preconditions for some operation. Some checks
are
impossible to perform on the "after" state because we can no longer
check the
precondition; for example, the `++` / `--` operators on raw pointers
require the
operand to be nonnull, but after the transfer function for the operator
has been
applied, the original value of the pointer can no longer be accessed.
`UncheckedOptionalAccessModelTest` has been modified to run the
diagnosis
callback on the "before" state. In this particular case, diagnosis can
be run
unchanged on either the "before" or "after" state, but we want this test
to
demonstrate that running diagnosis on the "before" state is usually the
preferred approach.
This change is backwards-compatible; all existing analyses will continue
to run
the callback on the "after" state.
This checks if the layout of `std::initializer_list` is something Clang
can handle much earlier and deduplicates the checks in
CodeGen/CGExprAgg.cpp and AST/ExprConstant.cpp
Also now diagnose `union initializer_list` (Fixes#95495), bit-field for
the size (Fixes a crash that would happen during codegen if it were
unnamed), base classes (that wouldn't be initialized) and polymorphic
classes (whose vtable pointer wouldn't be initialized).
The patch includes a repro for a case where we were returning a null
`FieldDecl`
when calling `getReferencedDecls()` on the `InitListExpr` for a union.
Also, I noticed while working on this that `RecordInitListHelper` has a
bug
where it doesn't work correctly for empty unions. This patch also
includes a
repro and fix for this bug.
This is one of the node kinds that should be considered an "original
initializer". The patch adds a test that was causing an assertion
failure in
`assert(Children.size() == 1)` without the fix.
Assume in fewer places that the analysis is of a `FunctionDecl`, and
initialize the `Environment` properly for `Stmt`s.
Moves constructors for `Environment` to header to make it more obvious
that there are only minor differences between them and very little
initialization in the constructors.
Tested with check-clang-tooling.
# Observed erroneous behavior
Prior to this change, a `MemberExpr` that accesses an anonymous class
might have a prvalue as its base (even though C++ mandates that the base
of a `MemberExpr` must be a glvalue), if the code containing the
`MemberExpr` was in a template.
Here's an example on [godbolt](https://godbolt.org/z/Gz1Mer9oz) (that is
essentially identical to the new test this patch adds).
This example sets up a struct containing an anonymous struct:
```cxx
struct S {
struct {
int i;
};
};
```
It then accesses the member `i` using the expression `S().i`.
When we do this in a non-template function, we get the following AST:
```
`-ExprWithCleanups <col:10, col:14> 'int'
`-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
`-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
`-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
`-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
`-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```
As expected, the AST contains a `MaterializeTemporarExpr` to materialize
the prvalue `S()` before accessing its members.
When we perform this access in a function template (that doesn't
actually even use its template parameter), the AST for the template
itself looks the same as above. However, the AST for an instantiation of
the template looks different:
```
`-ExprWithCleanups <col:10, col:14> 'int'
`-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
`-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
`-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
`-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
`-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```
Note how the inner `MemberExpr` (the one accessing the anonymous struct)
acts on a prvalue.
Interestingly, this does not appear to cause any problems for CodeGen,
probably because CodeGen is set up to deal with `MemberExpr`s on rvalues
in C. However, it does cause issues in the dataflow framework, which
only supports C++ today and expects the base of a `MemberExpr` to be a
glvalue.
Beyond the issues with the dataflow framework, I think this issue should
be fixed because it goes contrary to what the C++ standard mandates, and
the AST produced for the non-template case indicates that we want to
follow the C++ rules here.
# Reasons for erroneous behavior
Here's why we're getting this malformed AST.
First of all, `TreeTransform` [strips any
`MaterializeTemporaryExpr`s](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L14853))
from the AST.
It is therefore up to
[`TreeTransform::RebuildMemberExpr()`](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L2853))
to recreate a `MaterializeTemporaryExpr` if needed. In the [general
case](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L2915)),
it does this: It calls `Sema::BuildMemberReferenceExpr()`, which ensures
that the base is a glvalue by [materializing a
temporary](cd132dcbeb/clang/lib/Sema/SemaExprMember.cpp (L1016))
if needed. However, when `TreeTransform::RebuildMemberExpr()` encounters
an anonymous class, it [calls
`Sema::BuildFieldReferenceExpr()`](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L2880)),
which, unlike `Sema::BuildMemberReferenceExpr()`, does not make sure
that the base is a glvalue.
# Proposed fix
I considered several possible ways to fix this issue:
- Add logic to `Sema::BuildFieldReferenceExpr()` that materializes a
temporary if needed. This appears to work, but it feels like the fix is
in the wrong place:
- AFAIU, other callers of `Sema::BuildFieldReferenceExpr()` don't need
this logic.
- The issue is caused by `TreeTransform` removing the
`MaterializeTemporaryExpr`, so it seems the fix should also be in
`TreeTransform`
- Materialize the temporary directly in
`TreeTransform::RebuildMemberExpr()` if needed (within the case that
deals with anonymous classes).
This would work, too, but it would duplicate logic that already exists
in `Sema::BuildMemberReferenceExpr()` (which we leverage for the general
case).
- Use `Sema::BuildMemberReferenceExpr()` instead of
`Sema::BuildFieldReferenceExpr()` for the anonymous class case, so that
it also uses the existing logic for materializing the temporary.
This is the option I've decided to go with here. There's a slight
wrinkle in that we create a `LookupResult` that claims we looked up the
unnamed field for the anonymous class -- even though we would obviously
never be able to look up an unnamed field. I think this is defensible
and still better than the other alternatives, but I would welcome
feedback on this from others who know the code better.
- Instead of comparing the identity of the `PointerValue`s, compare the
underlying `StorageLocation`s.
- If the `StorageLocation`s are the same, return a definite "true" as
the
result of the comparison. Before, if the `PointerValue`s were different,
we
would return an atom, even if the storage locations themselves were the
same.
- If the `StorageLocation`s are different, return an atom (as before).
Pointers
that have different storage locations may still alias, so we can't
return a
definite "false" in this case.
The application-level gains from this are relatively modest. For the
Crubit
nullability check running on an internal codebase, this change reduces
the
number of functions on which the SAT solver times out from 223 to 221;
the
number of "pointer expression not modeled" errors reduces from 3815 to
3778.
Still, it seems that the gain in precision is generally worthwhile.
@Xazax-hun inspired me to think about this with his
[comments](https://github.com/llvm/llvm-project/pull/73860#pullrequestreview-1761484615)
on a different PR.