199 Commits

Author SHA1 Message Date
Mital Ashok
482c41e992
[Clang] [Sema] Diagnose unknown std::initializer_list layout in SemaInit (#95580)
This checks if the layout of `std::initializer_list` is something Clang
can handle much earlier and deduplicates the checks in
CodeGen/CGExprAgg.cpp and AST/ExprConstant.cpp

Also now diagnose `union initializer_list` (Fixes #95495), bit-field for
the size (Fixes a crash that would happen during codegen if it were
unnamed), base classes (that wouldn't be initialized) and polymorphic
classes (whose vtable pointer wouldn't be initialized).
2024-06-20 19:44:06 +02:00
martinboehme
282534268e
[clang][dataflow] Handle AtomicExpr in ResultObjectVisitor. (#94963)
This is one of the node kinds that should be considered an "original
initializer". The patch adds a test that was causing an assertion
failure in
`assert(Children.size() == 1)` without the fix.
2024-06-11 08:38:03 +02:00
martinboehme
492417278d
[clang][dataflow] Propagate storage location of compound assignment operators. (#94332)
To avoid generating unnecessary values, we don't create a new value but
instead
leave it to the specific analysis to do this if desired.
2024-06-04 17:08:20 +02:00
martinboehme
68761a9e05
[clang][nullability] Propagate storage location / value of ++/-- operators. (#94217)
To avoid generating unnecessary values, we don't create a new value but
instead
leave it to the specific analysis to do this if desired.
2024-06-04 08:32:29 +02:00
martinboehme
fcd020d561
[Clang][Sema] Fix malformed AST for anonymous class access in template. (#90842)
# Observed erroneous behavior

Prior to this change, a `MemberExpr` that accesses an anonymous class
might have a prvalue as its base (even though C++ mandates that the base
of a `MemberExpr` must be a glvalue), if the code containing the
`MemberExpr` was in a template.

Here's an example on [godbolt](https://godbolt.org/z/Gz1Mer9oz) (that is
essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

```cxx
struct S {
  struct {
    int i;
  };
};
```

It then accesses the member `i` using the expression `S().i`.

When we do this in a non-template function, we get the following AST:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

As expected, the AST contains a `MaterializeTemporarExpr` to materialize
the prvalue `S()` before accessing its members.

When we perform this access in a function template (that doesn't
actually even use its template parameter), the AST for the template
itself looks the same as above. However, the AST for an instantiation of
the template looks different:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

Note how the inner `MemberExpr` (the one accessing the anonymous struct)
acts on a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen,
probably because CodeGen is set up to deal with `MemberExpr`s on rvalues
in C. However, it does cause issues in the dataflow framework, which
only supports C++ today and expects the base of a `MemberExpr` to be a
glvalue.

Beyond the issues with the dataflow framework, I think this issue should
be fixed because it goes contrary to what the C++ standard mandates, and
the AST produced for the non-template case indicates that we want to
follow the C++ rules here.

# Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, `TreeTransform` [strips any
`MaterializeTemporaryExpr`s](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L14853))
from the AST.

It is therefore up to
[`TreeTransform::RebuildMemberExpr()`](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L2853))
to recreate a `MaterializeTemporaryExpr` if needed. In the [general
case](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L2915)),
it does this: It calls `Sema::BuildMemberReferenceExpr()`, which ensures
that the base is a glvalue by [materializing a
temporary](cd132dcbeb/clang/lib/Sema/SemaExprMember.cpp (L1016))
if needed. However, when `TreeTransform::RebuildMemberExpr()` encounters
an anonymous class, it [calls
`Sema::BuildFieldReferenceExpr()`](cd132dcbeb/clang/lib/Sema/TreeTransform.h (L2880)),
which, unlike `Sema::BuildMemberReferenceExpr()`, does not make sure
that the base is a glvalue.

# Proposed fix

I considered several possible ways to fix this issue:

- Add logic to `Sema::BuildFieldReferenceExpr()` that materializes a
temporary if needed. This appears to work, but it feels like the fix is
in the wrong place:
- AFAIU, other callers of `Sema::BuildFieldReferenceExpr()` don't need
this logic.
- The issue is caused by `TreeTransform` removing the
`MaterializeTemporaryExpr`, so it seems the fix should also be in
`TreeTransform`

- Materialize the temporary directly in
`TreeTransform::RebuildMemberExpr()` if needed (within the case that
deals with anonymous classes).

This would work, too, but it would duplicate logic that already exists
in `Sema::BuildMemberReferenceExpr()` (which we leverage for the general
case).

- Use `Sema::BuildMemberReferenceExpr()` instead of
`Sema::BuildFieldReferenceExpr()` for the anonymous class case, so that
it also uses the existing logic for materializing the temporary.

This is the option I've decided to go with here. There's a slight
wrinkle in that we create a `LookupResult` that claims we looked up the
unnamed field for the anonymous class -- even though we would obviously
never be able to look up an unnamed field. I think this is defensible
and still better than the other alternatives, but I would welcome
feedback on this from others who know the code better.
2024-05-14 09:45:54 +02:00
martinboehme
f3fbd21fa4
[clang][dataflow] Strengthen pointer comparison. (#75170)
-  Instead of comparing the identity of the `PointerValue`s, compare the
   underlying `StorageLocation`s.

- If the `StorageLocation`s are the same, return a definite "true" as
the
result of the comparison. Before, if the `PointerValue`s were different,
we
would return an atom, even if the storage locations themselves were the
same.

- If the `StorageLocation`s are different, return an atom (as before).
Pointers
that have different storage locations may still alias, so we can't
return a
   definite "false" in this case.

The application-level gains from this are relatively modest. For the
Crubit
nullability check running on an internal codebase, this change reduces
the
number of functions on which the SAT solver times out from 223 to 221;
the
number of "pointer expression not modeled" errors reduces from 3815 to
3778.

Still, it seems that the gain in precision is generally worthwhile.

@Xazax-hun inspired me to think about this with his

[comments](https://github.com/llvm/llvm-project/pull/73860#pullrequestreview-1761484615)
on a different PR.
2024-05-07 10:12:23 +02:00
martinboehme
4d839d8f18
[clang][dataflow] Don't propagate result objects in unevaluated contexts (reland #90438) (#91172)
This relands #90348 with a fix for a [buildbot
failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446)
caused by the test being run with `-fno-rtti`.
2024-05-06 14:21:15 +02:00
martinboehme
0348e71885
[clang][dataflow] Fix crash when operator= result type is not destination type. (#90898)
The existing code was full of comments about how we assume this is
always the
case, but it's not mandated by the standard, and there is code out there
that
returns a different type. So check that the result type is in fact the
same as
the destination type before attempting to copy to the result.

To make sure that we don't bail out in more cases than intended, I've
extended
existing tests to verify that in the common case, we do return the
destination
object (by reference or value, as the case may be).
2024-05-06 08:15:12 +02:00
Weaver
2252c5c42b Revert "[clang][dataflow] Don't propagate result objects in unevaluated contexts (#90438)"
This reverts commit 597a3150e932a9423c65b5ea4b53dd431aff5865.

Caused test failure on the following buildbot:
https://lab.llvm.org/buildbot/#/builders/216/builds/38446
2024-05-02 11:51:45 +01:00
martinboehme
597a3150e9
[clang][dataflow] Don't propagate result objects in unevaluated contexts (#90438)
Trying to do so can cause crashes -- see newly added test and the
comments in
the fix.

We're starting to see a repeating pattern here: We're getting crashes
because
`ResultObjectVisitor` and `getReferencedDecls()` don't agree on which
parts of
the AST to visit and, hence, which fields should be modeled.

I think we should ensure consistency between these two parts of the code
by
using a `RecursiveASTVisitor` in `getReferencedDecls()`[^1]; the
`Traverse...()` functions that control which parts of the AST we visit
would go
in a common base class that would be used for both `ResultObjectVisitor`
and
`getReferencedDecls()`.

I'd like to focus this PR, however, on a targeted fix for the current
crash and
postpone the refactoring to a later PR (which will be easier to revert
if there
are unintended side-effects).

[^1]: As an added bonus, this would make the code better structured and
more
efficient than the current sequence of `if (dyn_cast<T>(...))`
statements).
2024-05-02 08:35:13 +02:00
martinboehme
c70f058316
[clang][dataflow] Fix crash when ConstantExpr is used in conditional operator. (#90112)
`ConstantExpr` does not appear as a `CFGStmt` in the CFG, so
`StmtToEnvMap::getEnvironment()` was not finding an entry for it in the
map,
causing a crash when we tried to access the iterator resulting from the
map
lookup.

The fix is to make `ignoreCFGOmittedNodes()` ignore `ConstantExpr`, but
in
addition, I'm hardening `StmtToEnvMap::getEnvironment()` to make sure
release
builds don't crash in similar situations in the future.
2024-04-26 09:30:07 +02:00
martinboehme
b9208ce318
[clang][dataflow] Crash fix for widenDistinctValues(). (#89895)
We used to crash if the previous iteration contained a `BoolValue` and
the
current iteration contained an `IntegerValue`. The accompanying test
sets up
this situation -- see comments there for details.

While I'm here, clean up the tests for integral casts to use the test
helpers we
have available now. I was looking at these tests to understand how we
handle
integral casts, and the test helpers make the tests easier to read.
2024-04-25 09:24:08 +02:00
martinboehme
9b0651f5ae
[clang][dataflow] Don't propagate result objects in nested declarations. (#89903)
Trying to do so can cause crashes -- see newly added test and the
comments in
the fix.
2024-04-25 09:22:14 +02:00
martinboehme
9ba6961ce0
Reapply "[clang][dataflow] Model conditional operator correctly." with fixes (#89596)
I reverted https://github.com/llvm/llvm-project/pull/89213 beause it was
causing buildbots to fail with assertion failures.

Embarrassingly, it turns out I had been running tests locally in
`Release` mode, i.e. with `assert()` compiled away.

This PR re-lands #89213 with fixes for the failing assertions.
2024-04-23 08:10:55 +02:00
martinboehme
8ff6434546
Revert "[clang][dataflow] Model conditional operator correctly." (#89577)
Reverts llvm/llvm-project#89213

This is causing buildbot failures.
2024-04-22 09:35:29 +02:00
martinboehme
abb958f161
[clang][dataflow] Model conditional operator correctly. (#89213) 2024-04-22 09:23:13 +02:00
martinboehme
e8fce95887
[clang][nullability] Remove RecordValue. (#89052)
This class no longer serves any purpose; see also the discussion here:
https://reviews.llvm.org/D155204#inline-1503204

A lot of existing tests in TransferTest.cpp check for the existence of
`RecordValue`s. Some of these checks are now simply redundant and have
been
removed. In other cases, tests were checking for the existence of a
`RecordValue` as a way of testing whether a record has been initialized.
I have
typically changed these test to instead check whether a field of the
record has
a value.
2024-04-19 09:39:52 +02:00
martinboehme
ca7d9442ba
[clang][dataflow] Support CXXParenListInitExpr in PropagateResultObject(). (#89235) 2024-04-19 09:06:13 +02:00
martinboehme
1bccbe1f49
[clang][dataflow] Treat BuiltinBitCastExpr correctly in PropagateResultObject(). (#88875)
This patch includes a test that assert-fails without the fix.
2024-04-17 08:17:56 +02:00
martinboehme
b851c7f1fc
[clang][dataflow] Support StmtExpr in PropagateResultObject(). (#88872)
This patch adds a test that assert-fails without the fix.
2024-04-17 08:05:43 +02:00
martinboehme
3c6f91e5b6
[clang][dataflow] Fix result object location for builtin <=>. (#88726)
The newly added test causes an assertion failure in
`PropagateResultObject()`
without the fix added here.
2024-04-16 08:49:45 +02:00
martinboehme
71f1932b84
[clang][dataflow] Reland #87320: Propagate locations from result objects to initializers. (#88316)
This relands #87320 and additionally removes the now-unused function
`isOriginalRecordConstructor()`, which was causing buildbots to fail.
2024-04-11 08:20:35 +02:00
martinboehme
7549b45825
Revert "[clang][dataflow] Propagate locations from result objects to initializers." (#88315)
Reverts llvm/llvm-project#87320

This is causing buildbots to fail because
`isOriginalRecordConstructor()` is now unused.
2024-04-10 21:27:10 +02:00
martinboehme
21009f466e
[clang][dataflow] Propagate locations from result objects to initializers. (#87320)
Previously, we were propagating storage locations the other way around,
i.e.
from initializers to result objects, using `RecordValue::getLoc()`. This
gave
the wrong behavior in some cases -- see the newly added or fixed tests
in this
patch.

In addition, this patch now unblocks removing the `RecordValue` class
entirely,
as we no longer need `RecordValue::getLoc()`.

With this patch, the test `TransferTest.DifferentReferenceLocInJoin`
started to
fail because the framework now always uses the same storge location for
a
`MaterializeTemporaryExpr`, meaning that the code under test no longer
set up
the desired state where a variable of reference type is mapped to two
different
storage locations in environments being joined. Rather than trying to
modify
this test to set up the test condition again, I have chosen to replace
the test
with an equivalent test in DataflowEnvironmentTest.cpp that sets up the
test
condition directly; because this test is more direct, it will also be
less
brittle in the face of future changes.
2024-04-10 20:03:35 +02:00
martinboehme
e6f63a942a
[clang][dataflow] Bail out if input is Objective-C++. (#86479)
We only ever intended to support C++, but the condition we were testing
allowed
Objective-C++ code by mistake.
2024-03-25 14:08:25 +01:00
Eric Li
a6a6066290
[clang][dataflow] Fix crash when analyzing a coroutine (#85957)
A coroutine function body (`CoroutineBodyStmt`) may have null children,
which causes `isa` to segfault.
2024-03-20 12:45:30 -04:00
martinboehme
b788e4655c
[clang][dataflow] Model assignment to derived class from base. (#85064)
This is a relatively rare case, but

- It's still nice to get this right,
- We can remove the special case for this in
`VisitCXXOperatorCallExpr()` (that
  simply bails out), and
- With this in place, I can avoid having to add a similar special case
in an
  upcoming patch.
2024-03-19 09:22:35 +01:00
martinboehme
27d504998e
[clang][dataflow] Fix getResultObjectLocation() on CXXDefaultArgExpr. (#85072)
This patch includes a test that causes an assertion failure without the
other
changes in this patch.
2024-03-18 13:36:20 +01:00
martinboehme
9b74c43d70
[clang][dataflow] Add context-sensitive test for returning a record by value. (#84317)
I'm making some changes to `Environment::getResultObjectLocation()`,
with the
ultimate goal of eliminating `RecordValue` entirely, and I'd like to
make sure
I don't break this behavior (and I've realized we don't have a test for
it yet).
2024-03-08 08:19:41 +01:00
martinboehme
2d539db246
[clang][dataflow] When analyzing ctors, don't initialize fields of *this with values. (#84164)
This is the constructor's job, and we want to be able to test that it
does this.
2024-03-08 08:19:02 +01:00
martinboehme
128780b06f
[clang][dataflow] Correctly treat empty initializer lists for unions. (#82986)
This fixes a crash introduced by
https://github.com/llvm/llvm-project/pull/82348
but also adds additional handling to make sure that we treat empty
initializer
lists for both unions and structs/classes correctly (see tests added in
this
patch).
2024-03-01 09:27:59 +01:00
Samira Bazuzi
2730a5c68c
[clang][dataflow] Skip array types when handling InitListExprs. (#83013)
Crashes resulted from single-element InitListExprs for arrays with
elements of a record type after #80970.
2024-02-26 10:53:33 -05:00
Samira Bazuzi
c4e94633e8
Revert "[clang][dataflow] Correctly handle InitListExpr of union type." (#82856)
Reverts llvm/llvm-project#82348, which caused crashes when analyzing
empty InitListExprs for unions, e.g.

```cc
union U {
  double double_value;
  int int_value;
};

void target() {
  U value;
  value = {};
}
```

Co-authored-by: Samira Bazuzi <bazuzi@users.noreply.github.com>
2024-02-26 14:23:46 +01:00
martinboehme
4725993f1a
[clang][dataflow] Correctly handle InitListExpr of union type. (#82348) 2024-02-21 10:10:25 +01:00
Yitzhak Mandelbaum
60cb09ba4f
[clang][dataflow] Fix crash on unions introduced in ba279934c6ab09d5394a89d8318651aefd8d565b (#81918)
The commit was itself a crash fix, but inadvertently changed the
behavior for unions, which results in crashes.
2024-02-15 16:19:10 -05:00
Paul Semel
ba279934c6
[dataflow] Fix crash when InitListExpr is not a prvalue (#80970) 2024-02-15 10:59:51 +01:00
Paul Semel
a8fb0dcc41
[dataflow] CXXOperatorCallExpr equal operator might not be a glvalue (#80991)
Although in a normal implementation the assumption is reasonable, it
seems that some esoteric implementation are not returning a T&. This
should be handled correctly and the values be propagated.

---------

Co-authored-by: martinboehme <mboehme@google.com>
2024-02-13 11:39:27 +01:00
Danny Mösch
00e80fbfb9
[NFC] Correct C++ standard names (#81421) 2024-02-11 19:43:34 +01:00
Paul Semel
5c2da289d2
[clang][dataflow] fix assert in Environment::getResultObjectLocation (#79608)
When calling `Environment::getResultObjectLocation` with a
CXXOperatorCallExpr that is a prvalue, we just hit an assert because no
record was ever created.

---------

Co-authored-by: martinboehme <mboehme@google.com>
2024-01-31 17:18:16 +01:00
martinboehme
ccf1e322bd
[clang][dataflow] Process terminator condition within transferCFGBlock(). (#78127)
In particular, it's important that we create the "fallback" atomic at
this point
(which we produce if the transfer function didn't produce a value for
the
expression) so that it is placed in the correct environment.

Previously, we processed the terminator condition in the
`TerminatorVisitor`,
which put the fallback atomic in a copy of the environment that is
produced as
input for the _successor_ block, rather than the environment for the
block
containing the expression for which we produce the fallback atomic.

As a result, we produce different fallback atomics every time we process
the
successor block, and hence we don't have a consistent representation of
the
terminator condition in the flow condition.

This patch includes a test (authored by ymand@) that fails without the
fix.
2024-01-23 10:19:06 +01:00
Yitzhak Mandelbaum
f3dd8f10c7
[clang][dataflow] Make cap on block visits configurable by caller. (#77481)
Previously, we hard-coded the cap on block visits inside the framework.
This
patch enables the caller to specify the cap in the APIs for running an
analysis.
2024-01-22 22:41:48 -05:00
martinboehme
a2caa4929e
[clang][dataflow] Treat comma operator correctly in getResultObjectLocation(). (#78427) 2024-01-22 09:23:06 +01:00
martinboehme
f1226eea52
[clang][dataflow] Consider CXXDefaultInitExpr to be an "original record ctor". (#78423)
The CFG doesn't contain a CFGElement for the
`CXXDefaultInitExpr::getInit()`, so
it makes sense to consider the `CXXDefaultInitExpr` to be the expression
that
originally constructs the object.
2024-01-18 08:59:26 +01:00
martinboehme
1aacdfe473
Revert "[clang][dataflow] Process terminator condition within transferCFGBlock()." (#77895)
Reverts llvm/llvm-project#77750
2024-01-12 09:54:50 +01:00
martinboehme
537bbb4688
[clang][dataflow] Process terminator condition within transferCFGBlock(). (#77750)
In particular, it's important that we create the "fallback" atomic at
this point
(which we produce if the transfer function didn't produce a value for
the
expression) so that it is placed in the correct environment.

Previously, we processed the terminator condition in the
`TerminatorVisitor`,
which put the fallback atomic in a copy of the environment that is
produced as
input for the _successor_ block, rather than the environment for the
block
containing the expression for which we produce the fallback atomic.

As a result, we produce different fallback atomics every time we process
the
successor block, and hence we don't have a consistent representation of
the
terminator condition in the flow condition.

This patch includes a test (authored by ymand@) that fails without the
fix.
2024-01-12 09:20:58 +01:00
martinboehme
ca1034341c
[clang][dataflow] Fix an issue with Environment::getResultObjectLocation(). (#75483)
So far, if there was a chain of record type prvalues,
`getResultObjectLocation()` would assign a different result object
location to
each one. This makes no sense, of course, as all of these prvalues end
up
initializing the same result object.

This patch fixes this by propagating storage locations up through the
entire
chain of prvalues.

The new implementation also has the desirable effect of making it
possible to
make `getResultObjectLocation()` const, which seems appropriate given
that,
logically, it is just an accessor.
2023-12-18 09:10:03 +01:00
martinboehme
5bd643e145
[clang][dataflow] Strengthen widening of boolean values. (#73484)
Before we widen to top, we now check if both values can be proved either
true or
false in their respective environments; if so, widening returns a true
or false
literal. The idea is that we avoid losing information if posssible.

This patch includes a test that fails without this change to widening.

This change does mean that we call the SAT solver in more places, but
this seems
acceptable given the additional precision we gain.

In tests on an internal codebase, the number of SAT solver timeouts we
observe
with Crubit's nullability checker does increase by about 25%. They can
be
brought back to the previous level by doubling the SAT solver work
limit.
2023-11-27 14:55:49 +01:00
Samira Bazuzi
3001d6ddaa
[clang][dataflow] Fix buggy assertion: Compare an unqualified type to an unqualified type. (#71573)
Includes crash-reproducing test case.

---------

Co-authored-by: martinboehme <mboehme@google.com>
2023-11-09 16:57:04 +01:00
martinboehme
6b573f4611
[clang][dataflow] Fix assert-fail when calling assignment operator with by-value parameter. (#71384)
The code assumed that the source parameter of an assignment operator is
always
passed by reference, but it is legal for it to be passed by value.

This patch includes a test that assert-fails without the fix.
2023-11-07 09:48:40 +01:00
martinboehme
526c9b7e37
[clang][nullability] Use proves() and assume() instead of deprecated synonyms. (#70297) 2023-10-30 13:18:57 +01:00