This is a relatively rare case, but
- It's still nice to get this right,
- We can remove the special case for this in
`VisitCXXOperatorCallExpr()` (that
simply bails out), and
- With this in place, I can avoid having to add a similar special case
in an
upcoming patch.
I'm making some changes to `Environment::getResultObjectLocation()`,
with the
ultimate goal of eliminating `RecordValue` entirely, and I'd like to
make sure
I don't break this behavior (and I've realized we don't have a test for
it yet).
This fixes a crash introduced by
https://github.com/llvm/llvm-project/pull/82348
but also adds additional handling to make sure that we treat empty
initializer
lists for both unions and structs/classes correctly (see tests added in
this
patch).
Reverts llvm/llvm-project#82348, which caused crashes when analyzing
empty InitListExprs for unions, e.g.
```cc
union U {
double double_value;
int int_value;
};
void target() {
U value;
value = {};
}
```
Co-authored-by: Samira Bazuzi <bazuzi@users.noreply.github.com>
Although in a normal implementation the assumption is reasonable, it
seems that some esoteric implementation are not returning a T&. This
should be handled correctly and the values be propagated.
---------
Co-authored-by: martinboehme <mboehme@google.com>
When calling `Environment::getResultObjectLocation` with a
CXXOperatorCallExpr that is a prvalue, we just hit an assert because no
record was ever created.
---------
Co-authored-by: martinboehme <mboehme@google.com>
In particular, it's important that we create the "fallback" atomic at
this point
(which we produce if the transfer function didn't produce a value for
the
expression) so that it is placed in the correct environment.
Previously, we processed the terminator condition in the
`TerminatorVisitor`,
which put the fallback atomic in a copy of the environment that is
produced as
input for the _successor_ block, rather than the environment for the
block
containing the expression for which we produce the fallback atomic.
As a result, we produce different fallback atomics every time we process
the
successor block, and hence we don't have a consistent representation of
the
terminator condition in the flow condition.
This patch includes a test (authored by ymand@) that fails without the
fix.
Previously, we hard-coded the cap on block visits inside the framework.
This
patch enables the caller to specify the cap in the APIs for running an
analysis.
The CFG doesn't contain a CFGElement for the
`CXXDefaultInitExpr::getInit()`, so
it makes sense to consider the `CXXDefaultInitExpr` to be the expression
that
originally constructs the object.
In particular, it's important that we create the "fallback" atomic at
this point
(which we produce if the transfer function didn't produce a value for
the
expression) so that it is placed in the correct environment.
Previously, we processed the terminator condition in the
`TerminatorVisitor`,
which put the fallback atomic in a copy of the environment that is
produced as
input for the _successor_ block, rather than the environment for the
block
containing the expression for which we produce the fallback atomic.
As a result, we produce different fallback atomics every time we process
the
successor block, and hence we don't have a consistent representation of
the
terminator condition in the flow condition.
This patch includes a test (authored by ymand@) that fails without the
fix.
So far, if there was a chain of record type prvalues,
`getResultObjectLocation()` would assign a different result object
location to
each one. This makes no sense, of course, as all of these prvalues end
up
initializing the same result object.
This patch fixes this by propagating storage locations up through the
entire
chain of prvalues.
The new implementation also has the desirable effect of making it
possible to
make `getResultObjectLocation()` const, which seems appropriate given
that,
logically, it is just an accessor.
Before we widen to top, we now check if both values can be proved either
true or
false in their respective environments; if so, widening returns a true
or false
literal. The idea is that we avoid losing information if posssible.
This patch includes a test that fails without this change to widening.
This change does mean that we call the SAT solver in more places, but
this seems
acceptable given the additional precision we gain.
In tests on an internal codebase, the number of SAT solver timeouts we
observe
with Crubit's nullability checker does increase by about 25%. They can
be
brought back to the previous level by doubling the SAT solver work
limit.
The code assumed that the source parameter of an assignment operator is
always
passed by reference, but it is legal for it to be passed by value.
This patch includes a test that assert-fails without the fix.
As described [here](https://discourse.llvm.org/t/70086/6), there are
legitimate
non-bug scenarios where two `DeclToLoc` maps to be joined contain
different
storage locations for the same declaration. This patch also adds a test
containing an example of such a situation. (The test fails without the
other
changes in this patch.)
With the assertion removed, the existing logic in `intersectDenseMaps()`
will
remove the corresponding declaration from the joined DeclToLoc map.
We also remove `removeDecl()`'s precondition (that the declaration must
be
associated with a storage location) because this may no longer hold if
the
declaration was previously removed during a join, as described above.
Widen on backedge nodes, instead of nodes with a loop statement as
terminator.
This fixes#67834 and a precision loss from assignment in a loop
condition. The
commit contains tests for both of these issues.
And simplify formulas containing true/false
It's unclear to me how useful this is, it does make formulas more
conveniently self-contained now (we can usefully print them without
carrying around the "true/false" labels)
(while here, simplify !!X to X, too)
Differential Revision: https://reviews.llvm.org/D153485
So that the values that are accessed via such accessors can be analyzed
as a limited version of context-sensitive analysis. We can potentially
do this only when some option is set, but doing additional modeling like
this won't be expensive and intrusive, so we do it by default for now.
The test demonstrates that the `this` pointer seen in the constructor
has the
same value as the address of the variable the object is constructed
into.
In C++ it seems it is legit to use base class's operator (e.g. `using
Base::operator=`) to perform copy if the base class is the common
ancestor of the source and destination object. In such a case we
shouldn't try to access fields beyond that of the base class, however
such a case seems to be very rare (typical code would implement a copy
constructor instead), and could add complexities, so in this patch we
simply bail if the method operator's parent class is different from the
type of the destination object that this framework recognizes.
We want to eliminate the `RecordStorageLocation` from `RecordValue` and,
ultimately, eliminate `RecordValue` entirely (see the discussion linked
in the
`RecordValue` class comment). This is one step in that direction.
To eliminate `RecordValue::getChild()`, we also eliminate the last
remaining
caller, namely the `getFieldValue(const RecordValue *, ...)` overload.
Calls
to this overload have been rewritten to use the
`getFieldValue(const RecordStorageLocation *, ...)` overload. Note that
this
also makes the code slightly simpler in many cases.
When we call `getEnvironment`, `BlockToState[BlockId]` for the block can
return null even if CFCtx.isBlockReachable(B) returns true if it is
called from a particular block that is marked unreachable to the block.
This change makes widening act the same as equivalence checking. When the
analysis does not provide an answer regarding the equivalence of two distinct
values, the framework treats them as equivalent. This is an unsound choice that
enables convergence.
Differential Revision: https://reviews.llvm.org/D159355
Usually RecordValues for record objects (e.g. struct) are initialized with
`Environment::createValue()` which internally calls `getObjectFields()` to
collects all fields from the current and base classes, and then filter them
with `ModeledValues` via `DACtx::getModeledFields()` so that the fields that
are actually referenced are modeled.
The consistent set of fields should be initialized when a record is initialized
with an initializer list (InitListExpr), however the existing code's behavior
was different.
Before this patch:
* When a struct is initialized with InitListExpr, its fields are
initialized based on what is returned by `getFieldsForInitListExpr()`, which
only collects the direct fields in the current class, but not from the base
classes. Moreover, if the base classes have their own InitListExpr, values
that are initialized by their InitListExpr's weren't merged into the
child objects.
After this patch:
* When a struct is initialized with InitListExpr, it collects and merges the
fields in the base classes that were initialized by their InitListExpr's.
The code also asserts that the consistent set of fields are initialized
with the ModeledFields.
Reviewed By: mboehme
Differential Revision: https://reviews.llvm.org/D159284
We want to work towards eliminating the `RecordStorageLocation` from
`RecordValue`. These particular uses of `RecordValue::getChild()` can
simply
be replaced with `RecordStorageLocation::getChild()`.
Calls to member operators are a special case in that their callees have pointer
type. The callees of non-operator non-static member functions are not pointers.
See the comments in the code for details.
This issue came up in the Crubit nullability check; the fact that we weren't
modeling the `PointerValue` caused non-convergence.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D158592
These are broken out from https://reviews.llvm.org/D156658, which it now seems obvious isn't the right way to solve the non-convergence.
Instead, my plan is to address the non-convergence through pointer value widening, but the exact way this should be implemented is TBD. In the meantime, I think there's value in getting these repros submitted to record the current undesirable behavior.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D158513
- Both of these constructs are used to represent structs, classes, and unions;
Clang uses the collective term "record" for these.
- The term "aggregate" in `AggregateStorageLocation` implies that, at some
point, the intention may have been to use it also for arrays, but it don't
think it's possible to use it for arrays. Records and arrays are very
different and therefore need to be modeled differently. Records have a fixed
set of named fields, which can have different type; arrays have a variable
number of elements, but they all have the same type.
- Futhermore, "aggregate" has a very specific meaning in C++
(https://en.cppreference.com/w/cpp/language/aggregate_initialization).
Aggregates of class type may not have any user-declared or inherited
constructors, no private or protected non-static data members, no virtual
member functions, and so on, but we use `AggregateStorageLocations` to model all objects of class type.
In addition, for consistency, we also rename the following:
- `getAggregateLoc()` (in `RecordValue`, formerly known as `StructValue`) to
simply `getLoc()`.
- `refreshStructValue()` to `refreshRecordValue()`
We keep the old names around as deprecated synonyms to enable clients to be migrated to the new names.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D156788
In the [value categories RFC](https://discourse.llvm.org/t/70086), I proposed that the end state of the migration should be that `getValue()` should only be legal to call on prvalues.
As a stepping stone, to allow migrating off existing calls to `getValue()`, I proposed introducing `getValueStrict()`, which would already have the new semantics.
However, I've now reconsidered this. Any expression, whether prvalue or glvalue, has a value, so really there isn't any reason to forbid calling `getValue()` on glvalues. I'm therefore removing the deprecation from `getValue()` and transitioning existing `getValueStrict()` calls back to `getValue()`.
The other "strict" accessors are a different case. `setValueStrict()` should only be called on prvalues because glvalues need to have a storage location associated with them; it doesn't make sense to only set a value for them. And, of course, `getStorageLocationStrict()` and `setStorageLocationStrict()` should obviously only be called on glvalues because prvalues don't have storage locations.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155921