In the [value categories RFC](https://discourse.llvm.org/t/70086), I proposed that the end state of the migration should be that `getValue()` should only be legal to call on prvalues.
As a stepping stone, to allow migrating off existing calls to `getValue()`, I proposed introducing `getValueStrict()`, which would already have the new semantics.
However, I've now reconsidered this. Any expression, whether prvalue or glvalue, has a value, so really there isn't any reason to forbid calling `getValue()` on glvalues. I'm therefore removing the deprecation from `getValue()` and transitioning existing `getValueStrict()` calls back to `getValue()`.
The other "strict" accessors are a different case. `setValueStrict()` should only be called on prvalues because glvalues need to have a storage location associated with them; it doesn't make sense to only set a value for them. And, of course, `getStorageLocationStrict()` and `setStorageLocationStrict()` should obviously only be called on glvalues because prvalues don't have storage locations.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155921
There are no remaining uses of this class in the framework.
This patch is part of the ongoing migration to strict handling of value categories (see https://discourse.llvm.org/t/70086 for details).
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D155922
After this change, `StructValue` is just a wrapper for an `AggregateStorageLocation`. For the wider context, see https://discourse.llvm.org/t/70086.
## How to review
- Start by looking at the comments added / changed in Value.h, StorageLocation.h,
and DataflowEnvironment.h. This will give you a good overview of the semantic
changes.
- Look at the corresponding .cpp files that implement the semantic changes.
- Transfer.cpp, TypeErasedDataflowAnalysis.cpp, and RecordOps.cpp show how the
core of the framework is affected by the semantic changes.
- UncheckedOptionalAccessModel.cpp shows how this complex model is affected by
the changes.
- Many of the changes in the rest of the patch are mechanical in nature.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155446
In the case where the expression was not yet associated with a storage location, we created a new storage location but failed to associate it with the expression.
The newly added test fails without the fix.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D155465
Besides being a useful abstraction, this function will help insulate existing clients of the framework from upcoming changes to the API of `StructValue` and `AggregateStorageLocation`.
Depends On D155202
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155204
This consolidates the code used in various places to initialize objects (usually for variables) into one central location.
It will also help reduce the number of changes needed when we make the upcoming API changes to `AggregateStorageLocation` and `StructValue`.
Depends On D155074
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155075
Fixes two places where we relied on map iteration order when processing
values, which leaked nondeterminism into the generated SAT formulas.
Adds a couple of tests that directly assert that the SAT system is
equivalent on each run.
It's desirable that the formulas are deterministic based on the input:
- our SAT solver is naive and perfermance is sensitive to even simple
semantics-preserving transformations like A|B to B|A.
(e.g. it's likely to choose a different variable to split on).
Timeout failures are bad, but *flaky* ones are terrible to debug.
- similarly when debugging, it's important to have a consistent
understanding of what e.g. "V23" means across runs.
---
Both changes in this patch were isolated from a nullability analysis of
real-world code which was extremely slow, spending ages in the SAT
solver at "random" points that varied on each run.
I've included a reduced version of the code as a regression test.
One of the changes shows up directly as flow-condition nondeterminism
with a no-op analysis, the other relied on bits of the nullability
analysis but I found a synthetic example to show the problem.
Differential Revision: https://reviews.llvm.org/D154948
Previously, we were including these fields only in the specific instance that was initialized by the `InitListExpr`, but not in other instances of the same type. This is inconsistent and error-prone.
Depends On D154952
Reviewed By: xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D154961
- Rename `getReferencedFields()` to `getModeledFields()`. Move the logic that
returns all object fields when doing a context-sensitive analysis to here from
`DataflowAnalysisContext::createStorageLocation()`. I think all callers of
the previous `getReferencedFields()` should use this logic; the fact that they
were not doing so looks like a bug.
- Make `getModeledFields()` public. I have an upcoming patch that will need to
use this function from Transfer.cpp, and there doesn't seem to be any reason
why this function should not be public.
- Use a `SmallSetVector` to get deterministic iteration order. I have a pending
patch where I'm getting flaky tests because
`Environment::createValueUnlessSelfReferential()` is non-deterministically
populating different fields depending on the iteration order. This change
fixes those flaky tests.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D154586
This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.
Test problems were due to unspecified order of function arg evaluation.
Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)"
This properly frees the Value hierarchy from managing boolean formulas.
We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.
We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.
For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.
The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.
The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.
Differential Revision: https://reviews.llvm.org/D153469
These changes are OK, but they break downstream stuff that needs more time to adapt :-(
This reverts commit 71579569f4399d3cf6bc618dcd449b5388d749cc.
This reverts commit 5e4ad816bf100b0325ed45ab1cfea18deb3ab3d1.
This reverts commit 1c3ac8dfa16c42a631968aadd0396cfe7f7778e0.
This properly frees the Value hierarchy from managing boolean formulas.
We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.
We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.
For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.
The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.
The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.
Differential Revision: https://reviews.llvm.org/D153469
The newly added test fails without the other changes in this patch.
Reviewed By: sammccall, gribozavr2
Differential Revision: https://reviews.llvm.org/D154420
This avoids the need for casts at callsites.
Depends On D153852
Reviewed By: sammccall, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D153854
Mutating join() isn't used and so appears to be an anti-optimization.
Having Lattice vs Environment inconsistent is awkward, particularly when trying
to minimize copies while joining.
This patch eliminates the difference, but doesn't actually change the signature
of join on concrete lattice types (as that's a breaking change).
Differential Revision: https://reviews.llvm.org/D153908
This serves two purposes:
- Because, today, we only copy the `StructValue`, modifying the destination of
the copy also modifies the source. This is demonstrated by the new checks
added to `CopyConstructor` and `MoveConstructor`, which fail without the
deep copy.
- It lays the groundwork for eliminating the redundancy between
`AggregateStorageLocation` and `StructValue`, which will happen as part of the
ongoing migration to strict handling of value categories (seeo
https://discourse.llvm.org/t/70086 for details). This will involve turning
`StructValue` into essentially just a wrapper for `AggregateStorageLocation`;
under this scheme, the current "shallow" copy (copying a `StructValue` from
one `AggregateStorageLocation` to another) will no longer be possible.
Because we now perform deep copies, tests need to perform a deep equality
comparison instead of just comparing for equal identity of the `StructValue`s.
The new function `recordsEqual()` provides such a deep equality comparison.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D153006
Environments are heavyweight, and copies are observably different from the
original: they introduce new SAT variables, which degrade performance &
debugging. Copies should only be done deliberately, where justified.
Empirically there are several places in the framework where we perform dubious
copies, sometimes entirely accidentally. (see e.g. D153491). Making these
explicit makes this mistake harder.
This patch forces copies to go through fork(), the copy-constructor is private.
This requires changes to existing callsites: some are correct and call fork(),
some are incorrect and are fixed, others are difficult and I left a FIXME.
Differential Revision: https://reviews.llvm.org/D153674
This patch changes the way `Environment::ReturnLoc` is set: Whereas previously it was set by the caller, it is now set by the callee (obviously, as we otherwise would not be able to return references).
The patch also introduces `Environment::ReturnVal`, which is used for non-reference-type return values. This allows these to be handled with the correct value category semantics; see also https://discourse.llvm.org/t/70086, which describes the ongoing migration to strict value category semantics.
Depends On D150776
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D151194
This is part of the gradual migration to strict handling of value categories, as described in the RFC at https://discourse.llvm.org/t/70086.
This patch migrates some representative calls of the newly deprecated accessors to the new `Strict` functions. Followup patches will migrate the remaining callers. (There are a large number of callers, with some subtlety involved in some of them, so it makes sense to split this up into multiple patches rather than migrating all callers in one go.)
The `Strict` accessors as implemented here have some differences in semantics compared to the semantics originally proposed in the RFC; specifically:
* `setStorageLocationStrict()`: The RFC proposes to create an intermediate
`ReferenceValue` that then refers to the `StorageLocation` for the glvalue.
It turns out though that, even today, most places in the code are not doing
this but are instead associating glvalues directly with their
`StorageLocation`. It therefore didn't seem to make sense to introduce new
`ReferenceValue`s where there were none previously, so I have chosen to
instead make `setStorageLocationStrict()` simply call through to
`setStorageLocation(const Expr &, StorageLocation &)` and merely add the
assertion that the expression must be a glvalue.
* `getStorageLocationStrict()`: The RFC proposes that this should assert that
the storage location for the glvalue expression is associated with an
intermediate `ReferenceValue`, but, as explained, this is often not true.
The current state is inconsistent: Sometimes the intermediate
`ReferenceValue` is there, sometimes it isn't. For this reason,
`getStorageLocationStrict()` skips past a `ReferenceValue` if it is there but
otherwise directly returns the storage location associated with the
expression. This behavior is equivalent to the existing behavior of
`SkipPast::Reference`.
* `setValueStrict()`: The RFC proposes that this should always create the same
`StorageLocation` for a given `Value`, but, in fact, the transfer functions
that exist today don't guarantee this; almost all transfer functions
unconditionally create a new `StorageLocation` when associating an expression
with a `Value`.
There appears to be one special case:
`TerminatorVisitor::extendFlowCondition()` checks whether the expression is
already associated with a `StorageLocation` and, if so, reuses the existing
`StorageLocation` instead of creating a new one.
For this reason, `setValueStrict()` implements this logic (preserve an
existing `StorageLocation`) but makes no attempt to always associate the same
`StorageLocation` with a given `Value`, as nothing in the framework appers to
require this.
As `TerminatorVisitor::extendFlowCondition()` is an interesting special case,
the `setValue()` call there is among the ones that this patch migrates to
`setValueStrict()`.
Reviewed By: sammccall, ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D150653
As a replacement, we provide the accessors `getImplicitObjectLocation()` and
`getBaseObjectLocation()`, which are higher-level constructs that cover the use
cases in which `SkipPast::ReferenceThenPointer` was typically used.
Unfortunately, it isn't possible to use these accessors in
UncheckedOptionalAccessModel.cpp; I've added a FIXME to the code explaining the
details. I initially attempted to resolve the issue as part of this patch, but
it turned out to be non-trivial to fix. Instead, I have therefore added a
lower-level replacement for `SkipPast::ReferenceThenPointer` that is used only
within this file.
The wider context of this change is that `SkipPast` will be going away entirely.
See also the RFC at https://discourse.llvm.org/t/70086.
Reviewed By: ymandel, gribozavr2
Differential Revision: https://reviews.llvm.org/D149838
This parameter was already a no-op, so removing it doesn't change behavior.
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D150137
This parameter was already a no-op, so removing it doesn't change behavior.
Depends On D149144
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D149151
For the wider context of this change, see the RFC at
https://discourse.llvm.org/t/70086.
After this change, global and local variables of reference type are associated
directly with the `StorageLocation` of the referenced object instead of the
`StorageLocation` of a `ReferenceValue`.
Some tests that explicitly check for an existence of `ReferenceValue` for a
variable of reference type have been modified accordingly.
As discussed in the RFC, I have added an assertion to `Environment::join()` to
check that if both environments contain an entry for the same declaration in
`DeclToLoc`, they both map to the same `StorageLocation`. As discussed in
https://discourse.llvm.org/t/70086/5, this also necessitates removing
declarations from `DeclToLoc` when they go out of scope.
In the RFC, I proposed a gradual migration for this change, but it appears
that all of the callers of `Environment::setStorageLocation(const ValueDecl &,
SkipPast` are in the dataflow framework itself, and that there are only a few of
them.
As this is the function whose semantics are changing in a way that callers
potentially need to adapt to, I've decided to change the semantics of the
function directly.
The semantics of `getStorageLocation(const ValueDecl &, SkipPast SP` now no
longer depend on the behavior of the `SP` parameter. (There don't appear to be
any callers that use `SkipPast::ReferenceThenPointer`, so I've added an
assertion that forbids this usage.)
This patch adds a default argument for the `SP` parameter and removes the
explicit `SP` argument at the callsites that are touched by this change. A
followup patch will remove the argument from the remaining callsites,
allowing the `SkipPast` parameter to be removed entirely. (I don't want to do
that in this patch so that semantics-changing changes can be reviewed separately
from semantics-neutral changes.)
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D149144
This will eliminate the need for more pass-through APIs. Also replace pass-through usages with this exposure.
Reviewed By: ymandel, gribozavr2, xazax.hun
Differential Revision: https://reviews.llvm.org/D149464
DataflowAnalysisContext has a few too many responsibilities, this narrows them.
It also allows the Arena to be shared with analysis steps, which need to create
Values, without exposing the whole DACtx API (flow conditions etc).
This means Environment no longer needs to proxy all these methods.
(For now it still does, because there are many callsites to update, and maybe
if we separate bool formulas from values we can avoid churning them twice)
In future, if we untangle the concepts of Values from boolean formulas/atoms,
Arena would also be responsible for creating formulas and managing atom IDs.
Differential Revision: https://reviews.llvm.org/D148554
To ensure that we have a pointee for the `PointerValue`, we also create
storage locations for `FunctionDecl`s referenced in the function under analysis.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D148006
This is less verbose than checking for class, struct, and union individually,
and I believe it's also more efficient (not that that should be the overriding
concern).
Reviewed By: sammccall, xazax.hun
Differential Revision: https://reviews.llvm.org/D147603
These methods provide a less verbose way of allocating `StorageLocation`s and
`Value`s than the existing `takeOwnership(make_unique(...))` pattern.
In addition, because allocation of `StorageLocation`s and `Value`s now happens
within the `DataflowAnalysisContext`, the `create<T>()` open up the possibility
of using `BumpPtrAllocator` to allocate these objects if it turns out this
helps performance.
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D147302
The deduplicated code is moved into initVars().
As an added bonus, pushCallInternal() now also gets the "Add all fields
mentioned in default member initializers" behavior, which apparently had been
added to the Environment ctor but not pushCallInternal().
Reviewed By: xazax.hun, ymandel
Differential Revision: https://reviews.llvm.org/D147326
When building the set of referenced fields for the `DataflowAnalysisContext`,
include fields referenced only in default member initializers. These
initializers are visited in the CFGs of constructors and so the fields must be
included when analysing constructor bodies.
Differential Revision: https://reviews.llvm.org/D144987
Currently, the code assumes that all boolean-typed values are an instance of
`BoolValue` (or its subclasses). Yet, lvalues violate this assumption. This
patch drops the assumption and strengthens the check to confirm the shape of
both values being joined.
The patch also notes as FIXMES a number of problems discovered fixing this bug.
Differential Revision: https://reviews.llvm.org/D141709
There were two (small) bugs causing crashes in the analysis. This patch fixes both of them.
1. An enum value was accessed as a class member. Now, the engine gracefully
ignores such member expressions.
2. Field access in `MemberExpr` of struct/class-typed global variables. Analysis
didn't interpret fields of global vars, because the vars were initialized before
the fields were added to the "allowlist". Now, the allowlist is set _before_
init of globals.
Differential Revision: https://reviews.llvm.org/D141384