This reverts commit 9e08b083a09ef4e02fb0a4de2c0d3ddc0eccadde and ensures
signature rewriting also updates dead call sites to avoid the call graph
assertion.
Back with f3ad8cf00e213 we introduced a bug that caused us to skip
callees when we replace uses. This is not sound since subsequent IR
cleanup will assume replacement has happend. As such we created poison
callees for a long while. The original intend of the check was to
prevent call graph invalidation, however, we now properly check if the
instructions (here the call) are inside the SCC or not.
In CGSCC mode we cannot delete internal library functions, esp.
__kmpc_alloc_shared, or we trigger an assertion. While the assertion is
probably too narrow, we avoid deleting those unused functions for now to
unblock the AMDGPU buildbot.
The externalization was always a stopgap solution. One of the drawbacks
is that it is very conservative no matter if we actually require the
functions at the end of the pass. The new concept is more generic and
properly integrates into the dependence graph. Whenever we might need a
function, it has a "virtual use" that cannot be analyzed. If we do not
because of some AA state, there will be a dependence to ensure state
changes trigger revisits of uses, including a potentially new virtual
use.
Future AAs might need to iterate their own state until they reach a
fixpoint. We do not want to forbid that but we want to avoid negative
effects or bugs once this happens. As a precaution, we now rerun an AA
that did not require outside information. If it does not change anymore
we are done, otherwise the AA needs to iterate some more.
This patch adds two checks that have in experiments caused issues. One
was an oversight that allowed new AAs during cleanup to be optimistic.
The other treated functions as functions even if they were used as
values, e.g., in a cast instruction. In such cases we might have assumed
the value is dead if the function is not entered, which isn't true.
The new test functions don't expose a bug but I kept them around.
This patch introduces a new AA `AAUnderlyingObjects`. It is basically like a wrapper
AA of the function `AA::getAssumedUnderlyingObjects`, but it can recursively do
query if the underlying object is an indirect access, such as a phi node or a select
instruction.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D141164
Before we might have missed calling the destructor on an abstract
attribute if it was created outside the seeding or update phase.
All AAs are now in the AAMap and we can use it to delete them all.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
We had two AAs for reachability but it was very cumbersome to extend
them. We also had some fallback to use LLVM-core mechanisms and cache
the result. The new design shares the query code and interface nicely
between AAIntraFnReachability and AAInterFnReachability.
As part of the rewrite we also added the ExclusionSet to the queries.
This is in preparation for future changes that introduce an actual list of
ranges per Access, to be called a RangeList.
Differential Revision: https://reviews.llvm.org/D138644
We keep loads if they feed into assumes but even if we cannot predict
their value we should delete them if the associated stores are deleted
as well. This is not perfect but prioritizes deleting stores now.
Assumptions can help us reason about memory content. This patch teaches
AAPointerInfo to reason about memory assumptions of the following form:
```
%x = load %ptr
... code not writing memory, may include branches ...
%c = %x == %val
... code not writing memory, may include branches ...
llvm.assume(%c)
```
Assumption accesses are recognized from the involved load (%x above).
Assumption accesses are treated special and neither as ordinary read or
write. We use read encoding with an extra flag. Reads are not impacting
other reads or writes. Writes could do that. We don't want assumptions
to impact other writes as they themselves only confirm a value, not
write it. So the "other" write might be required as the assumption only
confirms the effect of that write.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This restores commit b756096b0cbef0918394851644649b3c28a886e2, which was
originally reverted in 00b09a7b18abb253d36b3d3e1c546007288f6e89.
AAPointerInfo now maintains a list of all Access objects that it owns, along
with the following maps:
- OffsetBins: OffsetAndSize -> { Access }
- InstTupleMap: RemoteI x LocalI -> Access
A RemoteI is any instruction that accesses memory. RemoteI is different from
LocalI if and only if LocalI is a call; then RemoteI is some instruction in the
callgraph starting from LocalI.
Motivation: When AAPointerInfo recomputes the offset for an instruction, it sets
the value to Unknown if the new offset is not the same as the old offset. The
instruction must now be moved from its current bin to the bin corresponding to
the new offset. This happens for example, when:
- A PHINode has operands that result in different offsets.
- The same remote inst is reachable from the same local inst via different paths
in the callgraph:
```
A (local inst)
|
B
/ \
C1 C2
\ /
D (remote inst)
```
This fixes a bug where a store is incorrectly eliminated in a lit test.
Reviewed By: jdoerfert, ye-luo
Differential Revision: https://reviews.llvm.org/D136526
AAPointerInfo now maintains a list of all Access objects that it owns, along
with the following maps:
- OffsetBins: OffsetAndSize -> { Access }
- InstTupleMap: RemoteI x LocalI -> Access
A RemoteI is any instruction that accesses memory. RemoteI is different from
LocalI if and only if LocalI is a call; then RemoteI is some instruction in the
callgraph starting from LocalI.
Motivation: When AAPointerInfo recomputes the offset for an instruction, it sets
the value to Unknown if the new offset is not the same as the old offset. The
instruction must now be moved from its current bin to the bin corresponding to
the new offset. This happens for example, when:
- A PHINode has operands that result in different offsets.
- The same remote inst is reachable from the same local inst via different paths
in the callgraph:
```
A (local inst)
|
B
/ \
C1 C2
\ /
D (remote inst)
```
This fixes a bug where a store is incorrectly eliminated in a lit test.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136526
The struct OffsetAndSize is a simple tuple of two int64_t. Treating it as a
derived class of std::pair has no special benefit, but it makes the code
verbose since we need get/set functions that avoid using "first" and "second" in
client code. Eliminating the std::pair makes this more readable.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136745
When determining the initial value of the object, use the constant
folding API to load a given type at a given offset in the global
initializer. This makes it work for cases where the load doesn't
directly correspond to an aggregate member.
Differential Revision: https://reviews.llvm.org/D135435
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
Revert "[Attributor] Teach AAPointerInfo to look into aggregates"
This reverts commit 844f6c5d03d58e7ac0c6b838e4a7834ac575ab9b and
4ed0a88cd8a77370073feb270d77a9e8b27bd68c as they broke the buildbots
that run openmp/libomptarget/test/offloading/bug49021.cpp.
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
If a function is non-recursive we only performed intra-procedural
reasoning for reachability (via AA::isPotentiallyReachable). However,
if it is re-entrant that doesn't mean we can't reach. Instead of this
problematic logic in the reachability reasoning we utilize logic in
AAPointerInfo. If a location is for sure written by a function it can
be re-entrant or recursive we know only intra-procedural reasoning is
sufficient.
If we have a dominating must-write access we do not need to know the
initial value of some object to perform reasoning about the potential
values. The dominating must-write has overwritten the initial value.
If we only have exact accesses we should never require the bit-pattern
to be uniform (in this case 0). Only a non-exact access should force us
to require only 0 values.
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.
This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.
`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.
We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.
Fixes: https://github.com/llvm/llvm-project/issues/54981
Note: A previous version was flawed and consequently reverted in
6555558a80589d1c5a1154b92cc3af9495f8f86c.
This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three
AMDGPU tests haven't been updated. Will need to verify the changes are
not regressions we should avoid.
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.
This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.
`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.
We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.
Fixes: https://github.com/llvm/llvm-project/issues/54981
Note: A previous version was flawed and consequently reverted in
6555558a80589d1c5a1154b92cc3af9495f8f86c.