Before, we checked and manifested attributes right in the IR. This was
bad as we modified the IR before the manifest stage. Now we can
add/remove/inspect attributes w/o going to the IR (except for the
initial query).
We had some custom handling for existing MemoryEffects but we now move
it to the place we check other existing attributes before we manifest
new ones. If we later decide to curb duplication (of attributes on the
call site and callee), we can do that at a single location and for all
attributes.
The test changes basically add known `memory` callee information to the
call sites.
If the IR has a boolean attribute, or the function is not IPO amendable,
we can avoid creating AAs that would just be forced into a trivial
fixpoint anyway. Since we check boolean IR attributes via
`AA::hasAssumedIRAttr`, we don't need AAs even if they would be fixed
optimistic right away. The only change is in the dependency graph
ordering as we move AAs around to simplify the code flow. There is no
reason for the order we seed AAs, so this order is just as fine.
While we can disallow AAs, liveness checks are everywhere and if the
user doesn't want them it is costly to go through just to find out
everything is assumed live.
We now consistently use `CallBase::getCalledOperand` rather than
`getCalledFunction`, as we do not want the type checked performed by the
latter. This exposed various missing checks to handle mismatches
properly, but it is good to have them explicit now.
In a follow up we might want to flag certain calls as UB, but for now,
we allow everything to cut down on unexpected differences.
It was never really useful to track #iterations, though it helped during
the initial development. What we should track, in a follow up, are
potentially #updates. That is also what we should restrict instead of
the #iterations.
For a lightweight pass we do not want to instantiate or use the
MustBeExecutedContextExplorer. This simply allows such a configuration.
While at it, the explorer is now allocated with the bump allocator.
Derive the mustprogress attribute based on the willreturn attribute
or the fact that all callers are mustprogress.
Differential Revision: https://reviews.llvm.org/D94740
AS(4), when targeting GPUs, is constant. Accesses to constant memory are
(historically) not treated as "memory accesses", hence we should deduce
`memory(none)` for those.
Currently we don't check call backs for global variable simplification.
What's more, the only way that we can register a simplification call back for
global variable is through its initializer (essentially a `Constant *`). It might
not correspond to the right global variable. In this patch, we set up a dedicated
simplification map for `GlobalVariable`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D144749
This patch adds the AANonConvergent abstract attribute. It removes the
convergent attribute from functions that only call non-convergent
functions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D143228
When we simplify loads we need to adjust types (esp. null-values)
properly to avoid inconsinstencies down the line. Add a cast and an
error message.
Fixes: https://github.com/llvm/llvm-project/issues/60788
We know kernels (generally) cannot be called from within the module. Thus,
for reachability we would need to step back from a kernel which would allow
us to reach anything anyway. Even if a kernel is invoked from another
kernel, values like allocas and shared memory are not accessible. We
implicitly check for this situation to avoid costly lookups.
We used to check the query instructions for effects but that does not
work well with complex accesses we will probably support in the future.
Now we simply let the user decide what accesses to look for.
With this patch we track aligned barriers in AAExecutionDomain and also
delete unnecessary barriers there. This allows us to eliminate barriers
across blocks, across functions, and in the presence of complex accesses
that do not force a barrier. Further, we can use the collected
information to enable store-load forwarding in a threaded environment
(follow up patch).
Differential Revision: https://reviews.llvm.org/D140463
This reverts commit 9e08b083a09ef4e02fb0a4de2c0d3ddc0eccadde and ensures
signature rewriting also updates dead call sites to avoid the call graph
assertion.
Back with f3ad8cf00e213 we introduced a bug that caused us to skip
callees when we replace uses. This is not sound since subsequent IR
cleanup will assume replacement has happend. As such we created poison
callees for a long while. The original intend of the check was to
prevent call graph invalidation, however, we now properly check if the
instructions (here the call) are inside the SCC or not.
In CGSCC mode we cannot delete internal library functions, esp.
__kmpc_alloc_shared, or we trigger an assertion. While the assertion is
probably too narrow, we avoid deleting those unused functions for now to
unblock the AMDGPU buildbot.
The externalization was always a stopgap solution. One of the drawbacks
is that it is very conservative no matter if we actually require the
functions at the end of the pass. The new concept is more generic and
properly integrates into the dependence graph. Whenever we might need a
function, it has a "virtual use" that cannot be analyzed. If we do not
because of some AA state, there will be a dependence to ensure state
changes trigger revisits of uses, including a potentially new virtual
use.
Future AAs might need to iterate their own state until they reach a
fixpoint. We do not want to forbid that but we want to avoid negative
effects or bugs once this happens. As a precaution, we now rerun an AA
that did not require outside information. If it does not change anymore
we are done, otherwise the AA needs to iterate some more.
This patch adds two checks that have in experiments caused issues. One
was an oversight that allowed new AAs during cleanup to be optimistic.
The other treated functions as functions even if they were used as
values, e.g., in a cast instruction. In such cases we might have assumed
the value is dead if the function is not entered, which isn't true.
The new test functions don't expose a bug but I kept them around.
This patch introduces a new AA `AAUnderlyingObjects`. It is basically like a wrapper
AA of the function `AA::getAssumedUnderlyingObjects`, but it can recursively do
query if the underlying object is an indirect access, such as a phi node or a select
instruction.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D141164
Before we might have missed calling the destructor on an abstract
attribute if it was created outside the seeding or update phase.
All AAs are now in the AAMap and we can use it to delete them all.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
We had two AAs for reachability but it was very cumbersome to extend
them. We also had some fallback to use LLVM-core mechanisms and cache
the result. The new design shares the query code and interface nicely
between AAIntraFnReachability and AAInterFnReachability.
As part of the rewrite we also added the ExclusionSet to the queries.
This is in preparation for future changes that introduce an actual list of
ranges per Access, to be called a RangeList.
Differential Revision: https://reviews.llvm.org/D138644
We keep loads if they feed into assumes but even if we cannot predict
their value we should delete them if the associated stores are deleted
as well. This is not perfect but prioritizes deleting stores now.
Assumptions can help us reason about memory content. This patch teaches
AAPointerInfo to reason about memory assumptions of the following form:
```
%x = load %ptr
... code not writing memory, may include branches ...
%c = %x == %val
... code not writing memory, may include branches ...
llvm.assume(%c)
```
Assumption accesses are recognized from the involved load (%x above).
Assumption accesses are treated special and neither as ordinary read or
write. We use read encoding with an extra flag. Reads are not impacting
other reads or writes. Writes could do that. We don't want assumptions
to impact other writes as they themselves only confirm a value, not
write it. So the "other" write might be required as the assumption only
confirms the effect of that write.