When we run the CGSCC pass we should only invest time on the SCC. We can
initialize AAs with information from the module slice but we should not
update those AAs. We make an exception for are call site of the SCC as
they are helpful providing information for the SCC.
Minor modifications to pointer privatization allow us to perform it even
in the CGSCC pass, similar to ArgumentPromotion.
The Attributor, as many other parts in LLVM, uses pointer equivalence
for `llvm::Value`s. This only works as long as `llvm::Value`s are
dynamically unique, or, to be exact, we will never end up with the same
`llvm::Value` representing two dynamic instances. We already provided a
helper to check the former, namely `AA::isDynamicallyUnique`, however we
could not check the latter. In this patch we move the logic into a
separate AA which helps with the growing complexity and use cases. We
also extend the interface to answer the second question rather than the
first. So we do not determine dynamically uniqueness but if we might end
up with the `llvm::Value` describing a different dynamic instance. Note
that the latter is very much tied to the Attributor capabilities to look
through memory, recursion, etc. so we need to update the logic as we go.
We look through loads in the "generic value traversal" and we
consequently don't need to look through them again in AAValueSimplify*.
The test changes stem from the fact that we allowed any simplified
value, incl. non-dynamically unique ones, as long as the underlying
memory was an alloca. This doesn't seem to make sense as allocas do not
protect against dynamically non-unique values. We need to make the
unique check better rather than excluding allocas. That in mind, we can
remove a lot of code by simply relying on the generic value traversal
load look through.
To soften the blow some minor adjustments have been made that allow more
simplification through the now used scheme and some tests have been
given a `norecurse` for now.
With D106397 we ensured that `AAReachability` will not answer queries for
potentially recursive functions. This was necessary as we did not treat
recursion explicitly otherwise. Now that we have
`AA::isPotentiallyReachable` we can make `AAReachability` a purely
intra-procedural AA which does not care about recursion.
`AA::isPotentiallyReachable`, however, does already deal with "going
back" the call graph and can now do so for potentially recursive
functions.
If we ignore droppable users everything only used in llvm.assume (among
other things) is going to be deleted as dead. This is not helpful.
Instead we want to only delete things we actually don't need anymore. A
follow up will deal with loads in a smarter way.
When simplify values we might end up with an instruction from a
different scope or just one that does not dominate the use. If the
instruction can be reproduced without side-effect (incl. UB) we can
now do that. For now this is mostly used for speculatable (intrinsic)
calls but as we learn to make things like arguments or loads available
this will become more powerful.
This will also allow us to remove dead stores more easily in a follow
up.
I didn't dig into this very much because it appears to be totally valid
(especially once these properties can come from attributes instead
of only from hard-coded library functions) for TLI to not be defined,
and nothing broke when I added this check, including with all my other
patches applied.
Differential Revision: https://reviews.llvm.org/D122917
Inline assembly is scary but we need to support it for the OpenMP GPU
device runtime. The new assumption expresses the fact that it may not
have call semantics, that is, it will not call another function but
simply perform an operation or side-effect. This is important for
reachability in the presence of inline assembly.
Differential Revision: https://reviews.llvm.org/D109986
This reverts commit 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 as it
breaks the buildbots.
I didn't see these failures in the pre-merge checks, looking into it.
Most intrinsics, especially "default" ones, will not call back into the
IR module. `nocallback` encodes this nicely. As it was not used before,
this patch also makes use of `nocallback` in the Attributor which
results in many more `norecurse` deductions.
Tablegen part is mechanical, test updates by script.
Differential Revision: https://reviews.llvm.org/D118680
There is potential for endless recursion if we try to determine the
underlying objects of a load, just to end up with the load as underlying
object. A proper solution will require us to pass a visited set around.
This will happen as we cleanup genericValueTraversal soon.
Before we used the capture tracker to follow pointer uses, now we do it
explicitly ourselves through the Attributor API. There are multiple
benefits: For one, the boilerplate is cut down by a lot. The class,
potential copies vector, etc. is all not needed anymore. We also do
avoid explicitly looking through memory here, something that was
duplicated and should only live in the `checkForAllUses~ helper. More
importantly, as we do simplifications we need to make sure all parties
are in sync when they reason about uses. The old way did not allow us to
do this but the new one does as every use visiting AA goes through
`checkForAllUses` now..
As replacements will become more complex it is better to have a single
AA responsible for replacing a use. Before this patch AAValueSimplify*
and AAValueSimplifyReturned could both try to replace the returned
value. The latter was marginally better for the old pass manager
when a function was already carrying a `returned` attribute and when
the context of the return instruction was important. The second
shortcoming was resolved by looking for return attributes in the
AAValueSimplifyCallSiteReturned initialization. The old PM impact is
not concerning.
This is yet another step towards the removal of AAReturnedValues, the
very first AA we should now try to eliminate due to the overlapping
logic with value simplification.
There was some ad-hoc handling of liveness and manifest to avoid
breaking CGSCC guarantees. Things always slipped through though.
This cleanup will:
1) Prevent us from manifesting any "information" outside the CGSCC.
This might be too conservative but we need to opt-in to annotation
not try to avoid some problematic ones.
2) Avoid running any liveness analysis outside the CGSCC. We did have
some AAIsDeadFunction handling to this end but we need this for all
AAIsDead classes. The reason is that AAIsDead information is only
correct if we actually manifest it, since we don't (see point 1) we
cannot actually derive/use it at all. We are currently trying to
avoid running any AA updates outside the CGSCC but that seems to
impact things quite a bit.
3) Assert, don't check, that our modifications (during cleanup) modifies
only CGSCC functions.
In an attempt to remove the memory leak we introduced a double free.
The problem was that we allowed a plain copy of the state and it was
actually used. The use was useless, so it is gone now. The copy
constructor is gone as well. The move constructor ensures the Accesses
pointers are owned by a single state, I hope.
Reported by: https://lab.llvm.org/buildbot/#/builders/16/builds/25820
Dropping this restriction seems to work fine (there are no assertion
failures), so it appears that either the updater got smarter or the
problematic cases are restricted elsewhere.
If doing this still causes issues, then the place to address it
would probably be 8f5bdaf481/llvm/lib/Transforms/IPO/Attributor.cpp (L1856-L1859),
which already prevents replacement outside the SCC, so I'm not
quite sure what this check is intended to avoid.
Differential Revision: https://reviews.llvm.org/D120987
This check is not compatible with opaque pointers. We can avoid
it by adjusting the getPointerAlignment() implementation to avoid
creating unnecessary ptrtoint expressions for bitcasted pointers.
The code already uses OnlyIfReduced to not create an expression
if it does not simplify, and this makes sure that folding a
bitcast and ptrtoint into a ptrtoint doesn't count as a
simplification.
Differential Revision: https://reviews.llvm.org/D120904
We already look through memory to determine where a value that is stored
might pop up again (potential copies). This patch introduces the other
direction with similar logic. If a value is loaded, we can follow all
the accesses to the pointer (or better object) and try to determine what
value might have been stored.
Both `undef` and `nullptr` are maximally aligned. This is especially
important as we often see `undef` until a proper value has been
identified during simplification.
With D106397 we used CFG reasoning to filter out writes that will not
interfere with a given load instruction. With this patch we use the
same logic (modulo the reversal in reachability check order) for store
instructions. As an example, we can now proof stores to shared memory
are dead if all the loads of the shared memory are not reachable from
them.
Heap-2-stack and heap-2-shared can replace an allocation call with
something else. To avoid us deriving information from the allocator
implementation we register a simplification callback now that will
force us to stop at the call site. We probably should create the
replacement memory eagerly and return that instead though.
While we can use range information when we derive dereferenceability we
must make sure to pick he right end of the range. Before we always went
with the minimal offset, which is not correct if we want to combine
the base dereferenceability with some offset. In that case it's the
maximum that gives the correct result.
We already have a check for !InstQueries.empty(), so move the for-range over InstQueries inside to avoid the AAReachability uninitialized variable static analysis warnings.
Prior to this change, LLVM would attempt to optimize an
aligned_alloc(33, ...) call to the stack. This flunked an assertion when
trying to emit the alloca, which crashed LLVM. Avoid that with extra
checks.
Differential Revision: https://reviews.llvm.org/D119604
Prior to this change, LLVM would attempt to optimize an
aligned_alloc(33, ...) call to the stack. This flunked an assertion when
trying to emit the alloca, which crashed LLVM. Avoid that with extra
checks.
Differential Revision: https://reviews.llvm.org/D119604
With
668c5c688b
we introduced an ordering issue revealed by the reverse iteration
buildbot. Depending on the order of the map that tracks the AAIsDead AAs
we ended up with slightly different attributes. This is not totally
unexpected and can happen. We should however be deterministic in our
orderings to avoid such issues.
When we move an allocation from the heap to the stack we need to
allocate it in the alloca AS and then cast the result. This also
prevents us from inserting the alloca after the allocation call but
rather right before.
Fixes https://github.com/llvm/llvm-project/issues/53858
When we use liveness for edges during the `genericValueTraversal` we
need to make sure to use the AAIsDead of the correct function. This
patch adds the proper logic and some simple caching scheme. We also
add an assertion to the `isEdgeDead` call to make sure future misuse
is detected earlier.
Fixes https://github.com/llvm/llvm-project/issues/53872
`UsedAssumedInformation` is a return argument utilized to determine what
information is known. Most APIs used it already but
`genericValueTraversal` did not. This adds it to `genericValueTraversal`
and replaces `AllCallSitesKnown` of `checkForAllCallSites` with the
commonly used `UsedAssumedInformation`.
This was supposed to be a NFC commit, then the test change appeared.
Turns out, we had one user of `AllCallSitesKnown` (AANoReturn) and the
way we set `AllCallSitesKnown` was wrong as we ignored the fact some
call sites were optimistically assumed dead. Included a dedicated test
for this as well now.
Fixes https://github.com/llvm/llvm-project/issues/53884
New users might want to check bins without a load or store instruction
at hand. Since we use those instructions only to find the offset and
size of the access anyway, we can expose an offset and size interface
to the outside world as well.
This commit mainly moves code around and exposes a class (OffsetAndSize)
as well as a method forallInterferingAccesses in AAPointerInfo.
Differential Revision: https://reviews.llvm.org/D119249
The oversight caused us to ignore call sites that are effectively dead
when we computed reachability (or more precise the call edges of a
function). The problem is that loads in the readonly callee might depend
on stores prior to the callee. If we do not track the call edge we
mistakenly assumed the store before the call cannot reach the load.
The problem is nicely visible in:
`llvm/test/Transforms/Attributor/ArgumentPromotion/basictest.ll`
Caused by D118673.
Fixes https://github.com/llvm/llvm-project/issues/53726