gcc warned about it:
[232/4788] Building CXX object lib/Transforms/IPO/CMakeFiles/LLVMipo.dir/AttributorAttributes.cpp.o
../lib/Transforms/IPO/AttributorAttributes.cpp: In lambda function:
../lib/Transforms/IPO/AttributorAttributes.cpp:12555:17: warning: unused variable 'SI' [-Wunused-variable]
12555 | if (auto *SI = dyn_cast<StoreInst>(Inst)) {
| ^~
Fix the warning by removing the variable and turn dyn_cast into isa.
`AAAddressSpace` currently only works for `LoadInst` and `StoreInst`
currently. For `StoreInst`, the corresponding use can be the pointer
operand, or value operand, or both. When it is used as value operand, it
can prevent `AMDGPUPromoteAlloca` from optimization in certain cases.
This patch changes the manifest method such that only pointer operand
will be rewritten.
Through the new `Attributor::checkForAllCallees` we can look through
indirect calls and visit all potential callees if they are known. Most
AAs will do that implicitly now via `AACalleeToCallSite`, thus, most AAs
are able to deal with missing callees for call site IR positions.
Differential Revision: https://reviews.llvm.org/D112290
Many AAs translated callee information to the call site explicitly but
they now all use the helper we already had for callee return to call
site return propagation. In a follow up the helper is going to be
extended to handle multiple callees.
Allow specialization of functions with "dynamic" denormal modes to a
known IEEE or DAZ mode based on callers. This should make it possible
to implement a is-denormal-flushing-enabled test using
llvm.canonicalize and have it be free after LTO.
https://reviews.llvm.org/D156129
GlobalValues are often interesting, especially if they have local
linkage. We now track all uses of those and refine potential callees
with it. Effectively, if an internal function cannot reach an indirect
call site, it cannot be a potential callee, even if it has its address
taken.
The Attributor user can now set the closed world flag
(`AttributorConfig.IsClosedWorldModule` or
`-attributor-assume-closed-world`) in order to specialize call edges
based only on available callees. That means, we assume all functions are
known and hence all potential callees must be declared/defined in the
module. We will use this for GPUs and LTO cases, but for now the user
has to set it via a flag.
If a potential callee has more arguments than the call site, the values
passed will be `poison`. If the potential callee would exhibit UB for
such `undef` argument, e.g., they are marked `noundef`, we can rule the
potential callee out.
The user can now limit the number of indirect calls specialized for a
given call site with `-attributor-max-specializations-per-call-base=N`
or the AttributorConfig callback. We further attach the `!callee`
metadata if all remaining callees are known.
The old code did not account for new queries during an update, which
caused us to leave stack RQIs in the map. We are now explicit about
temporary vs non-temporary RQIs.
Fixes: https://github.com/llvm/llvm-project/issues/64959
The visited set was used to not visit the same function twice, however,
the (new) algorithm requires we do since we start the queries at
different call sites.
AAIndirectCallInfo will collect information and specialize indirect call
sites. It is similar to our IndirectCallPromotion but runs as part of
the Attributor (so with assumed callee information). It also expands
more calls and let's the rest of the pipeline figure out what is UB, for
now. We use existing call promotion logic to improve the result,
otherwise we rely on the (implicit) function pointer cast.
This effectively "fixes" #60327 as it will undo the type punning early
enough for the inliner to work with the (now specialized, thus direct)
call.
Fixes: https://github.com/llvm/llvm-project/issues/60327
Before, we allowed the condition to be simplified to a simple constant
only, otherwise we assumed all successors are live. Now we allow
multiple constants, and mark the default successor as dead accordingly.
We know that __kmpc_alloc_shared is by construction matched with a
unique __kmpc_free_shared. Making the compiler aware of these facts
helps to avoid mallocs/allocas.
Fixes: https://github.com/llvm/llvm-project/issues/64551
If we end up visiting a store use, we couldn't follow to the "reloads".
The capture effect of a store is more than memory as the reloads are
unknown.
Fixes: https://github.com/llvm/llvm-project/issues/64613
If we cannot reach the target from the entry of a function without
exclusion set, we cannot reach it at all. This can allow us to filter
unique queries based on a uniform one.
We can improve our deduction if we stop at PHI and select instructions
and also iterate the returned values explicitly. The latter helps with
isImpliedByIR deductions.
The following pattern fails on recent GCC versions with -std=c++20 flag passed
and succeeds with -std=c++17. Such behavior is not observed on Clang 16.0.
```c++
template <typename T>
struct Foo {
Foo<T>(int a) {}
};
```
This patch removes template parameter from constructor in two occurences to
make the following command complete successfully:
bazel build -c fastbuild --cxxopt=-std=c++20 --host_cxxopt=-std=c++20 @llvm-project//llvm/...
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154782
The very first AA, at least the first one in order, is not necessary
anymore. `AAReturnedValues` was from a different time; one might say, a
simpler time.
It was rewriten once to use `Attribute::getAssumedSimplifiedValues`,
which is what the replacement, `AAPotentialValuesReturned`, does too.
To match the old behavior we needed to avoid the helper
`AAReturnedFromReturnedValues` and iterate the return instructions
explicitly, however, it is still less complexity than it was before.
`AAReturnedFromReturnedValues` and `getAssumedSimplifiedValues` now
allow users to stop at PHI and select nodes or to ignore those and look
through. `AANoFPClass` will stop at select and phi nodes to read the
fast math flags.
Fixes: https://github.com/llvm/llvm-project/issues/63404
Differential Revision: https://reviews.llvm.org/D154917
This patch adds initial support for the `AAAddressSpace` abstract
attributor interface to deduce and query address space information for a
pointer. We simply query the underlying objects that a pointer can point
to and find a common address space if they exist. This is the minimal
support for the interface, we currently manifest changes on loads and
stores. Additionally we should use the target transform information to
deduce if an address space transformation is a no-op for the target
machine when calculating compatibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120586