For the two remaining uses that did not explicitly specify it,
set UndefAllowed=false. In both cases, I believe that treating
undef as a full range is the correct behavior.
* Remove pointer AddressSpaceCast in function `constructPointer`
* Remove 1st parameter (`ResTy`) of function `constructPointer`
1st input argument to function `constructPointer` in all 4 call-sites is
`ptr addrspace(0)`. Function `constructPointer` performs a pointer
AddressSpaceCast to `ResTy`, making the returned pointer have type `ptr
addrspace(0)` in all 4 call-sites.
Unless there's a clear reason to discard the addrspace info of input
parameter `Ptr`, I think we should keep and forward that info to the
returned pointer of `constructPointer`.
Opaque ptr cleanup effort.
Changes the size of allocations automatically.
For now, implements the case when a single range from start of the
allocation is alive and the allocation can be reduced.
This adds a writable attribute, which in conjunction with
dereferenceable(N) states that a spurious store of N bytes is
introduced on function entry. This implies that this many bytes
are writable without trapping or introducing data races. See
https://llvm.org/docs/Atomics.html#optimization-outside-atomic for
why the second point is important.
This attribute can be added to sret arguments. I believe Rust will
also be able to use it for by-value (moved) arguments. Rust likely
won't be able to use it for &mut arguments (tree borrows does not
appear to allow spurious stores).
In this patch the new attribute is only used by LICM scalar promotion.
However, the actual motivation for this is to fix a correctness issue
in call slot optimization, which needs this attribute to avoid
optimization regressions.
Followup to the discussion on D157499.
Differential Revision: https://reviews.llvm.org/D158081
This is the first of a series of patch to improve Alias Analysis on
Scalable quantities.
Keep Scalable information from TypeSize which
will be used in Alias Analysis.
If a potential interfering access is in a different kernel and the
underlying object has kernel lifetime we can straight out ignore the
interfering access.
TODO: This should be made much stronger via "reaching kernels", which we
already track in AAKernelInfo.
gcc warned about it:
[232/4788] Building CXX object lib/Transforms/IPO/CMakeFiles/LLVMipo.dir/AttributorAttributes.cpp.o
../lib/Transforms/IPO/AttributorAttributes.cpp: In lambda function:
../lib/Transforms/IPO/AttributorAttributes.cpp:12555:17: warning: unused variable 'SI' [-Wunused-variable]
12555 | if (auto *SI = dyn_cast<StoreInst>(Inst)) {
| ^~
Fix the warning by removing the variable and turn dyn_cast into isa.
`AAAddressSpace` currently only works for `LoadInst` and `StoreInst`
currently. For `StoreInst`, the corresponding use can be the pointer
operand, or value operand, or both. When it is used as value operand, it
can prevent `AMDGPUPromoteAlloca` from optimization in certain cases.
This patch changes the manifest method such that only pointer operand
will be rewritten.
Through the new `Attributor::checkForAllCallees` we can look through
indirect calls and visit all potential callees if they are known. Most
AAs will do that implicitly now via `AACalleeToCallSite`, thus, most AAs
are able to deal with missing callees for call site IR positions.
Differential Revision: https://reviews.llvm.org/D112290
Many AAs translated callee information to the call site explicitly but
they now all use the helper we already had for callee return to call
site return propagation. In a follow up the helper is going to be
extended to handle multiple callees.
Allow specialization of functions with "dynamic" denormal modes to a
known IEEE or DAZ mode based on callers. This should make it possible
to implement a is-denormal-flushing-enabled test using
llvm.canonicalize and have it be free after LTO.
https://reviews.llvm.org/D156129
GlobalValues are often interesting, especially if they have local
linkage. We now track all uses of those and refine potential callees
with it. Effectively, if an internal function cannot reach an indirect
call site, it cannot be a potential callee, even if it has its address
taken.
The Attributor user can now set the closed world flag
(`AttributorConfig.IsClosedWorldModule` or
`-attributor-assume-closed-world`) in order to specialize call edges
based only on available callees. That means, we assume all functions are
known and hence all potential callees must be declared/defined in the
module. We will use this for GPUs and LTO cases, but for now the user
has to set it via a flag.
If a potential callee has more arguments than the call site, the values
passed will be `poison`. If the potential callee would exhibit UB for
such `undef` argument, e.g., they are marked `noundef`, we can rule the
potential callee out.
The user can now limit the number of indirect calls specialized for a
given call site with `-attributor-max-specializations-per-call-base=N`
or the AttributorConfig callback. We further attach the `!callee`
metadata if all remaining callees are known.
The old code did not account for new queries during an update, which
caused us to leave stack RQIs in the map. We are now explicit about
temporary vs non-temporary RQIs.
Fixes: https://github.com/llvm/llvm-project/issues/64959
The visited set was used to not visit the same function twice, however,
the (new) algorithm requires we do since we start the queries at
different call sites.
AAIndirectCallInfo will collect information and specialize indirect call
sites. It is similar to our IndirectCallPromotion but runs as part of
the Attributor (so with assumed callee information). It also expands
more calls and let's the rest of the pipeline figure out what is UB, for
now. We use existing call promotion logic to improve the result,
otherwise we rely on the (implicit) function pointer cast.
This effectively "fixes" #60327 as it will undo the type punning early
enough for the inliner to work with the (now specialized, thus direct)
call.
Fixes: https://github.com/llvm/llvm-project/issues/60327
Before, we allowed the condition to be simplified to a simple constant
only, otherwise we assumed all successors are live. Now we allow
multiple constants, and mark the default successor as dead accordingly.
We know that __kmpc_alloc_shared is by construction matched with a
unique __kmpc_free_shared. Making the compiler aware of these facts
helps to avoid mallocs/allocas.
Fixes: https://github.com/llvm/llvm-project/issues/64551
If we end up visiting a store use, we couldn't follow to the "reloads".
The capture effect of a store is more than memory as the reloads are
unknown.
Fixes: https://github.com/llvm/llvm-project/issues/64613
If we cannot reach the target from the entry of a function without
exclusion set, we cannot reach it at all. This can allow us to filter
unique queries based on a uniform one.
We can improve our deduction if we stop at PHI and select instructions
and also iterate the returned values explicitly. The latter helps with
isImpliedByIR deductions.
The following pattern fails on recent GCC versions with -std=c++20 flag passed
and succeeds with -std=c++17. Such behavior is not observed on Clang 16.0.
```c++
template <typename T>
struct Foo {
Foo<T>(int a) {}
};
```
This patch removes template parameter from constructor in two occurences to
make the following command complete successfully:
bazel build -c fastbuild --cxxopt=-std=c++20 --host_cxxopt=-std=c++20 @llvm-project//llvm/...
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154782