Instead of visiting call sites in Attribute::checkForAllUses, we now
keep track of returns in AAPointerInfo and use the call site return
information as required. This way, the user of
AAPointerInfo(CallSite)Argument can determine if the call return should
be visited. We do not collect them as "may accesses" in the
AAPointerInfo(CallSite)Argument itself in case a return user is found.
We don't have any legacy pass manager CGSCC passes that modify the call
graph (we only use it in the codegen pipeline to run function passes in
call graph order). This is the beginning of removing CallGraphUpdater
and making all the relevant CGSCC passes directly use the new pass
manager APIs.
Global variables that reference themselves alongside a function that is
called indirectly can cause an infinite loop in
`AAGlobalValueInfoFloating`. The recursive reference is continually
pushed back into the workload, causing the attributor to hang
indefinitely.
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
Noteworthy changes:
* FindInsertedValue now takes an optional iterator rather than an
instruction pointer, as we need to always insert with iterators,
* I've added a few iterator-taking versions of some value-tracking and
DomTree methods -- they just unwrap the iterator. These are purely
convenience methods to avoid extra syntax in some passes.
* A few calls to getNextNode become std::next instead (to keep in the
theme of using iterators for positions),
* SeparateConstOffsetFromGEP has it's insertion-position field changed.
Noteworthy because it's not a purely localised spelling change.
All this should be NFC.
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.
Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.
Main problem solved would be something like this:
store i17 -1, ptr %p, align 4
%v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.
If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.
Fixes: https://github.com/llvm/llvm-project/issues/81793
We're using this flag (IsNewDbgInfoFormat) to detect the boundaries in
LLVM of what's treating debug-info as intrinsics (i.e. dbg.value), and
what's using DPValue objects (the non-intrinsic replacement). The
attributor tends to create new wrapper functions and doesn't insert them
into Modules in the usual way, thus we have to manually update that flag
to signal what debug-info mode it's using.
I've added some --try-experimental-debuginfo-iterators RUN lines to
tests that would otherwise crash because of this, so that they're
exercised by our new-debuginfo-iterators buildbot.
NB: there's an attributor test with a dbg.value in it, however
attributes re-order themselves in RemoveDIs mode for various reasons, so
we're going to address that in a different patch.
Changes the size of allocations automatically.
For now, implements the case when a single range from start of the
allocation is alive and the allocation can be reduced.
Through the new `Attributor::checkForAllCallees` we can look through
indirect calls and visit all potential callees if they are known. Most
AAs will do that implicitly now via `AACalleeToCallSite`, thus, most AAs
are able to deal with missing callees for call site IR positions.
Differential Revision: https://reviews.llvm.org/D112290
A direct call to a function casted to a different type is still not
really an address taken event. We allow the user to opt out of these
now.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D159149
Allow specialization of functions with "dynamic" denormal modes to a
known IEEE or DAZ mode based on callers. This should make it possible
to implement a is-denormal-flushing-enabled test using
llvm.canonicalize and have it be free after LTO.
https://reviews.llvm.org/D156129
The Attributor user can now set the closed world flag
(`AttributorConfig.IsClosedWorldModule` or
`-attributor-assume-closed-world`) in order to specialize call edges
based only on available callees. That means, we assume all functions are
known and hence all potential callees must be declared/defined in the
module. We will use this for GPUs and LTO cases, but for now the user
has to set it via a flag.
The user can now limit the number of indirect calls specialized for a
given call site with `-attributor-max-specializations-per-call-base=N`
or the AttributorConfig callback. We further attach the `!callee`
metadata if all remaining callees are known.
AAIndirectCallInfo will collect information and specialize indirect call
sites. It is similar to our IndirectCallPromotion but runs as part of
the Attributor (so with assumed callee information). It also expands
more calls and let's the rest of the pipeline figure out what is UB, for
now. We use existing call promotion logic to improve the result,
otherwise we rely on the (implicit) function pointer cast.
This effectively "fixes" #60327 as it will undo the type punning early
enough for the inliner to work with the (now specialized, thus direct)
call.
Fixes: https://github.com/llvm/llvm-project/issues/60327
When other AAs query the current value of KernelEnvC via the callback
KernelConfigurationSimplifyCB we need to ensure they are now dependent
on the AAKernelInfo that is in charge of the KernelEnvC.
This patch adds a lightweight instance of Attributor that only deduces
attributes.
This is just an initial version with the goal to have a version that
only focuses on attributes to replace the function-attrs pass.
The initial version has a few open issues pending until default
enablement, the main one probably being compile time. The main
additional functionality this will provide in general is propagating
attributes to call sites.
Open issues:
* compile time
The current version increase O3 +2.67% and ThinLTO +6.18% when replacing FunctionAttr
https://llvm-compile-time-tracker.com/compare.php?from=c4bb3e073548cf436d5fa0406e3ae75e94684dec&to=d992630a69c79a2587d736e6a88f448850413bd1&stat=instructions%3Au
Both are with an additional change to preserve more analysis, like FunctionAttrs CGSCC run.
* some missed attribute inference
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D152081
We can improve our deduction if we stop at PHI and select instructions
and also iterate the returned values explicitly. The latter helps with
isImpliedByIR deductions.
The very first AA, at least the first one in order, is not necessary
anymore. `AAReturnedValues` was from a different time; one might say, a
simpler time.
It was rewriten once to use `Attribute::getAssumedSimplifiedValues`,
which is what the replacement, `AAPotentialValuesReturned`, does too.
To match the old behavior we needed to avoid the helper
`AAReturnedFromReturnedValues` and iterate the return instructions
explicitly, however, it is still less complexity than it was before.
`AAReturnedFromReturnedValues` and `getAssumedSimplifiedValues` now
allow users to stop at PHI and select nodes or to ignore those and look
through. `AANoFPClass` will stop at select and phi nodes to read the
fast math flags.
Fixes: https://github.com/llvm/llvm-project/issues/63404
Differential Revision: https://reviews.llvm.org/D154917
If the function is non-IPO amendable we do skip most attributes/AAs.
However, if an AA has a isImpliedByIR that can deduce the attribute from
other attributes, we can run those. For now, we manually enable them,
if we have more later we can use some automation/flag.
This patch adds initial support for the `AAAddressSpace` abstract
attributor interface to deduce and query address space information for a
pointer. We simply query the underlying objects that a pointer can point
to and find a common address space if they exist. This is the minimal
support for the interface, we currently manifest changes on loads and
stores. Additionally we should use the target transform information to
deduce if an address space transformation is a no-op for the target
machine when calculating compatibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120586
AANoCapture is now the first non-boolean AA that is always queried via
the new APIs and not created manually.
We explicitly do not manifest nocapture for `null` and `undef` anymore.
AANonNull is now the first AA that is always queried via the new APIs
and not created manually. Others will follow shortly to avoid trivial
AAs whenever possible.
This commit introduced some helper logic that will make it simpler to
port the next one. It also untangles AADereferenceable and AANonNull
such that the former does not keep a handle on the latter. Finally,
we stop deducing `nonnull` for `undef`, which was incorrect.