Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.
This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
lifetime.start and lifetime.end are primarily intended for use on
allocas, to enable stack coloring and other liveness optimizations. This
is necessary because all (static) allocas are hoisted into the entry
block, so lifetime markers are the only way to convey the actual
lifetimes.
However, lifetime.start and lifetime.end are currently *allowed* to be
used on non-alloca pointers. We don't actually do this in practice, but
just the mere fact that this is possible breaks the core purpose of the
lifetime markers, which is stack coloring of allocas. Stack coloring can
only work correctly if all lifetime markers for an alloca are
analyzable.
* If a lifetime marker may operate on multiple allocas via a select/phi,
we don't know which lifetime actually starts/ends and handle it
incorrectly (https://github.com/llvm/llvm-project/issues/104776).
* Stack coloring operates on the assumption that all lifetime markers
are visible, and not, for example, hidden behind a function call or
escaped pointer. It's not possible to change this, as part of the
purpose of lifetime markers is that they work even in the presence of
escaped pointers, where simple use analysis is insufficient.
I don't think there is any way to have coherent semantics for lifetime
markers on allocas, while also permitting them on arbitrary pointer
values.
This PR restricts lifetimes to operate on allocas only. As a followup, I
will also drop the size argument, which is superfluous if we always
operate on an alloca. (This change also renders various code handling
lifetime markers on non-alloca dead. I plan to clean up that kind of
code after dropping the size argument as well.)
In practice, I've only found a few places that currently produce
lifetimes on non-allocas:
* CoroEarly replaces the promise alloca with the result of an intrinsic,
which will later be replaced back with an alloca. I think this is the
only place where there is some legitimate loss of functionality, but I
don't think this is particularly important (I don't think we'd expect
the promise in a coroutine to admit useful lifetime optimization.)
* SafeStack moves unsafe allocas onto a separate frame. We can safely
drop lifetimes here, as SafeStack performs its own stack coloring.
* Similar for AddressSanitizer, it also moves allocas into separate
memory.
* LSR sometimes replaces the lifetime argument with a GEP chain of the
alloca (where the offsets ultimately cancel out). This is just
unnecessary. (Fixed separately in
https://github.com/llvm/llvm-project/pull/149492.)
* InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast
of an alloca. I don't think this is necessary.
Before commit 14dee0a, NewGVN would not miscompile the function foo,
because `isMustAlias` would return false for non-pointers, particularly
for `lifetime.start`. We need to check whether the loaded pointer is
defined by`lifetime.start` in order to safely simplify the load to
uninitialized memory.
For `getInitialValueOfAllocation`, the behavior depends on the
allocation function. Therefore, we take a conservative approach: we
check whether it's a pointer type. If it is, then—based on the earlier
check—we know that the allocation function defines the loaded pointer.
This is a (very belated) reland of 0a362f12ec60a49a054befec8620a8e69523af54,
which I originally reverted due to flang test failures.
This marks mul constant expressions as undesirable, which means that
we will no longer create them by default, but they can still be
created explicitly.
Part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
When storing a scalable vector and loading a fixed-size vector, where
the
scalable vector is known to be larger based on vscale_range, perform
store-to-load forwarding through temporary @llvm.vector.extract calls.
InstCombine then folds the insert/extract pair away.
The usecase is shown in https://godbolt.org/z/KT3sMrMbd, which shows
that clang generates IR that matches this pattern when the
"arm_sve_vector_bits" attribute is used:
```c
typedef svfloat32_t svfloat32_fixed_t
__attribute__((arm_sve_vector_bits(512)));
struct svfloat32_wrapped_t {
svfloat32_fixed_t v;
};
static inline svfloat32_wrapped_t
add(svfloat32_wrapped_t a, svfloat32_wrapped_t b) {
return {svadd_f32_x(svptrue_b32(), a.v, b.v)};
}
svfloat32_wrapped_t
foo(svfloat32_wrapped_t a, svfloat32_wrapped_t b) {
// The IR pattern this patch matches is generated for this return:
return add(a, b);
}
```
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.
Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
make it easier to use old IR files and somewhat reduce the test churn in
this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
attribute. The representation in the LLVM IR dialect should be updated
separately.
Preserve !alias.scope, !noalias and !mem.parallel_loop_access metadata
on the replacement instruction, if it does not move. In that case, the
program would be UB, if the aliasing property encoded in the metadata
does not hold. This makes use of the clarification re aliasing metadata
implying UB if the property does not hold: #116220
Same as #115868, but for !alias.scope, !noalias and
!mem.parallel_loop_access.
PR: https://github.com/llvm/llvm-project/pull/117716
This patch intersects attributes of two calls to avoid introducing UB.
It also skips incompatible call pairs in GVN/NewGVN. However, I cannot
provide negative tests for these changes.
Fixes https://github.com/llvm/llvm-project/issues/113997.
Cyclic dependencies in NewGVN can result in miscompilation and
termination issues.
This patch ensures that the Class Leader is always the member with the
lowest RPO number. This ensures that the Class Leader is processed
before all other members, making the cyclic dependence impossible.
This fixes#35683.
Regressions:
- 'simp-to-self.ll' regresses due to a known limitation in the way
NewGVN and InstSimplify interact. With the new leader, InstSimplify does
not know that %conv is 1 and fails to simplify.
The caching mechanism for 'OpIsSafeForPhiOfOps' is unsound. An operand
is deemed unsafe for PhiOfOps if it depends on a phi that resides in the
same block as the Phi block, i.e., where we are performing the PhiOfOps.
This is to avoid having to materialize the translated subexpressions. To
avoid redundant code walking, a cache is used to store these results.
Note, however, that since the safety is specific to the Phi block, we
cannot, in general, use the cached results for other blocks.
This patch addresses this by having a cache per block instead of a
single one for the entire function. closes#63335
Previously we only handled the `L0 == R0` case if both `L1` and `R1`
where constant.
We can get more out of the analysis using general constant ranges
instead.
For example, `X u> Y` implies `X != 0`.
In general, any strict comparison on `X` implies that `X` is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.
Closes#85557
The data-layout independent constant folding currently has some rather
gnarly code for canonicalizing GEP indices to reduce "notional
overindexing", and then infers inbounds based on that canonicalization.
Now that we canonicalize to i8 GEPs, this canonicalization is
essentially useless, as we'll discard it as soon as the GEP hits the
data-layout aware constant folder anyway. As such, I'd like to remove
this code entirely.
This shouldn't have any impact on optimization capabilities.
This patch canonicalizes constant expression GEPs to use i8 source
element type, aka ptradd. This is the ConstantFolding equivalent of the
InstCombine canonicalization introduced in #68882.
I believe all our optimizations working on constant expression GEPs
(like GlobalOpt etc) have already been switched to work on offsets, so I
don't expect any significant fallout from this change.
This is part of:
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699
At the moment, Clang is rather liberal in assuming that 0 (and by extension unqualified) is always a safe default. This does not work for targets that actually use a different value for the default / generic AS (for example, the SPIRV that obtains from HIPSPV or SYCL). This patch is a first, fairly safe step towards trying to clear things up by querying a modules' default AS from the target, rather than assuming it's 0, alongside fixing a few places where things break / we encode the 0 == DefaultAS assumption. A bunch of existing tests are extended to check for non-zero default AS usage.
Currently, the builtins used for implementing `va_list` handling
unconditionally take their arguments as unqualified `ptr`s i.e. pointers
to AS 0. This does not work for targets where the default AS is not 0 or
AS 0 is not a viable AS (for example, a target might choose 0 to
represent the constant address space). This patch changes the builtins'
signature to take generic `anyptr` args, which corrects this issue. It
is noisy due to the number of tests affected. A test for an upstream
target which does not use 0 as its default AS (SPIRV for HIP device
compilations) is added as well.
There are many tests that specify a target triple/CPU flags but no
DataLayout which can lead to IR being generated that has unusual
behaviour. This commit attempts to use the default DataLayout based
on the relevant flags if there is no explicit override on the command
line or in the IR file.
One thing that is not currently possible to differentiate from a missing
datalayout `target datalayout = ""` in the IR file since the current
APIs don't allow detecting this case. If it is considered useful to
support this case (instead of passing "-data-layout=" on the command
line), I can change IR parsers to track whether they have seen such a
directive and change the callback type.
Differential Revision: https://reviews.llvm.org/D141060
Some folds using m_NUW, m_NSW style matchers were missed, make
sure they respect UseInstrInfo.
This is part of #53218, but not a complete fix for the issue.
NewGVN isn't enabled by default so some test failures were being missed.
Reverting to do more testing offline.
This reverts commit c59bc230e738036050f4c9b1ebde88699eec74bf.
Currently, we consider any instruction that might read from memory to be unsafe for phi-of-ops.
This patch refines that by walking the clobbering memDefs until we either hit a block that
strictly dominates the phi block (safe) or we hit a clobbering memPhi (unsafe).
Differential Revision: https://reviews.llvm.org/D156055
Note that this is very conservative since it will not even combine
convergent calls that appear in the same basic block, but EarlyCSE will
handle that case.
Differential Revision: https://reviews.llvm.org/D150974