This used a single check to make sure that the object is both
writable and thread-local. Separate them out to make the
deficiencies in the current code more obvious.
Currently, we bail out of scalar promotion if the loop may unwind
and the memory may be visible on unwind. This is because we can't
insert stores of the promoted value on unwind edges.
However, nowadays scalar promotion also has support for only
promoting loads, while leaving stores in place. This kind of
promotion is safe even in the presence of unwinding.
Differential Revision: https://reviews.llvm.org/D133111
This code was relying on a very subtle contract: The expectation
was that for non-allocas, the unwind safety check would already
perform a capture check, so we don't need to perform it later.
This held true when this unwind safety was only handled for allocas
and noalias calls, but became incorrect when byval support was
added.
To avoid this kind of issue, just remove the dependency between the
unwind and thread-safety checks entirely. At worst, this means we
perform a redundant capture check. If this should turn out to be
problematic for compile-time, we can cache that query in a more
explicit way.
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.
Original Patch by @samparker (Sam Parker)
Differential Revision: https://reviews.llvm.org/D79483
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.
The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.
I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.
Differential Revision: https://reviews.llvm.org/D129205
This reverts the revert commit ad95255b9215a.
The updated version also creates a load when the store may not execute.
In those cases, we still need to introduce a load in a function where
there may not have been one before, so this doesn't completely resolve
issue #51248.
Original message:
When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.
The dead load can can introduce UB, if the function is marked as
writeonly.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123473
This reverts the revert commit 1ddc719680c21f3.
This version of the patch sets the initial available value to poison,
which resolves an issue with the SSAUpdater breaking LCSSA form.
When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.
The dead load can can introduce UB, if the function is marked as
writeonly.
Fixes#51248.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123473
This makes MemorySSA in LoopSink required, and removes the AST-based
implementation, as well as the related support code in LICM.
Differential Revision: https://reviews.llvm.org/D123288
Rather than iterating over users and comparing operands, iterate
over uses and check operand number. Otherwise, we'll end up
promoting a store twice if it has two equal operands.
This can only happen with opaque pointers, as otherwise both
operands differ by a level of indirection, so a bitcast would have
to be involved.
Fixes https://github.com/llvm/llvm-project/issues/54495.
This adds a new option to control AllowSpeculation added in D119965 when
using `-passes=...`.
This allows reproducing #54023 using opt.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D121944
This changes MemorySSA to be constructed in unoptimized form.
MemorySSA::ensureOptimizedUses() can be called to optimize all
uses (once). This should be done by passes where having optimized
uses is beneficial, either because we're going to query all uses
anyway, or because we're doing def-use walks.
This should help reduce the compile-time impact of MemorySSA for
some use cases (the reason why I started looking into this is
D117926), which can avoid optimizing all uses upfront, and instead
only optimize those that are actually queried.
Actually, we have an existing use-case for this, which is EarlyCSE.
Disabling eager use optimization there gives a significant
compile-time improvement, because EarlyCSE will generally only query
clobbers for a subset of all uses (this change is not included in
this patch).
Differential Revision: https://reviews.llvm.org/D121381
LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (https://github.com/llvm/llvm-project/issues/53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information.
This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D119965
This extract a common isNotVisibleOnUnwind() helper into
AliasAnalysis, which handles allocas, byval arguments and noalias
calls. After D116998 this could also handle sret arguments. We
have similar logic in DSE and MemCpyOpt, which will be switched
to use this helper as well.
The noalias call case is a bit different from the others, because
it also requires that the object is not captured. The caller is
responsible for doing the appropriate check.
Differential Revision: https://reviews.llvm.org/D117000
When determining whether the memory is local to the function (and
we can thus introduce spurious writes without thread-safety issues),
check for a noalias call rather than the hardcoded list of memory
allocation functions. Noalias calls are the more general way to
determine allocation functions, as long as we're only interested
in the property that the returned value is distinct from any other
accessible memory.
Differential Revision: https://reviews.llvm.org/D116728
This reverts change 2c391a5/D87551. As noted in the llvm-dev thread "LICM as canonical form" sent earlier today, introducing this was a major design change made without sufficient cause.
A profile driven LICM is not an unreasonable design, it simply is not what we have. Switching to such a model requires a lot more work than just this patch, and broad aggeement that is the right direction for the optimizer as a whole.
Worth noting is that all the tests included in the reverted changed are probably handled if we allow running unconstrained LICM, and later run LoopSink. As such, we have no public examples which motivate a profit based hoisting approach.
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.
Please consider this small example:
loop {
var = *ptr;
if (var) break;
*ptr= var + 1;
}
After this patch, it will be:
var0 = *ptr;
loop {
var1 = phi (var0, var2);
if (var1) break;
var2 = var1 + 1;
*ptr = var2;
}
This addresses some problems from [0].
[0] https://bugs.llvm.org/show_bug.cgi?id=51193
Differential revision: https://reviews.llvm.org/D113289
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.
Please consider this small example:
loop {
var = *ptr;
if (var) break;
*ptr= var + 1;
}
After this patch, it will be:
var0 = *ptr;
loop {
var1 = phi (var0, var2);
if (var1) break;
var2 = var1 + 1;
*ptr = var2;
}
This addresses some problems from [0].
[0] https://bugs.llvm.org/show_bug.cgi?id=51193
Differential revision: https://reviews.llvm.org/D113289
When we check if a load is loop invariant by finding a dominating
invariant.start call, we strip bitcasts until we get to an i8* Value,
and look for an invariant.start use of the i8* Value.
We may accidentally end up at an i8 global and look at a global's uses,
which we shouldn't do in a loop pass. Although we could make this
logic work with globals, that's not currently intended.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D111098
getMetadata() currently uses a weird API where it populates a
structure passed to it, and optionally merges into it. Instead,
we can return the AAMDNodes and provide a separate merge() API.
This makes usages more compact.
Differential Revision: https://reviews.llvm.org/D109852
MSSA-based LICM has been enabled by default for a few years now.
This drops the old AST-based implementation. Using loop(licm) will
result in a fatal error, the use of loop-mssa(licm) is required
(or just licm, which defaults to loop-mssa).
Note that the core canSinkOrHoistInst() logic has to retain AST
support for now, because it is shared with LoopSink.
Differential Revision: https://reviews.llvm.org/D108244
This was a diagnostic option used to demonstrate a weakness in
the AST-based LICM implementation. This problem does not exist
in the MSSA-based LICM implementation, which has been enabled
for a long time now. As such, this option is no longer relevant.
This option has been enabled by default for quite a while now.
The practical impact of removing the option is that MSSA use
cannot be disabled in default pipelines (both LPM and NPM) and
in manual LPM invocations. NPM can still choose to enable/disable
MSSA using loop vs loop-mssa.
The next step will be to require MSSA for LICM and drop the
AST-based implementation entirely.
Differential Revision: https://reviews.llvm.org/D108075
Currently, LNICM pass does not support sinking instructions out of loop nest.
This patch enables LNICM to sink down as many instructions to the exit block of outermost loop as possible.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D107219
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.
This patch introduces such an API and uses it in relevant passes. See
updated tests.
Fix for PR50744.
Reviewed By: nikic, jdoerfert, lebedev.ri
Differential Revision: https://reviews.llvm.org/D104641
This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works.
Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D104180