Because AA does not support vectors of pointers, we have to
treat pointers that are inserted into a vector as captures. We
mostly already do so, but missed the case where getelementptr
is used to produce a vector.
For calls, we are only interested in captures before the call, not
captures by the call itself -- arguments that get passed to the call are
checked explicitly.
In particular, the current implementation is not optimal if the pointer
is captured via a readonly argument -- in that case, we know that even
if the argument is captured, the call will not modify the argument (at
least not via that argument).
Make this more precise by renaming to isCapturedBefore() and adding an
OrAt argument that allows us to toggle whether to consider captures in
the instruction itself or not.
This is the first of a series of patch to improve Alias Analysis on
Scalable quantities.
Keep Scalable information from TypeSize which
will be used in Alias Analysis.
Use BatchAA with EarliestEscapeInfo instead of callCapturesBefore() in
MemDepAnalysis. The advantage of this is that it will also take
not-captured-before information into account for non-calls (see
test_store_before_capture for a representative example), and that this
is a cached analysis. The disadvantage is that EII is slightly less
precise than full CapturedBefore analysis.
In practice the impact is positive, with gvn.NumGVNLoad going from 22022
to 22808 on test-suite.
The impact to compile-time is also positive, mainly in the ThinLTO
configuration.
replaceValuesPerBlockEntry() only handled simple and coerced load
values, however the load may also be referenced by a select value.
Additionally, I suspect that the previous code might have been incorrect
if a load had an offset, as it always constructed the AvailableValue
from scratch.
Fixes https://github.com/llvm/llvm-project/issues/69301.
In some cases clobbering store can be safely skipped if it can only must
or no alias with memory location and it writes the same value. This
patch supports simple case when the value from memory location was
loaded in the same basic block before the store and there are no
modifications between them.
When performing store to load forwarding, replacing users of the
load may turn an indirect call into one with a known callee, in
which case it might become willreturn, invalidating cached ICF
information. Avoid this by removing users.
This is a bit more aggressive than strictly necessary (e.g. this
shouldn't be necessary when doing load-load CSE), but better safe
than sorry.
Fixes https://github.com/llvm/llvm-project/issues/48805.
Duplicate phi nodes were being directly removed, without
invalidating MDA. This could result in a new phi node being
allocated at the same address, incorrectly reusing a cache entry.
Fix this by optionally allowing EliminateDuplicatePHINodes() to
collect phi nodes to remove into a vector, which allows GVN to
handle removal itself.
Fixes https://github.com/llvm/llvm-project/issues/64598.
Differential Revision: https://reviews.llvm.org/D158849
findDominatingValue has a search limit, and when it is reached, optimization is
not applied. This patch fixes the issue that this limit also takes into account
debug intrinsics, so the result of optimization can depend from the presence of
debug info.
This patch implements the enhancement proposed by
https://github.com/llvm/llvm-project/issues/59312.
Suppose we have following code
v0 = load %addr
br %LoadBB
LoadBB:
v1 = load %addr
...
PredBB:
...
br %cond, label %LoadBB, label %SuccBB
SuccBB:
v2 = load %addr
...
Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a
critical edge. SuccBB is another successor of PredBB, it contains another load
v2 which is identical to v1. Current GVN splits the critical edge
(PredBB, LoadBB) and inserts a new load in it. A better method is move the load
of v2 into PredBB, then v1 can be changed to a PHI instruction.
If there are two or more similar predecessors, like the test case in the bug
entry, current GVN simply gives up because otherwise it needs to split multiple
critical edges. But we can move all loads in successor blocks into predecessors.
Differential Revision: https://reviews.llvm.org/D141712
Comparison between two AliasResults implicitly decayed to comparison
of AliasResult::Kind. As a result, MergeAliasResults() ended up
considering two PartialAlias results with different offsets as
equivalent.
Fix this by adding an operator== implementation. To stay
compatible with extensive use of comparisons between AliasResult
and AliasResult::Kind, add an overload for that as well, which
will ignore the offset. In the future, it would probably be a
good idea to remove these implicit decays to AliasResult::Kind
and add dedicated methods to check for specific AliasResult kinds.
Fixes https://github.com/llvm/llvm-project/issues/63019.
Note that this is very conservative since it will not even combine
convergent calls that appear in the same basic block, but EarlyCSE will
handle that case.
Differential Revision: https://reviews.llvm.org/D150974
This patch implements the enhancement proposed by
https://github.com/llvm/llvm-project/issues/59312.
Suppose we have following code
v0 = load %addr
br %LoadBB
LoadBB:
v1 = load %addr
...
PredBB:
...
br %cond, label %LoadBB, label %SuccBB
SuccBB:
v2 = load %addr
...
Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a
critical edge. SuccBB is another successor of PredBB, it contains another load
v2 which is identical to v1. Current GVN splits the critical edge
(PredBB, LoadBB) and inserts a new load in it. A better method is move the load
of v2 into PredBB, then v1 can be changed to a PHI instruction.
If there are two or more similar predecessors, like the test case in the bug
entry, current GVN simply gives up because otherwise it needs to split multiple
critical edges. But we can move all loads in successor blocks into predecessors.
Differential Revision: https://reviews.llvm.org/D141712
Make MaterializeAdjustedValue() responsible for adjusting load
metadata in all cases, so it also covers the non-local case.
In conjunction with that, we no longer need to call
patchReplacementInstruction() for the local case, which would
unnecessarily drop metadata if the replacement value just happened
to be a load (without actual load CSE).
When reusing a load in a way that requires coercion (i.e. casts or
bit extraction) we currently fail to adjust metadata. Unfortunately,
none of our existing tooling for this is really suitable, because
combineMetadataForCSE() expects both loads to have the same type.
In this case we may work on loads of different types and possibly
offset memory location.
As such, what this patch does is to simply drop all metadata, with
the following exceptions:
* Metadata for which violation is known to always cause UB.
* If the load is !noundef, keep all metadata, as this will turn
poison-generating metadata into UB as well.
This fixes the miscompile that was exposed by D146629.
Differential Revision: https://reviews.llvm.org/D148129
I believe !dereferencable violation is immediate undefined behavior,
but this was not explicitly spelled out in LangRef. We already
assume that !dereferenceable is implicitly !noundef and cannot
return poison in isGuaranteedNotToBeUndefOrPoison().
The reason why we made dereferenceable implicitly noundef is that
the purpose of this metadata is to allow speculation, and that
would not be legal on a potential poison pointer.
Differential Revision: https://reviews.llvm.org/D148202
Per LangRef:
> If a load instruction tagged with the !invariant.load metadata
> is executed, the memory location referenced by the load has to
> contain the same value at all points in the program where the
> memory location is dereferenceable; otherwise, the behavior is
> undefined.
As invariant.load violation is immediate undefined behavior, it
is sufficient for it to be present on the dominating load (for
the case where K does not move).
Since D141386 has changed the return value of !range from IUB to poison,
metadata !range shouldn't be preserved even if K dominates J.
If this patch was accepted, I plan to adjust metadata !nonnull as well.
BTW, I found that metadata !noundef is not handled in combineMetadata,
is this intentional?
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142687
Process cases when phi incoming in predecessor block has select
instruction, and this select address is unavailable, but there
are addresses translated from both sides of select instruction.
Differential Revision: https://reviews.llvm.org/D142705
SimplifyCFG currently drops !nontemporal metadata when sinking
common instructions. With this change, SimplifyCFG and similar
transforms will preserve !nontemporal metadata as long as it is
set on both original instructions.
Differential Revision: https://reviews.llvm.org/D144298