The last attempt failed a sanitiser build because we were
creating a reference to a null Predicates pointer in
isDereferenceableAndAlignedInLoop. This was exposed by
the unit test IsDerefReadOnlyLoop in
unittests/Analysis/LoadsTest.cpp. I fixed this by falling
back on getConstantMaxBackedgeTakenCount if Predicates is
null - see line 316 in llvm/lib/Analysis/Loads.cpp. There
are no other changes.
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
If a pointer gets freed, it may not be dereferenceable any longer, even
though there is a dominating dereferenceable assumption. As first step,
only consider assumptions if the pointer value cannot be freed if
UseDerefAtPointSemantics is used.
PR: https://github.com/llvm/llvm-project/pull/123196
Currently when we encounter a negative step in the induction
variable isDereferenceableAndAlignedInLoop bails out because
the element size is signed greater than the step. This patch
adds support for negative steps in cases where we detect the
start address for the load is of the form base + offset. In
this case the address decrements in each iteration so we need
to calculate the access size differently. I have done this by
caling getStartAndEndForAccess from LoopAccessAnalysis.cpp.
The motivation for this patch comes from PR #88385 where a
reviewer requested reusing isDereferenceableAndAlignedInLoop,
but that PR itself does support reverse loops.
The changed test in LoopVectorize/X86/load-deref-pred.ll now
passes because previously we were calculating the total access
size incorrectly, whereas now it is 412 bytes and fits
perfectly into the alloca.
Also use getPointerAlignment when trying to use alignment and
dereferenceable assumptions. This catches cases where dereferencable is
known via the assumption but alignment is known via getPointerAlignment
(e.g. via argument attribute or align of 1)
PR: https://github.com/llvm/llvm-project/pull/120916
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
Currently if a loop contains loads that we can prove at compile time
are dereferenceable when certain conditions are satisfied the function
isDereferenceableAndAlignedInLoop will still return false because
getSmallConstantMaxTripCount will return 0 when SCEV predicates
are required. This patch changes getSmallConstantMaxTripCount to take
an optional Predicates pointer argument so that we can permit
functions such as isDereferenceableAndAlignedInLoop to consider more
cases.
If a dereferenceability fact is provided through `!dereferenceable` (or
similar), it may only hold on the given control flow path. When we use
`isSafeToSpeculativelyExecute()` to check multiple instructions, we
might make use of `!dereferenceable` information that does not hold at
the speculation target. This doesn't happen when speculating
instructions one by one, because `!dereferenceable` will be dropped
while speculating.
Fix this by checking whether the instruction with `!dereferenceable`
dominates the context instruction. If this is not the case, it means we
are speculating, and cannot guarantee that it holds at the speculation
target.
Fixes https://github.com/llvm/llvm-project/issues/108854.
Even if memory is valid from `llvm` point of view,
e.g. local alloca, sanitizers have API for user
specific memory annotations.
These annotations can be used to track size of the
local object, e.g. inline vectors may prevent
accesses beyond the current vector size.
So valid programs should not access those parts of
alloca before checking preconditions.
Fixes#100639.
This addresses an optimization regression in Rust we have observed after
https://github.com/llvm/llvm-project/pull/82458. We now only perform
pointer replacement if they have the same underlying object. However,
getUnderlyingObject() by default only looks through linear chains, not
selects/phis. In particular, this means that we miss cases involving
involving pointer induction variables.
This patch fixes this by introducing a new helper
getUnderlyingObjectAggressive() which basically does what
getUnderlyingObjects() does, just specialized to the case where we must
arrive at a single underlying object in the end, and with a limit on the
number of inspected values.
Doing this more expensive underlying object check has no measurable
compile-time impact on CTMark.
This patch now bails out explicitly for negative offsets so that it's
more consistent with the unsigned remainder and add calculations,
and it fixes a genuine bug as shown with the new test.
I created this patch due to a reviewer request on PR #88385 to split off
the analysis changes, however without the other code in that PR I can
only test the new function with unit tests.
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
This patch does the following:
Adds the following functions:
- replaceDominatedUsesWithIf() that takes a callback.
- canReplacePointersIfEqual(...) returns true if the underlying object
is the same, and for null and const dereferencable pointer replacements.
- canReplacePointersIfEqualInUse(...) returns true for the above as well
as if the use is in icmp/ptrtoint or phi/selects feeding into them.
Updates GVN using the functions above so that the pointer replacements
are only made using the above API.
https://reviews.llvm.org/D143129
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
Extend handling to support `%base + offset` as start for AddRecs in
isDereferenceableAndAlignedInLoop. This is done by adjusting AccessSize
by the offset and effectively checking if the full object starting from
%base to %base + offset + access-size is dereferenceable.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D147260
Use existing functionality for identifying total access size by strided
loads. If we can speculate the load across all vector iterations, we can
avoid predication for these strided loads (or masked gathers in
architectures which support it).
Differential Revision: https://reviews.llvm.org/D145616
This patch adds AArch64 CodeGen support such that the type can be passed
and returned to/from functions, and also adds support to use this type in
load/store operations and PHI nodes.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D136862
In practice this means that we can speculate more loads in SROA.
This e.g. comes up in https://godbolt.org/z/G8716s6sj,
although we are missing second half of the puzzle to optimize that.
InstCombine does some basic store to load forwarding. One case it
currently misses is the case where the store is actually a memset.
This patch adds support for this case. This is a minimal
implementation that only handles a load at the memset base address,
without an offset.
GVN is already capable of performing this optimization. Having it
in InstCombine can help with phase ordering issues, similar to the
existing store to load forwarding.
Differential Revision: https://reviews.llvm.org/D137323
`isSafeToLoadUnconditionally` currently assumes sized types. Bail out for now.
This fixes a TypeSize warning reachable from instcombine via (load (select
cond, ptr, ptr)).
Differential Revision: https://reviews.llvm.org/D129477
Rather than checking the rounded type store size, check the type
size in bits. We don't want to forward a store of i1 to a load
of i8 for example, even though they have the same type store size.
The padding bits have unspecified contents.
This is a partial fix for the issue reported at
https://reviews.llvm.org/D115924#inline-1179482,
the problem also needs to be addressed more generally in the
constant folding code.
This follows up on D111023 by exporting the generic "load value
from constant at given offset as given type" and using it in the
store to load forwarding code. We now need to make sure that the
load size is smaller than the store size, previously this was
implicitly ensured by ConstantFoldLoadThroughBitcast().
Differential Revision: https://reviews.llvm.org/D112260
When using a datalayout that has pointer index width != pointer size this
code triggers an assertion in Value::stripAndAccumulateConstantOffsets().
I encountered this this while compiling FreeBSD for CHERI-RISC-V.
Also update LoadsTest.cpp to use a DataLayout with index width != pointer
width to ensure this case is tested.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D110406
As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention.
The optimizations unblocked are:
- In-block load propagation.
- In-block dead store elimination
- Memory copy optimization that turns stores to consecutive memories into a memset.
These optimizations are local to a block, so they shouldn't affect the profile quality.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D100075