TBAA/NoAlias/AliasScope and other information is currently preserved
when upgrading to a memcpy/memset. However, this is missing when upgrading to
the macOS memset_pattern function. This adds the same alias information preservation
to memset_pattern
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D152934
This is a follow on to D147117 and D147355. In both cases, we were adding special cases to compute zext(BTC+1) instead of zext(BTC)+1 when the BTC+1 computation was known not to overflow.
Differential Revision: https://reviews.llvm.org/D148661
The pattern we're using for the memset_pattern* call gets put into a static
global variable initialized, which means it has to be representable with
relocations on the target. Most `ConstantExpr` instances do not satisfy this
constraint, so avoid all of them for now.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
Mark ModRefInfo as a bitmask enum, which allows using normal
& and | operators on it. This supersedes various functions like
unionModRef() and intersectModRef(). I think this makes the code
cleaner than going through helper functions...
Differential Revision: https://reviews.llvm.org/D130870
isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.
Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.
Fixes https://github.com/llvm/llvm-project/issues/50506.
Differential Revision: https://reviews.llvm.org/D129630
Commit dd5991cc modified the aliasing checks here to allow transforming
a memcpy where the source and destination point into the same object.
However, the change accidentally made the code skip the alias check for
other operations in the loop.
Instead of completely skipping the alias check, just skip the check for
whether the memcpy aliases itself.
Differential Revision: https://reviews.llvm.org/D126486
libcalls." (was 0f8c626). This reverts commit 14d9390.
The patch previously failed to recognize cases where user had defined a
function alias with an identical name as that of the library
function. Module::getFunction() would then return nullptr which is what the
sanitizer discovered.
In this updated version a new function isLibFuncEmittable() has as well been
introduced which is now used instead of TLI->has() anytime a library function
is to be emitted . It additionally also makes sure there is e.g. no function
alias with the same name in the module.
Reviewed By: Eli Friedman
Differential Revision: https://reviews.llvm.org/D123198
test/Transforms/InstCombine/pr39177.ll failed in a -DLLVM_USE_SANITIZER=Undefined build.
```
lib/Transforms/Utils/BuildLibCalls.cpp:1217:17: runtime error: reference binding to null pointer of type 'llvm::Function'
```
`Function &F = *M->getFunction(Name);`
This reverts commit 0f8c626723d2bbd547e78dcab5ab260dfbc437e1.
A new set of overloaded functions named getOrInsertLibFunc() are now supposed
to be used instead of getOrInsertFunction() when building a libcall from
within an LLVM optimizer(). The idea is that this new function also makes
sure that any mandatory argument attributes are added to the function
prototype (after calling getOrInsertFunction()).
inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it
only adds attributes that are not necessary for correctness but merely
helping with later optimizations.
Generally, the front end is responsible for building a correct function
prototype with the needed argument attributes. If the middle end however is
the one creating the call, e.g. when replacing one libcall with another, it
then must take this responsibility.
This continues the work of properly handling argument extension if required
by the target ABI when building a lib call. getOrInsertLibFunc() now does
this for all libcalls currently built by any LLVM optimizer. It is expected
that when in the future a new optimization builds a new libcall with an
integer argument it is to be added to getOrInsertLibFunc() with the proper
handling. Note that not all targets have it in their ABI to sign/zero extend
integer arguments to the full register width, but this will be done
selectively as determined by getExtAttrForI32Param().
Review: Eli Friedman, Nikita Popov, Dávid Bolvanský
Differential Revision: https://reviews.llvm.org/D123198
Factor in the TBAA of adjacent stores instead of just the head store
when merging stores into a memset. We were seeing GVN remove a load that
had a TBAA that matched the 2nd store because GVN determined it didn't
match the TBAA of the memset. The memset had the TBAA of only the first
store.
i.e. Loading the field pi_ of shared_count after memset to create an
array of shared_ptr
template<class T>
class shared_ptr {
T *p;
shared_count refcount;
};
class shared_count {
sp_counted_base *pi_;
};
Differential Revision: https://reviews.llvm.org/D122205
When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D108221
There is no need to sort inserted instructions by dominance, as the
deletion loop still requires RAUW with undef before deleting. Removing
instructions in reverse insertion order should still insure that the
number of uselist updates is kept to a minimum.
These are deprecated and should be replaced with getAlign().
Some of these asserts don't do anything because Load/Store/AllocaInst never have a 0 align value.
Expression guraded in loop entry can be folded prior to comparison. This patch
proceeds D107353 and makes LIR able to deal with nested for-loop.
Reviewed By: qianzhen, bmahjour
Differential Revision: https://reviews.llvm.org/D108112
We were using the type of the loop back edge count to represent the
store size. This failed for small loop counts (e.g. in the added test,
the loop count was an i2).
Use the index type instead.
Fixes PR52104.
Differential Revision: https://reviews.llvm.org/D111401
This change fixes issue found by Markus: https://reviews.llvm.org/rG11338e998df1
Before this patch following code was transformed to memmove:
for (int i = 15; i >= 1; i--) {
p[i] = p[i-1];
sum += p[i-1];
}
However load from p[i-1] is used not only by store to p[i] but also by sum computation.
Therefore we cannot emit memmove in loop header.
Differential Revision: https://reviews.llvm.org/D107964
Letting it take SCEV allows further modification on the function to optimize
if the StoreSize / Stride is runtime determined.
The plan is to let memcpy / memmove deal with runtime-determined sizes, just
like what D107353 did to memset.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D108289
When dealing with memmove, we also add the load instruction to the ignored
instructions list passed to `mayLoopAccessLocation`. Renaming "Stores" to
"IgnoredInsts" to be more precise.
Differential Revision: https://reviews.llvm.org/D108275
The current LIR does not deal with runtime-determined memset-size. This patch
utilizes SCEV and check if the PointerStrideSCEV and the MemsetSizeSCEV are equal.
Before comparison the pass would try to fold the expression that is already
protected by the loop guard.
Testcase file `memset-runtime.ll`, `memset-runtime-debug.ll` added.
This patch deals with proper loop-idiom. Proceeding patch wants to deal with SCEV-s
that are inequal after folding with the loop guards.
Reviewed By: lebedev.ri, Whitney
Differential Revision: https://reviews.llvm.org/D107353
Letting it take SCEV allows further modification on the function to optimize
if the StoreSize / Stride is runtime determined.
This is a preceeding of D107353.
The big picture is to let LoopIdiom deal with runtime-determined sizes.
Reviewed By: Whitney, lebedev.ri
Differential Revision: https://reviews.llvm.org/D104595
The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns
in similar way like GCC: https://godbolt.org/z/fh95e83od
LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort.
Differential Revision: https://reviews.llvm.org/D104464
Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.
Nowadays LLVM does not assume that all loops are finite,
so if we want to produce a finite loop from a potentially-infinite one,
we must ensure that the original loop is known to be a finite one.
For this transform, it only matters for arithmetic right-shifts.
For them, either the function or the loop must be known to
be `mustprogress`, or the original value being shifted must be known
to be non-negative (because iff the sign bit was set,
it will never become zero, but will become `-1` in the "end").
It would be really good for alive2 to actually complain about this,
but it currently does not: https://github.com/AliveToolkit/alive2/issues/726
This adds support for the "count active bits" pattern, i.e.:
```
int countBits(unsigned val) {
int cnt = 0;
for( ; (val << cnt) != 0; ++cnt)
;
return cnt;
}
```
but a somewhat more general one:
```
int countBits(unsigned val, int start, int off) {
int cnt;
for (cnt = start; val << (cnt + off); cnt++)
;
return cnt;
}
```
alive2 is happy with all the tests there.
Note that, again, much like with the right-shift cases,
we don't require the `val != 0` guard.
This is the last pattern that was supported by
`detectShiftUntilZeroIdiom()`, which now becomes obsolete.
This adds support for the "count active bits" pattern, i.e.:
```
int countActiveBits(signed val) {
int cnt = 0;
for( ; (val >> cnt) != 0; ++cnt)
;
return cnt;
}
```
but a somewhat more general one:
```
int countActiveBits(signed val, int start, int off) {
int cnt;
for (cnt = start; val >> (cnt + off); cnt++)
;
return cnt;
}
```
This directly matches the existing 'logical right-shift until zero' idiom.
alive2 is happy with all the tests there.
Note that, again, much like with the original unsigned case,
we don't require the `val != 0` guard.
The old `detectShiftUntilZeroIdiom()` already supports this pattern,
the idea here is that the `val` must be positive (have at least one
leading zero), because otherwise the loop is non-terminating,
but since it is not `while(1)`, that would have been UB.