Relocate the GEP modification to a later stage of the function
performCallSlotOptzn(), ensuring that the code remains unchanged if the
optimization fails.
Co-authored-by: aklkaiyan <aklkaiyan@tencent.com>
…ptzn.
This changes performStackMoveOptzn to take a TypeSize instead of
uint64_t to avoid an implicit conversion when called from
processStoreOfLoad.
performStackMoveOptzn has been updated to allow scalable types in the
rest of its code.
Given that transition to opaque pointers a call to CreatePointerCast
in processMemSetMemCpyDependence was found redundant. It would cast
from "ptr" to "ptr" (both associated with the same address space).
Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.
Fixes#65225
This adds an additional transform to drop zero-size memcpys, also in
the case where the size is only zero after instruction simplification.
The motivation is the case from PR54983 where the size is non-trivially
zero, and processMemSetMemCpyDependence() keeps trying to reduce the
memset size by zero bytes.
This fix it's not really principled. It only works on the premise that
if InstSimplify doesn't realize the size is zero, then AA also won't.
The principled approach would be to instead add a isKnownNonZero()
guard to the processMemSetMemCpyDependence() transform, but I
suspect that would render that optimization mostly useless (at least
it breaks all the existing test coverage -- worth noting that the
constant size case is also handled by DSE, so I think this transform
is primarily about the dynamic size case).
Fixes https://github.com/llvm/llvm-project/issues/54983.
Fixes https://github.com/llvm/llvm-project/issues/64886.
Differential Revision: https://reviews.llvm.org/D124078
This reverts commit 3bb32c61b2f1f5d14dd056dd198dc898dce5a44e.
Use InsertionPt for DT to handle non-memory access dominators
Differential Revision: https://reviews.llvm.org/D155406
New memory accesses are usually inserted by using one of the
createMemoryAccessXYZ() methods followed by insertUse() or
insertDef(). createMemoryAccessXYZ() accepts a defining access,
however this defining access will always be overwritten by
insertUse() / insertDef().
Update the documentation to clarify this, and stop passing
Definition to createMemoryAccessXYZ() if it's followed by
insertUse/insertDef.
Alternatively, we could also make insertUse / insertDef keep the
defining access if it is specified, and only recompute it if it's
missing.
Differential Revision: https://reviews.llvm.org/D157979
This reverts commit 36a6eb7d12a9f827bf3d5d4e5fdc68b8a62807b2.
[MemCpyOpt] check that load/store and dest/src alloca are all in the same bb
Differential Revision: https://reviews.llvm.org/D153453
Co-authored-by: serge-sans-paille <sguelton@mozilla.com>
Make sure the code comments in processMemSetMemCpyDependence match
with the actual transform. They indicated that the memset being
rewritten was sunk to after a memcpy, while it actually is inserted
just before the memcpy.
Also make sure we use the debug location of the original memset
when creating the new simplified memset. In the past we've been
using the debug location for the memcpy which could be a bit
confusing.
Differential Revision: https://reviews.llvm.org/D135574
The llvm.memcpy.inline intrinsic must be expanded into code that
does not contain any function calls because it is intended for
the implementation of low-level functions like memcpy. Currently the
MemCpyOpt might covert llvm.memcpy.inline into llvm.memmove in
certain circumstances. This patch fixes the issue.
Fixes https://github.com/llvm/llvm-project/issues/61791.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D147162
When fetching allocation sizes, we almost always want to have the
size in bytes, but we were only providing an InBits API. Also add
the corresponding byte-based conjugate to save some *8 and /8
juggling everywhere.
Inspecting the downstream use of the cpyAlign, it is clear that
`performCallSlotOptzn` is expecting it to represent the alignment
of the copy destination, not the minimum of the src and dest
alignments. This patch renames the parameter to make this more
obvious.
I believe this change is NFC, because the downstream code has
alignment checks such that it all works out in the end. I have not
been able to construct a test case that actually triggers a change
in output.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D140603
While we can't use a single BatchAA instance for the entire
MemCpyOpt run without further justification, we can use BatchAA
while performing the queries related to a single instruction
(these will first perform some AA-based checks, and then modify
the IR only afterwards).
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Maintain and propagate DIAssignID attachments in memcpyopt.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133312
Currently call slot optimization may be prevented because the
lifetime markers for the destination only start after the call.
In this case, rather than aborting the transform, we should move
the lifetime.start before the call to enable the transform.
Differential Revision: https://reviews.llvm.org/D135886