446 Commits

Author SHA1 Message Date
Nikita Popov
71f7b972c3
[Local] Make combineAAMetadata() more principled (#122091)
This moves combineAAMetadata() into Local and implements it via a new
AAOnly flag, which will intersect only AA metadata and keep other known
metadata.

The existing KnownIDs list is dropped, because it is redundant with the
switch in combineMetadata(), which already drops unknown metadata.

I tried a few variants of this, and ultimately went with the AAOnly flag
because this way we make an explicit choice for each metadata kind
supported by combineMetadata(), and ignoring the flag gives you
conservatively correct behavior.

I checked that the memcpy tests still pass if we adjust the logic for
MD_memprof/MD_callsite to drop the metadata instead of arbitrarily
picking one.

Fixes https://github.com/llvm/llvm-project/issues/121495.
2025-01-09 09:34:46 +01:00
Teresa Johnson
3a423a10ff
[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359)
This patch fixes a couple of places where memprof-related metadata
(!memprof and !callsite) were being dropped, and one place where PGO
metadata (!prof) was being dropped.

All were due to instances of combineMetadata() being invoked. That
function drops all metadata not in the list provided by the client, and
also drops any not in its switch statement.

Memprof metadata needed a case in the combineMetadata switch statement.
For now we simply keep the metadata of the instruction being kept, which
doesn't retain all the profile information when two calls with
memprof metadata are being combined, but at least retains some.

For the memprof metadata being dropped during call CSE, add memprof and
callsite metadata to the list of known ids in combineMetadataForCSE.

Neither memprof nor regular prof metadata were in the list of known ids
for the callsite in MemCpyOptimizer, which was added to combine AA
metadata after optimization of byval arguments fed by memcpy
instructions, and similar types of optimizations of memcpy uses.

There is one other callsite of combineMetadata, but it is only invoked
on load instructions, which do not carry these types of metadata.
2025-01-02 12:11:59 -08:00
Momchil Velikov
5d9c321e8d
Handle scalable store size in MemCpyOptimizer (#118957)
The compiler crashes with an ICE when it tries to create a `memset` with
scalable size.
2024-12-06 20:48:48 +00:00
Antonio Frighetto
1d6ab189be [MemCpyOpt] Drop dead memmove calls on memset'd source data
When a memmove happens to clobber source data, and such data have
been previously memset'd, the memmove may be redundant.
2024-12-03 09:50:57 +01:00
Nikita Popov
1e32a7d42c
[AA] Rename CaptureInfo -> CaptureAnalysis (NFC) (#116842)
I'd like to use the name CaptureInfo to represent the new attribute
proposed at
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420,
but it's already taken by AA, and I can't think of great alternatives
(CaptureEffects would be something of a stretch).

As such, I'd like to rename CaptureInfo -> CaptureAnalysis in AA, which
also seems like the more accurate terminology.
2024-11-20 09:42:28 +01:00
Kazu Hirata
94f9cbbe49
[Scalar] Remove unused includes (NFC) (#114645)
Identified with misc-include-cleaner.
2024-11-02 08:32:26 -07:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Nikita Popov
f5c02dd06e
[MemCpyOpt] Use EarliestEscapeInfo (#110280)
Pass EarliestEscapeInfo to BatchAA in MemCpyOpt. This allows memcpy
elimination in cases where one of the involved pointers is captured
after the relevant memcpy/call.
2024-09-30 09:35:54 +02:00
Nikita Popov
296901fd00 [MemCpyOpt] Use BatchAA in one more place (NFCI)
Everything else in this method using BatchAA, apart from this
call.
2024-09-27 16:44:35 +02:00
Ramkumar Ramachandra
f664d313cd
MemCpyOpt: replace an AA query with MSSA query (NFC) (#108535)
Fix a long-standing TODO.
2024-09-24 11:18:37 +01:00
Ramkumar Ramachandra
7e9bd12cd9
MemCpyOpt: clarify logic in processStoreOfLoad (NFC) (#108400) 2024-09-12 21:16:43 +01:00
Ramkumar Ramachandra
159e5b3fdf
MemCpyOpt: avoid unnecessary getMemorySSA (NFC) (#108405) 2024-09-12 20:35:01 +01:00
Nikita Popov
2afe678f0a
[MemCpyOpt] Allow memcpy elision for non-noalias arguments (#107860)
We currently elide memcpys for readonly nocapture noalias arguments.
noalias is checked to make sure that there are no other ways to write
the memory, e.g. through a different argument or an escaped pointer.

In addition to the current noalias check, also query alias analysis, in
case it can prove that modification is not possible through other means.

This fixes the problem reported in
https://discourse.llvm.org/t/problem-about-memcpy-elimination/81121.
2024-09-11 10:04:37 +02:00
Yingwei Zheng
378daa6c6f
[MemCpyOpt] Avoid infinite loops in MemCpyOptPass::processMemCpyMemCpyDependence (#103218)
Closes https://github.com/llvm/llvm-project/issues/102994.
2024-08-22 17:20:47 +08:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Nikita Popov
71051deff2
[MemCpyOpt] Fix infinite loop in memset+memcpy fold (#98638)
For the case where the memcpy size is zero, this transform is a complex
no-op. This can lead to an infinite loop when the size is zero in a way
that BasicAA understands, because it can still understand that dst and
dst + src_size are MustAlias.

I've tried to mitigate this before using the isZeroSize() check, but we
can hit cases where InstSimplify doesn't understand that the size is
zero, but BasicAA does.

As such, this bites the bullet and adds an explicit isKnownNonZero()
check to guard against no-op transforms.

Fixes https://github.com/llvm/llvm-project/issues/98610.
2024-07-15 09:41:11 +02:00
Yingwei Zheng
99685a54d1
[MemCpyOpt] Use dyn_cast to fix assertion failure in processMemCpyMemCpyDependence (#98686)
Fixes https://github.com/llvm/llvm-project/issues/98675.
2024-07-13 04:27:07 +08:00
DianQK
fa24213928
[MemCpyOpt] Forward memcpy based on the actual copy memory location. (#87190)
Fixes #85560.

We can forward `memcpy` as long as the actual memory location being
copied have not been altered.

alive2: https://alive2.llvm.org/ce/z/q9JaHV
2024-07-12 22:58:28 +08:00
DianQK
117cc4abea
[MemCpyOpt] No need to create memcpy(a <- a) (#98321)
When forwarding `memcpy`, we don't need to create `memcpy(a, a)`.
2024-07-11 19:54:28 +08:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Nikita Popov
ac7c482ca5 [MemCpyOpt] Add extra debug output (NFC) 2024-05-21 07:29:26 +02:00
XChy
b5e8555607
[MemCpyOpt][NFC] Format codebase (#90225)
This patch automatically formats the code.
2024-04-27 20:17:35 +08:00
Philip Reames
42d6eb5475
[MemCpyOpt] Handle scalable aggregate types in memmove/memset formation (#80487)
Without this change, the included test cases crash the compiler. I
believe this is fallout from the homogenous scalable struct work from a
while back; I think we just forgot to update this case.

Likely to fix https://github.com/llvm/llvm-project/issues/80463.
2024-02-02 18:47:18 -08:00
Nikita Popov
6c2fbc3a68
[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582)
This abstracts over the common pattern of creating a gep with i8 element
type.
2024-01-12 14:21:21 +01:00
Wang Pengcheng
6aa6ef73ec
[MemCpyOpt] Don't perform call slot opt if alloc type is scalable (#75027)
This fixes #75010.
2023-12-11 19:45:13 +08:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Nikita Popov
369c9b791b
[MemCpyOpt] Require writable object during call slot optimization (#71542)
Call slot optimization may introduce writes to the destination object
that occur earlier than in the original function. We currently already
check that that the destination is dereferenceable and aligned, but we
do not make sure that it is writable. As such, we might introduce a
write to read-only memory, or introduce a data race.

Fix this by checking that the object is writable. For arguments, this is
indicated by the new writable attribute. Tests using
sret/dereferenceable are updated to use it.
2023-11-09 15:55:44 +01:00
Nikita Popov
5c3beb7b1e [MemCpyOpt] Handle memcpy marked as memory(none)
Fixes #71183.
2023-11-03 15:20:21 +01:00
DianQK
0c4f326d8b
[MemCpyOpt] Combine alias metadatas when replacing byval arguments (#70580)
Fixes #70578.
2023-10-29 16:07:55 +08:00
Fangrui Song
8e247b8f47 Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC 2023-10-27 00:30:41 -07:00
Nikita Popov
7c7896b1be [MemCpyOpt] Remove unnecessary typed pointer handling (NFC)
Drop code inserting pointer casts. Check pointer types instead of
address spaces.
2023-10-20 12:49:07 +02:00
Kai Yan
df116d1dc4
[MemCpyOpt] Fix the invalid code modification for GEP (#68479)
Relocate the GEP modification to a later stage of the function
performCallSlotOptzn(), ensuring that the code remains unchanged if the
optimization fails.

Co-authored-by: aklkaiyan <aklkaiyan@tencent.com>
2023-10-09 12:54:16 +02:00
Craig Topper
689ace53a5
[MemCpyOptimizer] Support scalable vectors in performStackMoveO… (#67632)
…ptzn.

This changes performStackMoveOptzn to take a TypeSize instead of
uint64_t to avoid an implicit conversion when called from
processStoreOfLoad.

performStackMoveOptzn has been updated to allow scalable types in the
rest of its code.
2023-09-28 12:25:38 -07:00
DianQK
4e6e476329
[MemCpyOpt] Merge alias metadatas when replacing arguments (#67539)
Alias metadata may no longer be valid after replacing the call argument.
Fix this by merging it with the memcpy alias metadata.

This fixes a miscompilation encountered in
https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Failing.20tests.20when.20rustc.20is.20compiled.20with.201.20CGU.
2023-09-28 10:13:21 +02:00
Kohei Asano
2a207128a7
[MemCpyOpt] move SrcAlloca to the entry if transformation is performed (#67226)
This is fixup for
https://github.com/llvm/llvm-project/pull/66618#discussion_r1328523770 .
This transformation checks whether allocas are static, if the
transformation is performed. This patch moves the SrcAlloca to the entry
of the BB when the optimization performed.
2023-09-26 16:27:34 +09:00
Bjorn Pettersson
08fdbbeb09 [MemCpyOpt] Drop redundant CreatePointerCast
Given that transition to opaque pointers a call to CreatePointerCast
in processMemSetMemCpyDependence was found redundant. It would cast
from "ptr" to "ptr" (both associated with the same address space).
2023-09-18 22:17:10 +02:00
Kohei Asano
baf031a853
[MemCpyOpt] fix miscompile for non-dominated use of src alloca for stack-move optimization (#66618)
Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.

Fixes #65225
2023-09-18 21:29:10 +09:00
Nikita Popov
07460b6666 [MemCpyOpt] Avoid infinite loop in processMemSetMemCpyDependence (PR54983)
This adds an additional transform to drop zero-size memcpys, also in
the case where the size is only zero after instruction simplification.
The motivation is the case from PR54983 where the size is non-trivially
zero, and processMemSetMemCpyDependence() keeps trying to reduce the
memset size by zero bytes.

This fix it's not really principled. It only works on the premise that
if InstSimplify doesn't realize the size is zero, then AA also won't.

The principled approach would be to instead add a isKnownNonZero()
guard to the processMemSetMemCpyDependence() transform, but I
suspect that would render that optimization mostly useless (at least
it breaks all the existing test coverage -- worth noting that the
constant size case is also handled by DSE, so I think this transform
is primarily about the dynamic size case).

Fixes https://github.com/llvm/llvm-project/issues/54983.
Fixes https://github.com/llvm/llvm-project/issues/64886.

Differential Revision: https://reviews.llvm.org/D124078
2023-09-15 09:10:15 +02:00
khei4
7f3610ac69 Reapply "Revert "[MemCpyOpt] implement multi BB stack-move optimization"
This reverts commit efe8aa2e618122e8050af10cc5d6ad83f24ef557.

Differential Revision: https://reviews.llvm.org/D155406
2023-09-14 19:42:36 +09:00
Vitaly Buka
efe8aa2e61 Revert "Reapply "Revert "[MemCpyOpt] implement multi BB stack-move optimization""
Suspecting incorrect lifetime markers.

This reverts commit 3a1409f93da32bf626f76257e0aac71716f2f67e.
2023-09-07 11:14:19 -07:00
khei4
3a1409f93d Reapply "Revert "[MemCpyOpt] implement multi BB stack-move optimization"
This reverts commit e0f9cc71cb6f4eb2e1566177e05425c497759dc6.
Differential Revision: https://reviews.llvm.org/D155406
2023-08-29 19:40:29 +09:00
Vitaly Buka
e0f9cc71cb Revert "Reapply "Revert "[MemCpyOpt] implement multi BB stack-move optimization"""
Breaks multiple bots. e.g. https://lab.llvm.org/buildbot/#/builders/19/builds/18856

This reverts commit ac0072602c9d01fc031a2d0acb418f7191480ef0.
2023-08-26 19:24:50 -07:00
khei4
ac0072602c Reapply "Revert "[MemCpyOpt] implement multi BB stack-move optimization""
This reverts commit 3bb32c61b2f1f5d14dd056dd198dc898dce5a44e.

Use InsertionPt for DT to handle non-memory access dominators

Differential Revision: https://reviews.llvm.org/D155406
2023-08-27 06:50:19 +09:00
khei4
3bb32c61b2 Revert "[MemCpyOpt] implement multi BB stack-move optimization"
This reverts commit ef867d2ea10e8246be20c608160e07a54eb2ed14.

crash on sanitizer build https://lab.llvm.org/buildbot/#/builders/70/builds/42861/steps/10/logs/stdio
2023-08-24 22:56:39 +09:00
khei4
ef867d2ea1 [MemCpyOpt] implement multi BB stack-move optimization
Differential Revision: https://reviews.llvm.org/D155406
2023-08-24 22:19:01 +09:00
Nikita Popov
9d2f8ecac8 [MSSAU] Clarify that the defining access does not matter (NFC)
New memory accesses are usually inserted by using one of the
createMemoryAccessXYZ() methods followed by insertUse() or
insertDef(). createMemoryAccessXYZ() accepts a defining access,
however this defining access will always be overwritten by
insertUse() / insertDef().

Update the documentation to clarify this, and stop passing
Definition to createMemoryAccessXYZ() if it's followed by
insertUse/insertDef.

Alternatively, we could also make insertUse / insertDef keep the
defining access if it is specified, and only recompute it if it's
missing.

Differential Revision: https://reviews.llvm.org/D157979
2023-08-16 09:00:32 +02:00
khei4
ca68a7f956 Reapply: [MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas
This reverts commit 207718029e1e62d82145b479f6349941b6384045.
2023-08-15 22:13:09 +09:00
Vitaly Buka
207718029e Revert "Reapply: [MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas"
Fails on https://lab.llvm.org/buildbot/#/builders/85/builds/18296

This reverts commit 43698c1ddc179ccd97b3f3b2bb03f4a3fe9556f3.
2023-08-13 16:29:39 -07:00
khei4
43698c1ddc Reapply: [MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas
Differential Revision: https://reviews.llvm.org/D153453

This reverts commit 00653889883f2d818536efcb21c6c8b739f0888b.
2023-08-13 21:38:00 +09:00