468 Commits

Author SHA1 Message Date
Nikita Popov
0a8ebdb2f0 [MemCpyOpt] Remove handling for lifetime sizes
Split out from #150248:

Since #150944 the size passed to lifetime.start/end is considered
meaningless. The lifetime always applies to the whole alloca.

Accordingly, remove checks of the lifetime size from MemCpyOpt.
2025-08-05 17:22:12 +02:00
Jameson Nash
4d859dbae1
[MemCpyOpt] fix incorrect handling of lifetime markers (#143782)
Having lifetime markers should only increase the information available
to LLVM, but it would instead rely on the callback to entirely give up
if it encountered a lifetime marker that wasn't full size, but
sub-optimal lifetime markers are not supposed to be forbidding
optimizations that would otherwise apply if they were either absent or
optimal. This pass wasn't tracking GEP offsets either, so it wasn't
quite correctly handled either, although earlier sub-optimal checks
that this size is the same as the alloca test made this safe in the
past, and unlikely to have encountered anything else in the past.
2025-07-26 14:03:18 -04:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Jameson Nash
c04fc5596e
[MemCpyOpt] allow some undef contents overread in processMemCpyMemCpyDependence (#143745)
Allows memcpy to memcpy forwarding in cases where the second memcpy is
larger, but the overread is known to be undef, by shrinking the memcpy
size.

Refs https://github.com/llvm/llvm-project/pull/140954 which laid some of
the groundwork for this.
2025-06-18 15:38:34 -04:00
Jameson Nash
bc7ea63e9c
[MemCpyOpt] handle memcpy from memset for non-constant sizes (#143727)
Allows forwarding memset to memcpy for mismatching unknown sizes if
overread has undef contents. In that case we can refine the undef bytes
to the memset value.

Refs #140954 which laid some of the groundwork for this.
2025-06-11 20:04:27 -04:00
Jameson Nash
7460c700ae
[MemCpyOpt] handle memcpy from memset in more cases (#140954)
This aims to reduce the divergence between the initial checks in this
function and processMemCpyMemCpyDependence (in particular, adding
handling of offsets), with the goal to eventually reduce duplication
there and improve this pass in other ways.
2025-06-11 10:42:05 +02:00
dianqk
e573ffe11f
[MemCpyOpt] Check MDep aliases to avoid infinite loops (NFC) (#140376)
cc #103218.
2025-05-27 20:01:22 +08:00
Philip Reames
c0a264e6a9
[IntrinsicInst] Remove MemCpyInlineInst and MemSetInlineInst [nfc] (#138568)
I'm looking for ways to simplify the Mem*Inst class structure, and these
two seem to have fairly minimal justification, so let's remove them.
2025-05-05 14:07:31 -07:00
Nikita Popov
541ad3fb71 [MemCpyOpt] Drop outdated TODO (NFC)
This code was already changed to make use of UseCC/ResultCC.
We can't restrict the check to provenance or address only, as both
are relevant here.
2025-05-05 16:26:16 +02:00
Kazu Hirata
031475594a
[llvm] Use llvm::SmallVector::pop_back_val (NFC) (#136441) 2025-04-19 11:49:19 -07:00
Nikita Popov
d69ee885cc
[CaptureTracking] Remove dereferenceable_or_null special case (#135613)
Remove the special case where comparing a dereferenceable_or_null
pointer with null results in captures(none) instead of
captures(address_is_null).

This special case is not entirely correct. Let's say we have an
allocated object of size 2 at address 1 and have a pointer `%p` pointing
either to address 1 or 2. Then passing `gep p, -1` to a
`dereferenceable_or_null(1)` function is well-defined, and allows us to
distinguish between the two possible pointers, capturing information
about the address.

Now that we ignore address captures in alias analysis, I think we're
ready to drop this special case. Additionally, if there are regressions
in other places, the fact that this is inferred as address_is_null
should allow us to easily address them if necessary.
2025-04-17 12:44:57 +02:00
Dominik Adamski
716b02d8c5
[LLVM][MemCpyOpt] Unify alias tags if we optimize allocas (#129537)
Optimization of alloca instructions may lead to invalid alias tags.
Incorrect alias tags can result in incorrect optimization outcomes for
Fortran source code compiled by Flang with flags: `-O3 -mmlir
-local-alloc-tbaa -flto`.

This commit removes alias tags when memcpy optimization replaces two
arrays with one array, thus ensuring correct compilation of Fortran
source code using flags: `-O3 -mmlir -local-alloc-tbaa -flto`.

This commit is also a proposal to fix the reported issue:
https://github.com/llvm/llvm-project/issues/133984

---------

Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-04-10 12:23:53 +02:00
Nikita Popov
5da9044c40 [MemCpyOpt] Fix clobber check in fca2memcpy optimization
This effectively reverts #108535. The old AA code was looking for
the *first* clobber between the load and store and then trying to
move all the way up there. The new MSSA based code instead found
the *last* clobber. There might still be an earlier clobber that
has not been accounted for.

Fixes #130632.
2025-03-12 14:53:50 +01:00
Nikita Popov
e56a6a2683
Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880) (#128020)
Relative to the previous attempt this includes two fixes:
 * Adjust callCapturesBefore() to not skip captures(ret: address,
    provenance) arguments, as these will not count as a capture
    at the call-site.
 * When visiting uses during stack slot optimization, don't skip
    the ModRef check for passthru captures. Calls can both modref
    and be passthru for captures.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-27 09:38:29 +01:00
Nikita Popov
9cbdcfcafd [CaptureTracking] Remove StoreCaptures parameter (NFC)
The implementation doesn't use it, and is unlikely to use it in
the future.

The places that do set StoreCaptures=false, do so incorrectly and
would be broken if the parameter actually did anything.
2025-02-24 12:00:57 +01:00
Nico Weber
e2ba1b6ffd Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729.
Seems to break LTO builds of clang on Windows, see comments on
https://github.com/llvm/llvm-project/pull/125880
2025-02-19 11:32:57 -05:00
Nikita Popov
7e3735d1a1 Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)
Relative to the previous attempt, this adjusts isEscapeSource()
to not treat calls with captures(ret: address, provenance) or similar
arguments as escape sources. This addresses the miscompile reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577

The implementation uses a helper function on CallBase to make this
check a bit more efficient (e.g. by skipping the byval checks) as
checking attributes on all arguments if fairly expensive.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-14 12:38:04 +01:00
Nikita Popov
1e64ea9914 Revert "[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a.

A miscompilation has been reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
2025-02-13 14:56:12 +01:00
Nikita Popov
ee655ca27a
[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)
This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-13 09:36:35 +01:00
Yingwei Zheng
9fbd5fbcc6
[IR][NFC] Switch to use LifetimeIntrinsic (#125528) 2025-02-04 02:18:33 +08:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Nikita Popov
1393f4e69f [MemCpyOpt] Use doesNotCapture() helper (NFC)
No difference in semantics here as byval is already handled
separately. This simplifies migration to the captures attribute.
2025-01-14 14:28:11 +01:00
Nikita Popov
71f7b972c3
[Local] Make combineAAMetadata() more principled (#122091)
This moves combineAAMetadata() into Local and implements it via a new
AAOnly flag, which will intersect only AA metadata and keep other known
metadata.

The existing KnownIDs list is dropped, because it is redundant with the
switch in combineMetadata(), which already drops unknown metadata.

I tried a few variants of this, and ultimately went with the AAOnly flag
because this way we make an explicit choice for each metadata kind
supported by combineMetadata(), and ignoring the flag gives you
conservatively correct behavior.

I checked that the memcpy tests still pass if we adjust the logic for
MD_memprof/MD_callsite to drop the metadata instead of arbitrarily
picking one.

Fixes https://github.com/llvm/llvm-project/issues/121495.
2025-01-09 09:34:46 +01:00
Teresa Johnson
3a423a10ff
[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359)
This patch fixes a couple of places where memprof-related metadata
(!memprof and !callsite) were being dropped, and one place where PGO
metadata (!prof) was being dropped.

All were due to instances of combineMetadata() being invoked. That
function drops all metadata not in the list provided by the client, and
also drops any not in its switch statement.

Memprof metadata needed a case in the combineMetadata switch statement.
For now we simply keep the metadata of the instruction being kept, which
doesn't retain all the profile information when two calls with
memprof metadata are being combined, but at least retains some.

For the memprof metadata being dropped during call CSE, add memprof and
callsite metadata to the list of known ids in combineMetadataForCSE.

Neither memprof nor regular prof metadata were in the list of known ids
for the callsite in MemCpyOptimizer, which was added to combine AA
metadata after optimization of byval arguments fed by memcpy
instructions, and similar types of optimizations of memcpy uses.

There is one other callsite of combineMetadata, but it is only invoked
on load instructions, which do not carry these types of metadata.
2025-01-02 12:11:59 -08:00
Momchil Velikov
5d9c321e8d
Handle scalable store size in MemCpyOptimizer (#118957)
The compiler crashes with an ICE when it tries to create a `memset` with
scalable size.
2024-12-06 20:48:48 +00:00
Antonio Frighetto
1d6ab189be [MemCpyOpt] Drop dead memmove calls on memset'd source data
When a memmove happens to clobber source data, and such data have
been previously memset'd, the memmove may be redundant.
2024-12-03 09:50:57 +01:00
Nikita Popov
1e32a7d42c
[AA] Rename CaptureInfo -> CaptureAnalysis (NFC) (#116842)
I'd like to use the name CaptureInfo to represent the new attribute
proposed at
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420,
but it's already taken by AA, and I can't think of great alternatives
(CaptureEffects would be something of a stretch).

As such, I'd like to rename CaptureInfo -> CaptureAnalysis in AA, which
also seems like the more accurate terminology.
2024-11-20 09:42:28 +01:00
Kazu Hirata
94f9cbbe49
[Scalar] Remove unused includes (NFC) (#114645)
Identified with misc-include-cleaner.
2024-11-02 08:32:26 -07:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Nikita Popov
f5c02dd06e
[MemCpyOpt] Use EarliestEscapeInfo (#110280)
Pass EarliestEscapeInfo to BatchAA in MemCpyOpt. This allows memcpy
elimination in cases where one of the involved pointers is captured
after the relevant memcpy/call.
2024-09-30 09:35:54 +02:00
Nikita Popov
296901fd00 [MemCpyOpt] Use BatchAA in one more place (NFCI)
Everything else in this method using BatchAA, apart from this
call.
2024-09-27 16:44:35 +02:00
Ramkumar Ramachandra
f664d313cd
MemCpyOpt: replace an AA query with MSSA query (NFC) (#108535)
Fix a long-standing TODO.
2024-09-24 11:18:37 +01:00
Ramkumar Ramachandra
7e9bd12cd9
MemCpyOpt: clarify logic in processStoreOfLoad (NFC) (#108400) 2024-09-12 21:16:43 +01:00
Ramkumar Ramachandra
159e5b3fdf
MemCpyOpt: avoid unnecessary getMemorySSA (NFC) (#108405) 2024-09-12 20:35:01 +01:00
Nikita Popov
2afe678f0a
[MemCpyOpt] Allow memcpy elision for non-noalias arguments (#107860)
We currently elide memcpys for readonly nocapture noalias arguments.
noalias is checked to make sure that there are no other ways to write
the memory, e.g. through a different argument or an escaped pointer.

In addition to the current noalias check, also query alias analysis, in
case it can prove that modification is not possible through other means.

This fixes the problem reported in
https://discourse.llvm.org/t/problem-about-memcpy-elimination/81121.
2024-09-11 10:04:37 +02:00
Yingwei Zheng
378daa6c6f
[MemCpyOpt] Avoid infinite loops in MemCpyOptPass::processMemCpyMemCpyDependence (#103218)
Closes https://github.com/llvm/llvm-project/issues/102994.
2024-08-22 17:20:47 +08:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Nikita Popov
71051deff2
[MemCpyOpt] Fix infinite loop in memset+memcpy fold (#98638)
For the case where the memcpy size is zero, this transform is a complex
no-op. This can lead to an infinite loop when the size is zero in a way
that BasicAA understands, because it can still understand that dst and
dst + src_size are MustAlias.

I've tried to mitigate this before using the isZeroSize() check, but we
can hit cases where InstSimplify doesn't understand that the size is
zero, but BasicAA does.

As such, this bites the bullet and adds an explicit isKnownNonZero()
check to guard against no-op transforms.

Fixes https://github.com/llvm/llvm-project/issues/98610.
2024-07-15 09:41:11 +02:00
Yingwei Zheng
99685a54d1
[MemCpyOpt] Use dyn_cast to fix assertion failure in processMemCpyMemCpyDependence (#98686)
Fixes https://github.com/llvm/llvm-project/issues/98675.
2024-07-13 04:27:07 +08:00
DianQK
fa24213928
[MemCpyOpt] Forward memcpy based on the actual copy memory location. (#87190)
Fixes #85560.

We can forward `memcpy` as long as the actual memory location being
copied have not been altered.

alive2: https://alive2.llvm.org/ce/z/q9JaHV
2024-07-12 22:58:28 +08:00
DianQK
117cc4abea
[MemCpyOpt] No need to create memcpy(a <- a) (#98321)
When forwarding `memcpy`, we don't need to create `memcpy(a, a)`.
2024-07-11 19:54:28 +08:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Nikita Popov
ac7c482ca5 [MemCpyOpt] Add extra debug output (NFC) 2024-05-21 07:29:26 +02:00
XChy
b5e8555607
[MemCpyOpt][NFC] Format codebase (#90225)
This patch automatically formats the code.
2024-04-27 20:17:35 +08:00
Philip Reames
42d6eb5475
[MemCpyOpt] Handle scalable aggregate types in memmove/memset formation (#80487)
Without this change, the included test cases crash the compiler. I
believe this is fallout from the homogenous scalable struct work from a
while back; I think we just forgot to update this case.

Likely to fix https://github.com/llvm/llvm-project/issues/80463.
2024-02-02 18:47:18 -08:00
Nikita Popov
6c2fbc3a68
[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582)
This abstracts over the common pattern of creating a gep with i8 element
type.
2024-01-12 14:21:21 +01:00
Wang Pengcheng
6aa6ef73ec
[MemCpyOpt] Don't perform call slot opt if alloc type is scalable (#75027)
This fixes #75010.
2023-12-11 19:45:13 +08:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Nikita Popov
369c9b791b
[MemCpyOpt] Require writable object during call slot optimization (#71542)
Call slot optimization may introduce writes to the destination object
that occur earlier than in the original function. We currently already
check that that the destination is dereferenceable and aligned, but we
do not make sure that it is writable. As such, we might introduce a
write to read-only memory, or introduce a data race.

Fix this by checking that the object is writable. For arguments, this is
indicated by the new writable attribute. Tests using
sret/dereferenceable are updated to use it.
2023-11-09 15:55:44 +01:00