llvm-project

Author	SHA1	Message	Date
Antonio Frighetto	034e5d6f86	[MemCpyOpt] Extend `performMemCpyToMemSetOptzn` to partially memset'd region While doing memset-to-memcpy forwarding, take into account memset that covers memory regions from a given offset, and the leading bytes of such a region are undef. Fixes: https://github.com/llvm/llvm-project/issues/172326.	2026-01-30 10:09:08 +01:00
Arthur Eubanks	f8c4974963	Revert "[MemCpyOpt] support offset slices for performStackMoveOptzn and processMemCpy (#176436 )" (#177482 ) This reverts commit 019eb855dd6a18a8f7ae5dd86abf6bc3ad0d9fa4. Causes miscompiles, see original PR and #177185 Fixes https://github.com/llvm/llvm-project/issues/177185	2026-01-22 22:34:35 +00:00
Jameson Nash	019eb855dd	[MemCpyOpt] support offset slices for performStackMoveOptzn and processMemCpy (#176436 ) In particular, support offset of src, since offset of dest will be a followup change when dest is allowed to be not full-sized with copy. Extracted from https://github.com/llvm/llvm-project/pull/150792	2026-01-19 16:45:31 -05:00
Jameson Nash	ba2bd3fbba	Use AllocaInst::getAllocationSize instead of manual size calculations (#176486 ) Replace patterns that manually compute allocation sizes by multiplying getTypeAllocSize(getAllocatedType()) by the array size with calls to the getAllocationSize(DL) API, which handles this correctly and concisely, returning nullopt for VLAs. This fixes several places that were not accounting for array allocations when computing sizes, simplifies code that was doing this manually, and adds some explicit isFixed checks where implied convert was being used. This PR is because now that we have opaque pointers, I hate that some AllocaInst still has type information being consumed by some passes instead of just using the size, since passes rarely handle that type information well or correctly. I hope this will grow into a sequence of commits to slowly eliminate uses of getAllocatedType from AllocaInst. And similarly later to remove type information from GlobalValue too (it can be replaced with just dereferenceable bytes, similar to arguments). Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 09:55:52 -05:00
Jameson Nash	94ffc754d2	[MemCpyOpt] keep src/dest alloca ordering (#176012 ) Rather than test dominator of every use, just check which of src or dest is first, and use that insert location. This minimizes unnecessary dominator queries while also helping to preserve the order of allocas (for better code readability / diff). Extracted from PR optimization improvement series at https://github.com/llvm/llvm-project/pull/150792	2026-01-14 15:18:37 -05:00
Jameson Nash	d275182924	[MemCpyOpt] allow memcpy-to-memcpy optimization with smaller dest than src (#176010 ) Resize the alloca if needed to a common size, as long as the dest was still fully initialized by the copy. Extracted from PR optimization improvement series at https://github.com/llvm/llvm-project/pull/150792 (included all tests additions from there as well)	2026-01-14 15:16:11 -05:00
Victor Chernyakin	c438773432	[LLVM][ADT] Migrate users of `make_scope_exit` to CTAD (#174030 ) This is a followup to #173131, which introduced the CTAD functionality.	2026-01-02 20:42:56 -08:00
Mircea Trofin	17789e9fa8	[MemCpyOpt][profcheck] Set `unknown` branch weights for certain selects (#167597 ) Issue #147390	2025-11-14 10:36:50 -08:00
Nikita Popov	8f624815bf	[MemCpyOpt] Allow stack move optimization if one address captured (#165527 ) Allow the stack move optimization (which merges two allocas) when the address of only one alloca is captured (and the provenance is not captured). Both addresses need to be captured to observe that the allocas were merged. Fixes https://github.com/llvm/llvm-project/issues/165484.	2025-10-30 10:23:40 +01:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Nikita Popov	0a8ebdb2f0	[MemCpyOpt] Remove handling for lifetime sizes Split out from #150248: Since #150944 the size passed to lifetime.start/end is considered meaningless. The lifetime always applies to the whole alloca. Accordingly, remove checks of the lifetime size from MemCpyOpt.	2025-08-05 17:22:12 +02:00
Jameson Nash	4d859dbae1	[MemCpyOpt] fix incorrect handling of lifetime markers (#143782 ) Having lifetime markers should only increase the information available to LLVM, but it would instead rely on the callback to entirely give up if it encountered a lifetime marker that wasn't full size, but sub-optimal lifetime markers are not supposed to be forbidding optimizations that would otherwise apply if they were either absent or optimal. This pass wasn't tracking GEP offsets either, so it wasn't quite correctly handled either, although earlier sub-optimal checks that this size is the same as the alloca test made this safe in the past, and unlikely to have encountered anything else in the past.	2025-07-26 14:03:18 -04:00
Jeremy Morse	57a5f9c47e	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383 ) There are no longer debug-info instructions, thus we don't need this skipping. Horray!	2025-07-15 15:34:10 +01:00
Jameson Nash	c04fc5596e	[MemCpyOpt] allow some undef contents overread in processMemCpyMemCpyDependence (#143745 ) Allows memcpy to memcpy forwarding in cases where the second memcpy is larger, but the overread is known to be undef, by shrinking the memcpy size. Refs https://github.com/llvm/llvm-project/pull/140954 which laid some of the groundwork for this.	2025-06-18 15:38:34 -04:00
Jameson Nash	bc7ea63e9c	[MemCpyOpt] handle memcpy from memset for non-constant sizes (#143727 ) Allows forwarding memset to memcpy for mismatching unknown sizes if overread has undef contents. In that case we can refine the undef bytes to the memset value. Refs #140954 which laid some of the groundwork for this.	2025-06-11 20:04:27 -04:00
Jameson Nash	7460c700ae	[MemCpyOpt] handle memcpy from memset in more cases (#140954 ) This aims to reduce the divergence between the initial checks in this function and processMemCpyMemCpyDependence (in particular, adding handling of offsets), with the goal to eventually reduce duplication there and improve this pass in other ways.	2025-06-11 10:42:05 +02:00
dianqk	e573ffe11f	[MemCpyOpt] Check `MDep` aliases to avoid infinite loops (NFC) (#140376 ) cc #103218.	2025-05-27 20:01:22 +08:00
Philip Reames	c0a264e6a9	[IntrinsicInst] Remove MemCpyInlineInst and MemSetInlineInst [nfc] (#138568 ) I'm looking for ways to simplify the Mem*Inst class structure, and these two seem to have fairly minimal justification, so let's remove them.	2025-05-05 14:07:31 -07:00
Nikita Popov	541ad3fb71	[MemCpyOpt] Drop outdated TODO (NFC) This code was already changed to make use of UseCC/ResultCC. We can't restrict the check to provenance or address only, as both are relevant here.	2025-05-05 16:26:16 +02:00
Kazu Hirata	031475594a	[llvm] Use llvm::SmallVector::pop_back_val (NFC) (#136441 )	2025-04-19 11:49:19 -07:00
Nikita Popov	d69ee885cc	[CaptureTracking] Remove dereferenceable_or_null special case (#135613 ) Remove the special case where comparing a dereferenceable_or_null pointer with null results in captures(none) instead of captures(address_is_null). This special case is not entirely correct. Let's say we have an allocated object of size 2 at address 1 and have a pointer `%p` pointing either to address 1 or 2. Then passing `gep p, -1` to a `dereferenceable_or_null(1)` function is well-defined, and allows us to distinguish between the two possible pointers, capturing information about the address. Now that we ignore address captures in alias analysis, I think we're ready to drop this special case. Additionally, if there are regressions in other places, the fact that this is inferred as address_is_null should allow us to easily address them if necessary.	2025-04-17 12:44:57 +02:00
Dominik Adamski	716b02d8c5	[LLVM][MemCpyOpt] Unify alias tags if we optimize allocas (#129537 ) Optimization of alloca instructions may lead to invalid alias tags. Incorrect alias tags can result in incorrect optimization outcomes for Fortran source code compiled by Flang with flags: `-O3 -mmlir -local-alloc-tbaa -flto`. This commit removes alias tags when memcpy optimization replaces two arrays with one array, thus ensuring correct compilation of Fortran source code using flags: `-O3 -mmlir -local-alloc-tbaa -flto`. This commit is also a proposal to fix the reported issue: https://github.com/llvm/llvm-project/issues/133984 --------- Co-authored-by: Shilei Tian <i@tianshilei.me>	2025-04-10 12:23:53 +02:00
Nikita Popov	5da9044c40	[MemCpyOpt] Fix clobber check in fca2memcpy optimization This effectively reverts #108535. The old AA code was looking for the first clobber between the load and store and then trying to move all the way up there. The new MSSA based code instead found the last clobber. There might still be an earlier clobber that has not been accounted for. Fixes #130632.	2025-03-12 14:53:50 +01:00
Nikita Popov	e56a6a2683	Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) (#128020 ) Relative to the previous attempt this includes two fixes: * Adjust callCapturesBefore() to not skip captures(ret: address, provenance) arguments, as these will not count as a capture at the call-site. * When visiting uses during stack slot optimization, don't skip the ModRef check for passthru captures. Calls can both modref and be passthru for captures. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-27 09:38:29 +01:00
Nikita Popov	9cbdcfcafd	[CaptureTracking] Remove StoreCaptures parameter (NFC) The implementation doesn't use it, and is unlikely to use it in the future. The places that do set StoreCaptures=false, do so incorrectly and would be broken if the parameter actually did anything.	2025-02-24 12:00:57 +01:00
Nico Weber	e2ba1b6ffd	Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 )" This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729. Seems to break LTO builds of clang on Windows, see comments on https://github.com/llvm/llvm-project/pull/125880	2025-02-19 11:32:57 -05:00
Nikita Popov	7e3735d1a1	Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) Relative to the previous attempt, this adjusts isEscapeSource() to not treat calls with captures(ret: address, provenance) or similar arguments as escape sources. This addresses the miscompile reported at: https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577 The implementation uses a helper function on CallBase to make this check a bit more efficient (e.g. by skipping the byval checks) as checking attributes on all arguments if fairly expensive. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-14 12:38:04 +01:00
Nikita Popov	1e64ea9914	Revert "[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 )" This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a. A miscompilation has been reported at: https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577	2025-02-13 14:56:12 +01:00
Nikita Popov	ee655ca27a	[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-13 09:36:35 +01:00
Yingwei Zheng	9fbd5fbcc6	[IR][NFC] Switch to use `LifetimeIntrinsic` (#125528 )	2025-02-04 02:18:33 +08:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Nikita Popov	1393f4e69f	[MemCpyOpt] Use doesNotCapture() helper (NFC) No difference in semantics here as byval is already handled separately. This simplifies migration to the captures attribute.	2025-01-14 14:28:11 +01:00
Nikita Popov	71f7b972c3	[Local] Make combineAAMetadata() more principled (#122091 ) This moves combineAAMetadata() into Local and implements it via a new AAOnly flag, which will intersect only AA metadata and keep other known metadata. The existing KnownIDs list is dropped, because it is redundant with the switch in combineMetadata(), which already drops unknown metadata. I tried a few variants of this, and ultimately went with the AAOnly flag because this way we make an explicit choice for each metadata kind supported by combineMetadata(), and ignoring the flag gives you conservatively correct behavior. I checked that the memcpy tests still pass if we adjust the logic for MD_memprof/MD_callsite to drop the metadata instead of arbitrarily picking one. Fixes https://github.com/llvm/llvm-project/issues/121495.	2025-01-09 09:34:46 +01:00
Teresa Johnson	3a423a10ff	[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359 ) This patch fixes a couple of places where memprof-related metadata (!memprof and !callsite) were being dropped, and one place where PGO metadata (!prof) was being dropped. All were due to instances of combineMetadata() being invoked. That function drops all metadata not in the list provided by the client, and also drops any not in its switch statement. Memprof metadata needed a case in the combineMetadata switch statement. For now we simply keep the metadata of the instruction being kept, which doesn't retain all the profile information when two calls with memprof metadata are being combined, but at least retains some. For the memprof metadata being dropped during call CSE, add memprof and callsite metadata to the list of known ids in combineMetadataForCSE. Neither memprof nor regular prof metadata were in the list of known ids for the callsite in MemCpyOptimizer, which was added to combine AA metadata after optimization of byval arguments fed by memcpy instructions, and similar types of optimizations of memcpy uses. There is one other callsite of combineMetadata, but it is only invoked on load instructions, which do not carry these types of metadata.	2025-01-02 12:11:59 -08:00
Momchil Velikov	5d9c321e8d	Handle scalable store size in MemCpyOptimizer (#118957 ) The compiler crashes with an ICE when it tries to create a `memset` with scalable size.	2024-12-06 20:48:48 +00:00
Antonio Frighetto	1d6ab189be	[MemCpyOpt] Drop dead `memmove` calls on `memset`'d source data When a memmove happens to clobber source data, and such data have been previously memset'd, the memmove may be redundant.	2024-12-03 09:50:57 +01:00
Nikita Popov	1e32a7d42c	[AA] Rename CaptureInfo -> CaptureAnalysis (NFC) (#116842 ) I'd like to use the name CaptureInfo to represent the new attribute proposed at https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420, but it's already taken by AA, and I can't think of great alternatives (CaptureEffects would be something of a stretch). As such, I'd like to rename CaptureInfo -> CaptureAnalysis in AA, which also seems like the more accurate terminology.	2024-11-20 09:42:28 +01:00
Kazu Hirata	94f9cbbe49	[Scalar] Remove unused includes (NFC) (#114645 ) Identified with misc-include-cleaner.	2024-11-02 08:32:26 -07:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Nikita Popov	f5c02dd06e	[MemCpyOpt] Use EarliestEscapeInfo (#110280 ) Pass EarliestEscapeInfo to BatchAA in MemCpyOpt. This allows memcpy elimination in cases where one of the involved pointers is captured after the relevant memcpy/call.	2024-09-30 09:35:54 +02:00
Nikita Popov	296901fd00	[MemCpyOpt] Use BatchAA in one more place (NFCI) Everything else in this method using BatchAA, apart from this call.	2024-09-27 16:44:35 +02:00
Ramkumar Ramachandra	f664d313cd	MemCpyOpt: replace an AA query with MSSA query (NFC) (#108535 ) Fix a long-standing TODO.	2024-09-24 11:18:37 +01:00
Ramkumar Ramachandra	7e9bd12cd9	MemCpyOpt: clarify logic in processStoreOfLoad (NFC) (#108400 )	2024-09-12 21:16:43 +01:00
Ramkumar Ramachandra	159e5b3fdf	MemCpyOpt: avoid unnecessary getMemorySSA (NFC) (#108405 )	2024-09-12 20:35:01 +01:00
Nikita Popov	2afe678f0a	[MemCpyOpt] Allow memcpy elision for non-noalias arguments (#107860 ) We currently elide memcpys for readonly nocapture noalias arguments. noalias is checked to make sure that there are no other ways to write the memory, e.g. through a different argument or an escaped pointer. In addition to the current noalias check, also query alias analysis, in case it can prove that modification is not possible through other means. This fixes the problem reported in https://discourse.llvm.org/t/problem-about-memcpy-elimination/81121.	2024-09-11 10:04:37 +02:00
Yingwei Zheng	378daa6c6f	[MemCpyOpt] Avoid infinite loops in `MemCpyOptPass::processMemCpyMemCpyDependence` (#103218 ) Closes https://github.com/llvm/llvm-project/issues/102994.	2024-08-22 17:20:47 +08:00
Yingwei Zheng	f364b2ee22	[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889 ) Since we are using opaque pointers now, we don't need to peek through bitcast on pointers and gep with zero indices.	2024-08-13 22:38:50 +08:00
Nikita Popov	71051deff2	[MemCpyOpt] Fix infinite loop in memset+memcpy fold (#98638 ) For the case where the memcpy size is zero, this transform is a complex no-op. This can lead to an infinite loop when the size is zero in a way that BasicAA understands, because it can still understand that dst and dst + src_size are MustAlias. I've tried to mitigate this before using the isZeroSize() check, but we can hit cases where InstSimplify doesn't understand that the size is zero, but BasicAA does. As such, this bites the bullet and adds an explicit isKnownNonZero() check to guard against no-op transforms. Fixes https://github.com/llvm/llvm-project/issues/98610.	2024-07-15 09:41:11 +02:00
Yingwei Zheng	99685a54d1	[MemCpyOpt] Use `dyn_cast` to fix assertion failure in `processMemCpyMemCpyDependence` (#98686 ) Fixes https://github.com/llvm/llvm-project/issues/98675.	2024-07-13 04:27:07 +08:00

1 2 3 4 5 ...

479 Commits