llvm-project

Author	SHA1	Message	Date
Nikita Popov	6b69985da4	[MemCpyOpt] Use helper for unwind check This extends support to byval arguments. It would be further extended to handle the case of non-captured noalias returns.	2022-01-26 12:43:31 +01:00
Nikita Popov	0d20407d1a	Reapply [MemCpyOpt] Look through pointer casts when checking capture This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679. ----- The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-20 09:30:21 +01:00
Nikita Popov	655a7024db	Reapply [MemCpyOpt] Make capture check during call slot optimization more precise This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679. ----- Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-20 09:30:20 +01:00
Nikita Popov	d7bff2e9d2	[MemCpyOpt] Fix metadata merging during call slot optimization Call slot optimization currently merges the metadata between the call and the load. However, we also need to merge in the metadata of the store. Part of the reason why we might have gotten away with this previously is that usually the load and the store are the same instruction (a memcpy), this can only happen if call slot optimization occurs on an actual load/store pair. This addresses the issue reported in https://reviews.llvm.org/D115615#3251386. Differential Revision: https://reviews.llvm.org/D117679	2022-01-20 09:25:13 +01:00
Nikita Popov	4dc4815f56	[MemCpyOpt] Add some debug output to call slot optimization (NFC)	2022-01-19 15:51:10 +01:00
Hans Wennborg	53a51acc36	Revert "[MemCpyOpt] Make capture check during call slot optimization more precise" This casued a miscompile due to call slot optimization replacing a call argument without considering the call's !noalias metadata, see discussion on the code review. > Call slot optimization is currently supposed to be prevented if > the call can capture the source pointer. Due to an implementation > bug, this check currently doesn't trigger if a bitcast of the source > pointer is passed instead. I'm somewhat afraid of the fallout of > fixing this bug (due to heavy reliance on call slot optimization > in rust), so I'd like to strengthen the capture reasoning a bit first. > > In particular, I believe that the capture is fine as long as a) > the call itself cannot depend on the pointer identity, because > neither dest has been captured before/at nor src before the > call and b) there is no potential use of the captured pointer > before the lifetime of the source alloca ends, either due to > lifetime.end or a return from a function. At that point the > potentially captured pointer becomes dangling. > > Differential Revision: https://reviews.llvm.org/D115615 Also reverting the dependent commit: > [MemCpyOpt] Look through pointer casts when checking capture > > The user scanning loop above looks through pointer casts, so we > also need to strip pointer casts in the capture check. Previously > the source was incorrectly considered not captured if a bitcast > was passed to the call. This reverts commit 487a34ed9d7d24a7b1fb388c8856c784a459b22b and 00e6869463ae6023d0d48f30de8511d6d748b14f.	2022-01-18 17:41:49 +01:00
Simon Pilgrim	5bbcff6181	[MemCpyOptimizer] hasUndefContents - only look for underlying object if we've found an alloca Provides an early-out if we fail to find an AllocaInst, and avoids a static analyzer warning about null dereferencing.	2022-01-06 15:15:03 +00:00
Simon Pilgrim	8399fa673b	[MemCpyOptimizer] Use auto* for cast<> results (style). NFC.	2022-01-06 15:15:03 +00:00
Nikita Popov	00e6869463	[MemCpyOpt] Look through pointer casts when checking capture The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-05 09:50:33 +01:00
Nikita Popov	487a34ed9d	[MemCpyOpt] Make capture check during call slot optimization more precise Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-05 09:39:25 +01:00
Fraser Cormack	7fb66d4035	[MemCpyOpt] Fix a variety of scalable-type crashes This patch fixes a variety of crashes resulting from the `MemCpyOptPass` casting `TypeSize` to a constant integer, whether implicitly or explicitly. Since the `MemsetRanges` requires a constant size to work, all but one of the fixes in this patch simply involve skipping the various optimizations for scalable types as cleanly as possible. The optimization of `byval` parameters, however, has been updated to work on scalable types in theory. In practice, this optimization is only valid when the length of the `memcpy` is known to be larger than the scalable type size, which is currently never the case. This could perhaps be done in the future using the `vscale_range` attribute. Some implicit casts have been left as they were, under the knowledge they are only called on aggregate types. These should never be scalably-sized. Reviewed By: nikic, tra Differential Revision: https://reviews.llvm.org/D109329	2021-09-08 11:21:36 +01:00
Artem Belevich	30dfd3449e	[MemCpyOpt] Allow specifying --enable-memcpyopt-without-libcalls more than once so we can override it via clang's CLI if necessary.	2021-08-30 13:55:55 -07:00
Nikita Popov	17db125b48	[MemCpyOpt] Optimize MemoryDef insertion When converting a store into a memset, we currently insert the new MemoryDef after the store MemoryDef, which requires all uses to be renamed to the new def using a whole block scan. Instead, we can insert the new MemoryDef before the store and not rename uses, because we know that the location is immediately overwritten, so all uses should still refer to the old MemoryDef. Those uses will get renamed when the old MemoryDef is actually dropped, which is efficient. I expect something similar can be done for some of the other MSSA updates in MemCpyOpt. This is an alternative to D107513, at least for this particular case. Differential Revision: https://reviews.llvm.org/D107702	2021-08-10 21:28:29 +02:00
Nikita Popov	88003cea1c	[MemCpyOpt] Remove MemDepAnalysis-based implementation The MemorySSA-based implementation has been enabled for a few months (since D94376). This patch drops the old MDA-based implementation entirely. I've kept this to only the basic cleanup of dropping various conditions -- the code could be further cleaned up now that there is only one implementation. Differential Revision: https://reviews.llvm.org/D102113	2021-08-07 22:35:44 +02:00
Artem Belevich	6a9cf21f5a	[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA. Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that there are users that do not expect LLVM to materialize `memset` intrinsic. While other passes can do that, too, MemCpyOpt triggers it more frequently and breaks sanitizers and some downstream users. For now introduce a flag to force-enable the flag and opt-in only CUDA compilation with NVPTX back-end. Differential Revision: https://reviews.llvm.org/D106401	2021-08-06 11:13:52 -07:00
Michael Liao	d1cacd5928	[MemCpyOpt] Teach memcpyopt to handle loads from the constant memory. - Loads from the constant memory (either explicit one or as the source of memory transfer intrinsics) won't alias any stores. Reviewed By: asbirlea, efriedma Differential Revision: https://reviews.llvm.org/D107605	2021-08-06 12:43:52 -04:00
Nikita Popov	bb15861e14	[MemCpyOpt] Relax libcall checks Rather than blocking the whole MemCpyOpt pass if the libcalls are not available, only disable creation of new memset/memcpy intrinsics where only load/stores were used previously. This only affects the store merging and load-store conversion optimization. Other optimizations are derived from existing intrinsics, which are well-defined in the absence of libcalls -- not having the libcalls just means that call simplification won't convert them to intrinsics. This is a weaker variation of D104801, which dropped these checks entirely. Ideally we would not couple emission of intrinsics to libcall availability at all, but as the intrinsics may be legalized to libcalls we need to be a bit careful right now. Differential Revision: https://reviews.llvm.org/D106769	2021-08-04 21:17:51 +02:00
Artem Belevich	1a43ee65d1	Revert "[MemCpyOpt] Enable memcpy optimizations unconditionally." This reverts commit 2c98298a7559dfe4a264ef1adaad0921526768cc which breaks sanitizers.	2021-07-19 14:27:41 -07:00
Artem Belevich	2c98298a75	[MemCpyOpt] Enable memcpy optimizations unconditionally. The patch does not depend on the availability of the library functions for memcpy/memset as it operates on LLVM intrinsics. The optimizations are useful on the targets that have these functions disabled (e.g. NVPTX & AMDGPU). Differential Revision: https://reviews.llvm.org/D104801	2021-07-19 11:58:02 -07:00
Arthur Eubanks	ab5693aa4a	[OpaquePtr] Use byval type more	2021-07-13 09:34:34 -07:00
Jon Roelofs	37b6e03c18	[Intrinsics] Make MemCpyInlineInst a MemCpyInst This opens up more optimization opportunities in passes that already handle MemCpyInst's. Differential revision: https://reviews.llvm.org/D105247	2021-07-02 10:25:24 -07:00
Nikita Popov	9aa951e80e	[MemCpyOpt] Preserve address space Preserve address space when generating the cast to i8*.	2021-06-27 20:21:19 +02:00
Nikita Popov	f025053977	[MemCpyOpt] Handle unusual memcpy element type Apparently, it is legal to use memcpy/memset with pointer types other than i8. Prior to 81fcdae68c5ff656c30032fd26c6a21af4c51dbb this case was silently miscompiled, as the i8 offset calculation was performed on some other type. Now it would crash due to a type mismatch. Fix this by inserting an explicit bitcast to i8.	2021-06-27 16:21:44 +02:00
Nikita Popov	81fcdae68c	[MemCpyOpt] Support opaque pointers	2021-06-27 15:52:38 +02:00
Simon Pilgrim	596004a947	MemCpyOptimizer.cpp - hasUndefContentsMSSA - Pass DataLayout by reference. NFCI.	2021-06-08 10:41:02 +01:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Hongtao Yu	30bb5be389	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2. As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention. The optimizations unblocked are: - In-block load propagation. - In-block dead store elimination - Memory copy optimization that turns stores to consecutive memories into a memset. These optimizations are local to a block, so they shouldn't affect the profile quality. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100075	2021-04-26 16:52:33 -07:00
Olle Fredriksson	f5446b769a	[MemCpyOpt] Allow variable lengths in memcpy optimizer This makes the memcpy-memcpy and memcpy-memset optimizations work for variable sizes as long as they are equal, relaxing the old restriction that they are constant integers. If they're not equal, the old requirement that they are constant integers with certain size restrictions is used. The implementation works by pushing the length tests further down in the code, which reveals some places where it's enough that the lengths are equal (but not necessarily constant). Differential Revision: https://reviews.llvm.org/D100870	2021-04-21 23:23:38 +02:00
Liam Keegan	edf9565a86	[MemCpyOpt] Add missing MemorySSAWrapperPass dependency macro Add MemorySSAWrapperPass as a dependency to MemCpyOptLegacyPass, since MemCpyOpt now uses MemorySSA by default. Differential Revision: https://reviews.llvm.org/D98484	2021-03-16 20:30:00 +01:00
Nikita Popov	5556660971	[MemCpyOpt] Handle read from lifetime.start with offset This fixes a regression from the MemDep-based implementation: MemDep completely ignores lifetime.start intrinsics that aren't MustAlias -- this is probably unsound, but it does mean that the MemDep based implementation successfully eliminated memcpy's from lifetime.start if the memcpy happens at an offset, rather than the base address of the alloca. Add a special case for the case where the lifetime.start spans the whole alloca (which is pretty much the only kind of lifetime.start that frontends ever emit), as we don't need to figure out our exact aliasing relationship in that case, the whole alloca is dead prior to the call. If this doesn't cover all practically relevant cases, then it would be possible to make use of the recently added PartialAlias clobber offsets to make this more precise.	2021-03-13 20:38:09 +01:00
Nikita Popov	2902bdeea1	[MemCpyOpt] Use AA to check for MustAlias between memset and memcpy Rather than checking for simple equality, check for MustAlias, as we do in other transforms. This catches equivalent GEPs.	2021-03-13 11:41:15 +01:00
Nikita Popov	9080444f33	[MemCpyOpt] Don't generate zero-size memset If a memset destination is overwritten by a memcpy and the sizes are exactly the same, then the memset is simply dead. We can directly drop it, instead of replacing it with a memset of zero size, which is particularly ugly for the case of a dynamic size.	2021-03-13 11:41:15 +01:00
Nikita Popov	4125afc357	[MemCpyOpt] Fix handling of readnone byval arguments If the call is readnone, then there may not be any MemoryAccess associated with the call. Bail out in that case. This fixes the issue reported at https://reviews.llvm.org/D94376#2578312.	2021-02-22 18:48:31 +01:00
Nikita Popov	71a8e4e7d6	[MemCopyOpt] Enable MemorySSA by default This enables use of MemorySSA instead of MemDep in MemCpyOpt. To allow this without significant compile-time impact, the MemCpyOpt pass is moved directly before DSE (in the cases where this was not already the case), which allows us to reuse the existing MemorySSA analysis. Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt can also perform simple optimizations across basic blocks. Differential Revision: https://reviews.llvm.org/D94376	2021-02-19 18:06:25 +01:00
Kazu Hirata	e53472de68	[Transforms] Use llvm::append_range (NFC)	2021-01-20 21:35:54 -08:00
Kazu Hirata	530c5af6a4	[Transforms] Construct SmallVector with iterator ranges (NFC)	2021-01-02 09:24:17 -08:00
Nikita Popov	624af932a8	[MemCpyOpt] Port to MemorySSA This is a straightforward port of MemCpyOpt to MemorySSA following the approach of D26739. MemDep queries are replaced with MSSA queries without changing the overall structure of the pass. Some care has to be taken to account for differences between these APIs (MemDep also returns reads, MSSA doesn't). Differential Revision: https://reviews.llvm.org/D89207	2020-12-01 17:57:41 +01:00
Nikita Popov	3e37543111	[MemCpyOpt] Move GEP during call slot optimization When performing a call slot optimization to a GEP destination, it will currently usually fail, because the GEP is directly before the memcpy and as such does not dominate the call. We should move it above the call if that satisfies the domination requirement. I think that a constant-index GEP is the only useful thing to move here, as otherwise isDereferenceablePointer couldn't look through it anyway. As such I'm not trying to generalize this further. Differential Revision: https://reviews.llvm.org/D89623	2020-10-22 20:40:56 +02:00
Nikita Popov	50cc9a0e61	[MemCpyOpt] Extract common function for unwinding check These two cases should be using the same logic. Not NFC, as this resolves the TODO regarding use of the underlying object.	2020-10-17 15:30:39 +02:00
Nikita Popov	cd6f40f432	[MemCpyOpt] Add test scaffolding for MSSA based MemCpyOpt This adds an -enable-memcpyopt-memoryssa option that currently does nothing apart from requiring MSSA as a dependency. The tests are split to run both with the option disabled and enabled. I went with this rather than the separate directory DSE uses, as I found it convenient to have a direct side-by-side comparison of differences. Differential Revision: https://reviews.llvm.org/D89206	2020-10-13 21:45:05 +02:00
Nikita Popov	e79ca751fc	[MemCpyOpt] Fix MemorySSA preservation moveUp() moves instructions, so we should move the corresponding memory accesses as well. We should also move the store instruction itself: Even though we'll end up removing it later, this gives us a correct MemoryDef to replace. The implementation is somewhat more complicated than it should be, because we also handle the case where P does not have a memory access due to a degnerate AA pipeline. Hopefully, the need for this will go away in the future, when the rest of the pass is based on MSSA. Differential Revision: https://reviews.llvm.org/D88778	2020-10-13 21:39:09 +02:00
Nikita Popov	baa3b87015	[MemCpyOpt] Don't shorten memset if memcpy operands may be the same If the memcpy operands are the same (which is allowed since D86815) then the memcpy is effectively a no-op and the partially overlapping memset is not dead. Differential Revision: https://reviews.llvm.org/D89192	2020-10-13 21:19:19 +02:00
Nikita Popov	39c39e8a7f	[MemCpyOpt] Don't shorten memset if destination observable through unwinding MemCpyOpt can shorten a memset if it is later partially overwritten by a memcpy. It checks that the destination is not read in between, but we also need to make sure that the destination cannot be observed via unwinding. Differential Revision: https://reviews.llvm.org/D89190	2020-10-13 21:12:19 +02:00
Nikita Popov	5e855f1e80	[MemCpyOpt] Don't hoist store that's not guaranteed to execute MemCpyOpt can hoist stores while load+store pairs into memcpy. This hoisting can currently result in stores being executed that weren't guaranteed to execute in the original problem. Differential Revision: https://reviews.llvm.org/D89154	2020-10-10 10:26:28 +02:00
Nikita Popov	616f545048	[MemCpyOpt] Use dereferenceable pointer helper The call slot optimization has some home-grown code for checking whether the destination is dereferenceable. Replace this with the generic isDereferenceableAndAlignedPointer() helper. I'm not checking alignment here, because that is currently handled separately and may be an enforced alignment for allocas. The clean way of integrating that part would probably be to accept a callback in isDereferenceableAndAlignedPointer() for the actual isAligned check, which would then have a chance to use an enforced alignment instead. This allows the destination to be a GEP (among other things), though the two open TODOs may prevent it from working in practice. Differential Revision: https://reviews.llvm.org/D88805	2020-10-06 18:41:19 +02:00
Nikita Popov	6b441ca523	[MemCpyOpt] Check for throwing calls during call slot optimization When performing call slot optimization for a non-local destination, we need to check whether there may be throwing calls between the call and the copy. Otherwise, the early write to the destination may be observable by the caller. This was already done for call slot optimization of load/store, but not for memcpys. For the sake of clarity, I'm moving this check into the common optimization function, even if that does need an additional instruction scan for the load/store case. As efriedma pointed out, this check is not sufficient due to potential accesses from another thread. This case is left as a TODO. Differential Revision: https://reviews.llvm.org/D88799	2020-10-06 18:24:40 +02:00
Nikita Popov	80cde02e85	[MemCpyOpt] Add separate statistic for call slot optimization (NFC)	2020-10-06 18:14:10 +02:00
Nikita Popov	fbf818724f	[MemCpyOpt] Make moveUp() a member method (NFC) So we don't have to pass through more parameters in the future.	2020-10-03 11:28:49 +02:00
Nikita Popov	94704ed008	[MemCpyOpt] Add helper to erase instructions (NFC) Next to erasing the instruction, we also always want to remove it from MSSA and MD. Use a common function to do so. This is a refactoring split out from D26739.	2020-10-02 21:52:10 +02:00
Nikita Popov	87b63c1726	[MemCpyOpt] Avoid double invalidation (NFCI) The removal of the cpy instruction is left to the caller of performCallSlotOptzn(), including the invalidation of MD. Both call-sites already do this. Also handle incrementation of NumMemCpyInstr consistently at the call-site. One of the call-site was already doing this, which ended up incrementing the statistic twice. This fix was part of D26739.	2020-10-02 21:50:46 +02:00

1 2 3 4 5 ...

354 Commits