llvm-project

Author	SHA1	Message	Date
Peter Collingbourne	75bb30ddbf	Move {load,store}(llvm.protected.field.ptr) lowering to InstCombine. The previous position of llvm.protected.field.ptr lowering for loads and stores was problematic as it not only inhibited optimizations such as DSE (as stores to a llvm.protected.field.ptr were not considered to must-alias stores to the non-protected.field pointer) but also required changes to other optimization passes to avoid transformations that would reduce PFP coverage. Address this by moving the load/store part of the lowering to InstCombine, where it will run earlier than the PFP-breaking and AA-relying transformations. The deactivation symbol, null comparison and EmuPAC parts of the lowering remain in PreISelLowering. Now that the transformation inhibitions are no longer needed, remove them (i.e. partially revert #151649, and revert #182976). This change resulted in a 2.4% reduction in Fleetbench .text size and the following improvements to PFP performance overhead for BM_PROTO_Arena on various microarchitectures: before after Apple M2 Ultra 3.5% 3.3% Google Axion C4A 3.3% 2.9% Google Axion N4A 2.7% 2.2% Reviewers: fmayer, nikic, vitalybuka Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/186548	2026-04-06 17:47:24 -07:00
Alexis Engelke	01571f1b4a	[CodeGen] Drop uses of BranchInst (#186391 ) Largely a straight-forward replacement with occasional simplifcations. For AMDGPU, I assumed that unconditional branches are always uniform and therefore "simplified"/changed AMDGPUAnnotateUniformValues to only annotate conditional branches. Target-specific FastISel only selects conditional branches, unconditional branches are already handled by the non-target-specific code.	2026-03-13 21:51:38 +00:00
Fabian Ritter	f2749f6645	[LowerMemIntrinsics][AMDGPU] Optimize memset.pattern lowering (#185901 ) This patch changes the lowering of the [experimental.memset.pattern intrinsic](https://llvm.org/docs/LangRef.html#llvm-experimental-memset-pattern-intrinsic) to match the optimized memset and memcpy lowering when possible. (The tl;dr of memset.pattern is that it is like memset, except that you can use it to set values that are wider than a single byte.) The memset.pattern lowering now queries `TTI::getMemcpyLoopLoweringType` for a preferred memory access type. If the size of that type is a multiple of the set value's type, and if both types have consistent store and alloc sizes (since memset.pattern behaves in a way that is not well suitable for access widening if store and alloc size differ), the memset.pattern is lowered into two loops: a main loop that stores a sufficiently wide vector splat of the SetValue with the preferred memory access type and a residual loop that covers the remaining set values individually. In contrast to the memset lowering, this patch doesn't include a specialized lowering for residual loops with known constant lengths. Loops that are statically known to be unreachable will not be emitted. For backends that don't override `TTI::getMemcpyLoopLoweringType`, the generated code is mostly unchanged except for more consistent basic block names, no more `br i1 false` for memset.patterns with known size, and a flipped loop condition for memset.patterns with known size (see test changes). This is a follow-up to a similar patch for memset: #169040	2026-03-13 10:37:33 +01:00
Peter Collingbourne	b703f63697	Add llvm.looptrap intrinsic. The '``llvm.looptrap``' intrinsic is equivalent to ``llvm.cond.loop(true)``, but is also considered to be ``noreturn``, which enables certain optimizations by allowing the optimizer to assume that a branch leading to a call to this intrinsic was not taken. A late optimization pass will convert this intrinsic to either ``llvm.cond.loop(true)`` or ``llvm.cond.loop(pred)``, where ``pred`` is a predicate for a conditional branch leading to the intrinsic call, if possible. Reviewers: fmayer, vitalybuka Pull Request: https://github.com/llvm/llvm-project/pull/181299	2026-02-13 14:12:51 -08:00
Aiden Grossman	4d5d2ffd3e	[ProfCheck] Add prof data for lowering of @llvm.cond.loop When there is no target-specific lowering of @llvm.cond.loop, it is lowered into a simple loop by PreISelIntrinsicLowering. Mark the branch weights into the no-return loop as unknown given we do not have value metadata to fix the profcheck test for this feature. Reviewers: mtrofin, alanzhao1, snehasish, pcc Pull Request: https://github.com/llvm/llvm-project/pull/180390	2026-02-08 10:16:58 -08:00
Peter Collingbourne	191af6c254	Add llvm.cond.loop intrinsic. The llvm.cond.loop intrinsic is semantically equivalent to a conditional branch conditioned on ``pred`` to a basic block consisting only of an unconditional branch to itself. Unlike such a branch, it is guaranteed to use specific instructions. This allows an interrupt handler or other introspection mechanism to straightforwardly detect whether the program is currently spinning in the infinite loop and possibly terminate the program if so. The intent is that this intrinsic may be used as a more efficient alternative to a conditional branch to a call to ``llvm.trap`` in circumstances where the loop detection is guaranteed to be present. This construct has been experimentally determined to be executed more efficiently (when the branch is not taken) than a conditional branch to a trap instruction on AMD and older Intel microarchitectures, and is also more code size efficient by avoiding the need to emit a trap instruction and possibly a long branch instruction. On i386 and x86_64, the infinite loop is guaranteed to consist of a short conditional branch instruction that branches to itself. Specifically, the first byte of the instruction will be between 0x70 and 0x7F, and the second byte will be 0xFE. Part of this RFC: https://discourse.llvm.org/t/rfc-optimizing-conditional-traps/89456 Reviewers: arsenm, RKSimon, fmayer, vitalybuka Pull Request: https://github.com/llvm/llvm-project/pull/177686	2026-02-06 17:11:15 -08:00
Fabian Ritter	d24a6754ce	[LowerMemIntrinsics] Optimize memset lowering (#169040 ) This patch changes the memset lowering to match the optimized memcpy lowering. The memset lowering now queries TTI.getMemcpyLoopLoweringType for a preferred memory access type. If that type is larger than a byte, the memset is lowered into two loops: a main loop that stores a sufficiently wide vector splat of the SetValue with the preferred memory access type and a residual loop that covers the remaining bytes individually. If the memset size is statically known, the residual loop is replaced by a sequence of stores. This improves memset performance on gfx1030 (AMDGPU) in microbenchmarks by around 7-20x. I'm planning similar treatment for memset.pattern as a follow-up PR. For SWDEV-543208.	2026-02-04 13:35:13 +01:00
Matt Arsenault	539412914a	GlobalISel: Use LibcallLoweringInfo analysis in legalizer (#170328 ) This is mostly boilerplate to move various freestanding utility functions into LegalizerHelper. LibcallLoweringInfo is currently optional, mostly because threading it through assorted other uses of LegalizerHelper is more difficult. I had a lot of trouble getting this to work in the legacy pass manager with setRequiresCodeGenSCCOrder, and am not happy with the result. A sub-pass manager is introduced and this is invalidated, so we're re-computing this unnecessarily.	2026-01-16 14:42:10 +01:00
Peter Collingbourne	4afc2562fb	Add llvm.protected.field.ptr intrinsic and pre-ISel lowering. This intrinsic is used to implement pointer field protection. For more information, see the included LangRef update and the RFC: https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/85555 Reviewers: nikic, fmayer, ahmedbougacha Reviewed By: nikic, fmayer Pull Request: https://github.com/llvm/llvm-project/pull/151647	2025-12-03 17:40:25 -08:00
Matt Arsenault	04c81a9973	CodeGen: Add LibcallLoweringInfo analysis pass (#168622 ) The libcall lowering decisions should be program dependent, depending on the current module's RuntimeLibcallInfo. We need another related analysis derived from that plus the current function's subtarget to provide concrete lowering decisions. This takes on a somewhat unusual form. It's a Module analysis, with a lookup keyed on the subtarget. This is a separate module analysis from RuntimeLibraryAnalysis to avoid that depending on codegen. It's not a function pass to avoid depending on any particular function, to avoid repeated subtarget map lookups in most of the use passes, and to avoid any recomputation in the common case of one subtarget (and keeps it reusable across repeated compilations). This also switches ExpandFp and PreISelIntrinsicLowering as a sample function and module pass. Note this is not yet wired up to SelectionDAG, which is still using the LibcallLoweringInfo constructed inside of TargetLowering.	2025-12-03 22:00:12 +01:00
Matt Arsenault	ad8f6b44be	DAG: Avoid some libcall string name comparisons (#166321 ) Move to the libcall impl based functions.	2025-11-05 07:09:02 -08:00
paperchalice	f3df058b03	[Passes] Report error when pass requires target machine (#142550 ) Fixes #142146 Do nullptr check when pass accept `const TargetMachine &` in constructor, but it is still not exhaustive.	2025-10-23 12:57:03 +08:00
Daniel Paoliello	f99b0f3de4	[NFC] RuntimeLibcalls: Prefix the impls with 'Impl_' (#153850 ) As noted in #153256, TableGen is generating reserved names for RuntimeLibcalls, which resulted in a build failure for Arm64EC since `vcruntime.h` defines `__security_check_cookie` as a macro. To avoid using reserved names, all impl names will now be prefixed with `Impl_`. `NumLibcallImpls` was lifted out as a `constexpr size_t` instead of being an enum field. While I was churning the dependent code, I also removed the TODO to move the impl enum into its own namespace and use an `enum class`: I experimented with using an `enum class` and adding a namespace, but we decided it was too verbose so it was dropped.	2025-09-02 09:57:33 -07:00
Matt Arsenault	3e5d8a1439	Reapply "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) This reverts commit 334e9bf2dd01fbbfe785624c0de477b725cde6f2. Check if llvm-nm exists before building the benchmark.	2025-08-16 09:53:50 +09:00
gulfemsavrun	334e9bf2dd	Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) …210)" This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3. Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)" This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de. Revert "TableGen: Emit statically generated hash table for runtime libcalls (#150192)" This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05. Reverted three changes because of a CMake error while building llvm-nm as reported in the following PR: https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073	2025-08-15 13:32:27 -07:00
Matt Arsenault	cb1228fbd5	RuntimeLibcalls: Return StringRef for libcall names (#153209 ) Does not yet fully propagate this down into the TargetLowering uses, many of which are relying on null checks on the returned value.	2025-08-15 09:55:39 +09:00
Stephen Long	19ada02086	PreISelIntrinsicLowering: Lower llvm.log to a loop if scalable vec arg (#129744 ) Similar to ab976a1, but for llvm.log.	2025-08-12 01:04:28 +09:00
Matt Arsenault	14b2d2cc3a	RuntimeLibcalls: Add entries for objc runtime calls (#147920 ) Stop emitting these calls by name in PreISelIntrinsicLowering. This is still kind of a hack. We should be going through the abstract RTLIB:Libcall, and then checking if the call is really supported in this module. Do this as a placeholder until RuntimeLibcalls is a module analysis.	2025-07-11 07:45:12 +09:00
Alex Bradbury	9f7567d33a	[PreISelIntrinsicLowering] Reuse previously generated GlobalVariable for memset_pattern16 when possible (#144677 ) As Constants are already uniquified, we can use a map to keep track of whether a GlobalVariable was produced for a given Constant or not. Repeated globals with the same value was one of the codegen differences noted in #126736. This patch removes that diff, producing cleaner output.	2025-06-23 16:35:48 +01:00
Matt Arsenault	a65e0edd6a	PowerPC: Stop reporting memcpy as an alias of memmove on AIX (#143836 ) Instead of reporting ___memmove as an implementation of memcpy, make it unavailable and let the lowering logic consider memmove as a fallback path. This avoids a special case 1:N mapping for libcall implementations.	2025-06-23 22:15:37 +09:00
Marina Taylor	4b794c8aff	[ObjC] Support objc_claimAutoreleasedReturnValue (#139720 ) This adds basic support for objc_claimAutoreleasedReturnValue, which is mostly equivalent to objc_retainAutoreleasedReturnValue, with the difference that it doesn't require the marker nop to be emitted between it and the call it was attached to. To achieve that, this also teaches the AArch64 attachedcall bundle lowering to pick whether the marker should be emitted or not based on whether the attachedcall target is claimARV or retainARV. Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2025-05-13 16:04:49 +01:00
Philip Reames	c0a264e6a9	[IntrinsicInst] Remove MemCpyInlineInst and MemSetInlineInst [nfc] (#138568 ) I'm looking for ways to simplify the Mem*Inst class structure, and these two seem to have fairly minimal justification, so let's remove them.	2025-05-05 14:07:31 -07:00
Alex Bradbury	6db2594c48	[PreISelIntrinsicLowering] Zext/trunc count parameter as necessary for memset_pattern16 emission (#129239 ) This patch cleans up the handling of the count parameter in general, though was initially motivated by a compiler crash upon a memset.pattern with a narrow count causing a compiler crash due to different types for CreateMul when converting the count to the number of bytes. The logic used to name globals means there is some minor renaming churn in the output to test/Transforms/PreISelIntrinsicLowering/X86/memset-pattern.ll irrelevant to the newly added tests (that would crash before).	2025-03-19 11:16:24 +00:00
Alex Bradbury	059ada405c	[PreISelintrinsicLowering] getTypeSizeInBits/8 => getTypeAllocSize in memset.pattern lowering As noted during review of #129329.	2025-03-12 12:18:03 +00:00
Alex Bradbury	8fcb1263f4	[PreISelIntrinsicLowering] Produce a memset_pattern16 libcall for llvm.experimental.memset.pattern when available (#120420 ) This is to enable a transition of LoopIdiomRecognize to selecting the llvm.experimental.memset.pattern intrinsic as requested in #118632 (as opposed to supporting selection of the libcall or the intrinsic). As such, although it _is_ a TODO to add costing considerations on whether to lower to the libcall (when available) or expand directly, lacking such logic is helpful at this stage in order to minimise any unexpected code gen changes in this transition.	2025-01-30 07:12:53 +00:00
Stephen Long	ab976a1712	PreISelIntrinsicLowering: Lower llvm.exp/llvm.exp2 to a loop if scalable vec arg (#117568 )	2025-01-24 14:02:06 -05:00
Alex Bradbury	298127dcbe	Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the test case. Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 15:21:39 +00:00
Alex Bradbury	0fb8fac5d6	Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 )" This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3. Recent scheduling changes means tests need to be re-generated. Reverting to green while I do that.	2024-11-15 14:48:32 +00:00
Alex Bradbury	7ff3a9acd8	[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 14:07:46 +00:00
Roger Ferrer Ibáñez	e1a16cd88d	[ExpandVectorPredication] Be more precise reporting changes (#102313 ) This is used by PreISelIntrinsicLowering. With this change, PreISelIntrinsicLowering does not have to assume that there were changes just because we encountered a VP intrinsic.	2024-08-09 08:01:42 +02:00
Alexis Engelke	5313d2e6d0	[CodeGen] Fix lower constant intrinsics for dead code (#102442 ) lowerConstantIntrinsics does an RPO traveral, which doesn't reach dead blocks. Remove the assertion that all intrinsics are lowered, because some intrinsics might remain.	2024-08-08 13:06:33 +02:00
Alexis Engelke	85bf0a6b44	[CodeGen] Fix PreISelLowering not reporting changes (#102184 ) expandVectorPredication may change code, even if the intrinsic itself remains in the code. Report changes whenever such an intrinsic is encountered, because code could have been changed. Another follow-up fix for #101652 to fix expensive-checks-only failure.	2024-08-06 19:30:42 +02:00
Alexis Engelke	a4837fe3c1	[CodeGen] Allow PreISel lowering to run without TM (#102150 ) Fixes #101652 after build bot failures where TM in the opt pass builder is nullptr.	2024-08-06 16:21:56 +02:00
Alexis Engelke	fa92d51f9e	[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652 ) Similar to #97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.	2024-08-06 09:27:59 +02:00
Alexis Engelke	b5fc083dc3	[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727 ) Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.	2024-08-01 17:44:32 +02:00
Alex Bradbury	fdf94e1632	Reapply "[Intrinsics][PreISelInstrinsicLowering] llvm.memcpy.inline length no longer needs to be constant (#98281 )" This reverts commit ac4b6b662630cd4d3bf6929f2b39ea203c0054a1. A test change was missing for mlir/test/Target/LLVMIR/llvmir-intrinsics.mlir in the initial commit.	2024-07-16 14:48:59 +01:00
Alex Bradbury	ac4b6b6626	Revert "[Intrinsics][PreISelInstrinsicLowering] llvm.memcpy.inline length no longer needs to be constant (#98281 )" This reverts commit 522fd53838d577add8c19b5eccccae756fd27899 while unexpected mlir failures are investigated and resolved.	2024-07-16 14:31:14 +01:00
Alex Bradbury	522fd53838	[Intrinsics][PreISelInstrinsicLowering] llvm.memcpy.inline length no longer needs to be constant (#98281 ) Following on from the discussion in https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496 and the equivalent change for llvm.memset.inline (#95397), this removes the requirement that the length of llvm.memcpy.inline is constant. PreISelInstrinsicLowering will expand llvm.memcpy.inline with non-constant lengths, while the codegen path for constant lengths is left unaltered.	2024-07-16 14:13:13 +01:00
Alex Bradbury	4d052a7618	[Intrinsics][PreISelIntrinsicLowering] llvm.memset.inline length no longer needs to be constant (#95397 ) As requested in https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496 this patch removes the requirement that the length of llvm.memset.inline is a constant, and adjusts PreISelIntrinsicLowering so it supports expanding such the intrinsic in the case it has a non-constant length.	2024-07-10 07:58:52 +01:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Nikita Popov	6c2fbc3a68	[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582 ) This abstracts over the common pattern of creating a gep with i8 element type.	2024-01-12 14:21:21 +01:00
Jon Roelofs	fa71f9e87a	Reland "[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn." This reverts commit cb62f67088aaf79493350547f74870318b71acc5. Fixes: https://github.com/llvm/llvm-project/issues/69658	2023-11-06 11:10:59 -08:00
Jon Roelofs	d9ccacee13	Revert "Reland "[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn."" This reverts commit 30414fc614d80a45bad4c89763a353f50d3e04d6. Broke some buildbots.	2023-11-06 10:04:22 -08:00
Jon Roelofs	30414fc614	Reland "[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn." This reverts commit cb62f67088aaf79493350547f74870318b71acc5. Fixes: https://github.com/llvm/llvm-project/issues/69658	2023-11-06 08:47:05 -08:00
Jon Roelofs	cb62f67088	Revert "[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn." This reverts commit ed83797f3cbfc8fb2a1af63542f97d7ec1d5505a. Reverting pending the investigation of https://github.com/llvm/llvm-project/issues/69658	2023-10-20 09:22:12 -07:00
Nikita Popov	66bb752162	[PreISelIntrinsicLowering] Use TLI for correct function We should query the subtarget of the calling function, not of the intrinsic. This probably makes no functional difference (as libcalls are unlikely to vary across subtargets), but fixes minor compile-time regressions from unnecessary subtarget instantiations. Followup to D157567. Differential Revision: https://reviews.llvm.org/D157848	2023-08-16 10:02:18 +02:00
Matt Arsenault	c8cac15613	PreISelIntrinsicLowering: Check RuntimeLibcalls instead of TLI for memory functions We need a better mechanism for expressing which calls you are allowed to emit and which calls are recognized. This should be applied to the 17 branch.	2023-08-10 16:40:04 -04:00
Bjorn Pettersson	4ce7c4a92a	[llvm] Drop some typed pointer handling/bitcasts Differential Revision: https://reviews.llvm.org/D157016	2023-08-03 22:54:33 +02:00
Jon Roelofs	ed83797f3c	[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain rdar://79869679 Differential revision: https://reviews.llvm.org/D105671	2023-08-01 18:02:00 -07:00

1 2

79 Commits