llvm-project

Author	SHA1	Message	Date
Rahul Joshi	74b7abf154	[IRBuilder] Add new overload for CreateIntrinsic (#131942 ) Add a new `CreateIntrinsic` overload with no `Types`, useful for creating calls to non-overloaded intrinsics that don't need additional mangling.	2025-03-31 08:10:34 -07:00
Alex MacLean	10fd5b925f	[NVPTX] Auto-Upgrade !"align" metadata on return values to stackalign (#131726 ) This commit follows up 0191307b by auto-upgrading !"align" metadata on return values to stackalign. This allows us to remove all logic to check the metadata from NVPTXUtilities.	2025-03-24 12:00:44 -07:00
Alex MacLean	30ff508614	[NVPTX] Auto-Upgrade llvm.nvvm.swap.lo.hi.b64 to llvm.fshl (#132098 ) After 3c8c2914e067e132af951f70d2b3577fe049e19a the lowering of 64-bit funnel shifts has been improved to the point where this intrinsic is no longer needed.	2025-03-20 19:21:24 -07:00
AdityaK	b17af9d8ee	[NFC][llvm/IR] comparison of unsigned expression in ‘>= 0’ is always true (#130843 )	2025-03-14 11:59:29 -07:00
Frederik Harwath	c979ce7e36	Add IRBuilder::CreateFMA (#131112 ) This commit adds a function for creating fma intrinsic calls to the IRBuilder. If the "IsFPConstrained" flag of the builder is set, the function creates a call to "experimental.constrained.fma" instead of "llvm.fma" . To support the creation of the constrained intrinsic, a function "CreateConstrainedFPIntrinsic" is introduced.	2025-03-14 13:20:58 +01:00
Alex MacLean	6c2e170d04	[NVPTX] Convert vector function nvvm.annotations to attributes (#127736 ) Replace some more nvvm.annotations with function attributes, auto-upgrading the annotations as needed. These new attributes will be more idiomatic and compile-time efficient than the annotations. - !"maxntid[xyz]" -> "nvvm.maxntid" - !"reqntid[xyz]" -> "nvvm.reqntid" - !"cluster_dim_[xyz]" -> "nvvm.cluster_dim"	2025-02-26 08:45:27 -08:00
Alex MacLean	a282b6c486	[NVPTX] Convert scalar function nvvm.annotations to attributes (#125908 ) Replace some more nvvm.annotations with function attributes, auto-upgrading the annotations as needed. These new attributes will be more idiomatic and compile-time efficient than the annotations. - !"maxclusterrank" / !"cluster_max_blocks" -> "nvvm.maxclusterrank" - !"minctasm" -> "nvvm.minctasm" - !"maxnreg" -> "nvvm.maxnreg"	2025-02-12 07:33:22 -08:00
Alex MacLean	de7438e472	[NVPTX] Auto-Upgrade some nvvm.annotations to attributes (#119261 ) Add a new AutoUpgrade function to convert some legacy nvvm.annotations metadata to function level attributes. These attributes are quicker to look-up so improve compile time and are more idiomatic than using metadata which should not include required information that changes the meaning of the program. Currently supported annotations are: - !"kernel" -> ptx_kernel calling convention - !"align" -> alignstack parameter attributes (return not yet supported)	2025-01-29 16:27:27 -08:00
David Green	547bfda56b	[AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (#120363 ) This started out as trying to combine bf16 fpround to BFCVT2 instructions, but ended up removing the aarch64.neon.nfcvt intrinsics in favour of generating fpround instructions directly. This simplifies the patterns and can lead to other optimizations. The BFCVT2 instruction is adjusted to makes sure the types are valid, and a bfcvt2 is now generated in more place. The old intrinsics are auto-upgraded to fptrunc instructions too.	2025-01-21 09:16:04 +00:00
Nikita Popov	0b1ae8963e	[AutoUpgrade] Avoid unnecessary pointer bitcasts (NFCI) Not needed with opaque pointers.	2025-01-20 09:55:35 +01:00
Dan Gohman	c5ab70c508	[WebAssembly] Add `-i128:128` to the `datalayout` string. (#119204 ) Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM `datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is currently using the defaults for both, so it's inconsistent. Fix this by adding `-i128:128` to Wasm's `datalayout` string so that it aligns `i128` to 16 bytes too. This is similar to [llvm/llvm-project@dbad963](`dbad963a69`) for SPARC. This fixes rust-lang/rust#133991; see that issue for further discussion. [defaults to aligning `__int128_t` to 16 bytes]: `f8b4182f07/clang/lib/Basic/TargetInfo.cpp (L77)` [default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout	2024-12-10 09:21:58 -08:00
Lei Huang	a13ec9cd54	[PowerPC] Update data layout aligment of i128 to 16 (#118004 ) Fix 64-bit PowerPC part of https://github.com/llvm/llvm-project/issues/102783.	2024-12-09 18:02:24 -05:00
Graham Hunter	ed5aaddd7b	[IR] Vector extract last active element intrinsic (#113587 ) As discussed in #112738, it may be better to have an intrinsic to represent vector element extracts based on mask bits. This intrinsic is for the case of extracting the last active element, if any, or a default value if the mask is all-false. The target-agnostic SelectionDAG lowering is similar to the IR in #106560.	2024-11-14 17:48:43 +00:00
yingopq	86e4beb702	[MIPS] LLVM data layout give i128 an alignment of 16 for mips64 (#112084 ) Fix parts of #102783.	2024-11-06 16:14:30 +01:00
Alex MacLean	fb33af08e4	[NVPTX] Remove nvvm.ldg.global.* intrinsics (#112834 ) Remove these intrinsics which can be better represented by load instructions with `!invariant.load` metadata: - llvm.nvvm.ldg.global.i - llvm.nvvm.ldg.global.f - llvm.nvvm.ldg.global.p	2024-10-27 16:14:13 -07:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Jay Foad	d9c95efb6c	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546 ) Convert almost every instance of: CreateCall(Intrinsic::getOrInsertDeclaration(...), ...) to the equivalent CreateIntrinsic call.	2024-10-16 15:43:30 +01:00
Daniel Paoliello	c9f27275c1	[clang][aarch64] Add support for the MSVC qualifiers __ptr32, __ptr64, __sptr, __uptr for AArch64 (#111879 ) MSVC has a set of qualifiers to allow using 32-bit signed/unsigned pointers when building 64-bit targets. This is useful for WoW code (i.e., the part of Windows that handles running 32-bit application on a 64-bit OS). Currently this is supported on x64 using the 270, 271 and 272 address spaces, but does not work for AArch64 at all. This change adds the same 270, 271 and 272 address spaces to AArch64 and adjusts the data layout string accordingly. Clang will generate the correct address space casts, but these will currently be ignored until the AArch64 backend is updated to handle them. Partially fixes #62536 This is a resurrected version of <https://reviews.llvm.org/D158857> (originally created by @a_vorobev) - I've cleaned it up a little, fixed the rest of the tests and added to auto-upgrade for the data layout.	2024-10-15 10:37:36 -07:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Matt Arsenault	c198f775cd	AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642 ) These have been replaced with atomicrmw	2024-10-09 09:27:28 +04:00
Matt Arsenault	9dca83f2e1	AMDGPU: Add noalias.addrspace metadata when autoupgrading atomic intrinsics (#102599 ) This will be needed to continue generating the raw instruction in the flat case.	2024-10-08 00:13:28 +04:00
Paul Walker	d283705829	[AArch64][SVE] Fix definition of bfloat fcvt intrinsics. (#110281 ) Affected intrinsics: llvm.aarch64.sve.fcvt.bf16f32 llvm.aarch64.sve.fcvtnt.bf16f32 The named intrinsics took a predicate based on the smallest element type when it should be based on the largest. The intrinsics have been replace by v2 equivalents and affected code ported to use them. Patch includes changes to getSVEPredicateBitCast() that ensure the generated code for the auto-upgraded old intrinsics is unchanged.	2024-10-03 12:36:01 +01:00
Koakuma	076392b0aa	[SPARC] Fix regression from UpgradeDataLayoutString change (#110608 ) It turns out that we cannot rely on the presence of `-i64:64` as a position reference when adding the `-i128:128` datalayout string due to some custom datalayout strings lacking it (e.g ones used by bugpoint, among other things). Do not add the `-i128:128` string in that case. This fixes the regression introduced in https://github.com/llvm/llvm-project/pull/106951.	2024-10-03 05:20:56 +07:00
Koakuma	dbad963a69	[SPARC] Align i128 to 16 bytes in SPARC datalayouts (#106951 ) Align i128s to 16 bytes, following the example at https://reviews.llvm.org/D86310. clang already does this implicitly, but do it in backend code too for the benefit of other frontends (see e.g https://github.com/llvm/llvm-project/issues/102783 & https://github.com/rust-lang/rust/issues/128950).	2024-09-30 08:32:33 +07:00
Alex MacLean	e7621f4199	Reland "[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast" (#110262 ) Remove the following intrinsics which can be trivially replaced with an `addrspacecast` * llvm.nvvm.ptr.gen.to.global * llvm.nvvm.ptr.gen.to.shared * llvm.nvvm.ptr.gen.to.constant * llvm.nvvm.ptr.gen.to.local * llvm.nvvm.ptr.global.to.gen * llvm.nvvm.ptr.shared.to.gen * llvm.nvvm.ptr.constant.to.gen * llvm.nvvm.ptr.local.to.gen Also, cleanup the NVPTX lowering of `addrspacecast` making it more concise. This was reverted to avoid conflicts while reverting #107655. Re-landing unchanged.	2024-09-28 14:13:17 -07:00
Alex MacLean	a131fbf168	Reland "[NVPTX] deprecate nvvm.rotate.* intrinsics, cleanup funnel-shift handling" (#110025 ) This change deprecates the following intrinsics which can be trivially converted to llvm funnel-shift intrinsics: - @llvm.nvvm.rotate.b32 - @llvm.nvvm.rotate.right.b64 - @llvm.nvvm.rotate.b64 This fixes a bug in the previous version (#107655) which flipped the order of the operands to the PTX funnel shift instruction. In LLVM IR the high bits are the first arg and the low bits are the second arg, while in PTX this is reversed.	2024-09-27 05:23:08 -07:00
Dmitry Chernenkov	4cb61c20ef	Revert "[NVPTX] deprecate nvvm.rotate.* intrinsics, cleanup funnel-shift handling (#107655 )" This reverts commit 9ac00b85e05d21be658d6aa0c91cbe05bb5dbde2.	2024-09-25 14:50:26 +00:00
Dmitry Chernenkov	9a0e281e8c	Revert "[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast (#109710 )" This reverts commit 36757613b73908f055674a8df0b51cc00aa04373.	2024-09-25 14:50:26 +00:00
Alex MacLean	36757613b7	[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast (#109710 ) Remove the following intrinsics which can be trivially replaced with an `addrspacecast` * llvm.nvvm.ptr.gen.to.global * llvm.nvvm.ptr.gen.to.shared * llvm.nvvm.ptr.gen.to.constant * llvm.nvvm.ptr.gen.to.local * llvm.nvvm.ptr.global.to.gen * llvm.nvvm.ptr.shared.to.gen * llvm.nvvm.ptr.constant.to.gen * llvm.nvvm.ptr.local.to.gen Also, cleanup the NVPTX lowering of `addrspacecast` making it more concise.	2024-09-24 08:15:14 -07:00
Alex MacLean	9ac00b85e0	[NVPTX] deprecate nvvm.rotate.* intrinsics, cleanup funnel-shift handling (#107655 ) This change deprecates the following intrinsics which can be trivially converted to llvm funnel-shift intrinsics: - @llvm.nvvm.rotate.b32 - @llvm.nvvm.rotate.right.b64 - @llvm.nvvm.rotate.b64	2024-09-23 14:58:52 -07:00
Alex MacLean	8be6b108fb	[NVPTX] Remove nvvm.bitcast.* intrinsics (#107936 ) Remove the following intrinsics which correspond directly to a bitcast: - llvm.nvvm.bitcast.f2i - llvm.nvvm.bitcast.i2f - llvm.nvvm.bitcast.d2ll - llvm.nvvm.bitcast.ll2d	2024-09-23 11:24:07 -07:00
Nikita Popov	5dcea4628d	[AutoUpgrade] Preserve attributes when upgrading named struct return For example, if the argument has an alignment attribute, preserve it.	2024-09-02 12:42:52 +02:00
Maciej Gabka	95d2d1cba0	Move stepvector intrinsic out of experimental namespace (#98043 ) This patch is moving out stepvector intrinsic from the experimental namespace. This intrinsic exists in LLVM for several years now, and is widely used.	2024-08-28 12:48:20 +01:00
Matt Arsenault	ee08d9cba5	AMDGPU: Remove global/flat atomic fadd intrinics (#97051 ) These have been replaced with atomicrmw.	2024-08-22 23:27:33 +04:00
Matt Arsenault	9d364286f3	AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (#97050 ) These are now fully covered by atomicrmw.	2024-08-21 14:26:42 +04:00
Matt Arsenault	70feafdb27	IR/AMDGPU: Autoupgrade amdgpu-unsafe-fp-atomics attribute (#101698 ) Delete the attribute and annotate any atomicrmw instructions in the function with new metadata.	2024-08-12 14:56:53 +04:00
Alexis Engelke	b7cd564fa3	[IR] Don't verify module flags on every access (#102153 ) 8b4306ce050bd5 introduced validity checks for every module flag access, because the auto-upgrader uses named metadata before verifying the module. This causes overhead for all other accesses, and the check is, in fact, only need at that single place. Change the upgrader to be careful when accessing module flags before the module is verified and remove the checks on all other occasions. There are two tangential optimizations included: first, when querying a specific flag, don't enumerate all other flags into a vector as well. Second, don't use a Twine for getNamedMetadata(), which has materialization overhead -- all call sites use simple strings that can be implicitly converted to a StringRef.	2024-08-06 18:33:26 +02:00
Justin Holewinski	9374f83a73	Outline X86 autoupgrade patterns (#97851 ) Outlining these patterns has a significant impact on the overall stack frame size of llvm::UpgradeIntrinsicCall. This is helpful for scenarios where compilation threads are stack-constrained. The overall impact is low when using clang as the host compiler, but very pronounced when using MSVC 2022 with release builds. Clang: 1,624 -> 824 bytes MSVC: 23,560 -> 6,120 bytes	2024-07-06 09:24:36 -04:00
Matt Arsenault	f55bcc5dbe	AMDGPU: Add amdgpu.no.fine.grained.memory when upgrading old atomic intrinsics (#89655 ) This should replicate the old intrinsic behavior better when codegen of the raw instruction will require metadata in the future.	2024-06-27 19:52:23 +02:00
Matt Arsenault	4477ff6836	AMDGPU: Remove ds_fmin/ds_fmax intrinsics (#96739 ) These have been replaced with atomicrmw.	2024-06-27 15:35:24 +02:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Matt Arsenault	70c8b9c24a	AMDGPU: Remove ds atomic fadd intrinsics (#95396 ) These have been replaced with atomicrmw fadd	2024-06-23 10:30:20 +02:00
Simon Pilgrim	2615e69ec2	[IR] AutoUpgrade.cpp - don't directly dereference pointers from dyn_cast Static analysis was reporting that dyn_cast<> can return null on failure - use cast<> instead	2024-06-21 17:42:01 +01:00
hev	46edc02eaa	[LoongArch] Adjust LA64 data layout by using n32:64 in layout string (#93814 ) Although i32 type is illegal in the backend, LA64 has pretty good support for i32 types by using W instructions. By adding n32 to the DataLayout string, middle end optimizations will consider i32 to be a native type. One known effect of this is enabling LoopStrengthReduce on loops with i32 induction variables. This can be beneficial because C/C++ code often has loops with i32 induction variables due to the use of `int` or `unsigned int`. If this patch exposes performance issues, those are better addressed by tuning LSR or other passes.	2024-06-06 14:05:56 +08:00
Doug Wyatt	ddecadabeb	[clang backend] In AArch64's DataLayout, specify a minimum function alignment of 4. (#90702 ) This addresses an issue where the explicit alignment of 2 (for C++ ABI reasons) was being propagated to the back end and causing under-aligned functions (in special sections). This is an alternate approach suggested by @efriedma-quic in PR #90415. Fixes #90358	2024-05-05 19:05:15 -07:00
Kazu Hirata	4e6f6fda8b	[IR] Use StringRef::operator== instead of StringRef::equals (NFC) (#90550 ) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator== outnumbers StringRef::equals by a factor of 22 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".	2024-04-30 12:23:31 -07:00
Maciej Gabka	bfc0317153	Move several vector intrinsics out of experimental namespace (#88748 ) This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.	2024-04-29 10:16:45 +01:00
Paul Walker	0fa1f1f2d1	[LLVM][SVE] Seperate the int and floating-point variants of addqv. (#89762 ) We only use common intrinsics for operations that treat their element type as a container of bits.	2024-04-26 11:25:55 +01:00

1 2 3 4 5 ...

533 Commits