llvm-project

Author	SHA1	Message	Date
Amina Chabane	62744f3681	[AArch64][NEON] NEON intrinsic compilation error with -fno-lax-vector-conversion flag fix (#149329 ) Issue originally raised in https://github.com/llvm/llvm-project/issues/71362#issuecomment-3028515618. Certain NEON intrinsics that operate on poly types (e.g. poly8x8_t) failed to compile with the -fno-lax-vector-conversions flag. This patch updates NeonEmitter.cpp to insert an explicit __builtin_bit_cast from poly types to the required signed integer vector types when generating lane-related intrinsics. A test 'neon-bitcast-poly.ll' is included.	2025-07-30 10:56:14 +01:00
Jon Roelofs	a0fcb50bf9	[ARM] Improve arm_neon.h header diagnostic when included on unsupported targets (#147817 ) The footgun here was that the preprocessor diagnostic that looks for __ARM_FP would fire when included on targets like x86_64, but the suggestion it gives in that case is totally bogus. Avoid giving bad advice, by first checking whether we're being built for an appropriate target, and only then do the soft-fp check. rdar://155449666	2025-07-11 10:21:13 -07:00
Rahul Joshi	3932360b14	[LLVM][TableGen] Rename `ListInit::getValues()` to `getElements()` (#140289 ) Rename `ListInit::getValues()` to `getElements()` to better match with other `ListInit` members like `getElement`. Keep `getValues()` for existing downstream code but mark it deprecated.	2025-05-19 12:16:33 -07:00
Lukacma	6fc0312919	[Clang][AArch64] Add fp8 variants for untyped NEON intrinsics (#128019 ) This patch adds fp8 variants to existing intrinsics, whose operation doesn't depend on arguments being a specific type. It also changes mfloat8 type representation in memory from `i8` to `<1xi8>`	2025-05-15 14:01:41 +01:00
Kazu Hirata	8e2a9fa9a5	[TableGen] Use std::tie to implement operator< (NFC) (#139405 )	2025-05-10 16:04:26 -07:00
Jay Foad	2bc6f9d4b6	[TableGen] Only store direct superclasses in Record (#123072 ) In Record only store the direct superclasses instead of all superclasses. getSuperClasses recurses to find all superclasses when necessary. This gives a small reduction in memory usage. On lib/Target/X86/X86.td I measured about 2.0% reduction in total bytes allocated (measured by valgrind) and 1.3% reduction in peak memory usage (measured by /usr/bin/time -v). --------- Co-authored-by: Min-Yih Hsu <min@myhsu.dev>	2025-04-24 18:57:51 +01:00
Kazu Hirata	f2ec5e40d9	[clang] Use llvm::unique (NFC) (#136469 )	2025-04-19 20:33:53 -07:00
Lukacma	6c3adaafe3	[AARCH64][Neon] switch to using bitcasts in arm_neon.h where appropriate (#127043 ) Currently arm_neon.h emits C-style casts to do vector type casts. This relies on implicit conversion between vector types to be enabled, which is currently deprecated behaviour and soon will disappear. To ensure NEON code will keep working afterwards, this patch changes all this vector type casts into bitcasts. Co-authored-by: Momchil Velikov <momchil.velikov@arm.com>	2025-04-01 09:45:16 +01:00
Kazu Hirata	c6c394634c	[clang] Use *Set::insert_range (NFC) (#132507 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-22 08:06:38 -07:00
Kazu Hirata	69b70110b7	[TableGen] Avoid repeated hash lookups (NFC) (#132142 )	2025-03-20 09:10:23 -07:00
Oliver Stannard	a619a2e53a	[ARM] Fix lane ordering for AdvSIMD intrinsics on big-endian targets (#127068 ) In arm-neon.h, we insert shufflevectors around each intrinsic when the target is big-endian, to compensate for the difference between the ABI-defined memory format of vectors (with the whole vector stored as one big-endian access) and LLVM's target-independent expectations (with the lowest-numbered lane in the lowest address). However, this code was written for the AArch64 ABI, and the AArch32 ABI differs slightly: it requires that vectors are stored in memory as-if stored with VSTM, which does a series of 64-bit accesses, instead of the AArch64 VSTR, which does a single 128-bit access. This means that for AArch32 we need to reverse the lanes in each 64-bit chunk of the vector, instead of in the whole vector. Since there are only a small number of different shufflevector orderings needed, I've split them out into macros, so that this doesn't need separate conditions in each intrinsic definition.	2025-03-04 08:10:22 +00:00
Chandler Carruth	64ea3f5a47	[StrTable] Switch AArch64 and ARM to use directly TableGen-ed builtin tables This leverages the sharded structure of the builtins to make it easy to directly tablegen most of the AArch64 and ARM builtins while still using X-macros for a few edge cases. It also extracts common prefixes as part of that. This makes the string tables for these targets dramatically smaller. This is especially important as the SVE builtins represent (by far) the largest string table and largest builtin table across all the targets in Clang.	2025-02-04 18:04:58 +00:00
Momchil Velikov	db6fa74dfe	[AArch64] Implement FP8 Neon reinterpret intrinsics (#120476 )	2025-01-28 11:06:24 +00:00
Momchil Velikov	99bd2e3f12	[AArch64] Add Neon FP8 conversion intrinsics (#123612 ) The patch adds the following intrinsics: bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm) mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t fpm) Co-Authored-By: Caroline Concatto <caroline.concatto@arm.com>	2025-01-27 17:32:47 +00:00
Momchil Velikov	87103a016f	[AArch64] Implement NEON FP8 vectors as VectorType (#123603 ) Reimplement Neon FP8 vector types using attribute `neon_vector_type` instead of having them as builtin types. This allows to implement FP8 Neon intrinsics without the need to add special cases for these types when using `__builtin_shufflevector` or bitcast (using C-style cast operator) between vectors, both extensively used in the generated code in `arm_neon.h`.	2025-01-27 10:41:53 +00:00
Momchil Velikov	dac49e8ddd	[Arm] Fix generating code with UB in NeonEmitter (#121802 ) When generating `arm_neon.h`, NeonEmitter outputs code that violates strict aliasing rules (C23 6.5 Expressions #7, C++23 7.2.1 Value category [basic.lval] #11), for example: bfloat16_t __reint = __p0; uint32_t __reint1 = (uint32_t)((uint16_t ) &__reint) << 16; __ret = (float32_t ) &__reint1; This patch fixed the offending code by replacing it with a call to `__builtin_bit_cast`.	2025-01-24 10:57:23 +00:00
CarolineConcatto	aaba8406c5	[NFC][Clang][AArch64]Refactor implementation of Neon vectors MFloat8… (#114804 ) …x8 and MFloat8x16 This patch adds MFloat8 as a TypeFlag and Kind on Neon to generate the typedefs using emitNeonTypeDefs. It does not need any change in Clang, because SEMA and CodeGen use the Builtins defined in AArch64SVEACLETypes.def	2024-11-21 10:29:28 +00:00
Rahul Joshi	63aa8cf6be	[NFC][Clang][TableGen] Fix file header comments (#116491 )	2024-11-17 07:54:10 -08:00
CarolineConcatto	91aad9bfb2	[Clang][AArch64]Fix Name and Mangle name for scalar fp8 (#114983 ) The scalar __mfp8 type has the wrong name and mangle name in AArch64SVEACLETypes.def According to the ACLE[1] the name should be __mfp8 This patch fixes this problem by replacing the Name __MFloat8_t by __mfp8 and the Mangle Name __MFloat8_t by u6__mfp8 And we revert the incorrect typedef in NeonEmitter. [1]https://github.com/ARM-software/acle	2024-11-15 09:19:39 +00:00
Kazu Hirata	173529104d	[TableGen] Use heterogenous lookups with std::map (NFC) (#115682 ) Heterogenous lookups allow us to call find with StringRef, avoiding a temporary heap allocation of std::string.	2024-11-11 07:34:42 -08:00
Kazu Hirata	a44ee8ec1c	[TableGen] Use heterogenous lookups with std::map (NFC) (#115633 ) Heterogenous lookups allow us to call find with StringRef, avoiding a temporary heap allocation of std::string.	2024-11-10 07:24:27 -08:00
Momchil Velikov	1df5c94343	[AArch64] Implement FP8 floating-point mode helper intrinsics (#100608 ) Implement FP8 mode helper intrinsics (as inline functions) as specified in ACLE 2024Q3 "14.2 Helper intrinsics" https://github.com/ARM-software/acle/releases/download/r2024Q3/acle-2024Q3.pdf	2024-10-28 11:22:38 +00:00
CarolineConcatto	49940514e2	[CLANG][AArch64] Add the modal 8 bit floating-point scalar type (#97277 ) ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic. From the PR#323: ``` ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3 8-bit floating-point formats. It is a storage and interchange only type with no arithmetic operations other than intrinsic calls. ```` The type should be an opaque type and its format in undefined in Clang. Only defined in the backend by a status/format register, for AArch64 the FPMR. This patch is an attempt to the add the mfloat8_t scalar type. It has a parser and codegen for the new scalar type. The patch it is lowering to and 8bit unsigned as it has no format. But maybe we should add another opaque type. [1] https://github.com/ARM-software/acle/pull/323	2024-10-25 13:59:46 +01:00
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
CarolineConcatto	6dad29aebc	[CLANG][AArch64]Add Neon vectors for mfloat8_t (#99865 ) This patch adds these new vector sizes for neon: mfloat8x16_t and mfloat8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323	2024-10-23 13:23:18 +01:00
Rahul Joshi	9b422d14f3	[Clang][TableGen] Use const pointers for various Init objects in NeonEmitter (#112317 ) Use const pointers for various Init objects in NeonEmitter. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089	2024-10-15 15:48:42 -07:00
Rahul Joshi	e9dbdb20f2	[Clang][TableGen] Change NeonEmitter to use const Record * (#110597 ) This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089	2024-10-01 10:47:09 -07:00
Rahul Joshi	0e948bfd31	[NFC][clang][TableGen] Remove redundant llvm:: namespace qualifier (#108627 ) Remove llvm:: from .cpp files, and add "using namespace llvm" if needed.	2024-09-16 06:35:34 -07:00
Kazu Hirata	bae275f65e	[TableGen] Avoid repeated map lookups (NFC) (#108675 )	2024-09-14 07:39:00 -07:00
Rahul Joshi	a4b1617368	[clang][TableGen] Change NeonEmitter to use const RecordKeeper (#108501 ) Change NeonEmitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089	2024-09-13 07:52:37 -07:00
Rahul Joshi	1651014960	[TableGen] Change SetTheory set/vec to use const Record * (#107692 ) Change SetTheory::RecSet/RecVec to use const Record pointers.	2024-09-09 08:47:42 -07:00
SpencerAbson	1f70fcefa9	[Clang][AArch64] Add customisable immediate range checking to NEON (#100278 ) This patch moves NEON immediate argument specification and checking to the system currently shared by both SVE and SME. In its current form, the TableGen definition of a NEON intrinsic cannot control how its immediate arguments are range-checked, this information must be inferred from the name of the intrinsic by NeonEmitter, which also assumes that any NEON instruction will only ever receive a single immediate argument. For SVE/SME instrinsics, this information is more conveniently supplied in the TableGen definition. As a result, for each immediate argument, NEON instructions must define - The index of the immediate argument to be checked - The type of immediate range check to be performed, (e.g., ImmCheckShiftRight) - The index of the argument whose type defines the context of this immediate check (base type, vector size). - Difference from SVE/SME If this definition generates a polymorphic NEON builtin, the base type defined by this argument is overwritten by that of the type code supplied to the overloaded builtin call. This third argument is omitted in some cases due to this. Here is an example for [`vfma_laneq`](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vfma_laneq) - The immediate is supplied in argument 3 - The immediate is used as an index into the lanes of argument 2 - So we must perform an immediate check on argument 3, based on the type information of argument 2. - `ImmCheck<3, ImmCheckLaneIndex, 2>` During this work, we discovered that the existing immediate range-checking system was largely untested, which made it difficult to make reliable progress. Missing tests have been added to verify this implementation against all intrinsics which take constrained immediate arguments. All test immediate range checking tests for NEON intrinsics are moved to a dedicated directory `clang/test/Sema/aarch64-neon-immediate-ranges/`.	2024-09-06 13:12:37 +01:00
Rahul Joshi	d7da79f2cd	[NFC][SetTheory] Refactor to use const pointers and range loops (#105544 ) - Refactor SetTheory code to use const pointers when possible. - Use auto for variables initialized using dyn_cast<>. - Use range based for loops and early continue.	2024-08-22 05:47:31 -07:00
Lukacma	0284b4b4b6	[Clang][NEON] Add neon target guard to intrinsics (#99870 ) This patch improves reported error when NEON intrinsics are used without neon target feature.	2024-07-22 14:21:31 +01:00
Lukacma	c1622cae10	Revert "[Clang][NEON] Add neon target guard to intrinsics" (#99864 ) Reverts llvm/llvm-project#98624	2024-07-22 12:32:47 +01:00
Lukacma	dc82c774a7	[Clang][NEON] Add neon target guard to intrinsics (#98624 ) This patch improves reported error when NEON intrinsics are used without neon target feature.	2024-07-22 12:03:59 +01:00
Lukacma	8a46bbbc22	[Clang] Remove preprocessor guards and global feature checks for NEON (#95224 ) To enable function multi-versioning (FMV), current checks which rely on cmd line options or global macros to see if target feature is present need to be removed. This patch removes those for NEON and also implements changes to NEON header file as proposed in [ACLE](https://github.com/ARM-software/acle/pull/321).	2024-06-25 17:19:42 +02:00
Eli Friedman	8c9f45e2de	[ARM64EC] Fix arm_neon.h on ARM64EC. (#88572 ) Since 97fe519d, in ARM64EC mode, we don't define `__aarch64__`. Fix various preprocessor guards to account for this.	2024-04-16 17:08:02 -07:00
Kazu Hirata	e6bafbe726	[TableGen] Use StringRef::consume_{front,back} (NFC)	2024-01-25 18:17:24 -08:00
Sam Tebbs	945c645acb	[AArch64][SME] Warn when using a streaming builtin from a non-streaming function (#75487 ) This PR adds a warning that's emitted when a non-streaming or non-streaming-compatible builtin is called in an unsuitable function. Uses work by Kerry McLaughlin. This is a re-upload of #74064 and fixes a compile time increase.	2023-12-18 09:32:34 +00:00
Sam Tebbs	342384ca05	Revert "[AArch64][SME] Warn when using a streaming builtin from a non-streaming function" (#75449 ) Reverts llvm/llvm-project#74064	2023-12-14 09:31:55 +00:00
Sam Tebbs	2e45326b08	[AArch64][SME] Warn when using a streaming builtin from a non-streaming function (#74064 ) This PR adds a warning that's emitted when a non-streaming or non-streaming-compatible builtin is called in an unsuitable function. Uses work by Kerry McLaughlin.	2023-12-14 00:11:04 +00:00
CarolineConcatto	ed2d497291	[Clang][AArch64] Add fix vector types to header into SVE (#73258 ) This patch is needed for the reduction instructions in sve2.1 It add a new header to sve with all the fixed vector types. The new types are only added if neon is not declared.	2023-12-13 08:59:41 +00:00
Simon Pilgrim	141122ece3	[TableGen] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-03 17:53:56 +00:00
Kazu Hirata	dd27036ff7	[TableGen] Modernize OverloadInfo (NFC)	2023-09-04 13:35:26 -07:00
Lucas Prates	2b7ac62606	[AArch64][RCPC3] Add Neon intrinsics for LDAP1 and STL1 This adds new intrisics to support the LDAP1 and STL1 Advanced SIMD (Neon) instructions introduced as part of FEAT_LRCPC3. The new intrinsics `vldap1(q)_lane`/`vstl1(q)_lane` generate IR code similar to the existing `vld1(q)_lane/st1(q)_lane` ones, but capturing the difference in the atomic release/acquire memory model. The LLVM code generation changes to ensure that this instruction pair is lowered to the correct LDAP1/STL1 instructions will be covered in a separate commit. Based on a patch by Sam Elliott. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D153128	2023-07-07 12:31:55 +01:00
Dimitry Andric	db49231639	[clang][BFloat] Avoid redefining bfloat16_t in arm_neon.h As of https://reviews.llvm.org/D79708, clang-tblgen generates `arm_neon.h`, `arm_sve.h` and `arm_bf16.h`, and all those generated files will contain a typedef of `bfloat16_t`. However, `arm_neon.h` and `arm_sve.h` include `arm_bf16.h` immediately before their own typedef: #include <arm_bf16.h> typedef __bf16 bfloat16_t; With a recent version of clang (I used 16.0.1) this results in warnings: /usr/lib/clang/16/include/arm_neon.h:38:16: error: redefinition of typedef 'bfloat16_t' is a C11 feature [-Werror,-Wtypedef-redefinition] Since `arm_bf16.h` is very likely supposed to be the one true place where `bfloat16_t` is defined, I propose to delete the duplicate typedefs from the generated `arm_neon.h` and `arm_sve.h`. Reviewed By: sdesmalen, simonbutcher Differential Revision: https://reviews.llvm.org/D148822	2023-05-03 17:54:58 +02:00
Manna, Soumi	38ecb9767c	[NFC][clang] Fix Coverity bugs with AUTO_CAUSES_COPY Reported by Coverity: AUTO_CAUSES_COPY Unnecessary object copies can affect performance. 1. Inside "ExtractAPIVisitor.h" file, in clang::extractapi::impl::ExtractAPIVisitorBase<<unnamed>::BatchExtractAPIVisitor>::VisitFunctionDecl(clang::FunctionDecl const ): Using the auto keyword without an & causes the copy of an object of type DynTypedNode. 2. Inside "NeonEmitter.cpp" file, in <unnamed>::Intrinsic::Intrinsic(llvm::Record , llvm::StringRef, llvm::StringRef, <unnamed>::TypeSpec, <unnamed>::TypeSpec, <unnamed>::ClassKind, llvm::ListInit , <unnamed>::NeonEmitter &, llvm::StringRef, llvm::StringRef, bool, bool): Using the auto keyword without an & causes the copy of an object of type Type. 3. Inside "MicrosoftCXXABI.cpp" file, in <unnamed>::MSRTTIBuilder::getClassHierarchyDescriptor(): Using the auto keyword without an & causes the copy of an object of type MSRTTIClass. 4. Inside "CGGPUBuiltin.cpp" file, in clang::CodeGen::CodeGenFunction::EmitAMDGPUDevicePrintfCallExpr(clang::CallExpr const ): Using the auto keyword without an & causes the copy of an object of type CallArg. 5. Inside "SemaDeclAttr.cpp" file, in threadSafetyCheckIsSmartPointer(clang::Sema &, clang::RecordType const ): Using the auto keyword without an & causes the copy of an object of type CXXBaseSpecifier. 6. Inside "ComputeDependence.cpp" file, in clang::computeDependence(clang::DesignatedInitExpr ): Using the auto keyword without an & causes the copy of an object of type Designator. 7. Inside "Format.cpp" file, In clang::format::affectsRange(llvm::ArrayRef<clang::tooling::Range>, unsigned int, unsigned int): Using the auto keyword without an & causes the copy of an object of type Range. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D149074	2023-04-24 14:52:55 -07:00
Kazu Hirata	9cf4419e24	[clang] Use std::optional instead of llvm::Optional (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-02 15:54:57 -08:00
Kazu Hirata	f7dffc28b3	Don't include None.h (NFC) I've converted all known uses of None to std::nullopt, so we no longer need to include None.h. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-10 11:24:26 -08:00

1 2 3 4 5

209 Commits