llvm-project

Author	SHA1	Message	Date
Pavel Iliin	8ec50d6446	[AArch64] Fix FMV ifunc resolver usage on old Android APIs. Rename internal compiler-rt FMV functions. The patch fixes Function Multi Versioning features detection by ifunc resolver on Android API levels < 30. Ifunc hwcaps parameters are not supported on Android API levels 23-29, so all CPU features are set unsupported if they were not initialized before ifunc resolver call. There is no support for ifunc on Android API levels < 23, so Function Multi Versioning is disabled in this case. Also use two underscore prefix for FMV runtime support functions to avoid conflict with user program ones. Differential Revision: https://reviews.llvm.org/D158641	2023-09-29 17:10:48 +01:00
Fangrui Song	0d8b864829	CGBuiltin: emit llvm.abs.* instead of neg+icmp+select for abs instcombine will combine neg+icmp+select to llvm.abs.. Let's just emit llvm.abs. in the first place.	2023-09-27 21:29:56 -07:00
Phoebe Wang	31631d307f	[X86][FP16] Add missing handling for FP16 constrained cmp intrinsics (#67400 )	2023-09-26 19:27:57 +08:00
Björn Pettersson	b4858c634e	[clang][CodeGen] Simplify code based on opaque pointers (#65624 ) - Update CodeGenTypeCache to use a single union for all pointers in address space zero. - Introduce a UnqualPtrTy in CodeGenTypeCache, and use that (for example instead of llvm::PointerType::getUnqual) in some places. - Drop some redundant bit/pointers casts from ptr to ptr.	2023-09-25 11:21:24 +02:00
Carlos Eduardo Seo	7523550853	[Clang][CodeGen] Add __builtin_bcopy (#67130 ) Add __builtin_bcopy to the list of GNU builtins. This was causing a series of test failures in glibc. Adjust the tests to reflect the changes in codegen. Fixes #51409. Fixes #63065.	2023-09-24 11:58:14 -03:00
Amy Huang	03c698a431	[MSVC, ARM64] Add _Copy* and _Count* intrinsics (#66554 ) Implement the _Count* and _Copy* Windows ARM intrinsics: ``` double _CopyDoubleFromInt64(__int64) float _CopyFloatFromInt32(__int32) __int32 _CopyInt32FromFloat(float) __int64 _CopyInt64FromDouble(double) unsigned int _CountLeadingOnes(unsigned long) unsigned int _CountLeadingOnes64(unsigned __int64) unsigned int _CountLeadingSigns(long) unsigned int _CountLeadingSigns64(__int64) unsigned int _CountLeadingZeros(unsigned long) unsigned int _CountLeadingZeros64(unsigned __int64) unsigned int _CountOneBits(unsigned long) unsigned int _CountOneBits64(unsigned __int64) ``` Full list of intrinsics here: [https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics](https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics) Bug: [65405](https://github.com/llvm/llvm-project/issues/65405)	2023-09-21 14:34:59 -07:00
Zahira Ammarguellat	a292e7edf8	Fix math-errno issue (#66381 ) Update handling of math errno. This change updates the logic for generation of math intrinics in place of math library function calls. The previous logic https://reviews.llvm.org/D151834 was incorrectly using intrinsics when math errno handling was needed at optimization levels above -O0. This also fixes issue mentioned in https://reviews.llvm.org/D151834 by @uabelho This is joint work with @andykaylor Andy.	2023-09-19 09:13:02 -04:00
Matt Arsenault	ddc3346a6b	clang/AMDGPU: Fix accidental behavior change for __builtin_amdgcn_ldexph (#66340 )	2023-09-14 18:15:44 +03:00
CarolineConcatto	ee31ba0dd9	[AArch64][SME]Update intrinsic interface for ld1/st1 (#65582 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-13 15:24:09 +01:00
Joseph Huber	49ff6a96a7	[Clang] Define AMDGPU ABI when referenced in CodeGen for ABI "none" (#66162 ) Summary: We use the `llvm.amgcn.abi.version` varaible to control code generation. This is emitted in every module now to indicate what should be used when compiling. Previously, the logic caused us to emit an external reference to this variable when creating the code for the `none` type. This would then cause us not to emit the actual definition. This patch refines the logic to create the external reference, and then update it if it is found unset by the time we emit the global. I had to remove the reference to `GetOrCreateLLVmGlobal` because it did not accept the proper address space.	2023-09-13 08:31:31 -05:00
CarolineConcatto	dc8d2ecc5e	[AArch64][SME]Update intrinsic interface for read/write (#65594 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. This patch is the #2 of 3 patches to update the interface. Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-12 18:08:57 +01:00
CarolineConcatto	7b8d4eff02	[AArch64][SME]Update intrinsic interface for ldr/str (#65593 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-12 17:31:51 +01:00
Max Iyengar	dbeb3d029d	Add missing vrnd intrinsics This patch adds 8 missing intrinsics as specified in the Arm ACLE document section 2.12.1.1 : [[ https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3 \| https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3]] The intrinsics implemented are: - vrnd32z_f64 - vrnd32zq_f64 - vrnd64z_f64 - vrnd64zq_f64 - vrnd32x_f64 - vrnd32xq_f64 - vrnd64x_f64 - vrnd64xq_f64 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D158626	2023-09-11 12:59:18 +01:00
Matt Arsenault	6a08cf12d9	clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic https://reviews.llvm.org/D157911	2023-09-09 23:14:12 +03:00
Zahira Ammarguellat	2c93e3c1c8	Take math-errno into account with '#pragma float_control(precise,on)' and 'attribute__((optnone)). Differential Revision: https://reviews.llvm.org/D151834	2023-09-08 09:48:53 -04:00
Saiyedul Islam	f616c3eeb4	[OpenMP][DeviceRTL][AMDGPU] Support code object version 5 Update DeviceRTL and the AMDGPU plugin to support code object version 5. Default is code object version 4. CodeGen for __builtin_amdgpu_workgroup_size generates code for cov4 as well as cov5 if -mcode-object-version=none is specified. DeviceRTL compilation passes this argument via Xclang option to generate abi-agnostic code. Generated code for the above builtin uses a clang control constant "llvm.amdgcn.abi.version" to branch on the abi version, which is available during linking of user's OpenMP code. Load of this constant gets eliminated during linking. AMDGPU plugin queries the ELF for code object version and then prepares various implicitargs accordingly. Differential Revision: https://reviews.llvm.org/D139730 Reviewed By: jhuber6, yaxunl	2023-08-29 06:35:44 -05:00
Kazu Hirata	722474969e	[clang] Use SmallDenseMap::contains (NFC)	2023-08-27 08:26:50 -07:00
Fangrui Song	27da15381c	[X86] __builtin_cpu_supports: support x86-64{,-v2,-v3,-v4} GCC 12 (https://gcc.gnu.org/PR101696) allows __builtin_cpu_supports("x86-64") (and -v2 -v3 -v4). This patch ports the feature. * Add `FEATURE_X86_64_{BASELINE,V2,V3,V4}` to enum ProcessorFeatures, but keep CPU_FEATURE_MAX unchanged to make FeatureInfos/FeatureInfos_WithPLUS happy. * Change validateCpuSupports to allow `x86-64{,-v2,-v3,-v4}` * Change getCpuSupportsMask to return `std::array<uint32_t, 4>` where `x86-64{,-v2,-v3,-v4}` set bits `FEATURE_X86_64_{BASELINE,V2,V3,V4}`. * `target("x86-64")` and `cpu_dispatch(x86_64)` are invalid. Tested by commit 9de3b35ac9159d5bae6e6796cb91e4f877a07189 Close https://github.com/llvm/llvm-project/issues/59961 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D158811	2023-08-25 20:56:25 -07:00
Yaxun (Sam) Liu	63f0833ec4	[clang] Fix missing contract flag in sqrt intrinsic The fp options specified through pragma are already encoded in Expr. This patch takes the same approach used by clang codegen to emit fastmath flags for fadd insts, basically use RAII to set the current fastmath flags in IRBuilder, which is then used to emit sqrt intrinsic. Fixes: https://github.com/llvm/llvm-project/issues/64653	2023-08-24 19:15:17 -04:00
Fangrui Song	7a41af8604	[X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute GCC 12 (https://gcc.gnu.org/PR101696) allows `arch=x86-64` `arch=x86-64-v2` `arch=x86-64-v3` `arch=x86-64-v4` in the target_clones function attribute. This patch ports the feature. * Set KeyFeature to `x86-64{,-v2,-v3,-v4}` in `Processors[]`, to be used by X86TargetInfo::multiVersionSortPriority * builtins: change `__cpu_features2` to an array like libgcc. Define `FEATURE_X86_64_{BASELINE,V2,V3,V4}` and depended ISA feature bits. * CGBuiltin.cpp: update EmitX86CpuSupports to handle `arch=x86-64*`. Close https://github.com/llvm/llvm-project/issues/55830 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D158329	2023-08-23 22:08:55 -07:00
Manna, Soumi	30c60ec52f	[NFC][CLANG] Fix static analyzer bugs about large copy by values Static Analyzer Tool complains about a large function call parameter which is is passed by value in CGBuiltin.cpp file. 1. In CodeGenFunction::EmitSMELdrStr(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. 2. In CodeGenFunction::EmitSMEZero(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. 3. In CodeGenFunction::EmitSMEReadWrite(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. 4. In CodeGenFunction::EmitSMELd1St1(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. I see many places in CGBuiltin.cpp file, we are passing parameter TypeFlags of type clang::SVETypeFlags by reference. clang::SVETypeFlags inherits several other types. This patch passes parameter TypeFlags by reference instead of by value in the function. Reviewed By: tahonermann, sdesmalen Differential Revision: https://reviews.llvm.org/D158522	2023-08-23 07:57:04 -07:00
Artem Labazov	f0bbda00bd	[CodeGen] [ubsan] Respect integer overflow handling in abs builtin Currenly both Clang and GCC support the following set of flags that control code gen of signed overflow: * -fwrapv: overflow is defined as in two-complement * -ftrapv: overflow traps * -fsanitize=signed-integer-overflow: if undefined (no -fwrapv), then overflow behaviour is controlled by UBSan runtime, overrides -ftrapv Howerver, clang ignores these flags for __builtin_abs(int) and its higher-width versions, so passing minimum integer value always causes poison. The same holds for abs(), which are not handled in frontend at all but folded to llvm.abs. intrinsics during InstCombinePass. The intrinsics are not instrumented by UBSan, so the functions need special handling as well. This patch does a few things: * Handle abs() in CGBuiltin the same way as __builtin_abs() * -fsanitize=signed-integer-overflow now properly instruments abs() with UBSan * -fwrapv and -ftrapv handling for abs() is made consistent with GCC Fixes #45129 and #45794 Reviewed By: efriedma, MaskRay Differential Revision: https://reviews.llvm.org/D156821	2023-08-21 09:13:25 -07:00
Thurston Dang	fc06cce30d	Revert "Respect integer overflow handling in abs builtin" This reverts commit 1783185790de29b24d3850d33d9a9d586e6bbd39, which broke the buildbots, starting with when it was first built in https://lab.llvm.org/buildbot/#/builders/85/builds/18390 (N.B. I think the patch is uncovering real bugs; the revert is simply to keep the tree green and the buildbots useful, because I'm not confident how to fix-forward all the found bugs.)	2023-08-18 19:59:34 +00:00
Artem Labazov	1783185790	Respect integer overflow handling in abs builtin Currenly both Clang and GCC support the following set of flags that control code gen of signed overflow: * -fwrapv: overflow is defined as in two-complement * -ftrapv: overflow traps * -fsanitize=signed-integer-overflow: if undefined (no -fwrapv), then overflow behaviour is controlled by UBSan runtime, overrides -ftrapv. However, clang ignores these flags for __builtin_abs(int) and its higher-width versions, so passing minimum integer value always causes poison. The same holds for abs(), which are not handled in frontend at all but folded to llvm.abs. intrinsics during InstCombinePass. The intrinsics are not instrumented by UBSan, so the functions need special handling as well. This patch does a few things: * Handle abs() in CGBuiltin the same way as __builtin_abs() * -fsanitize=signed-integer-overflow now properly instruments abs() with UBSan * -fwrapv and -ftrapv handling for abs() is made consistent with GCC Fixes https://github.com/llvm/llvm-project/issues/45129 Fixes https://github.com/llvm/llvm-project/issues/45794 Differential Revision: https://reviews.llvm.org/D156821	2023-08-17 14:36:00 -04:00
Matt Arsenault	9e3d9c9eae	clang: Add __builtin_elementwise_sqrt This will be used in the opencl builtin headers to provide direct intrinsic access with proper !fpmath metadata. https://reviews.llvm.org/D156737	2023-08-11 19:32:39 -04:00
Bjorn Pettersson	d03f4177df	[clang] Drop some references to typed pointers (getInt8PtrTy). NFC Differential Revision: https://reviews.llvm.org/D157550	2023-08-10 15:07:06 +02:00
Matt Arsenault	25bc999d1f	Intrinsics: Add type overload to stacksave and stackstore This allows use with non-0 address space stacks. llvm_ptr_ty should never be used. This could use some more percolation up through mlir, but this is enough to fix existing tests. https://reviews.llvm.org/D156666	2023-08-09 18:33:11 -04:00
wanglei	ea8d3b1f9f	[Clang][LoongArch] Use the ClangBuiltin class to automatically generate support for CBE and CFE Fixed the type modifier (L->W), removed redundant feature checking code since the feature has already been checked in `EmitBuiltinExpr`. And Cleaned up unused diagnostic information. Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D156866	2023-08-09 16:04:09 +08:00
Piyou Chen	2df05cd01c	[RISCV] Support overloaded version ntlh intrinsic function Here is the proposal https://github.com/riscv-non-isa/riscv-c-api-doc/pull/47. The version that omit the domain argument imply domain=__RISCV_NTLH_ALL. ``` type __riscv_ntl_load (type ptr); void __riscv_ntl_store (type ptr, type val); ``` Reviewed By: kito-cheng, craig.topper Differential Revision: https://reviews.llvm.org/D156221	2023-08-04 00:39:25 -07:00
Bjorn Pettersson	2bdc86484d	[clang][CodeGen] Drop some typed pointer bitcasts Differential Revision: https://reviews.llvm.org/D156911	2023-08-03 22:54:33 +02:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Alex Voicu	51a014cb2d	[Clang][CodeGen] `__builtin_alloca`s should care about address spaces `alloca` instructions always return pointers to the `alloca` address space. This composes poorly with most HLLs which are address space agnostic and thus have all pointers point to generic/default. Static `alloca`s were already handled on the AST level, however dynamic `alloca`s were not, which would lead to subtly incorrect IR. This patch addresses that by inserting an address space cast iff the `alloca` address space is different from the default / expected. Reviewed By: rjmccall, arsenm Differential Revision: https://reviews.llvm.org/D156539	2023-08-01 21:55:36 +01:00
Joshua Batista	57f879cdd4	clang: Add elementwise bitreverse builtin Add codegen for llvm bitreverse elementwise builtin The bitreverse elementwise builtin is necessary for HLSL codegen. Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types, or too many inputs. The new builtin is restricted to integer types only. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156357	2023-07-31 10:59:13 -07:00
ranapratap55	970569b6cc	[AMDGPU] __builtin_amdgcn_read_exec_* should be implemented with llvm.amdgcn.ballot Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156219	2023-07-26 16:21:31 +05:30
Joshua Batista	3a98e73169	clang: Add elementwise pow builtin Add codegen for llvm pow elementwise builtin The pow elementwise builtin is necessary for HLSL codegen. Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types, or too many inputs. The new builtin is restricted to floating point types only. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153310	2023-07-24 14:03:58 -07:00
Sander de Smalen	a8cbd27d1f	[Clang][AArch64] svldr_vnum/svstr_vnum should use cntsb iso vscale for the offset The specification for LDR/STR says that: The ZA array vector is selected by the sum of the vector select register and immediate offset, modulo the number of bytes in a Streaming SVE vector. [..] This instruction does not require the PE to be in Streaming SVE mode When the instruction is used outside of streaming mode, 'vscale' will result in the wrong value being used for the offset because LLVM's code-generator will emit the non-streaming 'RDVL/ADDVL' instead of the 'RDSVL/ADDSVL' instructions which are used to get the Streaming-SVE vector length. Reviewed By: bryanpkc Differential Revision: https://reviews.llvm.org/D156121	2023-07-24 14:29:45 +00:00
Bryan Chan	4ae900c063	[Clang][AArch64][SME] Add intrinsics for adding vector elements to ZA tile This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html): - svaddha_za32[_u32]_m // also for s32 - svaddva_za32[_u32]_m // also for s32 - svaddha_za64[_u64]_m // also for s64 - svaddva_za64[_u64]_m // also for s64 The _za64 versions are available only when the sme-i16i64 feature is enabled. Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com> Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D134680	2023-07-20 06:06:36 -04:00
Bryan Chan	15d16a79a0	[Clang][AArch64][SME] Add intrinsics for reading streaming vector length This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html): - svcntsb - svcntsh - svcntsw - svcntsd Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com> Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D134679	2023-07-20 06:06:35 -04:00
Bryan Chan	f225898a7c	[Clang][AArch64][SME] Add intrinsics for ZA array load/store (LDR/STR) This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html): - svldr_vnum_za - svstr_vnum_za Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com> Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D134678	2023-07-20 06:06:35 -04:00
Bryan Chan	578b0bd4e6	[Clang][AArch64][SME] Add ZA zeroing intrinsics This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html): - svzero_mask_za - svzero_za Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com> Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D134677	2023-07-20 06:06:34 -04:00
Bryan Chan	6dc94c54e5	[Clang][AArch64][SME] Add vector read/write (mova) intrinsics This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html): - svread_hor_za8[_s8]_m // also for u8 - svread_hor_za16[_s16]_m // also for u16, f16, bf16 - svread_hor_za32[_s32]_m // also for u32, f32 - svread_hor_za64[_s64]_m // also for u64, f64 - svread_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 - svread_ver_za8[_s8]_m // also for u8 - svread_ver_za16[_s16]_m // also for u16, f16, bf16 - svread_ver_za32[_s32]_m // also for u32, f32 - svread_ver_za64[_s64]_m // also for u64, f64 - svread_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 - svwrite_hor_za8[_s8]_m // also for u8 - svwrite_hor_za16[_s16]_m // also for u16, f16, bf16 - svwrite_hor_za32[_s32]_m // also for u32, f32 - svwrite_hor_za64[_s64]_m // also for u64, f64 - svwrite_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 - svwrite_ver_za8[_s8]_m // also for u8 - svwrite_ver_za16[_s16]_m // also for u16, f16, bf16 - svwrite_ver_za32[_s32]_m // also for u32, f32 - svwrite_ver_za64[_s64]_m // also for u64, f64 - svwrite_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com> Reviewed By: sdesmalen, kmclaughlin Differential Revision: https://reviews.llvm.org/D128648	2023-07-20 06:06:33 -04:00
Craig Topper	a64b3e92c7	[RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types. Previously we returned i32 on RV32 and i64 on RV64. The instructions only consume 32 bits and only produce 32 bits. For RV64, the result is sign extended to 64 bits like *W instructions. This patch removes this detail from the interface to improve portability and consistency. This matches the proposal for scalar intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44 I've included IR autoupgrade support as well. I'll be doing this for other builtins/intrinsics that currently use 'long' in other patches. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D154647	2023-07-17 08:58:29 -07:00
Craig Topper	143e2c2ac0	[RISCV] Split clmul/clmulh/clmulr builtins into _32 and _64 versions. This removes another use of 'long' to mean xlen from builtins. I've also converted the types to unsigned as proposed in D154616. clmul_32 is available to RV64 as its emulation is clmul+sext.w clmulh_32 and clmulr_32 are not available on RV64 as their emulation is currently 6 instructions in the worst case.	2023-07-14 19:09:15 -07:00
Matt Arsenault	bac2a07540	clang: Attach !fpmath metadata to __builtin_sqrt based on language flags OpenCL and HIP have -cl-fp32-correctly-rounded-divide-sqrt and -fno-hip-correctly-rounded-divide-sqrt. The corresponding fpmath metadata was only set on fdiv, and not sqrt. The backend is currently underutilizing sqrt lowering options, and the responsibility is split between the libraries and backend and this metadata is needed. CUDA/NVCC has -prec-div and -prev-sqrt but clang doesn't appear to be aiming for compatibility with those. Don't know if OpenMP has a similar control.	2023-07-14 18:46:18 -04:00
Craig Topper	85b27ace52	[ARM][AArch64] Add ARM specific builtin for clz that is not undefined for 0 in ubsan. D152023 made ubsan consider __builtin_clz of 0 undefined regardless of the target. This ensures portability and matches gcc. This causes the ACLE intrinsics to also be considered to also be considered to be undefined for 0 since they used the generic builtins as their implementation. This patch adds builtins for ARM that ubsan doesn't know about to make the behavior defined for 0. Alternatively, I could have added a zero check to the intrinsics, but the dedicated builtin will give better -O0 codegen. Fixes #63113. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D154915	2023-07-12 09:29:18 -07:00
Serge Pavlov	7d6c2e1811	[clang] Use llvm.is_fpclass to implement FP classification functions Builtin floating-point number classification functions: - __builtin_isnan, - __builtin_isinf, - __builtin_finite, and - __builtin_isnormal now are implemented using `llvm.is_fpclass`. This change makes the target callback `TargetCodeGenInfo::testFPKind` unneeded. It is preserved in this change and should be removed later. Differential Revision: https://reviews.llvm.org/D112932	2023-07-11 21:34:53 +07:00
Craig Topper	939f818a66	[RISCV] Split __builtin_riscv_brev8 into _32 and _64 builtin. Allow _32 builtin on RV64 since it only brev8+sext.w. Part of an effort to remove 'long' to mean XLen from the builtin interface. Matches the proposal here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44 Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154683	2023-07-10 13:01:07 -07:00
Craig Topper	a1b7db3e4c	[RISCV] Split __builtin_riscv_xperm4/8 into separate _32 and _64 builtins. Part of an effort to remove uses of 'long' to mean XLen from the builtin interfaces. Also makes the builtin names match https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154681	2023-07-10 12:18:20 -07:00
Sergio Afonso	63ca93c7d1	[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change. `IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs. Differential Revision: https://reviews.llvm.org/D154591	2023-07-10 14:14:16 +01:00
Lucas Prates	2b7ac62606	[AArch64][RCPC3] Add Neon intrinsics for LDAP1 and STL1 This adds new intrisics to support the LDAP1 and STL1 Advanced SIMD (Neon) instructions introduced as part of FEAT_LRCPC3. The new intrinsics `vldap1(q)_lane`/`vstl1(q)_lane` generate IR code similar to the existing `vld1(q)_lane/st1(q)_lane` ones, but capturing the difference in the atomic release/acquire memory model. The LLVM code generation changes to ensure that this instruction pair is lowered to the correct LDAP1/STL1 instructions will be covered in a separate commit. Based on a patch by Sam Elliott. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D153128	2023-07-07 12:31:55 +01:00

1 2 3 4 5 ...

1787 Commits