llvm-project

Author	SHA1	Message	Date
Jessica Del	b025864af8	[AMDGPU] - Add clang builtins for tied WMMA intrinsics (#70669 ) Add clang builtins for the new tied wmma intrinsics. These variations tie the destination accumulator matrix to the input accumulator matrix. See https://github.com/llvm/llvm-project/pull/69903 for context.	2023-11-13 13:23:26 +01:00
Fangrui Song	65f2cf25c3	Revert "[CodeGen] -fsanitize=alignment: add cl::opt sanitize-alignment-builtin to disable memcpy instrumentation (#69240 )" This reverts commit e8fe4de64ffb84924c41e54116a04570046eed74. memcpy/memmove instrumentation for -fsanitize=alignment has been tested on a huge code base. There were some cleanups but the number does not justify a workaround.	2023-11-12 22:26:27 -08:00
Bill Wendling	bc09ec6962	[CodeGen] Revamp counted_by calculations (#70606 ) Break down the counted_by calculations so that they correctly handle anonymous structs, which are specified internally as IndirectFieldDecls. Improves the calculation of __bdos on a different field member in the struct. And also improves support for __bdos in an index into the FAM. If the index is further out than the length of the FAM, then we return __bdos's "can't determine the size" value (zero or negative one, depending on type). Also simplify the code to use helper methods to get the field referenced by counted_by and the flexible array member itself, which also had some issues with FAMs in sub-structs.	2023-11-09 10:18:17 -08:00
Saiyedul Islam	21861991e7	[OpenMP] Cleanup and fixes for ABI agnostic DeviceRTL (#71234 ) Fixes the DeviceRTL compilation to ensure it is ABI agnostic. Uses already available global variable "oclc_ABI_version" instead of "llvm.amdgcn.abi.verion". It also adds some minor fields in ImplicitArg structure.	2023-11-09 10:34:35 +05:30
Pravin Jagtap	1f21e49870	Revert "Revert "[AMDGPU] const-fold imm operands of (#71669 ) amdgcn_update_dpp intrinsic (#71139)"" This reverts commit d1fb9307951319eea3e869d78470341d603c8363 and fixes the lit test clang/test/CodeGenHIP/dpp-const-fold.hip --------- Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>	2023-11-09 10:09:22 +05:30
Mitch Phillips	d1fb930795	Revert "[AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (#71139 )" This reverts commit 32a3f2afe6ea7ffb02a6a188b123ded6f4c89f6c. Reason: Broke the sanitizer buildbots. More details at `32a3f2afe6`	2023-11-08 12:50:53 +01:00
Pravin Jagtap	32a3f2afe6	[AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (#71139 ) Operands of `__builtin_amdgcn_update_dpp` need to evaluate to constant to match the intrinsic requirements. Fixes: SWDEV-426822, SWDEV-431138 --------- Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>	2023-11-08 15:09:10 +05:30
Noah Goldstein	590884a860	[Clang][CodeGen] Stoping emitting alignment assumes for `align_{up,down}` Now that `align_{up,down}` use `llvm.ptrmask` (as of #71238), the assume doesn't preserve any information that is not still easily re-computable. Closes #71295	2023-11-07 00:31:04 -06:00
Vlad Serebrennikov	dda8e3de35	[clang][NFC] Refactor `ImplicitParamDecl::ImplicitParamKind` This patch converts `ImplicitParamDecl::ImplicitParamKind` into a scoped enum at namespace scope, making it eligible for forward declaring. This is useful for `preferred_type` annotations on bit-fields.	2023-11-06 12:01:09 +03:00
Noah Goldstein	71be514fa0	[Clang][CodeGen] Emit `llvm.ptrmask` for `align_up` and `align_down` Since PR's #69343 and #67166 we probably have enough support for `llvm.ptrmask` to make it preferable to the GEP stategy. Closes #71238	2023-11-04 14:20:54 -05:00
Momchil Velikov	9b3bb7a066	[AArch64] Implement reinterpret builtins for SVE vector tuples (#69598 ) This patch adds reinterpret builtins as proposed here: https://github.com/ARM-software/acle/pull/275. The builtins take the form: sv<dst>x<N>_t svreinterpret_<dst>_<src>_x<N>(sv<src>x<N>_t op) where - <src> and <dst> designate the source and the destination type, respectively, all pairs chosen from {s8, u8, s16, u8, s32, u32, s64, u64, bf16, f16, f32, f64} - <N> designated the number of tuple elements, 2, 3 or 4 A short (overloaded) for is also provided, where the destination type is explicitly designated and the source type is deduced from the parameter type. These take the form sv<dst>x<N>_t svreinterpret_<dst>(sv<src>x<N>_t op) For example: svuin16x2_t svreinterpret_u16_s32_x2(svint32x2_t op); svuin16x2_t svreinterpret_u16(svint32x2_t op);	2023-11-03 11:45:08 +00:00
Kerry McLaughlin	8f59c168a9	[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70959 ) This patch removes duplicated code in EmitAArch64SVEBuiltinExpr and EmitAArch64SMEBuiltinExpr by creating a new function called GetAArch64SVEProcessedOperands which handles splitting up multi-vector arguments using vector extracts. These changes are non-functional.	2023-11-02 15:47:37 +00:00
Kerry McLaughlin	e2550b7aa0	Revert "[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70662 )" This reverts commit c34efe3c2734629b925d9411b3c86a710911a93a.	2023-11-01 15:57:14 +00:00
Kerry McLaughlin	c34efe3c27	[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70662 ) This patch removes duplicated code in EmitAArch64SVEBuiltinExpr and EmitAArch64SMEBuiltinExpr by creating a new function called GetAArch64SVEProcessedOperands which handles splitting up multi-vector arguments using vector extracts. These changes are non-functional.	2023-11-01 15:21:08 +00:00
Youngsuk Kim	bc44e6e7c6	[clang] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque pointer cleanup effort.	2023-11-01 09:06:15 -05:00
Serge Pavlov	fc7198b799	[clang] Additional FP classification functions (#69041 ) C language standard defined library functions `iszero`, `issignaling` and `issubnormal`, which did not have counterparts among clang builtin functions. This change adds new functions: __builtin_iszero __builtin_issubnormal __builtin_issignaling They provide builtin implementation for the missing standard functions. Pull request: https://github.com/llvm/llvm-project/pull/69041	2023-11-01 12:10:54 +07:00
Vlad Serebrennikov	49fd28d960	[clang][NFC] Refactor `ArrayType::ArraySizeModifier` This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.	2023-10-31 18:06:34 +03:00
Qiu Chaofan	de7c006832	[PowerPC] Fix use of FPSCR builtins in smmintrin.h (#67299 ) smmintrin.h uses __builtin_mffs, __builtin_mffsl, __builtin_mtfsf and __builtin_set_fpscr_rn. This patch replaces the uses with ppc prefix and implement the missing ones.	2023-10-26 15:56:32 +08:00
Rana Pratap Reddy	13ea1146a7	[AMDGPU] Lower __builtin_amdgcn_read_exec_hi to use amdgcn_ballot (#69567 ) Currently __builtin_amdgcn_read_exec_hi lowers to llvm.read_register, this patch lowers it to use amdgcn_ballot.	2023-10-26 10:26:11 +05:30
Caroline Concatto	e4b75d836d	[Clang][SVE2.1] Add builtins for Multi-vector load and store Patch by : David Sherwood <david.sherwood@arm.com> As described in: https://github.com/ARM-software/acle/pull/257 Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D151433	2023-10-19 15:11:58 +00:00
Caroline Concatto	ba47bc7fd4	[Clang][SVE2.1] Add pfalse builtin As described in: https://github.com/ARM-software/acle/pull/257 Patch by : Sander de Smalen<sander.desmalen@arm.com> Reviewed By: dtemirbulatov Differential Revision: https://reviews.llvm.org/D151199	2023-10-19 08:55:32 +00:00
Fangrui Song	e8fe4de64f	[CodeGen] -fsanitize=alignment: add cl::opt sanitize-alignment-builtin to disable memcpy instrumentation (#69240 ) Deploying #67766 to a large internal codebase uncovers many bugs (many are probably benign but need cleaning up). There are also issues in high-profile open-source projects like v8. Add a cl::opt to disable builtin instrumentation for -fsanitize=alignment to help large codebase users. In the long term, this cl::opt option may still be useful to debug -fsanitize=alignment instrumentation on builtins, so we probably want to keep it around.	2023-10-18 11:43:36 -07:00
Caroline Concatto	1b93e15bcd	[Clang][SVE2p1] Add svpsel builtins As described in: https://github.com/ARM-software/acle/pull/257 Patch by : Sander de Smalen<sander.desmalen@arm.com> Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D151197	2023-10-18 15:05:26 +00:00
Caroline Concatto	7cad5a9eb4	[Clang][SVE2.1] Add svpext builtins As described in: https://github.com/ARM-software/acle/pull/257 Reviewed By: hassnaa-arm Differential Revision: https://reviews.llvm.org/D151081	2023-10-17 16:15:22 +00:00
Sam Tebbs	4b8f23e93d	[AArch64][SME] Remove immediate argument restriction for svldr and svstr (#68908 ) The svldr_vnum_za and svstr_vnum_za builtins/intrinsics currently require that the vnum argument be an immediate, but since vnum is used to modify the base register via a mul and add, that restriction is not necessary. This patch removes that restriction.	2023-10-17 16:02:36 +01:00
Alex Voicu	dd5d65adb6	[HIP][Clang][CodeGen] Add CodeGen support for `hipstdpar` This patch adds the CodeGen changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. This change relaxes restrictions on what gets emitted on the device path, when compiling in `hipstdpar` mode: 1. Unless a function is explicitly marked `__host__`, it will get emitted, whereas before only `__device__` and `__global__` functions would be emitted; 2. Unsupported builtins are ignored as opposed to being marked as an error, as the decision on their validity is deferred to the `hipstdpar` specific code selection pass; 3. We add a `hipstdpar` specific pass to the opt pipeline, independent of optimisation level: - When compiling for the host, iff the user requested it via the `--hipstdpar-interpose-alloc` flag, we add a pass which replaces canonical allocation / deallocation functions with accelerator aware equivalents. A test to validate that unannotated functions get correctly emitted is added as well. Reviewed by: yaxunl, efriedma Differential Revision: https://reviews.llvm.org/D155850	2023-10-17 11:41:36 +01:00
Bill Wendling	769bc11f68	[Clang] Implement the 'counted_by' attribute (#68750 ) The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member in the same structure holding the count of elements in the flexible array. This information can be used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. This example specifies the that the flexible array member 'array' has the number of elements allocated for it in 'count': struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; This establishes a relationship between 'array' and 'count', specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained through changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar ))); p->count = count + 42; } void use_foo(int index) { p->count += 42; p->array[index] = 0; / The sanitizer cannot properly check this access */ } Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D148381	2023-10-14 04:18:02 -07:00
Amy Huang	e220398cc3	[MSVC, ARM64] Add __prefetch intrinsic (#67174 ) Implement __prefetch intrinsic. MSVC docs: https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 Bug: https://github.com/llvm/llvm-project/issues/65405	2023-10-13 13:34:15 -07:00
Fangrui Song	792674400f	-fsanitize=alignment: check memcpy/memmove arguments (#67766 ) The -fsanitize=alignment implementation follows the model that we allow forming unaligned pointers but disallow accessing unaligned pointers. See [RFC: Enforcing pointer type alignment in Clang](https://lists.llvm.org/pipermail/llvm-dev/2016-January/094012.html) for detail. memcpy is a memory access and we require an `int ` argument to be aligned. Similar to https://reviews.llvm.org/D9673 , emit -fsanitize=alignment check for arguments of builtin memcpy and memmove functions to catch misaligned load like: ``` // Check the alignment of a but ignore the alignment of b void unaligned_load(int a, void b) { memcpy(a, b, sizeof(a)); } ``` For a reference parameter, we emit a -fsanitize=alignment check as well, which can be optimized out by InstCombinePass. We rely on the call site `TCK_ReferenceBinding` check instead. ``` // The alignment check of a will be optimized out. void unaligned_load(int &a, void b) { memcpy(&a, b, sizeof(a)); } ``` The diagnostic message looks like ``` runtime error: store to misaligned address [[PTR:0x[0-9a-f]]] for type 'int ' ``` We could use a better message for memcpy, but we don't do it for now as it would require a new check name like misaligned-pointer-use, which is probably not necessary. RFC: Enforcing pointer type alignment in Clang* is not well documented, but this patch does not intend to change the that. Technically builtin memset functions can be checked for -fsanitize=alignment as well, but it does not seem too useful.	2023-10-09 23:02:07 -07:00
alexfh	67b675ee55	Revert "[Clang] Implement the 'counted_by' attribute" (#68603 ) This reverts commit 9a954c693573281407f6ee3f4eb1b16cc545033d, which causes clang crashes when compiling with `-fsanitize=bounds`. See `9a954c6935 (commitcomment-129529574)` for details.	2023-10-09 20:53:48 +02:00
Bill Wendling	9a954c6935	[Clang] Implement the 'counted_by' attribute The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member in the same structure holding the count of elements in the flexible array. This information can be used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. This example specifies the that the flexible array member 'array' has the number of elements allocated for it in 'count': struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; This establishes a relationship between 'array' and 'count', specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained through changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar ))); p->count = count + 42; } void use_foo(int index) { p->count += 42; p->array[index] = 0; / The sanitizer cannot properly check this access */ } Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D148381	2023-10-04 18:26:15 -07:00
Pavel Iliin	8ec50d6446	[AArch64] Fix FMV ifunc resolver usage on old Android APIs. Rename internal compiler-rt FMV functions. The patch fixes Function Multi Versioning features detection by ifunc resolver on Android API levels < 30. Ifunc hwcaps parameters are not supported on Android API levels 23-29, so all CPU features are set unsupported if they were not initialized before ifunc resolver call. There is no support for ifunc on Android API levels < 23, so Function Multi Versioning is disabled in this case. Also use two underscore prefix for FMV runtime support functions to avoid conflict with user program ones. Differential Revision: https://reviews.llvm.org/D158641	2023-09-29 17:10:48 +01:00
Fangrui Song	0d8b864829	CGBuiltin: emit llvm.abs.* instead of neg+icmp+select for abs instcombine will combine neg+icmp+select to llvm.abs.. Let's just emit llvm.abs. in the first place.	2023-09-27 21:29:56 -07:00
Phoebe Wang	31631d307f	[X86][FP16] Add missing handling for FP16 constrained cmp intrinsics (#67400 )	2023-09-26 19:27:57 +08:00
Björn Pettersson	b4858c634e	[clang][CodeGen] Simplify code based on opaque pointers (#65624 ) - Update CodeGenTypeCache to use a single union for all pointers in address space zero. - Introduce a UnqualPtrTy in CodeGenTypeCache, and use that (for example instead of llvm::PointerType::getUnqual) in some places. - Drop some redundant bit/pointers casts from ptr to ptr.	2023-09-25 11:21:24 +02:00
Carlos Eduardo Seo	7523550853	[Clang][CodeGen] Add __builtin_bcopy (#67130 ) Add __builtin_bcopy to the list of GNU builtins. This was causing a series of test failures in glibc. Adjust the tests to reflect the changes in codegen. Fixes #51409. Fixes #63065.	2023-09-24 11:58:14 -03:00
Amy Huang	03c698a431	[MSVC, ARM64] Add _Copy* and _Count* intrinsics (#66554 ) Implement the _Count* and _Copy* Windows ARM intrinsics: ``` double _CopyDoubleFromInt64(__int64) float _CopyFloatFromInt32(__int32) __int32 _CopyInt32FromFloat(float) __int64 _CopyInt64FromDouble(double) unsigned int _CountLeadingOnes(unsigned long) unsigned int _CountLeadingOnes64(unsigned __int64) unsigned int _CountLeadingSigns(long) unsigned int _CountLeadingSigns64(__int64) unsigned int _CountLeadingZeros(unsigned long) unsigned int _CountLeadingZeros64(unsigned __int64) unsigned int _CountOneBits(unsigned long) unsigned int _CountOneBits64(unsigned __int64) ``` Full list of intrinsics here: [https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics](https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics) Bug: [65405](https://github.com/llvm/llvm-project/issues/65405)	2023-09-21 14:34:59 -07:00
Zahira Ammarguellat	a292e7edf8	Fix math-errno issue (#66381 ) Update handling of math errno. This change updates the logic for generation of math intrinics in place of math library function calls. The previous logic https://reviews.llvm.org/D151834 was incorrectly using intrinsics when math errno handling was needed at optimization levels above -O0. This also fixes issue mentioned in https://reviews.llvm.org/D151834 by @uabelho This is joint work with @andykaylor Andy.	2023-09-19 09:13:02 -04:00
Matt Arsenault	ddc3346a6b	clang/AMDGPU: Fix accidental behavior change for __builtin_amdgcn_ldexph (#66340 )	2023-09-14 18:15:44 +03:00
CarolineConcatto	ee31ba0dd9	[AArch64][SME]Update intrinsic interface for ld1/st1 (#65582 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-13 15:24:09 +01:00
Joseph Huber	49ff6a96a7	[Clang] Define AMDGPU ABI when referenced in CodeGen for ABI "none" (#66162 ) Summary: We use the `llvm.amgcn.abi.version` varaible to control code generation. This is emitted in every module now to indicate what should be used when compiling. Previously, the logic caused us to emit an external reference to this variable when creating the code for the `none` type. This would then cause us not to emit the actual definition. This patch refines the logic to create the external reference, and then update it if it is found unset by the time we emit the global. I had to remove the reference to `GetOrCreateLLVmGlobal` because it did not accept the proper address space.	2023-09-13 08:31:31 -05:00
CarolineConcatto	dc8d2ecc5e	[AArch64][SME]Update intrinsic interface for read/write (#65594 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. This patch is the #2 of 3 patches to update the interface. Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-12 18:08:57 +01:00
CarolineConcatto	7b8d4eff02	[AArch64][SME]Update intrinsic interface for ldr/str (#65593 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-12 17:31:51 +01:00
Max Iyengar	dbeb3d029d	Add missing vrnd intrinsics This patch adds 8 missing intrinsics as specified in the Arm ACLE document section 2.12.1.1 : [[ https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3 \| https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3]] The intrinsics implemented are: - vrnd32z_f64 - vrnd32zq_f64 - vrnd64z_f64 - vrnd64zq_f64 - vrnd32x_f64 - vrnd32xq_f64 - vrnd64x_f64 - vrnd64xq_f64 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D158626	2023-09-11 12:59:18 +01:00
Matt Arsenault	6a08cf12d9	clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic https://reviews.llvm.org/D157911	2023-09-09 23:14:12 +03:00
Zahira Ammarguellat	2c93e3c1c8	Take math-errno into account with '#pragma float_control(precise,on)' and 'attribute__((optnone)). Differential Revision: https://reviews.llvm.org/D151834	2023-09-08 09:48:53 -04:00
Saiyedul Islam	f616c3eeb4	[OpenMP][DeviceRTL][AMDGPU] Support code object version 5 Update DeviceRTL and the AMDGPU plugin to support code object version 5. Default is code object version 4. CodeGen for __builtin_amdgpu_workgroup_size generates code for cov4 as well as cov5 if -mcode-object-version=none is specified. DeviceRTL compilation passes this argument via Xclang option to generate abi-agnostic code. Generated code for the above builtin uses a clang control constant "llvm.amdgcn.abi.version" to branch on the abi version, which is available during linking of user's OpenMP code. Load of this constant gets eliminated during linking. AMDGPU plugin queries the ELF for code object version and then prepares various implicitargs accordingly. Differential Revision: https://reviews.llvm.org/D139730 Reviewed By: jhuber6, yaxunl	2023-08-29 06:35:44 -05:00
Kazu Hirata	722474969e	[clang] Use SmallDenseMap::contains (NFC)	2023-08-27 08:26:50 -07:00
Fangrui Song	27da15381c	[X86] __builtin_cpu_supports: support x86-64{,-v2,-v3,-v4} GCC 12 (https://gcc.gnu.org/PR101696) allows __builtin_cpu_supports("x86-64") (and -v2 -v3 -v4). This patch ports the feature. * Add `FEATURE_X86_64_{BASELINE,V2,V3,V4}` to enum ProcessorFeatures, but keep CPU_FEATURE_MAX unchanged to make FeatureInfos/FeatureInfos_WithPLUS happy. * Change validateCpuSupports to allow `x86-64{,-v2,-v3,-v4}` * Change getCpuSupportsMask to return `std::array<uint32_t, 4>` where `x86-64{,-v2,-v3,-v4}` set bits `FEATURE_X86_64_{BASELINE,V2,V3,V4}`. * `target("x86-64")` and `cpu_dispatch(x86_64)` are invalid. Tested by commit 9de3b35ac9159d5bae6e6796cb91e4f877a07189 Close https://github.com/llvm/llvm-project/issues/59961 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D158811	2023-08-25 20:56:25 -07:00
Yaxun (Sam) Liu	63f0833ec4	[clang] Fix missing contract flag in sqrt intrinsic The fp options specified through pragma are already encoded in Expr. This patch takes the same approach used by clang codegen to emit fastmath flags for fadd insts, basically use RAII to set the current fastmath flags in IRBuilder, which is then used to emit sqrt intrinsic. Fixes: https://github.com/llvm/llvm-project/issues/64653	2023-08-24 19:15:17 -04:00

... 6 7 8 9 10 ...

2168 Commits