llvm-project

Author	SHA1	Message	Date
Martin Storsjö	f730749e85	[clang] [ARM] Add __builtin_sponentry like on aarch64 This is used for calling the SEH aware setjmp on MinGW. Differential Revision: https://reviews.llvm.org/D126764	2022-06-02 12:29:59 +03:00
Stephen Long	4f1e64b54f	[MSVC, ARM64] Add __readx18 intrinsics https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 unsigned char __readx18byte(unsigned long) unsigned short __readx18word(unsigned long) unsigned long __readx18dword(unsigned long) unsigned __int64 __readx18qword(unsigned long) Given the lack of documentation of the intrinsics, we chose to align the offset with just `CharUnits::One()` when calling `IRBuilderBase::CreateAlignedLoad()` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D126024	2022-05-23 10:59:12 -07:00
Stephen Long	3e0be5610f	[MSVC, ARM64] Add __writex18 intrinsics https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 void __writex18byte(unsigned long, unsigned char) void __writex18word(unsigned long, unsigned short) void __writex18dword(unsigned long, unsigned long) void __writex18qword(unsigned long, unsigned __int64) Given the lack of documentation of the intrinsics, we chose to align the offset with just `CharUnits::One()` when calling `IRBuilderBase::CreateAlignedStore()`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D126023	2022-05-23 07:01:11 -07:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Simon Pilgrim	ec6024d081	[X86] Replace avx512f integer mul reduction builtins with generic builtin D117829 added the generic "__builtin_reduce_mul" which we can use to replace the x86 specific integer mul reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required. Differential Revision: https://reviews.llvm.org/D125222	2022-05-09 14:10:28 +01:00
Simon Pilgrim	8a92c45e07	[Clang] Add integer mul reduction builtin Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.mul intrinsic call. For other reductions, we've tried to share builtins for float/integer vectors, but the fmul reduction intrinsic also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fmul support this shouldn't affect the integer case. Differential Revision: https://reviews.llvm.org/D117829	2022-05-09 12:12:53 +01:00
Richard Smith	c4f95ef86a	Reimplement `__builtin_dump_struct` in Sema. Compared to the old implementation: * In C++, we only recurse into aggregate classes. * Unnamed bit-fields are not printed. * Constant evaluation is supported. * Proper conversion is done when passing arguments through `...`. * Additional arguments are supported and are injected prior to the format string; this directly supports use with `fprintf`, for example. * An arbitrary callable can be passed rather than only a function pointer. In particular, in C++, a function template or overload set is acceptable. * All text generated by Clang is printed via `%s` rather than directly; this avoids issues where Clang's pretty-printing output might itself contain a `%` character. * Fields of types that we don't know how to print are printed with a `"%p"` format and passed by address to the print function. No return value is produced. Reviewed By: aaron.ballman, erichkeane, yihanaa Differential Revision: https://reviews.llvm.org/D124221	2022-05-05 14:55:47 -07:00
Simon Pilgrim	9a14c369c4	[X86] Replace avx512f integer add reduction builtins with generic builtin D124741 added the generic "__builtin_reduce_add" which we can use to replace the x86 specific integer add reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required. Differential Revision: https://reviews.llvm.org/D124757	2022-05-02 14:39:17 +01:00
Simon Pilgrim	a23291b7db	[Clang] Add integer add reduction builtin Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.add intrinsic call. For other reductions, we've tried to share builtins for float/integer vectors, but the fadd reduction intrinsics also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fadd support this shouldn't affect the integer case. (Split off from D117829) Differential Revision: https://reviews.llvm.org/D124741	2022-05-02 11:03:25 +01:00
joker881	19978e0874	[RISCV]Add CTZ Intrinsic for ZBB in Clang Add Intrinsics and test for B extension (updating coming soon (: Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124348	2022-04-30 08:18:10 +08:00
jonasyhwang	eaca933c59	[Clang][CodeGen]Fix __builtin_dump_struct missing record type field name Thanks for @rsmith to point this. I'm sorry for introducing this bug. See @rsmith 's comment in https://reviews.llvm.org/D122248 Eg:(By @rsmith ) https://godbolt.org/z/o7vcbWaEf I have added a test case struct: ``` struct U19A { int a; }; struct U19B { struct U19A a; }; struct U19B a = { .a.a = 2022 }; ``` Dump result: ``` struct U19B { struct U19A a = { int a = 2022 } } ``` Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122920	2022-04-29 12:58:53 +08:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Richard Smith	72315d02c4	Treat `std::move`, `forward`, etc. as builtins. This is extended to all `std::` functions that take a reference to a value and return a reference (or pointer) to that same value: `move`, `forward`, `move_if_noexcept`, `as_const`, `addressof`, and the libstdc++-specific function `__addressof`. We still require these functions to be declared before they can be used, but don't instantiate their definitions unless their addresses are taken. Instead, code generation, constant evaluation, and static analysis are given direct knowledge of their effect. This change aims to reduce various costs associated with these functions -- per-instantiation memory costs, compile time and memory costs due to creating out-of-line copies and inlining them, code size at -O0, and so on -- so that they are not substantially more expensive than a cast. Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change. We now automatically infer the `const` and `nothrow` attributes on these now-builtin functions, in particular meaning that we get a warning for an unused call to one of these functions. In C++20 onwards, we disallow taking the addresses of these functions, per the C++20 "addressable function" rule. In earlier language modes, a compatibility warning is produced but the address can still be taken. The same infrastructure is extended to the existing MSVC builtin `__GetExceptionInfo`, which is now only recognized in namespace `std` like it always should have been. This is a re-commit of fc3090109643af8d2da9822d0f99c84742b9c877, a571f82a50416b767fd3cce0fb5027bb5dfec58c, 64c045e25b8471bbb572bd29159c294a82a86a2, and de6ddaeef3aaa8a9ae3663c12cdb57d9afc0f906, and reverts aa643f455a5362de7189eac630050d2c8aefe8f2. This change also includes a workaround for users using libc++ 3.1 and earlier (!!), as apparently happens on AIX, where std::move sometimes returns by value. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123345 Revert "Fixup D123950 to address revert of D123345" This reverts commit aa643f455a5362de7189eac630050d2c8aefe8f2.	2022-04-20 17:58:31 -07:00
David Tenty	98d911e01f	Revert "Treat `std::move`, `forward`, etc. as builtins." This reverts commit b27430f9f46b88bcd54d992debc8d72e131e1bd0 as the parent https://reviews.llvm.org/D123345 breaks the AIX CI: https://lab.llvm.org/buildbot/#/builders/214/builds/819	2022-04-20 19:14:37 -04:00
Pengxuan Zheng	38612fbc89	Reland "[COFF, ARM64] Add __break intrinsic" https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 Reland after fixing the test failure. The failure was due to conflict with a change (D122983) which was merged right before this patch. Reviewed By: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D124032	2022-04-20 13:01:30 -07:00
Pengxuan Zheng	bff8356b19	Revert "[COFF, ARM64] Add __break intrinsic" This reverts commit 8a9b4fb4aa6d2dde026d9ae08459aa9e7a1edb05.	2022-04-20 11:57:49 -07:00
Pengxuan Zheng	8a9b4fb4aa	[COFF, ARM64] Add __break intrinsic https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 Reviewed By: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D124032	2022-04-20 11:20:26 -07:00
Richard Smith	b27430f9f4	Treat `std::move`, `forward`, etc. as builtins. This is extended to all `std::` functions that take a reference to a value and return a reference (or pointer) to that same value: `move`, `forward`, `move_if_noexcept`, `as_const`, `addressof`, and the libstdc++-specific function `__addressof`. We still require these functions to be declared before they can be used, but don't instantiate their definitions unless their addresses are taken. Instead, code generation, constant evaluation, and static analysis are given direct knowledge of their effect. This change aims to reduce various costs associated with these functions -- per-instantiation memory costs, compile time and memory costs due to creating out-of-line copies and inlining them, code size at -O0, and so on -- so that they are not substantially more expensive than a cast. Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change. We now automatically infer the `const` and `nothrow` attributes on these now-builtin functions, in particular meaning that we get a warning for an unused call to one of these functions. In C++20 onwards, we disallow taking the addresses of these functions, per the C++20 "addressable function" rule. In earlier language modes, a compatibility warning is produced but the address can still be taken. The same infrastructure is extended to the existing MSVC builtin `__GetExceptionInfo`, which is now only recognized in namespace `std` like it always should have been. This is a re-commit of fc3090109643af8d2da9822d0f99c84742b9c877, a571f82a50416b767fd3cce0fb5027bb5dfec58c, and 64c045e25b8471bbb572bd29159c294a82a86a25 which were reverted in e75d8b70370435b0ad10388afba0df45fcf9bfcc due to a crasher bug where CodeGen would emit a builtin glvalue as an rvalue if it constant-folds. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123345	2022-04-17 13:26:16 -07:00
Vitaly Buka	e75d8b7037	Revert "Treat `std::move`, `forward`, and `move_if_noexcept` as builtins." Revert "Extend support for std::move etc to also cover std::as_const and" Revert "Update test to handle opaque pointers flag flip." It crashes on libcxx tests https://lab.llvm.org/buildbot/#/builders/85/builds/8174 This reverts commit fc3090109643af8d2da9822d0f99c84742b9c877. This reverts commit a571f82a50416b767fd3cce0fb5027bb5dfec58c. This reverts commit 64c045e25b8471bbb572bd29159c294a82a86a25.	2022-04-16 00:27:51 -07:00
Richard Smith	fc30901096	Extend support for std::move etc to also cover std::as_const and std::addressof, plus the libstdc++-specific std::__addressof. This brings us to parity with the corresponding GCC behavior. Remove STDBUILTIN macro that ended up not being used.	2022-04-15 16:31:39 -07:00
Richard Smith	64c045e25b	Treat `std::move`, `forward`, and `move_if_noexcept` as builtins. We still require these functions to be declared before they can be used, but don't instantiate their definitions unless their addresses are taken. Instead, code generation, constant evaluation, and static analysis are given direct knowledge of their effect. This change aims to reduce various costs associated with these functions -- per-instantiation memory costs, compile time and memory costs due to creating out-of-line copies and inlining them, code size at -O0, and so on -- so that they are not substantially more expensive than a cast. Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change. We now automatically infer the `const` and `nothrow` attributes on these now-builtin functions, in particular meaning that we get a warning for an unused call to one of these functions. In C++20 onwards, we disallow taking the addresses of these functions, per the C++20 "addressable function" rule. In earlier language modes, a compatibility warning is produced but the address can still be taken. The same infrastructure is extended to the existing MSVC builtin `__GetExceptionInfo`, which is now only recognized in namespace `std` like it always should have been. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123345	2022-04-15 14:09:45 -07:00
joker881	a4f47a99aa	RISCV] Add clang builtins for CLZ instruction. add intrinsic for CLZ Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121915	2022-04-14 12:29:15 +08:00
Quinn Pham	7d7022fb0c	[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove the loop at the beginning of the function that emits the arguments and to delay emitting the arguments until inside the switch statement. These changes will put `EmitPPCBuiltinExpr` in line with the strategy of the target independent function `EmitBuiltinExpr`. Also, this patch ensures that arguments are only emitted once. Tests that included builtins affected by these changes have been modified to match expected behaviour. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D121637	2022-04-12 15:33:20 -05:00
Quinn Pham	fef56f79ac	Revert "[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once" This reverts commit 2aae5b1fac3898afa10b550eee5e97f394aed0e6. Because it breaks tests on windows.	2022-04-07 16:45:19 -05:00
Quinn Pham	2aae5b1fac	[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove the loop at the beginning of the function that emits the arguments and to delay emitting the arguments until inside the switch statement. These changes will put `EmitPPCBuiltinExpr` in line with the strategy of the target independent function `EmitBuiltinExpr`. Also, this patch ensures that arguments are only emitted once. Tests that included builtins affected by these changes have been modified to match expected behaviour. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D121637	2022-04-07 16:00:12 -05:00
Ting Wang	b389354b28	[Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend Add support for builtin_[max\|min] which has below prototype: A builtin_max (A1, A2, A3, ...) All arguments must have the same type; they must all be float, double, or long double. Internally use SelectCC to get the result. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D122478	2022-04-05 22:43:48 -04:00
Luo, Yuanke	979d876bb4	[X86][AMX] enable amx cast intrinsics in FE. We have some discission in D99152 and llvm-dev and finially come up with a solution to add amx specific cast intrinsics. We've support the intrinsics in llvm IR. This patch is to replace bitcast with amx cast intrinsics in code emitting in FE. Differential Revision: https://reviews.llvm.org/D122567	2022-04-02 14:02:35 +08:00
wangyihan	907d3acefc	[Clang][CodeGen]Beautify dump format, add indent for nested struct and struct members Beautify dump format, add indent for nested struct and struct members, also fix test cases in dump-struct-builtin.c for example: struct: ``` struct A { int a; struct B { int b; struct C { struct D { int d; union E { int x; int y; } e; } d; int c; } c; } b; }; ``` Before: ``` struct A { int a = 0 struct B { int b = 0 struct C { struct D { int d = 0 union E { int x = 0 int y = 0 } } int c = 0 } } } ``` After: ``` struct A { int a = 0 struct B { int b = 0 struct C { struct D { int d = 0 union E { int x = 0 int y = 0 } } int c = 0 } } } ``` Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122704	2022-03-31 07:38:37 +08:00
wangyihan	de7cd3ccf5	[Clang][CodeGen]Remove anonymous tag locations Remove anonymous tag locations, powered by 'PrintingPolicy', @aaron.ballman once suggested removing this extra information in https://reviews.llvm.org/D122248 struct: struct S { int a; struct /* Anonymous*/ { int x; } b; int c; }; Before: struct S { int a = 0 struct S::(unnamed at ./builtin_dump_struct.c:20:3) { int x = 0 } int c = 0 } After: struct S { int a = 0 struct S::(unnamed) { int x = 0 } int c = 0 } Differntial Revision: https://reviews.llvm.org/D122670	2022-03-29 11:38:29 -07:00
wangyihan	7faa95624e	[clang][CodeGen]Fix clang crash and add bitfield support in __builtin_dump_struct Fix clang crash and add bitfield support in __builtin_dump_struct. In clang13.0.x, a struct with three or more members and a bitfield at the same time will cause a crash. In clang15.x, as long as the struct has one bitfield, it will cause a crash in clang. Open issue: https://github.com/llvm/llvm-project/issues/54462 Differential Revision: https://reviews.llvm.org/D122248	2022-03-24 12:23:29 -07:00
Alan Zhao	8cd8bd4a5c	Implement __cpuid and __cpuidex as Clang builtins https://reviews.llvm.org/D23944 implemented the #pragma intrinsic from MSVC. This causes the statement #pragma intrinsic(cpuid) to fail [0] on Clang because cpuid is currently implemented in intrin.h instead of a Clang builtin. Reimplementing cpuid (as well as it's releated function, cpuidex) should resolve this. [0]: https://crbug.com/1279344 Differential revision: https://reviews.llvm.org/D121653	2022-03-18 18:13:52 +01:00
Changpeng Fang	dd5895cc39	AMDGPU: Use the implicit kernargs for code object version 5 Summary: Specifically, for trap handling, for targets that do not support getDoorbellID, we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1]. To get aperture bases when targets do not have aperture registers, we load private_base or shared_base directly from the implicit kernarg. In clang, we use implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}. Reviewers: arsenm, sameerds, yaxunl Differential Revision: https://reviews.llvm.org/D120265	2022-03-17 14:12:36 -07:00
Nikita Popov	2edac9d962	[CodeGen] Avoid some pointer element type accesses	2022-03-17 16:32:45 +01:00
Eli Friedman	04ba344176	[CodeGen] Inline _byteswap_* builtins. As discussed in D57915. Fixes https://github.com/llvm/llvm-project/issues/39999 . Differential Revision: https://reviews.llvm.org/D121865	2022-03-16 16:18:51 -07:00
Arthur Eubanks	2371c5a0e0	[OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Basically the same as D120527. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D121847	2022-03-16 14:11:53 -07:00
Thomas Lively	7e8913d775	[WebAssembly] Fix names of SIMD instructions containing '_zero' Fix the instruction names to match the WebAssembly spec: - `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero` - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero` Also rename related things like intrinsics, builtins, and test functions to match. Reviewed By: aheejin Differential Revision: https://reviews.llvm.org/D121661	2022-03-16 13:34:57 -07:00
Arthur Eubanks	250620f76e	[OpaquePtr][AArch64] Use elementtype on ldxr/stxr Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D120527	2022-03-14 10:09:59 -07:00
Kazushi (Jam) Marukawa	b1b4b6f366	[Clang][VE] Add vector load intrinsics Add vector load intrinsic instructions for VE. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D121049	2022-03-12 09:09:57 +09:00
Akira Hatanaka	9bb8c80bea	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in CGBuiltin.cpp Differential Revision: https://reviews.llvm.org/D121153	2022-03-08 09:45:15 -08:00
Stanislav Mekhanoshin	932f628121	[AMDGPU] new gfx940 fp atomics Differential Revision: https://reviews.llvm.org/D121028	2022-03-07 12:32:02 -08:00
Qiu Chaofan	b2497e5435	[PowerPC] Add generic fnmsub intrinsic Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/double scalar, they'll generate corresponding intrinsics. But for the vector version of builtin, the 3 op chain may be recognized as expensive by some passes (like early cse). We need some way to keep the fnmsub form until code generation. This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub intrinsics. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D116015	2022-03-07 13:00:06 +08:00
Shao-Ce SUN	fa9c8bab0c	[RISCV] Support k-ext clang intrinsics Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112774	2022-03-05 13:57:18 +08:00
Nikita Popov	5065076698	[CodeGen] Rename deprecated Address constructor To make uses of the deprecated constructor easier to spot, and to ensure that no new uses are introduced, rename it to Address::deprecated(). While doing the rename, I've filled in element types in cases where it was relatively obvious, but we're still left with 135 calls to the deprecated constructor.	2022-02-17 11:26:42 +01:00
Nikita Popov	f208644ed3	[CGBuilder] Remove CreateBitCast() method Use CreateElementBitCast() instead, or don't work on Address where not necessary.	2022-02-14 15:06:04 +01:00
Sander de Smalen	0b41238ae7	[AArch64] Emit TBAA metadata for SVE load/store intrinsics In Clang we can attach TBAA metadata based on the load/store intrinsics based on the operation's element type. This also contains changes to InstCombine where the AArch64-specific intrinsics are transformed into generic LLVM load/store operations, to ensure that all metadata is transferred to the new instruction. There will be some further work after this patch to also emit TBAA metadata for SVE's gather/scatter- and struct load/store intrinsics. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D119319	2022-02-11 09:00:29 +00:00
Simon Pilgrim	09857a4bd1	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 15:00:10 +00:00
Simon Pilgrim	a59faf272e	Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat" Missed some legacy builtin tests that need cleaning up first	2022-02-08 14:45:28 +00:00
Simon Pilgrim	6c174ab2ad	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 14:21:20 +00:00
Simon Pilgrim	c00db97159	[Clang] Add elementwise saturated add/sub builtins This patch implements `__builtin_elementwise_add_sat` and `__builtin_elementwise_sub_sat` builtins. These map to the add/sub saturated math intrinsics described here: https://llvm.org/docs/LangRef.html#saturation-arithmetic-intrinsics With this in place we should then be able to replace the x86 SSE adds/subs intrinsics with these generic variants - it looks like other targets should be able to use these as well (arm/aarch64/webassembly all have similar examples in cgbuiltin). Differential Revision: https://reviews.llvm.org/D117898	2022-02-08 11:22:01 +00:00
Nikita Popov	c45a99f36b	[MatrixBuilder] Require explicit element type in CreateColumnMajorLoad() This makes the method compatible with opaque pointers.	2022-02-07 16:57:33 +01:00

1 2 3 4 5 ...

1683 Commits