llvm-project

Author	SHA1	Message	Date
Koakuma	c2fba6df94	[clang][SPARC] Treat empty structs as if it's a one-bit type in the CC (#90338 ) Make sure that empty structs are treated as if it has a size of one bit in function parameters and return types so that it occupies a full argument and/or return register slot. This fixes crashes and miscompilations when passing and/or returning empty structs. Reviewed by: @s-barannikov	2024-05-15 20:49:28 +07:00
Lukacma	421862f8e4	[Clang] Fix incorrect passing of _BitInt args (#90741 ) This patch removes incorrect `byval` attribute from pointer argument passed with >128 bit long _BitInt types.	2024-05-15 10:51:32 +01:00
Phoebe Wang	5bde8017a1	[X86][vectorcall] Pass built types byval when xmm0~6 exhausted (#91846 ) This is how MSVC handles it. https://godbolt.org/z/fG386bjnf	2024-05-13 08:31:49 +08:00
Fangrui Song	e9f53e4095	[test] Move RISCV tests to clang/test/CodeGen/RISCV/ The directory was created by 2f1fe9a3a60d6f18998c5f3b7e643d4cbaa4e65d (2020). Pull Request: https://github.com/llvm/llvm-project/pull/91783	2024-05-10 13:22:07 -07:00
Momchil Velikov	2371a6410d	[AArch64] Add intrinsics for non-widening FMOPA/FMOPS (#88105 ) According to the specification in https://github.com/ARM-software/acle/pull/309 this adds the intrinsics void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm, svfloat16_t zn, svfloat16_t zm) __arm_streaming __arm_inout("za"); void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm, svfloat16_t zn, svfloat16_t zm) __arm_streaming __arm_inout("za"); as well as the corresponding `bf16` variants.	2024-05-10 11:57:08 +01:00
Momchil Velikov	64d4ade3bb	[AArch64] Add intrinsics for 16-bit non-widening FMLA/FMLS (#88553 ) According to the specification in https://github.com/ARM-software/acle/pull/309 add the following intrinsics void svmla[_single]_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn, svfloat16_t zm) void svmla[_single]_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn, svfloat16_t zm) void svmls[_single]_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn, svfloat16_t zm) void svmls[_single]_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn, svfloat16_t zm) void svmla_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn, svfloat16x2_t zm) void svmla_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn, svfloat16x4_t zm) void svmls_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn, svfloat16x2_t zm) void svmls_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn, svfloat16x4_t zm) void svmla_lane_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn, svfloat16_t zm, uint64_t imm_idx) void svmla_lane_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn, svfloat16_t zm, uint64_t imm_idx) void svmls_lane_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn, svfloat16_t zm, uint64_t imm_idx) void svmls_lane_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn, svfloat16_t zm, uint64_t imm_idx) as well as the corresponding `_bf16` variants.	2024-05-10 11:14:26 +01:00
Brendan Dahl	8a3277acbc	[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545 ) Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon.	2024-05-09 15:38:13 -07:00
Tomas Matheson	ddad7c3c84	[AArch64] add some more tests for FMV (#91490 ) Add a couple of tests to make it clear: - when FMV should be enabled and disabled by the driver. - which extensions are enabled/disabled based on the dependencies specified in TargetParser.	2024-05-09 21:51:39 +01:00
Momchil Velikov	139e0aa68d	Revert "[AArch64] Add intrinsics for multi-vector to ZA array vector accumulators" (#91597 ) Reverts llvm/llvm-project#88266 due to test failures error: 'expected-error' diagnostics seen but not expected: (frontend): '-fsyntax-only' action ignored; '-emit-llvm' action specified previously	2024-05-09 15:03:52 +01:00
Momchil Velikov	e88ba6d975	[AArch64] Add intrinsics for multi-vector to ZA array vector accumulators (#88266 ) According to the specification in https://github.com/ARM-software/acle/pull/309 this adds the intrinsics void_svadd_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn) __arm_streaming __arm_inout("za"); void_svadd_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn) __arm_streaming __arm_inout("za"); void_svsub_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn) __arm_streaming __arm_inout("za"); void_svsub_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn) __arm_streaming __arm_inout("za"); as well as the corresponding `bf16` variants.	2024-05-09 14:30:53 +01:00
Daniil Kovalev	ad652efa1f	[AArch64][PAC][clang][ELF] Support PAuth ABI core info (#85235 ) Depends on #87545 Emit PAuth ABI compatibility tag values as llvm module flags: - `aarch64-elf-pauthabi-platform` - `aarch64-elf-pauthabi-version` For platform 0x10000002 (llvm_linux), the version value bits correspond to the following LangOptions defined in #85232: - bit 0: `PointerAuthIntrinsics`; - bit 1: `PointerAuthCalls`; - bit 2: `PointerAuthReturns`; - bit 3: `PointerAuthAuthTraps`; - bit 4: `PointerAuthVTPtrAddressDiscrimination`; - bit 5: `PointerAuthVTPtrTypeDiscrimination`; - bit 6: `PointerAuthInitFini`. --------- Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2024-05-09 15:32:18 +03:00
Lukacma	105dd60fc8	[Clang][AArch64] Fixed incorrect _BitInt alignment (#90602 ) This patch makes determining alignment and width of BitInt to be target ABI specific and makes it consistent with [Procedure Call Standard for the Arm® 64-bit Architecture (AArch64)](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst) for AArch64 targets.	2024-05-09 10:45:19 +01:00
Jacob Lambert	11a6799740	[clang][CodeGen] Omit pre-opt link when post-opt is link requested (#85672 ) Currently, when the -relink-builtin-bitcodes-postop option is used we link builtin bitcodes twice: once before optimization, and again after optimization. With this change, we omit the pre-opt linking when the option is set, and we rename the option to the following: -Xclang -mlink-builtin-bitcodes-postopt (-Xclang -mno-link-builtin-bitcodes-postopt) The goal of this change is to reduce compile time. We do lose the theoretical benefits of pre-opt linking, but in practice these are small than the overhead of linking twice. However we may be able to address this in a future patch by adjusting the position of the builtin-bitcode linking pass. Compilations not setting the option are unaffected	2024-05-08 08:11:15 -07:00
Freddy Ye	e44600f3ab	[X86][CFE] Support EGPR in GCCRegNames. (#91323 )	2024-05-08 15:07:18 +08:00
Farzon Lotfi	31b45a9d0d	[clang][hlsl] Add tan intrinsic part 1 (#90276 ) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 If you want an overarching view of how this will all connect see: https://github.com/llvm/llvm-project/pull/90088 Changes: - `clang/docs/LanguageExtensions.rst` - Document the new elementwise tan builtin. - `clang/include/clang/Basic/Builtins.td` - Implement the tan builtin. - `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the tan intrinsic on uses of the builtin - `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the tan builtin with the equivalent hlsl apis - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks as well as HLSL specifc sema checks to the tan builtin - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/docs/LangRef.rst` - Document the tan intrinsic	2024-05-07 22:54:15 -04:00
Max Winkler	3f37397c95	[clang][CodeGen] Fix MSVC ABI for classes with a deleted copy assignment operator (#90547 ) For global functions and static methods the MSVC ABI returns structs/classes with a deleted copy assignment operator indirectly. From local testing this ABI holds true for all currently supported architectures including ARM64EC.	2024-05-07 19:46:19 -04:00
Sean Perry	c9ab1d8905	Mark test cases as unsupported on z/OS (#90990 ) These test cases are testing features not available when either targeting the s390x-ibm-zos target or use tools/features not available on the z/OS operating system. In a couple cases the lit test had a number of subtests with one or two that aren't supported on z/OS. Rather than mark the entire test as unsupported I split out the unsupported tests into a separate test case.	2024-05-07 15:23:50 -04:00
Brendan Dahl	1a2a1fbd7c	[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906 ) Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon.	2024-05-07 11:33:10 -07:00
Petr Hosek	8bcb073705	[Clang] -fseparate-named-sections option (#91028 ) When set, the compiler will use separate unique sections for global symbols in named special sections (e.g. symbols that are annotated with __attribute__((section(...)))). Doing so enables linker GC to collect unused symbols without having to use a different section per-symbol.	2024-05-07 09:18:55 -07:00
ostannard	1fd196c8df	[AArch64] Diagnose more functions when FP not enabled (#90832 ) When using a hard-float ABI for a target without FP registers, it's not possible to correctly generate code for functions with arguments which must be passed in floating-point registers. This is diagnosed in CodeGen instead of Sema, to more closely match GCC's behaviour around inline functions, which is relied on by the Linux kernel. Previously, this only checked function signatures as they were code-generated, but this missed some cases: * Calls to functions not defined in this translation unit. * Calls through function pointers. * Calls to variadic functions, where the variadic arguments have a floating-point type. This adds checks to function calls, as well as definitions, so that these cases are correctly diagnosed.	2024-05-07 09:17:05 +01:00
Doug Wyatt	ddecadabeb	[clang backend] In AArch64's DataLayout, specify a minimum function alignment of 4. (#90702 ) This addresses an issue where the explicit alignment of 2 (for C++ ABI reasons) was being propagated to the back end and causing under-aligned functions (in special sections). This is an alternate approach suggested by @efriedma-quic in PR #90415. Fixes #90358	2024-05-05 19:05:15 -07:00
Fangrui Song	d33937b623	[test] %clang_cc1: remove redundant actions ParseFrontendArgs takes the last OPT_Action_Group option. The other actions are overridden.	2024-05-05 11:42:04 -07:00
Fangrui Song	7e59223ac4	[test] %clang_cc1: remove redundant actions ParseFrontendArgs takes the last OPT_Action_Group option. The other actions are overridden.	2024-05-05 10:46:06 -07:00
Fangrui Song	7c1d9b15ee	[test] %clang_cc1: remove redundant actions	2024-05-04 23:08:11 -07:00
Fangrui Song	3c311b0222	[test] %clang_cc1 -S: remove overridden -emit-llvm	2024-05-04 17:49:32 -07:00
Fangrui Song	a312dd68c0	[BPF,test] %clang_cc1 -emit-llvm: remove redundant -S	2024-05-04 17:37:36 -07:00
Fangrui Song	c4c3efa161	[test] %clang_cc1 -emit-llvm: remove redundant -S	2024-05-04 17:31:08 -07:00
Fangrui Song	0d501f38f3	[test] %clang_cc1 -emit-llvm: remove redundant -S Also replace aarch64-none-linux-gnu (none can indicate an OS as well) with aarch64	2024-05-04 17:15:51 -07:00
Fangrui Song	c5de4dd1ea	[test] %clang_cc1 -emit-llvm: remove redundant -S And replace -emit-llvm -o - with -emit-llvm-only	2024-05-04 17:00:29 -07:00
Karl-Johan Karlsson	cb015b9ec9	[clang][CodeGen] Propagate pragma set fast-math flags to floating point builtins (#90377 ) This is a fix for the issue #87758 where fast-math flags are not propagated all builtins. It seems like pragmas with fast math flags was only propagated to calls of unary floating point builtins. This patch propagate them also for binary and ternary floating point builtins.	2024-05-04 17:47:48 +02:00
Fangrui Song	f34a5205aa	[clang,test] Convert text files from CRLF to LF Skip files with intentional CRLF line endings.	2024-05-03 10:23:53 -07:00
Pavel Iliin	804202292b	[FMV][AArch64] Don't optimize backward compatible features in resolver. (#90928 ) For arch64 features, such as Branch Target Identification or MTE (Memory Tagging Extension), compatible with targets that lack their support we may encounter scenarios where a binary compiled with MTE for example is executed on both MTE and non-MTE hardware and we still need to detect at runtime whether the MTE feature is available to choose the appropriate function version. So, we cannot optimize the function multi versioning resolver by removing checks for these features enabled for the target during compilation.	2024-05-03 18:07:17 +01:00
cor3ntin	642117105d	[Clang] Implement P2809: Trivial infinite loops are not Undefined Behavior (#90066 ) https://wg21.link/P2809R3 This is applied as a DR to C++11 (C++98 did not guarantee forward progress and is left untouched) As an extension (and to preserve existing behavior in C), we consider all controlling expression that can be constant folded in the front end, not just standard constant expressions.	2024-05-03 14:10:54 +02:00
Pavel Iliin	ff210b94d4	[FMV][NFC] Add test for bti and mte check in resolver.	2024-05-03 00:58:17 +01:00
Björn Pettersson	7298ae3b6d	[clang][CodeGen] Fix in codegen for __builtin_popcountg/ctzg/clzg (#90845 ) Make sure that the result from the popcnt/ctlz/cttz intrinsics is unsigned casted to int, rather than casted as a signed value, when expanding the __builtin_popcountg/__builtin_ctzg/__builtin_clzg builtins. An example would be unsigned _BitInt(1) x = ...; int y = __builtin_popcountg(x); which previously was incorrectly expanded to %1 = call i1 @llvm.ctpop.i1(i1 %0) %cast = sext i1 %1 to i32 Since the input type is generic for those "g" versions of the builtins the intrinsic call may return a value for which the sign bit is set (that could typically for BitInt of size 1 and 2). So we need to emit a zext rather than a sext to avoid negative results.	2024-05-02 22:49:39 +02:00
zhijian lin	d4a25976df	Implement a subset of builtin_cpu_supports() features (#82809 ) The PR implements a subset of features of function __builtin_cpu_support() for AIX OS based on the information which AIX kernel runtime variable `_system_configuration` and function call `getsystemcfg()` of /usr/include/sys/systemcfg.h in AIX OS can provide. Following subset of features are supported in the PR "arch_3_00", "arch_3_1","booke","cellbe","darn","dfp","dscr" ,"ebb","efpsingle","efpdouble","fpu","htm","isel", "mma","mmu","pa6t","power4","power5","power5+","power6x","ppc32","ppc601","ppc64","ppcle","smt", "spe","tar","true_le","ucache","vsx"	2024-05-02 14:59:33 -04:00
Krishna Narayanan	f17b1fb667	[Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (#89362 ) Fixes #53079	2024-05-02 10:42:34 +01:00
Nikita Popov	74aa1abfae	[InstCombine] Canonicalize scalable GEPs to use llvm.vscale intrinsic (#90569 ) Canonicalize getelementptr instructions for scalable vector types into ptradd representation with an explicit llvm.vscale call. This representation has better support in BasicAA, which can reason about llvm.vscale, but not plain scalable GEPs.	2024-05-01 14:53:43 +09:00
wanglei	eb148aecb3	[LoongArch][Codegen] Add support for TLSDESC The implementation only enables when the `-enable-tlsdesc` option is passed and the TLS model is `dynamic`. LoongArch's GCC has the same option(-mtls-dialet=) as RISC-V. Reviewers: heiher, MaskRay, SixWeining Reviewed By: SixWeining, MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/90159	2024-04-30 15:14:44 +08:00
Kees Cook	869ffcf3f6	[CodeGen][i386] Move -mregparm storage earlier and fix Runtime calls (#89707 ) When building the Linux kernel for i386, the -mregparm=3 option is enabled. Crashes were observed in the sanitizer handler functions, and the problem was found to be mismatched calling convention. As was fixed in commit c167c0a4dcdb ("[BuildLibCalls] infer inreg param attrs from NumRegisterParameters"), call arguments need to be marked as "in register" when -mregparm is set. Use the same helper developed there to update the function arguments. Since CreateRuntimeFunction() is actually part of CodeGenModule, storage of the -mregparm value is also moved to the constructor, as doing this in Release() is too late. Fixes: https://github.com/llvm/llvm-project/issues/89670	2024-04-29 14:54:10 -07:00
Eli Friedman	3ab4ae9e58	[clang codegen] Fix MS ABI detection of user-provided constructors. (#90151 ) In the context of determining whether a class counts as an "aggregate", a constructor template counts as a user-provided constructor. Fixes #86384	2024-04-29 12:00:12 -07:00
Lawrence Benson	bd07c22e53	[Clang] Add support for scalable vectors in __builtin_reduce_* functions (#87750 ) Currently, a lot of `__builtin_reduce_*` function do not support scalable vectors, i.e., ARM SVE and RISCV V. This PR adds support for them. The main code change is to use a different path to extract the type from the vectors, the rest is the same and LLVM supports the reduce functions for `vscale` vectors. This PR adds scalable vector support for: - `__builtin_reduce_add` - `__builtin_reduce_mul` - `__builtin_reduce_xor` - `__builtin_reduce_or` - `__builtin_reduce_and` - `__builtin_reduce_min` - `__builtin_reduce_max` Note: For all except `min/max`, the element type must still be an integer value. Adding floating point support for `add` and `mul` is still an open TODO.	2024-04-29 16:45:33 +02:00
nihui	cb3174bd78	[clang][CodeGen] fix UB in aarch64 bfloat16 scalar conversion (#89062 ) do not bitcast 16bit `bfloat16` to 32bit `int32_t` directly bitcast to `int16_t`, and then upcast to `int32_t` Fix ASAN runtime error when calling vcvtah_f32_bf16 `==21842==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x007fda1dd063 at pc 0x005c0361c234 bp 0x007fda1dd030 sp 0x007fda1dd028 ` without patch ```c __ai __attribute__((target("bf16"))) float32_t vcvtah_f32_bf16(bfloat16_t __p0) { float32_t __ret; bfloat16_t __reint = __p0; int32_t __reint1 = (int32_t ) &__reint << 16; __ret = (float32_t ) &__reint1; return __ret; } ``` with this patch ```c __ai __attribute__((target("bf16"))) float32_t vcvtah_f32_bf16(bfloat16_t __p0) { float32_t __ret; bfloat16_t __reint = __p0; int32_t __reint1 = (int32_t)((int16_t ) &__reint) << 16; __ret = (float32_t ) &__reint1; return __ret; } ``` fix issue https://github.com/llvm/llvm-project/issues/61983	2024-04-29 13:18:37 +01:00
Paul Walker	0fa1f1f2d1	[LLVM][SVE] Seperate the int and floating-point variants of addqv. (#89762 ) We only use common intrinsics for operations that treat their element type as a container of bits.	2024-04-26 11:25:55 +01:00
Andreas Jonson	93de97d750	[SCCP] Swap out range metadata to range attribute (#90134 ) Also moved the range from the function's call sites to the functions return value as that is possible now.	2024-04-26 11:04:47 +09:00
Andreas Jonson	b8f3024a31	[InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (#88776 ) Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of range metadata for call instructions to range attributes.	2024-04-25 01:45:50 +08:00
Brandon Wu	418bdb49a7	[clang][RISCV] Remove LMUL=8 scalar input for some vector crypto instructions (#89867 ) Since the requirement is EEW=32, it's impossible that EGW=128 needs LMUL=8.	2024-04-24 22:43:25 +08:00
Brandon Wu	3fa6b9c69e	[clang][RISCV] Support RVV bfloat16 C intrinsics (#89354 ) It follows the interface defined here: https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/293	2024-04-24 07:42:58 +08:00
Karl-Johan Karlsson	31480b0cc8	[test] Avoid writing to a potentially write-protected dir (#89242 ) These tests just don't check the output written to the current directory. The current directory may be write protected e.g. in a sandboxed environment. The Testcases that use -emit-llvm and -verify only care about stdout/stderr and are in this patch changed to use -emit-llvm-only to avoid writing to an output file. The verify-inlineasmbr.mir testcase that also only care about stdout/stderr is in this patch changed to throw away the output file and just write to /dev/null.	2024-04-20 12:26:58 +02:00
Bill Wendling	c32712d176	[Clang] Handle structs with inner structs and no fields (#89126 ) A struct that declares an inner struct, but no fields, won't have a field count. So getting the offset of the inner struct fails. This happens in both C and C++: struct foo { struct bar { int Quantizermatrix[]; }; }; Here 'struct foo' has no fields. Closes: https://github.com/llvm/llvm-project/issues/88931	2024-04-19 19:48:33 +00:00

1 2 3 4 5 ...

9018 Commits