llvm-project

Author	SHA1	Message	Date
Nikolas Klauser	508263824f	[Clang] Start moving X86Builtins.def to X86Builtins.td (#106005 ) This starts moving `X86Builtins.def` to be a tablegen file. It's quite large, so I think it'd be good to move things in multiple steps to avoid a bunch of merge conflicts due to the amount of time this takes to complete.	2024-10-30 14:23:35 +01:00
Alexandros Lamprineas	5dac2db5a8	[FMV][AArch64] Remove features which can be expressed as a combination of others. (#113580 ) Removes sve-bf16, sve-ebf16, and sve-i8mm since they are obsolete. One could write target_version("sve+bf16") instead of sve-bf16 for instance. Approved in ACLE as https://github.com/ARM-software/acle/pull/353	2024-10-30 11:53:50 +00:00
Jesse Huang	335e68d8bc	[Clang][RISCV] Support -fcf-protection=return for RISC-V (#112477 ) Enables the support of `-fcf-protection=return` on RISC-V, which requires Zicfiss. It also adds a string attribute "hw-shadow-stack" to every function if the option is set on RISC-V	2024-10-29 15:47:49 +08:00
Craig Topper	7bd8a165f9	[X86] Don't allow '+f' as an inline asm constraint. (#113871 ) f cannot be used as an output constraint. We already errored for '=f' but not '+f'. Fixes #113692.	2024-10-28 13:20:46 -07:00
Aaron Ballman	af7c58b7ea	Remove support for RenderScript (#112916 ) See https://discourse.llvm.org/t/rfc-deprecate-and-eventually-remove-renderscript-support/81284 for the RFC	2024-10-28 12:48:42 -04:00
Dan Gohman	1bc2cd98c5	[WebAssembly] Enable nontrapping-fptoint and bulk-memory by default. (#112049 ) We were prepared to enable these features [back in February], but they got pulled for what appear to be unrelated reasons. So let's have another try at enabling them! Another motivation here is that it'd be convenient for the [Lime1 proposal] if "lime1" is close to a subset of "generic" (missing only for extended-const). [back in February]: https://github.com/WebAssembly/tool-conventions/issues/158#issuecomment-1931119512 [Lime1 proposal]: https://github.com/llvm/llvm-project/pull/112035	2024-10-25 13:52:51 -07:00
Freddy Ye	c4248fa3ed	[X86] Support MOVRS and AVX10.2 instructions. (#113274 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368	2024-10-25 09:00:19 +08:00
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
Alex Crichton	c2293b33dd	[WebAssembly] Implement the wide-arithmetic proposal (#111598 ) This commit implements the [wide-arithmetic] proposal which has recently reached phase 2 in the WebAssembly proposals process. The goal here is to implement support in LLVM for emitting these instructions which are gated behind a new feature flag by default. A new `wide-arithmetic` feature flag is introduced which gates these four new instructions from being emitted. Emission of each instruction itself is relatively simple given LLVM's preexisting lowering rules and infrastructure. The main gotcha is that due to the multi-result nature of all of these instructions it needed the lowerings to be implemented in C++ rather than in TableGen. [wide-arithmetic]: https://github.com/WebAssembly/wide-arithmetic	2024-10-23 11:39:58 -07:00
tangaac	5b9c76b6e7	[LoongArch] Support LoongArch-specific amswap[_db].{b/h} and amadd[_db].{b/h} instructions (#113255 ) Two options for clang: -mlam-bh & -mno-lam-bh. Enable or disable amswap[__db].{b/h} and amadd[__db].{b/h} instructions. The default is -mno-lam-bh. Only works on LoongArch64.	2024-10-23 16:03:15 +08:00
Carl Ritson	076aac59ac	[AMDGPU] Add a new target for gfx1153 (#113138 )	2024-10-23 12:56:58 +09:00
Alex Voicu	6e0b0038cd	[clang][OpenCL][CodeGen][AMDGPU] Do not use `private` as the default AS for when `generic` is available (#112442 ) Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use `private` as the default address space. This is wrong for cases where the `generic` address space is available, and is corrected via this patch. In general, this AS map abuse is a bad hack and we should re-work it altogether, but at least after this patch we will stop being incorrect for e.g. OpenCL 2.0.	2024-10-22 12:05:48 +01:00
Alexandros Lamprineas	b6e9ba017f	[FMV][AArch64] Unify features memtag and memtag2. (#112511 ) If we split these features in the compiler (see relevant pull request https://github.com/llvm/llvm-project/pull/109299), we would only be able to hand-write a 'memtag2' version using inline assembly since the compiler cannot generate the instructions that become available with FEAT_MTE2. However these instructions only work at Exception Level 1, so they would be unusable since FMV is a user space facility. I am therefore unifying them. Approved in ACLE as https://github.com/ARM-software/acle/pull/351	2024-10-21 21:40:57 +01:00
Alex Rønne Petersen	d906ac52ab	[clang][AVR] Fix basic type size/alignment values to match avr-gcc. (#111290 ) Closes #102172	2024-10-21 12:30:03 +02:00
Sam Elliott	228f88fdc8	[RISCV] Inline Assembly: RVC constraint and N modifier (#112561 ) This change implements support for the `cr` and `cf` register constraints (which allocate a RVC GPR or RVC FPR respectively), and the `N` modifier (which prints the raw encoding of a register rather than the name). The intention behind these additions is to make it easier to use inline assembly when assembling raw instructions that are not supported by the compiler, for instance when experimenting with new instructions or when supporting proprietary extensions outside the toolchain. These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92 As part of the implementation, I felt there was not enough coverage of inline assembly and the "in X" floating-point extensions, so I have added more regression tests around these configurations.	2024-10-18 10:40:38 +01:00
Hans	4ddea298e6	[clang-cl]: Add /std:c++23preview and update _MSVC_LANG for C++23 (#112378 ) As discussed in https://discourse.llvm.org/t/clang-cl-adding-std-c-23preview/82553	2024-10-16 10:06:43 +02:00
Daniel Paoliello	c9f27275c1	[clang][aarch64] Add support for the MSVC qualifiers __ptr32, __ptr64, __sptr, __uptr for AArch64 (#111879 ) MSVC has a set of qualifiers to allow using 32-bit signed/unsigned pointers when building 64-bit targets. This is useful for WoW code (i.e., the part of Windows that handles running 32-bit application on a 64-bit OS). Currently this is supported on x64 using the 270, 271 and 272 address spaces, but does not work for AArch64 at all. This change adds the same 270, 271 and 272 address spaces to AArch64 and adjusts the data layout string accordingly. Clang will generate the correct address space casts, but these will currently be ignored until the AArch64 backend is updated to handle them. Partially fixes #62536 This is a resurrected version of <https://reviews.llvm.org/D158857> (originally created by @a_vorobev) - I've cleaned it up a little, fixed the rest of the tests and added to auto-upgrade for the data layout.	2024-10-15 10:37:36 -07:00
Artem Belevich	30a06e8022	[CUDA] Add support for CUDA-12.6 and sm_100 (#112028 ) This is a copy of #97402(with minor updates), which is now ready to land. --------- Co-authored-by: Sergey Kozub <skozub@nvidia.com>	2024-10-14 11:51:05 -07:00
Michał Górny	387b37af1a	[LLVM] [Clang] Support for Gentoo `t64` triples (64-bit time_t ABIs) (#111302 ) Gentoo is planning to introduce a `t64` suffix for triples that will be used by 32-bit platforms that use 64-bit `time_t`. Add support for parsing and accepting these triples, and while at it make clang automatically enable the necessary glibc feature macros when this suffix is used. An open question is whether we can backport this to LLVM 19.x. After all, adding new triplets to Triple sounds like an ABI change — though I suppose we can minimize the risk of breaking something if we move new enum values to the very end.	2024-10-14 11:18:04 +00:00
Jim Lin	dba54fb074	[RISCV] Add support for inline asm constraint vd (#111653 ) It constrains vector registers excluding v0. Refer to https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html RISC-V part. This patch also adds a testcase for constraints vr, vd and vm.	2024-10-14 10:47:59 +08:00
Greg Roth	c2063de159	Switch DirectX Target to use the Itanium ABI (#111632 ) To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add. One effect of this will be that linking library shaders compiled with DXC will not be possible with shaders compiled with clang. That isn't considered a terribly interesting use case and one that would likely have been onerous to maintain anyway. This involved adding a function to call all global destructors as the Microsoft ABI had done. This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another. Other changes effected by using the new ABI include using different types when manipulating smaller bitfields, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, changing the way static local initialization is guarded, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders. fixes #110736	2024-10-10 12:58:28 -06:00
Jonathan Thackray	d0756caedc	[ARM][AArch64] Introduce the Armv9.6-A architecture version (#110825 ) This introduces the Armv9.6-A architecture version, including the relevant command-line option for -march. More details about the Armv9.6-A architecture version can be found at: * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2024 * https://developer.arm.com/documentation/ddi0602/2024-09/	2024-10-04 10:12:41 +01:00
Brandon Wu	23c0850d2e	[RISCV][VCIX] Add vcix_state to GNU inline assembly register set (#106914 ) https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/56 Resolved https://github.com/llvm/llvm-project/issues/106700. This enables inline asm to have vcix_state to be a clobbered register thus disable reordering between VCIX intrinsics and inline asm.	2024-09-30 23:52:35 -07:00
Koakuma	dbad963a69	[SPARC] Align i128 to 16 bytes in SPARC datalayouts (#106951 ) Align i128s to 16 bytes, following the example at https://reviews.llvm.org/D86310. clang already does this implicitly, but do it in backend code too for the benefit of other frontends (see e.g https://github.com/llvm/llvm-project/issues/102783 & https://github.com/rust-lang/rust/issues/128950).	2024-09-30 08:32:33 +07:00
Alex Voicu	e13cbaca69	[clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface (#109415 ) This is primarily meant to address the issue identified in #109182, around incorrect usage of `-fsycl-is-device`; we now have AMDGCN flavoured SPIR-V which retains the desired behaviour around the default AS and does not depend on the SYCL language being enabled to do so. Overall, there are three changes: 1. We unconditionally use the `SPIRDefIsGen` AS map for AMDGCNSPIRV target, as there is no case where the hack of setting default to private would be desirable, and it can be used for languages other than OCL/HIP; 2. We implement `SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace` for SPIR-V in general, because otherwise using it from languages other than HIP or OpenCL would yield 0, incorrectly; 3. We remove the incorrect usage of `-fsycl-is-device`.	2024-09-26 14:06:14 +01:00
Ming-Yi Lai	9f33eb861a	[clang][RISCV] Introduce command line options for RISC-V Zicfilp CFI This patch enables the following command line flags for RISC-V targets: + `-fcf-protection=branch` turns on forward-edge control-flow integrity conditioning + `-mcf-branch-label-scheme=unlabeled\|func-sig` selects the label scheme used in the forward-edge CFI conditioning	2024-09-26 18:30:43 +08:00
Alex Voicu	3cfd0c0d36	[SPIRV][RFC] Rework / extend support for memory scopes (#106429 ) This change adds support for correctly lowering the `__scoped` Clang builtins, and corresponding scoped LLVM instructions. These were previously unconditionally lowered to Device scope, which is possibly incorrect. Furthermore, the default / implicit scope is changed from Device (an OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not OpenCL specific / can ingest IR coming from other language front-ends. OpenCL defaulting to Device scope is now reflected in the front-end handling of atomic ops, which seems preferable.	2024-09-25 00:44:57 +01:00
yonghong-song	4c4fb6ada7	[BPF] Do atomic_fetch_() pattern matching with memory ordering (#107343 ) Three commits in this pull request: commit 1: implement pattern matching for memory ordering seq_cst, acq_rel, release, acquire and monotonic. Specially, for monotonic memory ordering (relaxed memory model), if no return value is used, locked insn is used. commit 2: add support to handle dwarf atomic modifier in BTF generation. Actually atomic modifier is ignored in BTF. commit 3: add tests for new atomic ordering support and BTF support with _Atomic type. I removed RFC tag as now patch sets are in reasonable states. For atomic fetch_and_() operations, do pattern matching with memory ordering seq_cst, acq_rel, release, acquire and monotonic (relaxed). For fetch_and_() operations with seq_cst/acq_rel/release/acquire ordering, atomic_fetch_() instructions are generated. For monotonic ordering, locked insns are generated if return value is not used. Otherwise, atomic_fetch_() insns are used. The main motivation is to resolve the kernel issue [1]. The following are memory ordering are supported: seq_cst, acq_rel, release, acquire, relaxed Current gcc style __sync_fetch_and_() operations are all seq_cst. To use explicit memory ordering, the _Atomic type is needed. The following is an example: ``` $ cat test.c \#include <stdatomic.h> void f1(_Atomic int i) { (void)__c11_atomic_fetch_and(i, 10, memory_order_relaxed); } void f2(_Atomic int i) { (void)__c11_atomic_fetch_and(i, 10, memory_order_acquire); } void f3(_Atomic int i) { (void)__c11_atomic_fetch_and(i, 10, memory_order_seq_cst); } $ cat run.sh clang -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -O2 --target=bpf -c test.c -o test.o && llvm-objdum p -d test.o $ ./run.sh test.o: file format elf64-bpf Disassembly of section .text: 0000000000000000 <f1>: 0: b4 02 00 00 0a 00 00 00 w2 = 0xa 1: c3 21 00 00 50 00 00 00 lock (u32 )(r1 + 0x0) &= w2 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 <f2>: 3: b4 02 00 00 0a 00 00 00 w2 = 0xa 4: c3 21 00 00 51 00 00 00 w2 = atomic_fetch_and((u32 )(r1 + 0x0), w2) 5: 95 00 00 00 00 00 00 00 exit 0000000000000030 <f3>: 6: b4 02 00 00 0a 00 00 00 w2 = 0xa 7: c3 21 00 00 51 00 00 00 w2 = atomic_fetch_and((u32 )(r1 + 0x0), w2) 8: 95 00 00 00 00 00 00 00 exit ``` The following is another example where return value is used: ``` $ cat test1.c \#include <stdatomic.h> int f1(_Atomic int i) { return __c11_atomic_fetch_and(i, 10, memory_order_relaxed); } int f2(_Atomic int i) { return __c11_atomic_fetch_and(i, 10, memory_order_acquire); } int f3(_Atomic int i) { return __c11_atomic_fetch_and(i, 10, memory_order_seq_cst); } $ cat run.sh clang -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -O2 --target=bpf -c test1.c -o test1.o && llvm-objdump -d test1.o $ ./run.sh test.o: file format elf64-bpf Disassembly of section .text: 0000000000000000 <f1>: 0: b4 00 00 00 0a 00 00 00 w0 = 0xa 1: c3 01 00 00 51 00 00 00 w0 = atomic_fetch_and((u32 )(r1 + 0x0), w0) 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 <f2>: 3: b4 00 00 00 0a 00 00 00 w0 = 0xa 4: c3 01 00 00 51 00 00 00 w0 = atomic_fetch_and((u32 )(r1 + 0x0), w0) 5: 95 00 00 00 00 00 00 00 exit 0000000000000030 <f3>: 6: b4 00 00 00 0a 00 00 00 w0 = 0xa 7: c3 01 00 00 51 00 00 00 w0 = atomic_fetch_and((u32 )(r1 + 0x0), w0) 8: 95 00 00 00 00 00 00 00 exit ``` You can see that for relaxed memory ordering, if return value is used, atomic_fetch_and() insn is used. Otherwise, if return value is not used, locked insn is used. Here is another example with global _Atomic variable: ``` $ cat test3.c \#include <stdatomic.h> _Atomic int i; void f1(void) { (void)__c11_atomic_fetch_and(&i, 10, memory_order_relaxed); } void f2(void) { (void)__c11_atomic_fetch_and(&i, 10, memory_order_seq_cst); } $ cat run.sh clang -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -O2 --target=bpf -c test3.c -o test3.o && llvm-objdump -d test3.o $ ./run.sh test3.o: file format elf64-bpf Disassembly of section .text: 0000000000000000 <f1>: 0: b4 01 00 00 0a 00 00 00 w1 = 0xa 1: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll 3: c3 12 00 00 50 00 00 00 lock (u32 )(r2 + 0x0) &= w1 4: 95 00 00 00 00 00 00 00 exit 0000000000000028 <f2>: 5: b4 01 00 00 0a 00 00 00 w1 = 0xa 6: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll 8: c3 12 00 00 51 00 00 00 w1 = atomic_fetch_and((u32 )(r2 + 0x0), w1) 9: 95 00 00 00 00 00 00 00 exit ``` Note that in the above compilations, '-g' is not used. The reason is due to the following IR related to _Atomic type: ``` $clang -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -O2 --target=bpf -g -S -emit-llvm test3.c ``` The related debug info for test3.c: ``` !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression()) !1 = distinct !DIGlobalVariable(name: "i", scope: !2, file: !3, line: 3, type: !16, isLocal: false, isDefinition: true) ... !16 = !DIDerivedType(tag: DW_TAG_atomic_type, baseType: !17) !17 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) ``` If compiling test.c, the related debug info: ``` ... !19 = distinct !DISubprogram(name: "f1", scope: !1, file: !1, line: 3, type: !20, scopeLine: 3, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !25) !20 = !DISubroutineType(types: !21) !21 = !{null, !22} !22 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !23, size: 64) !23 = !DIDerivedType(tag: DW_TAG_atomic_type, baseType: !24) !24 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) !25 = !{!26} !26 = !DILocalVariable(name: "i", arg: 1, scope: !19, file: !1, line: 3, type: !22) ``` All the above suggests _Atomic behaves like a modifier (e.g. const, restrict, volatile). This seems true based on doc [1]. Without proper handling DW_TAG_atomic_type, llvm BTF generation will be incorrect since the current implementation assumes no existence of DW_TAG_atomic_type. So we have two choices here: (1). llvm bpf backend processes DW_TAG_atomic_type but ignores it in BTF encoding. (2). Add another type, e.g., BTF_KIND_ATOMIC to BTF. BTF_KIND_ATOMIC behaves as a modifier like const/volatile/restrict. For choice (1), llvm bpf backend should skip dwarf::DW_TAG_atomic_type during BTF generation whenever necessary. For choice (2), BTF_KIND_ATOMIC will be added to BTF so llvm backend and kernel needs to handle that properly. The main advantage of it probably is to maintain this atomic type so it is also available to skeleton. But I think for skeleton a raw type might be good enough unless user space intends to do some atomic operation with that, which is a unlikely case. So I choose choice (1) in this RFC implementation. See the commit message of the second commit for details. [1] https://lore.kernel.org/bpf/7b941f53-2a05-48ec-9032-8f106face3a3@linux.dev/ [2] https://dwarfstd.org/issues/131112.1.html ---------	2024-09-24 15:55:50 -07:00
eddyz87	eabc8857e7	[BPF] make __BPF_FEATURE_MAY_GOTO available for cpuv1 (#108071 ) For some reason `__BPF_FEATURE_MAY_GOTO` is available for CPUs v{2,3,4} but is not available for CPU v1. This limitation is arbitrary: - the instruction is never produced by LLVM backend; - on Linux Kernel side this instruction is available in kernels that also support CPUv4. Hence, it is more consistent to either always allow `__BPF_FEATURE_MAY_GOTO` or only allow it for CPUv4.	2024-09-24 11:46:33 +03:00
Craig Topper	f7d088b616	[RISCV] Implement validateGlobalRegisterVariable. (#109596 ) Only allow GPR registers and verify the size is the same as XLen. This fixes the crash seen in #109588 by making it a frontend error. gcc does accept the code so we may need to consider if we can fix the backend. Some other targets I tried appear to have similar issues so it might not be straightforward to fix.	2024-09-23 10:24:27 -07:00
Prabhuk	fb78495376	Reland "[Driver] Add toolchain for X86_64 UEFI target" (#109364 ) Reverts llvm/llvm-project#109340 Addressing the failed MAC Clang Driver test as part of this reland.	2024-09-20 11:16:36 -07:00
Alex Rønne Petersen	72a218056d	[llvm][Triple] Add `Environment` members and parsing for glibc/musl parity. (#107664 ) This adds support for: * `muslabin32` (MIPS N32) * `muslabi64` (MIPS N64) * `muslf32` (LoongArch ILP32F/LP64F) * `muslsf` (LoongArch ILP32S/LP64S) As we start adding glibc/musl cross-compilation support for these targets in Zig, it would make our life easier if LLVM recognized these triples. I'm hoping this'll be uncontroversial since the same has already been done for `musleabi`, `musleabihf`, and `muslx32`. I intentionally left out a musl equivalent of `gnuf64` (LoongArch ILP32D/LP64D); my understanding is that Loongson ultimately settled on simply `gnu` for this much more common case, so there doesn't seem to be a particularly compelling reason to add a `muslf64` that's basically deprecated on arrival. Note: I don't have commit access.	2024-09-20 08:53:03 +08:00
Prabhuk	d2df2e41ca	Revert "[Driver] Add toolchain for X86_64 UEFI target" (#109340 ) Reverts llvm/llvm-project#76838 Appears to be causing failures in MAC builders. First reverting the patch and will investigate after.	2024-09-19 15:28:07 -07:00
Prabhuk	d1335fb864	[Driver] Add toolchain for X86_64 UEFI target (#76838 ) Introduce changes necessary for UEFI X86_64 target Clang driver. Addressed the review comments originally suggested in Phabricator. Differential Revision: https://reviews.llvm.org/D159541	2024-09-19 11:46:55 -07:00
Benjamin Kramer	c23d6df60d	[AArch64] Don't define reserved macros It's not allowed. It also prevents Clang from compiling itself on Aarch64. lib/Basic/Targets/AArch64.cpp:404:9: warning: '__ARM_ACLE_VERSION' macro redefined [-Wmacro-redefined] 404 \| #define __ARM_ACLE_VERSION(Y, Q, P) (100 * (Y) + 10 * (Q) + (P))	2024-09-17 19:07:36 +02:00
Alexandros Lamprineas	b1d7694c12	[AArch64] Add missing ACLE predefined macros and update __ARM_ACLE. (#108857 ) Adds __ARM_ACLE_VERSION and __FUNCTION_MULTI_VERSIONING_SUPPORT_LEVEL as defined here https://github.com/ARM-software/acle/pull/301 and here https://github.com/ARM-software/acle/pull/302. Also bumps __ARM_ACLE to 202420.	2024-09-17 11:07:07 +01:00
Ganesh	02e4186d0b	[X86] AMD Zen 5 Initial enablement (#107964 ) This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.	2024-09-13 17:45:33 +01:00
Piyou Chen	9cd9377409	[RISCV][FMV] Support target_clones (#85786 ) This patch enable the function multiversion(FMV) and `target_clones` attribute for RISC-V target. The proposal of `target_clones` syntax can be found at the https://github.com/riscv-non-isa/riscv-c-api-doc/pull/48 (which has landed), as modified by the proposed https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85 (which adds the priority syntax). It supports the `target_clones` function attribute and function multiversioning feature for RISC-V target. It will generate the ifunc resolver function for the function that declared with target_clones attribute. The resolver function will check the version support by runtime object `__riscv_feature_bits`. For example: ``` __attribute__((target_clones("default", "arch=+ver1", "arch=+ver2"))) int bar() { return 1; } ``` the corresponding resolver will be like: ``` bar.resolver() { __init_riscv_feature_bits(); // Check arch=+ver1 if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION1) == BITMASK_OF_VERSION1) { return bar.arch=+ver1; } else { // Check arch=+ver2 if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION2) == BITMASK_OF_VERSION2) { return bar.arch=+ver2; } else { // Default return bar.default; } } } ```	2024-09-13 18:04:53 +08:00
Jim Lin	dee058f9e3	[RISCV] Emit predefined macro __riscv_cmodel_large for large code model (#108131 ) Co-authored-by: patrick <patrick@andestech.com>	2024-09-13 10:37:48 +08:00
Sean Perry	e62bf7cd0b	[z/OS] Set the default arch for z/OS to be arch10 (#89854 ) The default arch level on z/OS is arch10. Update the code so z/OS has arch10 without changing the default for zLinux.	2024-09-09 15:24:16 -04:00
Piyou Chen	022b3c27e2	[Clang][RISCV] Recognize unsupport target feature by supporting isValidFeatureName (#106495 ) This patch makes unsupported target attributes emit a warning and ignore the target attribute during semantic checks. The changes include: 1. Adding the RISCVTargetInfo::isValidFeatureName function. 2. Rejecting non-full-arch strings in the handleFullArchString function. 3. Adding test cases to demonstrate the warning behavior.	2024-09-09 15:07:39 +08:00
Alex Rønne Petersen	b55186eefd	[clang][Driver] Define soft float macros for PPC. (#106012 ) Fixes #105972. Co-authored-by: Qiu Chaofan <qcf@ecnelises.com>	2024-09-04 10:07:35 +08:00
Freddy Ye	83ad644afa	[X86][AVX10.2] Support AVX10.2-BF16 new instructions. (#101603 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-09-04 08:13:24 +08:00
yonghong-song	7852ebc088	[BPF] Make -mcpu=v3 as the default (#107008 ) Before llvm20, (void)__sync_fetch_and_add(...) always generates locked xadd insns. In linux kernel upstream discussion [1], it is found that for arm64 architecture, the original semantics of (void)__sync_fetch_and_add(...), i.e., __atomic_fetch_add(...), is preferred in order for jit to emit proper native barrier insns. In llvm commits [2] and [3], (void)__sync_fetch_and_add(...) will generate the following insns: - for cpu v1/v2: locked xadd insns to keep backward compatibility - for cpu v3/v4: __atomic_fetch_add() insns To ensure proper barrier semantics for (void)__sync_fetch_and_add(...), cpu v3/v4 is recommended. This patch enables cpu=v3 as the default cpu version. For users wanting to use cpu v1, -mcpu=v1 needs to be explicitly added to clang/llc command line. [1] https://lore.kernel.org/bpf/ZqqiQQWRnz7H93Hc@google.com/T/#mb68d67bc8f39e35a0c3db52468b9de59b79f021f [2] https://github.com/llvm/llvm-project/pull/101428 [3] https://github.com/llvm/llvm-project/pull/106494	2024-09-03 07:15:18 -07:00
Piyou Chen	b0276ec6b7	[RISCV][NFC] Reimplementation of target attribute override mechanism (#106680 ) This patch aims to replace the target attribute override mechanism based on `__RISCV_TargetAttrNeedOverride` with the insertion of several negative target features When the target attribute uses the full architecture string ("arch=rv64gc") or specifies the CPU ("cpu=rocket-rv64") as the version, it will override the module-level target feature. Currently, this mechanism is implemented by inserting `__RISCV_TargetAttrNeedOverride` as a dummy target feature immediately before the target attribute's feature. ``` module target features + __RISCV_TargetAttrNeedOverride + target attribute's feature ``` The RISCVTargetInfo::initFeatureMap function will remove the "module target features" and use only the "target attribute's features". This patch changes the process as follows: ``` module target features + negative target feature for all supported extension + target attribute's feature ``` The `module target features` will be disable by `negative target feature for all supported extension` in `TargetInfo::initFeatureMap`	2024-08-31 20:02:46 +08:00
Greg Roth	26c582bb45	[DXIL] Don't generate per-variable guards for DirectX (#106096 ) Thread init guards are generated for local static variables when using the Microsoft CXX ABI. This ABI is also used for HLSL generation, but DXIL doesn't need the corresponding _Init_thread_header/footer calls and doesn't really have a way to handle them in its output targets. This modifies the language ops when the target is DXIL to exclude this so that they won't be generated and an alternate guardvar method is used that is compatible with the usage. Done to facilitate testing for #89806, but isn't really related	2024-08-28 14:08:44 -07:00
SpencerAbson	2617023923	[clang][AArch64] Add SME2.1 feature macros (#105657 )	2024-08-23 14:27:49 +01:00
Brendan Dahl	7d373cef49	[WebAssembly] Change half-precision feature name to fp16. (#105434 ) This better aligns with how the feature is being referred to and what runtimes (V8) are calling it.	2024-08-22 09:44:33 -07:00
Phoebe Wang	3f25f23a2b	[X86][AVX10] Fix unexpected error and warning when using intrinsic (#104781 ) E.g.: https://godbolt.org/z/G8zK5svjK Based on Evgenii's work.	2024-08-20 19:56:19 +08:00
Zaara Syeda	3e7135750c	[PPC] Disable vsx and altivec when -msoft-float is used (#100450 ) We emit an error when -msoft-float and -maltivec/-mvsx is used together, but when -msoft-float is used on its own, there is still +altivec and +vsx in the IR attributes. This patch disables altivec and vsx and all related sub features when -msoft-float is used.	2024-08-08 12:27:26 -04:00

... 3 4 5 6 7 ...

1643 Commits