llvm-project

Author	SHA1	Message	Date
Rainer Orth	39e30508a7	[Driver][Sparc] Default to -mcpu=v9 for 32-bit Linux/sparc64 (#109278 ) While working on supporting PR #109101 on Linux/sparc64, I was reminded that `clang -m32` still defaults to generating V8 code, although the 64-bit kernel requires a V9 CPU. This patch corrects that. Tested on `sparc64-unknown-linux-gnu`, `x86_64-pc-linux-gnu`, `sparcv9-sun-solaris2.11`, and `amd64-pc-solaris2.11`.	2024-09-21 19:53:35 +02:00
Ganesh	02e4186d0b	[X86] AMD Zen 5 Initial enablement (#107964 ) This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.	2024-09-13 17:45:33 +01:00
Sean Perry	e62bf7cd0b	[z/OS] Set the default arch for z/OS to be arch10 (#89854 ) The default arch level on z/OS is arch10. Update the code so z/OS has arch10 without changing the default for zLinux.	2024-09-09 15:24:16 -04:00
James Y Knight	f0eb5587ce	Remove support for 3DNow!, both intrinsics and builtins. (#96246 ) This set of instructions was only supported by AMD chips starting in the K6-2 (introduced 1998), and before the "Bulldozer" family (2011). They were never much used, as they were effectively superseded by the more-widely-implemented SSE (first implemented on the AMD side in Athlon XP in 2001). This is being done as a predecessor towards general removal of MMX register usage. Since there is almost no usage of the 3DNow! intrinsics, and no modern hardware even implements them, simple removal seems like the best option. (Clang half originally uploaded in https://reviews.llvm.org/D94213) Works towards issue #41665 and issue #98272.	2024-07-16 12:08:48 -04:00
Freddy Ye	4def1ce101	Reland "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93136 ) This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.	2024-05-24 13:46:34 +08:00
Freddy Ye	aa4069ea96	Revert "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93123 ) This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.	2024-05-23 10:25:23 +08:00
Freddy Ye	282d2ab58f	[X86] Remove knl/knm specific ISAs supports (#92883 ) Cont. patch after https://github.com/llvm/llvm-project/pull/75580	2024-05-23 09:46:44 +08:00
Joseph Huber	7155c1ef65	[NVPTX] Allow compiling LLVM-IR without `-march` set (#79873 ) Summary: The NVPTX tools require an architecture to be used, however if we are creating generic LLVM-IR we should be able to leave it unspecified. This will result in the `target-cpu` attributes not being set on the functions so it can be changed when linked into code. This allows the standalone `--target=nvptx64-nvidia-cuda` toolchain to create LLVM-IR simmilar to how CUDA's deviceRTL looks from C/C++	2024-01-30 21:44:43 -06:00
Joseph Huber	626fe71fa5	[Clang] Fix test failing on systems without ROCm installed Summary: Forgot to specify `-nogpulib` which makes this test look for ROCm.	2024-01-30 13:17:02 -06:00
Joseph Huber	f2a78e68ee	[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#80035 ) Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into another program via `-mlink-builtin-bitcode` to inherit its attributes. However, there are a handful of macros that can leak incorrect information when compiling for an unspecified architecture. Currently, things like the wavefront size will default to 64, which is actually variable. We should not expose these macros unless it is known.	2024-01-30 13:05:29 -06:00
Joseph Huber	72d4fc1b4d	Revert "[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660 )" This reverts commit c9a6e993f7b349405b6c8f9244cd9cf0f56a6a81. This breaks HIP code that incorrectly depended on GPU-specific macros to be set. The code is totally wrong as using `__WAVEFRTONSIZE__` on the host is absolutely meaningless, but it seems this entire corner of the toolchain is fundmentally broken. Reverting for now to avoid breakages.	2024-01-29 11:11:25 -06:00
Joseph Huber	c9a6e993f7	[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660 ) Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into another program via `-mlink-builtin-bitcode` to inherit its attributes. However, there are a handful of macros that can leak incorrect information when compiling for an unspecified architecture. Currently, things like the wavefront size will default to 64, which is actually variable. We should not expose these macros unless it is known.	2024-01-29 08:46:14 -06:00
Freddy Ye	19e784604c	[X86] Remove RAO-INT from Grandridge (#76420 ) According to latest spec: https://cdrdv2.intel.com/v1/dl/getContent/671368	2023-12-28 10:06:54 +08:00
Phoebe Wang	c78aeabaec	[X86] Add a EVEX256 macro to match with GCC and MSVC (#71317 )	2023-11-07 14:39:24 +08:00
Freddy Ye	278e533ee9	[X86] Support -march=pantherlake,clearwaterforest (#69277 )	2023-10-19 15:11:15 +08:00
XinWang10	057ec767ad	[X86][NFC]Update test cases after D159250 (#68517 )	2023-10-10 09:32:32 +08:00
Fangrui Song	8cfe9d8f2a	[Driver] Remove remnant myriad pieces after Myriad.cpp removal after D104279 and D158706.	2023-08-25 13:29:10 -07:00
Freddy Ye	6acff5390d	[X86] Support -march=gracemont gracemont has some different tuning features from alderlake. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158046	2023-08-21 08:49:01 +08:00
Freddy Ye	c9d92e6638	[X86] Support -march=arrowlake,arrowlake-s,lunarlake Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D156239	2023-07-28 15:05:54 +08:00
Freddy Ye	6d23a3faa4	[X86] Support -march=graniterapids-d and update -march=graniterapids Reviewed By: pengfei, RKSimon, skan Differential Revision: https://reviews.llvm.org/D155798	2023-07-25 13:48:31 +08:00
Freddy Ye	5cc4b1059b	[X86] Update features for sierraforest, grandridge Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D155784	2023-07-25 11:00:41 +08:00
Freddy Ye	548e08c3f6	[NFC] Add missing cpu tests in predefined-arch-macros.c Added tests for penryn, nehalem, westmere, sandybridge, ivybridge, haswell, bonnell, silvermont. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D153714	2023-06-29 13:30:13 +08:00
Freddy Ye	847abddedc	[X86] Add AMX_COMPLEX to Graniterapids This patch also rename __AMXCOMPLEX__ to __AMX_COMPLEX__ Reviewed By: skan, xiangzhangllvm Differential Revision: https://reviews.llvm.org/D147525	2023-04-06 13:19:44 +08:00
Joe Loser	8998fa6c14	[clang] Change AMX macros to match names from GCC The current behavior for AMX macros is: ``` gcc -march=native -dM -E - < /dev/null \| grep TILE clang -march=native -dM -E - < /dev/null \| grep TILE ``` which is not ideal. Change `__AMXTILE__` and friends to `__AMX_TILE__` (i.e. have an underscore in them). This makes GCC and Clang agree on the naming of these AMX macros to simplify downstream user code. Fix this for `__AMXTILE__`, `__AMX_INT8__`, `__AMX_BF16__`, and `__AMX_FP16__`. Differential Revision: https://reviews.llvm.org/D143094	2023-02-03 07:00:16 -07:00
Ben Shi	16f9451b07	[clang] Redefine some AVR specific macros Fixes https://github.com/llvm/llvm-project/issues/58855 Reviewed By: aykevl, Miss_Grape Differential Revision: https://reviews.llvm.org/D141598	2023-01-13 17:22:15 +08:00
Ben Shi	485ba407a6	[clang][test] Remove unnecessary 'REQUIRES' The test 'Preprocessor/predefined-arch-macros.c' contains many target tests than 'amdgpu'. If clang is built without 'amdgpu', then failures in other target tests will not be reported. Reviewed By: aaron.ballman, MaskRay Differential Revision: https://reviews.llvm.org/D141647	2023-01-13 10:04:22 +08:00
Brad Smith	f70d17fc2c	[LoongArch] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Reviewed By: SixWeining, MaskRay Differential Revision: https://reviews.llvm.org/D141070	2023-01-05 20:21:22 -05:00
Freddy Ye	27b8f54f51	[X86] Support -march=emeraldrapids Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D140950	2023-01-05 20:27:32 +08:00
Brad Smith	d227c3b68c	[Hexagon][VE][WebAssembly] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Reviewed By: kparzysz, aheejin, MaskRay Differential Revision: https://reviews.llvm.org/D140757	2023-01-05 04:45:07 -05:00
Brad Smith	2784b243e3	[M68k] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Fixes #58974 Reviewed By: myhsu, glaubitz, 0x59616e Differential Revision: https://reviews.llvm.org/D140695	2022-12-29 05:07:35 -05:00
Ganesh Gopalasubramanian	1f057e365f	[X86] AMD Zen 4 Initial enablement Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D139073	2022-12-17 16:15:22 +05:30
Freddy Ye	84a18a260e	[X86] Support -march=sierraforest, grandridge, graniterapids. Reviewed By: skan, pengfei, MaskRay Differential Revision: https://reviews.llvm.org/D137153	2022-11-09 16:56:03 +08:00
Freddy Ye	a806fc2767	[X86] Support -march=raptorlake, meteorlake Reviewed By: pengfei, skan, MaskRay Differential Revision: https://reviews.llvm.org/D135937	2022-11-04 09:32:17 +08:00
Simon Pilgrim	6e19e6ce36	[clang][X86] Add RDPRU predefined macro tests for znver2/znver3 targets These were missed in D128934	2022-08-11 15:48:39 +01:00
Ulrich Weigand	1283ccb610	Support z16 processor name The recently announced IBM z16 processor implements the architecture already supported as "arch14" in LLVM. This patch adds support for "z16" as an alternate architecture name for arch14.	2022-04-21 19:58:22 +02:00
John Paul Adrian Glaubitz	5061eb6b01	[Sparc] Don't define __sparcv9 and __sparcv9__ when targeting V8+ Currently, clang defines the three macros __sparcv9, __sparcv9__ and __sparc_v9__ when targeting the V8+ baseline, i.e. using the V9 instruction set on a 32-bit target. Since neither gcc nor SolarisStudio define __sparcv9 and __sparcv9__ when targeting V8+, some existing code such as the glibc breaks when defining either of these two macros on a 32-bit target as they are used to detect a 64-bit target. Update the tests accordingly. Fixes PR49562. Reviewed By: jrtc27, MaskRay, hvdijk Differential Revision: https://reviews.llvm.org/D98574	2022-01-21 09:57:17 -08:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Ulrich Weigand	8cd8120a7b	[SystemZ] Add support for new cpu architecture - arch14 This patch adds support for the next-generation arch14 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch14 as host processor. - Assembler/disassembler support for new instructions. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10304. Note: No currently available Z system supports the arch14 architecture. Once new systems become available, the official system name will be added as supported -march name.	2021-07-26 16:57:28 +02:00
Freddy Ye	3fc1fe8db8	[X86] Support -march=rocketlake Reviewed By: skan, craig.topper, MaskRay Differential Revision: https://reviews.llvm.org/D100085	2021-04-13 09:48:13 +08:00
Freddy Ye	5cb47be410	[X86] Remove FeatureCLWB from FeaturesICLClient Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100279	2021-04-12 12:08:59 +08:00
Freddy Ye	5f9489b754	[X86] Refine "Support -march=alderlake" Refine "Support -march=alderlake" Compare with tremont, it includes 25 more new features. They are adx, aes, avx, avx2, avxvnni, bmi, bmi2, cldemote, f16c, fma, hreset, invpcid, kl, lzcnt, movdir64b, movdiri, pclmulqdq, pconfig, pku, serialize, shstk, vaes, vpclmulqdq, waitpkg, widekl. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D97832	2021-03-08 13:17:18 +08:00
Yaxun (Sam) Liu	efc063b621	Fix lit test failure due to 0b81d9 These lit tests now requires amdgpu-registered-target since they use clang driver and clang driver passes an LLVM option which is available only if amdgpu target is registered. Change-Id: I2df31967409f1627fc6d342d1ab5cc8aa17c9c0c	2020-12-07 19:50:21 -05:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Benjamin Kramer	39a0d6889d	[X86] Add a stub for Intel's alderlake. No scheduling, no autodetection.	2020-10-24 19:01:22 +02:00
Benjamin Kramer	bd2cf96c09	[X86] Add a stub for znver3 based on the little public information there is in AMD's manuals No scheduling, no autodetection. Just enough so -march=znver3 works.	2020-10-24 19:01:22 +02:00
Tianqing Wang	be39a6fe6f	[X86] Add User Interrupts(UINTR) instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89301	2020-10-22 17:33:07 +08:00
Fangrui Song	012dd42e02	[X86] Support -march=x86-64-v[234] PR47686. These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 GCC 11 will support these levels. Note, -mtune=x86-64-v[234] are invalid and __builtin_cpu_is cannot be used on them. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D89197	2020-10-12 10:29:46 -07:00
Fangrui Song	cbe4d973ed	[X86] Define __LAHF_SAHF__ if feature 'sahf' is set or 32-bit mode GCC 11 will define this macro. In LLVM, the feature flag only applies to 64-bit mode and we always define the macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can suppress the macro. The discrepancy can unlikely cause trouble. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89198	2020-10-11 09:46:00 -07:00
Rainer Orth	76e85ae268	[clang][Sparc] Default to -mcpu=v9 for Sparc V8 on Solaris As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit Sparc, unlike `gcc` on Solaris. In a 1-stage build with `gcc`, only two testcases are affected (currently `XFAIL`ed), while in a 2-stage build more than 100 tests `FAIL` due to this issue. The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike with `clang`'s default of `-mcpu=v8`. This patch changes `clang` to use `-mcpu=v9` on 32-bit Solaris/SPARC, too. Doing so uncovered two bugs: `clang -m32 -mcpu=v9` chokes with any Solaris system headers included: /usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined" #error "Both _ILP32 and _LP64 are defined" While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9` compilation, neither `gcc` nor Studio `cc` do. In fact, the Studio 12.6 `cc(1)` man page clearly states: These predefinitions are valid in all modes: [...] __sparcv8 (SPARC) __sparcv9 (SPARC -m64) At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]` for a 32-bit Sparc compilation with any V9 cpu. I've also changed `MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic Types, 3.1 Size and Alignment of Atomic C Types). The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed again. Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`. Differential Revision: https://reviews.llvm.org/D86621	2020-09-11 09:53:19 +02:00
Craig Topper	e6bb4c8e7b	[X86] SSE4_A should only imply SSE3 not SSSE3 in the frontend. SSE4_1 and SSE4_2 due imply SSSE3. So I guess I got confused when switching the code to being table based in D83273. Fixes PR47464	2020-09-08 10:50:59 -07:00

1 2 3 4

177 Commits