177 Commits

Author SHA1 Message Date
Rainer Orth
39e30508a7
[Driver][Sparc] Default to -mcpu=v9 for 32-bit Linux/sparc64 (#109278)
While working on supporting PR #109101 on Linux/sparc64, I was reminded
that `clang -m32` still defaults to generating V8 code, although the
64-bit kernel requires a V9 CPU.

This patch corrects that.

Tested on `sparc64-unknown-linux-gnu`, `x86_64-pc-linux-gnu`,
`sparcv9-sun-solaris2.11`, and `amd64-pc-solaris2.11`.
2024-09-21 19:53:35 +02:00
Ganesh
02e4186d0b
[X86] AMD Zen 5 Initial enablement (#107964)
This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.
2024-09-13 17:45:33 +01:00
Sean Perry
e62bf7cd0b
[z/OS] Set the default arch for z/OS to be arch10 (#89854)
The default arch level on z/OS is arch10. Update the code so z/OS has
arch10 without changing the default for zLinux.
2024-09-09 15:24:16 -04:00
James Y Knight
f0eb5587ce
Remove support for 3DNow!, both intrinsics and builtins. (#96246)
This set of instructions was only supported by AMD chips starting in
the K6-2 (introduced 1998), and before the "Bulldozer" family
(2011). They were never much used, as they were effectively superseded
by the more-widely-implemented SSE (first implemented on the AMD side
in Athlon XP in 2001).

This is being done as a predecessor towards general removal of MMX
register usage. Since there is almost no usage of the 3DNow!
intrinsics, and no modern hardware even implements them, simple
removal seems like the best option.

(Clang half originally uploaded in https://reviews.llvm.org/D94213)

Works towards issue #41665 and issue #98272.
2024-07-16 12:08:48 -04:00
Freddy Ye
4def1ce101
Reland "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93136)
This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.
2024-05-24 13:46:34 +08:00
Freddy Ye
aa4069ea96
Revert "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93123)
This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.
2024-05-23 10:25:23 +08:00
Freddy Ye
282d2ab58f
[X86] Remove knl/knm specific ISAs supports (#92883)
Cont. patch after https://github.com/llvm/llvm-project/pull/75580
2024-05-23 09:46:44 +08:00
Joseph Huber
7155c1ef65
[NVPTX] Allow compiling LLVM-IR without -march set (#79873)
Summary:
The NVPTX tools require an architecture to be used, however if we are
creating generic LLVM-IR we should be able to leave it unspecified. This
will result in the `target-cpu` attributes not being set on the
functions so it can be changed when linked into code. This allows the
standalone `--target=nvptx64-nvidia-cuda` toolchain to create LLVM-IR
simmilar to how CUDA's deviceRTL looks from C/C++
2024-01-30 21:44:43 -06:00
Joseph Huber
626fe71fa5 [Clang] Fix test failing on systems without ROCm installed
Summary:
Forgot to specify `-nogpulib` which makes this test look for ROCm.
2024-01-30 13:17:02 -06:00
Joseph Huber
f2a78e68ee
[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#80035)
Summary:
Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means
to create a sort of "generic" IR. The resulting IR will not contain any
target dependent attributes and can then be inserted into another
program via `-mlink-builtin-bitcode` to inherit its attributes.

However, there are a handful of macros that can leak incorrect
information when compiling for an unspecified architecture. Currently,
things like the wavefront size will default to 64, which is actually
variable. We should not expose these macros unless it is known.
2024-01-30 13:05:29 -06:00
Joseph Huber
72d4fc1b4d Revert "[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660)"
This reverts commit c9a6e993f7b349405b6c8f9244cd9cf0f56a6a81.

This breaks HIP code that incorrectly depended on GPU-specific macros to
be set. The code is totally wrong as using `__WAVEFRTONSIZE__` on the
host is absolutely meaningless, but it seems this entire corner of the
toolchain is fundmentally broken. Reverting for now to avoid breakages.
2024-01-29 11:11:25 -06:00
Joseph Huber
c9a6e993f7
[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660)
Summary:
Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means
to create a sort of "generic" IR. The resulting IR will not contain any
target dependent attributes and can then be inserted into another
program via `-mlink-builtin-bitcode` to inherit its attributes.

However, there are a handful of macros that can leak incorrect
information when compiling for an unspecified architecture. Currently,
things like the wavefront size will default to 64, which is actually
variable. We should not expose these macros unless it is known.
2024-01-29 08:46:14 -06:00
Freddy Ye
19e784604c
[X86] Remove RAO-INT from Grandridge (#76420)
According to latest spec:
https://cdrdv2.intel.com/v1/dl/getContent/671368
2023-12-28 10:06:54 +08:00
Phoebe Wang
c78aeabaec
[X86] Add a EVEX256 macro to match with GCC and MSVC (#71317) 2023-11-07 14:39:24 +08:00
Freddy Ye
278e533ee9
[X86] Support -march=pantherlake,clearwaterforest (#69277) 2023-10-19 15:11:15 +08:00
XinWang10
057ec767ad
[X86][NFC]Update test cases after D159250 (#68517) 2023-10-10 09:32:32 +08:00
Fangrui Song
8cfe9d8f2a [Driver] Remove remnant myriad pieces after Myriad.cpp removal
after D104279 and D158706.
2023-08-25 13:29:10 -07:00
Freddy Ye
6acff5390d [X86] Support -march=gracemont
gracemont has some different tuning features from alderlake.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D158046
2023-08-21 08:49:01 +08:00
Freddy Ye
c9d92e6638 [X86] Support -march=arrowlake,arrowlake-s,lunarlake
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D156239
2023-07-28 15:05:54 +08:00
Freddy Ye
6d23a3faa4 [X86] Support -march=graniterapids-d and update -march=graniterapids
Reviewed By: pengfei, RKSimon, skan

Differential Revision: https://reviews.llvm.org/D155798
2023-07-25 13:48:31 +08:00
Freddy Ye
5cc4b1059b [X86] Update features for sierraforest, grandridge
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D155784
2023-07-25 11:00:41 +08:00
Freddy Ye
548e08c3f6 [NFC] Add missing cpu tests in predefined-arch-macros.c
Added tests for penryn, nehalem, westmere, sandybridge, ivybridge,
haswell, bonnell, silvermont.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D153714
2023-06-29 13:30:13 +08:00
Freddy Ye
847abddedc [X86] Add AMX_COMPLEX to Graniterapids
This patch also rename __AMXCOMPLEX__ to __AMX_COMPLEX__

Reviewed By: skan, xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D147525
2023-04-06 13:19:44 +08:00
Joe Loser
8998fa6c14 [clang] Change AMX macros to match names from GCC
The current behavior for AMX macros is:

```
gcc -march=native -dM -E - < /dev/null | grep TILE

clang -march=native -dM -E - < /dev/null | grep TILE
```

which is not ideal.  Change `__AMXTILE__` and friends to `__AMX_TILE__` (i.e.
have an underscore in them).  This makes GCC and Clang agree on the naming of
these AMX macros to simplify downstream user code.

Fix this for `__AMXTILE__`, `__AMX_INT8__`, `__AMX_BF16__`, and `__AMX_FP16__`.

Differential Revision: https://reviews.llvm.org/D143094
2023-02-03 07:00:16 -07:00
Ben Shi
16f9451b07 [clang] Redefine some AVR specific macros
Fixes https://github.com/llvm/llvm-project/issues/58855

Reviewed By: aykevl, Miss_Grape

Differential Revision: https://reviews.llvm.org/D141598
2023-01-13 17:22:15 +08:00
Ben Shi
485ba407a6 [clang][test] Remove unnecessary 'REQUIRES'
The test 'Preprocessor/predefined-arch-macros.c' contains many
target tests than 'amdgpu'. If clang is built without 'amdgpu',
then failures in other target tests will not be reported.

Reviewed By: aaron.ballman, MaskRay

Differential Revision: https://reviews.llvm.org/D141647
2023-01-13 10:04:22 +08:00
Brad Smith
f70d17fc2c [LoongArch] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros
Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros

Reviewed By: SixWeining, MaskRay

Differential Revision: https://reviews.llvm.org/D141070
2023-01-05 20:21:22 -05:00
Freddy Ye
27b8f54f51 [X86] Support -march=emeraldrapids
Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D140950
2023-01-05 20:27:32 +08:00
Brad Smith
d227c3b68c [Hexagon][VE][WebAssembly] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros
Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros

Reviewed By: kparzysz, aheejin, MaskRay

Differential Revision: https://reviews.llvm.org/D140757
2023-01-05 04:45:07 -05:00
Brad Smith
2784b243e3 [M68k] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros
Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros

Fixes #58974

Reviewed By: myhsu, glaubitz, 0x59616e

Differential Revision: https://reviews.llvm.org/D140695
2022-12-29 05:07:35 -05:00
Ganesh Gopalasubramanian
1f057e365f [X86] AMD Zen 4 Initial enablement
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D139073
2022-12-17 16:15:22 +05:30
Freddy Ye
84a18a260e [X86] Support -march=sierraforest, grandridge, graniterapids.
Reviewed By: skan, pengfei, MaskRay

Differential Revision: https://reviews.llvm.org/D137153
2022-11-09 16:56:03 +08:00
Freddy Ye
a806fc2767 [X86] Support -march=raptorlake, meteorlake
Reviewed By: pengfei, skan, MaskRay

Differential Revision: https://reviews.llvm.org/D135937
2022-11-04 09:32:17 +08:00
Simon Pilgrim
6e19e6ce36 [clang][X86] Add RDPRU predefined macro tests for znver2/znver3 targets
These were missed in D128934
2022-08-11 15:48:39 +01:00
Ulrich Weigand
1283ccb610 Support z16 processor name
The recently announced IBM z16 processor implements the architecture
already supported as "arch14" in LLVM.  This patch adds support for
"z16" as an alternate architecture name for arch14.
2022-04-21 19:58:22 +02:00
John Paul Adrian Glaubitz
5061eb6b01 [Sparc] Don't define __sparcv9 and __sparcv9__ when targeting V8+
Currently, clang defines the three macros __sparcv9, __sparcv9__
and __sparc_v9__ when targeting the V8+ baseline, i.e. using the
V9 instruction set on a 32-bit target.

Since neither gcc nor SolarisStudio define __sparcv9 and __sparcv9__
when targeting V8+, some existing code such as the glibc breaks when
defining either of these two macros on a 32-bit target as they are
used to detect a 64-bit target. Update the tests accordingly.

Fixes PR49562.

Reviewed By: jrtc27, MaskRay, hvdijk

Differential Revision: https://reviews.llvm.org/D98574
2022-01-21 09:57:17 -08:00
Wang, Pengfei
6f7f5b54c8 [X86] AVX512FP16 instructions enabling 1/6
1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105263
2021-08-10 12:46:01 +08:00
Ulrich Weigand
8cd8120a7b [SystemZ] Add support for new cpu architecture - arch14
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10304.

Note: No currently available Z system supports the arch14
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2021-07-26 16:57:28 +02:00
Freddy Ye
3fc1fe8db8 [X86] Support -march=rocketlake
Reviewed By: skan, craig.topper, MaskRay

Differential Revision: https://reviews.llvm.org/D100085
2021-04-13 09:48:13 +08:00
Freddy Ye
5cb47be410 [X86] Remove FeatureCLWB from FeaturesICLClient
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100279
2021-04-12 12:08:59 +08:00
Freddy Ye
5f9489b754 [X86] Refine "Support -march=alderlake"
Refine "Support -march=alderlake"
Compare with tremont, it includes 25 more new features. They are
adx, aes, avx, avx2, avxvnni, bmi, bmi2, cldemote, f16c, fma, hreset, invpcid,
kl, lzcnt, movdir64b, movdiri, pclmulqdq, pconfig, pku, serialize, shstk, vaes,
vpclmulqdq, waitpkg, widekl.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D97832
2021-03-08 13:17:18 +08:00
Yaxun (Sam) Liu
efc063b621 Fix lit test failure due to 0b81d9
These lit tests now requires amdgpu-registered-target since they
use clang driver and clang driver passes an LLVM option which
is available only if amdgpu target is registered.

Change-Id: I2df31967409f1627fc6d342d1ab5cc8aa17c9c0c
2020-12-07 19:50:21 -05:00
Liu, Chen3
756f597841 [X86] Support Intel avxvnni
This patch mainly made the following changes:

1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.

Differential Revision: https://reviews.llvm.org/D89105
2020-10-31 12:39:51 +08:00
Benjamin Kramer
39a0d6889d [X86] Add a stub for Intel's alderlake.
No scheduling, no autodetection.
2020-10-24 19:01:22 +02:00
Benjamin Kramer
bd2cf96c09 [X86] Add a stub for znver3 based on the little public information there is in AMD's manuals
No scheduling, no autodetection. Just enough so -march=znver3 works.
2020-10-24 19:01:22 +02:00
Tianqing Wang
be39a6fe6f [X86] Add User Interrupts(UINTR) instructions
For more details about these instructions, please refer to the latest
ISE document:
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89301
2020-10-22 17:33:07 +08:00
Fangrui Song
012dd42e02 [X86] Support -march=x86-64-v[234]
PR47686. These micro-architecture levels are defined in the x86-64 psABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9

GCC 11 will support these levels.

Note, -mtune=x86-64-v[234] are invalid and __builtin_cpu_is cannot be
used on them.

Reviewed By: craig.topper, RKSimon

Differential Revision: https://reviews.llvm.org/D89197
2020-10-12 10:29:46 -07:00
Fangrui Song
cbe4d973ed [X86] Define __LAHF_SAHF__ if feature 'sahf' is set or 32-bit mode
GCC 11 will define this macro.

In LLVM, the feature flag only applies to 64-bit mode and we always define the
macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can
suppress the macro. The discrepancy can unlikely cause trouble.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89198
2020-10-11 09:46:00 -07:00
Rainer Orth
76e85ae268 [clang][Sparc] Default to -mcpu=v9 for Sparc V8 on Solaris
As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit
Sparc, unlike `gcc` on Solaris.  In a 1-stage build with `gcc`, only two
testcases are affected (currently `XFAIL`ed), while in a 2-stage build more
than 100 tests `FAIL` due to this issue.

The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit
Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike
with `clang`'s default of `-mcpu=v8`.  This patch changes `clang` to use
`-mcpu=v9` on 32-bit Solaris/SPARC, too.

Doing so uncovered two bugs:

`clang -m32 -mcpu=v9` chokes with any Solaris system headers included:

  /usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined"
  #error "Both _ILP32 and _LP64 are defined"

While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9`
compilation, neither `gcc` nor Studio `cc` do.  In fact, the Studio 12.6
`cc(1)` man page clearly states:

            These predefinitions are valid in all modes:
  [...]
               __sparcv8 (SPARC)
               __sparcv9 (SPARC -m64)

At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]`
for a 32-bit Sparc compilation with any V9 cpu.  I've also changed
`MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle
Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic
Types, 3.1 Size and Alignment of Atomic C Types).

The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed
again.

Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D86621
2020-09-11 09:53:19 +02:00
Craig Topper
e6bb4c8e7b [X86] SSE4_A should only imply SSE3 not SSSE3 in the frontend.
SSE4_1 and SSE4_2 due imply SSSE3. So I guess I got confused when
switching the code to being table based in D83273.

Fixes PR47464
2020-09-08 10:50:59 -07:00