253 Commits

Author SHA1 Message Date
Freddy Ye
6d23a3faa4 [X86] Support -march=graniterapids-d and update -march=graniterapids
Reviewed By: pengfei, RKSimon, skan

Differential Revision: https://reviews.llvm.org/D155798
2023-07-25 13:48:31 +08:00
Fangrui Song
14b466b940 [X86] Fix a typo of Broadwell after D74918. NFC
Close #64053
2023-07-23 15:15:05 -07:00
Freddy Ye
1c154bd755 [X86] Add AVX-VNNI-INT16 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155145
2023-07-20 14:31:16 +08:00
Freddy Ye
049d6a3f42 [X86] Add SM4 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155148
2023-07-20 13:35:15 +08:00
Freddy Ye
c6f66de21a [X86] Add SM3 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D155147
2023-07-20 10:24:16 +08:00
Freddy Ye
fc3b7874b6 [X86] Add SHA512 instructions.
For more details about this instruction, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D155146
2023-07-20 09:44:44 +08:00
Freddy Ye
7717c0071d [X86] Remove CPU_SPECIFIC* MACROs and add getCPUDispatchMangling
This refactor patch means to remove CPU_SPECIFIC* MACROs in X86TargetParser.def
and move those information into ProcInfo of X86TargetParser.cpp. Since these
two files both maintain a table with redundant info such as cpuname and its
features supported. CPU_SPECIFIC* MACROs define some different information. This
patch dealt with them in these ways when moving:
1.mangling
This is now moved to Mangling in ProcInfo and directly initialized at array of
Processors. CPUs don't support cpu_dispatch/specific are assigned '\0' as
mangling.
2.CPU alias
The alias cpu will also be initialized in array of Processors, its attributes
will be same as its alias target cpu. Same feature list, same mangling.
3.TUNE_NAME
Before my change, some cpu names support cpu_dispatch/specific are not
supported in X86.td, which means optimizer/backend doesn't recognize them. So
they use a different TUNE_NAME to generate in IR. In this patch, I added these
missing cpu support at X86.td by utilizing existing Features and XXXTunings, so
that each cpu name can directly use its own name as TUNE_NAME to be supported
by optimizer/backend.
4.Feature list
The feature list of one CPU maintained in X86TargetParser.def is not same as
the one in X86TargetParser.cpp. It only maintains part of features of one CPU
(features defined by X86_FEATURE_COMPAT). While X86TargetParser.cpp maintains
a complete one. This patch abandons the feature list maintained by CPU_SPECIFIC*
MACROs because assigning a CPU with a complete one doesn't affect the
functionality of cpu_dispatch/specific.
Except these four info, since some of CPUs supported by cpu_dispatch/specific
doesn's support clang options like -march, -mtune before, this patch also kept
this behavior still by adding another member OnlyForCPUDispatchSpecific in
ProcInfo.

Reviewed By: pengfei, RKSimon

Differential Revision: https://reviews.llvm.org/D151696
2023-07-05 17:32:00 +08:00
M. Zeeshan Siddiqui
e621757365 [Clang][BFloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support
Pursuant to discussions at
https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/22,
this commit enhances the handling of the __bf16 type in Clang.
- Firstly, it upgrades __bf16 from a storage-only type to an arithmetic
  type.
- Secondly, it changes the mangling of __bf16 to DF16b on all
  architectures except ARM. This change has been made in
  accordance with the finalization of the mangling for the
  std::bfloat16_t type, as discussed at
  https://github.com/itanium-cxx-abi/cxx-abi/pull/147.
- Finally, this commit extends the existing excess precision support to
  the __bf16 type. This applies to hardware architectures that do not
  natively support bfloat16 arithmetic.
Appropriate tests have been added to verify the effects of these
changes and ensure no regressions in other areas of the compiler.

Reviewed By: rjmccall, pengfei, zahiraam

Differential Revision: https://reviews.llvm.org/D150913
2023-05-27 13:33:50 +08:00
Mingming Liu
739f5578c4 [NFC][Clang]Remove a reference on argument since 'Name' is not modified'
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D149274
2023-04-26 11:21:34 -07:00
Freddy Ye
847abddedc [X86] Add AMX_COMPLEX to Graniterapids
This patch also rename __AMXCOMPLEX__ to __AMX_COMPLEX__

Reviewed By: skan, xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D147525
2023-04-06 13:19:44 +08:00
Xiang1 Zhang
038b7e6b76 [X86] Support AMX Complex instructions
Reviewed By: Wang Pengfei

Differential Revision: https://reviews.llvm.org/D147420
2023-04-04 09:54:46 +08:00
Dominik Adamski
baca3c1507 Move SIMD alignment calculation to LLVM Frontend
Currently default simd alignment is defined by Clang specific TargetInfo class.
This class cannot be reused for LLVM Flang. That's why default simd alignment
calculation has been moved to OMPIRBuilder which is common for Flang and Clang.

Previous attempt: https://reviews.llvm.org/D138496 was wrong because
the default alignment depended on the number of built LLVM targets.

If we wanted to calculate the default alignment for PPC and we hadn't specified
PPC LLVM target to build, then we would get 0 as the alignment because
OMPIRBuilder couldn't create PPCTargetMachine object and it returned 0 as
the default value.

If PPC LLVM target had been built earlier, then OMPIRBuilder could have created
PPCTargetMachine object and it would have returned 128.

Differential Revision: https://reviews.llvm.org/D141910

Reviewed By: jdoerfert
2023-02-10 04:11:54 -06:00
Archibald Elliott
b590f99712 [NFC][TargetParser] Remove llvm/Support/X86TargetParser.h 2023-02-07 11:06:00 +00:00
Joe Loser
8998fa6c14 [clang] Change AMX macros to match names from GCC
The current behavior for AMX macros is:

```
gcc -march=native -dM -E - < /dev/null | grep TILE

clang -march=native -dM -E - < /dev/null | grep TILE
```

which is not ideal.  Change `__AMXTILE__` and friends to `__AMX_TILE__` (i.e.
have an underscore in them).  This makes GCC and Clang agree on the naming of
these AMX macros to simplify downstream user code.

Fix this for `__AMXTILE__`, `__AMX_INT8__`, `__AMX_BF16__`, and `__AMX_FP16__`.

Differential Revision: https://reviews.llvm.org/D143094
2023-02-03 07:00:16 -07:00
Argyrios Kyrtzidis
4de51483ef Revert "[OpenMP][OMPIRBuilder]Move SIMD alignment calculation to LLVM Frontend"
Causes clang build failures, see https://reviews.llvm.org/D141910#4089465 for details.

This reverts commit ca446037af019d1aa01b1352a30a18df33038359.
2023-01-31 12:11:57 -08:00
Dominik Adamski
ca446037af [OpenMP][OMPIRBuilder]Move SIMD alignment calculation to LLVM Frontend
Currently default simd alignment is defined by Clang specific TargetInfo class.
This class cannot be reused for LLVM Flang. That's why default simd alignment
calculation has been moved to OMPIRBuilder which is common for Flang and Clang.

Previous attempt: https://reviews.llvm.org/D138496 was wrong because
the default alignment depended on the number of built LLVM targets.

If we wanted to calculate the default alignment for PPC and we hadn't specified
PPC LLVM target to build, then we would get 0 as the alignment because
OMPIRBuilder couldn't create PPCTargetMachine object and it returned 0 as
the default value.

If PPC LLVM target had been built earlier, then OMPIRBuilder could have created
PPCTargetMachine object and it would have returned 128.

Differential Revision: https://reviews.llvm.org/D141910

Reviewed By: jdoerfert
2023-01-26 15:10:19 -06:00
serge-sans-paille
5a7f47cc02
[clang] Optimize clang::Builtin::Info density
Reorganize clang::Builtin::Info to have them naturally align on 4 bytes
boundaries.

Instead of storing builtin headers as a straight char pointer, enumerate
them and store the enum. It allows to use a small enum instead of a
pointer to reference them.

On a 64 bit machine, this brings sizeof(clang::Builtin::Info) from 56
down to 48 bytes.

On a release build on my Linux 64 bit machine, it shrinks the size of
libclang-cpp.so by 193kB.

The impact on performance is negligible in terms of instruction count,
but the wall time seems better, see
https://llvm-compile-time-tracker.com/compare.php?from=b3d8639f3536a4876b511aca9fb7948ff9266cee&to=a89b56423f98b550260a58c41e64aff9e56b76be&stat=task-clock

Differential Revision: https://reviews.llvm.org/D142024
2023-01-23 14:27:44 +01:00
Kazu Hirata
6ad0788c33 [clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 12:31:01 -08:00
Kazu Hirata
a1580d7b59 [clang] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 11:07:21 -08:00
Dominik Adamski
6809af1a23 Revert "[OpenMP][OMPIRBuilder] Move SIMD alignment calculation to LLVM Frontend"
This reverts commit ed01de67433174d3157e9d239d59dd465d52c6a5.
2023-01-13 14:38:17 -06:00
Dominik Adamski
ed01de6743 [OpenMP][OMPIRBuilder] Move SIMD alignment calculation to LLVM Frontend
Currently default simd alignment is specified by Clang specific TargetInfo
class. This class cannot be reused for LLVM Flang. If we move the default
alignment field into TargetMachine class then we can create TargetMachine
objects and query them to find SIMD alignment.

Scope of changes:
  1) Added information about maximal allowed SIMD alignment to TargetMachine
     classes.
  2) Removed getSimdDefaultAlign function from Clang TargetInfo class.
  3) Refactored createTargetMachine function.

Reviewed By: jsjodin

Differential Revision: https://reviews.llvm.org/D138496
2023-01-13 14:07:29 -06:00
serge-sans-paille
a3c248db87
Move from llvm::makeArrayRef to ArrayRef deduction guides - clang/ part
This is a follow-up to https://reviews.llvm.org/D140896, split into
several parts as it touches a lot of files.

Differential Revision: https://reviews.llvm.org/D141139
2023-01-09 12:15:24 +01:00
Freddy Ye
27b8f54f51 [X86] Support -march=emeraldrapids
Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D140950
2023-01-05 20:27:32 +08:00
serge-sans-paille
d9ab3e82f3
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.

The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-27 09:55:19 +01:00
Steven Wu
9cd6fbee7e Fix module build after TargetParser
Need to include the textual header from the correct module.
2022-12-20 10:31:19 -08:00
Ganesh Gopalasubramanian
1f057e365f [X86] AMD Zen 4 Initial enablement
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D139073
2022-12-17 16:15:22 +05:30
Kazu Hirata
eeee3fee37 [Basic] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 11:34:27 -08:00
Freddy Ye
84a18a260e [X86] Support -march=sierraforest, grandridge, graniterapids.
Reviewed By: skan, pengfei, MaskRay

Differential Revision: https://reviews.llvm.org/D137153
2022-11-09 16:56:03 +08:00
Freddy Ye
a806fc2767 [X86] Support -march=raptorlake, meteorlake
Reviewed By: pengfei, skan, MaskRay

Differential Revision: https://reviews.llvm.org/D135937
2022-11-04 09:32:17 +08:00
Freddy Ye
aee2a35ac4 [X86] Add AVX-NE-CONVERT instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D135930
2022-10-31 23:39:38 +08:00
Freddy Ye
23f02693ec [X86] Add AVX-VNNI-INT8 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135938
2022-10-28 10:39:54 +08:00
Freddy Ye
0e720e6ada [X86] Add AVX-IFMA instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135932
2022-10-28 09:42:30 +08:00
Phoebe Wang
b51b90d6e2 [X86][1/2] SUPPORT RAO-INT
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Initial authored by Liu Chen (@LiuChen3)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135951
2022-10-27 17:20:07 +08:00
Freddy Ye
fdac4c4e92 [X86] Add CMPCCXADD instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135933
2022-10-25 14:33:39 +08:00
Xiang1 Zhang
661881d436 [X86] Add AMX-FP16 instructions.
Differential Revision: https://reviews.llvm.org/D135941
2022-10-22 08:05:22 +08:00
Phoebe Wang
62ca79102c [X86][1/2] Support PREFETCHI instructions
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136040
2022-10-20 08:46:01 +08:00
Zahira Ammarguellat
5def954a5b Support of expression granularity for _Float16.
Differential Revision: https://reviews.llvm.org/D113107
2022-08-25 08:26:53 -04:00
Freddy Ye
e4888a37d3 [X86][BF16] Enable __bf16 for x86 targets.
X86 psABI has updated to support __bf16 type, the ABI of which is the
same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D130964
2022-08-10 09:00:47 +08:00
Fangrui Song
3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Paul Robinson
08e4fe6c61 [X86] Add RDPRU instruction
Add support for the RDPRU instruction on Zen2 processors.

User-facing features:

- Clang option -m[no-]rdpru to enable/disable the feature
- Support is implicit for znver2/znver3 processors
- Preprocessor symbol __RDPRU__ to indicate support
- Header rdpruintrin.h to define intrinsics
- "rdpru" mnemonic supported for assembler code

Internal features:

- Clang builtin __builtin_ia32_rdpru
- IR intrinsic @llvm.x86.rdpru

Differential Revision: https://reviews.llvm.org/D128934
2022-07-06 07:17:47 -07:00
Phoebe Wang
abeeae570e [X86] Support _Float16 on SSE2 and up
This is split from D113107 to address #56204 and https://discourse.llvm.org/t/how-to-build-compiler-rt-for-new-x86-half-float-abi/63366

Reviewed By: zahiraam, rjmccall, bkramer, MaskRay

Differential Revision: https://reviews.llvm.org/D128571
2022-06-30 17:21:37 +08:00
Ben Langmuir
eab2a06f0f Revert "Reland "[X86] Support _Float16 on SSE2 and up""
Broke compiler-rt on Darwin: https://green.lab.llvm.org/green/job/clang-stage1-RA/29920/

This reverts commit 527ef8ca981e88a35758c0e4143be6853ea26dfc.
2022-06-28 10:59:03 -07:00
Phoebe Wang
527ef8ca98 Reland "[X86] Support _Float16 on SSE2 and up"
Enable `COMPILER_RT_HAS_FLOAT16` to solve the lit fail.

This is split from D113107 to address #56204 and https://discourse.llvm.org/t/how-to-build-compiler-rt-for-new-x86-half-float-abi/63366

Reviewed By: zahiraam, rjmccall, bkramer

Differential Revision: https://reviews.llvm.org/D128571
2022-06-28 14:38:56 +08:00
Vitaly Buka
8f7cca90af Revert "[X86] Support _Float16 on SSE2 and up"
Breaks buildbot
https://lab.llvm.org/buildbot/#/builders/37/builds/14334

This reverts commit f5d781d6273cc56dd8b44ee9e4cfb2ae5579bb04.
2022-06-27 12:43:29 -07:00
Phoebe Wang
f5d781d627 [X86] Support _Float16 on SSE2 and up
This is split from D113107 to address #56204 and https://discourse.llvm.org/t/how-to-build-compiler-rt-for-new-x86-half-float-abi/63366

Reviewed By: zahiraam, rjmccall, bkramer

Differential Revision: https://reviews.llvm.org/D128571
2022-06-27 21:37:30 +08:00
Jonas Paulsson
46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Erich Keane
dc152659b4 Have cpu-specific variants set 'tune-cpu' as an optimization hint
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.

This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.

Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).

If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.

1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.

2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.

Differential Revision: https://reviews.llvm.org/D121410
2022-03-14 06:14:30 -07:00
Phoebe Wang
925ec98d00 Revert "[X86][clang] Emit diagnostic for float and double when we have features -x87 and -sse on 64-bits"
This reverts commit 4a2c827b178f89d4cdeb56153d9440ad4ba786a3.

Need to fix the problem when using `-mno-sse` together with "x86intrin.h"
2021-12-10 10:31:09 +08:00
Phoebe Wang
4a2c827b17 [X86][clang] Emit diagnostic for float and double when we have features -x87 and -sse on 64-bits
A follow up of D114162.

Reviewed By: asavonic

Differential Revision: https://reviews.llvm.org/D114782
2021-12-08 09:50:26 +08:00
Phoebe Wang
42c15c7edf [X86][clang] Enable floating-point type for -mno-x87 option on 32-bits
We should match GCC's behavior which allows floating-point type for -mno-x87 option on 32-bits. https://godbolt.org/z/KrbhfWc9o

The previous block issues have partially been fixed by D112143.

Reviewed By: asavonic, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D114162
2021-11-30 14:08:10 +08:00