This refactor patch means to remove CPU_SPECIFIC* MACROs in X86TargetParser.def
and move those information into ProcInfo of X86TargetParser.cpp. Since these
two files both maintain a table with redundant info such as cpuname and its
features supported. CPU_SPECIFIC* MACROs define some different information. This
patch dealt with them in these ways when moving:
1.mangling
This is now moved to Mangling in ProcInfo and directly initialized at array of
Processors. CPUs don't support cpu_dispatch/specific are assigned '\0' as
mangling.
2.CPU alias
The alias cpu will also be initialized in array of Processors, its attributes
will be same as its alias target cpu. Same feature list, same mangling.
3.TUNE_NAME
Before my change, some cpu names support cpu_dispatch/specific are not
supported in X86.td, which means optimizer/backend doesn't recognize them. So
they use a different TUNE_NAME to generate in IR. In this patch, I added these
missing cpu support at X86.td by utilizing existing Features and XXXTunings, so
that each cpu name can directly use its own name as TUNE_NAME to be supported
by optimizer/backend.
4.Feature list
The feature list of one CPU maintained in X86TargetParser.def is not same as
the one in X86TargetParser.cpp. It only maintains part of features of one CPU
(features defined by X86_FEATURE_COMPAT). While X86TargetParser.cpp maintains
a complete one. This patch abandons the feature list maintained by CPU_SPECIFIC*
MACROs because assigning a CPU with a complete one doesn't affect the
functionality of cpu_dispatch/specific.
Except these four info, since some of CPUs supported by cpu_dispatch/specific
doesn's support clang options like -march, -mtune before, this patch also kept
this behavior still by adding another member OnlyForCPUDispatchSpecific in
ProcInfo.
Reviewed By: pengfei, RKSimon
Differential Revision: https://reviews.llvm.org/D151696
Pursuant to discussions at
https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/22,
this commit enhances the handling of the __bf16 type in Clang.
- Firstly, it upgrades __bf16 from a storage-only type to an arithmetic
type.
- Secondly, it changes the mangling of __bf16 to DF16b on all
architectures except ARM. This change has been made in
accordance with the finalization of the mangling for the
std::bfloat16_t type, as discussed at
https://github.com/itanium-cxx-abi/cxx-abi/pull/147.
- Finally, this commit extends the existing excess precision support to
the __bf16 type. This applies to hardware architectures that do not
natively support bfloat16 arithmetic.
Appropriate tests have been added to verify the effects of these
changes and ensure no regressions in other areas of the compiler.
Reviewed By: rjmccall, pengfei, zahiraam
Differential Revision: https://reviews.llvm.org/D150913
Currently default simd alignment is defined by Clang specific TargetInfo class.
This class cannot be reused for LLVM Flang. That's why default simd alignment
calculation has been moved to OMPIRBuilder which is common for Flang and Clang.
Previous attempt: https://reviews.llvm.org/D138496 was wrong because
the default alignment depended on the number of built LLVM targets.
If we wanted to calculate the default alignment for PPC and we hadn't specified
PPC LLVM target to build, then we would get 0 as the alignment because
OMPIRBuilder couldn't create PPCTargetMachine object and it returned 0 as
the default value.
If PPC LLVM target had been built earlier, then OMPIRBuilder could have created
PPCTargetMachine object and it would have returned 128.
Differential Revision: https://reviews.llvm.org/D141910
Reviewed By: jdoerfert
The current behavior for AMX macros is:
```
gcc -march=native -dM -E - < /dev/null | grep TILE
clang -march=native -dM -E - < /dev/null | grep TILE
```
which is not ideal. Change `__AMXTILE__` and friends to `__AMX_TILE__` (i.e.
have an underscore in them). This makes GCC and Clang agree on the naming of
these AMX macros to simplify downstream user code.
Fix this for `__AMXTILE__`, `__AMX_INT8__`, `__AMX_BF16__`, and `__AMX_FP16__`.
Differential Revision: https://reviews.llvm.org/D143094
Currently default simd alignment is defined by Clang specific TargetInfo class.
This class cannot be reused for LLVM Flang. That's why default simd alignment
calculation has been moved to OMPIRBuilder which is common for Flang and Clang.
Previous attempt: https://reviews.llvm.org/D138496 was wrong because
the default alignment depended on the number of built LLVM targets.
If we wanted to calculate the default alignment for PPC and we hadn't specified
PPC LLVM target to build, then we would get 0 as the alignment because
OMPIRBuilder couldn't create PPCTargetMachine object and it returned 0 as
the default value.
If PPC LLVM target had been built earlier, then OMPIRBuilder could have created
PPCTargetMachine object and it would have returned 128.
Differential Revision: https://reviews.llvm.org/D141910
Reviewed By: jdoerfert
Reorganize clang::Builtin::Info to have them naturally align on 4 bytes
boundaries.
Instead of storing builtin headers as a straight char pointer, enumerate
them and store the enum. It allows to use a small enum instead of a
pointer to reference them.
On a 64 bit machine, this brings sizeof(clang::Builtin::Info) from 56
down to 48 bytes.
On a release build on my Linux 64 bit machine, it shrinks the size of
libclang-cpp.so by 193kB.
The impact on performance is negligible in terms of instruction count,
but the wall time seems better, see
https://llvm-compile-time-tracker.com/compare.php?from=b3d8639f3536a4876b511aca9fb7948ff9266cee&to=a89b56423f98b550260a58c41e64aff9e56b76be&stat=task-clock
Differential Revision: https://reviews.llvm.org/D142024
Currently default simd alignment is specified by Clang specific TargetInfo
class. This class cannot be reused for LLVM Flang. If we move the default
alignment field into TargetMachine class then we can create TargetMachine
objects and query them to find SIMD alignment.
Scope of changes:
1) Added information about maximal allowed SIMD alignment to TargetMachine
classes.
2) Removed getSimdDefaultAlign function from Clang TargetInfo class.
3) Refactored createTargetMachine function.
Reviewed By: jsjodin
Differential Revision: https://reviews.llvm.org/D138496
This avoids recomputing string length that is already known at compile time.
It has a slight impact on preprocessing / compile time, see
https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u
This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.
The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.
Differential Revision: https://reviews.llvm.org/D139881
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Add support for the RDPRU instruction on Zen2 processors.
User-facing features:
- Clang option -m[no-]rdpru to enable/disable the feature
- Support is implicit for znver2/znver3 processors
- Preprocessor symbol __RDPRU__ to indicate support
- Header rdpruintrin.h to define intrinsics
- "rdpru" mnemonic supported for assembler code
Internal features:
- Clang builtin __builtin_ia32_rdpru
- IR intrinsic @llvm.x86.rdpru
Differential Revision: https://reviews.llvm.org/D128934
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.
This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).
These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.
Review: Xiang Zhang and Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D122220
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.
This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.
Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).
If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.
1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.
2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.
Differential Revision: https://reviews.llvm.org/D121410
We should match GCC's behavior which allows floating-point type for -mno-x87 option on 32-bits. https://godbolt.org/z/KrbhfWc9o
The previous block issues have partially been fixed by D112143.
Reviewed By: asavonic, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D114162