Update feature flag names from:
AEK_SMEF64 to AEK_SMEF64F64
and
AEK_SMEI64 to AEK_SMEI16I64
These feature flags had their name changed in this previous patch
https://reviews.llvm.org/D135974
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D143989
Removes the forwarding header `llvm/Support/AArch64TargetParser.h`.
I am proposing to do this for all the forwarding headers left after
rGf09cf34d00625e57dea5317a3ac0412c07292148 - for each header:
- Update all relevant in-tree includes
- Remove the forwarding Header
Differential Revision: https://reviews.llvm.org/D140999
Currently negative architecture features passes to clang like -Xclang
-target-feature -Xclang -v9.3a will end up _enabling_ dependant target
features (like FEAT_MOPS). This patch fixes that by ensuring we don't
enable dependant target features when !Enabled.
Fixes#60375
Differential Revision: https://reviews.llvm.org/D142963
This updates the AArch64's Target Parser and its uses to capture
information about default features directly from ArchInfo and CpuInfo
objects, instead of relying on an API function to access them
indirectly.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D142540
This updates the parsing methods in AArch64's Target Parser to make use
of optional returns instead of "invalid" enum values, making the API's
behaviour clearer.
Reviewed By: lenary, tmatheson
Differential Revision: https://reviews.llvm.org/D142539
The Armv8.6-a and later architecture definitions included AES, SHA2,
SHA3 and SM4, but this did not have an effect when specifying
-march=armv8.6-a. The did not set preprocessor features
(https://godbolt.org/z/1YKad6M8e) or enable the relevant instructions
(like eor3 from sha3: https://godbolt.org/z/vY9v4MqvG). Similarly
architectures armv8 to armv8.5 defined +crypto, but this did not effect
the -march's, only the -mcpu with those architectures. I believe this
was working as intended.
After D141411 we now add the default features for architectures except
for +crypto, which has had the effect of enabling aes/sha2/sha3/sm4 when
-march=armv8.6-a is used. This patch removed those crypto features
again, going back to how things were before. It also removes the
AEK_CRYPTO feature from lower architecture levels, moving it to the cpus
that use it. This shouldn't make any changes, but a few extra tests have
been added for preprocessor features that have improved since llvm 15.
The -mcpu=ampere1 cpu is the only armv8.6+ cpu at present. For that, the
AES, SHA2 and SHA3 features have been re-added to the CPU definition to
keep it in-line with the gcc definition from
db2f5d6612.
Differential Revision: https://reviews.llvm.org/D141606
Reorganize clang::Builtin::Info to have them naturally align on 4 bytes
boundaries.
Instead of storing builtin headers as a straight char pointer, enumerate
them and store the enum. It allows to use a small enum instead of a
pointer to reference them.
On a 64 bit machine, this brings sizeof(clang::Builtin::Info) from 56
down to 48 bytes.
On a release build on my Linux 64 bit machine, it shrinks the size of
libclang-cpp.so by 193kB.
The impact on performance is negligible in terms of instruction count,
but the wall time seems better, see
https://llvm-compile-time-tracker.com/compare.php?from=b3d8639f3536a4876b511aca9fb7948ff9266cee&to=a89b56423f98b550260a58c41e64aff9e56b76be&stat=task-clock
Differential Revision: https://reviews.llvm.org/D142024
Reworked after several other major changes to the TargetParser since
this was reverted. Combined with several other changes.
Inline calls for the following macros and delete AArch64TargetParser.def:
AARCH64_ARCH, AARCH64_CPU_NAME, AARCH64_CPU_ALIAS, AARCH64_ARCH_EXT_NAME
Squashed changes from D139278 and D139102.
Differential Revision: https://reviews.llvm.org/D138792
Specifying an architecture revision should also add feature strings for
any dependent default extensions. Otherwise the new checks for
target-dependent features for acle intrinsics from D134353 and D132034
can fail.
This patch does that in setFeatureEnabled, similar to the addition of
dependent architecture revisions. +sve also needs to be added to armv9
architectures in the target parser, as it is implied by +sve2.
Fixes#59911
Differential Revision: https://reviews.llvm.org/D141411
This patch makes SVE intrinsics more useable by gating them on the
target, not by ifdef preprocessor macros. See #56480. This alters the
SVEEmitter for arm_sve.h to remove the #ifdef guards and instead use
TARGET_BUILTIN with the correct features so that the existing "'func'
needs target feature sve" error will be generated when sve is not
present.
The ArchGuard containing defines in the SVEEmitter are changed to
TargetGuard containing target features. In the arm_neon.h emitter there
are both existing ArchGuard ifdefs mixed with new TargetGuard target
feature guards, so the name is change in the SVE too for consistency.
The few functions that are present in arm_sve.h (as opposed to builtin
aliases) have __attribute__((target("sve"))) added. Some of the tests
needed to be rejigged a little, as well as updating the error message,
as the error now happens at a later point.
Differential Revision: https://reviews.llvm.org/D131064
This avoids recomputing string length that is already known at compile time.
It has a slight impact on preprocessing / compile time, see
https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u
This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.
The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.
Differential Revision: https://reviews.llvm.org/D139881
This reverts commit e43924a75145d2f9e722f74b673145c3e62bfd07.
Reason: Patch broke the MSan buildbots. More information is available on
the original phabricator review: https://reviews.llvm.org/D127812
hasFeature fields need to be initialised to false. Easy to miss as missed for hasPAuth and hasFlagM.
Maybe the code less error prone like this.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D139622
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Re-land with constexpr StringRef::substr():
The TargetParser depends heavily on a collection of macros and enums to tie
together information about architectures, CPUs and extensions. Over time this
has led to some pretty awkward API choices. For example, recently a custom
operator-- has been added to the enum, which effectively turns iteration into
a graph traversal and makes the ordering of the macro calls in the header
significant. More generally there is a lot of string <-> enum conversion
going on. I think this shows the extent to which the current data structures
are constraining us, and the need for a rethink.
Key changes:
- Get rid of Arch enum, which is used to bind fields together. Instead of
passing around ArchKind, use the named ArchInfo objects directly or via
references.
- The list of all known ArchInfo becomes an array of pointers.
- ArchKind::operator-- is replaced with ArchInfo::implies(), which defines
which architectures are predecessors to each other. This allows features
from predecessor architectures to be added in a more intuitive way.
- Free functions of the form f(ArchKind) are converted to ArchInfo::f(). Some
functions become unnecessary and are deleted.
- Version number and profile are added to the ArchInfo. This makes comparison
of architectures easier and moves a couple of functions out of clang and
into AArch64TargetParser.
- clang::AArch64TargetInfo ArchInfo is initialised to Armv8a not INVALID.
- AArch64::ArchProfile which is distinct from ARM::ArchProfile
- Give things sensible names and add some comments.
Differential Revision: https://reviews.llvm.org/D138792
The TargetParser depends heavily on a collection of macros and enums to tie
together information about architectures, CPUs and extensions. Over time this
has led to some pretty awkward API choices. For example, recently a custom
operator-- has been added to the enum, which effectively turns iteration into
a graph traversal and makes the ordering of the macro calls in the header
significant. More generally there is a lot of string <-> enum conversion
going on. I think this shows the extent to which the current data structures
are constraining us, and the need for a rethink.
Key changes:
- Get rid of Arch enum, which is used to bind fields together. Instead of
passing around ArchKind, use the named ArchInfo objects directly or via
references.
- The list of all known ArchInfo becomes an array of pointers.
- ArchKind::operator-- is replaced with ArchInfo::implies(), which defines
which architectures are predecessors to each other. This allows features
from predecessor architectures to be added in a more intuitive way.
- Free functions of the form f(ArchKind) are converted to ArchInfo::f(). Some
functions become unnecessary and are deleted.
- Version number and profile are added to the ArchInfo. This makes comparison
of architectures easier and moves a couple of functions out of clang and
into AArch64TargetParser.
- clang::AArch64TargetInfo ArchInfo is initialised to Armv8a not INVALID.
- AArch64::ArchProfile which is distinct from ARM::ArchProfile
- Give things sensible names and add some comments.
Differential Revision: https://reviews.llvm.org/D138792
Virtual Memory System Architecture (VMSA)
This is part of the 2022 A-Profile Architecture extensions and adds support for
the following:
- Translation Hardening Extension (FEAT_THE)
- 128-bit Page Table Descriptors (FEAT_D128)
- 56-bit Virtual Address (FEAT_LVA3)
- Support for 128-bit System Registers (FEAT_SYSREG128)
- System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128)
- 128-bit Atomic Instructions (FEAT_LSE128)
- Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE)
- Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE)
- Memory Attribute Index Enhancement (FEAT_AIE)
New instructions added:
- FEAT_SYSREG128 adds MRRS and MSRR.
- FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases.
- FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions.
- FEAT_THE adds the set of RCW* instructions.
Specs for individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
Contributors:
Keith Walker
Lucas Prates
Sam Elliott
Son Tuan Vu
Tomas Matheson
Differential Revision: https://reviews.llvm.org/D138920
D133848 added support for the GCC format of target("..") attributes. The
supported formats to match gcc are:
// "arch=<arch>" - parsed to features as per -march=..
// "cpu=<cpu>" - parsed to features as per -mcpu=.., with CPU set to <cpu>
// "tune=<cpu>" - TuneCPU set to <cpu>
// "+feature", "+nofeature" - Add (or remove) feature.
We also support the existing formats, previously accepted by clang, for
compatibility with the existing code and intrinsics code:
// "feature", "no-feature" - Add (or remove) feature.
The clang formats would accept and use internal feature names
("fullfp16"/"neon"/"sve") as opposed to the user facing names
("fp16"/"simd"/"sve"). Usually they use the same names, but can be
different for cases like fp, fullfp16 and mte (among others).
This patch makes the clang format also except the user facing names, by
parsing the features through getArchExtFeature. There is a fallback if
the name is not recognized (like "fullfp16"), where we add the existing
string which should then be checked later for consistency. This allows
the internal names to be used as before, so long as they are recognized
as internal names. (Note that we currently don't have an implementation
of isValidFeatureName. The backend will currently give an error like
"'-sid' is not a recognized feature for this target (ignoring feature)."
This should be improved in a later patch once an implementation of
isValidFeatureName in clang is present).
Differential Revision: https://reviews.llvm.org/D137617
Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.
The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
- For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
is added to the definition of the function. Along with the existing
always_inline intrinsic, this will give a compile time error if the
function is used in a context where the target feature is not available.
- For intrinsics emitted as macros, the __builtins are emitted into
arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
the target feature and gives an error if the builtin is found in a
function without the required features, similar to arm_sve.h.
The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.
Differential Revision: https://reviews.llvm.org/D132034
This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.
The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
backward compatibility with previous releases.
To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.
Differential Revision: https://reviews.llvm.org/D133848
This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.
This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.
Differential Revision: https://reviews.llvm.org/D133885
A given function is compatible with all previous arch versions.
To avoid compering values of the attribute this logic adds all predecessor
architecture values.
Reviewed By: dmgreen, DavidSpickett
Differential Revision: https://reviews.llvm.org/D134353
The __ARM_FEATURE_SVE_VECTOR_OPERATORS macro should be changed to
indicate that this feature is now supported on VLA vectors as well as
VLS vectors. There is a complementary PR to the ACLE spec here
https://github.com/ARM-software/acle/pull/213
Reviewed By: peterwaller-arm
Differential Revision: https://reviews.llvm.org/D131573
We would like to make the ACLE NEON and SVE intrinsics more useable by
gating them on the target, not by ifdef preprocessor macros. In order to
do this the types they use need to be available. This patches makes
__bf16 always available under AArch64 not just when the bf16
architecture feature is present. This bringing it in-line with GCC. In
subsequent patches the NEON bfloat16x8_t and SVE svbfloat16_t types
(along with bfloat16_t used in arm_sve.h) will be made unconditional
too.
The operations valid on the types are still very limited. They can be
used as a storage type, but the intrinsics used for convertions are
still behind an ifdef guard in arm_neon.h/arm_bf16.h.
Differential Revision: https://reviews.llvm.org/D130973
This implements minimum support in clang for default HBC/MOPS features
on v8.8-a/v9.3-a or later architectures.
Differential Revision: https://reviews.llvm.org/D120111
This removes two feature flags from `AArch64TargetInfo` class:
- `HasHBC`: this feature does not involve generating any IR intrinsics,
so clang does not need to know about whether it is set
- `HasCrypto`: this feature is deprecated in favor of finer grained
features such as AES, SHA2, SHA3 and SM4. The associated ACLE macro
__ARM_FEATURE_CRYPTO is thus no longer used.
Differential Revision: https://reviews.llvm.org/D118757