This patch adds support for -mcpu=gb10 (NVIDIA GB10). This is a
big.LITTLE cluster of Cortex-X925 and Cortex-A725 cores. The appropriate
MIDR numbers are added to detect them in -mcpu=native.
We did not add an -mcpu=cortex-x925.cortex-a725 option because GB10 does
include the crypto instructions which we want on by default, and the
current convention is to not enable such extensions for Arm Cortex cores
in -mcpu where they are optional in the IP.
Relevant GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687005.html
Adds sve-sha3 to reference FEAT_SVE_SHA3 without specifically enabling
SVE2. The SVE2 requirement for AES, SHA3 and Bitperm is replaced with
SVE for non-streaming function.
The Andes CPU is configurable with optional extensions. The minimal
required extension set does not include `B` and `Zbc` extensions. So we
decided to remove them.
This patch adds initial support for the recently announced Armv9
Cortex-A320 processor.
For more information, including the Technical Reference Manual, see:
https://developer.arm.com/Processors/Cortex-A320
---------
Co-authored-by: Oliver Stannard <oliver.stannard@arm.com>
FEAT_FP8DOT4 and FEAT_FP8FMA are supported by FUJITSU-MONAKA. These were
previously enabled due to dependencies, but now require explicit
activation due to modifications in the dependencies.
Enable optional ISA extensions on Grace when mcpu=grace
is used: sve2-sm4, sve2-aes, sve2-sha3.
Grace is no longer an alias, but a separate CPU definition.
The FEAT_SPEv1p2 feature (known to LLVM as FeatureSPE_EEF and +spe-eef)
was incorrectly marked as a required feature of Armv8.7-A (and later),
which is incorrect because it is optional, and some CPUs do not
implement it. This moves it to the default features list, so that it is
still enabled by -march=armv8.7-a, but can be configured individually
for each processor.
For Cortex-A520 and Cortex-A520AE, I've checked that these do not have any of
the FEAT_SPE* features, so updated the tests accordingly. All other
Arm-designed v8.7A+ and v9.2A+ CPUs should continue to have it enabled. For
Ampere1B and Fujitsu Monaka, these CPUs do not have the feature, so I've
removed it from their tests. For Apple M4, I haven't found any reference for
whether that CPU should have this feature, so I've added it to the CPU
definition to avoid this being a functional change.
The ArmARM says:
```
"In an Armv9.4 implementation, if FEAT_SVE2 is implemented,
FEAT_SVE2p1 is implemented."
```
Since FEAT_SVE2 is already enabled for Armv9.0-A and later, then
FEAT_SVE2p1 should also be enabled by default.
The `SPE-EEF` system-register only feature introduced in Armv8.7-a adds
support for an extra system register (`PMSNEVFR_EL1`) to the Statistical
Profiling extension.
However, `SPE-EEF` is gated even for Armv8.7-a and the `spe-eef`
subtarget-feature is needed to enable it.
This behavior is inconsistent with the implementation for other
system-register only features as they can be used ungated under
supported architectures.
(e.g. HCX : Enable Armv8.7-A `HCRX_EL2` system register).
GCC/Binutils too do not add command line flags for features that only
enable system registers.
Fix by enabling `SPE-EEF` unconditionally under v8.7-A and above.
This adds a check that all ExtensionWithMArch which are marked as
implied features for an architecture are also present in the list of
default features. It doesn't make sense to have something mandatory but
not on by default.
There were a number of existing cases that violated this rule, and some
changes to which features are mandatory (indicated by the Implies
field).
This resulted in a bug where if a feature was marked as `Implies` but
was not added to `DefaultExt`, then for `-march=base_arch+nofeat` the
Driver would consider `feat` to have never been added and therefore
would do nothing to disable it (no `-target-feature -feat` would be
added, but the backend would enable the feature by default because of
`Implies`). See
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c.
Note that the processor definitions do not respect the architecture
DefaultExts. These apply only when specifying `-march=<some architecture
version>`. So when a feature is moved from `Implies` to `DefaultExts` on
the Architecture definition, the feature needs to be added to all
processor definitions (that are based on that architecture) in order to
preserve the existing behaviour. I have checked the TRMs for many cases
(see specific commit messages) but in other cases I have just kept the
current behaviour and not tried to fix it.
We added FPAC recently in d7e8a7487cd7 to allow ptrauth codegen to rely
on the cpu auth failure checks rather than emitting its own auth failure
check/brk sequence.
Add it to the Apple processors that do have it: A15, A16, A17, M4.
While there, tweak the description to refer to Armv8.3-A rather than
v8.3-A, matching the other features.
Textual strings for architecture feature flags have not been
consistently written, so there are a wide variety of styles,
capitalisation, etc. and some are missing information. I have
tidied this up mechanically for AArch64, so that the output of
`--print-supported-extensions` looks much more consistent,
since it's user-visible.
But since SVE and friends have been added to the default extensions
list, and every CPU was opted into those extensions by default, we
couldn't correctly announce its architecutral version to the backend.
Additionally, we FEAT_MEC from llvm's "required" list for v9.0 to the
optional list for v9.2, as the spec considers it optional, and M4 does
not implement it. Similarly, fixes up several bugs w.r.t. FEAT_RME.
As a drive-by, I noticed that saphira did not have an
AArch64CPUTestParams entry, and thus added one.
For AArch64, we have existing tests for `--print-enabled-extensions` for
each architecture. However:
- These are added to the end of the existing tests which check for
`"-target-feature"`, which complicates them slightly.
- They do not test the descriptions printed next to each feature.
- Part of the output was tested separately in `TargetParserTest`.
- We did not have _any_ tests of this output for CPUs (only for
architectures).
Similarly, the tests for `--print-supported-extensions` do not give
complete coverage of either the full list of features or the
descriptions.
In my opinion we should be testing the full output, as this is what the
user sees. Descriptions and formatting can contain errors and be
accidentally broken.