The "target-features" function attribute is not currently considered
when adding vscale_range to a function. When +sve/+sme are pushed onto
functions with "#pragma attribute push(+sve/+sme)", the function
potentially misses out on optimizations that rely on vscale_range being
present.
The FEAT_SPEv1p2 feature (known to LLVM as FeatureSPE_EEF and +spe-eef)
was incorrectly marked as a required feature of Armv8.7-A (and later),
which is incorrect because it is optional, and some CPUs do not
implement it. This moves it to the default features list, so that it is
still enabled by -march=armv8.7-a, but can be configured individually
for each processor.
For Cortex-A520 and Cortex-A520AE, I've checked that these do not have any of
the FEAT_SPE* features, so updated the tests accordingly. All other
Arm-designed v8.7A+ and v9.2A+ CPUs should continue to have it enabled. For
Ampere1B and Fujitsu Monaka, these CPUs do not have the feature, so I've
removed it from their tests. For Apple M4, I haven't found any reference for
whether that CPU should have this feature, so I've added it to the CPU
definition to avoid this being a functional change.
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm,
which allows the sve-bitperm instructions to run in streaming mode.
It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The
sve2-bitperm feature is now an alias for sve-bitperm and sve2.
A new feature flag sve-bitperm is added to reflect the change that the
instructions under FEAT_SVE_BitPerm are supported if:
on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or
in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
Similar to other targets (AMDGPU, Mips, PowerPC, RISCV, X86, ...)
`ninja check-clang-codegen-aarch64` can be used to test this subfolder.
Pull Request: https://github.com/llvm/llvm-project/pull/115818