I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
70 under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
This was [addressed for AArch64
here](https://github.com/llvm/llvm-project/pull/79004), but the same
applies to ARM.
Move the enablement of neon+fp64 to `-mcpu=cortex-r52`, which optionally
supports these features.
When `+sve` is passed in the command line, if the Architecture being
targeted is V8.6A/V9.1A or later, `+f32mm` is also added. This enables
FEAT_32MM, however at the time of writing no CPU's support this. This
leads to the FEAT_32MM instructions being compiled for CPU's that do not
support them.
This commit removes the automatic enablement, however the option is
still able to be used by passing `+f32mm`.
This is something we generally tend to avoid due to it confusing the git
history, but with the new github formatting bots being more noisy we
keep running into issues with the existing formatting when adding or
adjusting CPUs. This patch formats the code to make sure we are in a
good state going forward.
Correct an issue with Arm Neoverse N2 after it was
changed to a v9a core in change
f576cbe44eabb8a5ac0af817424a0d1e7c8fbf85:
* FEAT_FHM should be enabled for this core.
This reverts commit 89c1bf1230e011f2f0e43554c278205fa1819de5.
This has been unimplemenented for a while, and GCC does not implement
it, therefore we need to consider whether we should just deprecate it
in the ACLE instead.
This patch changes how the macro __ARM_ARCH is defined to match its
defintion in the ACLE. In ACLE 5.4.1, __ARM_ARCH is defined as equal to
the major architecture version for ISAs up to and including v8. From
v8.1 onwards, its definition is changed to include minor versions, such
that for an architecture vX.Y, __ARM_ARCH = X*100 + Y. Before this
patch, LLVM defined __ARM_ARCH using only the major architecture version
for all architecture versions. This patch adds functionality to define
__ARM_ARCH correctly for architectures greater than or equal to v8.1.
The Ampere1B is Ampere's third-generation core implementing a
superscalar, out-of-order microarchitecture with nested virtualization,
speculative side-channel mitigation and architectural support for
defense against ROP/JOP style software attacks.
Ampere1B is an ARMv8.7+ implementation, adding support for the FEAT
WFxT, FEAT CSSC, FEAT PAN3 and FEAT AFP extensions. It also includes all
features of the second-generation Ampere1A, such as the Memory Tagging
Extension and SM3/SM4 cryptography instructions.
Before:
```
[----------] 65 tests from AArch64CPUTests/AArch64CPUTestFixture
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/0
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/0 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/1
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/1 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/2
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/2 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/3
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/3 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/4
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/4 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/5
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/5 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/6
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/6 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/7
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/7 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/8
...
```
After:
```
[----------] 65 tests from AArch64CPUTests/AArch64CPUTestFixture
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a34
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a34 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a35
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a35 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a53
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a53 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a55
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a55 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a510
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a510 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a520
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a520 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a57
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a57 (0 ms)
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a65
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a65 (0 ms)
...
```
Which improves the experience of finding and running this:
```
$ ./unittests/TargetParser/TargetParserTests --gtest_filter=AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a65
Note: Google Test filter = AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a65
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from AArch64CPUTests/AArch64CPUTestFixture
[ RUN ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a65
[ OK ] AArch64CPUTests/AArch64CPUTestFixture.testAArch64CPU/cortex_a65 (0 ms)
[----------] 1 test from AArch64CPUTests/AArch64CPUTestFixture (0 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[ PASSED ] 1 test.
```
Add AEK_PAUTH to ARMV8_3A in TargetParser and let it propagate to
ARMV8R, as it aligns with GCC defaults.
After adding AEK_PAUTH, several tests from TargetParserTest.cpp crashed
when trying to format an error message, thus update a format string in
AssertSameExtensionFlags to account for bitmask being pre-formatted as
std::string.
The CHECK-PAUTH* lines in aarch64-target-features.c are updated to
account for the fact that FEAT_PAUTH support and pac-ret can be enabled
independently and all four combinations are possible.
With a690e86 we added -mcpu/mtune=native support to handle the Microsoft
Azure Cobalt 100 CPU as a Neoverse N2. This patch adds a CPU alias in
TargetParser to maintain compatibility with GCC.
Currently there are several bits of code in the AArch64 driver which
attempt to enforce dependencies between optional features in the -march=
and -mcpu= options. However, these are based on the list of feature
names being enabled/disabled, so they have a lot of logic to consider
the order in which features were turned on and off, which doesn't scale
well as dependency chains get longer.
This patch moves the code handling these dependencies to TargetParser,
and changes them to use a Bitset of enabled features. This makes it easy
to check which features are enabled, and is converted back to a list of
LLVM feature names once all of the command-line options are parsed.
The motivating example for this was the -mcpu=cortex-r82+nofp option.
Previously, the code handling the dependency between the fp16 and
fp16fml extensions did not consider the nofp modifier, so it added
+fullfp16 to the feature list. This should have been disabled by the
+nofp modifier, and also the backend did follow the dependency between
fullfp16 and fp, resulting in fp being turned back on in the backend.
Most of the dependencies added to AArch64TargetParser.h weren't known
about by clang before, I built that list by checking what the backend
thinks the dependencies between SubtargetFeatures are.
This implements assembly support for the Memory Systems Extensions
introduced as part of the Armv9.5-A architecture version.
The changes include:
* New subtarget feature for FEAT_TLBIW.
* New system registers for FEAT_HDBSS:
* HDBSSBR_EL2 and HDBSSPROD_EL2.
* New system registers for FEAT_HACDBS:
* HACDBSBR_EL2 and HACDBSCONS_EL2.
* New TLBI instructions for FEAT_TLBIW:
* VMALLWS2E1(nXS), VMALLWS2E1IS(nXS) and VMALLWS2E1OS(nXS).
* New system register for FEAT_FGWTE3:
* FGWTE3_EL3.
- Adds a new +pc option to -mbranch-protection that will enable
the use of PC as a diversifier in PAC branch protection code.
- When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination
with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions
(pacibsppc, retaasppc, etc) are used.
Documentation for the relevant instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/
Co-authored-by: Lucas Prates <lucas.prates@arm.com>
This introduces assembly support for the Checked Pointer Arithmetic
Extension (FEAT_CPA), annouced as part of the Armv9.5-A architecture
version.
The changes include:
* New subtarget feature for FEAT_CPA
* New scalar instruction for pointer arithmetic
* ADDPT, SUBPT, MADDPT, and MSUBPT
* New SVE instructions for pointer arithmetic
* ADDPT (vectors, predicated), ADDPT (vectors, unpredicated)
* SUBPT (vectors, predicated), SUBPT (vectors, unpredicated)
* MADPT and MLAPT
* New ID_AA64ISAR3_EL1 system register
Mode details about the extension can be found at:
* https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023
* https://developer.arm.com/documentation/ddi0602/2023-09/
Co-authored-by: Rodolfo Wottrich <rodolfo.wottrich@arm.com>
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Neoverse N2 was incorrectly marked as an Armv8.5a core. This has been
changed to an Armv9.0a core. However, crypto options are not enabled
by default for Armv9 cores, so -mcpu=neoverse-n2+crypto is required
to enable crypto for this core.
Neoverse N2 Technical Reference Manual:
https://developer.arm.com/documentation/102099/0003/
This introduces assembly support for the Checked Pointer Arithmetic
Extension (FEAT_CPA), annouced as part of the Armv9.5-A architecture
version.
The changes include:
* New subtarget feature for FEAT_CPA
* New scalar instruction for pointer arithmetic
* ADDPT, SUBPT, MADDPT, and MSUBPT
* New SVE instructions for pointer arithmetic
* ADDPT (vectors, predicated), ADDPT (vectors, unpredicated)
* SUBPT (vectors, predicated), SUBPT (vectors, unpredicated)
* MADPT and MLAPT
* New ID_AA64ISAR3_EL1 system register
Mode details about the extension can be found at:
* https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023
* https://developer.arm.com/documentation/ddi0602/2023-09/
Co-authored-by: Rodolfo Wottrich <rodolfo.wottrich@arm.com>
We pretended they were v8.5a in the past because LLVM's modelling used
to fold SM4 crypto support into v8.6a (which the CPUs don't actually
have). That's changed in the last year so we can use the real value.
This is mostly a tidy-up commit before one that'll bring in A17 and M3.
This patch adds the feature flags of SME_F8F16 and SME_F8F32,
and the assembly/disassembly for the following instructions of SME2:
* SME:
- FMLAL, FMLALL
- FVDOT, FVDOTT
- FVDOTB
- FMOPA
That is according to this documentation:
https://developer.arm.com/documentation/ddi0602/2023-09
Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
This patch adds the feature flags of LUT and SME_LUTv2, and the
assembly/disassembly
for the following instructions of NEON, SVE2 and SME2:
* NEON:
- LUT2
- LUT4
* SVE2:
- LUTI2_ZZZI
- LUTI4_ZZZI
- LUTI4_Z2ZZI
* SME:
- MOVT
- LUTI4_4ZZT2Z
- LUTI4_S_4ZZT2Z
That is according to this documentation:
https://developer.arm.com/documentation/ddi0602/2023-09
This patch adds the feature flag FDOT2/FDOT4 and the
assembly/disassembly
for the following instructions of NEON and SVE2:
* NEON:
- FDOTlane
- FDOT
* SVE2:
- FDOT_ZZZI_BtoH
- FDOT_ZZZ_BtoH
- FDOT_ZZZI_BtoS
- FDOT_ZZZ_BtoS
That is according to this documentation:
https://developer.arm.com/documentation/ddi0602/2023-09
…mbly.
This patch adds the feature flag FAMINMAX and the assembly/disassembly
for the following instructions of NEON, SVE2 and SME2:
* NEON:
- FAMIN
- FAMAX
* SVE2:
- FAMIN_ZPmZ
- FAMAX_ZPmZ
* SME2:
- FAMAX_2Z2Z
- FAMIN_2Z2Z
- FAMAX_4Z4Z
- FAMIN_4Z4Z
That is according to this documentation:
https://developer.arm.com/documentation/ddi0602/2023-09
Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
```
$ ./bin/clang --target=arm-linux-gnueabihf --print-supported-extensions
<...>
All available -march extensions for ARM
crc
crypto
sha2
aes
dotprod
<...>
```
This follows the format set by RISC-V and AArch64. As for AArch64, ARM
doesn't have versioned extensions like RISC-V does. So there is only 1
column, which contains the name.
Any extension without a "feature" is hidden as these cannot be used with
-march.
This follows the RISC-V work done in
4b40ced4e5ba10b841516b3970e7699ba8ded572.
This uses AArch64's target parser instead. We just list the names,
without the "+" on them, which matches RISC-V's format.
```
$ ./bin/clang -target aarch64-linux-gnu --print-supported-extensions
clang version 18.0.0 (https://github.com/llvm/llvm-project.git 154da8aec20719c82235a6957aa6e461f5a5e030)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: <...>
All available -march extensions for AArch64
aes
b16b16
bf16
brbe
crc
crypto
cssc
<...>
```
Since our extensions don't have versions in the same way there's just
one column with the name in.
Any extension without a feature name (including the special "none") is
not listed as those cannot be passed to -march, they're just for the
backend. For example the MTE extension can be added with "+memtag" but
MTE2 and MTE3 do not have feature names so they cannot be added to
-march.
This does not attempt to tackle the fact that clang allows invalid
combinations of AArch64 extensions, it simply lists the possible
options. It's still up to the user to ask for something sensible.
Equally, this has no context of what CPU is being selected. Neither does
the RISC-V option, the user has to be aware of that.
I've added a target parser test, and a high level clang test that checks
RISC-V and AArch64 work and that Intel, that doesn't support this, shows
the correct error.
This implements support for two new 2022 A-profile extensions:
- FEAT_CHK - Check Feature Status Extension
- FEAT_GCS - Guarded Control Stacks
FEAT_CHK is mandatory from armv8.0-a, but is in the hint space so
there's no clang command-line flag for it, and we only print the hint as
`chkfeat x16` at v8.9a and above, to be compatible when using a
non-integrated assembler that might not yet know about the extension.
FEAT_GCS is optional from armv9.4-a onwards. It is enabled using `+gcs`
in a clang `-march=` or `-mcpu=` option string, or using a
`.arch_extension gcs` assembly directive.
This patch includes changes by Ties Stuij, Tomas Matheson, and Keith
Walker.
Differential Revision: https://reviews.llvm.org/D145563
This replaces AEK_CRYPTO in the AArch64TargetParser definitions,
replacing the composite Crypto features with the constituent parts.
AEK_CRYPTO is replaced with either AEK_AES | AEK_SHA2 or AEK_AES |
AEK_SHA2 | AEK_SHA3 | AEK_SHA4 depending on if the cpu is Arm-v8.4+.
This helps get the features correct in some more places like
target(cpu=..) attributes.
Otherwise this is hopefully an NFC for -mcpu options but seems like a
cleaner design.
Differential Revision: https://reviews.llvm.org/D142548
This updates the AArch64's Target Parser and its uses to capture
information about default features directly from ArchInfo and CpuInfo
objects, instead of relying on an API function to access them
indirectly.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D142540