llvm-project

Author	SHA1	Message	Date
Lucas Prates	f516e91715	[AArch64] Add new v9.4-A PM pstate system register This adds support for the new PM pstate system register introduced by the v9.4-A Exception-based Event Profiling extension (FEAT_EBEP). The new PM pstate register takes a 1-bit immediate and requires different values to be specified for the higher bits of the Crm field. To enable that, this patch creates an explicit separation between the pstate system registers that take 4-bit and 1-bit immediate operands, allowing each entry to specify the value for the 3 high bits of Crm. This also updates other pstate registers to correctly accept 4-bit immediates, matching their decoding specification from the Arm ARM. These include: `PAN`, `UAO`, `DIT` and `SSBS`. More information about this extension and the new register can be found at: * https://developer.arm.com/documentation/ddi0601/2022-09/AArch64-Registers/PM--PMU-Exception-Mask Contributors: * Lucas Prates * Sam Elliott Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139925	2022-12-19 15:07:52 +00:00
Sergei Barannikov	4d48ccfc88	[MC] Use `MCRegister` instead of `unsigned` in `MCTargetAsmParser` Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D140273	2022-12-18 12:12:05 -08:00
Archibald Elliott	947d4fb373	[AArch64] RASv2 Assembly Support This feature adds upstream support for FEAT_RASv2 and FEAT_PFAR. Both are system-register-only, but FEAT_RAS is behind the command-line extension "+ras", so FEAT_RASv2 is behind "+rasv2". This patch includes support for ID_AA64MMFR4_EL1. This is an ID system register so it is not behind any feature flags. Differential Revision: https://reviews.llvm.org/D139936	2022-12-16 14:37:35 +00:00
Lucas Prates	2050e7ebe1	[Arm][AArch64] Add support for v8.9-A/v9.4-A base extensions This implements the base extensions that are part of the v8.9-A and v9.4-A architecture versions, including: * The Clear BHB Instruction (FEAT_CLRBHB) * The Speculation Restriction Instruction (FEAT_SPECRES2) * The SLC target for the PRFM instruction * New system registers: * ID_AA64PFR2_EL1 * ID_AA64MMFR3_EL1 * HFGITR2_EL2 * SCTLR2_EL3 More information on the new extensions can be found on: * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 * https://developer.arm.com/downloads/-/exploration-tools Contributors: Sam Elliott, Tomas Matheson and Son Tuan Vu. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139424	2022-12-08 10:15:29 +00:00
Tomas Matheson	541a1371c0	Revert "[AArch64] Improve TargetParser API" This reverts commit e83f1502f1be7a2a3b9a33f5a73867767e78ba6b. Did not build with C++20 and caused problems with dynamic libs.	2022-12-05 11:09:03 +00:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Fangrui Song	3c2f9daf2d	[AArch64] Remove following .inst/after directive from AsmParser diagnostics The part of the diagnostic is not useful because the instruction line is printed. The new style follows generic code.	2022-12-01 21:44:03 +00:00
Tomas Matheson	e83f1502f1	[AArch64] Improve TargetParser API Re-land with constexpr StringRef::substr(): The TargetParser depends heavily on a collection of macros and enums to tie together information about architectures, CPUs and extensions. Over time this has led to some pretty awkward API choices. For example, recently a custom operator-- has been added to the enum, which effectively turns iteration into a graph traversal and makes the ordering of the macro calls in the header significant. More generally there is a lot of string <-> enum conversion going on. I think this shows the extent to which the current data structures are constraining us, and the need for a rethink. Key changes: - Get rid of Arch enum, which is used to bind fields together. Instead of passing around ArchKind, use the named ArchInfo objects directly or via references. - The list of all known ArchInfo becomes an array of pointers. - ArchKind::operator-- is replaced with ArchInfo::implies(), which defines which architectures are predecessors to each other. This allows features from predecessor architectures to be added in a more intuitive way. - Free functions of the form f(ArchKind) are converted to ArchInfo::f(). Some functions become unnecessary and are deleted. - Version number and profile are added to the ArchInfo. This makes comparison of architectures easier and moves a couple of functions out of clang and into AArch64TargetParser. - clang::AArch64TargetInfo ArchInfo is initialised to Armv8a not INVALID. - AArch64::ArchProfile which is distinct from ARM::ArchProfile - Give things sensible names and add some comments. Differential Revision: https://reviews.llvm.org/D138792	2022-12-01 15:30:07 +00:00
Tomas Matheson	d1ef4b0a8d	Revert "[AArch64] Improve TargetParser API" Buildbots unhappy about constexpr function. This reverts commit 450de8008bb0ccb5dfc9dd69b6f5b434158772bd.	2022-12-01 13:06:54 +00:00
Tomas Matheson	450de8008b	[AArch64] Improve TargetParser API The TargetParser depends heavily on a collection of macros and enums to tie together information about architectures, CPUs and extensions. Over time this has led to some pretty awkward API choices. For example, recently a custom operator-- has been added to the enum, which effectively turns iteration into a graph traversal and makes the ordering of the macro calls in the header significant. More generally there is a lot of string <-> enum conversion going on. I think this shows the extent to which the current data structures are constraining us, and the need for a rethink. Key changes: - Get rid of Arch enum, which is used to bind fields together. Instead of passing around ArchKind, use the named ArchInfo objects directly or via references. - The list of all known ArchInfo becomes an array of pointers. - ArchKind::operator-- is replaced with ArchInfo::implies(), which defines which architectures are predecessors to each other. This allows features from predecessor architectures to be added in a more intuitive way. - Free functions of the form f(ArchKind) are converted to ArchInfo::f(). Some functions become unnecessary and are deleted. - Version number and profile are added to the ArchInfo. This makes comparison of architectures easier and moves a couple of functions out of clang and into AArch64TargetParser. - clang::AArch64TargetInfo ArchInfo is initialised to Armv8a not INVALID. - AArch64::ArchProfile which is distinct from ARM::ArchProfile - Give things sensible names and add some comments. Differential Revision: https://reviews.llvm.org/D138792	2022-12-01 12:50:23 +00:00
Tomas Matheson	f57f086714	[AArch64TargetParser] getArchFeatures -> getArchFeature Differential Revision: https://reviews.llvm.org/D138753	2022-12-01 12:50:17 +00:00
Tomas Matheson	7fea6f2e0e	[AArch64] Assembly support for VMSA Virtual Memory System Architecture (VMSA) This is part of the 2022 A-Profile Architecture extensions and adds support for the following: - Translation Hardening Extension (FEAT_THE) - 128-bit Page Table Descriptors (FEAT_D128) - 56-bit Virtual Address (FEAT_LVA3) - Support for 128-bit System Registers (FEAT_SYSREG128) - System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128) - 128-bit Atomic Instructions (FEAT_LSE128) - Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE) - Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE) - Memory Attribute Index Enhancement (FEAT_AIE) New instructions added: - FEAT_SYSREG128 adds MRRS and MSRR. - FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases. - FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions. - FEAT_THE adds the set of RCW* instructions. Specs for individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ Contributors: Keith Walker Lucas Prates Sam Elliott Son Tuan Vu Tomas Matheson Differential Revision: https://reviews.llvm.org/D138920	2022-11-30 13:37:02 +00:00
Sander de Smalen	7f01737687	[AArch64][AsmParser] SME: Allow h/v suffix to be upper-case.	2022-11-28 11:42:43 +00:00
Kazu Hirata	fc07a54ef6	[AsmParser] Use std::optional in AArch64AsmParser.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 22:14:44 -08:00
Tomas Matheson	a6aaa969f7	[AArch64] Assembly support for FEAT_LRCPC3 This patch implements assembly support for the 2022 A-Profile Architecture extension FEAT_LRCPC3. FEAT_LRCPC3 is AArch64 only and introduces new variants of load/store instructions with release consistency ordering. Specs for individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ This feature is optionally available from v8.2a and therefore not enabled by default. Contributors: Lucas Prates Sam Elliot Son Tuan Vu Tomas Matheson Differential Revision: https://reviews.llvm.org/D138579	2022-11-25 18:59:07 +00:00
Ties Stuij	cb261e30fb	[AArch64][clang] implement 2022 General Data-Processing instructions This patch implements the 2022 Architecture General Data-Processing Instructions They include: Common Short Sequence Compression (CSSC) instructions - scalar comparison instructions SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate - ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes) - command-line options for CSSC Associated with these instructions in the documentation is the Range Prefetch Memory (RPRFM) instruction, which signals to the memory system that data memory accesses from a specified range of addresses are likely to occur in the near future. The instruction lies in hint space, and is made unconditional. Specs for the individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ contributors to this patch: - Cullen Rhodes - Son Tuan Vu - Mark Murray - Tomas Matheson - Sam Elliott - Ties Stuij Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D138488	2022-11-22 14:23:12 +00:00
Kazu Hirata	6ba4b62af8	Return None instead of Optional<T>() (NFC) This patch replaces: return Optional<T>(); with: return None; to make the migration from llvm::Optional to std::optional easier. Specifically, I can deprecate None (in my source tree, that is) to identify all the instances of None that should be replaced with std::nullopt. Note that "return None" far outnumbers "return Optional<T>();". There are more than 2000 instances of "return None" in our source tree. All of the instances in this patch come from functions that return Optional<T> except Archive::findSym and ASTNodeImporter::import, where we return Expected<Optional<T>>. Note that we can construct Expected<Optional<T>> from any parameter convertible to Optional<T>, which None certainly is. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Differential Revision: https://reviews.llvm.org/D138464	2022-11-21 19:06:42 -08:00
Ties Stuij	983f63f7f0	[AArch64][ARM] add Armv8.9-a/Armv9.4-a identifier support For both ARM and AArch64 add support for specifying -march=armv8.9a/armv9.4a to clang. Add backend plumbing like target parser and predicate support. For a summary of Amv8.9/Armv9.4 features, see: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 For detailed information, consult the Arm Architecture Reference Manual for A-profile architecture: https://developer.arm.com/documentation/ddi0487/latest/ People who contributed to this patch: - Keith Walker - Ties Stuij Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D138010	2022-11-16 10:20:14 +00:00
Caroline Concatto	4dad168e67	[NFC][AArch64] SME2 Add instruction name convention and fix LookupTable number of registers This patch adds the name convention for SME instructions. This patch fixes the number of registers for LookUpTable in the AsmParser. The number of registers is not used atm, but it is needed. The switch case in getNumRegsForRegKind needs to have all the RegKind enum.	2022-11-15 12:09:51 +00:00
Caroline Concatto	3eacda4547	[AArch64] Add all SME2.1 instructions Assembly/Disassembly This patch adds a new feature flag: sme-f16f16 to represent FEAT_SME-F16F16 This patch add the following instructions: SME2.1 stand alone instructions: MOVAZ (array to vector, four registers): Move and zero four ZA single-vector groups to vector registers. (array to vector, two registers): Move and zero two ZA single-vector groups to vector registers. (tile to vector, four registers): Move and zero four ZA tile slices to vector registers. (tile to vector, single): Move and zero ZA tile slice to vector register. (tile to vector, two registers): Move and zero two ZA tile slices to vector registers. LUTI2 (Strided four registers): Lookup table read with 2-bit indexes. (Strided two registers): Lookup table read with 2-bit indexes. LUTI4 (Strided four registers): Lookup table read with 4-bit indexes. (Strided two registers): Lookup table read with 4-bit indexes. ZERO (double-vector): Zero ZA double-vector groups. (quad-vector): Zero ZA quad-vector groups. (single-vector): Zero ZA single-vector groups. SME2p1 and SME-F16F16: All instructions are half precision elements: FADD: Floating-point add multi-vector to ZA array vector accumulators. FSUB: Floating-point subtract multi-vector from ZA array vector accumulators. FMLA (multiple and indexed vector): Multi-vector floating-point fused multiply-add by indexed element. (multiple and single vector): Multi-vector floating-point fused multiply-add by vector. (multiple vectors): Multi-vector floating-point fused multiply-add. FMLS (multiple and indexed vector): Multi-vector floating-point fused multiply-subtract by indexed element. (multiple and single vector): Multi-vector floating-point fused multiply-subtract by vector. (multiple vectors): Multi-vector floating-point fused multiply-subtract. FCVT (widening): Multi-vector floating-point convert from half-precision to single-precision (in-order). FCVTL: Multi-vector floating-point convert from half-precision to deinterleaved single-precision. FMOPA (non-widening): Floating-point outer product and accumulate. FMOPS (non-widening): Floating-point outer product and subtract. SME2p1 and B16B16: BFADD: BFloat16 floating-point add multi-vector to ZA array vector accumulators. BFSUB: BFloat16 floating-point subtract multi-vector from ZA array vector accumulators. BFCLAMP: Multi-vector BFloat16 floating-point clamp to minimum/maximum number. BFMLA (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-add by indexed element. (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-add by vector. (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-add. BFMLS (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-subtract by indexed element. (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-subtract by vector. (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-subtract. BFMAX (multiple and single vector): Multi-vector BFloat16 floating-point maximum by vector. (multiple vectors): Multi-vector BFloat16 floating-point maximum. BFMAXNM (multiple and single vector): Multi-vector BFloat16 floating-point maximum number by vector. (multiple vectors): Multi-vector BFloat16 floating-point maximum number. BFMIN (multiple and single vector): Multi-vector BFloat16 floating-point minimum by vector. (multiple vectors): Multi-vector BFloat16 floating-point minimum. BFMINNM (multiple and single vector): Multi-vector BFloat16 floating-point minimum number by vector. (multiple vectors): Multi-vector BFloat16 floating-point minimum number. BFMOPA (non-widening): BFloat16 floating-point outer product and accumulate. BFMOPS (non-widening): BFloat16 floating-point outer product and subtract. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137571	2022-11-14 14:56:16 +00:00
Caroline Concatto	ecab1bc0dc	[AArch64]SME2 Multi vector Sel Load and Store instructions This patch adds the assembly/disassembly for the following instruction: SEL: Multi-vector conditionally select elements from two vectors for 2 and 4 registers Non-constiguous load with stride resgisters: LD1B (scalar + immediate): Contiguous load of bytes to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of bytes to multiple strided vectors (scalar index). LD1D (scalar + immediate): Contiguous load of doublewords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of doublewords to multiple strided vectors (scalar index). LD1H (scalar + immediate): Contiguous load of halfwords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of halfwords to multiple strided vectors (scalar index). LD1W (scalar + immediate): Contiguous load of words to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of words to multiple strided vectors (scalar index). LDNT1B (scalar + immediate): Contiguous load non-temporal of bytes to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of bytes to multiple strided vectors (scalar index). LDNT1D (scalar + immediate): Contiguous load non-temporal of doublewords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of doublewords to multiple strided vectors (scalar index). LDNT1H (scalar + immediate): Contiguous load non-temporal of halfwords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of halfwords to multiple strided vectors (scalar index). LDNT1W (scalar + immediate): Contiguous load non-temporal of words to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of words to multiple strided vectors (scalar index). Non-constiguous store with stride resgisters: ST1B (scalar + immediate): Contiguous store of bytes from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of bytes from multiple strided vectors (scalar index). ST1D (scalar + immediate): Contiguous store of doublewords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of doublewords from multiple strided vectors (scalar index). ST1H (scalar + immediate): Contiguous store of halfwords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of halfwords from multiple strided vectors (scalar index). ST1W (scalar + immediate): Contiguous store of words from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of words from multiple strided vectors (scalar index). STNT1B (scalar + immediate): Contiguous store non-temporal of bytes from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of bytes from multiple strided vectors (scalar index). STNT1D (scalar + immediate): Contiguous store non-temporal of doublewords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of doublewords from multiple strided vectors (scalar index). STNT1H (scalar + immediate): Contiguous store non-temporal of halfwords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of halfwords from multiple strided vectors (scalar index). STNT1W (scalar + immediate): Contiguous store non-temporal of words from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of words from multiple strided vectors (scalar index). The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a new SVE vector list to represent the stride loads/stores ZPRVectorListStrided and the sets of 2 and 4 ZA registers: ZZ_[b\|h\|w\|d]_strided and ZZZZ_[b\|h\|w\|d]_strided Differential Revision: https://reviews.llvm.org/D136172	2022-11-10 16:04:57 +00:00
Keith Walker	00d98e6572	[AArch64] RME MEC instructions and system registers This patch adds assembler/disassembler support for RME MEC (Memory Encryption Contexts). Cache maintence instructions added: - DC CIPAPA - DC CIGDPAPA System registers added: - MECIDR_EL2 - MECID_P0_EL2 - MECID_A0_EL2 - MECID_P1_EL2 - MECID_A1_EL2 - VMECID_P_EL2 - VMECID_A_EL2 - MECID_RL_A_EL3 Differential Revision: https://reviews.llvm.org/D137431	2022-11-10 14:05:12 +00:00
Caroline Concatto	1a917568e7	[AArch64]SME2 MOV Instructions This patch adds the assembly/disassembly for the following instructions: MOVA (array to vector, four registers): Move four ZA single-vector groups to four vector registers. (array to vector, two registers): Move two ZA single-vector groups to two vector registers. (tile to vector, four registers): Move four ZA tile slices to four vector registers. (tile to vector, single): Move ZA tile slice to vector register. (tile to vector, two registers): Move two ZA tile slices to two vector registers. (vector to array, four registers): Move four vector registers to four ZA single-vector groups. (vector to array, two registers): Move two vector registers to two ZA single-vector groups. (vector to tile, four registers): Move four vector registers to four ZA tile slices. (vector to tile, single): Move vector register to ZA tile slice. (vector to tile, two registers): Move two vector registers to two ZA tile slices. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It add more sizes for Matrix Operand: MatrixOp8 and MatrixOp16 two implicit operands uimm0s2range and uimm0s4range. and uimm1s2range that are immediates Differential Revision: https://reviews.llvm.org/D136142	2022-11-08 09:51:56 +00:00
David Sherwood	cf69895ab3	[AArch64][SVE2] Add the SVE2.1 BF16 instructions This patch adds the new FEAT_B16B16 feature as well as the assembly/disassembly for all of the B16B16 instructions: bfadd: BFloat16 floating-point add vectors bfsub: BFloat16 floating-point subtract vectors bfmul: BFloat16 floating-point multiply vectors bfclamp: BFloat16 floating-point clamp to minimum/maximum number bfmax: BFloat16 floating-point maximum bfmaxnm: BFloat16 floating-point maximum number bfmin: BFloat16 floating-point minimum bfminnm: BFloat16 floating-point minimum number bfmla: BFloat16 floating-point fused multiply-add vectors bfmls: BFloat16 floating-point fused multiply-subtract vectors The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137321	2022-11-07 15:29:40 +00:00
David Sherwood	12a6572d41	[AArch64] Add SME2.1 target feature for Armv9-A 2022 Architecture Extension First patch in a series adding MC layer support for SME2.1. This patch adds the following feature: sme2p1 The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137410	2022-11-07 14:38:28 +00:00
Caroline Concatto	73d2a4cfd8	[AArch64] SME2 -Fix failing buildbots because of warning This patch is to solve this: https://lab.llvm.org/buildbot#builders/36/builds/26801 Created by this patch: a20112a74cb34f [AArch64]SME2 instructions that use ZTO operand	2022-11-03 08:34:56 +00:00
Caroline Concatto	a20112a74c	[AArch64]SME2 instructions that use ZTO operand This patch adds the assembly/disassembly for the following instructions: ZERO (ZT0): Zero ZT0. LDR (ZT0): Load ZT0 register. STR (ZT0): Store ZT0 register. MOVT (scalar to ZT0): Move 8 bytes from general-purpose register to ZT0. (ZT0 to scalar): Move 8 bytes from ZT0 to general-purpose register. Consecutive: LUTI2 (single): Lookup table read with 2-bit indexes. (two registers): Lookup table read with 2-bit indexes. (four registers): Lookup table read with 2-bit indexes. LUTI4 (single): Lookup table read with 4-bit indexes. (two registers): Lookup table read with 4-bit indexes. (four registers): Lookup table read with 4-bit indexes. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a new register class and operand for zt0 and a another index operand uimm3s8 Differential Revision: https://reviews.llvm.org/D136088	2022-11-03 07:35:21 +00:00
David Sherwood	be369ea31b	[AArch64][SVE2] Add the SVE2.1 while & pext predicate pair instructions This patch adds the assembly/disassembly for the following predicate pair instructions: pext: Set pair of predicates from predicate-as-counter whilelt: While incrementing signed scalar less than scalar whilele: While incrementing signed scalar less than or equal to scalar whilegt: While incrementing signed scalar greater than scalar whilege: While incrementing signed scalar greater than or equal to scalar whilelo: While incrementing unsigned scalar lower than scalar whilels: While incrementing unsigned scalar lower or same as scalar whilehs: While decrementing unsigned scalar higher or same as scalar whilehi: While decrementing unsigned scalar higher than scalar The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136759	2022-11-02 08:39:03 +00:00
David Sherwood	5f7a8cf026	[AArch64][SVE2] Add the SVE2.1 cntp instruction This patch adds the assembly/disassembly for the following instructions: cntp : Set scalar to count from predicate-as-counter The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136747	2022-11-01 13:24:37 +00:00
David Sherwood	3eafca58e2	[AArch64][SVE2] Add the SVE2.1 contiguous load to multiple consecutive vector This patch adds the assembly/disassembly for the following instructions: ld1* : Contiguous load of bytes to multiple consecutive vectors - (scalar + scalar) and (scalar + immediate) ldnt1* : Contiguous load non-temporal of bytes to multiple consecutive vectors - (scalar + scalar) and (scalar + immediate) The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136680	2022-11-01 09:31:51 +00:00
Caroline Concatto	340cf2118a	[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors LONG INT MLA sources This patch adds the assembly/disassembly for the following instructions: SMLALL: (multiple and indexed vector): Multi-vector signed integer multiply-add long long by indexed element. (multiple and single vector): Multi-vector signed integer multiply-add long long by vector. (multiple vectors): Multi-vector signed integer multiply-add long long. SMLSLL: (multiple and indexed vector): Multi-vector signed integer multiply-subtract long long by indexed element. (multiple and single vector): Multi-vector signed integer multiply-subtract long long by vector. (multiple vectors): Multi-vector signed integer multiply-subtract long long. SUMLALL: (multiple and indexed vector): Multi-vector signed by unsigned integer multiply-add long long by indexed element. (multiple and single vector): Multi-vector signed by unsigned integer multiply-add long long by vector. UMLALL: (multiple and indexed vector): Multi-vector unsigned integer multiply-add long long by indexed element. (multiple and single vector): Multi-vector unsigned integer multiply-add long long by vector. (multiple vectors): Multi-vector unsigned integer multiply-add long long. UMLSLL: (multiple and indexed vector): Multi-vector unsigned integer multiply-subtract long long by indexed element. (multiple and single vector): Multi-vector unsigned integer multiply-subtract long long by vector. (multiple vectors): Multi-vector unsigned integer multiply-subtract long long. USMLALL: (multiple and indexed vector): Multi-vector unsigned by signed integer multiply-add long long by indexed element. (multiple and single vector): Multi-vector unsigned by signed integer multiply-add long long by vector. (multiple vectors): Multi-vector unsigned by signed integer multiply-add long long. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds a new immediate: uimm2s4range for off2 uimm1s4range for o1 to represent the vector select offset. The new operands have the range between the first and the last vector position. Depends on : D135785 Differential Revision: https://reviews.llvm.org/D136075	2022-10-28 17:39:23 +01:00
David Sherwood	3232725852	Fix whitespace introduced by 891aaff9a8a9997582eac1bb1edb8d4b4e117ef1	2022-10-27 18:24:51 +00:00
David Sherwood	891aaff9a8	[AArch64][SVE2] Add the SVE2.1 pext and ptrue predicate-as-counter instructions This patch adds the assembly/disassembly for the following instructions: pext (predicate) : Set predicate from predicate-as-counter ptrue (predicate-as-counter) : Initialise predicate-as-counter to all active This patch also introduces the predicate-as-counter registers pn8, etc. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136678	2022-10-27 18:23:35 +00:00
David Sherwood	fcd545863d	[AArch64] Add SVE2.1 target feature for Armv9-A 2022 Architecture Extension First patch in a series adding MC layer support for SVE2.1. This patch adds the following feature: sve2p1 Some of the existing SVE instructions added for SME are now also available under the sve2p1 feature, which are now guarded by the HasSVE2p1orSME predicate. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136352	2022-10-21 14:02:32 +00:00
Caroline Concatto	fc0cde8af5	[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources This patch adds the assembly/disassembly for the following instructions: INT: SMLAL SMLSL UMLAL UMLSL FP: BFMLAL BFMLSL FMLAL FMLSL For multiple and indexed vector, Multiple and Single vector and Multi vectors, for 1, 2 and 4 ZA registers. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds a new immediate: uimm3s2range for off3 uimm2s2range for off2 to represent the vector select offset. The new operands have the range between the first and the last vector position. Depends on: D135563 Reviewed By: aemerson, sdesmalen Re-landing the patch as the problem with https://reviews.llvm.org/D135563 is fixed in this commit: 1e4f82c2578cf5045ffe Differential Revision: https://reviews.llvm.org/D135785	2022-10-21 14:20:44 +01:00
Caroline Concatto	1e4f82c257	[AArch64]SME2 Multi-single vector SVE Destructive 2 and 4 Registers This patch adds the assembly/disassembly for the following instructions: ADD (to vector): Add replicated single vector to multi-vector with multi-vector result. SQDMULH (multiple and single vector): Multi-vector signed saturating doubling multiply high by vector. for 2 and 4 ZA SVE registers. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds more size for the multiple register tuple: ZZ_b_mul_r, ZZ_h_mul_r, ZZZZ_b_mul_r, ZZZZ_h_mul_r, for 8 bits and 16 bits with 2 and 4 ZA registers. Depends on: D135468 With a fix for Mips for this test: llvm/test/MC/Mips/mips64r6/valid.s Differential Revision: https://reviews.llvm.org/D135563	2022-10-21 14:01:29 +01:00
Caroline Concatto	9895447006	Revert "[AArch64]SME2 Multi-single vector SVE Destructive 2 and 4 Registers" This reverts commit 4c4909703d74883e5cc49edcbd22b783135d2897. This patch was breaking this test: llvm/test/MC/Mips/mips64r6/valid.s I will push again when fixed	2022-10-20 19:43:31 +01:00
Caroline Concatto	c05b1bde34	Revert "[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources" This reverts commit 3fee9358baab54e4ed646a106297e7fb6f1b4cff.	2022-10-20 19:43:30 +01:00
Caroline Concatto	3fee9358ba	[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources This patch adds the assembly/disassembly for the following instructions: INT: SMLAL SMLSL UMLAL UMLSL FP: BFMLAL BFMLSL FMLAL FMLSL For multiple and indexed vector, Multiple and Single vector and Multi vectors, for 1, 2 and 4 ZA registers. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds a new immediate: uimm3s2range for off3 uimm2s2range for off2 to represent the vector select offset. The new operands have the range between the first and the last vector position. Depends on: D135563 Reviewed By: aemerson, sdesmalen Differential Revision: https://reviews.llvm.org/D135785	2022-10-20 19:09:48 +01:00
Caroline Concatto	4c4909703d	[AArch64]SME2 Multi-single vector SVE Destructive 2 and 4 Registers This patch adds the assembly/disassembly for the following instructions: ADD (to vector): Add replicated single vector to multi-vector with multi-vector result. SQDMULH (multiple and single vector): Multi-vector signed saturating doubling multiply high by vector. for 2 and 4 ZA SVE registers. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds more size for the multiple register tuple: ZZ_b_mul_r, ZZ_h_mul_r, ZZZZ_b_mul_r, ZZZZ_h_mul_r, for 8 bits and 16 bits with 2 and 4 ZA registers. Depends on: D135468 Differential Revision: https://reviews.llvm.org/D135563	2022-10-20 18:54:41 +01:00
Caroline Concatto	9db12a45e4	[AArch64]SME2 Multiple vector ternary int/float 2 and 4 registers This patch adds the assembly/disassembly for the following instructions: For INT: ADD(array results, multiple vectors): Add multi-vector to multi-vector with ZA array vector results. SUB(array results, multiple vectors): Subtract multi-vector from multi-vector with ZA array vector results. For FP: FMLA (multiple vectors): Multi-vector floating-point fused multiply-add. FMLS (multiple vectors): Multi-vector floating-point fused multiply-subtract. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a register operand to represent multiples of ZA multi-vectors. They are: ZZ_s_mul_r, ZZ_d_mul_r, ZZZZ_s_mul_r and ZZZZ_d_mul_r and represent the Zn or Zm times 2 or 4 according to the vector group. Depends on: D135455 Differential Revision: https://reviews.llvm.org/D135468	2022-10-20 18:44:22 +01:00
Caroline Concatto	2ecbe8c38c	[AArch64] SME2 Single-multi vector ternary int/FP 2 and 4 registers This patch adds the assembly/disassembly for the following instructions: For INT: ADD(array results, multiple and single vector): Add replicated single vector to multi-vector with ZA array vector results. SUB(array results, multiple and single vector): Subtract replicated single vector from multi-vector with ZA array vector results. For FP: FMLA (multiple and single vector): Multi-vector floating-point fused multiply-add by vector. FMLS (multiple and single vector): Multi-vector floating-point multiply-subtract long by vector. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 The Matriz Operand has 2 new sizes 32(.s) and 64(.d) bits (MatrixOp32 and MatrixOp64) Depends on: D135448 Depends on: D135952 Differential Revision: https://reviews.llvm.org/D135455	2022-10-19 17:49:48 +01:00
Caroline Concatto	579ca5e7e1	[AArch64] Replace sme-i64 by sme-i16i64 and sme-f64 by sme-f64f64 The names in developer.arm for these SME features are: HaveSMEI16I64 and HaveSMEF64F64 so the new flag names are consistent with the documentation page Reviewed By: sdesmalen, c-rhodes Differential Revision: https://reviews.llvm.org/D135974	2022-10-19 10:56:46 +01:00
Eli Friedman	d6481dc88c	[AArch64][Windows] Add MC support for save_any_reg. Representing this as 12 separate operations is a bit ugly, but trying to represent the different modes using a bitfield seemed worse. Differential Revision: https://reviews.llvm.org/D135417	2022-10-18 11:45:27 -07:00
Martin Storsjö	918f6f581d	[AArch64] [SEH] Rename pac_sign_return_address to pac_sign_lr This new opcode was initially documented as "pac_sign_return_address" in https://github.com/MicrosoftDocs/cpp-docs/pull/4202, but was soon afterwards renamed into "pac_sign_lr" in https://github.com/MicrosoftDocs/cpp-docs/pull/4209, as the other name was unwieldy, and there were no other external references to that name anywhere. Rename our external .seh assembler directive - it hasn't been merged for very long yet, so there's probably no external use to account for. Rename all other internal references to the opcode similarly. Differential Revision: https://reviews.llvm.org/D135762	2022-10-12 22:19:59 +03:00
Martin Storsjö	c43bff64e9	[AArch64] Add support for the SEH opcode for return address signing This was documented upstream in https://github.com/MicrosoftDocs/cpp-docs/pull/4202. Differential Revision: https://reviews.llvm.org/D135276	2022-10-12 11:07:11 +03:00
Kazu Hirata	8b1b0d1d81	Revert "Use std::is_same_v instead of std::is_same (NFC)" This reverts commit c5da37e42d388947a40654b7011f2a820ec51601. This patch seems to break builds with some versions of MSVC.	2022-08-20 23:00:39 -07:00
Kazu Hirata	c5da37e42d	Use std::is_same_v instead of std::is_same (NFC)	2022-08-20 22:36:26 -07:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00

1 2 3 4 5 ...

412 Commits