llvm-project

Author	SHA1	Message	Date
Hassnaa Hamdi	835c885ddb	[llvm][AArch64][Assembly]: Add LUT assembly/disassembly. (#70802 ) This patch adds the feature flags of LUT and SME_LUTv2, and the assembly/disassembly for the following instructions of NEON, SVE2 and SME2: * NEON: - LUT2 - LUT4 * SVE2: - LUTI2_ZZZI - LUTI4_ZZZI - LUTI4_Z2ZZI * SME: - MOVT - LUTI4_4ZZT2Z - LUTI4_S_4ZZT2Z That is according to this documentation: https://developer.arm.com/documentation/ddi0602/2023-09	2023-11-02 17:17:20 +00:00
CarolineConcatto	e4e02e31c2	[AArch64][NFC] Refactor NEON, SVE and SME classes and multiclasses fo… (#68800 ) …r the assembly disassembly This NFC patch refactors the assembly/disassembly class and multiclass in the AArch64 backend to receive a new 2023/09 AArch64[1] ISA release. The encoding for the 2023 instructions re-uses encoding blocks from previous assembly/disassembly instructions. The refactoring makes the class and multiclass for assembly/disassembly generic so it can be used to describe the instructions for the new ISA. [1]https://developer.arm.com/documentation/ddi0602/2023-09	2023-10-13 14:25:42 +01:00
Matthew Devereau	b967f3a1d7	[AArch64] Separate PNR into its own Register Class (#65306 ) This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.	2023-09-21 19:53:16 +01:00
Jonas Devlieghere	77d1032516	[llvm] Add assembly color highlighting Add support for syntax highlighting assembly. The patch introduces new RAII helper called WithMarkup that takes care of both emitting colors and markup annotations. It makes adding markup easier and ensures colors and annotations remain consistent. This patch adopts the new helper in the AArch64 backend. If your backend already uses markup annotations, adoption is as easy as using the new MCInstPrinter::markup overload. Differential revision: https://reviews.llvm.org/D159162	2023-09-01 07:57:45 -07:00
Jon Roelofs	aba4e4d6c1	[AArch64] Add hex comments to mov-imm spellings in the InstPrinter Differential Revision: https://reviews.llvm.org/D146105	2023-03-15 14:29:44 -07:00
Jon Roelofs	cdee83b015	Revert "[AArch64] Add hex comments to mov-imm spellings in the InstPrinter" This reverts commit 1def3141135c072a1d3e51e82e113dd67b0def97.	2023-03-15 14:21:08 -07:00
Jon Roelofs	1def314113	[AArch64] Add hex comments to mov-imm spellings in the InstPrinter Differential Revision: https://reviews.llvm.org/D146105	2023-03-15 14:08:51 -07:00
Kristina Bessonova	5dde2bcdd1	[AArch64InstPrinter][llvm-objdump] Print ADR PC-relative label as a target address hexadecimal form This is similar to ADRP and matches GNU objdump: GNU objdump: ``` 0000000000200100 <_start>: 200100: adr x0, 201000 <_start+0xf00> ``` llvm-objdump (before patch): ``` 0000000000200100 <_start>: 200100: adr x0, #3840 ``` llvm-objdump (after patch): ``` 0000000000200100 <_start>: 200100: adr x0, 0x201000 <_start+0xf00> ``` Reviewed By: simon_tatham, peter.smith Differential Revision: https://reviews.llvm.org/D144079	2023-02-18 18:31:21 +02:00
Fangrui Song	d9e4c10440	[AArch64] Simplify with MCSubtargetInfo::hasFeature. NFC	2023-02-17 14:29:21 -08:00
Philip Reames	b3154d08e9	[ARM][AArch64] Switch to generic MEMBARRIER node This change switches both targets from using target specific CompilerBarrier nodes to the recently introduced generic MEMBARRIER instruction. A couple things to call out. First, this changes the assembly comment printed. I'm not sure this matters, but if it does, we can simply drop this patch. This is a minor clean up at best. Second, the ordering operand on the target instruction appears to be unused. We could easily add ordering to the generic instruction, but since we don't seem to have a motivating case in tree, I simply dropped the ordering when selecting to the generic instruction. Differential Revision: https://reviews.llvm.org/D141513	2023-01-20 08:54:34 -08:00
Sergei Barannikov	6ae84d668f	[MC] Use MCRegister instead of unsigned in MCInstPrinter (NFC) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D140654	2023-01-17 22:39:39 +03:00
Lucas Prates	f516e91715	[AArch64] Add new v9.4-A PM pstate system register This adds support for the new PM pstate system register introduced by the v9.4-A Exception-based Event Profiling extension (FEAT_EBEP). The new PM pstate register takes a 1-bit immediate and requires different values to be specified for the higher bits of the Crm field. To enable that, this patch creates an explicit separation between the pstate system registers that take 4-bit and 1-bit immediate operands, allowing each entry to specify the value for the 3 high bits of Crm. This also updates other pstate registers to correctly accept 4-bit immediates, matching their decoding specification from the Arm ARM. These include: `PAN`, `UAO`, `DIT` and `SSBS`. More information about this extension and the new register can be found at: * https://developer.arm.com/documentation/ddi0601/2022-09/AArch64-Registers/PM--PMU-Exception-Mask Contributors: * Lucas Prates * Sam Elliott Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139925	2022-12-19 15:07:52 +00:00
Paul Walker	b4028fbc1a	[MC][AArch64] Remove bogus whitespace of markup'd immediate.	2022-12-08 18:26:38 +00:00
Lucas Prates	2050e7ebe1	[Arm][AArch64] Add support for v8.9-A/v9.4-A base extensions This implements the base extensions that are part of the v8.9-A and v9.4-A architecture versions, including: * The Clear BHB Instruction (FEAT_CLRBHB) * The Speculation Restriction Instruction (FEAT_SPECRES2) * The SLC target for the PRFM instruction * New system registers: * ID_AA64PFR2_EL1 * ID_AA64MMFR3_EL1 * HFGITR2_EL2 * SCTLR2_EL3 More information on the new extensions can be found on: * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 * https://developer.arm.com/downloads/-/exploration-tools Contributors: Sam Elliott, Tomas Matheson and Son Tuan Vu. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139424	2022-12-08 10:15:29 +00:00
Tomas Matheson	7fea6f2e0e	[AArch64] Assembly support for VMSA Virtual Memory System Architecture (VMSA) This is part of the 2022 A-Profile Architecture extensions and adds support for the following: - Translation Hardening Extension (FEAT_THE) - 128-bit Page Table Descriptors (FEAT_D128) - 56-bit Virtual Address (FEAT_LVA3) - Support for 128-bit System Registers (FEAT_SYSREG128) - System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128) - 128-bit Atomic Instructions (FEAT_LSE128) - Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE) - Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE) - Memory Attribute Index Enhancement (FEAT_AIE) New instructions added: - FEAT_SYSREG128 adds MRRS and MSRR. - FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases. - FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions. - FEAT_THE adds the set of RCW* instructions. Specs for individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ Contributors: Keith Walker Lucas Prates Sam Elliott Son Tuan Vu Tomas Matheson Differential Revision: https://reviews.llvm.org/D138920	2022-11-30 13:37:02 +00:00
Ties Stuij	cb261e30fb	[AArch64][clang] implement 2022 General Data-Processing instructions This patch implements the 2022 Architecture General Data-Processing Instructions They include: Common Short Sequence Compression (CSSC) instructions - scalar comparison instructions SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate - ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes) - command-line options for CSSC Associated with these instructions in the documentation is the Range Prefetch Memory (RPRFM) instruction, which signals to the memory system that data memory accesses from a specified range of addresses are likely to occur in the near future. The instruction lies in hint space, and is made unconditional. Specs for the individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ contributors to this patch: - Cullen Rhodes - Son Tuan Vu - Mark Murray - Tomas Matheson - Sam Elliott - Ties Stuij Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D138488	2022-11-22 14:23:12 +00:00
Woody Lin	409eaff5dd	[AArch64InstPrinter] Print TargetAddress as an uint64_t Outputs readable addresses by printed 'TargetAddress' as an uint64_t value. `bl -0x37efd56628` => `bl 0xffffffc8102a99d8` Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D137260	2022-11-16 13:34:22 +08:00
Caroline Concatto	ecab1bc0dc	[AArch64]SME2 Multi vector Sel Load and Store instructions This patch adds the assembly/disassembly for the following instruction: SEL: Multi-vector conditionally select elements from two vectors for 2 and 4 registers Non-constiguous load with stride resgisters: LD1B (scalar + immediate): Contiguous load of bytes to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of bytes to multiple strided vectors (scalar index). LD1D (scalar + immediate): Contiguous load of doublewords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of doublewords to multiple strided vectors (scalar index). LD1H (scalar + immediate): Contiguous load of halfwords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of halfwords to multiple strided vectors (scalar index). LD1W (scalar + immediate): Contiguous load of words to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of words to multiple strided vectors (scalar index). LDNT1B (scalar + immediate): Contiguous load non-temporal of bytes to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of bytes to multiple strided vectors (scalar index). LDNT1D (scalar + immediate): Contiguous load non-temporal of doublewords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of doublewords to multiple strided vectors (scalar index). LDNT1H (scalar + immediate): Contiguous load non-temporal of halfwords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of halfwords to multiple strided vectors (scalar index). LDNT1W (scalar + immediate): Contiguous load non-temporal of words to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of words to multiple strided vectors (scalar index). Non-constiguous store with stride resgisters: ST1B (scalar + immediate): Contiguous store of bytes from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of bytes from multiple strided vectors (scalar index). ST1D (scalar + immediate): Contiguous store of doublewords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of doublewords from multiple strided vectors (scalar index). ST1H (scalar + immediate): Contiguous store of halfwords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of halfwords from multiple strided vectors (scalar index). ST1W (scalar + immediate): Contiguous store of words from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of words from multiple strided vectors (scalar index). STNT1B (scalar + immediate): Contiguous store non-temporal of bytes from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of bytes from multiple strided vectors (scalar index). STNT1D (scalar + immediate): Contiguous store non-temporal of doublewords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of doublewords from multiple strided vectors (scalar index). STNT1H (scalar + immediate): Contiguous store non-temporal of halfwords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of halfwords from multiple strided vectors (scalar index). STNT1W (scalar + immediate): Contiguous store non-temporal of words from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of words from multiple strided vectors (scalar index). The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a new SVE vector list to represent the stride loads/stores ZPRVectorListStrided and the sets of 2 and 4 ZA registers: ZZ_[b\|h\|w\|d]_strided and ZZZZ_[b\|h\|w\|d]_strided Differential Revision: https://reviews.llvm.org/D136172	2022-11-10 16:04:57 +00:00
Caroline Concatto	a20112a74c	[AArch64]SME2 instructions that use ZTO operand This patch adds the assembly/disassembly for the following instructions: ZERO (ZT0): Zero ZT0. LDR (ZT0): Load ZT0 register. STR (ZT0): Store ZT0 register. MOVT (scalar to ZT0): Move 8 bytes from general-purpose register to ZT0. (ZT0 to scalar): Move 8 bytes from ZT0 to general-purpose register. Consecutive: LUTI2 (single): Lookup table read with 2-bit indexes. (two registers): Lookup table read with 2-bit indexes. (four registers): Lookup table read with 2-bit indexes. LUTI4 (single): Lookup table read with 4-bit indexes. (two registers): Lookup table read with 4-bit indexes. (four registers): Lookup table read with 4-bit indexes. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a new register class and operand for zt0 and a another index operand uimm3s8 Differential Revision: https://reviews.llvm.org/D136088	2022-11-03 07:35:21 +00:00
David Sherwood	be369ea31b	[AArch64][SVE2] Add the SVE2.1 while & pext predicate pair instructions This patch adds the assembly/disassembly for the following predicate pair instructions: pext: Set pair of predicates from predicate-as-counter whilelt: While incrementing signed scalar less than scalar whilele: While incrementing signed scalar less than or equal to scalar whilegt: While incrementing signed scalar greater than scalar whilege: While incrementing signed scalar greater than or equal to scalar whilelo: While incrementing unsigned scalar lower than scalar whilels: While incrementing unsigned scalar lower or same as scalar whilehs: While decrementing unsigned scalar higher or same as scalar whilehi: While decrementing unsigned scalar higher than scalar The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136759	2022-11-02 08:39:03 +00:00
David Sherwood	5f7a8cf026	[AArch64][SVE2] Add the SVE2.1 cntp instruction This patch adds the assembly/disassembly for the following instructions: cntp : Set scalar to count from predicate-as-counter The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136747	2022-11-01 13:24:37 +00:00
David Sherwood	891aaff9a8	[AArch64][SVE2] Add the SVE2.1 pext and ptrue predicate-as-counter instructions This patch adds the assembly/disassembly for the following instructions: pext (predicate) : Set predicate from predicate-as-counter ptrue (predicate-as-counter) : Initialise predicate-as-counter to all active This patch also introduces the predicate-as-counter registers pn8, etc. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136678	2022-10-27 18:23:35 +00:00
Caroline Concatto	fc0cde8af5	[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources This patch adds the assembly/disassembly for the following instructions: INT: SMLAL SMLSL UMLAL UMLSL FP: BFMLAL BFMLSL FMLAL FMLSL For multiple and indexed vector, Multiple and Single vector and Multi vectors, for 1, 2 and 4 ZA registers. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds a new immediate: uimm3s2range for off3 uimm2s2range for off2 to represent the vector select offset. The new operands have the range between the first and the last vector position. Depends on: D135563 Reviewed By: aemerson, sdesmalen Re-landing the patch as the problem with https://reviews.llvm.org/D135563 is fixed in this commit: 1e4f82c2578cf5045ffe Differential Revision: https://reviews.llvm.org/D135785	2022-10-21 14:20:44 +01:00
Caroline Concatto	c05b1bde34	Revert "[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources" This reverts commit 3fee9358baab54e4ed646a106297e7fb6f1b4cff.	2022-10-20 19:43:30 +01:00
Caroline Concatto	3fee9358ba	[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources This patch adds the assembly/disassembly for the following instructions: INT: SMLAL SMLSL UMLAL UMLSL FP: BFMLAL BFMLSL FMLAL FMLSL For multiple and indexed vector, Multiple and Single vector and Multi vectors, for 1, 2 and 4 ZA registers. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 It also adds a new immediate: uimm3s2range for off3 uimm2s2range for off2 to represent the vector select offset. The new operands have the range between the first and the last vector position. Depends on: D135563 Reviewed By: aemerson, sdesmalen Differential Revision: https://reviews.llvm.org/D135785	2022-10-20 19:09:48 +01:00
Caroline Concatto	60e2aad109	[AArch64]Change printVectorList to print SVE vector range This patch has the prefered disassembly changed for SVE vector list. For instance, instead of printing this assembly: ld4d { z1.d, z2.d, z3.d, z4.d }, p0/z, [x0] it will print this: ld4d { z1.d-z4.d }, p0/z, [x0] Differential Revision: https://reviews.llvm.org/D135952	2022-10-14 18:59:56 +01:00
Antonio Frighetto	c63e05dc07	[AArch64InstPrinter] Introduce register markup tags emission AArch64 assembly syntax emission now leverages markup tags for registers, if enabled. Reviewed By: MaskRay, david-arm Differential Revision: https://reviews.llvm.org/D129870	2022-09-13 20:52:02 -07:00
Fangrui Song	1b726f0a4c	[AArch64InstPrinter] Add some `<reg:...>` for llvm-mc --mdis output	2022-09-01 21:34:56 -07:00
Antonio Frighetto	4e99079774	[AArch64InstPrinter] Introduce immediate markup tags emission AArch64 assembly syntax emission now leverages markup tags for immediates, if enabled. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129871	2022-09-01 20:58:42 -07:00
Kazu Hirata	ec5eab7e87	Use range-based for loops (NFC)	2022-08-20 21:18:32 -07:00
David Sherwood	8f9d73fbd6	[NFC][AArch64] Minor refactor of AArch64InstPrinter::printMatrixTileList We can remove the MatrixZADRegisterTable table of tile registers and just calculate the register index directly. Differential Revision: https://reviews.llvm.org/D127757	2022-06-15 09:52:24 +01:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
Alexandros Lamprineas	8689f5e6e7	[AArch64] Add support for the 'R' architecture profile. This change introduces subtarget features to predicate certain instructions and system registers that are available only on 'A' profile targets. Those features are not present when targeting a generic CPU, which is the default processor. In other words the generic CPU now means the intersection of 'A' and 'R' profiles. To maintain backwards compatibility we enable the features that correspond to -march=armv8-a when the architecture is not explicitly specified on the command line. References: https://developer.arm.com/documentation/ddi0600/latest Differential Revision: https://reviews.llvm.org/D110065	2021-10-27 12:32:30 +01:00
Cullen Rhodes	42ba79b7b0	[AArch64][SME] Update tile slice index offset Changes in architecture revision 00eac1: * Tile slice index offset no longer prefixed with '#'. * The syntax for 128-bit (.Q) ZA tile slice accesses must now include an explicit zero index. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D111212	2021-10-07 08:55:10 +00:00
Jason Molenda	0d8cd4e2d5	[AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted Add a comment when there is a shifted value, add x9, x0, #291, lsl #12 ; =1191936 but not when the immediate value is unshifted, subs x9, x0, #256 ; =256 when the comment adds nothing additional to the reader. Differential Revision: https://reviews.llvm.org/D107196	2021-08-03 02:28:46 -07:00
Cullen Rhodes	2e27c4e1f1	[AArch64][SME] Add zero instruction This patch adds the zero instruction for zeroing a list of 64-bit element ZA tiles. The instruction takes a list of up to eight tiles ZA0.D-ZA7.D, which must be in order, e.g. zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d} zero {za1.d,za3.d,za5.d,za7.d} The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which are mapped to corresponding 64-bit element tiles in accordance with the architecturally defined mapping between different element size tiles, e.g. * Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing all eight 64-bit element tiles ZA0.D to ZA7.D. * Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D. The preferred disassembly of this instruction uses the shortest list of tile names that represent the encoded immediate mask, e.g. * An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and ZA5.D is disassembled as {ZA0.S, ZA1.S}. * An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and ZA6.D is disassembled as {ZA0.H}. * An all-ones immediate is disassembled as {ZA}. * An all-zeros immediate is disassembled as an empty list {}. This patch adds the MatrixTileList asm operand and related parsing to support this. Depends on D105570. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105575	2021-07-27 08:35:45 +00:00
Cullen Rhodes	15af3aaa2e	[AArch64][SME] Add system registers and related instructions This patch adds the new system registers introduced in SME: - ID_AA64SMFR0_EL1 (ro) SME feature identifier. - SMCR_ELx (r/w) streaming mode control register for configuring effective SVE Streaming SVE Vector length when the PE is in Streaming SVE mode. - SVCR (r/w) streaming vector control register, visible at all exception levels. Provides access to PSTATE.SM and PSTATE.ZA using MSR and MRS instructions. - SMPRI_EL1 (r/w) streaming mode execution priority register. - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register. - SMIDR_EL1 (ro) streaming mode identification register. - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread SME context. - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for labelling memory accesses performed in streaming mode. Also added in this patch are the SME mode change instructions. Three MSR immediate instructions are implemented to set or clear PSTATE.SM, PSTATE.ZA, or both respectively: - MSR SVCRSM, #<imm1> - MSR SVCRZA, #<imm1> - MSR SVCRSMZA, #<imm1> The following smstart/smstop aliases are also implemented for convenience: smstart -> MSR SVCRSMZA, #1 smstart sm -> MSR SVCRSM, #1 smstart za -> MSR SVCRZA, #1 smstop -> MSR SVCRSMZA, #0 smstop sm -> MSR SVCRSM, #0 smstop za -> MSR SVCRZA, #0 The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105576	2021-07-20 08:06:26 +00:00
Cullen Rhodes	c08dabb0f4	[AArch64][SME] Add matrix register definitions and parsing support SME introduces the ZA array, a new piece of architectural register state consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the implementation defined Streaming SVE vector length and SVLb is the number of 8-bit elements in a vector of SVL bits. SME instructions consist of three types of matrix operands: * Tiles: a ZA tile is a square, two-dimensional sub-array of elements within the ZA array. These tiles make up the larger accumulator array and the granularity varies based on the element size, i.e. - ZAQ0..ZAQ15 (smallest tile granule) - ZAD0..ZAD7 - ZAS0..ZAS3 - ZAH0..ZAH1 or ZAB0 (largest tile granule, single tile) * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v' to tell how the vector at [reg+offset] is layed out in the tile, horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds to vectors in registers ZAH1 and ZAQ15, respectively. * Accumulator matrix: this is the entire accumulator array ZA. This patch adds the register classes and related operands and parsing for SME instructions operating on the accumulator array. The ADDHA and ADDVA instructions which operate on tiles are also added in this patch to make some use of the code added, later patches will make use of the other operands introduced here. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Co-authored by: Sander de Smalen (@sdesmalen) Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105570	2021-07-14 08:25:49 +00:00
Fangrui Song	5852582532	[AArch64] Support llvm-mc/llvm-objdump -M no-aliases This enables the no-aliases forms of many instructions. Depends on D103004 Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D103005	2021-05-26 13:35:31 -07:00
Alexandros Lamprineas	1079870971	[llvm-mc][AArch64] HINT instruction disassembled as BTI The Arm Architecture Reference Manual says that the SystemHintOp_BTI opcode is prefered when CRm:op2 matches 0100:xx0, but llvm-mc currently accepts 0100:xxx, which isn't right. Differential Revision: https://reviews.llvm.org/D102415	2021-05-14 10:05:37 +01:00
Dan Gohman	698c6b0a09	[WebAssembly] Support single-floating-point immediate value As mentioned in TODO comment, casting double to float causes NaNs to change bits. To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode. Patch by Yuta Saito. Differential Revision: https://reviews.llvm.org/D77384	2021-02-04 18:05:06 -08:00
Lucas Prates	97c006aabb	[AArch64] Add a GPR64x8 register class This adds a GPR64x8 register class that will be needed as the data operand to the LD64B/ST64B family of instructions in the v8.7-A Accelerator Extension, which load or store a contiguous range of eight x-regs. It has to be its own register class so that register allocation will have visibility of the full set of registers actually read/written by the instructions, which will be needed when we add intrinsics and/or inline asm access to this piece of architecture. Patch written by Simon Tatham. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D91774	2020-12-17 13:45:46 +00:00
Lucas Prates	42b92b31b8	[ARM][AArch64] Adding basic support for the v8.7-A architecture This introduces support for the v8.7-A architecture through a new subtarget feature called "v8.7a". It adds two new "WFET" and "WFIT" instructions, the nXS limited-TLB-maintenance qualifier for DSB and TLBI instructions, a new CPU id register, ID_AA64ISAR2_EL1, and the new HCRX_EL2 system register. Based on patches written by Simon Tatham and Victor Campos. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D91772	2020-12-17 13:45:08 +00:00
Fangrui Song	1bd928e50b	[AArch64InstPrinter] Use * 4096 instead of << 12 Left shirting a negative integer is undefined before C++20.	2020-12-16 14:02:25 -08:00
Fangrui Song	66bcbdbc9c	[AArch64InstPrinter] Change printADRPLabel to print the target address in hexadecimal form Similar to D77853. Change ADRP to print the target address in hex, instead of the raw immediate. The behavior is similar to GNU objdump but we also include `0x`. Note: GNU objdump is not consistent whether or not to emit `0x` for different architectures. We try emitting 0x consistently for all targets. ``` GNU objdump: adrp x16, 10000000 Old llvm-objdump: adrp x16, #0 New llvm-objdump: adrp x16, 0x10000000 ``` `adrp Xd, 0x...` assembles to a relocation referencing `ABS+0x10000` which is not intended. We need to use a linker or use yaml2obj. The main test is `test/tools/llvm-objdump/ELF/AArch64/pcrel-address.yaml` Differential Revision: https://reviews.llvm.org/D93241	2020-12-16 09:20:55 -08:00
Fangrui Song	a3b5f428c1	[AArch64] Print the immediate operand for SPACE pseudo instruction Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D81814	2020-06-15 20:55:53 -07:00
Fangrui Song	7f36cb1f1a	[AArch64InstPrinter] Change printAlignedLabel to print the target address in hexadecimal form Similar to D76580 (x86) and D76591 (PPC). ``` // llvm-objdump -d output (before) 10000: 08 00 00 94 bl #32 10004: 08 00 00 94 bl #32 // llvm-objdump -d output (after) 10000: 08 00 00 94 bl 0x10020 10004: 08 00 00 94 bl 0x10024 // GNU objdump -d. The lack of 0x is not ideal due to ambiguity. 10000: 94000008 bl 10020 <bar+0x18> 10004: 94000008 bl 10024 <bar+0x1c> ``` The new output makes it easier to find the jump target. Differential Revision: https://reviews.llvm.org/D77853	2020-04-10 09:21:09 -07:00
Fangrui Song	b3cc5dcef0	[MCInstPrinter] Add parameter `Address` to MCInstPrinter::printAliasInstr. NFC Follow-up of D72172.	2020-03-27 00:03:32 -07:00
Fangrui Song	5fad05e80d	[MCInstPrinter] Pass `Address` parameter to MCOI::OPERAND_PCREL typed operands. NFC Follow-up of D72172 and D72180 This patch passes `uint64_t Address` to print methods of PC-relative operands so that subsequent target specific patches can change `*InstPrinter::print{Operand,PCRelImm,...}` to customize the output. Add MCInstPrinter::PrintBranchImmAsAddress which is set to true by llvm-objdump. ``` // Current llvm-objdump -d output aarch64: 20000: bl #0 ppc: 20000: bl .+4 x86: 20000: callq 0 // Ideal output aarch64: 20000: bl 0x20000 ppc: 20000: bl 0x20004 x86: 20000: callq 0x20005 // GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled aarch64: 20000: bl 20000 ppc: 20000: bl 0x20004 x86: 20000: callq 20005 ``` In `lib/Target/X86/X86GenAsmWriter1.inc` (generated by `llvm-tblgen -gen-asm-writer`): ``` case 12: // CALL64pcrel32, CALLpcrel16, CALLpcrel32, EH_SjLj_Setup, JCXZ, JECXZ, J... - printPCRelImm(MI, 0, O); + printPCRelImm(MI, Address, 0, O); return; ``` Some targets have 2 `printOperand` overloads, one without `Address` and one with `Address`. They should annotate derived `Operand` properly with `let OperandType = "OPERAND_PCREL"`. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D76574	2020-03-26 08:21:15 -07:00

1 2

58 Commits