llvm-project

Author	SHA1	Message	Date
David Green	f33245a5c4	[AArch64] Fix a always true condition warning. NFC As ImmVal is unsigned, it will always be >= 0	2024-01-01 15:56:03 +00:00
Tomas Matheson	192f720178	Re-land "[AArch64] Add FEAT_PAuthLR assembler support" (#75947 ) This reverts commit 199a0f9f5aaf72ff856f68e3bb708e783252af17. Fixed the left-shift of signed integer which was causing UB.	2023-12-21 18:09:31 +00:00
Tomas Matheson	199a0f9f5a	Revert "[AArch64] Add FEAT_PAuthLR assembler support" This reverts commit 934b1099cbf14fa3f86a269dff957da8e5fb619f. Buildbot failues on sanitizer-x86_64-linux-fast	2023-12-21 16:26:39 +00:00
Oliver Stannard	934b1099cb	[AArch64] Add FEAT_PAuthLR assembler support Add assembly/disassembly support for the new PAuthLR instructions introduced in Armv9.5-A: - AUTIASPPC/AUTIBSPPC - PACIASPPC/PACIBSPPC - PACNBIASPPC/PACNBIBSPPC - RETAASPPC/RETABSPPC - PACM Documentation for these instructions can be found here: https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/	2023-12-21 14:18:33 +00:00
hassnaaHamdi	6477b41a0b	[llvm][AArch64][Assembly]: Add FP8FMA assembly and disassembly. (#70134 ) This patch adds the feature flag FP8FMA and the assembly/disassembly for the following instructions of NEON and SVE2: * NEON: - FMLALBlane - FMLALTlane - FMLALLBBlane - FMLALLBTlane - FMLALLTBlane - FMLALLTTlane - FMLALB - FMLALT - FMLALLB - FMLALLBT - FMLALLTB - FMLALLTT * SVE2: - FMLALB_ZZZI - FMLALT_ZZZI - FMLALB_ZZZ - FMLALT_ZZZ - FMLALLBB_ZZZI - FMLALLBT_ZZZI - FMLALLTB_ZZZI - FMLALLTT_ZZZI - FMLALLBB_ZZZ - FMLALLBT_ZZZ - FMLALLTB_ZZZ - FMLALLTT_ZZZ That is according to this documentation: https://developer.arm.com/documentation/ddi0602/2023-09	2023-11-01 13:03:00 +00:00
Matthew Devereau	b967f3a1d7	[AArch64] Separate PNR into its own Register Class (#65306 ) This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.	2023-09-21 19:53:16 +01:00
Craig Topper	c1d94ea08c	[AArch64] Remove global constructors from AArch64Disassembler.cpp. Instead of using SmallVectors of SmallVectors, use a plain array. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D150077	2023-05-09 10:37:20 -07:00
Jay Foad	768aed1378	[MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A future patch will remove opInfo_begin and opInfo_end. Also use it instead of raw access to the OpInfo pointer. A future patch will remove this pointer. Differential Revision: https://reviews.llvm.org/D142213	2023-01-23 11:31:41 +00:00
Lucas Prates	f516e91715	[AArch64] Add new v9.4-A PM pstate system register This adds support for the new PM pstate system register introduced by the v9.4-A Exception-based Event Profiling extension (FEAT_EBEP). The new PM pstate register takes a 1-bit immediate and requires different values to be specified for the higher bits of the Crm field. To enable that, this patch creates an explicit separation between the pstate system registers that take 4-bit and 1-bit immediate operands, allowing each entry to specify the value for the 3 high bits of Crm. This also updates other pstate registers to correctly accept 4-bit immediates, matching their decoding specification from the Arm ARM. These include: `PAN`, `UAO`, `DIT` and `SSBS`. More information about this extension and the new register can be found at: * https://developer.arm.com/documentation/ddi0601/2022-09/AArch64-Registers/PM--PMU-Exception-Mask Contributors: * Lucas Prates * Sam Elliott Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139925	2022-12-19 15:07:52 +00:00
Tomas Matheson	7fea6f2e0e	[AArch64] Assembly support for VMSA Virtual Memory System Architecture (VMSA) This is part of the 2022 A-Profile Architecture extensions and adds support for the following: - Translation Hardening Extension (FEAT_THE) - 128-bit Page Table Descriptors (FEAT_D128) - 56-bit Virtual Address (FEAT_LVA3) - Support for 128-bit System Registers (FEAT_SYSREG128) - System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128) - 128-bit Atomic Instructions (FEAT_LSE128) - Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE) - Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE) - Memory Attribute Index Enhancement (FEAT_AIE) New instructions added: - FEAT_SYSREG128 adds MRRS and MSRR. - FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases. - FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions. - FEAT_THE adds the set of RCW* instructions. Specs for individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ Contributors: Keith Walker Lucas Prates Sam Elliott Son Tuan Vu Tomas Matheson Differential Revision: https://reviews.llvm.org/D138920	2022-11-30 13:37:02 +00:00
Ties Stuij	cb261e30fb	[AArch64][clang] implement 2022 General Data-Processing instructions This patch implements the 2022 Architecture General Data-Processing Instructions They include: Common Short Sequence Compression (CSSC) instructions - scalar comparison instructions SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate - ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes) - command-line options for CSSC Associated with these instructions in the documentation is the Range Prefetch Memory (RPRFM) instruction, which signals to the memory system that data memory accesses from a specified range of addresses are likely to occur in the near future. The instruction lies in hint space, and is made unconditional. Specs for the individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ contributors to this patch: - Cullen Rhodes - Son Tuan Vu - Mark Murray - Tomas Matheson - Sam Elliott - Ties Stuij Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D138488	2022-11-22 14:23:12 +00:00
Caroline Concatto	ecab1bc0dc	[AArch64]SME2 Multi vector Sel Load and Store instructions This patch adds the assembly/disassembly for the following instruction: SEL: Multi-vector conditionally select elements from two vectors for 2 and 4 registers Non-constiguous load with stride resgisters: LD1B (scalar + immediate): Contiguous load of bytes to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of bytes to multiple strided vectors (scalar index). LD1D (scalar + immediate): Contiguous load of doublewords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of doublewords to multiple strided vectors (scalar index). LD1H (scalar + immediate): Contiguous load of halfwords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of halfwords to multiple strided vectors (scalar index). LD1W (scalar + immediate): Contiguous load of words to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load of words to multiple strided vectors (scalar index). LDNT1B (scalar + immediate): Contiguous load non-temporal of bytes to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of bytes to multiple strided vectors (scalar index). LDNT1D (scalar + immediate): Contiguous load non-temporal of doublewords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of doublewords to multiple strided vectors (scalar index). LDNT1H (scalar + immediate): Contiguous load non-temporal of halfwords to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of halfwords to multiple strided vectors (scalar index). LDNT1W (scalar + immediate): Contiguous load non-temporal of words to multiple strided vectors (immediate index). (scalar + scalar): Contiguous load non-temporal of words to multiple strided vectors (scalar index). Non-constiguous store with stride resgisters: ST1B (scalar + immediate): Contiguous store of bytes from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of bytes from multiple strided vectors (scalar index). ST1D (scalar + immediate): Contiguous store of doublewords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of doublewords from multiple strided vectors (scalar index). ST1H (scalar + immediate): Contiguous store of halfwords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of halfwords from multiple strided vectors (scalar index). ST1W (scalar + immediate): Contiguous store of words from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store of words from multiple strided vectors (scalar index). STNT1B (scalar + immediate): Contiguous store non-temporal of bytes from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of bytes from multiple strided vectors (scalar index). STNT1D (scalar + immediate): Contiguous store non-temporal of doublewords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of doublewords from multiple strided vectors (scalar index). STNT1H (scalar + immediate): Contiguous store non-temporal of halfwords from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of halfwords from multiple strided vectors (scalar index). STNT1W (scalar + immediate): Contiguous store non-temporal of words from multiple strided vectors (immediate index). (scalar + scalar): Contiguous store non-temporal of words from multiple strided vectors (scalar index). The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a new SVE vector list to represent the stride loads/stores ZPRVectorListStrided and the sets of 2 and 4 ZA registers: ZZ_[b\|h\|w\|d]_strided and ZZZZ_[b\|h\|w\|d]_strided Differential Revision: https://reviews.llvm.org/D136172	2022-11-10 16:04:57 +00:00
Caroline Concatto	a20112a74c	[AArch64]SME2 instructions that use ZTO operand This patch adds the assembly/disassembly for the following instructions: ZERO (ZT0): Zero ZT0. LDR (ZT0): Load ZT0 register. STR (ZT0): Store ZT0 register. MOVT (scalar to ZT0): Move 8 bytes from general-purpose register to ZT0. (ZT0 to scalar): Move 8 bytes from ZT0 to general-purpose register. Consecutive: LUTI2 (single): Lookup table read with 2-bit indexes. (two registers): Lookup table read with 2-bit indexes. (four registers): Lookup table read with 2-bit indexes. LUTI4 (single): Lookup table read with 4-bit indexes. (two registers): Lookup table read with 4-bit indexes. (four registers): Lookup table read with 4-bit indexes. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a new register class and operand for zt0 and a another index operand uimm3s8 Differential Revision: https://reviews.llvm.org/D136088	2022-11-03 07:35:21 +00:00
David Sherwood	be369ea31b	[AArch64][SVE2] Add the SVE2.1 while & pext predicate pair instructions This patch adds the assembly/disassembly for the following predicate pair instructions: pext: Set pair of predicates from predicate-as-counter whilelt: While incrementing signed scalar less than scalar whilele: While incrementing signed scalar less than or equal to scalar whilegt: While incrementing signed scalar greater than scalar whilege: While incrementing signed scalar greater than or equal to scalar whilelo: While incrementing unsigned scalar lower than scalar whilels: While incrementing unsigned scalar lower or same as scalar whilehs: While decrementing unsigned scalar higher or same as scalar whilehi: While decrementing unsigned scalar higher than scalar The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136759	2022-11-02 08:39:03 +00:00
James Y Knight	9a26f89316	[llvm-tblgen] NFC: Simplify DecoderEmitter. Currently the DecoderEmitter constructor takes a bunch of string parameters containing bits of code to interpolate. However, there's only two ways it can be called. The one used for most targets which doesn't handle the SoftFail DecoderStatus (not a problem, because they don't use SoftFail). The other mode, which is used for ARM/AArch64, does handle SoftFail, but requires an externally defined helper function in those targets. This is unnecessary complication; remove the parameters, and unify onto a single version which does support SoftFail, defining the helper itself.	2022-10-28 19:45:20 -04:00
David Sherwood	891aaff9a8	[AArch64][SVE2] Add the SVE2.1 pext and ptrue predicate-as-counter instructions This patch adds the assembly/disassembly for the following instructions: pext (predicate) : Set predicate from predicate-as-counter ptrue (predicate-as-counter) : Initialise predicate-as-counter to all active This patch also introduces the predicate-as-counter registers pn8, etc. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D136678	2022-10-27 18:23:35 +00:00
Caroline Concatto	9db12a45e4	[AArch64]SME2 Multiple vector ternary int/float 2 and 4 registers This patch adds the assembly/disassembly for the following instructions: For INT: ADD(array results, multiple vectors): Add multi-vector to multi-vector with ZA array vector results. SUB(array results, multiple vectors): Subtract multi-vector from multi-vector with ZA array vector results. For FP: FMLA (multiple vectors): Multi-vector floating-point fused multiply-add. FMLS (multiple vectors): Multi-vector floating-point fused multiply-subtract. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 This patch also adds a register operand to represent multiples of ZA multi-vectors. They are: ZZ_s_mul_r, ZZ_d_mul_r, ZZZZ_s_mul_r and ZZZZ_d_mul_r and represent the Zn or Zm times 2 or 4 according to the vector group. Depends on: D135455 Differential Revision: https://reviews.llvm.org/D135468	2022-10-20 18:44:22 +01:00
Caroline Concatto	2ecbe8c38c	[AArch64] SME2 Single-multi vector ternary int/FP 2 and 4 registers This patch adds the assembly/disassembly for the following instructions: For INT: ADD(array results, multiple and single vector): Add replicated single vector to multi-vector with ZA array vector results. SUB(array results, multiple and single vector): Subtract replicated single vector from multi-vector with ZA array vector results. For FP: FMLA (multiple and single vector): Multi-vector floating-point fused multiply-add by vector. FMLS (multiple and single vector): Multi-vector floating-point multiply-subtract long by vector. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 The Matriz Operand has 2 new sizes 32(.s) and 64(.d) bits (MatrixOp32 and MatrixOp64) Depends on: D135448 Depends on: D135952 Differential Revision: https://reviews.llvm.org/D135455	2022-10-19 17:49:48 +01:00
Kazu Hirata	109df7f9a4	[llvm] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-13 12:55:42 -07:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Simon Tatham	55f1fbf005	[MC,llvm-objdump,ARM] Target-dependent disassembly resync policy. Currently, when llvm-objdump is disassembling a code section and encounters a point where no instruction can be decoded, it uses the same policy on all targets: consume one byte of the section, emit it as "<unknown>", and try disassembling from the next byte position. On an architecture where instructions are always 4 bytes long and 4-byte aligned, this makes no sense at all. If a 4-byte word cannot be decoded as an instruction, then the next place that a valid instruction could //possibly// be found is 4 bytes further on. Disassembling from a misaligned address can't possibly produce anything that the code generator intended, or that the CPU would even attempt to execute. This patch introduces a new MCDisassembler virtual method called `suggestBytesToSkip`, which allows each target to choose its own resynchronization policy. For Arm (as opposed to Thumb) and AArch64, I've filled in the new method to return a fixed width of 4. Thumb is a more interesting case, because the criterion for identifying 2-byte and 4-byte instruction encodings is very simple, and doesn't require the particular instruction to be recognized. So `suggestBytesToSkip` is also passed an ArrayRef of the bytes in question, so that it can take that into account. The new test case shows Thumb disassembly skipping over two unrecognized instructions, and identifying one as 2-byte and one as 4-byte. For targets other than Arm and AArch64, this is NFC: the base class implementation of `suggestBytesToSkip` still returns 1, so that the existing behavior is unchanged. Other targets can fill in their own implementations as they see fit; I haven't attempted to choose a new behavior for each one myself. I've updated all the call sites of `MCDisassembler::getInstruction` in llvm-objdump, and also one in sancov, which was the only other place I spotted the same idiom of `if (Size == 0) Size = 1` after a call to `getInstruction`. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D130357	2022-07-26 09:35:30 +01:00
Maksim Panchenko	bed9efed71	[MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand() MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter to specify either the instruction size or the operand size depending on the architecture. However, for proper symbolic disassembly on X86, we need to know both sizes, as an instruction can have two operands, and the instruction size cannot be reliably calculated based on the operand offset and its size. Hence, split Size into OpSize and InstSize. For X86, the new interface allows to fix a couple of issues: * Correctly adjust the value of PC-relative operands. * Set operand size to zero when the operand is specified implicitly. Differential Revision: https://reviews.llvm.org/D126101	2022-05-25 13:44:32 -07:00
Caroline Concatto	8765ad42cd	[AArch64][SME][NFC] Add implicit operands for SME instructions in the disassembly. This patch simplifies the switch statement in getInstruction to add implicit operands (register ZA and Immediate equal to zero) in the SME operands when disassembly. The register ZA and the zero immediate can be added by checking the operand in MCInstDesc. Differential Revision: https://reviews.llvm.org/D125534	2022-05-20 10:29:21 +01:00
Sheng	c644488a8b	Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h` The name `MCFixedLenDisassembler.h` is out of date after D120958. Rename it as `MCDecoderOps.h` to reflect the change. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D124987	2022-05-15 08:44:58 +08:00
Maksim Panchenko	4ae9745af1	[Disassember][NFCI] Use strong type for instruction decoder All LLVM backends use MCDisassembler as a base class for their instruction decoders. Use "const MCDisassembler " for the decoder instead of "const void ". Remove unnecessary static casts. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D122245	2022-03-25 18:53:59 -07:00
Kazu Hirata	410480e32b	Ensure newlines at the end of files (NFC)	2022-01-06 23:44:02 -08:00
Simon Tatham	e35a3f188f	[AArch64] Adding "armv8.8-a" memcpy/memset support. This family of instructions includes CPYF (copy forward), CPYB (copy backward), SET (memset) and SETG (memset + initialise MTE tags), with some sub-variants to indicate whether address translation is done in a privileged or unprivileged way. For the copy instructions, you can separately specify the read and write translations (so that kernels can safely use these instructions in syscall handlers, to memcpy between the calling process's user-space memory map and the kernel's own privileged one). The unusual thing about these instructions is that they write back to multiple registers, because they perform an implementation-defined amount of copying each time they run, and write back to _all_ the address and size registers to indicate how much remains to be done (and the code is expected to loop on them until the size register becomes zero). But this is no problem in LLVM - you just define each instruction to have multiple outputs, multiple inputs, and a set of constraints tying their register numbers together appropriately. This commit introduces a special subtarget feature called MOPS (after the name the spec gives to the CPU id field), which is a dependency of the top-level 8.8-A feature, and uses that to enable most of the new instructions. The SETMG instructions also depend on MTE (and the test checks that). Differential Revision: https://reviews.llvm.org/D116157	2022-01-05 14:44:24 +00:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Cullen Rhodes	42ba79b7b0	[AArch64][SME] Update tile slice index offset Changes in architecture revision 00eac1: * Tile slice index offset no longer prefixed with '#'. * The syntax for 128-bit (.Q) ZA tile slice accesses must now include an explicit zero index. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D111212	2021-10-07 08:55:10 +00:00
Cullen Rhodes	dc5dd77ac7	[AArch64][SME] Support NEON vector to GPR integer moves in streaming mode A small subset of the NEON instruction set is legal in streaming mode. This patch adds support for the following vector to integer move instructions: 0x00 1110 0000 0001 0010 11xx xxxx xxxx # SMOV W\|Xd,Vn.B[0] 0x00 1110 0000 0010 0010 11xx xxxx xxxx # SMOV W\|Xd,Vn.H[0] 0100 1110 0000 0100 0010 11xx xxxx xxxx # SMOV Xd,Vn.S[0] 0000 1110 0000 0001 0011 11xx xxxx xxxx # UMOV Wd,Vn.B[0] 0000 1110 0000 0010 0011 11xx xxxx xxxx # UMOV Wd,Vn.H[0] 0000 1110 0000 0100 0011 11xx xxxx xxxx # UMOV Wd,Vn.S[0] 0100 1110 0000 1000 0011 11xx xxxx xxxx # UMOV Xd,Vn.D[0] Only the zero index variants are legal, all others indexes are illegal. To support this, new instructions are defined specifically for zero index which is hardcoded, along an implicit 'VectorIndex0' operand. Since the index operand is implicit and takes no bits in the encoding, custom decoding is required to add the operand. I'm not sure if this is the best approach but the predicate constraint on a subset of an operand is unusual. Would be interested to hear some alternatives. The instructions are predicated on 'HasNEONorStreamingSVE', i.e. they're enabled by either +neon or +streaming-sve. This follows on from the work in D106272 to support the subset of SVE(2) instructions that are legal in streaming mode. Depends on D107902. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D107903	2021-09-03 07:59:17 +00:00
Cullen Rhodes	419deccfd1	[AArch64] NFC: Remove register decoder tables in disassembler The register classes are generated by TableGen, use them instead of handwritten tables. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D107763	2021-08-12 07:28:56 +00:00
Cullen Rhodes	1a18bb9270	[AArch64] NFC: Remove DecodeVectorRegisterClass from disassembler The decoder function and table are the same as FPR128, use that instead. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D107644	2021-08-09 06:52:47 +00:00
Cullen Rhodes	08bc441174	[AArch64] NFC: drop unnecessary llvm:: namespace prefix on MCInst	2021-08-06 09:23:18 +00:00
Cullen Rhodes	2e27c4e1f1	[AArch64][SME] Add zero instruction This patch adds the zero instruction for zeroing a list of 64-bit element ZA tiles. The instruction takes a list of up to eight tiles ZA0.D-ZA7.D, which must be in order, e.g. zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d} zero {za1.d,za3.d,za5.d,za7.d} The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which are mapped to corresponding 64-bit element tiles in accordance with the architecturally defined mapping between different element size tiles, e.g. * Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing all eight 64-bit element tiles ZA0.D to ZA7.D. * Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D. The preferred disassembly of this instruction uses the shortest list of tile names that represent the encoded immediate mask, e.g. * An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and ZA5.D is disassembled as {ZA0.S, ZA1.S}. * An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and ZA6.D is disassembled as {ZA0.H}. * An all-ones immediate is disassembled as {ZA}. * An all-zeros immediate is disassembled as an empty list {}. This patch adds the MatrixTileList asm operand and related parsing to support this. Depends on D105570. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105575	2021-07-27 08:35:45 +00:00
Cullen Rhodes	2d80bbd939	[AArch64][SME] Add mova instructions This patch adds the mova instruction to insert/extract an SVE vector register to/from a ZA tile vector. The preferred MOV aliases are also implemented. Depends on D105572. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm, CarolineConcatto Differential Revision: https://reviews.llvm.org/D105574	2021-07-21 08:20:01 +00:00
Cullen Rhodes	6c32cfe85c	[AArch64][SME] Add ldr and str instructions The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D105573	2021-07-21 08:17:13 +00:00
Cullen Rhodes	15af3aaa2e	[AArch64][SME] Add system registers and related instructions This patch adds the new system registers introduced in SME: - ID_AA64SMFR0_EL1 (ro) SME feature identifier. - SMCR_ELx (r/w) streaming mode control register for configuring effective SVE Streaming SVE Vector length when the PE is in Streaming SVE mode. - SVCR (r/w) streaming vector control register, visible at all exception levels. Provides access to PSTATE.SM and PSTATE.ZA using MSR and MRS instructions. - SMPRI_EL1 (r/w) streaming mode execution priority register. - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register. - SMIDR_EL1 (ro) streaming mode identification register. - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread SME context. - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for labelling memory accesses performed in streaming mode. Also added in this patch are the SME mode change instructions. Three MSR immediate instructions are implemented to set or clear PSTATE.SM, PSTATE.ZA, or both respectively: - MSR SVCRSM, #<imm1> - MSR SVCRZA, #<imm1> - MSR SVCRSMZA, #<imm1> The following smstart/smstop aliases are also implemented for convenience: smstart -> MSR SVCRSMZA, #1 smstart sm -> MSR SVCRSM, #1 smstart za -> MSR SVCRZA, #1 smstop -> MSR SVCRSMZA, #0 smstop sm -> MSR SVCRSM, #0 smstop za -> MSR SVCRZA, #0 The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105576	2021-07-20 08:06:26 +00:00
Cullen Rhodes	99eb96f031	[AArch64][SME] Add load and store instructions This patch adds support for following contiguous load and store instructions: * LD1B, LD1H, LD1W, LD1D, LD1Q * ST1B, ST1H, ST1W, ST1D, ST1Q A new register class and operand is added for the 32-bit vector select register W12-W15. The differences in the following tests which have been re-generated are caused by the introduction of this register class: * llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll * llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir * llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir * llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir D88663 attempts to resolve the issue with the store pair test differences in the AArch64 load/store optimizer. The GlobalISel differences are caused by changes in the enum values of register classes, tests have been updated with the new values. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: CarolineConcatto Differential Revision: https://reviews.llvm.org/D105572	2021-07-16 10:11:10 +00:00
Cullen Rhodes	c08dabb0f4	[AArch64][SME] Add matrix register definitions and parsing support SME introduces the ZA array, a new piece of architectural register state consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the implementation defined Streaming SVE vector length and SVLb is the number of 8-bit elements in a vector of SVL bits. SME instructions consist of three types of matrix operands: * Tiles: a ZA tile is a square, two-dimensional sub-array of elements within the ZA array. These tiles make up the larger accumulator array and the granularity varies based on the element size, i.e. - ZAQ0..ZAQ15 (smallest tile granule) - ZAD0..ZAD7 - ZAS0..ZAS3 - ZAH0..ZAH1 or ZAB0 (largest tile granule, single tile) * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v' to tell how the vector at [reg+offset] is layed out in the tile, horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds to vectors in registers ZAH1 and ZAQ15, respectively. * Accumulator matrix: this is the entire accumulator array ZA. This patch adds the register classes and related operands and parsing for SME instructions operating on the accumulator array. The ADDHA and ADDVA instructions which operate on tiles are also added in this patch to make some use of the code added, later patches will make use of the other operands introduced here. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Co-authored by: Sander de Smalen (@sdesmalen) Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105570	2021-07-14 08:25:49 +00:00
Lucas Prates	313889191e	[AArch64] Adding the v8.7-A LD64B/ST64B Accelerator extension This adds support for the v8.7-A LD64B/ST64B Accelerator extension through a subtarget feature called "ls64". It adds four 64-byte load/store instructions with an operand in the new GPR64x8 register class, and one system register that's part of the same extension. Based on patches written by Simon Tatham. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D91775	2020-12-17 13:46:23 +00:00
Lucas Prates	97c006aabb	[AArch64] Add a GPR64x8 register class This adds a GPR64x8 register class that will be needed as the data operand to the LD64B/ST64B family of instructions in the v8.7-A Accelerator Extension, which load or store a contiguous range of eight x-regs. It has to be its own register class so that register allocation will have visibility of the full set of registers actually read/written by the instructions, which will be needed when we add intrinsics and/or inline asm access to this piece of architecture. Patch written by Simon Tatham. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D91774	2020-12-17 13:45:46 +00:00
Lucas Prates	b5bbb4b2b7	[NFC][AArch64] Move AArch64 MSR/MRS into a new decoder namespace This removes the general forms of the AArch64 MSR and MRS instructions from the same decoding table that contains many more specific instructions that supersede them. They're now in a separate decoding table of their own, called "Fallback", which is only consulted in the event of the main decoder table failing to produce an answer. This should avoid decoding conflicts on future specialized instructions in the MSR space. Patch written by Simon Tatham. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D91771	2020-12-17 13:40:10 +00:00
Victor Campos	da852b03b0	[AArch64] Emit warning when disassembling unpredictable LDRAA and LDRAB Summary: LDRAA and LDRAB in their writeback variant should softfail when the same register is used as result and base. This patch adds a custom decoder that catches such case and emits a warning when it occurs. Differential Revision: https://reviews.llvm.org/D82541	2020-06-25 15:56:36 +01:00
Tom Stellard	0dbcb36394	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439	2020-01-14 19:46:52 -08:00
Fangrui Song	6fdd6a7b3f	[Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction() The argument is llvm::null() everywhere except llvm::errs() in llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds. If we ever have the needs to add verbose log to disassemblers, we can record log with a member function, instead of passing it around as an argument.	2020-01-11 13:34:52 -08:00
Tom Stellard	4b0b26199b	Revert CMake: Make most target symbols hidden by default This reverts r362990 (git commit 374571301dc8e9bc9fdd1d70f86015de198673bd) This was causing linker warnings on Darwin: ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)' from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol 'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&), std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings. llvm-svn: 363028	2019-06-11 03:21:13 +00:00
Tom Stellard	374571301d	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439 llvm-svn: 362990	2019-06-10 22:12:56 +00:00
Richard Trieu	b26592e04d	[AArch64] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360709	2019-05-14 21:33:53 +00:00
Tim Northover	ff6875acd9	AArch64: support binutils-like things on arm64_32. This adds support for the arm64_32 watchOS ABI to LLVM's low level tools, teaching them about the specific MachO choices and constants needed to disassemble things. llvm-svn: 360663	2019-05-14 11:25:44 +00:00
Simon Pilgrim	aa49be4926	Avoid cppcheck operator precedence warnings. NFCI. Prefer ((X & Y) ? A : B) to (X & Y ? A : B) llvm-svn: 359884	2019-05-03 13:50:38 +00:00

1 2 3

140 Commits