140 Commits

Author SHA1 Message Date
David Green
f33245a5c4 [AArch64] Fix a always true condition warning. NFC
As ImmVal is unsigned, it will always be >= 0
2024-01-01 15:56:03 +00:00
Tomas Matheson
192f720178 Re-land "[AArch64] Add FEAT_PAuthLR assembler support" (#75947)
This reverts commit 199a0f9f5aaf72ff856f68e3bb708e783252af17.
Fixed the left-shift of signed integer which was causing UB.
2023-12-21 18:09:31 +00:00
Tomas Matheson
199a0f9f5a Revert "[AArch64] Add FEAT_PAuthLR assembler support"
This reverts commit 934b1099cbf14fa3f86a269dff957da8e5fb619f.

Buildbot failues on sanitizer-x86_64-linux-fast
2023-12-21 16:26:39 +00:00
Oliver Stannard
934b1099cb [AArch64] Add FEAT_PAuthLR assembler support
Add assembly/disassembly support for the new PAuthLR instructions
introduced in Armv9.5-A:

- AUTIASPPC/AUTIBSPPC
- PACIASPPC/PACIBSPPC
- PACNBIASPPC/PACNBIBSPPC
- RETAASPPC/RETABSPPC
- PACM

Documentation for these instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/
2023-12-21 14:18:33 +00:00
hassnaaHamdi
6477b41a0b
[llvm][AArch64][Assembly]: Add FP8FMA assembly and disassembly. (#70134)
This patch adds the feature flag FP8FMA and the assembly/disassembly
for the following instructions of NEON and SVE2:
  * NEON: 
    - FMLALBlane
    - FMLALTlane
    - FMLALLBBlane
    - FMLALLBTlane
    - FMLALLTBlane
    - FMLALLTTlane
    - FMLALB
    - FMLALT
    - FMLALLB
    - FMLALLBT
    - FMLALLTB
    - FMLALLTT
  * SVE2:
    - FMLALB_ZZZI
    - FMLALT_ZZZI
    - FMLALB_ZZZ 
    - FMLALT_ZZZ 
    - FMLALLBB_ZZZI 
    - FMLALLBT_ZZZI 
    - FMLALLTB_ZZZI 
    - FMLALLTT_ZZZI 
    - FMLALLBB_ZZZ 
    - FMLALLBT_ZZZ 
    - FMLALLTB_ZZZ 
    - FMLALLTT_ZZZ

That is according to this documentation:
https://developer.arm.com/documentation/ddi0602/2023-09
2023-11-01 13:03:00 +00:00
Matthew Devereau
b967f3a1d7
[AArch64] Separate PNR into its own Register Class (#65306)
This patch separates PNR registers into their own register class instead
of sharing a register class with PPR registers. This primarily allows us
to return more accurate register classes when applying assembly
constraints, but also more protection from supplying an incorrect
predicate type to an invalid register operand.
2023-09-21 19:53:16 +01:00
Craig Topper
c1d94ea08c [AArch64] Remove global constructors from AArch64Disassembler.cpp.
Instead of using SmallVectors of SmallVectors, use a plain array.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D150077
2023-05-09 10:37:20 -07:00
Jay Foad
768aed1378 [MC] Make more use of MCInstrDesc::operands. NFC.
Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.

Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.

Differential Revision: https://reviews.llvm.org/D142213
2023-01-23 11:31:41 +00:00
Lucas Prates
f516e91715 [AArch64] Add new v9.4-A PM pstate system register
This adds support for the new PM pstate system register introduced by
the v9.4-A Exception-based Event Profiling extension (FEAT_EBEP).

The new PM pstate register takes a 1-bit immediate and requires
different values to be specified for the higher bits of the Crm field.
To enable that, this patch creates an explicit separation between the
pstate system registers that take 4-bit and 1-bit immediate operands,
allowing each entry to specify the value for the 3 high bits of Crm.

This also updates other pstate registers to correctly accept 4-bit
immediates, matching their decoding specification from the Arm ARM.
These include: `PAN`,  `UAO`, `DIT` and `SSBS`.

More information about this extension and the new register can be found
at:
* https://developer.arm.com/documentation/ddi0601/2022-09/AArch64-Registers/PM--PMU-Exception-Mask

Contributors:
* Lucas Prates
* Sam Elliott

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D139925
2022-12-19 15:07:52 +00:00
Tomas Matheson
7fea6f2e0e [AArch64] Assembly support for VMSA
Virtual Memory System Architecture (VMSA)

This is part of the 2022 A-Profile Architecture extensions and adds support for
the following:

 - Translation Hardening Extension (FEAT_THE)
 - 128-bit Page Table Descriptors (FEAT_D128)
 - 56-bit Virtual Address (FEAT_LVA3)
 - Support for 128-bit System Registers (FEAT_SYSREG128)
 - System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128)
 - 128-bit Atomic Instructions (FEAT_LSE128)
 - Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE)
 - Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE)
 - Memory Attribute Index Enhancement (FEAT_AIE)

New instructions added:
 - FEAT_SYSREG128 adds MRRS and MSRR.
 - FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases.
 - FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions.
 - FEAT_THE adds the set of RCW* instructions.

Specs for individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/

Contributors:
  Keith Walker
  Lucas Prates
  Sam Elliott
  Son Tuan Vu
  Tomas Matheson

Differential Revision: https://reviews.llvm.org/D138920
2022-11-30 13:37:02 +00:00
Ties Stuij
cb261e30fb [AArch64][clang] implement 2022 General Data-Processing instructions
This patch implements the 2022 Architecture General Data-Processing Instructions

They include:

Common Short Sequence Compression (CSSC) instructions
- scalar comparison instructions
  SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate
- ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes)
- command-line options for CSSC

Associated with these instructions in the documentation is the Range Prefetch
Memory (RPRFM) instruction, which signals to the memory system that data memory
accesses from a specified range of addresses are likely to occur in the near
future. The instruction lies in hint space, and is made unconditional.

Specs for the individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/

contributors to this patch:
- Cullen Rhodes
- Son Tuan Vu
- Mark Murray
- Tomas Matheson
- Sam Elliott
- Ties Stuij

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D138488
2022-11-22 14:23:12 +00:00
Caroline Concatto
ecab1bc0dc [AArch64]SME2 Multi vector Sel Load and Store instructions
This patch adds the assembly/disassembly for the following instruction:

   SEL: Multi-vector conditionally select elements from two vectors
        for 2 and 4 registers

Non-constiguous load with stride resgisters:

  LD1B (scalar + immediate): Contiguous load of bytes to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load of bytes to multiple strided vectors (scalar index).
  LD1D (scalar + immediate): Contiguous load of doublewords to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load of doublewords to multiple strided vectors (scalar index).
  LD1H (scalar + immediate): Contiguous load of halfwords to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load of halfwords to multiple strided vectors (scalar index).
  LD1W (scalar + immediate): Contiguous load of words to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load of words to multiple strided vectors (scalar index).

  LDNT1B (scalar + immediate): Contiguous load non-temporal of bytes to multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous load non-temporal of bytes to multiple strided vectors (scalar index).
  LDNT1D (scalar + immediate): Contiguous load non-temporal of doublewords to multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous load non-temporal of doublewords to multiple strided vectors (scalar index).
  LDNT1H (scalar + immediate): Contiguous load non-temporal of halfwords to multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous load non-temporal of halfwords to multiple strided vectors (scalar index).
  LDNT1W (scalar + immediate): Contiguous load non-temporal of words to multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous load non-temporal of words to multiple strided vectors (scalar index).

Non-constiguous store with stride resgisters:

  ST1B (scalar + immediate): Contiguous store of bytes from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store of bytes from multiple strided vectors (scalar index).
  ST1D (scalar + immediate): Contiguous store of doublewords from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store of doublewords from multiple strided vectors (scalar index).
  ST1H (scalar + immediate): Contiguous store of halfwords from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store of halfwords from multiple strided vectors (scalar index).
  ST1W (scalar + immediate): Contiguous store of words from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store of words from multiple strided vectors (scalar index).

  STNT1B (scalar + immediate): Contiguous store non-temporal of bytes from multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous store non-temporal of bytes from multiple strided vectors (scalar index).
  STNT1D (scalar + immediate): Contiguous store non-temporal of doublewords from multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous store non-temporal of doublewords from multiple strided vectors (scalar index).
  STNT1H (scalar + immediate): Contiguous store non-temporal of halfwords from multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous store non-temporal of halfwords from multiple strided vectors (scalar index).
  STNT1W (scalar + immediate): Contiguous store non-temporal of words from multiple strided vectors (immediate index).
         (scalar + scalar): Contiguous store non-temporal of words from multiple strided vectors (scalar index).

    The reference can be found here:

        https://developer.arm.com/documentation/ddi0602/2022-09

This patch also adds a new SVE vector list to represent the stride loads/stores
ZPRVectorListStrided and the sets of 2 and 4 ZA registers:
ZZ_[b|h|w|d]_strided and ZZZZ_[b|h|w|d]_strided

Differential Revision: https://reviews.llvm.org/D136172
2022-11-10 16:04:57 +00:00
Caroline Concatto
a20112a74c [AArch64]SME2 instructions that use ZTO operand
This patch adds the assembly/disassembly for the following instructions:
  ZERO (ZT0): Zero ZT0.
  LDR (ZT0): Load ZT0 register.
  STR (ZT0): Store ZT0 register.
  MOVT (scalar to ZT0): Move 8 bytes from general-purpose register to ZT0.
       (ZT0 to scalar): Move 8 bytes from ZT0 to general-purpose register.
 Consecutive:
   LUTI2 (single): Lookup table read with 2-bit indexes.
         (two registers): Lookup table read with 2-bit indexes.
         (four registers): Lookup table read with 2-bit indexes.
   LUTI4 (single): Lookup table read with 4-bit indexes.
         (two registers): Lookup table read with 4-bit indexes.
         (four registers): Lookup table read with 4-bit indexes.

The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

This patch also adds a new register class and operand for zt0
and a another index operand uimm3s8

Differential Revision: https://reviews.llvm.org/D136088
2022-11-03 07:35:21 +00:00
David Sherwood
be369ea31b [AArch64][SVE2] Add the SVE2.1 while & pext predicate pair instructions
This patch adds the assembly/disassembly for the following
predicate pair instructions:

pext:    Set pair of predicates from predicate-as-counter
whilelt: While incrementing signed scalar less than scalar
whilele: While incrementing signed scalar less than or equal to scalar
whilegt: While incrementing signed scalar greater than scalar
whilege: While incrementing signed scalar greater than or equal to scalar
whilelo: While incrementing unsigned scalar lower than scalar
whilels: While incrementing unsigned scalar lower or same as scalar
whilehs: While decrementing unsigned scalar higher or same as scalar
whilehi: While decrementing unsigned scalar higher than scalar

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09

Differential Revision: https://reviews.llvm.org/D136759
2022-11-02 08:39:03 +00:00
James Y Knight
9a26f89316 [llvm-tblgen] NFC: Simplify DecoderEmitter.
Currently the DecoderEmitter constructor takes a bunch of string
parameters containing bits of code to interpolate.

However, there's only two ways it can be called. The one used for most
targets which doesn't handle the SoftFail DecoderStatus (not a
problem, because they don't use SoftFail). The other mode, which is
used for ARM/AArch64, does handle SoftFail, but requires an externally
defined helper function in those targets.

This is unnecessary complication; remove the parameters, and unify
onto a single version which does support SoftFail, defining the helper
itself.
2022-10-28 19:45:20 -04:00
David Sherwood
891aaff9a8 [AArch64][SVE2] Add the SVE2.1 pext and ptrue predicate-as-counter instructions
This patch adds the assembly/disassembly for the following instructions:

pext (predicate) : Set predicate from predicate-as-counter
ptrue (predicate-as-counter) : Initialise predicate-as-counter to all active

This patch also introduces the predicate-as-counter registers pn8, etc.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09

Differential Revision: https://reviews.llvm.org/D136678
2022-10-27 18:23:35 +00:00
Caroline Concatto
9db12a45e4 [AArch64]SME2 Multiple vector ternary int/float 2 and 4 registers
This patch adds the assembly/disassembly for the following instructions:
       For INT:
               ADD(array results, multiple vectors): Add multi-vector to multi-vector with ZA array vector results.
               SUB(array results, multiple vectors): Subtract multi-vector from multi-vector with ZA array vector results.
       For FP:
              FMLA (multiple vectors): Multi-vector floating-point fused multiply-add.
              FMLS (multiple vectors): Multi-vector floating-point fused multiply-subtract.

The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

   This patch also adds a  register operand to represent multiples of ZA multi-vectors.
They are:
        ZZ_s_mul_r, ZZ_d_mul_r, ZZZZ_s_mul_r and ZZZZ_d_mul_r
and represent the Zn or Zm times 2 or 4 according to the vector group.

Depends on: D135455

Differential Revision: https://reviews.llvm.org/D135468
2022-10-20 18:44:22 +01:00
Caroline Concatto
2ecbe8c38c [AArch64] SME2 Single-multi vector ternary int/FP 2 and 4 registers
This patch adds the assembly/disassembly for the following instructions:

For INT:
    ADD(array results, multiple and single vector): Add replicated single
        vector to multi-vector with ZA array vector results.
    SUB(array results, multiple and single vector): Subtract replicated single
        vector from multi-vector with ZA array vector results.
For FP:
    FMLA (multiple and single vector): Multi-vector floating-point fused
          multiply-add by vector.
    FMLS (multiple and single vector): Multi-vector floating-point
          multiply-subtract long by vector.
The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

The Matriz Operand has 2 new sizes 32(.s) and 64(.d) bits
(MatrixOp32 and MatrixOp64)

Depends on: D135448

Depends on:  D135952

Differential Revision: https://reviews.llvm.org/D135455
2022-10-19 17:49:48 +01:00
Kazu Hirata
109df7f9a4 [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-13 12:55:42 -07:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Simon Tatham
55f1fbf005 [MC,llvm-objdump,ARM] Target-dependent disassembly resync policy.
Currently, when llvm-objdump is disassembling a code section and
encounters a point where no instruction can be decoded, it uses the
same policy on all targets: consume one byte of the section, emit it
as "<unknown>", and try disassembling from the next byte position.

On an architecture where instructions are always 4 bytes long and
4-byte aligned, this makes no sense at all. If a 4-byte word cannot be
decoded as an instruction, then the next place that a valid
instruction could //possibly// be found is 4 bytes further on.
Disassembling from a misaligned address can't possibly produce
anything that the code generator intended, or that the CPU would even
attempt to execute.

This patch introduces a new MCDisassembler virtual method called
`suggestBytesToSkip`, which allows each target to choose its own
resynchronization policy. For Arm (as opposed to Thumb) and AArch64,
I've filled in the new method to return a fixed width of 4.

Thumb is a more interesting case, because the criterion for
identifying 2-byte and 4-byte instruction encodings is very simple,
and doesn't require the particular instruction to be recognized. So
`suggestBytesToSkip` is also passed an ArrayRef of the bytes in
question, so that it can take that into account. The new test case
shows Thumb disassembly skipping over two unrecognized instructions,
and identifying one as 2-byte and one as 4-byte.

For targets other than Arm and AArch64, this is NFC: the base class
implementation of `suggestBytesToSkip` still returns 1, so that the
existing behavior is unchanged. Other targets can fill in their own
implementations as they see fit; I haven't attempted to choose a new
behavior for each one myself.

I've updated all the call sites of `MCDisassembler::getInstruction` in
llvm-objdump, and also one in sancov, which was the only other place I
spotted the same idiom of `if (Size == 0) Size = 1` after a call to
`getInstruction`.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D130357
2022-07-26 09:35:30 +01:00
Maksim Panchenko
bed9efed71 [MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand()
MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter
to specify either the instruction size or the operand size depending on
the architecture. However, for proper symbolic disassembly on X86, we
need to know both sizes, as an instruction can have two operands, and
the instruction size cannot be reliably calculated based on the operand
offset and its size. Hence, split Size into OpSize and InstSize.

For X86, the new interface allows to fix a couple of issues:
  * Correctly adjust the value of PC-relative operands.
  * Set operand size to zero when the operand is specified implicitly.

Differential Revision: https://reviews.llvm.org/D126101
2022-05-25 13:44:32 -07:00
Caroline Concatto
8765ad42cd [AArch64][SME][NFC] Add implicit operands for SME instructions in the disassembly.
This patch simplifies the switch statement in getInstruction to add
implicit operands (register ZA and Immediate  equal to zero)
in the SME operands when disassembly.

The register ZA and the zero immediate  can be added by checking the operand
in MCInstDesc.

Differential Revision: https://reviews.llvm.org/D125534
2022-05-20 10:29:21 +01:00
Sheng
c644488a8b Rename MCFixedLenDisassembler.h as MCDecoderOps.h
The name `MCFixedLenDisassembler.h` is out of date after D120958.

Rename it as `MCDecoderOps.h` to reflect the change.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D124987
2022-05-15 08:44:58 +08:00
Maksim Panchenko
4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Kazu Hirata
410480e32b Ensure newlines at the end of files (NFC) 2022-01-06 23:44:02 -08:00
Simon Tatham
e35a3f188f [AArch64] Adding "armv8.8-a" memcpy/memset support.
This family of instructions includes CPYF (copy forward), CPYB (copy
backward), SET (memset) and SETG (memset + initialise MTE tags), with
some sub-variants to indicate whether address translation is done in a
privileged or unprivileged way. For the copy instructions, you can
separately specify the read and write translations (so that kernels
can safely use these instructions in syscall handlers, to memcpy
between the calling process's user-space memory map and the kernel's
own privileged one).

The unusual thing about these instructions is that they write back to
multiple registers, because they perform an implementation-defined
amount of copying each time they run, and write back to _all_ the
address and size registers to indicate how much remains to be done
(and the code is expected to loop on them until the size register
becomes zero). But this is no problem in LLVM - you just define each
instruction to have multiple outputs, multiple inputs, and a set of
constraints tying their register numbers together appropriately.

This commit introduces a special subtarget feature called MOPS (after
the name the spec gives to the CPU id field), which is a dependency of
the top-level 8.8-A feature, and uses that to enable most of the new
instructions. The SETMG instructions also depend on MTE (and the test
checks that).

Differential Revision: https://reviews.llvm.org/D116157
2022-01-05 14:44:24 +00:00
Reid Kleckner
89b57061f7 Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454
2021-10-08 14:51:48 -07:00
Cullen Rhodes
42ba79b7b0 [AArch64][SME] Update tile slice index offset
Changes in architecture revision 00eac1:
  * Tile slice index offset no longer prefixed with '#'.
  * The syntax for 128-bit (.Q) ZA tile slice accesses must now include
    an explicit zero index.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-09

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D111212
2021-10-07 08:55:10 +00:00
Cullen Rhodes
dc5dd77ac7 [AArch64][SME] Support NEON vector to GPR integer moves in streaming mode
A small subset of the NEON instruction set is legal in streaming mode.
This patch adds support for the following vector to integer move
instructions:

  0x00 1110 0000 0001 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.B[0]
  0x00 1110 0000 0010 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.H[0]
  0100 1110 0000 0100 0010 11xx xxxx xxxx # SMOV Xd,Vn.S[0]
  0000 1110 0000 0001 0011 11xx xxxx xxxx # UMOV Wd,Vn.B[0]
  0000 1110 0000 0010 0011 11xx xxxx xxxx # UMOV Wd,Vn.H[0]
  0000 1110 0000 0100 0011 11xx xxxx xxxx # UMOV Wd,Vn.S[0]
  0100 1110 0000 1000 0011 11xx xxxx xxxx # UMOV Xd,Vn.D[0]

Only the zero index variants are legal, all others indexes are illegal.
To support this, new instructions are defined specifically for zero
index which is hardcoded, along an implicit 'VectorIndex0' operand.
Since the index operand is implicit and takes no bits in the encoding,
custom decoding is required to add the operand.

I'm not sure if this is the best approach but the predicate constraint
on a subset of an operand is unusual. Would be interested to hear some
alternatives.

The instructions are predicated on 'HasNEONorStreamingSVE', i.e. they're
enabled by either +neon or +streaming-sve. This follows on from the work
in D106272 to support the subset of SVE(2) instructions that are legal
in streaming mode.

Depends on D107902.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D107903
2021-09-03 07:59:17 +00:00
Cullen Rhodes
419deccfd1 [AArch64] NFC: Remove register decoder tables in disassembler
The register classes are generated by TableGen, use them instead of
handwritten tables.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D107763
2021-08-12 07:28:56 +00:00
Cullen Rhodes
1a18bb9270 [AArch64] NFC: Remove DecodeVectorRegisterClass from disassembler
The decoder function and table are the same as FPR128, use that instead.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D107644
2021-08-09 06:52:47 +00:00
Cullen Rhodes
08bc441174 [AArch64] NFC: drop unnecessary llvm:: namespace prefix on MCInst 2021-08-06 09:23:18 +00:00
Cullen Rhodes
2e27c4e1f1 [AArch64][SME] Add zero instruction
This patch adds the zero instruction for zeroing a list of 64-bit
element ZA tiles. The instruction takes a list of up to eight tiles
ZA0.D-ZA7.D, which must be in order, e.g.

  zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d}
  zero {za1.d,za3.d,za5.d,za7.d}

The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which
are mapped to corresponding 64-bit element tiles in accordance with the
architecturally defined mapping between different element size tiles,
e.g.

  * Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing
    all eight 64-bit element tiles ZA0.D to ZA7.D.
  * Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D.

The preferred disassembly of this instruction uses the shortest list of
tile names that represent the encoded immediate mask, e.g.

  * An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and
    ZA5.D is disassembled as {ZA0.S, ZA1.S}.
  * An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and
    ZA6.D is disassembled as {ZA0.H}.
  * An all-ones immediate is disassembled as {ZA}.
  * An all-zeros immediate is disassembled as an empty list {}.

This patch adds the MatrixTileList asm operand and related parsing to support
this.

Depends on D105570.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105575
2021-07-27 08:35:45 +00:00
Cullen Rhodes
2d80bbd939 [AArch64][SME] Add mova instructions
This patch adds the mova instruction to insert/extract an SVE vector
register to/from a ZA tile vector.

The preferred MOV aliases are also implemented.

Depends on D105572.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm, CarolineConcatto

Differential Revision: https://reviews.llvm.org/D105574
2021-07-21 08:20:01 +00:00
Cullen Rhodes
6c32cfe85c [AArch64][SME] Add ldr and str instructions
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D105573
2021-07-21 08:17:13 +00:00
Cullen Rhodes
15af3aaa2e [AArch64][SME] Add system registers and related instructions
This patch adds the new system registers introduced in SME:

  - ID_AA64SMFR0_EL1 (ro) SME feature identifier.
  - SMCR_ELx (r/w) streaming mode control register for configuring
    effective SVE Streaming SVE Vector length when the PE is in
    Streaming SVE mode.
  - SVCR (r/w) streaming vector control register, visible at all
    exception levels. Provides access to PSTATE.SM and PSTATE.ZA
    using MSR and MRS instructions.
  - SMPRI_EL1 (r/w) streaming mode execution priority register.
  - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register.
  - SMIDR_EL1 (ro) streaming mode identification register.
  - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread
    SME context.
  - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for
    labelling memory accesses performed in streaming mode.

Also added in this patch are the SME mode change instructions.
Three MSR immediate instructions are implemented to set or clear
PSTATE.SM, PSTATE.ZA, or both respectively:

  - MSR SVCRSM, #<imm1>
  - MSR SVCRZA, #<imm1>
  - MSR SVCRSMZA, #<imm1>

The following smstart/smstop aliases are also implemented for
convenience:

  smstart    -> MSR SVCRSMZA, #1
  smstart sm -> MSR SVCRSM,   #1
  smstart za -> MSR SVCRZA,   #1

  smstop     -> MSR SVCRSMZA, #0
  smstop sm  -> MSR SVCRSM,   #0
  smstop za  -> MSR SVCRZA,   #0

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105576
2021-07-20 08:06:26 +00:00
Cullen Rhodes
99eb96f031 [AArch64][SME] Add load and store instructions
This patch adds support for following contiguous load and store
instructions:

  * LD1B, LD1H, LD1W, LD1D, LD1Q
  * ST1B, ST1H, ST1W, ST1D, ST1Q

A new register class and operand is added for the 32-bit vector select
register W12-W15. The differences in the following tests which have been
re-generated are caused by the introduction of this register class:

  * llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
  * llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir
  * llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
  * llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir

D88663 attempts to resolve the issue with the store pair test
differences in the AArch64 load/store optimizer.

The GlobalISel differences are caused by changes in the enum values of
register classes, tests have been updated with the new values.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D105572
2021-07-16 10:11:10 +00:00
Cullen Rhodes
c08dabb0f4 [AArch64][SME] Add matrix register definitions and parsing support
SME introduces the ZA array, a new piece of architectural register state
consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the
implementation defined Streaming SVE vector length and SVLb is the
number of 8-bit elements in a vector of SVL bits.

SME instructions consist of three types of matrix operands:

  * Tiles: a ZA tile is a square, two-dimensional sub-array of elements
  within the ZA array. These tiles make up the larger accumulator array
  and the granularity varies based on the element size, i.e.
    - ZAQ0..ZAQ15 (smallest tile granule)
    - ZAD0..ZAD7
    - ZAS0..ZAS3
    - ZAH0..ZAH1
    or ZAB0       (largest tile granule, single tile)
  * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v'
  to tell how the vector at [reg+offset] is layed out in the tile,
  horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds
  to vectors in registers ZAH1 and ZAQ15, respectively.
  * Accumulator matrix: this is the entire accumulator array ZA.

This patch adds the register classes and related operands and parsing
for SME instructions operating on the accumulator array.

The ADDHA and ADDVA instructions which operate on tiles are also added
in this patch to make some use of the code added, later patches will
make use of the other operands introduced here.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Co-authored by: Sander de Smalen (@sdesmalen)

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105570
2021-07-14 08:25:49 +00:00
Lucas Prates
313889191e [AArch64] Adding the v8.7-A LD64B/ST64B Accelerator extension
This adds support for the v8.7-A LD64B/ST64B Accelerator extension
through a subtarget feature called "ls64". It adds four 64-byte
load/store instructions with an operand in the new GPR64x8 register
class, and one system register that's part of the same extension.

Based on patches written by Simon Tatham.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D91775
2020-12-17 13:46:23 +00:00
Lucas Prates
97c006aabb [AArch64] Add a GPR64x8 register class
This adds a GPR64x8 register class that will be needed as the data
operand to the LD64B/ST64B family of instructions in the v8.7-A
Accelerator Extension, which load or store a contiguous range of eight
x-regs. It has to be its own register class so that register allocation
will have visibility of the full set of registers actually read/written
by the instructions, which will be needed when we add intrinsics and/or
inline asm access to this piece of architecture.

Patch written by Simon Tatham.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D91774
2020-12-17 13:45:46 +00:00
Lucas Prates
b5bbb4b2b7 [NFC][AArch64] Move AArch64 MSR/MRS into a new decoder namespace
This removes the general forms of the AArch64 MSR and MRS instructions
from the same decoding table that contains many more specific
instructions that supersede them. They're now in a separate decoding
table of their own, called "Fallback", which is only consulted in the
event of the main decoder table failing to produce an answer.

This should avoid decoding conflicts on future specialized instructions
in the MSR space.

Patch written by Simon Tatham.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D91771
2020-12-17 13:40:10 +00:00
Victor Campos
da852b03b0 [AArch64] Emit warning when disassembling unpredictable LDRAA and LDRAB
Summary:
LDRAA and LDRAB in their writeback variant should softfail when the same
register is used as result and base.

This patch adds a custom decoder that catches such case and emits a
warning when it occurs.

Differential Revision: https://reviews.llvm.org/D82541
2020-06-25 15:56:36 +01:00
Tom Stellard
0dbcb36394 CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439
2020-01-14 19:46:52 -08:00
Fangrui Song
6fdd6a7b3f [Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction()
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.

If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
2020-01-11 13:34:52 -08:00
Tom Stellard
4b0b26199b Revert CMake: Make most target symbols hidden by default
This reverts r362990 (git commit 374571301dc8e9bc9fdd1d70f86015de198673bd)

This was causing linker warnings on Darwin:

ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.

llvm-svn: 363028
2019-06-11 03:21:13 +00:00
Tom Stellard
374571301d CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439

llvm-svn: 362990
2019-06-10 22:12:56 +00:00
Richard Trieu
b26592e04d [AArch64] Create a TargetInfo header. NFC
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.

llvm-svn: 360709
2019-05-14 21:33:53 +00:00
Tim Northover
ff6875acd9 AArch64: support binutils-like things on arm64_32.
This adds support for the arm64_32 watchOS ABI to LLVM's low level tools,
teaching them about the specific MachO choices and constants needed to
disassemble things.

llvm-svn: 360663
2019-05-14 11:25:44 +00:00
Simon Pilgrim
aa49be4926 Avoid cppcheck operator precedence warnings. NFCI.
Prefer ((X & Y) ? A : B) to (X & Y ? A : B)

llvm-svn: 359884
2019-05-03 13:50:38 +00:00