[AMDGPU][GFX90A][DOC][NFC] Update assembler syntax description

Summary of changes:
- Enable register tuples with 9, 10, 11 and 12 registers (https://reviews.llvm.org/D138205).
- Enable VOP3 variants of dot2c/dot4c/dot8c instructions (https://reviews.llvm.org/D138494).
- Enable omod modifiers for v_max3_f16, v_min3_f16, etc. (https://reviews.llvm.org/D139469).
- Enable abs and neg modifiers for v_cndmask_b32 (https://reviews.llvm.org/D135900).
- Correct v_mov_b32_sdwa (it does not support abs and neg input modifiers yet).
- Enable abs and neg modifiers for v_dot2c_f32_f16_dpp.
- Minor corrections and improvements.
This commit is contained in:
Dmitry Preobrazhensky 2022-12-13 14:18:20 +03:00
parent bcb457c68e
commit 0d0018e709
27 changed files with 1142 additions and 1139 deletions

File diff suppressed because it is too large Load Diff

View File

@ -10,7 +10,7 @@
FX Operand
==========
This is an *f32* or *f16* operand depending on instruction modifiers:
This is a *f32* or *f16* operand depending on instruction modifiers:
* Operand size is controlled by :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
* Location of 16-bit operand is controlled by :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
* Location of the 16-bit operand is controlled by :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.

View File

@ -24,27 +24,27 @@ The bits of this operand have the following meaning:
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* An *hwreg* value described below.
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from 0 to 0xFFFF.
* An *hwreg* value which is described below.
==================================== ============================================================================
==================================== ===============================================================================
Hwreg Value Syntax Description
==================================== ============================================================================
hwreg({0..63}) All bits of a register indicated by its *id*.
hwreg(<*name*>) All bits of a register indicated by its *name*.
hwreg({0..63}, {0..31}, {1..32}) Register bits indicated by register *id*, first bit *offset* and *size*.
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
==================================== ============================================================================
==================================== ===============================================================================
hwreg({0..63}) All bits of a register indicated by the register *id*.
hwreg(<*name*>) All bits of a register indicated by the register *name*.
hwreg({0..63}, {0..31}, {1..32}) Register bits indicated by the register *id*, first bit *offset* and *size*.
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by the register *name*, first bit *offset* and *size*.
==================================== ===============================================================================
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Defined register *names* include:
Predefined register *names* include:
============================== ==========================================
Name Description
============================== ==========================================
HW_REG_MODE Shader writeable mode bits.
HW_REG_MODE Shader writable mode bits.
HW_REG_STATUS Shader read-only status.
HW_REG_TRAPSTS Trap status.
HW_REG_HW_ID Id of wave, simd, compute unit, etc.

View File

@ -12,7 +12,7 @@ imask
This operand is a mask which controls indexing mode for operands of subsequent instructions.
Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*.
Value 1 enables indexing and value 0 disables it.
Value 1 enables indexing, and value 0 disables it.
===== ========================================
Bit Meaning
@ -25,31 +25,31 @@ Value 1 enables indexing and value 0 disables it.
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..15.
* A *gpr_idx* value described below.
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from 0 to 15.
* A *gpr_idx* value which is described below.
==================================== ===========================================
==================================== =============================================
Gpr_idx Value Syntax Description
==================================== ===========================================
gpr_idx(*<operands>*) Enable indexing for specified *operands*
==================================== =============================================
gpr_idx(*<operand list>*) Enable indexing for the specified *operands*
and disable it for the rest.
*Operands* is a comma-separated list of
*Operand list* is a comma-separated list of
values which may include:
* "SRC0" - enable *src0* indexing.
* SRC0 - enable *src0* indexing.
* "SRC1" - enable *src1* indexing.
* SRC1 - enable *src1* indexing.
* "SRC2" - enable *src2* indexing.
* SRC2 - enable *src2* indexing.
* "DST" - enable *dst* indexing.
* DST - enable *dst* indexing.
Each of these values may be specified only
once.
*Operands* list may be empty; this syntax
*Operand list* may be empty; this syntax
disables indexing for all operands.
==================================== ===========================================
==================================== =============================================
Examples:

View File

@ -5,9 +5,9 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_imm16_a04fb3:
.. _amdgpu_synid_gfx90a_imm16_0533c2:
imm16
=====
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from -32768 to 65535.

View File

@ -5,9 +5,9 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_imm16_73139a:
.. _amdgpu_synid_gfx90a_imm16_169952:
imm16
=====
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from 0 to 65535.

View File

@ -10,11 +10,11 @@
label
=====
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
A branch target, which is a 16-bit signed integer treated as a PC-relative dword offset.
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from -32768 to 65535.
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
Examples:

View File

@ -5,9 +5,9 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_m_254bcb:
.. _amdgpu_synid_gfx90a_m_28b494:
m
=
This operand may be used with integer operand modifier :ref:`sext<amdgpu_synid_sext>`.
This operand may be used with an integer operand modifier :ref:`sext<amdgpu_synid_sext>`.

View File

@ -5,9 +5,9 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_m_f5d306:
.. _amdgpu_synid_gfx90a_m_c141fc:
m
=
This operand may be used with floating point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.
This operand may be used with floating-point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.

View File

@ -24,8 +24,8 @@ A 16-bit message code. The bits of this operand have the following meaning:
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* A *sendmsg* value described below.
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from 0 to 0xFFFF.
* A *sendmsg* value which is described below.
==================================== ====================================================
Sendmsg Value Syntax Description
@ -40,7 +40,7 @@ This operand may be specified as one of the following:
*Op* may be specified using operation *name* or operation *id*.
Stream *id* is an integer in the range 0..3.
Stream *id* is an integer in the range from 0 to 3.
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
@ -73,7 +73,7 @@ Each message type supports specific operations:
*Sendmsg* arguments are validated depending on how *type* value is specified:
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
* If message *type* is specified as a number, each argument must not exceed the corresponding value range (see the first table).
Examples:

View File

@ -5,12 +5,12 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_sbase_010ce0:
.. _amdgpu_synid_gfx90a_sbase_b0aa25:
sbase
=====
A 128-bit buffer resource constant for scalar memory operations which provides a base address, a size and a stride.
A 128-bit buffer resource constant for scalar memory operations which provides a base address, a size, and a stride.
*Size:* 4 dwords.

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_sdata_eb6f2a:
.. _amdgpu_synid_gfx90a_sdata_45d924:
sdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_sdata_aefe00:
.. _amdgpu_synid_gfx90a_sdata_ba98a3:
sdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_sdata_c6aec1:
.. _amdgpu_synid_gfx90a_sdata_c1aec6:
sdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

View File

@ -5,12 +5,12 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_soffset_ba92ce:
.. _amdgpu_synid_gfx90a_soffset_02ec85:
soffset
=======
An unsigned offset from the base address. My be specified as either a register or a 20-bit immediate.
An unsigned offset from the base address. May be specified as either a register or a 20-bit immediate.
Note that an *immediate* offset may be specified using either :ref:`uimm20<amdgpu_synid_uimm20>` operand or :ref:`offset20u<amdgpu_synid_smem_offset20u>` modifier, but not both.

View File

@ -5,12 +5,12 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_srsrc_e73d16:
.. _amdgpu_synid_gfx90a_srsrc_80eef6:
srsrc
=====
Buffer resource constant which defines the address and characteristics of the buffer in memory.
Buffer resource constant, which defines the address and characteristics of the buffer in memory.
*Size:* 4 dwords.

View File

@ -10,4 +10,4 @@
Type Deviation
==============
*Type* of this operand differs from *type* :ref:`implied by the opcode<amdgpu_syn_instruction_mnemo>`. This tag specifies actual operand *type*.
The *type* of this operand differs from the *type* :ref:`implied by the opcode<amdgpu_syn_instruction_mnemo>`. This tag specifies the actual operand *type*.

View File

@ -5,17 +5,15 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vaddr_5d0b42:
.. _amdgpu_synid_gfx90a_vaddr_cc213c:
vaddr
=====
Image address which includes from one to four dimensional coordinates and other data used to locate a position in the image.
*Size:* 1, 2, 3, 4, 8 or 16 dwords. Actual size depends on opcode, specific image being handled and :ref:`a16<amdgpu_synid_a16>`.
*Size:* 1-12 dwords. Actual size depends on opcode, specific image being handled and :ref:`a16<amdgpu_synid_a16>`.
Note 1. Image format and dimensions are encoded in the image resource constant but not in the instruction.
Note 2. Actually image address size may vary from 1 to 13 dwords, but assembler currently supports a limited range of register sequences.
Note. Image format and dimensions are encoded in the image resource constant, but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdata_8e9b87:
.. _amdgpu_synid_gfx90a_vdata_0c567e:
vdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdata_af2725:
.. _amdgpu_synid_gfx90a_vdata_898c08:
vdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
@ -21,6 +21,6 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
Note: the surface data format is indicated in the image resource constant but not in the instruction.
Note: the surface data format is indicated in the image resource constant, but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`a<amdgpu_synid_a>`

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdata_ca6e5f:
.. _amdgpu_synid_gfx90a_vdata_999247:
vdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
@ -21,6 +21,6 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
Note: the surface data format is indicated in the image resource constant but not in the instruction.
Note: the surface data format is indicated in the image resource constant, but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`a<amdgpu_synid_a>`

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdata_2d0375:
.. _amdgpu_synid_gfx90a_vdata_ae1132:
vdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

View File

@ -5,14 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdata_2a60db:
.. _amdgpu_synid_gfx90a_vdata_bbcfbb:
vdata
=====
Input data for an atomic instruction.
Optionally may serve as an output data:
Optionally, this operand may be used to store output data:
* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

View File

@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdata_a5f23e:
.. _amdgpu_synid_gfx90a_vdata_cbb01e:
vdata
=====
@ -14,7 +14,7 @@ Image data to store by an *image_store* instruction.
*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`d16<amdgpu_synid_d16>`:
* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits, depending on :ref:`d16<amdgpu_synid_d16>`.
* :ref:`d16<amdgpu_synid_d16>` specifies that data in registers are packed; each value occupies 16 bits.
*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`a<amdgpu_synid_a>`

View File

@ -5,16 +5,16 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdst_7c9848:
.. _amdgpu_synid_gfx90a_vdst_a9ee3f:
vdst
====
Image data to load by an image instruction.
Image data to be loaded by an image instruction.
*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`d16<amdgpu_synid_d16>`:
* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits, depending on :ref:`d16<amdgpu_synid_d16>`.
* :ref:`d16<amdgpu_synid_d16>` specifies that data elements in registers are packed; each value occupies 16 bits.

View File

@ -5,12 +5,12 @@
* *
**************************************************
.. _amdgpu_synid_gfx90a_vdst_f47b9b:
.. _amdgpu_synid_gfx90a_vdst_f5eb9d:
vdst
====
Image data to load by an image instruction.
Image data to be loaded by an image instruction.
*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>`:

View File

@ -24,7 +24,7 @@ The bits of this operand have the following meaning:
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range from 0 to 0xFFFF.
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
====================== ======================================================================
@ -38,7 +38,8 @@ This operand may be specified as one of the following:
lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
====================== ======================================================================
These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
These values may be specified in any order. Spaces, ampersands, and commas may be used as optional separators.
If some values are omitted, the corresponding fields will default to their maximum value.
*N* is either an
:ref:`integer number<amdgpu_synid_integer_number>` or an