[AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in llvm/docs (#126887)
gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all documentation occurrences of gfx940/gfx941 except for the gfx940 ISA description, which will be the subject of a separate PR. For SWDEV-512631
This commit is contained in:
parent
2260d59257
commit
db597084c5
@ -63,7 +63,7 @@ Note: *N* and *K* must satisfy the following conditions:
|
|||||||
* 0 <= *K* <= 255.
|
* 0 <= *K* <= 255.
|
||||||
* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
|
* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
|
||||||
|
|
||||||
GFX90A and GFX940 have an additional alignment requirement:
|
GFX90A and GFX942 have an additional alignment requirement:
|
||||||
pairs of *vector* registers must be even-aligned
|
pairs of *vector* registers must be even-aligned
|
||||||
(first register must be even).
|
(first register must be even).
|
||||||
|
|
||||||
@ -183,7 +183,7 @@ Note: *N* and *K* must satisfy the following conditions:
|
|||||||
* 0 <= *K* <= 255.
|
* 0 <= *K* <= 255.
|
||||||
* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
|
* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
|
||||||
|
|
||||||
GFX90A and GFX940 have an additional alignment requirement:
|
GFX90A and GFX942 have an additional alignment requirement:
|
||||||
pairs of *accumulator* registers must be even-aligned
|
pairs of *accumulator* registers must be even-aligned
|
||||||
(first register must be even).
|
(first register must be even).
|
||||||
|
|
||||||
|
@ -323,7 +323,7 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following
|
|||||||
Add product
|
Add product
|
||||||
names.
|
names.
|
||||||
|
|
||||||
**GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ [AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ [AMD-GCN-GFX940-GFX942-CDNA3]_
|
**GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ [AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ [AMD-GCN-GFX942-CDNA3]_
|
||||||
-----------------------------------------------------------------------------------------------------------------------
|
-----------------------------------------------------------------------------------------------------------------------
|
||||||
``gfx900`` ``amdgcn`` dGPU - xnack - Absolute - *rocm-amdhsa* - Radeon Vega
|
``gfx900`` ``amdgcn`` dGPU - xnack - Absolute - *rocm-amdhsa* - Radeon Vega
|
||||||
flat - *pal-amdhsa* Frontier Edition
|
flat - *pal-amdhsa* Frontier Edition
|
||||||
@ -378,20 +378,6 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following
|
|||||||
- Ryzen 3 Pro 4350G
|
- Ryzen 3 Pro 4350G
|
||||||
- Ryzen 3 Pro 4350GE
|
- Ryzen 3 Pro 4350GE
|
||||||
|
|
||||||
``gfx940`` ``amdgcn`` dGPU - sramecc - Architected *TBA*
|
|
||||||
- tgsplit flat
|
|
||||||
- xnack scratch .. TODO::
|
|
||||||
- kernarg preload - Packed
|
|
||||||
work-item Add product
|
|
||||||
IDs names.
|
|
||||||
|
|
||||||
``gfx941`` ``amdgcn`` dGPU - sramecc - Architected *TBA*
|
|
||||||
- tgsplit flat
|
|
||||||
- xnack scratch .. TODO::
|
|
||||||
- kernarg preload - Packed
|
|
||||||
work-item Add product
|
|
||||||
IDs names.
|
|
||||||
|
|
||||||
``gfx942`` ``amdgcn`` dGPU - sramecc - Architected - AMD Instinct MI300X
|
``gfx942`` ``amdgcn`` dGPU - sramecc - Architected - AMD Instinct MI300X
|
||||||
- tgsplit flat - AMD Instinct MI300A
|
- tgsplit flat - AMD Instinct MI300A
|
||||||
- xnack scratch
|
- xnack scratch
|
||||||
@ -583,10 +569,10 @@ Generic processor code objects are versioned. See :ref:`amdgpu-generic-processor
|
|||||||
- ``v_dot2_f32_f16``
|
- ``v_dot2_f32_f16``
|
||||||
|
|
||||||
|
|
||||||
``gfx9-4-generic`` ``amdgcn`` - ``gfx940`` - sramecc - Architected FP8 and BF8 instructions,
|
``gfx9-4-generic`` ``amdgcn`` - ``gfx942`` - sramecc - Architected FP8 and BF8 instructions,
|
||||||
- ``gfx941`` - tgsplit flat scratch FP8 and BF8 conversion
|
- ``gfx950`` - tgsplit flat scratch FP8 and BF8 conversion
|
||||||
- ``gfx942`` - xnack - Packed instructions, as well as
|
- xnack - Packed instructions, as well as
|
||||||
- ``gfx950`` - kernarg preload work-item instructions with XF32 format
|
- kernarg preload work-item instructions with XF32 format
|
||||||
IDs support are not available.
|
IDs support are not available.
|
||||||
|
|
||||||
``gfx10-1-generic`` ``amdgcn`` - ``gfx1010`` - xnack - Absolute flat - The following instructions are
|
``gfx10-1-generic`` ``amdgcn`` - ``gfx1010`` - xnack - Absolute flat - The following instructions are
|
||||||
@ -4985,7 +4971,7 @@ The fields used by CP for code objects before V3 also match those specified in
|
|||||||
bytes
|
bytes
|
||||||
383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-GFX9
|
383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-GFX9
|
||||||
Reserved, must be 0.
|
Reserved, must be 0.
|
||||||
GFX90A, GFX940
|
GFX90A, GFX942
|
||||||
Compute Shader (CS)
|
Compute Shader (CS)
|
||||||
program settings used by
|
program settings used by
|
||||||
CP to set up
|
CP to set up
|
||||||
@ -5070,7 +5056,7 @@ The fields used by CP for code objects before V3 also match those specified in
|
|||||||
463:460 4 bits Reserved, must be 0.
|
463:460 4 bits Reserved, must be 0.
|
||||||
470:464 7 bits KERNARG_PRELOAD_SPEC_LENGTH GFX6-GFX9
|
470:464 7 bits KERNARG_PRELOAD_SPEC_LENGTH GFX6-GFX9
|
||||||
- Reserved, must be 0.
|
- Reserved, must be 0.
|
||||||
GFX90A, GFX940
|
GFX90A, GFX942
|
||||||
- The number of dwords from
|
- The number of dwords from
|
||||||
the kernarg segment to preload
|
the kernarg segment to preload
|
||||||
into User SGPRs before kernel
|
into User SGPRs before kernel
|
||||||
@ -5078,7 +5064,7 @@ The fields used by CP for code objects before V3 also match those specified in
|
|||||||
:ref:`amdgpu-amdhsa-kernarg-preload`).
|
:ref:`amdgpu-amdhsa-kernarg-preload`).
|
||||||
479:471 9 bits KERNARG_PRELOAD_SPEC_OFFSET GFX6-GFX9
|
479:471 9 bits KERNARG_PRELOAD_SPEC_OFFSET GFX6-GFX9
|
||||||
- Reserved, must be 0.
|
- Reserved, must be 0.
|
||||||
GFX90A, GFX940
|
GFX90A, GFX942
|
||||||
- An offset in dwords into the
|
- An offset in dwords into the
|
||||||
kernarg segment to begin
|
kernarg segment to begin
|
||||||
preloading data into User
|
preloading data into User
|
||||||
@ -5104,7 +5090,7 @@ The fields used by CP for code objects before V3 also match those specified in
|
|||||||
GFX6-GFX9
|
GFX6-GFX9
|
||||||
- vgprs_used 0..256
|
- vgprs_used 0..256
|
||||||
- max(0, ceil(vgprs_used / 4) - 1)
|
- max(0, ceil(vgprs_used / 4) - 1)
|
||||||
GFX90A, GFX940
|
GFX90A, GFX942
|
||||||
- vgprs_used 0..512
|
- vgprs_used 0..512
|
||||||
- vgprs_used = align(arch_vgprs, 4)
|
- vgprs_used = align(arch_vgprs, 4)
|
||||||
+ acc_vgprs
|
+ acc_vgprs
|
||||||
@ -5570,7 +5556,7 @@ The fields used by CP for code objects before V3 also match those specified in
|
|||||||
|
|
||||||
..
|
..
|
||||||
|
|
||||||
.. table:: compute_pgm_rsrc3 for GFX90A, GFX940
|
.. table:: compute_pgm_rsrc3 for GFX90A, GFX942
|
||||||
:name: amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table
|
:name: amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table
|
||||||
|
|
||||||
======= ======= =============================== ===========================================================================
|
======= ======= =============================== ===========================================================================
|
||||||
@ -9981,15 +9967,15 @@ only accessed by a single thread, and is always write-before-read, there is
|
|||||||
never a need to invalidate these entries from the L1 cache. Hence all cache
|
never a need to invalidate these entries from the L1 cache. Hence all cache
|
||||||
invalidates are done as ``*_vol`` to only invalidate the volatile cache lines.
|
invalidates are done as ``*_vol`` to only invalidate the volatile cache lines.
|
||||||
|
|
||||||
The code sequences used to implement the memory model for GFX940, GFX941, GFX942
|
The code sequences used to implement the memory model for GFX942 are defined in
|
||||||
are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table`.
|
table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx942-table`.
|
||||||
|
|
||||||
.. table:: AMDHSA Memory Model Code Sequences GFX940, GFX941, GFX942
|
.. table:: AMDHSA Memory Model Code Sequences GFX942
|
||||||
:name: amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table
|
:name: amdgpu-amdhsa-memory-model-code-sequences-gfx942-table
|
||||||
|
|
||||||
============ ============ ============== ========== ================================
|
============ ============ ============== ========== ================================
|
||||||
LLVM Instr LLVM Memory LLVM Memory AMDGPU AMDGPU Machine Code
|
LLVM Instr LLVM Memory LLVM Memory AMDGPU AMDGPU Machine Code
|
||||||
Ordering Sync Scope Address GFX940, GFX941, GFX942
|
Ordering Sync Scope Address GFX942
|
||||||
Space
|
Space
|
||||||
============ ============ ============== ========== ================================
|
============ ============ ============== ========== ================================
|
||||||
**Non-Atomic**
|
**Non-Atomic**
|
||||||
@ -10024,18 +10010,12 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
|
|||||||
load *none* *none* - local 1. ds_load
|
load *none* *none* - local 1. ds_load
|
||||||
store *none* *none* - global - !volatile & !nontemporal
|
store *none* *none* - global - !volatile & !nontemporal
|
||||||
- generic
|
- generic
|
||||||
- private 1. GFX940, GFX941
|
- private 1. GFX942
|
||||||
- constant buffer/global/flat_store
|
- constant buffer/global/flat_store
|
||||||
sc0=1 sc1=1
|
|
||||||
GFX942
|
|
||||||
buffer/global/flat_store
|
|
||||||
|
|
||||||
- !volatile & nontemporal
|
- !volatile & nontemporal
|
||||||
|
|
||||||
1. GFX940, GFX941
|
1. GFX942
|
||||||
buffer/global/flat_store
|
|
||||||
nt=1 sc0=1 sc1=1
|
|
||||||
GFX942
|
|
||||||
buffer/global/flat_store
|
buffer/global/flat_store
|
||||||
nt=1
|
nt=1
|
||||||
|
|
||||||
@ -10707,11 +10687,8 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
|
|||||||
|
|
||||||
**Release Atomic**
|
**Release Atomic**
|
||||||
------------------------------------------------------------------------------------
|
------------------------------------------------------------------------------------
|
||||||
store atomic release - singlethread - global 1. GFX940, GFX941
|
store atomic release - singlethread - global 1. GFX942
|
||||||
- wavefront - generic buffer/global/flat_store
|
- wavefront - generic buffer/global/flat_store
|
||||||
sc0=1 sc1=1
|
|
||||||
GFX942
|
|
||||||
buffer/global/flat_store
|
|
||||||
|
|
||||||
store atomic release - singlethread - local *If TgSplit execution mode,
|
store atomic release - singlethread - local *If TgSplit execution mode,
|
||||||
- wavefront local address space cannot
|
- wavefront local address space cannot
|
||||||
@ -10749,10 +10726,7 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
|
|||||||
store that is being
|
store that is being
|
||||||
released.
|
released.
|
||||||
|
|
||||||
2. GFX940, GFX941
|
2. GFX942
|
||||||
buffer/global/flat_store
|
|
||||||
sc0=1 sc1=1
|
|
||||||
GFX942
|
|
||||||
buffer/global/flat_store
|
buffer/global/flat_store
|
||||||
sc0=1
|
sc0=1
|
||||||
store atomic release - workgroup - local *If TgSplit execution mode,
|
store atomic release - workgroup - local *If TgSplit execution mode,
|
||||||
@ -10813,10 +10787,7 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
|
|||||||
store that is being
|
store that is being
|
||||||
released.
|
released.
|
||||||
|
|
||||||
3. GFX940, GFX941
|
3. GFX942
|
||||||
buffer/global/flat_store
|
|
||||||
sc0=1 sc1=1
|
|
||||||
GFX942
|
|
||||||
buffer/global/flat_store
|
buffer/global/flat_store
|
||||||
sc1=1
|
sc1=1
|
||||||
store atomic release - system - global 1. buffer_wbl2 sc0=1 sc1=1
|
store atomic release - system - global 1. buffer_wbl2 sc0=1 sc1=1
|
||||||
@ -17574,11 +17545,7 @@ in this description.
|
|||||||
|
|
||||||
CDNA 2 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx90a<AMDGPU/AMDGPUAsmGFX90a>`
|
CDNA 2 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx90a<AMDGPU/AMDGPUAsmGFX90a>`
|
||||||
|
|
||||||
CDNA 3 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx940<AMDGPU/AMDGPUAsmGFX940>`
|
CDNA 3 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>`
|
||||||
|
|
||||||
:doc:`gfx941<AMDGPU/AMDGPUAsmGFX940>`
|
|
||||||
|
|
||||||
:doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>`
|
|
||||||
|
|
||||||
RDNA 1 :doc:`GFX10 RDNA1<AMDGPU/AMDGPUAsmGFX10>` :doc:`gfx1010<AMDGPU/AMDGPUAsmGFX10>`
|
RDNA 1 :doc:`GFX10 RDNA1<AMDGPU/AMDGPUAsmGFX10>` :doc:`gfx1010<AMDGPU/AMDGPUAsmGFX10>`
|
||||||
|
|
||||||
@ -17616,7 +17583,7 @@ combinations of operands, refer to one of instruction set architecture manuals
|
|||||||
[AMD-GCN-GFX6]_, [AMD-GCN-GFX7]_, [AMD-GCN-GFX8]_,
|
[AMD-GCN-GFX6]_, [AMD-GCN-GFX7]_, [AMD-GCN-GFX8]_,
|
||||||
[AMD-GCN-GFX900-GFX904-VEGA]_, [AMD-GCN-GFX906-VEGA7NM]_,
|
[AMD-GCN-GFX900-GFX904-VEGA]_, [AMD-GCN-GFX906-VEGA7NM]_,
|
||||||
[AMD-GCN-GFX908-CDNA1]_, [AMD-GCN-GFX90A-CDNA2]_,
|
[AMD-GCN-GFX908-CDNA1]_, [AMD-GCN-GFX90A-CDNA2]_,
|
||||||
[AMD-GCN-GFX940-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_,
|
[AMD-GCN-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_,
|
||||||
[AMD-GCN-GFX11-RDNA3]_ and [AMD-GCN-GFX11-RDNA3.5]_.
|
[AMD-GCN-GFX11-RDNA3]_ and [AMD-GCN-GFX11-RDNA3.5]_.
|
||||||
|
|
||||||
Operands
|
Operands
|
||||||
@ -18129,7 +18096,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
|
|||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`
|
||||||
``.amdhsa_user_sgpr_private_segment_buffer`` 0 GFX6-GFX10 Controls ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER in
|
``.amdhsa_user_sgpr_private_segment_buffer`` 0 GFX6-GFX10 Controls ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER in
|
||||||
(except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
(except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
GFX940)
|
GFX942)
|
||||||
``.amdhsa_user_sgpr_dispatch_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_DISPATCH_PTR in
|
``.amdhsa_user_sgpr_dispatch_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_DISPATCH_PTR in
|
||||||
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
``.amdhsa_user_sgpr_queue_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_QUEUE_PTR in
|
``.amdhsa_user_sgpr_queue_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_QUEUE_PTR in
|
||||||
@ -18140,7 +18107,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
|
|||||||
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
``.amdhsa_user_sgpr_flat_scratch_init`` 0 GFX6-GFX10 Controls ENABLE_SGPR_FLAT_SCRATCH_INIT in
|
``.amdhsa_user_sgpr_flat_scratch_init`` 0 GFX6-GFX10 Controls ENABLE_SGPR_FLAT_SCRATCH_INIT in
|
||||||
(except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
(except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
GFX940)
|
GFX942)
|
||||||
``.amdhsa_user_sgpr_private_segment_size`` 0 GFX6-GFX12 Controls ENABLE_SGPR_PRIVATE_SEGMENT_SIZE in
|
``.amdhsa_user_sgpr_private_segment_size`` 0 GFX6-GFX12 Controls ENABLE_SGPR_PRIVATE_SEGMENT_SIZE in
|
||||||
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
``.amdhsa_wavefront_size32`` Target GFX10-GFX12 Controls ENABLE_WAVEFRONT_SIZE32 in
|
``.amdhsa_wavefront_size32`` Target GFX10-GFX12 Controls ENABLE_WAVEFRONT_SIZE32 in
|
||||||
@ -18151,8 +18118,8 @@ terminated by an ``.end_amdhsa_kernel`` directive.
|
|||||||
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
``.amdhsa_system_sgpr_private_segment_wavefront_offset`` 0 GFX6-GFX10 Controls ENABLE_PRIVATE_SEGMENT in
|
``.amdhsa_system_sgpr_private_segment_wavefront_offset`` 0 GFX6-GFX10 Controls ENABLE_PRIVATE_SEGMENT in
|
||||||
(except :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
(except :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
||||||
GFX940)
|
GFX942)
|
||||||
``.amdhsa_enable_private_segment`` 0 GFX940, Controls ENABLE_PRIVATE_SEGMENT in
|
``.amdhsa_enable_private_segment`` 0 GFX942, Controls ENABLE_PRIVATE_SEGMENT in
|
||||||
GFX11-GFX12 :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
GFX11-GFX12 :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
||||||
``.amdhsa_system_sgpr_workgroup_id_x`` 1 GFX6-GFX12 Controls ENABLE_SGPR_WORKGROUP_ID_X in
|
``.amdhsa_system_sgpr_workgroup_id_x`` 1 GFX6-GFX12 Controls ENABLE_SGPR_WORKGROUP_ID_X in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
||||||
@ -18173,14 +18140,14 @@ terminated by an ``.end_amdhsa_kernel`` directive.
|
|||||||
Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
|
Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
||||||
``.amdhsa_accum_offset`` Required GFX90A, Offset of a first AccVGPR in the unified register file.
|
``.amdhsa_accum_offset`` Required GFX90A, Offset of a first AccVGPR in the unified register file.
|
||||||
GFX940 Used to calculate ACCUM_OFFSET in
|
GFX942 Used to calculate ACCUM_OFFSET in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
|
||||||
``.amdhsa_reserve_vcc`` 1 GFX6-GFX12 Whether the kernel may use the special VCC SGPR.
|
``.amdhsa_reserve_vcc`` 1 GFX6-GFX12 Whether the kernel may use the special VCC SGPR.
|
||||||
Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
|
Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
||||||
``.amdhsa_reserve_flat_scratch`` 1 GFX7-GFX10 Whether the kernel may use flat instructions to access
|
``.amdhsa_reserve_flat_scratch`` 1 GFX7-GFX10 Whether the kernel may use flat instructions to access
|
||||||
(except scratch memory. Used to calculate
|
(except scratch memory. Used to calculate
|
||||||
GFX940) GRANULATED_WAVEFRONT_SGPR_COUNT in
|
GFX942) GRANULATED_WAVEFRONT_SGPR_COUNT in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
||||||
``.amdhsa_reserve_xnack_mask`` Target GFX8-GFX10 Whether the kernel may trigger XNACK replay.
|
``.amdhsa_reserve_xnack_mask`` Target GFX8-GFX10 Whether the kernel may trigger XNACK replay.
|
||||||
Feature Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
|
Feature Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
|
||||||
@ -18211,7 +18178,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
|
|||||||
``.amdhsa_fp16_overflow`` 0 GFX9-GFX12 Controls FP16_OVFL in
|
``.amdhsa_fp16_overflow`` 0 GFX9-GFX12 Controls FP16_OVFL in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
|
||||||
``.amdhsa_tg_split`` Target GFX90A, Controls TG_SPLIT in
|
``.amdhsa_tg_split`` Target GFX90A, Controls TG_SPLIT in
|
||||||
Feature GFX940, :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
|
Feature GFX942, :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
|
||||||
Specific GFX11-GFX12
|
Specific GFX11-GFX12
|
||||||
(tgsplit)
|
(tgsplit)
|
||||||
``.amdhsa_workgroup_processor_mode`` Target GFX10-GFX12 Controls ENABLE_WGP_MODE in
|
``.amdhsa_workgroup_processor_mode`` Target GFX10-GFX12 Controls ENABLE_WGP_MODE in
|
||||||
@ -18242,9 +18209,9 @@ terminated by an ``.end_amdhsa_kernel`` directive.
|
|||||||
``.amdhsa_exception_int_div_zero`` 0 GFX6-GFX12 Controls ENABLE_EXCEPTION_INT_DIVIDE_BY_ZERO in
|
``.amdhsa_exception_int_div_zero`` 0 GFX6-GFX12 Controls ENABLE_EXCEPTION_INT_DIVIDE_BY_ZERO in
|
||||||
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
|
||||||
``.amdhsa_user_sgpr_kernarg_preload_length`` 0 GFX90A, Controls KERNARG_PRELOAD_SPEC_LENGTH in
|
``.amdhsa_user_sgpr_kernarg_preload_length`` 0 GFX90A, Controls KERNARG_PRELOAD_SPEC_LENGTH in
|
||||||
GFX940 :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
GFX942 :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
``.amdhsa_user_sgpr_kernarg_preload_offset`` 0 GFX90A, Controls KERNARG_PRELOAD_SPEC_OFFSET in
|
``.amdhsa_user_sgpr_kernarg_preload_offset`` 0 GFX90A, Controls KERNARG_PRELOAD_SPEC_OFFSET in
|
||||||
GFX940 :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
GFX942 :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
|
||||||
======================================================== =================== ============ ===================
|
======================================================== =================== ============ ===================
|
||||||
|
|
||||||
.amdgpu_metadata
|
.amdgpu_metadata
|
||||||
@ -18414,7 +18381,7 @@ Additional Documentation
|
|||||||
.. [AMD-GCN-GFX906-VEGA7NM] `AMD Vega 7nm Instruction Set Architecture <https://gpuopen.com/wp-content/uploads/2019/11/Vega_7nm_Shader_ISA_26November2019.pdf>`__
|
.. [AMD-GCN-GFX906-VEGA7NM] `AMD Vega 7nm Instruction Set Architecture <https://gpuopen.com/wp-content/uploads/2019/11/Vega_7nm_Shader_ISA_26November2019.pdf>`__
|
||||||
.. [AMD-GCN-GFX908-CDNA1] `AMD Instinct MI100 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf>`__
|
.. [AMD-GCN-GFX908-CDNA1] `AMD Instinct MI100 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf>`__
|
||||||
.. [AMD-GCN-GFX90A-CDNA2] `AMD Instinct MI200 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf>`__
|
.. [AMD-GCN-GFX90A-CDNA2] `AMD Instinct MI200 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf>`__
|
||||||
.. [AMD-GCN-GFX940-GFX942-CDNA3] `AMD Instinct MI300 Instruction Set Architecture <https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf>`__
|
.. [AMD-GCN-GFX942-CDNA3] `AMD Instinct MI300 Instruction Set Architecture <https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf>`__
|
||||||
.. [AMD-GCN-GFX10-RDNA1] `AMD RDNA 1.0 Instruction Set Architecture <https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf>`__
|
.. [AMD-GCN-GFX10-RDNA1] `AMD RDNA 1.0 Instruction Set Architecture <https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf>`__
|
||||||
.. [AMD-GCN-GFX10-RDNA2] `AMD RDNA 2 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf>`__
|
.. [AMD-GCN-GFX10-RDNA2] `AMD RDNA 2 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf>`__
|
||||||
.. [AMD-GCN-GFX11-RDNA3] `AMD RDNA 3 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf>`__
|
.. [AMD-GCN-GFX11-RDNA3] `AMD RDNA 3 Instruction Set Architecture <https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf>`__
|
||||||
|
Loading…
x
Reference in New Issue
Block a user