[NFC][Docs] Add documentation for NVPTX conversion intrinsics (#175536)

This change adds documentation for the NVPTX narrow floating-point
conversion intrinsics.

PTX ISA Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt
This commit is contained in:
Srinivasa Ravi 2026-03-03 15:39:20 +05:30 committed by GitHub
parent 5d8c6c198d
commit b4fffcd8e4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1106,6 +1106,183 @@ If the given pointer in the generic address space refers to memory which falls
within the state space of the intrinsic (and therefore could be safely address
space casted to this space), 1 is returned, otherwise 0 is returned.
Narrow Floating-Point Conversion intrinsics
-------------------------------------------
These intrinsics perform conversions involving narrow floating-point formats.
The following table describes the rounding modes used across these intrinsics:
.. _narrow-fp-rounding-modes:
.. table:: Narrow Floating-Point Conversion Rounding Modes
:widths: 30 60
+-----------------------+---------------------------------------------------+
| Rounding Mode | Description |
+=======================+===================================================+
|``rn`` (default) | Round to nearest, with ties to even |
+-----------------------+---------------------------------------------------+
|``rz`` | Round towards zero |
+-----------------------+---------------------------------------------------+
|``rp`` | Round towards positive infinity |
+-----------------------+---------------------------------------------------+
|``rs`` | Stochastic rounding which is achieved through the |
| | use of the supplied random bits (``%rnd_bits``). |
| | The result s rounded in the direction towards |
| | zero or away from zero based on the carry out of |
| | the integer addition of the of mantissa from |
| | the input. |
+-----------------------+---------------------------------------------------+
``fp8`` Conversion Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
.. code-block:: llvm
declare i16 @llvm.nvvm.ff.to{.e4m3x2, .e5m2x2}.rn{.relu}(float %a, float %b)
declare i16 @llvm.nvvm.ff.to.ue8m0x2{.rz, .rp}{.satfinite}(float %a, float %b)
declare i16 @llvm.f16x2.to{.e4m3x2, .e5m2x2}.rn{.relu}(<2 x half> %a)
declare i16 @llvm.bf16x2.to{.e4m3x2, .e5m2x2}.rn{.relu}.satfinite(<2 x bfloat> %a)
declare i16 @llvm.bf16x2.to.ue8m0x2{.rz, .rp}{.satfinite}(<2 x bfloat> %a)
declare <2 x half> @llvm.nvvm{.e4m3x2, .e5m2x2}.to.f16x2.rn{.relu}(i16 %a)
declare <2 x bfloat> @llvm.nvvm.ue8m0x2.to.bf16x2(i16 %a)
declare <4 x i8> @llvm.nvvm.f32x4.to{.e4m3x4, .e5m2x4}.rs{.relu}.satfinite(<4 x f32> %a, i32 %rnd_bits)
Overview:
"""""""""
These intrinsics perform conversions involving the ``e4m3`` and ``e5m2`` narrow
floating-point formats. In case of two inputs, the value converted from input
``%a`` is stored in the upper 8-bits of the result, and the value converted
from input ``%b`` is stored in the lower 8-bits of the result.
For rounding modes, see :ref:`narrow-fp-rounding-modes`.
The ``relu`` modifier clamps negative results to 0.
When ``satfinite`` is specified, if the absolute value of input (ignoring sign)
is greater than ``MAX_NORM`` of the specified destination format, then the
result is sign-preserved ``MAX_NORM`` of the destination format and a positive
``MAX_NORM`` in ``.ue8m0x2`` for which the destination sign is not supported.
Also, if the input value is ``NaN``, then the result is ``NaN`` in the
specified destination format. The ``satfinite`` modifier is assumed to be
present for conversions involving ``e4m3`` and ``e5m2`` types as the
destination.
For more information, see `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt>`__.
``s2f6`` Conversion Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
.. code-block:: llvm
declare i16 @llvm.nvvm.ff.to.s2f6x2.rn{.relu}.satfinite.scale.n2.ue8m0(float %a, float %b, i16 %scale_factor)
declare i16 @llvm.nvvm.bf16x2.to.s2f6x2.rn{.relu}.satfinite.scale.n2.ue8m0(<2 x bfloat> %a, i16 %scale_factor)
declare <2 x bfloat> @llvm.nvvm.s2f6x2.to.bf16x2.rn{.relu}{.satfinite}.scale.n2.ue8m0(i16 %a, i16 %scale_factor)
Overview:
"""""""""
These intrinsics perform conversions involving the ``s2f6`` narrow
floating-point format. In case of two inputs, the value converted from input
``%a`` is stored in the upper 8-bits of the result, and the value converted
from input ``%b`` is stored in the lower 8-bits of the result.
For rounding modes, see :ref:`narrow-fp-rounding-modes`.
The ``relu`` modifier clamps negative results to 0.
When ``satfinite`` is specified, if the absolute value of input (ignoring sign)
is greater than ``MAX_NORM`` of the specified destination format, then the
result is sign-preserved ``MAX_NORM`` of the destination format. Also, if the
input is ``NaN``, then the result is the positive ``MAX_NORM`` of the
destination format.
The operand ``%scale_factor`` stores two packed scaling factors of type
``ue8m0``, one for each input. For down conversion, inputs are divided by
``scale_factor`` and then the conversion is performed. For up-conversion,
inputs are converted to destination type and then multiplied by
``scale_factor``.
For more information, see `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt>`__.
``fp6`` Conversion Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
.. code-block:: llvm
declare i16 @llvm.nvvm.ff.to{.e2m3x2, .e3m2x2}.rn{.relu}.satfinite(float %a, float %b)
declare i16 @llvm.nvvm.f16x2.to{.e2m3x2, .e3m2x2}.rn{.relu}.satfinite(<2 x half> %a)
declare i16 @llvm.nvvm.bf16x2.to{.e2m3x2, .e3m2x2}.rn{.relu}.satfinite(<2 x bfloat> %a)
declare <2 x half> @llvm.nvvm{.e2m3x2, .e3m2x2}.to.f16x2.rn{.relu}(i16 %a)
declare <4 x i8> @llvm.nvvm.f32x4.to{.e2m3x4, .e3m2x4}.rs{.relu}.satfinite(<4 x f32> %a, i32 %rnd_bits)
Overview:
"""""""""
These intrinsics perform conversions involving the ``e2m3`` and ``e3m2`` narrow
floating-point formats. In case of two inputs, the value converted from input
``%a`` is stored in the upper 8-bits of the result, and the value converted
from input ``%b`` is stored in the lower 8-bits of the result with 2 MSBs
padded with 0s in both cases.
For rounding modes, see :ref:`narrow-fp-rounding-modes`.
The ``relu`` modifier clamps negative results to 0.
When ``satfinite`` is specified, if the absolute value of input (ignoring sign)
is greater than ``MAX_NORM`` of the specified destination format, then the
result is sign-preserved ``MAX_NORM`` of the destination format. Also, if the
input is ``NaN``, then the result is the positive ``MAX_NORM`` of the
destination format.
For more information, see `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt>`__.
``fp4`` Conversion Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
.. code-block:: llvm
declare i16 @llvm.nvvm.ff.to.e2m1x2.rn{.relu}.satfinite(float %a, float %b)
declare i16 @llvm.nvvm.f16x2.to.e2m1x2.rn{.relu}.satfinite(<2 x half> %a)
declare i16 @llvm.nvvm.bf16x2.to.e2m1x2.rn{.relu}.satfinite(<2 x bfloat> %a)
declare <2 x half> @llvm.nvvm.e2m1x2.to.f16x2.rn{.relu}(i16 %a)
declare i16 @llvm.nvvm.f32x4.to.e2m1x4.rs{.relu}.satfinite(<4 x f32> %a, i32 %rnd_bits)
Overview:
"""""""""
These intrinsics perform conversions involving the ``e2m1`` narrow
floating-point format. For conversions involving ``e2m1x2``, the packed
``e2m1x2`` value is stored in the lower byte of the ``i16`` argument or result.
In case of two inputs, the value converted from input
``%a`` is stored in the upper 4-bits of the result, and the value converted
from input ``%b`` is stored in the lower 4-bits of the result.
For rounding modes, see :ref:`narrow-fp-rounding-modes`.
The ``relu`` modifier clamps negative results to 0.
When ``satfinite`` is specified, if the absolute value of input (ignoring sign)
is greater than ``MAX_NORM`` of the specified destination format, then the
result is sign-preserved ``MAX_NORM`` of the destination format. Also, if the
input is ``NaN``, then the result is the positive ``MAX_NORM`` of the
destination format.
For more information, see `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt>`__.
Arithmetic Intrinsics
---------------------