[Mesa-dev] [PATCH] gallium: tgsi documentation updates and clarification for integer opcodes.

Fri May 3 06:20:23 PDT 2013

----- Original Message -----
> From: Roland Scheidegger <sroland at vmware.com>
> 
> A lot of them were missing. Others were moved from the Compute ISA
> to a new Integer ISA section as that seemed more appropriate.
> ---
>  src/gallium/docs/source/tgsi.rst |  362
>  ++++++++++++++++++++++++++++++--------
>  1 file changed, 289 insertions(+), 73 deletions(-)
> 
> diff --git a/src/gallium/docs/source/tgsi.rst
> b/src/gallium/docs/source/tgsi.rst
> index a528fd2..b7caf63 100644
> --- a/src/gallium/docs/source/tgsi.rst
> +++ b/src/gallium/docs/source/tgsi.rst
> @@ -872,6 +872,16 @@ This instruction replicates its result.
>    as an integer register.
>  
>  
> +.. opcode:: CONT - Continue
> +
> +  TBD
> +
> +.. note::
> +
> +   Support for CONT is determined by a special capability bit,
> +   ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
> +
> +
>  .. opcode:: IF - Float If
>  
>    Start an IF ... ELSE .. ENDIF block.  Condition evaluates to true if
> @@ -977,6 +987,7 @@ These opcodes are primarily provided for special-use
> computational shaders.
>  Support for these opcodes indicated by a special pipe capability bit (TBD).
>  
>  XXX so let's discuss it, yeah?
> +XXX doesn't look like most of the opcodes really belong here.
>  
>  .. opcode:: CEIL - Ceiling
>  
> @@ -991,7 +1002,89 @@ XXX so let's discuss it, yeah?
>    dst.w = \lceil src.w\rceil
>  
>  
> -.. opcode:: I2F - Integer To Float
> +.. opcode:: TRUNC - Truncate
> +
> +.. math::
> +
> +  dst.x = trunc(src.x)
> +
> +  dst.y = trunc(src.y)
> +
> +  dst.z = trunc(src.z)
> +
> +  dst.w = trunc(src.w)
> +
> +
> +.. opcode:: MOD - Modulus
> +
> +.. math::
> +
> +  dst.x = src0.x \bmod src1.x
> +
> +  dst.y = src0.y \bmod src1.y
> +
> +  dst.z = src0.z \bmod src1.z
> +
> +  dst.w = src0.w \bmod src1.w
> +
> +
> +.. opcode:: UARL - Integer Address Register Load
> +
> +  Moves the contents of the source register, assumed to be an integer, into
> the
> +  destination register, which is assumed to be an address (ADDR) register.
> +
> +
> +.. opcode:: SAD - Sum Of Absolute Differences
> +
> +.. math::
> +
> +  dst.x = |src0.x - src1.x| + src2.x
> +
> +  dst.y = |src0.y - src1.y| + src2.y
> +
> +  dst.z = |src0.z - src1.z| + src2.z
> +
> +  dst.w = |src0.w - src1.w| + src2.w
> +
> +
> +.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single
> texel
> +                  from a specified texture image. The source sampler may
> +		  not be a CUBE or SHADOW.
> +                  src 0 is a four-component signed integer vector used to
> +		  identify the single texel accessed. 3 components + level.
> +		  src 1 is a 3 component constant signed integer vector,
> +		  with each component only have a range of
> +		  -8..+8 (hw only seems to deal with this range, interface
> +		  allows for up to unsigned int).
> +		  TXF(uint_vec coord, int_vec offset).
> +
> +
> +.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
> +                  retrieve the dimensions of the texture
> +                  depending on the target. For 1D (width), 2D/RECT/CUBE
> +		  (width, height), 3D (width, height, depth),
> +		  1D array (width, layers), 2D array (width, height, layers)
> +
> +.. math::
> +
> +  lod = src0

  src0.x ?

Otherwise looks good. Thanks for taking the time of cleaning up these.

Jose

> +
> +  dst.x = texture_width(unit, lod)
> +
> +  dst.y = texture_height(unit, lod)
> +
> +  dst.z = texture_depth(unit, lod)
> +
> +
> +Integer ISA
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +These opcodes are used for integer operations.
> +Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of
> them?)
> +
> +
> +.. opcode:: I2F - Signed Integer To Float
> +
> +   Rounding is unspecified (round to nearest even suggested).
>  
>  .. math::
>  
> @@ -1004,56 +1097,157 @@ XXX so let's discuss it, yeah?
>    dst.w = (float) src.w
>  
>  
> -.. opcode:: NOT - Bitwise Not
> +.. opcode:: U2F - Unsigned Integer To Float
> +
> +   Rounding is unspecified (round to nearest even suggested).
>  
>  .. math::
>  
> -  dst.x = ~src.x
> +  dst.x = (float) src.x
>  
> -  dst.y = ~src.y
> +  dst.y = (float) src.y
>  
> -  dst.z = ~src.z
> +  dst.z = (float) src.z
>  
> -  dst.w = ~src.w
> +  dst.w = (float) src.w
>  
>  
> -.. opcode:: TRUNC - Truncate
> +.. opcode:: F2I - Float to Signed Integer
> +
> +   Rounding is towards zero (truncate).
> +   Values outside signed range (including NaNs) produce undefined results.
>  
>  .. math::
>  
> -  dst.x = trunc(src.x)
> +  dst.x = (int) src.x
>  
> -  dst.y = trunc(src.y)
> +  dst.y = (int) src.y
>  
> -  dst.z = trunc(src.z)
> +  dst.z = (int) src.z
>  
> -  dst.w = trunc(src.w)
> +  dst.w = (int) src.w
>  
>  
> -.. opcode:: SHL - Shift Left
> +.. opcode:: F2U - Float to Unsigned Integer
> +
> +   Rounding is towards zero (truncate).
> +   Values outside unsigned range (including NaNs) produce undefined results.
>  
>  .. math::
>  
> -  dst.x = src0.x << src1.x
> +  dst.x = (unsigned) src.x
>  
> -  dst.y = src0.y << src1.x
> +  dst.y = (unsigned) src.y
>  
> -  dst.z = src0.z << src1.x
> +  dst.z = (unsigned) src.z
>  
> -  dst.w = src0.w << src1.x
> +  dst.w = (unsigned) src.w
>  
>  
> -.. opcode:: SHR - Shift Right
> +.. opcode:: UADD - Integer Add
> +
> +   This instruction works the same for signed and unsigned integers.
> +   The low 32bit of the result is returned.
>  
>  .. math::
>  
> -  dst.x = src0.x >> src1.x
> +  dst.x = src0.x + src1.x
>  
> -  dst.y = src0.y >> src1.x
> +  dst.y = src0.y + src1.y
>  
> -  dst.z = src0.z >> src1.x
> +  dst.z = src0.z + src1.z
>  
> -  dst.w = src0.w >> src1.x
> +  dst.w = src0.w + src1.w
> +
> +
> +.. opcode:: UMAD - Integer Multiply And Add
> +
> +   This instruction works the same for signed and unsigned integers.
> +   The multiplication returns the low 32bit (as does the result itself).
> +
> +.. math::
> +
> +  dst.x = src0.x \times src1.x + src2.x
> +
> +  dst.y = src0.y \times src1.y + src2.y
> +
> +  dst.z = src0.z \times src1.z + src2.z
> +
> +  dst.w = src0.w \times src1.w + src2.w
> +
> +
> +.. opcode:: UMUL - Integer Multiply
> +
> +   This instruction works the same for signed and unsigned integers.
> +   The low 32bit of the result is returned.
> +
> +.. math::
> +
> +  dst.x = src0.x \times src1.x
> +
> +  dst.y = src0.y \times src1.y
> +
> +  dst.z = src0.z \times src1.z
> +
> +  dst.w = src0.w \times src1.w
> +
> +
> +.. opcode:: IDIV - Signed Integer Division
> +
> +   TBD: behavior for division by zero.
> +
> +.. math::
> +
> +  dst.x = src0.x \ src1.x
> +
> +  dst.y = src0.y \ src1.y
> +
> +  dst.z = src0.z \ src1.z
> +
> +  dst.w = src0.w \ src1.w
> +
> +
> +.. opcode:: UDIV - Unsigned Integer Division
> +
> +   For division by zero, 0xffffffff is returned.
> +
> +.. math::
> +
> +  dst.x = src0.x \ src1.x
> +
> +  dst.y = src0.y \ src1.y
> +
> +  dst.z = src0.z \ src1.z
> +
> +  dst.w = src0.w \ src1.w
> +
> +
> +.. opcode:: UMOD - Unsigned Integer Remainder
> +
> +   If second arg is zero, 0xffffffff is returned.
> +
> +.. math::
> +
> +  dst.x = src0.x \ src1.x
> +
> +  dst.y = src0.y \ src1.y
> +
> +  dst.z = src0.z \ src1.z
> +
> +  dst.w = src0.w \ src1.w
> +
> +
> +.. opcode:: NOT - Bitwise Not
> +
> +.. math::
> +
> +  dst.x = ~src.x
> +
> +  dst.y = ~src.y
> +
> +  dst.z = ~src.z
> +
> +  dst.w = ~src.w
>  
>  
>  .. opcode:: AND - Bitwise And
> @@ -1082,114 +1276,136 @@ XXX so let's discuss it, yeah?
>    dst.w = src0.w | src1.w
>  
>  
> -.. opcode:: MOD - Modulus
> +.. opcode:: XOR - Bitwise Xor
>  
>  .. math::
>  
> -  dst.x = src0.x \bmod src1.x
> +  dst.x = src0.x \oplus src1.x
>  
> -  dst.y = src0.y \bmod src1.y
> +  dst.y = src0.y \oplus src1.y
>  
> -  dst.z = src0.z \bmod src1.z
> +  dst.z = src0.z \oplus src1.z
>  
> -  dst.w = src0.w \bmod src1.w
> +  dst.w = src0.w \oplus src1.w
>  
>  
> -.. opcode:: XOR - Bitwise Xor
> +.. opcode:: IMAX - Maximum of Signed Integers
>  
>  .. math::
>  
> -  dst.x = src0.x \oplus src1.x
> +  dst.x = max(src0.x, src1.x)
>  
> -  dst.y = src0.y \oplus src1.y
> +  dst.y = max(src0.y, src1.y)
>  
> -  dst.z = src0.z \oplus src1.z
> +  dst.z = max(src0.z, src1.z)
>  
> -  dst.w = src0.w \oplus src1.w
> +  dst.w = max(src0.w, src1.w)
>  
>  
> -.. opcode:: UCMP - Integer Conditional Move
> +.. opcode:: UMAX - Maximum of Unsigned Integers
>  
>  .. math::
>  
> -  dst.x = src0.x ? src1.x : src2.x
> +  dst.x = max(src0.x, src1.x)
>  
> -  dst.y = src0.y ? src1.y : src2.y
> +  dst.y = max(src0.y, src1.y)
>  
> -  dst.z = src0.z ? src1.z : src2.z
> +  dst.z = max(src0.z, src1.z)
>  
> -  dst.w = src0.w ? src1.w : src2.w
> +  dst.w = max(src0.w, src1.w)
>  
>  
> -.. opcode:: UARL - Integer Address Register Load
> +.. opcode:: IMIN - Minimum of Signed Integers
>  
> -  Moves the contents of the source register, assumed to be an integer, into
> the
> -  destination register, which is assumed to be an address (ADDR) register.
> +.. math::
>  
> +  dst.x = min(src0.x, src1.x)
>  
> -.. opcode:: IABS - Integer Absolute Value
> +  dst.y = min(src0.y, src1.y)
> +
> +  dst.z = min(src0.z, src1.z)
> +
> +  dst.w = min(src0.w, src1.w)
> +
> +
> +.. opcode:: UMIN - Minimum of Unsigned Integers
>  
>  .. math::
>  
> -  dst.x = |src.x|
> +  dst.x = min(src0.x, src1.x)
>  
> -  dst.y = |src.y|
> +  dst.y = min(src0.y, src1.y)
>  
> -  dst.z = |src.z|
> +  dst.z = min(src0.z, src1.z)
>  
> -  dst.w = |src.w|
> +  dst.w = min(src0.w, src1.w)
>  
>  
> -.. opcode:: SAD - Sum Of Absolute Differences
> +.. opcode:: SHL - Shift Left
>  
>  .. math::
>  
> -  dst.x = |src0.x - src1.x| + src2.x
> +  dst.x = src0.x << src1.x
>  
> -  dst.y = |src0.y - src1.y| + src2.y
> +  dst.y = src0.y << src1.x
>  
> -  dst.z = |src0.z - src1.z| + src2.z
> +  dst.z = src0.z << src1.x
>  
> -  dst.w = |src0.w - src1.w| + src2.w
> +  dst.w = src0.w << src1.x
>  
>  
> -.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single
> texel
> -                  from a specified texture image. The source sampler may
> -		  not be a CUBE or SHADOW.
> -                  src 0 is a four-component signed integer vector used to
> -		  identify the single texel accessed. 3 components + level.
> -		  src 1 is a 3 component constant signed integer vector,
> -		  with each component only have a range of
> -		  -8..+8 (hw only seems to deal with this range, interface
> -		  allows for up to unsigned int).
> -		  TXF(uint_vec coord, int_vec offset).
> +.. opcode:: ISHR - Arithmetic Shift Right (of Signed Integer)
>  
> +.. math::
>  
> -.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
> -                  retrieve the dimensions of the texture
> -                  depending on the target. For 1D (width), 2D/RECT/CUBE
> -		  (width, height), 3D (width, height, depth),
> -		  1D array (width, layers), 2D array (width, height, layers)
> +  dst.x = src0.x >> src1.x
> +
> +  dst.y = src0.y >> src1.x
> +
> +  dst.z = src0.z >> src1.x
> +
> +  dst.w = src0.w >> src1.x
> +
> +
> +.. opcode:: USHR - Logical Shift Right
>  
>  .. math::
>  
> -  lod = src0
> +  dst.x = src0.x >> (unsigned) src1.x
>  
> -  dst.x = texture_width(unit, lod)
> +  dst.y = src0.y >> (unsigned) src1.x
>  
> -  dst.y = texture_height(unit, lod)
> +  dst.z = src0.z >> (unsigned) src1.x
>  
> -  dst.z = texture_depth(unit, lod)
> +  dst.w = src0.w >> (unsigned) src1.x
>  
>  
> -.. opcode:: CONT - Continue
>  
> -  TBD
>  
> -.. note::
> +.. opcode:: UCMP - Integer Conditional Move
>  
> -   Support for CONT is determined by a special capability bit,
> -   ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
> +.. math::
> +
> +  dst.x = src0.x ? src1.x : src2.x
> +
> +  dst.y = src0.y ? src1.y : src2.y
> +
> +  dst.z = src0.z ? src1.z : src2.z
> +
> +  dst.w = src0.w ? src1.w : src2.w
> +
> +
> +.. opcode:: IABS - Integer Absolute Value
> +
> +.. math::
> +
> +  dst.x = |src.x|
> +
> +  dst.y = |src.y|
> +
> +  dst.z = |src.z|
> +
> +  dst.w = |src.w|
>  
>  
>  Geometry ISA
> --
> 1.7.9.5
>