[Mesa-dev] [PATCH] gallium: tgsi documentation updates and clarification for integer opcodes.
sroland at vmware.com
sroland at vmware.com
Thu May 2 16:16:18 PDT 2013
From: Roland Scheidegger <sroland at vmware.com>
A lot of them were missing. Others were moved from the Compute ISA
to a new Integer ISA section as that seemed more appropriate.
---
src/gallium/docs/source/tgsi.rst | 362 ++++++++++++++++++++++++++++++--------
1 file changed, 289 insertions(+), 73 deletions(-)
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index a528fd2..b7caf63 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -872,6 +872,16 @@ This instruction replicates its result.
as an integer register.
+.. opcode:: CONT - Continue
+
+ TBD
+
+.. note::
+
+ Support for CONT is determined by a special capability bit,
+ ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
+
+
.. opcode:: IF - Float If
Start an IF ... ELSE .. ENDIF block. Condition evaluates to true if
@@ -977,6 +987,7 @@ These opcodes are primarily provided for special-use computational shaders.
Support for these opcodes indicated by a special pipe capability bit (TBD).
XXX so let's discuss it, yeah?
+XXX doesn't look like most of the opcodes really belong here.
.. opcode:: CEIL - Ceiling
@@ -991,7 +1002,89 @@ XXX so let's discuss it, yeah?
dst.w = \lceil src.w\rceil
-.. opcode:: I2F - Integer To Float
+.. opcode:: TRUNC - Truncate
+
+.. math::
+
+ dst.x = trunc(src.x)
+
+ dst.y = trunc(src.y)
+
+ dst.z = trunc(src.z)
+
+ dst.w = trunc(src.w)
+
+
+.. opcode:: MOD - Modulus
+
+.. math::
+
+ dst.x = src0.x \bmod src1.x
+
+ dst.y = src0.y \bmod src1.y
+
+ dst.z = src0.z \bmod src1.z
+
+ dst.w = src0.w \bmod src1.w
+
+
+.. opcode:: UARL - Integer Address Register Load
+
+ Moves the contents of the source register, assumed to be an integer, into the
+ destination register, which is assumed to be an address (ADDR) register.
+
+
+.. opcode:: SAD - Sum Of Absolute Differences
+
+.. math::
+
+ dst.x = |src0.x - src1.x| + src2.x
+
+ dst.y = |src0.y - src1.y| + src2.y
+
+ dst.z = |src0.z - src1.z| + src2.z
+
+ dst.w = |src0.w - src1.w| + src2.w
+
+
+.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
+ from a specified texture image. The source sampler may
+ not be a CUBE or SHADOW.
+ src 0 is a four-component signed integer vector used to
+ identify the single texel accessed. 3 components + level.
+ src 1 is a 3 component constant signed integer vector,
+ with each component only have a range of
+ -8..+8 (hw only seems to deal with this range, interface
+ allows for up to unsigned int).
+ TXF(uint_vec coord, int_vec offset).
+
+
+.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
+ retrieve the dimensions of the texture
+ depending on the target. For 1D (width), 2D/RECT/CUBE
+ (width, height), 3D (width, height, depth),
+ 1D array (width, layers), 2D array (width, height, layers)
+
+.. math::
+
+ lod = src0
+
+ dst.x = texture_width(unit, lod)
+
+ dst.y = texture_height(unit, lod)
+
+ dst.z = texture_depth(unit, lod)
+
+
+Integer ISA
+^^^^^^^^^^^^^^^^^^^^^^^^
+These opcodes are used for integer operations.
+Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?)
+
+
+.. opcode:: I2F - Signed Integer To Float
+
+ Rounding is unspecified (round to nearest even suggested).
.. math::
@@ -1004,56 +1097,157 @@ XXX so let's discuss it, yeah?
dst.w = (float) src.w
-.. opcode:: NOT - Bitwise Not
+.. opcode:: U2F - Unsigned Integer To Float
+
+ Rounding is unspecified (round to nearest even suggested).
.. math::
- dst.x = ~src.x
+ dst.x = (float) src.x
- dst.y = ~src.y
+ dst.y = (float) src.y
- dst.z = ~src.z
+ dst.z = (float) src.z
- dst.w = ~src.w
+ dst.w = (float) src.w
-.. opcode:: TRUNC - Truncate
+.. opcode:: F2I - Float to Signed Integer
+
+ Rounding is towards zero (truncate).
+ Values outside signed range (including NaNs) produce undefined results.
.. math::
- dst.x = trunc(src.x)
+ dst.x = (int) src.x
- dst.y = trunc(src.y)
+ dst.y = (int) src.y
- dst.z = trunc(src.z)
+ dst.z = (int) src.z
- dst.w = trunc(src.w)
+ dst.w = (int) src.w
-.. opcode:: SHL - Shift Left
+.. opcode:: F2U - Float to Unsigned Integer
+
+ Rounding is towards zero (truncate).
+ Values outside unsigned range (including NaNs) produce undefined results.
.. math::
- dst.x = src0.x << src1.x
+ dst.x = (unsigned) src.x
- dst.y = src0.y << src1.x
+ dst.y = (unsigned) src.y
- dst.z = src0.z << src1.x
+ dst.z = (unsigned) src.z
- dst.w = src0.w << src1.x
+ dst.w = (unsigned) src.w
-.. opcode:: SHR - Shift Right
+.. opcode:: UADD - Integer Add
+
+ This instruction works the same for signed and unsigned integers.
+ The low 32bit of the result is returned.
.. math::
- dst.x = src0.x >> src1.x
+ dst.x = src0.x + src1.x
- dst.y = src0.y >> src1.x
+ dst.y = src0.y + src1.y
- dst.z = src0.z >> src1.x
+ dst.z = src0.z + src1.z
- dst.w = src0.w >> src1.x
+ dst.w = src0.w + src1.w
+
+
+.. opcode:: UMAD - Integer Multiply And Add
+
+ This instruction works the same for signed and unsigned integers.
+ The multiplication returns the low 32bit (as does the result itself).
+
+.. math::
+
+ dst.x = src0.x \times src1.x + src2.x
+
+ dst.y = src0.y \times src1.y + src2.y
+
+ dst.z = src0.z \times src1.z + src2.z
+
+ dst.w = src0.w \times src1.w + src2.w
+
+
+.. opcode:: UMUL - Integer Multiply
+
+ This instruction works the same for signed and unsigned integers.
+ The low 32bit of the result is returned.
+
+.. math::
+
+ dst.x = src0.x \times src1.x
+
+ dst.y = src0.y \times src1.y
+
+ dst.z = src0.z \times src1.z
+
+ dst.w = src0.w \times src1.w
+
+
+.. opcode:: IDIV - Signed Integer Division
+
+ TBD: behavior for division by zero.
+
+.. math::
+
+ dst.x = src0.x \ src1.x
+
+ dst.y = src0.y \ src1.y
+
+ dst.z = src0.z \ src1.z
+
+ dst.w = src0.w \ src1.w
+
+
+.. opcode:: UDIV - Unsigned Integer Division
+
+ For division by zero, 0xffffffff is returned.
+
+.. math::
+
+ dst.x = src0.x \ src1.x
+
+ dst.y = src0.y \ src1.y
+
+ dst.z = src0.z \ src1.z
+
+ dst.w = src0.w \ src1.w
+
+
+.. opcode:: UMOD - Unsigned Integer Remainder
+
+ If second arg is zero, 0xffffffff is returned.
+
+.. math::
+
+ dst.x = src0.x \ src1.x
+
+ dst.y = src0.y \ src1.y
+
+ dst.z = src0.z \ src1.z
+
+ dst.w = src0.w \ src1.w
+
+
+.. opcode:: NOT - Bitwise Not
+
+.. math::
+
+ dst.x = ~src.x
+
+ dst.y = ~src.y
+
+ dst.z = ~src.z
+
+ dst.w = ~src.w
.. opcode:: AND - Bitwise And
@@ -1082,114 +1276,136 @@ XXX so let's discuss it, yeah?
dst.w = src0.w | src1.w
-.. opcode:: MOD - Modulus
+.. opcode:: XOR - Bitwise Xor
.. math::
- dst.x = src0.x \bmod src1.x
+ dst.x = src0.x \oplus src1.x
- dst.y = src0.y \bmod src1.y
+ dst.y = src0.y \oplus src1.y
- dst.z = src0.z \bmod src1.z
+ dst.z = src0.z \oplus src1.z
- dst.w = src0.w \bmod src1.w
+ dst.w = src0.w \oplus src1.w
-.. opcode:: XOR - Bitwise Xor
+.. opcode:: IMAX - Maximum of Signed Integers
.. math::
- dst.x = src0.x \oplus src1.x
+ dst.x = max(src0.x, src1.x)
- dst.y = src0.y \oplus src1.y
+ dst.y = max(src0.y, src1.y)
- dst.z = src0.z \oplus src1.z
+ dst.z = max(src0.z, src1.z)
- dst.w = src0.w \oplus src1.w
+ dst.w = max(src0.w, src1.w)
-.. opcode:: UCMP - Integer Conditional Move
+.. opcode:: UMAX - Maximum of Unsigned Integers
.. math::
- dst.x = src0.x ? src1.x : src2.x
+ dst.x = max(src0.x, src1.x)
- dst.y = src0.y ? src1.y : src2.y
+ dst.y = max(src0.y, src1.y)
- dst.z = src0.z ? src1.z : src2.z
+ dst.z = max(src0.z, src1.z)
- dst.w = src0.w ? src1.w : src2.w
+ dst.w = max(src0.w, src1.w)
-.. opcode:: UARL - Integer Address Register Load
+.. opcode:: IMIN - Minimum of Signed Integers
- Moves the contents of the source register, assumed to be an integer, into the
- destination register, which is assumed to be an address (ADDR) register.
+.. math::
+ dst.x = min(src0.x, src1.x)
-.. opcode:: IABS - Integer Absolute Value
+ dst.y = min(src0.y, src1.y)
+
+ dst.z = min(src0.z, src1.z)
+
+ dst.w = min(src0.w, src1.w)
+
+
+.. opcode:: UMIN - Minimum of Unsigned Integers
.. math::
- dst.x = |src.x|
+ dst.x = min(src0.x, src1.x)
- dst.y = |src.y|
+ dst.y = min(src0.y, src1.y)
- dst.z = |src.z|
+ dst.z = min(src0.z, src1.z)
- dst.w = |src.w|
+ dst.w = min(src0.w, src1.w)
-.. opcode:: SAD - Sum Of Absolute Differences
+.. opcode:: SHL - Shift Left
.. math::
- dst.x = |src0.x - src1.x| + src2.x
+ dst.x = src0.x << src1.x
- dst.y = |src0.y - src1.y| + src2.y
+ dst.y = src0.y << src1.x
- dst.z = |src0.z - src1.z| + src2.z
+ dst.z = src0.z << src1.x
- dst.w = |src0.w - src1.w| + src2.w
+ dst.w = src0.w << src1.x
-.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
- from a specified texture image. The source sampler may
- not be a CUBE or SHADOW.
- src 0 is a four-component signed integer vector used to
- identify the single texel accessed. 3 components + level.
- src 1 is a 3 component constant signed integer vector,
- with each component only have a range of
- -8..+8 (hw only seems to deal with this range, interface
- allows for up to unsigned int).
- TXF(uint_vec coord, int_vec offset).
+.. opcode:: ISHR - Arithmetic Shift Right (of Signed Integer)
+.. math::
-.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
- retrieve the dimensions of the texture
- depending on the target. For 1D (width), 2D/RECT/CUBE
- (width, height), 3D (width, height, depth),
- 1D array (width, layers), 2D array (width, height, layers)
+ dst.x = src0.x >> src1.x
+
+ dst.y = src0.y >> src1.x
+
+ dst.z = src0.z >> src1.x
+
+ dst.w = src0.w >> src1.x
+
+
+.. opcode:: USHR - Logical Shift Right
.. math::
- lod = src0
+ dst.x = src0.x >> (unsigned) src1.x
- dst.x = texture_width(unit, lod)
+ dst.y = src0.y >> (unsigned) src1.x
- dst.y = texture_height(unit, lod)
+ dst.z = src0.z >> (unsigned) src1.x
- dst.z = texture_depth(unit, lod)
+ dst.w = src0.w >> (unsigned) src1.x
-.. opcode:: CONT - Continue
- TBD
-.. note::
+.. opcode:: UCMP - Integer Conditional Move
- Support for CONT is determined by a special capability bit,
- ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
+.. math::
+
+ dst.x = src0.x ? src1.x : src2.x
+
+ dst.y = src0.y ? src1.y : src2.y
+
+ dst.z = src0.z ? src1.z : src2.z
+
+ dst.w = src0.w ? src1.w : src2.w
+
+
+.. opcode:: IABS - Integer Absolute Value
+
+.. math::
+
+ dst.x = |src.x|
+
+ dst.y = |src.y|
+
+ dst.z = |src.z|
+
+ dst.w = |src.w|
Geometry ISA
--
1.7.9.5
More information about the mesa-dev
mailing list