[Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H

Sun Jan 3 10:02:50 PST 2016

On Sun, Jan 3, 2016 at 12:33 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 03.01.2016 um 01:37 schrieb Ilia Mirkin:
>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>> ---
>>  src/gallium/docs/source/tgsi.rst | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>> index 955ece8..f69998f 100644
>> --- a/src/gallium/docs/source/tgsi.rst
>> +++ b/src/gallium/docs/source/tgsi.rst
>> @@ -458,7 +458,9 @@ while DDY is allowed to be the same for the entire 2x2 quad.
>>
>>  .. opcode:: PK2H - Pack Two 16-bit Floats
>>
>> -  TBD
>> +.. math::
>> +
>> +  dst.x = f32\_to\_f16(src.x) | f32\_to\_f16(src.y) << 16
> This doesn't quite match the tgsi info description (which says that the
> result is
> replicated). If you don't want channel replication probably should make
> that CHAN
> there instead.

I'll add the replication to the docs. Looks like NV_fragment_program
also wanted this:

      tmp0 = VectorLoad(op0);
      /* result obtained by combining raw bits of tmp0.x, tmp0.y */
      result.x = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
      result.y = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
      result.z = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
      result.w = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);

But looks like it's just packing, not actually converting. And it's
unclear whether UP2H is converting or not... let's assume that they do
the conversions or else this is going to be useless.

>
>
>
>>
>>  .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
>> @@ -615,7 +617,11 @@ This instruction replicates its result.
>>
>>  .. opcode:: UP2H - Unpack Two 16-Bit Floats
>>
>> -  TBD
>> +.. math::
>> +
>> +  dst.x = f16\_to\_f32(src0.x \& 0xffff)
>> +
>> +  dst.y = f16\_to\_f32(src0.x >> 16)
>>
> I'm certainly ok with that, albeit (just like PK2H unless you do
> replication) it's not what the original source for this opcode does
> (which would have been NV_fragment_program).

      tmp = ScalarLoad(op0);
      result.x = (fp16) (RawBits(tmp) & 0xFFFF);
      result.y = (fp16) ((RawBits(tmp) >> 16) & 0xFFFF);
      result.z = (fp16) (RawBits(tmp) & 0xFFFF);
      result.w = (fp16) ((RawBits(tmp) >> 16) & 0xFFFF);

Happy to add the .zw = .xy bit here as well. I was previously not
aware that these ops came from NV_fragment_program, and instead
assumed that they came from some incomplete attempt to do...
something. (I guess it was for implementing NV_fragment_program ;) )

>
> For the series (with the first point addressed either way,though a tgsi
> exec implementation which should be trivial wouldn't hurt neither)
> Reviewed-by: Roland Scheidegger <sroland at vmware.com>

Thanks! I'll do a patch for that shortly (tgsi_exec). Unfortunately I
won't be able to enable the cap since it will still use gallivm by
default for vertices. I have a gallivm implementation as well, but it
hits asserts on LLVM 3.5. I'm pretty sure I tested it at one point or
another, but it must have been on another box with a more recent LLVM.

  -ilia