[Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H

Roland Scheidegger sroland at vmware.com
Sun Jan 3 11:08:51 PST 2016


Am 03.01.2016 um 19:02 schrieb Ilia Mirkin:
> On Sun, Jan 3, 2016 at 12:33 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 03.01.2016 um 01:37 schrieb Ilia Mirkin:
>>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>>> ---
>>>  src/gallium/docs/source/tgsi.rst | 10 ++++++++--
>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>>> index 955ece8..f69998f 100644
>>> --- a/src/gallium/docs/source/tgsi.rst
>>> +++ b/src/gallium/docs/source/tgsi.rst
>>> @@ -458,7 +458,9 @@ while DDY is allowed to be the same for the entire 2x2 quad.
>>>
>>>  .. opcode:: PK2H - Pack Two 16-bit Floats
>>>
>>> -  TBD
>>> +.. math::
>>> +
>>> +  dst.x = f32\_to\_f16(src.x) | f32\_to\_f16(src.y) << 16
>> This doesn't quite match the tgsi info description (which says that the
>> result is
>> replicated). If you don't want channel replication probably should make
>> that CHAN
>> there instead.
> 
> I'll add the replication to the docs. Looks like NV_fragment_program
> also wanted this:
> 
>       tmp0 = VectorLoad(op0);
>       /* result obtained by combining raw bits of tmp0.x, tmp0.y */
>       result.x = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
>       result.y = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
>       result.z = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
>       result.w = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
> 
> But looks like it's just packing, not actually converting. And it's
> unclear whether UP2H is converting or not... let's assume that they do
> the conversions or else this is going to be useless.
I don't think that's quite true it only packs (the pseudo-code is
probably a bit sloppy...), given what nv30 could do this doesn't make
sense. Also, UP2H clearly states "...undoes the type conversion and
packing performed by the PK2H instruction". Albeit the pseudo-code
doesn't really mention float anywhere there neither. I think though this
is due to the possibility of the src (for pk2h) or dst (for up2h) being
either a float or half reg, so in the latter case you wouldn't get any
conversion (but don't quote me on that...).


> 
>>
>>
>>
>>>
>>>  .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
>>> @@ -615,7 +617,11 @@ This instruction replicates its result.
>>>
>>>  .. opcode:: UP2H - Unpack Two 16-Bit Floats
>>>
>>> -  TBD
>>> +.. math::
>>> +
>>> +  dst.x = f16\_to\_f32(src0.x \& 0xffff)
>>> +
>>> +  dst.y = f16\_to\_f32(src0.x >> 16)
>>>
>> I'm certainly ok with that, albeit (just like PK2H unless you do
>> replication) it's not what the original source for this opcode does
>> (which would have been NV_fragment_program).
> 
>       tmp = ScalarLoad(op0);
>       result.x = (fp16) (RawBits(tmp) & 0xFFFF);
>       result.y = (fp16) ((RawBits(tmp) >> 16) & 0xFFFF);
>       result.z = (fp16) (RawBits(tmp) & 0xFFFF);
>       result.w = (fp16) ((RawBits(tmp) >> 16) & 0xFFFF);
> 
> Happy to add the .zw = .xy bit here as well. I was previously not
> aware that these ops came from NV_fragment_program, and instead
> assumed that they came from some incomplete attempt to do...
> something. (I guess it was for implementing NV_fragment_program ;) )
Yes. I don't think any real effort was really ever made to support it,
but tgsi was supposed to provide a superset of all available opcodes
coming from somewhere (be it gl extensions or coming from d3d9) then.
There's actually an ooooooooooold branch sitting on fdo where Michal
removed support for all these opcodes, but it was never merged
(http://cgit.freedesktop.org/mesa/mesa/commit/?id=5efeade4dc7ffe2d10b231b56fac60dbaa8aa0c8)

So, if you want slightly different semantics that should be fine, but if
the original ones aren't annoying could of course just stick to them.

Roland


> 
>>
>> For the series (with the first point addressed either way,though a tgsi
>> exec implementation which should be trivial wouldn't hurt neither)
>> Reviewed-by: Roland Scheidegger <sroland at vmware.com>
> 
> Thanks! I'll do a patch for that shortly (tgsi_exec). Unfortunately I
> won't be able to enable the cap since it will still use gallivm by
> default for vertices. I have a gallivm implementation as well, but it
> hits asserts on LLVM 3.5. I'm pretty sure I tested it at one point or
> another, but it must have been on another box with a more recent LLVM.

Ah right. f16 conversion is pretty annoying indeed, though I'd hope the
helpers for that should work. In any case, I only really suggested that
because I'd thought it would be trivial, so if it's not I don't consider
that important...

Roland




More information about the mesa-dev mailing list