[Mesa-dev] [PATCH 1/4] gallium: add new opcodes for ARB_gs5 bit manipulation support

Fri Apr 25 15:20:42 PDT 2014

Am 25.04.2014 23:19, schrieb Ilia Mirkin:
> On Fri, Apr 25, 2014 at 5:02 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 25.04.2014 19:41, schrieb Ilia Mirkin:
>>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>>> ---
>>>  src/gallium/auxiliary/tgsi/tgsi_info.c     |  8 +++++
>>>  src/gallium/docs/source/tgsi.rst           | 51 ++++++++++++++++++++++++++++++
>>>  src/gallium/include/pipe/p_shader_tokens.h | 11 ++++++-
>>>  3 files changed, 69 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c
>>> index 5bcc3c9..d03a920 100644
>>> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
>>> @@ -223,6 +223,14 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] =
>>>     { 1, 2, 0, 0, 0, 0, COMP, "UMUL_HI", TGSI_OPCODE_UMUL_HI },
>>>     { 1, 3, 1, 0, 0, 0, OTHR, "TG4", TGSI_OPCODE_TG4 },
>>>     { 1, 2, 1, 0, 0, 0, OTHR, "LODQ", TGSI_OPCODE_LODQ },
>>> +   { 1, 3, 0, 0, 0, 0, COMP, "IBFE", TGSI_OPCODE_IBFE },
>>> +   { 1, 3, 0, 0, 0, 0, COMP, "UBFE", TGSI_OPCODE_UBFE },
>>> +   { 1, 4, 0, 0, 0, 0, COMP, "BFI", TGSI_OPCODE_BFI },
>>> +   { 1, 1, 0, 0, 0, 0, COMP, "BREV", TGSI_OPCODE_BREV },
>>> +   { 1, 1, 0, 0, 0, 0, COMP, "POPC", TGSI_OPCODE_POPC },
>>> +   { 1, 1, 0, 0, 0, 0, COMP, "LSB", TGSI_OPCODE_LSB },
>>> +   { 1, 1, 0, 0, 0, 0, COMP, "IMSB", TGSI_OPCODE_IMSB },
>>> +   { 1, 1, 0, 0, 0, 0, COMP, "UMSB", TGSI_OPCODE_UMSB },
>>>  };
>>>
>>>  const struct tgsi_opcode_info *
>>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>>> index 0ea0759..95b069f 100644
>>> --- a/src/gallium/docs/source/tgsi.rst
>>> +++ b/src/gallium/docs/source/tgsi.rst
>>> @@ -1558,6 +1558,57 @@ Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?)
>>>
>>>    dst.w = |src.w|
>>>
>>> +Bitwise ISA
>>> +^^^^^^^^^^^
>>> +These opcodes are used for bit-level manipulation of integers.
>>> +
>>> +.. opcode:: IBFE - Signed Bitfield Extract
>>> +
>>> +.. math::
>>> +
>>> +  value = src0
>>> +
>>> +  offset = src1
>>> +
>>> +  bits = src2
>>> +
>>> +  dst = bitfield\_extract(value, offset, bits)
>>> +
>>> +.. opcode:: UBFE - Unsigned Bitfield Extract
>>> +
>>> +.. math::
>>> +
>>> +  value = src0
>>> +
>>> +  offset = src1
>>> +
>>> +  bits = src2
>>> +
>>> +  dst = bitfield\_extract(value, offset, bits)
>> I think the description for these two leaves a bit to be desired (you'd
>> even think they are the same).
> 
> They basically are the same, except for the sign extension.
Yes of course. But you can't tell from that description.

> What's the
> standard for such operations which don't map into "math" nicely?
> Should I stick some pseudo-code in?
Some paragraph including pseudo-code is fine by me. Or you could explain
the bitfield_extract term below under the Functions section (though I'm
not sure it's such a good idea - bitfield_extract() just isn't a very
well known term).

> 
>>
>>> +
>>> +.. opcode:: BFI - Bitfield Insert
>>> +
>>> +.. math::
>>> +
>>> +  base = src0
>>> +
>>> +  insert = src1
>>> +
>>> +  offset = src2
>>> +
>>> +  bits = src3
>>> +
>>> +  dst = bitfield\_insert(base, insert, offset, bits)
>> Same as above.
>>
>>> +
>>> +.. opcode:: BREV - Bitfield Reverse
>> Could also be a bit more descriptive.
>>
>>> +
>>> +.. opcode:: POPC - Population Count (Count Set Bits)
>>> +
>>> +.. opcode:: LSB - Index of lowest set bit
>>> +
>>> +.. opcode:: IMSB - Index of highest non-sign bit
>> That looks very confusing to me, since it apparently is meant to give
>> the highest set bit if the number is positive, and the highest cleared
>> bit if the number is negative.
> 
> Right, so if the sign-bit is 1 (negative), it's the index of the
> highest 0. If the sign bit is 0 (positive), it's the index of the
> highest 1. And -1 if all the bits are the same. None of these at all
> map nicely to a "math" style of description. Perhaps I should just put
> in a paragraph for these?
Sounds good to me.

> 
>>
>>> +
>>> +.. opcode:: UMSB - Index of highest 1-bit
>> highest set bit?
> 
> Sure.
> 
>>
>> Otherwise these look reasonable to me.
>> As for the addc/subb I guess this is an area where just about everything
>> you do won't really match hw in any case. A quick glance at radeonsi
>> tells me that gcn actually _always_ sets the carry bit for normal int
>> adds/subs but does so in the VCC reg - so if you'd want to get this to a
>> "normal" register you'd have to do some other instruction (maybe
>> conditional 0/1 move based on VCC). However, gcn actually has subb/addc
>> instructions, these just do add/sub honoring that VCC bit (and again
>> still outputting VCC bit themselves).
>> But sm5 and glsl agree there - they both have addc/subb with just just 2
>> inputs (so no carry/borrow input) but an additional "normal" overflow
>> output. Maybe this is easiest to transform into what hw will actually do
>> usually.
> 
> I was hoping to not have to deal with carry/borrow at the TGSI level
> at all and just have the GLSL lower to ADD + USLT or so, and then for
> hw capable of dealing with it (not nvc0, or at least the blob driver
> doesn't make use of a mechanism that'd enable it), having a peephole
> opt that converts the USLT to a "recover whereever the flag is at".
I guess an explicit carry instruction makes it somewhat more obvious
this really came from an addc. Not sure if that really matters, though.

Roland


> 
>>
>> Roland
>>
>>
>>>
>>>  Geometry ISA
>>>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
>>> index b537166..d095bd3 100644
>>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>>> @@ -462,7 +462,16 @@ struct tgsi_property_data {
>>>
>>>  #define TGSI_OPCODE_LODQ                183
>>>
>>> -#define TGSI_OPCODE_LAST                184
>>> +#define TGSI_OPCODE_IBFE                184
>>> +#define TGSI_OPCODE_UBFE                185
>>> +#define TGSI_OPCODE_BFI                 186
>>> +#define TGSI_OPCODE_BREV                187
>>> +#define TGSI_OPCODE_POPC                188
>>> +#define TGSI_OPCODE_LSB                 189
>>> +#define TGSI_OPCODE_IMSB                190
>>> +#define TGSI_OPCODE_UMSB                191
>>> +
>>> +#define TGSI_OPCODE_LAST                192
>>>
>>>  #define TGSI_SAT_NONE            0  /* do not saturate */
>>>  #define TGSI_SAT_ZERO_ONE        1  /* clamp to [0,1] */
>>>