[Mesa-dev] [PATCH 1/4] gallium: add new opcodes for ARB_gs5 bit manipulation support
Roland Scheidegger
sroland at vmware.com
Fri Apr 25 15:20:42 PDT 2014
Am 25.04.2014 23:19, schrieb Ilia Mirkin:
> On Fri, Apr 25, 2014 at 5:02 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 25.04.2014 19:41, schrieb Ilia Mirkin:
>>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>>> ---
>>> src/gallium/auxiliary/tgsi/tgsi_info.c | 8 +++++
>>> src/gallium/docs/source/tgsi.rst | 51 ++++++++++++++++++++++++++++++
>>> src/gallium/include/pipe/p_shader_tokens.h | 11 ++++++-
>>> 3 files changed, 69 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c
>>> index 5bcc3c9..d03a920 100644
>>> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
>>> @@ -223,6 +223,14 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] =
>>> { 1, 2, 0, 0, 0, 0, COMP, "UMUL_HI", TGSI_OPCODE_UMUL_HI },
>>> { 1, 3, 1, 0, 0, 0, OTHR, "TG4", TGSI_OPCODE_TG4 },
>>> { 1, 2, 1, 0, 0, 0, OTHR, "LODQ", TGSI_OPCODE_LODQ },
>>> + { 1, 3, 0, 0, 0, 0, COMP, "IBFE", TGSI_OPCODE_IBFE },
>>> + { 1, 3, 0, 0, 0, 0, COMP, "UBFE", TGSI_OPCODE_UBFE },
>>> + { 1, 4, 0, 0, 0, 0, COMP, "BFI", TGSI_OPCODE_BFI },
>>> + { 1, 1, 0, 0, 0, 0, COMP, "BREV", TGSI_OPCODE_BREV },
>>> + { 1, 1, 0, 0, 0, 0, COMP, "POPC", TGSI_OPCODE_POPC },
>>> + { 1, 1, 0, 0, 0, 0, COMP, "LSB", TGSI_OPCODE_LSB },
>>> + { 1, 1, 0, 0, 0, 0, COMP, "IMSB", TGSI_OPCODE_IMSB },
>>> + { 1, 1, 0, 0, 0, 0, COMP, "UMSB", TGSI_OPCODE_UMSB },
>>> };
>>>
>>> const struct tgsi_opcode_info *
>>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>>> index 0ea0759..95b069f 100644
>>> --- a/src/gallium/docs/source/tgsi.rst
>>> +++ b/src/gallium/docs/source/tgsi.rst
>>> @@ -1558,6 +1558,57 @@ Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?)
>>>
>>> dst.w = |src.w|
>>>
>>> +Bitwise ISA
>>> +^^^^^^^^^^^
>>> +These opcodes are used for bit-level manipulation of integers.
>>> +
>>> +.. opcode:: IBFE - Signed Bitfield Extract
>>> +
>>> +.. math::
>>> +
>>> + value = src0
>>> +
>>> + offset = src1
>>> +
>>> + bits = src2
>>> +
>>> + dst = bitfield\_extract(value, offset, bits)
>>> +
>>> +.. opcode:: UBFE - Unsigned Bitfield Extract
>>> +
>>> +.. math::
>>> +
>>> + value = src0
>>> +
>>> + offset = src1
>>> +
>>> + bits = src2
>>> +
>>> + dst = bitfield\_extract(value, offset, bits)
>> I think the description for these two leaves a bit to be desired (you'd
>> even think they are the same).
>
> They basically are the same, except for the sign extension.
Yes of course. But you can't tell from that description.
> What's the
> standard for such operations which don't map into "math" nicely?
> Should I stick some pseudo-code in?
Some paragraph including pseudo-code is fine by me. Or you could explain
the bitfield_extract term below under the Functions section (though I'm
not sure it's such a good idea - bitfield_extract() just isn't a very
well known term).
>
>>
>>> +
>>> +.. opcode:: BFI - Bitfield Insert
>>> +
>>> +.. math::
>>> +
>>> + base = src0
>>> +
>>> + insert = src1
>>> +
>>> + offset = src2
>>> +
>>> + bits = src3
>>> +
>>> + dst = bitfield\_insert(base, insert, offset, bits)
>> Same as above.
>>
>>> +
>>> +.. opcode:: BREV - Bitfield Reverse
>> Could also be a bit more descriptive.
>>
>>> +
>>> +.. opcode:: POPC - Population Count (Count Set Bits)
>>> +
>>> +.. opcode:: LSB - Index of lowest set bit
>>> +
>>> +.. opcode:: IMSB - Index of highest non-sign bit
>> That looks very confusing to me, since it apparently is meant to give
>> the highest set bit if the number is positive, and the highest cleared
>> bit if the number is negative.
>
> Right, so if the sign-bit is 1 (negative), it's the index of the
> highest 0. If the sign bit is 0 (positive), it's the index of the
> highest 1. And -1 if all the bits are the same. None of these at all
> map nicely to a "math" style of description. Perhaps I should just put
> in a paragraph for these?
Sounds good to me.
>
>>
>>> +
>>> +.. opcode:: UMSB - Index of highest 1-bit
>> highest set bit?
>
> Sure.
>
>>
>> Otherwise these look reasonable to me.
>> As for the addc/subb I guess this is an area where just about everything
>> you do won't really match hw in any case. A quick glance at radeonsi
>> tells me that gcn actually _always_ sets the carry bit for normal int
>> adds/subs but does so in the VCC reg - so if you'd want to get this to a
>> "normal" register you'd have to do some other instruction (maybe
>> conditional 0/1 move based on VCC). However, gcn actually has subb/addc
>> instructions, these just do add/sub honoring that VCC bit (and again
>> still outputting VCC bit themselves).
>> But sm5 and glsl agree there - they both have addc/subb with just just 2
>> inputs (so no carry/borrow input) but an additional "normal" overflow
>> output. Maybe this is easiest to transform into what hw will actually do
>> usually.
>
> I was hoping to not have to deal with carry/borrow at the TGSI level
> at all and just have the GLSL lower to ADD + USLT or so, and then for
> hw capable of dealing with it (not nvc0, or at least the blob driver
> doesn't make use of a mechanism that'd enable it), having a peephole
> opt that converts the USLT to a "recover whereever the flag is at".
I guess an explicit carry instruction makes it somewhat more obvious
this really came from an addc. Not sure if that really matters, though.
Roland
>
>>
>> Roland
>>
>>
>>>
>>> Geometry ISA
>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
>>> index b537166..d095bd3 100644
>>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>>> @@ -462,7 +462,16 @@ struct tgsi_property_data {
>>>
>>> #define TGSI_OPCODE_LODQ 183
>>>
>>> -#define TGSI_OPCODE_LAST 184
>>> +#define TGSI_OPCODE_IBFE 184
>>> +#define TGSI_OPCODE_UBFE 185
>>> +#define TGSI_OPCODE_BFI 186
>>> +#define TGSI_OPCODE_BREV 187
>>> +#define TGSI_OPCODE_POPC 188
>>> +#define TGSI_OPCODE_LSB 189
>>> +#define TGSI_OPCODE_IMSB 190
>>> +#define TGSI_OPCODE_UMSB 191
>>> +
>>> +#define TGSI_OPCODE_LAST 192
>>>
>>> #define TGSI_SAT_NONE 0 /* do not saturate */
>>> #define TGSI_SAT_ZERO_ONE 1 /* clamp to [0,1] */
>>>
More information about the mesa-dev
mailing list