[Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

Tue Dec 13 14:46:32 PST 2011

On 12/13/2011 04:22 PM, Jose Fonseca wrote:
> ----- Original Message -----
>>
>> ----- Original Message -----
>>> On 12/13/2011 03:48 PM, Jose Fonseca wrote:
>>>> ----- Original Message -----
>>>>> On 12/13/2011 03:25 PM, Jose Fonseca wrote:
>>>>>> ----- Original Message -----
>>>>>>> On 12/13/2011 03:09 PM, Jose Fonseca wrote:
>>>>>>>> ----- Original Message -----
>>>>>>>>> On 12/13/2011 12:26 PM, Bryan Cain wrote:
>>>>>>>>>> On 12/13/2011 02:11 PM, Jose Fonseca wrote:
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> This is an updated version of the patch set I sent to the
>>>>>>>>>>>> list
>>>>>>>>>>>> a
>>>>>>>>>>>> few
>>>>>>>>>>>> hours
>>>>>>>>>>>> ago.
>>>>>>>>>>>> There is now a TGSI property called
>>>>>>>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES
>>>>>>>>>>>> that drivers can use to determine how many of the 8
>>>>>>>>>>>> available
>>>>>>>>>>>> clip
>>>>>>>>>>>> distances
>>>>>>>>>>>> are actually used by a shader.
>>>>>>>>>>> Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be
>>>>>>>>>>> easily
>>>>>>>>>>> derived from the shader, and queried through
>>>>>>>>>>> src/gallium/auxiliary/tgsi/tgsi_scan.h ?
>>>>>>>>>> No.  The clip distances can be indirectly addressed (there
>>>>>>>>>> are
>>>>>>>>>> up
>>>>>>>>>> to 2
>>>>>>>>>> of them in vec4 form for a total of 8 floats), which makes
>>>>>>>>>> it
>>>>>>>>>> impossible
>>>>>>>>>> to determine which ones are used by analyzing the shader.
>>>>>>>>> The description is almost complete. :)  The issue is that
>>>>>>>>> the
>>>>>>>>> shader
>>>>>>>>> may
>>>>>>>>> declare
>>>>>>>>>
>>>>>>>>> out float gl_ClipDistance[4];
>>>>>>>>>
>>>>>>>>> the use non-constant addressing of the array.  The compiler
>>>>>>>>> knows
>>>>>>>>> that
>>>>>>>>> gl_ClipDistance has at most 4 elements, but post-hoc
>>>>>>>>> analysis
>>>>>>>>> would
>>>>>>>>> not
>>>>>>>>> be able to determine that.  Often the fixed-function
>>>>>>>>> hardware
>>>>>>>>> (see
>>>>>>>>> below) needs to know which clip distance values are actually
>>>>>>>>> written.
>>>>>>>> But don't all the clip distances written by the shader need
>>>>>>>> to
>>>>>>>> be
>>>>>>>> declared?
>>>>>>>>
>>>>>>>> E.g.:
>>>>>>>>
>>>>>>>> DCL OUT[0], CLIPDIST[0]
>>>>>>>> DCL OUT[1], CLIPDIST[1]
>>>>>>>> DCL OUT[2], CLIPDIST[2]
>>>>>>>> DCL OUT[3], CLIPDIST[3]
>>>>>>>>
>>>>>>>> therefore a trivial analysis of the declarations convey that?
>>>>>>> No.  Clip distance is an array of up to 8 floats in GLSL, but
>>>>>>> it's
>>>>>>> represented in the hardware as 2 vec4s.  You can tell by
>>>>>>> analyzing
>>>>>>> the
>>>>>>> declarations whether there are more than 4 clip distances in
>>>>>>> use,
>>>>>>> but
>>>>>>> not which components the shader writes to.
>>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components
>>>>>>> in
>>>>>>> use,
>>>>>>> not
>>>>>>> the number of full vectors.
>>>>>> Lets imagine
>>>>>>
>>>>>>   out float gl_ClipDistance[6];
>>>>>>
>>>>>> Each a clip distance is a scalar float.
>>>>>>
>>>>>> Either all hardware represents the 8 clip distances as two 4
>>>>>> vectors, and we do:
>>>>>>
>>>>>>   DCL OUT[0].xywz, CLIPDIST[0]
>>>>>>   DCL OUT[1].xy, CLIPDIST[1]
>>>>>>
>>>>>> using the full range of struct tgsi_declaration::UsageMask [1]
>>>>>> or
>>>>>> we represent them as as scalars:
>>>>>>
>>>>>>   DCL OUT[0].x, CLIPDIST[0]
>>>>>>   DCL OUT[1].x, CLIPDIST[1]
>>>>>>   DCL OUT[2].x, CLIPDIST[2]
>>>>>>   DCL OUT[3].x, CLIPDIST[3]
>>>>>>   DCL OUT[4].x, CLIPDIST[4]
>>>>>>   DCL OUT[5].x, CLIPDIST[5]
>>>>>>
>>>>>> If indirect addressing is allowed as I read bore, then maybe
>>>>>> the
>>>>>> later is better.
>>>>>>
>>>>>> I confess my ignorance about clipping and maybe I'm being
>>>>>> dense,
>>>>>> but I still don't see the need for this
>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you please draft an
>>>>>> example TGSI shader showing this property (or just paste one
>>>>>> generated with your change)?  I think that would help a lot.
>>>>>>
>>>>>>
>>>>>> Jose
>>>>>>
>>>>>>
>>>>>> [1] I don't know if tgsi_dump pays much attention to
>>>>>>  tgsi_declaration::UsageMask, but it does exist.
>>>>> UsageMask might work, but before that can be considered a viable
>>>>> solution, someone will need to make it possible to actually
>>>>> declare
>>>>> it
>>>>> from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw
>>>>> no
>>>>> matter what on all declared inputs and outputs.
>>>> ureg automatically fills the UsageMask from the destionation
>>>> register masks, since it easy to determine from the opcodes.
>>> Wait, where does it do that?  When I search through tgsi_ureg.c for
>>> "UsageMask", all it shows are assignments of TGSI_WRITEMASK_XYZW to
>>> the
>>> UsageMask property.
>> ah. I may be lying. But I'm pretty sure I wrote such code somewhere,
>> sometime. Let me dig it.
> I was lying.  I wrote tgsi_util_get_inst_usage_mask() in src/gallium/auxiliary/tgsi/tgsi_util.c , but it only analyses which registers are _read_, and never got hooked into ureg anyway.
>
> I don't want you to go over hoops just to pass a scalar quantity. So may be just add ability to ureg to specify declaration's output mask?

One problem with this is that the output mask would have to be a
parameter to a new declaration function in ureg, like
ureg_DECL_output_with_mask() instead of just ureg_DECL_output().  In
this case, ureg_DECL_output_with_mask() would be the only DECL function
with a usage mask specified.  If that asymmetry is okay with you, I
think I could go the UsageMask route.

> Another approach would be just to add the property, and kill output mask. Two ways of doing the same is what I'd like to avoid. I'll need a day (it's late here) to think about this and see how output mask is being actually used in the code.
>
> Jose