[Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

Tue Dec 13 14:17:49 PST 2011

On 12/13/2011 10:48 PM, Jose Fonseca wrote:
> ----- Original Message -----
>> On 12/13/2011 03:25 PM, Jose Fonseca wrote:
>>>
>>> ----- Original Message -----
>>>> On 12/13/2011 03:09 PM, Jose Fonseca wrote:
>>>>> ----- Original Message -----
>>>>>> On 12/13/2011 12:26 PM, Bryan Cain wrote:
>>>>>>> On 12/13/2011 02:11 PM, Jose Fonseca wrote:
>>>>>>>> ----- Original Message -----
>>>>>>>>> This is an updated version of the patch set I sent to the
>>>>>>>>> list
>>>>>>>>> a
>>>>>>>>> few
>>>>>>>>> hours
>>>>>>>>> ago.
>>>>>>>>> There is now a TGSI property called
>>>>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES
>>>>>>>>> that drivers can use to determine how many of the 8 available
>>>>>>>>> clip
>>>>>>>>> distances
>>>>>>>>> are actually used by a shader.
>>>>>>>> Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
>>>>>>>> derived from the shader, and queried through
>>>>>>>> src/gallium/auxiliary/tgsi/tgsi_scan.h ?
>>>>>>> No.  The clip distances can be indirectly addressed (there are
>>>>>>> up
>>>>>>> to 2
>>>>>>> of them in vec4 form for a total of 8 floats), which makes it
>>>>>>> impossible
>>>>>>> to determine which ones are used by analyzing the shader.
>>>>>> The description is almost complete. :)  The issue is that the
>>>>>> shader
>>>>>> may
>>>>>> declare
>>>>>>
>>>>>> out float gl_ClipDistance[4];
>>>>>>
>>>>>> the use non-constant addressing of the array.  The compiler
>>>>>> knows
>>>>>> that
>>>>>> gl_ClipDistance has at most 4 elements, but post-hoc analysis
>>>>>> would
>>>>>> not
>>>>>> be able to determine that.  Often the fixed-function hardware
>>>>>> (see
>>>>>> below) needs to know which clip distance values are actually
>>>>>> written.
>>>>> But don't all the clip distances written by the shader need to be
>>>>> declared?
>>>>>
>>>>> E.g.:
>>>>>
>>>>> DCL OUT[0], CLIPDIST[0]
>>>>> DCL OUT[1], CLIPDIST[1]
>>>>> DCL OUT[2], CLIPDIST[2]
>>>>> DCL OUT[3], CLIPDIST[3]
>>>>>
>>>>> therefore a trivial analysis of the declarations convey that?
>>>> No.  Clip distance is an array of up to 8 floats in GLSL, but it's
>>>> represented in the hardware as 2 vec4s.  You can tell by analyzing
>>>> the
>>>> declarations whether there are more than 4 clip distances in use,
>>>> but
>>>> not which components the shader writes to.
>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in
>>>> use,
>>>> not
>>>> the number of full vectors.
>>> Lets imagine
>>>
>>>   out float gl_ClipDistance[6];
>>>
>>> Each a clip distance is a scalar float.
>>>
>>> Either all hardware represents the 8 clip distances as two 4
>>> vectors, and we do:
>>>
>>>   DCL OUT[0].xywz, CLIPDIST[0]
>>>   DCL OUT[1].xy, CLIPDIST[1]
>>>
>>> using the full range of struct tgsi_declaration::UsageMask [1] or
>>> we represent them as as scalars:
>>>
>>>   DCL OUT[0].x, CLIPDIST[0]
>>>   DCL OUT[1].x, CLIPDIST[1]
>>>   DCL OUT[2].x, CLIPDIST[2]
>>>   DCL OUT[3].x, CLIPDIST[3]
>>>   DCL OUT[4].x, CLIPDIST[4]
>>>   DCL OUT[5].x, CLIPDIST[5]
>>>
>>> If indirect addressing is allowed as I read bore, then maybe the
>>> later is better.
>>>
>>> I confess my ignorance about clipping and maybe I'm being dense,
>>> but I still don't see the need for this
>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you please draft an
>>> example TGSI shader showing this property (or just paste one
>>> generated with your change)?  I think that would help a lot.
>>>
>>>
>>> Jose
>>>
>>>
>>> [1] I don't know if tgsi_dump pays much attention to
>>>  tgsi_declaration::UsageMask, but it does exist.
>>
>> UsageMask might work, but before that can be considered a viable
>> solution, someone will need to make it possible to actually declare
>> it
>> from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw no
>> matter what on all declared inputs and outputs.
> 
> ureg automatically fills the UsageMask from the destionation register masks, since it easy to determine from the opcodes.
> 
> Which leads me to my second point, if indirect addressing of CLIPDIST is allowed, then we can't really pack the clip distance as 4-elem vectors in TGSI: not only the syntax would be very weird, but it would create havoc on all tgsi-translating code that makes decisions based on indirect addressing of registers.
> 
> That is, 
> 
>   float gl_ClipDistance[6];
> 
>   gl_ClipDistance[i] = foo;
> 
> would become
> 
>    DCL OUT[0].x, CLIPDIST[0]
>    DCL OUT[1].x, CLIPDIST[1]
>    DCL OUT[2].x, CLIPDIST[2]
>    DCL OUT[3].x, CLIPDIST[3]
>    DCL OUT[4].x, CLIPDIST[4]
>    DCL OUT[5].x, CLIPDIST[5]
>    MOV OUT[ADDR[0].x].x, foo
> 

This cannot work properly yet.

For instance, the clip distance slots in my hardware's output memory
space are packed, i.e. consuming NUM_CLIP_DISTANCES * 4 bytes.
(This cannot be changed, except by spilling outputs to high latency
memory and moving them all at the end, which is very undesirable.)

The MOV OUT[ADDR[0].x].x, however, has no way to know whether to scale
ADDR[0].x by 4 or by 16 bytes (as it would for arrays of vec4s) since it
is not clear which declaration/output range the instruction accesses.

The plan is to remedy this, at some point, by augmenting indirect
accesses with an extra "declaration index", and let a declaration
constitute an array. This would also make the TGSI_FILE_*_ARRAY superfluous.

So it would be:
DCL OUT[0-8].x, CLIPDIST[0-8], making use of tgsi_declaration_range.

> and the info from TGSI_PROPERTY_NUM_CLIP_DISTANCES can be obtained by walking the declaration (which can/should be done only once in tgsi_scan).
> 
> But this just doesn't look like it would ever work:
> 
>    DCL OUT[0].xyzw, CLIPDIST[0]
>    DCL OUT[1].xy  , CLIPDIST[1]
>    MOV OUT[ADDR[0].x].????, foo
> 
> Jose
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev