[Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

Ian Romanick idr at freedesktop.org
Tue Dec 13 16:01:41 PST 2011


On 12/13/2011 02:12 PM, Jose Fonseca wrote:
>
>
> ----- Original Message -----
>> On 12/13/2011 03:48 PM, Jose Fonseca wrote:
>>> ----- Original Message -----
>>>> On 12/13/2011 03:25 PM, Jose Fonseca wrote:
>>>>> ----- Original Message -----
>>>>>> On 12/13/2011 03:09 PM, Jose Fonseca wrote:
>>>>>>> ----- Original Message -----
>>>>>>>> On 12/13/2011 12:26 PM, Bryan Cain wrote:
>>>>>>>>> On 12/13/2011 02:11 PM, Jose Fonseca wrote:
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> This is an updated version of the patch set I sent to the
>>>>>>>>>>> list
>>>>>>>>>>> a
>>>>>>>>>>> few
>>>>>>>>>>> hours
>>>>>>>>>>> ago.
>>>>>>>>>>> There is now a TGSI property called
>>>>>>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES
>>>>>>>>>>> that drivers can use to determine how many of the 8
>>>>>>>>>>> available
>>>>>>>>>>> clip
>>>>>>>>>>> distances
>>>>>>>>>>> are actually used by a shader.
>>>>>>>>>> Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
>>>>>>>>>> derived from the shader, and queried through
>>>>>>>>>> src/gallium/auxiliary/tgsi/tgsi_scan.h ?
>>>>>>>>> No.  The clip distances can be indirectly addressed (there
>>>>>>>>> are
>>>>>>>>> up
>>>>>>>>> to 2
>>>>>>>>> of them in vec4 form for a total of 8 floats), which makes it
>>>>>>>>> impossible
>>>>>>>>> to determine which ones are used by analyzing the shader.
>>>>>>>> The description is almost complete. :)  The issue is that the
>>>>>>>> shader
>>>>>>>> may
>>>>>>>> declare
>>>>>>>>
>>>>>>>> out float gl_ClipDistance[4];
>>>>>>>>
>>>>>>>> the use non-constant addressing of the array.  The compiler
>>>>>>>> knows
>>>>>>>> that
>>>>>>>> gl_ClipDistance has at most 4 elements, but post-hoc analysis
>>>>>>>> would
>>>>>>>> not
>>>>>>>> be able to determine that.  Often the fixed-function hardware
>>>>>>>> (see
>>>>>>>> below) needs to know which clip distance values are actually
>>>>>>>> written.
>>>>>>> But don't all the clip distances written by the shader need to
>>>>>>> be
>>>>>>> declared?
>>>>>>>
>>>>>>> E.g.:
>>>>>>>
>>>>>>> DCL OUT[0], CLIPDIST[0]
>>>>>>> DCL OUT[1], CLIPDIST[1]
>>>>>>> DCL OUT[2], CLIPDIST[2]
>>>>>>> DCL OUT[3], CLIPDIST[3]
>>>>>>>
>>>>>>> therefore a trivial analysis of the declarations convey that?
>>>>>> No.  Clip distance is an array of up to 8 floats in GLSL, but
>>>>>> it's
>>>>>> represented in the hardware as 2 vec4s.  You can tell by
>>>>>> analyzing
>>>>>> the
>>>>>> declarations whether there are more than 4 clip distances in
>>>>>> use,
>>>>>> but
>>>>>> not which components the shader writes to.
>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in
>>>>>> use,
>>>>>> not
>>>>>> the number of full vectors.
>>>>> Lets imagine
>>>>>
>>>>>    out float gl_ClipDistance[6];
>>>>>
>>>>> Each a clip distance is a scalar float.
>>>>>
>>>>> Either all hardware represents the 8 clip distances as two 4
>>>>> vectors, and we do:
>>>>>
>>>>>    DCL OUT[0].xywz, CLIPDIST[0]
>>>>>    DCL OUT[1].xy, CLIPDIST[1]
>>>>>
>>>>> using the full range of struct tgsi_declaration::UsageMask [1] or
>>>>> we represent them as as scalars:
>>>>>
>>>>>    DCL OUT[0].x, CLIPDIST[0]
>>>>>    DCL OUT[1].x, CLIPDIST[1]
>>>>>    DCL OUT[2].x, CLIPDIST[2]
>>>>>    DCL OUT[3].x, CLIPDIST[3]
>>>>>    DCL OUT[4].x, CLIPDIST[4]
>>>>>    DCL OUT[5].x, CLIPDIST[5]
>>>>>
>>>>> If indirect addressing is allowed as I read bore, then maybe the
>>>>> later is better.
>>>>>
>>>>> I confess my ignorance about clipping and maybe I'm being dense,
>>>>> but I still don't see the need for this
>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you please draft an
>>>>> example TGSI shader showing this property (or just paste one
>>>>> generated with your change)?  I think that would help a lot.
>>>>>
>>>>>
>>>>> Jose
>>>>>
>>>>>
>>>>> [1] I don't know if tgsi_dump pays much attention to
>>>>>   tgsi_declaration::UsageMask, but it does exist.
>>>> UsageMask might work, but before that can be considered a viable
>>>> solution, someone will need to make it possible to actually
>>>> declare
>>>> it
>>>> from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw
>>>> no
>>>> matter what on all declared inputs and outputs.
>>> ureg automatically fills the UsageMask from the destionation
>>> register masks, since it easy to determine from the opcodes.
>>>
>>> Which leads me to my second point, if indirect addressing of
>>> CLIPDIST is allowed, then we can't really pack the clip distance
>>> as 4-elem vectors in TGSI: not only the syntax would be very
>>> weird, but it would create havoc on all tgsi-translating code that
>>> makes decisions based on indirect addressing of registers.
>>>
>>> That is,
>>>
>>>    float gl_ClipDistance[6];
>>>
>>>    gl_ClipDistance[i] = foo;
>>>
>>> would become
>>>
>>>     DCL OUT[0].x, CLIPDIST[0]
>>>     DCL OUT[1].x, CLIPDIST[1]
>>>     DCL OUT[2].x, CLIPDIST[2]
>>>     DCL OUT[3].x, CLIPDIST[3]
>>>     DCL OUT[4].x, CLIPDIST[4]
>>>     DCL OUT[5].x, CLIPDIST[5]
>>>     MOV OUT[ADDR[0].x].x, foo
>>>
>>> and the info from TGSI_PROPERTY_NUM_CLIP_DISTANCES can be obtained
>>> by walking the declaration (which can/should be done only once in
>>> tgsi_scan).
>>>
>>> But this just doesn't look like it would ever work:
>>>
>>>     DCL OUT[0].xyzw, CLIPDIST[0]
>>>     DCL OUT[1].xy  , CLIPDIST[1]
>>>     MOV OUT[ADDR[0].x].????, foo
>>>
>>> Jose
>>
>> If ureg automatically fills the UsageMask from the accessed
>> components,
>> it's probably a better solution than the property.
>>
>> About the indirect addressing of components: the GLSL compiler lowers
>> indirect addressing of the gl_ClipDistance array to indirect
>> addressing
>> of the 2 vec4s, combined with conditional moves to the different
>> components.  Which is okay, because although indirect addressing of
>> gl_ClipDistance is allowed by the GLSL specification, meaning have to
>> support it, it's not something that's actually useful in practical
>> situations.
>
> It sounds a bit complicated -- the lowered indirect addressing code will certainly result in inefficient code for software rendering (e.g, draw module + llvm), but that alone is probably not reason to be different.  Could everybody confirm that all hardware uses 2 vec4s for clip distances and is not able to natively do indirect addressing?

Those are (or should be) orthogonal.  For example, i965 uses vec4[2], 
but it can indirect access components of a vector.  We don't do that 
yet, but we could.  I don't know if there's a way to represent that in 
TGSI, but it can be represented in the GLSL IR.  We just have two 
separate lowering passes.  One does the float[]->vec4[] conversion, and 
the other does the vec4 indexing conversion.


More information about the mesa-dev mailing list