[Mesa-dev] [PATCH 1/2] mesa, gallium: add a cap for GPUs without unified color+generic varying slots

Wed Dec 14 15:48:52 PST 2011

On Thu, Dec 15, 2011 at 12:01 AM, Ian Romanick <idr at freedesktop.org> wrote:
>> Simple. I cannot pass the test glsl-max-varyings if I report 40
>> varying components, because I have only 32 texcoord components + 8
>> color components. I could re-use the color varyings, but it's not full
>> single-precision. r300 uses a fixed-point type S3.12 for color
>> interpolators, which can represent values in the range [-7.9999,
>> 7.9999]. That's unusable for used-defined varyings. r500 uses a 20-bit
>> float, which is better, but I am not sure if it's good enough.
>
>
> The desktop GL spec isn't very specific about range or precision before
> either a late 3.x version or one of the 4.x versions (I don't recall which).
>  However, I'm pretty sure ES2 requires 24-bit float.  Is 20-bits even a real
> computer number? :)

I did not design the hardware. The 20-bit float interpolation might
have been faster than 32-bit or even 24-bit, who knows.

>
> This also means that r300 and r500 can't handle unclamped colors. Right?

r300 cannot handle unclamped colors outside of the range [-8, 8]. For
that reason, ARB_color_buffer_float is not exposed on r300. r500 can
handle unclamped colors, but only with 20 bits of precision (probably
m14e6 or something like that).

>  I'm thinking of the scenario:
>
>  - The driver advertises 32 varying components.
>
>  - The application calls glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB,
> GL_FALSE).
>
>  - The shader uses gl_Color and 32 components worth of other varyings.
>
> What happens?  Are the colors partially clamped or what?  Since color
> clamping is set independent of compiling or linking, you don't have an
> opportunity to generate any errors up front.
>
> It sounds like the colors have to go through different interpolators anyway
> if glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB, GL_FALSE) is used.  Is that
> handled correctly?

Like I said, ARB_color_buffer_float is not exposed on r300-r400, so
the colors are always clamped.

>
> Now I'm really curious to see glsl-max-varyings run on an Apple system with
> an r300.  Scouting around, that looks like it would have to be a G5 iMac or
> PowerMac.  Hrm...
>
>
>> Also, r500 actually has 10 texcoord interpolators, but we don't use
>> the last two yet (it's non-trivial, there is a special PS3 mode for
>> it). Whether color interpolators can be enabled in that mode is
>> undocumented, though I am almost sure the back colors cannot be. (the
>> DX9 ps_3_0 shader profile doesn't have color inputs at all)
>
>
> If colors are counted, you don't need to worry about that.  40 varyings
> means 40 varyings.  Clamped colors would go to the usual slots (leaving 2
> texcoords unused), and clamped colors would go in two of the texcoord slots.
>
>
>> Yes, the patch is a hack. However modifying glsl-max-varyings to only
>> test MAX_VARYING_FLOATS - 8 doesn't feel right either. Do you have a
>> better idea?
>>
>> What about the other patch?
>
>
> I need to do some research about that.  I'm pretty sure i915 needs a slot
> for WPOS.  I want to collect some more data and see if there's a coherent
> architecture that works for the various platforms.  We're starting to get a
> big pile of band-aids, and that can't hold up in the long run.

Right. I used the ps_3_0 spec, which requires 10 varying vectors +
face + wpos.xy (yes two channels only). That matches r500, which has
dedicated shader inputs for face + wpos.xy. However, OpenGL also
requires wpos.zw, so r300g emulates wpos through another texcoord
anyway, but SM4 and later hardware should not have that limitation.

The point of these patches is that if we want the linker to fail when
it should, we better calculate occupied resources correctly, otherwise
we may regret this one day.

Marek