[Mesa-dev] Mesa (master): nv30: report 8 maximum inputs

Mon Feb 10 18:16:08 PST 2014

On Mon, Feb 10, 2014 at 7:12 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 02/10/2014 02:04 PM, Ilia Mirkin wrote:
>> On Mon, Feb 10, 2014 at 4:43 PM, Ian Romanick <idr at freedesktop.org> wrote:
>>> On 02/08/2014 04:18 PM, Ilia Mirkin wrote:
>>>> Module: Mesa
>>>> Branch: master
>>>> Commit: 356aff3a5c08be055d6befff99a72f5551b3ac2d
>>>> URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=356aff3a5c08be055d6befff99a72f5551b3ac2d
>>>>
>>>> Author: Ilia Mirkin <imirkin at alum.mit.edu>
>>>> Date:   Wed Jan 29 12:36:13 2014 -0500
>>>>
>>>> nv30: report 8 maximum inputs
>>>>
>>>> nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
>>>> nv40/nv30. This fixes compilation of the varying-packing tests.
>>>> Furthermore it appears that the last 2 inputs on nv4x don't seem to
>>>> work in those tests, so just report 8 everywhere for now.
>>>
>>> Is it possible that the last two inputs are supposed to be used for
>>> gl_Color and gl_SecondaryColor?  In that case, they may have clamping
>>> enabled by default (or always enabled, if the hardware cannot disable
>>> GL_CLAMP_VERTEX_COLOR).  Does that match the behavior that you saw?
>>
>> I'm definitely out of my depth here. What I saw were piglit tests
>> failing and hard-to-understand code with little additional
>> documentation.
>>
>> There's a mask that enables passing of outputs from VP -> FP. This
>> mask has separate entries for colors, fog, psize, and clipping. The
>> texcoord's, as they are called, are in a different part of the mask.
>> The last 2 are in a different-yet part of the mask from the first 8.
>
> Hm... that does sound like something different.
>
>> This of course does not preclude them getting clamped/modified in some
>> way. Ideally those varying-packing tests would be rewritten in a way
>> that better exposes what actually went wrong, but I realize it's not
>> easy to add printf("interesting values: %f %f") into a shader.
>
> Can that hardware render to floating point?  You could try rendering to
> a fp FBO, doing glReadPixels, the printf the data.  It's messy, but it
> works.

The hardware? Yes. But it's disabled in the software. I asked Ben why,
and he gave me a reasonable explanation. However that explanation has
since escaped my memory, but it's in the irc logs if it's really
important.

For those specific tests, assuming a reasonable number of varyings
(i.e. < 255), one could just set the color results to e.g.
varying_index/255 -- that way we'd know which varying was the first
bad one. But in general, it's hard :( Shader debugging does seem like
a prime target for some extension though -- nv30 will never get
support for it, but at least future hardware driver writers may be
spared...

>
>> It's also entirely possible that the issue is that the way you address
>> those last 2 texcoord's in the FP has to be done in some different way
>> than the first 8. Ideally I (or someone) would trace the blob and
>> analyze the bytecode/engine setup to see what the difference is. I
>> haven't gotten around to that (and unfortunately envydis doesn't
>> support the nv30/nv40 shader ISA).
>>
>> However I was trying to make a dent in the nv4x failures and crashes.
>> This is the latest state, btw:
>> http://people.freedesktop.org/~imirkin/nv40-comparison/problems.html
>> -- I think actually a bunch of the test failures are due to incorrect
>> piglit tests (e.g. "shader uses too many input components (48 > 32)"),
>
> Is that just the variable indexing tests?  Hm... in at least

At least those, yes. I forget if there were others.

> vs-varying-array-mat4-col-rd.shader_test, it looks like the varying
> array is too big.  OpenGL 2.1 only requires 32 varying components, and
> that test clearly uses 16*3+4 = 52.  It seems like the linker should
> chop off the last element, but that still only reduces the usage to 36.
>  A few of those tests could be made to work with 32 varying floats by
> reducing the array size from 3 to 2.

Also interestingly, some of those tests used to pass before I threw in
the restriction down to 32. So I guess something can work with those
last 2 texcoords, or perhaps it was just coincidence... (Compare nv42
vs nv44 -- the nv42 run had the limit down to 8, the nv44 run didn't.)

>
> There are a couple that can't be fixed that way because you'd have to
> reduce the array size to 1.  For those tests, you'll have to add a
> MAX_VARYING_COMPONENTS requirement that behaves like the existing
> MAX_FRAGMENT_UNIFORM_VECTORS requirement.

Sounds reasonable, I'll look into it. I'm less interested in making
piglit tests run on nv30 than I am at making them not fail :)

  -ilia