[Mesa-dev] [PATCH 11/15] i965: abort linking if we exhaust the registers

Wed May 11 11:19:01 UTC 2016

On Fri, 2016-04-29 at 14:23 +0200, Juan A. Suarez Romero wrote:
> On Fri, 2016-04-29 at 11:15 +0200, Ian Romanick wrote:
> > 
> > The driver supports up to 16 vertex attributes.
> > > 
> > > ARB_vertex_attrib_64bit
> > > states that attribute variables of type dvec3, dvec4, dmat2x3,
> > > dmat2x4,
> > > dmat3, dmat3x4, dmat4x3, and dmat4 *may* count as consuming twice
> > > as
> > > many attributes as equivalent single-precision types.
> > > 
> > > 
> > > I highlight the may, because it is not mandatory. If we count
> > > those
> > > types as consuming the same as a single-precision type (which is
> > > what
> > > is happening in Mesa), we are consuming 15 attributes, so we are
> > > under 
> > > the limit.
> > This is the thing we need to fix.  Bailing from deep inside the
> > driver
> > code generation (which may happen long, long after linking) is not
> > allowed.  If a shader is not going to work, we are required to
> > generate
> > the error in glLinkProgram.
> > 
> I'm not sure if I am following you. In which cases there can be code
> generation after the linking?
> 
> On the other side, the error is generated when calling
> glLinkProgram():
> it happens inside the stack that follows the call, when the NIR code
> is
> transformed in the intermediate BRW code.
> 
> So from user pov, the error happens when calling glLinkProgram(), not
> afterwards.
> 
> 
> > 
> > > 
> > > > 
> > > > But I see couple of drawbacks with this approach:
> > > >  
> > > >  
> > > > - There are tests that under the same conditions (less than the
> > > limit
> > > > 
> > > > if you count those types as occupying the same as single-
> > > precision, but
> > > > 
> > > > beyond the limit if those types are considered as consuming
> > > twice) they
> > > > 
> > > > still works. An example is the attached shader2 test: it
> > > > requires
> > > 13
> > > > 
> > > > attributes (or 19 counting as twice the mentioned types) and it
> > > works
> > > > 
> > > > fine.
> > > > 
> > > - This check affects to all the backends. And there could be some
> > > backend that works perfectly fine with the current
> > > implementation,
> > > which is less conservative. In fact, we have an example: the same
> > > driver running in vec4 mode (SIMD4x2) works perfectly fine.
> > I think we can handle this by having a per-type (double, dvec2,
> > dvec3,
> > and dvec4) flag to select the double or don't-double behavior.
> > 
> Do you mean only dvec3 and dvec4 (and types based on those: dmat2x3,
> dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4)?
> 
> Because only restrict to those types to count them as twice when
> checking if we bypass the max attributes limit.
> 
> Also, do we really need a flag per type? Wouldn't it be enough to
> have
> a single flag stating if all of those types are counted twice or not?
> 
> Finally, I understand the idea is that the flag would be like a
> boolean
> with true (count twice) or false (don't count twice), contained in
> each
> backend/driver (like the max attribs limit, for instance).
> 
> 
> Anyway, this would fix the second problem, but not the former: if our
> gen8 backend counts doubles as twice, it will prevent to run some
> shaders that otherway would work fine (like the shader2). If this is
> an
> acceptable solution (shaders that work fine now will be rejected
> because they use too much vertex attribs), then it's fine.
> 
> 
> 	J.A.

CCing Kenneth as this is also related to patch 12.

Summing up, we can:

-  Check registers exhaustion and URB read length and abort linking if
we reach the limit. This is what patches 11 and 12 do.

- Or, we can count dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4,
dmat4x3, and dmat4 vertex attributes as consuming twice the equivalent
single-precision types. Right now we are counting them as if they were
single-precision (spec allows both options[1]). This would require to
add some kind of flags to allow drivers to decide if they count them as
one or two. But if we follow this option, some shaders in i965 that
currently work because they do not exhaust registers neither reach the
URB read length limit, will fail to link if the count those mentioned
double types twice.

So what to do?

[1] "(modify the second paragraph, p. 87) ... exceeds
MAX_VERTEX_ATTRIBS.  For
    the purposes of this comparison, attribute variables of the type
dvec3,
    dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
count as
    consuming twice as many attributes as equivalent single-precision
types.
    While these types use the same number of generic attributes as
their
    single-precision equivalents, implementations are permitted to
consume two
    single-precision vectors of internal storage for each three- or
    four-component double-precision vector."

	J.A.