[Mesa-dev] [PATCH 11/15] i965: abort linking if we exhaust the registers

Wed May 11 20:05:37 UTC 2016

On Wednesday, May 11, 2016 1:19:01 PM PDT Juan A. Suarez Romero wrote:
> On Fri, 2016-04-29 at 14:23 +0200, Juan A. Suarez Romero wrote:
> > On Fri, 2016-04-29 at 11:15 +0200, Ian Romanick wrote:
> > > 
> > > The driver supports up to 16 vertex attributes.
> > > > 
> > > > ARB_vertex_attrib_64bit
> > > > states that attribute variables of type dvec3, dvec4, dmat2x3,
> > > > dmat2x4,
> > > > dmat3, dmat3x4, dmat4x3, and dmat4 *may* count as consuming twice
> > > > as
> > > > many attributes as equivalent single-precision types.
> > > > 
> > > > 
> > > > I highlight the may, because it is not mandatory. If we count
> > > > those
> > > > types as consuming the same as a single-precision type (which is
> > > > what
> > > > is happening in Mesa), we are consuming 15 attributes, so we are
> > > > under 
> > > > the limit.
> > > This is the thing we need to fix.  Bailing from deep inside the
> > > driver
> > > code generation (which may happen long, long after linking) is not
> > > allowed.  If a shader is not going to work, we are required to
> > > generate
> > > the error in glLinkProgram.
> > > 
> > I'm not sure if I am following you. In which cases there can be code
> > generation after the linking?
> > 
> > On the other side, the error is generated when calling
> > glLinkProgram():
> > it happens inside the stack that follows the call, when the NIR code
> > is
> > transformed in the intermediate BRW code.
> > 
> > So from user pov, the error happens when calling glLinkProgram(), not
> > afterwards.
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > But I see couple of drawbacks with this approach:
> > > > >  
> > > > >  
> > > > > - There are tests that under the same conditions (less than the
> > > > limit
> > > > > 
> > > > > if you count those types as occupying the same as single-
> > > > precision, but
> > > > > 
> > > > > beyond the limit if those types are considered as consuming
> > > > twice) they
> > > > > 
> > > > > still works. An example is the attached shader2 test: it
> > > > > requires
> > > > 13
> > > > > 
> > > > > attributes (or 19 counting as twice the mentioned types) and it
> > > > works
> > > > > 
> > > > > fine.
> > > > > 
> > > > - This check affects to all the backends. And there could be some
> > > > backend that works perfectly fine with the current
> > > > implementation,
> > > > which is less conservative. In fact, we have an example: the same
> > > > driver running in vec4 mode (SIMD4x2) works perfectly fine.
> > > I think we can handle this by having a per-type (double, dvec2,
> > > dvec3,
> > > and dvec4) flag to select the double or don't-double behavior.
> > > 
> > Do you mean only dvec3 and dvec4 (and types based on those: dmat2x3,
> > dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4)?
> > 
> > Because only restrict to those types to count them as twice when
> > checking if we bypass the max attributes limit.
> > 
> > Also, do we really need a flag per type? Wouldn't it be enough to
> > have
> > a single flag stating if all of those types are counted twice or not?
> > 
> > Finally, I understand the idea is that the flag would be like a
> > boolean
> > with true (count twice) or false (don't count twice), contained in
> > each
> > backend/driver (like the max attribs limit, for instance).
> > 
> > 
> > Anyway, this would fix the second problem, but not the former: if our
> > gen8 backend counts doubles as twice, it will prevent to run some
> > shaders that otherway would work fine (like the shader2). If this is
> > an
> > acceptable solution (shaders that work fine now will be rejected
> > because they use too much vertex attribs), then it's fine.
> > 
> > 
> > 	J.A.
> 
> 
> CCing Kenneth as this is also related to patch 12.
> 
> 
> Summing up, we can:
> 
> -  Check registers exhaustion and URB read length and abort linking if
> we reach the limit. This is what patches 11 and 12 do.
> 
> - Or, we can count dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4,
> dmat4x3, and dmat4 vertex attributes as consuming twice the equivalent
> single-precision types. Right now we are counting them as if they were
> single-precision (spec allows both options[1]). This would require to
> add some kind of flags to allow drivers to decide if they count them as
> one or two. But if we follow this option, some shaders in i965 that
> currently work because they do not exhaust registers neither reach the
> URB read length limit, will fail to link if the count those mentioned
> double types twice.
> 
> So what to do?

I feel like the best approach is the second one (double counting
dvec[34] and dmat*x[34] types).  Although they only take up a single
VERTEX_ELEMENT structure, they do take up twice as much URB space
(256 bits instead of 128 bits).

As you said in your earlier email, we should only get into trouble
for shaders that use a lot of double attributes.  I doubt that we'll
see many of those in the real world.  Plus, we are specifically allowed
to do that, so conformance tests can't assume things will fit, either.

We can always do better in the future, if we ever need to.

I definitely prefer not to raise linker errors from the backend
compiler.  This should technically work if the precompile is enabled,
but I'd rather avoid it if possible.

Another thought is that if we want to allow more programs, we could
probably safely raise the MAX_VERTEX_ATTRIBS value from 16 to 28 or so.

--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160511/ae087290/attachment-0001.sig>