[Mesa-dev] [PATCH 11/15] i965: abort linking if we exhaust the registers

Ian Romanick idr at freedesktop.org
Fri Apr 29 09:15:48 UTC 2016


On 04/29/2016 09:32 AM, Juan A. Suarez Romero wrote:
> On Thu, 2016-04-28 at 15:29 +0200, Ian Romanick wrote:
>> On 04/28/2016 01:40 PM, Antia Puentes wrote:
>>>
>>> From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
>>>
>>> Even when the number of vertex attributes is under the limit, for
>>> shaders that use a high number of them, we can quickly exhaust the
>>> number of hardware registers.
>> Were you able to construct a case where this actually occurs?  Limits
>> exposed by the driver and enforced by the GLSL linker should prevent
>> this.
>>
> 
> Yes. See the attached shader1 test that exposes this problem.
> 
> 
> The driver supports up to 16 vertex attributes. ARB_vertex_attrib_64bit
> states that attribute variables of type dvec3, dvec4, dmat2x3, dmat2x4,
> dmat3, dmat3x4, dmat4x3, and dmat4 *may* count as consuming twice as
> many attributes as equivalent single-precision types.
> 
> 
> I highlight the may, because it is not mandatory. If we count those
> types as consuming the same as a single-precision type (which is what
> is happening in Mesa), we are consuming 15 attributes, so we are under 
> the limit.

This is the thing we need to fix.  Bailing from deep inside the driver
code generation (which may happen long, long after linking) is not
allowed.  If a shader is not going to work, we are required to generate
the error in glLinkProgram.

> The issue is that in scalar mode (SIMD8), for each vec4 attribute we
> require 4 registers (or 8 per each dvec4 attribute), so it is easy to
> reach a huge number of registers. Which is the problem the test is
> exposing.
> 
> 
> If we were working on SIMD4x2, this wouldn't happen, as we would 
> require only 1 register per vec4 attribute (or 2 per each dvec4).
> 
> 
> So the problem is a combination of using a high number of attributes
> and SIMD8 mode.
> 
> 
> One of the first approaches we took was precisely to consider the
> previous types to consume two attributes, instead of one. In this case,
> the shader1 test would be consuming 29 attributes, so the limit would
> be reached.
> 
> 
> But I see couple of drawbacks with this approach:
> 
> 
> - There are tests that under the same conditions (less than the limit
> if you count those types as occupying the same as single-precision, but
> beyond the limit if those types are considered as consuming twice) they
> still works. An example is the attached shader2 test: it requires 13
> attributes (or 19 counting as twice the mentioned types) and it works
> fine.

I don't see where you get 19.  I get 3 array elements * 2 matrix columns
* 2 for value0, 2 array elements * 3 matrix columns * 2 for value1, and
1 for piglit_vertex.  That's 25.

This overcounts because by naive doubling the dmat2 counts each column
as 2 slots, but we only actually need 1. By doubling only when it's
necessary, that shader would need (3 * 2) + (2 * 3 * 2) + 1 = 19.

> - This check affects to all the backends. And there could be some
> backend that works perfectly fine with the current implementation,
> which is less conservative. In fact, we have an example: the same
> driver running in vec4 mode (SIMD4x2) works perfectly fine.

I think we can handle this by having a per-type (double, dvec2, dvec3,
and dvec4) flag to select the double or don't-double behavior.

> So all in all, the best way we found is to keep how we count vertex
> attributes, and just abort if we exhaust the available registers.
> 
> Ideally, the best approach would be to switch to vec4 mode. But this
> would require to support gen8+vec4 (we are right now working on support
> for gen7, which uses vec4), and also to improve switching from scalar
> mode to vec4 when compiling the shader.
> 
> 
>         J.A.



More information about the mesa-dev mailing list