[Mesa-dev] [PATCH 11/15] i965: abort linking if we exhaust the registers
Juan A. Suarez Romero
jasuarez at igalia.com
Fri Apr 29 09:21:24 UTC 2016
On Thu, 2016-04-28 at 15:29 +0200, Ian Romanick wrote:
> On 04/28/2016 01:40 PM, Antia Puentes wrote:
> >
> > From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
> >
> > Even when the number of vertex attributes is under the limit, for
> > shaders that use a high number of them, we can quickly exhaust the
> > number of hardware registers.
> Were you able to construct a case where this actually occurs? Limits
> exposed by the driver and enforced by the GLSL linker should prevent
> this.
>
(Re-sending, because the original email was too big).
Yes. See the attached shader1 test that exposes this problem.
The driver supports up to 16 vertex attributes. ARB_vertex_attrib_64bit
states that attribute variables of type dvec3, dvec4, dmat2x3, dmat2x4,
dmat3, dmat3x4, dmat4x3, and dmat4 *may* count as consuming twice as
many attributes as equivalent single-precision types.
I highlight the may, because it is not mandatory. If we count those
types as consuming the same as a single-precision type (which is what
is happening in Mesa), we are consuming 15 attributes, so we are under
the limit.
The issue is that in scalar mode (SIMD8), for each vec4 attribute we
require 4 registers (or 8 per each dvec4 attribute), so it is easy to
reach a huge number of registers. Which is the problem the test is
exposing.
If we were working on SIMD4x2, this wouldn't happen, as we would
require only 1 register per vec4 attribute (or 2 per each dvec4).
So the problem is a combination of using a high number of attributes
and SIMD8 mode.
One of the first approaches we took was precisely to consider the
previous types to consume two attributes, instead of one. In this case,
the shader1 test would be consuming 29 attributes, so the limit would
be reached.
But I see couple of drawbacks with this approach:
- There are tests that under the same conditions (less than the limit
if you count those types as occupying the same as single-precision, but
beyond the limit if those types are considered as consuming twice) they
still works. An example is the attached shader2 test: it requires 13
attributes (or 19 counting as twice the mentioned types) and it works
fine.
- This check affects to all the backends. And there could be some
backend that works perfectly fine with the current implementation,
which is less conservative. In fact, we have an example: the same
driver running in vec4 mode (SIMD4x2) works perfectly fine.
So all in all, the best way we found is to keep how we count vertex
attributes, and just abort if we exhaust the available registers.
Ideally, the best approach would be to switch to vec4 mode. But this
would require to support gen8+vec4 (we are right now working on support
for gen7, which uses vec4), and also to improve switching from scalar
mode to vec4 when compiling the shader.
J.A.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shader1.shader_test.gz
Type: application/gzip
Size: 2138 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160429/5647ceb1/attachment-0002.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shader2.shader_test.gz
Type: application/gzip
Size: 1781 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160429/5647ceb1/attachment-0003.gz>
More information about the mesa-dev
mailing list