[Mesa-dev] Mesa (master): i965/fs: Convert gen7 to using GRFs for texture messages.

Thu Oct 17 09:29:02 CEST 2013

On Thu, Oct 17, 2013 at 1:53 PM, Chia-I Wu <olvaffe at gmail.com> wrote:
> Hi Eric,
>
> On Sat, Oct 12, 2013 at 3:18 AM, Eric Anholt <eric at anholt.net> wrote:
>> Chia-I Wu <olvaffe at gmail.com> writes:
>>
>>> Hi Eric,
>>> The frame rate of Unigine Tropics (with low shader quality) dropped
>>> from 40.8 to 23.5 after this change.
>>
>> Thanks for the note.  I see the regression as well, and I see a shader
>> that's started spilling.  It looks like we can drop the regs_written <=
>> 1 check on gen7+'s pre-regalloc scheduling to fix the problem (the MRF
>> setup thing is no longer an issue, and its presence is now making us
>> pessimize instead of optimize in general in the pre-regalloc
>> scheduling).  I'll want to run a few more tests to make sure that this
>> doesn't regress something else.
> Are you looking at this issue?  The change you suggested does not
> avoid spilling.
>
> I think the problem can be demonstrated with this snippet:
>
>   vec4 val = vec4(0.0);
>
>   vec4 tmp_001 = texture(tex, texcoord * 0.01);
>   val += tmp_001;
>   vec4 tmp_002 = texture(tex, texcoord * 0.02);
>   val += tmp_002;
>   vec4 tmp_003 = texture(tex, texcoord * 0.03);
>   val += tmp_003;
>   ...
>   vec4 tmp_099 = texture(tex, texcoord * 0.99);
>   val += tmp_099;
>   vec4 tmp_100 = texture(tex, texcoord * 1.00);
>   val += tmp_100;
>
>   gl_FragColor = val;
>
> Before the change, the scheduler saw a dependency between any two
> texture() calls (because of the use of MRF).  It was inclined to keep
> the accumulation of tmp_xxx between texture() calls even though the
> accumulation also had a dependency on the last texture() call.
>
> After the change, the dependencies between texture()s are gone.  The
> scheduler sees a chance to move all the high latency texture()
> together and generate something like this:
Ah, I started looking at post-reg-alloc scheduling in the middle
way...  My reasoning was wrong.  The correct one is:

It worked before this change because there were dependencies between
texture() calls, and those texture() calls must thus be scheduled in
that order.  Accumulations were scheduled as soon as they were
available, and thus were intermixed with texture() calls.

It does not work now because the dependencies between texture() calls
are gone.  Since the scheduler schedules in FILO order, texture()
calls are scheduled in reversed order.  Accumulations are thus
available only after all texture() calls are scheduled.

This remains true with the fix suggested (it is still desirable, only
that it is a partial fix).  The problem can be demonstrated with the
attached fragment shader.

>   vec4 tmp_003 = texture(tex, texcoord * 0.03);
>   ...
>   vec4 tmp_099 = texture(tex, texcoord * 0.99);
>   vec4 tmp_100 = texture(tex, texcoord * 1.00);
>
>   val += tmp_001;
>   val += tmp_002;
>   val += tmp_003;
>   ...
>   val += tmp_099;
>   val += tmp_100;
>
> Since there are not enough registers to hold all tmp_xxx, the register
> allocation starts spilling.
>
>>
>> This shader is also in bad shape now that we don't have the redundant
>> MRF move optimization, and we need to look into grf_size > 1 CSE.  That
>> would probably also have avoided the problem on this shader, though the
>> scheduling problem is more general than this one shader.
>
>
>
> --
> olv at LunarG.com
>   val = texture(tex, texcoord * 1.0);

-- 
olv at LunarG.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 465.frag
Type: application/octet-stream
Size: 2964 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20131017/2f1e248a/attachment.obj>