[Mesa-dev] [PATCH] i965/fs: Don't try 16-wide if 8-wide uses more than half the registers.

Paul Berry stereotype441 at gmail.com
Thu Aug 16 09:55:52 PDT 2012


On 15 August 2012 16:37, Kenneth Graunke <kenneth at whitecape.org> wrote:

> 16-wide programs use roughly twice as many registers as 8-wide, and we
> don't support spilling in 16-wide.  So if an 8-wide program uses more
> than half the available GRFs, the 16-wide one almost certainly will fail
> to compile during register allocation.
>
> Not only that, but attempting to compiling such shaders is expensive:
> programs that use a lot of registers tend to be quite complex, meaning
> that we spend more time than usual generating and optimizing code.  If
> we fail at register allocation, we've failed at the last step, after
> needlessly burning through a lot of CPU time.
>
> To make things worse, such shader compilation typically happens at the
> first draw call using the shader, so it can cause the GPU to stall.
>
> With all that in mind, it makes sense to short-circuit the 16-wide
> attempt if the 8-wide program uses too many registers.  I've chosen 75
> to be conservative---if we /can/ compile a SIMD16 program, we want to.
>
> Reduces the number of GPU stalls due to fragment shader recompiles
> in Left 4 Dead 2 by about 20%, and reduces the duration of many of the
> remaining stalls by about half.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index e2dafdc..a113105 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2100,7 +2100,10 @@ brw_wm_fs_emit(struct brw_context *brw, struct
> brw_wm_compile *c,
>        return false;
>     }
>
> -   if (intel->gen >= 5 && c->prog_data.nr_pull_params == 0) {
> +   if (v.grf_used >= 75) {
> +      perf_debug("Too many registers to attempt compiling a 16-wide
> shader; "
> +                 "falling back to 8-wide at a 10-20%% performance
> cost.\n");
> +   } else if (intel->gen >= 5 && c->prog_data.nr_pull_params == 0) {
>

It looks like this code will give the perf warning even for situations
where we couldn't do 16-wide anyhow (intel->gen == 4 ||
c->prog_data.nr_pull_params != 0).  To avoid confusing people, we should
probably only give the perf warning if register count is the *only* reason
we can't do 16-wide.


>        c->dispatch_width = 16;
>        fs_visitor v2(c, prog, shader);
>        v2.import_uniforms(&v);
> --
> 1.7.11.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20120816/fbbcaebb/attachment.html>


More information about the mesa-dev mailing list