[Mesa-dev] [PATCH 6/6] i965/gen7: Add instruction latency estimates for untyped atomics and reads.

Paul Berry stereotype441 at gmail.com
Fri Nov 1 10:31:33 PDT 2013


On 29 October 2013 16:37, Francisco Jerez <currojerez at riseup.net> wrote:

> The latency information has been obtained empirically from
> measurements taken on Haswell and Ivy Bridge.
> ---
>  .../drivers/dri/i965/brw_schedule_instructions.cpp | 41
> ++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> index 944b5c8..cbfaabe 100644
> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> @@ -329,6 +329,47 @@ schedule_node::set_latency_gen7(bool is_haswell)
>        latency = 200;
>        break;
>
> +   case SHADER_OPCODE_UNTYPED_ATOMIC:
> +      /* Test code:
> +       *   mov(8)    g112<1>ud       0x00000000ud       { align1 WE_all
> 1Q };
> +       *   mov(1)    g112.7<1>ud     g1.7<0,1,0>ud      { align1 WE_all };
> +       *   mov(8)    g113<1>ud       0x00000000ud       { align1
> WE_normal 1Q };
> +       *   send(8)   g4<1>ud         g112<8,8,1>ud
> +       *             data (38, 5, 6) mlen 2 rlen 1      { align1
> WE_normal 1Q };
> +       *
> +       * Running it 100 times as fragment shader on a 128x128 quad
> +       * gives an average latency of 13867 cycles per atomic op,
> +       * standard deviation 3%.  Note that this is a rather
> +       * pessimistic estimate, the actual latency in cases with few
> +       * collisions between threads and favorable pipelining has been
> +       * seen to be reduced by a factor of 100.
> +       */
> +      latency = 14000;
>

Wow, that's a really huge latency.  Given your argument in the comment, I
suspect that in practice, shaders that use atomic counters are going to be
a lot closer to the "few collisions between threads and favorable
pipelining" case than they are going to be to this pessimistic estimate.
Personally, I'd be inclined to make the latency the same as
SHADER_OPCODE_UNTYPED_SURFACE_READ.

But I'm not an expert on scheduling latencies so I'll defer to Eric and
Matt.  Consider this patch:

Acked-by: Paul Berry <stereotype441 at gmail.com>

I made comments on all the other patches in the series except patch 3.
Patch 3 is:

Reviewed-by: Paul Berry <stereotype441 at gmail.com>


> +      break;
> +
> +   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
> +      /* Test code:
> +       *   mov(8)    g112<1>UD       0x00000000UD       { align1 WE_all
> 1Q };
> +       *   mov(1)    g112.7<1>UD     g1.7<0,1,0>UD      { align1 WE_all };
> +       *   mov(8)    g113<1>UD       0x00000000UD       { align1
> WE_normal 1Q };
> +       *   send(8)   g4<1>UD         g112<8,8,1>UD
> +       *             data (38, 6, 5) mlen 2 rlen 1      { align1
> WE_normal 1Q };
> +       *   .
> +       *   . [repeats 8 times]
> +       *   .
> +       *   mov(8)    g112<1>UD       0x00000000UD       { align1 WE_all
> 1Q };
> +       *   mov(1)    g112.7<1>UD     g1.7<0,1,0>UD      { align1 WE_all };
> +       *   mov(8)    g113<1>UD       0x00000000UD       { align1
> WE_normal 1Q };
> +       *   send(8)   g4<1>UD         g112<8,8,1>UD
> +       *             data (38, 6, 5) mlen 2 rlen 1      { align1
> WE_normal 1Q };
> +       *
> +       * Running it 100 times as fragment shader on a 128x128 quad
> +       * gives an average latency of 583 cycles per surface read,
> +       * standard deviation 0.9%.
> +       */
> +      latency = is_haswell ? 300 : 600;
> +      break;
> +
>     default:
>        /* 2 cycles:
>         * mul(8) g4<1>F g2<0,1,0>F      0.5F            { align1 WE_normal
> 1Q };
> --
> 1.8.3.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20131101/31a98b07/attachment.html>


More information about the mesa-dev mailing list