[Mesa-dev] [PATCH 1/3] i965: Fix shared local memory size for Gen9+.

Jason Ekstrand jason at jlekstrand.net
Thu Jun 9 14:29:22 UTC 2016


On Jun 9, 2016 1:10 AM, "Kenneth Graunke" <kenneth at whitecape.org> wrote:
>
> Skylake changes the representation of shared local memory size:
>
>  Size   | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
>  -------------------------------------------------------------------
>  Gen7-8 |    0 | none | none |    1 |    2 |     3 |     4 |     5 |
>  -------------------------------------------------------------------
>  Gen9+  |    0 |    1 |    2 |    3 |    4 |     5 |     6 |     7 |
>
> The old formula would substantially underallocate the amount of space.
> This fixes GPU hangs on Skylake when running with full thread counts.
>
> Cc: "12.0" <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c
b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 750aa2c..aff1f4e 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -150,11 +150,16 @@ brw_upload_cs_state(struct brw_context *brw)
>     assert(prog_data->total_shared <= 64 * 1024);
>     uint32_t slm_size = 0;
>     if (prog_data->total_shared > 0) {
> -      /* slm_size is in 4k increments, but must be a power of 2. */
> -      slm_size = 4 * 1024;
> -      while (slm_size < prog_data->total_shared)
> -         slm_size <<= 1;
> -      slm_size /= 4 * 1024;
> +      /* Shared Local Memory Size is specified as powers of two. */
> +      slm_size = util_next_power_of_two(prog_data->total_shared);
> +
> +      if (brw->gen >= 9) {
> +         /* Use a minimum of 1kB; turn an exponent of 10 (1024 kB) into
1. */
> +         slm_size = ffs(MAX2(slm_size, 1024)) - 10;
> +      } else {
> +         /* Use a minimum of 4kB; convert to the pre-Gen9
representation. */
> +         slm_size = MAX2(slm_size, 4096) / 4096;

Are you sure you don't want ffs in both cases?  The table above really
looks like powers of two all around.

const unsigned slm_divisor = brw->gen >= 9 ? 1024 : 4096;
slm_size = DIV_ROUND_UP(prog_data->total_shared, slm_divisor);
slm_size = util_next_power_of_two(slm_size);
slm_size = ffs(slm_size);

You may be also be able to replace the last two lines with a clever use of
clz.

If the table in the commit message comes from the bspec, it might be good
to put it here as a comment.

> +      }
>     }
>
>     desc[dw++] =
> --
> 2.8.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160609/01f97002/attachment.html>


More information about the mesa-dev mailing list