[Mesa-dev] [PATCH 1/3] i965: Fix shared local memory size for Gen9+.

Ilia Mirkin imirkin at alum.mit.edu
Thu Jun 9 14:00:40 UTC 2016


On Jun 9, 2016 4:10 AM, "Kenneth Graunke" <kenneth at whitecape.org> wrote:
>
> Skylake changes the representation of shared local memory size:
>
>  Size   | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
>  -------------------------------------------------------------------
>  Gen7-8 |    0 | none | none |    1 |    2 |     3 |     4 |     5 |
>  -------------------------------------------------------------------
>  Gen9+  |    0 |    1 |    2 |    3 |    4 |     5 |     6 |     7 |
>
> The old formula would substantially underallocate the amount of space.
> This fixes GPU hangs on Skylake when running with full thread counts.
>
> Cc: "12.0" <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c
b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 750aa2c..aff1f4e 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -150,11 +150,16 @@ brw_upload_cs_state(struct brw_context *brw)
>     assert(prog_data->total_shared <= 64 * 1024);
>     uint32_t slm_size = 0;
>     if (prog_data->total_shared > 0) {
> -      /* slm_size is in 4k increments, but must be a power of 2. */
> -      slm_size = 4 * 1024;
> -      while (slm_size < prog_data->total_shared)
> -         slm_size <<= 1;
> -      slm_size /= 4 * 1024;
> +      /* Shared Local Memory Size is specified as powers of two. */
> +      slm_size = util_next_power_of_two(prog_data->total_shared);
> +
> +      if (brw->gen >= 9) {
> +         /* Use a minimum of 1kB; turn an exponent of 10 (1024 kB) into
1. */
> +         slm_size = ffs(MAX2(slm_size, 1024)) - 10;
> +      } else {
> +         /* Use a minimum of 4kB; convert to the pre-Gen9
representation. */
> +         slm_size = MAX2(slm_size, 4096) / 4096;

According to your chart, 16k should end up with 3, but this logic will
produce 4. The old comment said it was in increments of 4k, so I'm guessing
just the chart needs to be adjusted.

> +      }
>     }
>
>     desc[dw++] =
> --
> 2.8.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160609/dd45fa28/attachment-0001.html>


More information about the mesa-dev mailing list