[Mesa-stable] [Mesa-dev] [PATCH 1/3] i965: Fix shared local memory size for Gen9+.
Jason Ekstrand
jason at jlekstrand.net
Thu Jun 9 14:29:22 UTC 2016
On Jun 9, 2016 1:10 AM, "Kenneth Graunke" <kenneth at whitecape.org> wrote:
>
> Skylake changes the representation of shared local memory size:
>
> Size | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
> -------------------------------------------------------------------
> Gen7-8 | 0 | none | none | 1 | 2 | 3 | 4 | 5 |
> -------------------------------------------------------------------
> Gen9+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
>
> The old formula would substantially underallocate the amount of space.
> This fixes GPU hangs on Skylake when running with full thread counts.
>
> Cc: "12.0" <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> src/mesa/drivers/dri/i965/gen7_cs_state.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c
b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 750aa2c..aff1f4e 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -150,11 +150,16 @@ brw_upload_cs_state(struct brw_context *brw)
> assert(prog_data->total_shared <= 64 * 1024);
> uint32_t slm_size = 0;
> if (prog_data->total_shared > 0) {
> - /* slm_size is in 4k increments, but must be a power of 2. */
> - slm_size = 4 * 1024;
> - while (slm_size < prog_data->total_shared)
> - slm_size <<= 1;
> - slm_size /= 4 * 1024;
> + /* Shared Local Memory Size is specified as powers of two. */
> + slm_size = util_next_power_of_two(prog_data->total_shared);
> +
> + if (brw->gen >= 9) {
> + /* Use a minimum of 1kB; turn an exponent of 10 (1024 kB) into
1. */
> + slm_size = ffs(MAX2(slm_size, 1024)) - 10;
> + } else {
> + /* Use a minimum of 4kB; convert to the pre-Gen9
representation. */
> + slm_size = MAX2(slm_size, 4096) / 4096;
Are you sure you don't want ffs in both cases? The table above really
looks like powers of two all around.
const unsigned slm_divisor = brw->gen >= 9 ? 1024 : 4096;
slm_size = DIV_ROUND_UP(prog_data->total_shared, slm_divisor);
slm_size = util_next_power_of_two(slm_size);
slm_size = ffs(slm_size);
You may be also be able to replace the last two lines with a clever use of
clz.
If the table in the commit message comes from the bspec, it might be good
to put it here as a comment.
> + }
> }
>
> desc[dw++] =
> --
> 2.8.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-stable/attachments/20160609/01f97002/attachment.html>
More information about the mesa-stable
mailing list