[Mesa-stable] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview
Juan A. Suarez Romero
jasuarez at igalia.com
Mon Mar 26 15:23:13 UTC 2018
On Wed, 2018-03-07 at 00:16 -0800, Jordan Justen wrote:
> Ken suggested that we might be underallocating scratch space on HD
> 400. Allocating scratch space as though there was actually 8 EUs
> seems to help with a GPU hang seen on synmark CSDof.
>
FYI, in order to pick this commit for next 17.3 stable release, I need to pick
also:
commit f9d5a7add42af5a2e4410526d1480a08f41317ae
Author: Jordan Justen <jordan.l.justen at intel.com>
Date: Tue Oct 31 00:34:32 2017 -0700
i965: Calculate thread_count in brw_alloc_stage_scratch
Unless you prefer not picking them, I'll add both.
Cheers!
J.A.
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
> Cc: <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> ---
> src/mesa/drivers/dri/i965/brw_program.c | 44 ++++++++++++++++++++-------------
> 1 file changed, 27 insertions(+), 17 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c
> index 527f003977b..c121136c439 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context *brw,
> if (devinfo->gen >= 9)
> subslices = 4 * brw->screen->devinfo.num_slices;
>
> - /* WaCSScratchSize:hsw
> - *
> - * Haswell's scratch space address calculation appears to be sparse
> - * rather than tightly packed. The Thread ID has bits indicating
> - * which subslice, EU within a subslice, and thread within an EU
> - * it is. There's a maximum of two slices and two subslices, so these
> - * can be stored with a single bit. Even though there are only 10 EUs
> - * per subslice, this is stored in 4 bits, so there's an effective
> - * maximum value of 16 EUs. Similarly, although there are only 7
> - * threads per EU, this is stored in a 3 bit number, giving an effective
> - * maximum value of 8 threads per EU.
> - *
> - * This means that we need to use 16 * 8 instead of 10 * 7 for the
> - * number of threads per subslice.
> - */
> - const unsigned scratch_ids_per_subslice =
> - devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads;
> + unsigned scratch_ids_per_subslice;
> + if (devinfo->is_haswell) {
> + /* WaCSScratchSize:hsw
> + *
> + * Haswell's scratch space address calculation appears to be sparse
> + * rather than tightly packed. The Thread ID has bits indicating
> + * which subslice, EU within a subslice, and thread within an EU it
> + * is. There's a maximum of two slices and two subslices, so these
> + * can be stored with a single bit. Even though there are only 10 EUs
> + * per subslice, this is stored in 4 bits, so there's an effective
> + * maximum value of 16 EUs. Similarly, although there are only 7
> + * threads per EU, this is stored in a 3 bit number, giving an
> + * effective maximum value of 8 threads per EU.
> + *
> + * This means that we need to use 16 * 8 instead of 10 * 7 for the
> + * number of threads per subslice.
> + */
> + scratch_ids_per_subslice = 16 * 8;
> + } else if (devinfo->is_cherryview) {
> + /* For Cherryview, it appears that the scratch addresses for the 6 EU
> + * devices may still generate compute scratch addresses covering the
> + * same range as 8 EU.
> + */
> + scratch_ids_per_subslice = 8 * 7;
> + } else {
> + scratch_ids_per_subslice = devinfo->max_cs_threads;
> + }
>
> thread_count = scratch_ids_per_subslice * subslices;
> break;
More information about the mesa-stable
mailing list