[Mesa-stable] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview

Juan A. Suarez Romero jasuarez at igalia.com
Mon Mar 26 15:23:13 UTC 2018


On Wed, 2018-03-07 at 00:16 -0800, Jordan Justen wrote:
> Ken suggested that we might be underallocating scratch space on HD
> 400. Allocating scratch space as though there was actually 8 EUs
> seems to help with a GPU hang seen on synmark CSDof.
> 

FYI, in order to pick this commit for next 17.3 stable release, I need to pick
also:

commit f9d5a7add42af5a2e4410526d1480a08f41317ae
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Tue Oct 31 00:34:32 2017 -0700

    i965: Calculate thread_count in brw_alloc_stage_scratch
  

Unless you prefer not picking them, I'll add both.


Cheers!


	J.A.

> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
> Cc: <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_program.c | 44 ++++++++++++++++++++-------------
>  1 file changed, 27 insertions(+), 17 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c
> index 527f003977b..c121136c439 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context *brw,
>        if (devinfo->gen >= 9)
>           subslices = 4 * brw->screen->devinfo.num_slices;
>  
> -      /* WaCSScratchSize:hsw
> -       *
> -       * Haswell's scratch space address calculation appears to be sparse
> -       * rather than tightly packed.  The Thread ID has bits indicating
> -       * which subslice, EU within a subslice, and thread within an EU
> -       * it is.  There's a maximum of two slices and two subslices, so these
> -       * can be stored with a single bit.  Even though there are only 10 EUs
> -       * per subslice, this is stored in 4 bits, so there's an effective
> -       * maximum value of 16 EUs.  Similarly, although there are only 7
> -       * threads per EU, this is stored in a 3 bit number, giving an effective
> -       * maximum value of 8 threads per EU.
> -       *
> -       * This means that we need to use 16 * 8 instead of 10 * 7 for the
> -       * number of threads per subslice.
> -       */
> -      const unsigned scratch_ids_per_subslice =
> -         devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads;
> +      unsigned scratch_ids_per_subslice;
> +      if (devinfo->is_haswell) {
> +         /* WaCSScratchSize:hsw
> +          *
> +          * Haswell's scratch space address calculation appears to be sparse
> +          * rather than tightly packed. The Thread ID has bits indicating
> +          * which subslice, EU within a subslice, and thread within an EU it
> +          * is. There's a maximum of two slices and two subslices, so these
> +          * can be stored with a single bit. Even though there are only 10 EUs
> +          * per subslice, this is stored in 4 bits, so there's an effective
> +          * maximum value of 16 EUs. Similarly, although there are only 7
> +          * threads per EU, this is stored in a 3 bit number, giving an
> +          * effective maximum value of 8 threads per EU.
> +          *
> +          * This means that we need to use 16 * 8 instead of 10 * 7 for the
> +          * number of threads per subslice.
> +          */
> +         scratch_ids_per_subslice = 16 * 8;
> +      } else if (devinfo->is_cherryview) {
> +         /* For Cherryview, it appears that the scratch addresses for the 6 EU
> +          * devices may still generate compute scratch addresses covering the
> +          * same range as 8 EU.
> +          */
> +         scratch_ids_per_subslice = 8 * 7;
> +      } else {
> +         scratch_ids_per_subslice = devinfo->max_cs_threads;
> +      }
>  
>        thread_count = scratch_ids_per_subslice * subslices;
>        break;


More information about the mesa-stable mailing list