[Mesa-dev] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview

Eero Tamminen eero.t.tamminen at intel.com
Wed Mar 7 15:43:02 UTC 2018


Hi,

Tested SynMark CSDof and GfxBench Aztec Ruins GL & GLES / normal & high 
versions, which were earlier GPU hanging.  With this patch hangs are gone.

Tested-by: Eero Tamminen <eero.t.tamminen at intel.com>


On 07.03.2018 10:16, Jordan Justen wrote:
> Ken suggested that we might be underallocating scratch space on HD
> 400. Allocating scratch space as though there was actually 8 EUs

s/8/18/?

	- Eero


> seems to help with a GPU hang seen on synmark CSDof.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
> Cc: <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> ---
>   src/mesa/drivers/dri/i965/brw_program.c | 44 ++++++++++++++++++++-------------
>   1 file changed, 27 insertions(+), 17 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c
> index 527f003977b..c121136c439 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context *brw,
>         if (devinfo->gen >= 9)
>            subslices = 4 * brw->screen->devinfo.num_slices;
>   
> -      /* WaCSScratchSize:hsw
> -       *
> -       * Haswell's scratch space address calculation appears to be sparse
> -       * rather than tightly packed.  The Thread ID has bits indicating
> -       * which subslice, EU within a subslice, and thread within an EU
> -       * it is.  There's a maximum of two slices and two subslices, so these
> -       * can be stored with a single bit.  Even though there are only 10 EUs
> -       * per subslice, this is stored in 4 bits, so there's an effective
> -       * maximum value of 16 EUs.  Similarly, although there are only 7
> -       * threads per EU, this is stored in a 3 bit number, giving an effective
> -       * maximum value of 8 threads per EU.
> -       *
> -       * This means that we need to use 16 * 8 instead of 10 * 7 for the
> -       * number of threads per subslice.
> -       */
> -      const unsigned scratch_ids_per_subslice =
> -         devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads;
> +      unsigned scratch_ids_per_subslice;
> +      if (devinfo->is_haswell) {
> +         /* WaCSScratchSize:hsw
> +          *
> +          * Haswell's scratch space address calculation appears to be sparse
> +          * rather than tightly packed. The Thread ID has bits indicating
> +          * which subslice, EU within a subslice, and thread within an EU it
> +          * is. There's a maximum of two slices and two subslices, so these
> +          * can be stored with a single bit. Even though there are only 10 EUs
> +          * per subslice, this is stored in 4 bits, so there's an effective
> +          * maximum value of 16 EUs. Similarly, although there are only 7
> +          * threads per EU, this is stored in a 3 bit number, giving an
> +          * effective maximum value of 8 threads per EU.
> +          *
> +          * This means that we need to use 16 * 8 instead of 10 * 7 for the
> +          * number of threads per subslice.
> +          */
> +         scratch_ids_per_subslice = 16 * 8;
> +      } else if (devinfo->is_cherryview) {
> +         /* For Cherryview, it appears that the scratch addresses for the 6 EU
> +          * devices may still generate compute scratch addresses covering the
> +          * same range as 8 EU.
> +          */
> +         scratch_ids_per_subslice = 8 * 7;
> +      } else {
> +         scratch_ids_per_subslice = devinfo->max_cs_threads;
> +      }
>   
>         thread_count = scratch_ids_per_subslice * subslices;
>         break;
> 



More information about the mesa-dev mailing list