[Mesa-dev] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview
Eero Tamminen
eero.t.tamminen at intel.com
Wed Mar 7 15:43:02 UTC 2018
Hi,
Tested SynMark CSDof and GfxBench Aztec Ruins GL & GLES / normal & high
versions, which were earlier GPU hanging. With this patch hangs are gone.
Tested-by: Eero Tamminen <eero.t.tamminen at intel.com>
On 07.03.2018 10:16, Jordan Justen wrote:
> Ken suggested that we might be underallocating scratch space on HD
> 400. Allocating scratch space as though there was actually 8 EUs
s/8/18/?
- Eero
> seems to help with a GPU hang seen on synmark CSDof.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
> Cc: <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> ---
> src/mesa/drivers/dri/i965/brw_program.c | 44 ++++++++++++++++++++-------------
> 1 file changed, 27 insertions(+), 17 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c
> index 527f003977b..c121136c439 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context *brw,
> if (devinfo->gen >= 9)
> subslices = 4 * brw->screen->devinfo.num_slices;
>
> - /* WaCSScratchSize:hsw
> - *
> - * Haswell's scratch space address calculation appears to be sparse
> - * rather than tightly packed. The Thread ID has bits indicating
> - * which subslice, EU within a subslice, and thread within an EU
> - * it is. There's a maximum of two slices and two subslices, so these
> - * can be stored with a single bit. Even though there are only 10 EUs
> - * per subslice, this is stored in 4 bits, so there's an effective
> - * maximum value of 16 EUs. Similarly, although there are only 7
> - * threads per EU, this is stored in a 3 bit number, giving an effective
> - * maximum value of 8 threads per EU.
> - *
> - * This means that we need to use 16 * 8 instead of 10 * 7 for the
> - * number of threads per subslice.
> - */
> - const unsigned scratch_ids_per_subslice =
> - devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads;
> + unsigned scratch_ids_per_subslice;
> + if (devinfo->is_haswell) {
> + /* WaCSScratchSize:hsw
> + *
> + * Haswell's scratch space address calculation appears to be sparse
> + * rather than tightly packed. The Thread ID has bits indicating
> + * which subslice, EU within a subslice, and thread within an EU it
> + * is. There's a maximum of two slices and two subslices, so these
> + * can be stored with a single bit. Even though there are only 10 EUs
> + * per subslice, this is stored in 4 bits, so there's an effective
> + * maximum value of 16 EUs. Similarly, although there are only 7
> + * threads per EU, this is stored in a 3 bit number, giving an
> + * effective maximum value of 8 threads per EU.
> + *
> + * This means that we need to use 16 * 8 instead of 10 * 7 for the
> + * number of threads per subslice.
> + */
> + scratch_ids_per_subslice = 16 * 8;
> + } else if (devinfo->is_cherryview) {
> + /* For Cherryview, it appears that the scratch addresses for the 6 EU
> + * devices may still generate compute scratch addresses covering the
> + * same range as 8 EU.
> + */
> + scratch_ids_per_subslice = 8 * 7;
> + } else {
> + scratch_ids_per_subslice = devinfo->max_cs_threads;
> + }
>
> thread_count = scratch_ids_per_subslice * subslices;
> break;
>
More information about the mesa-dev
mailing list