[Mesa-stable] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview
Juan A. Suarez Romero
jasuarez at igalia.com
Wed Mar 28 22:34:25 UTC 2018
On Wed, 2018-03-28 at 14:55 -0700, Jordan Justen wrote:
> On 2018-03-26 08:23:13, Juan A. Suarez Romero wrote:
> > On Wed, 2018-03-07 at 00:16 -0800, Jordan Justen wrote:
> > > Ken suggested that we might be underallocating scratch space on
> > > HD
> > > 400. Allocating scratch space as though there was actually 8 EUs
> > > seems to help with a GPU hang seen on synmark CSDof.
> > >
> >
> > FYI, in order to pick this commit for next 17.3 stable release, I
> > need to pick
> > also:
> >
> > commit f9d5a7add42af5a2e4410526d1480a08f41317ae
> > Author: Jordan Justen <jordan.l.justen at intel.com>
> > Date: Tue Oct 31 00:34:32 2017 -0700
> >
> > i965: Calculate thread_count in brw_alloc_stage_scratch
>
> I believe that this commit lead to a regression with compute shaders,
> which was fixed by:
>
> commit a16dc04ad51c32e5c7d136e4dd6273d983385d3f
> Author: Kenneth Graunke <kenneth at whitecape.org>
> Date: Tue Oct 31 00:56:24 2017 -0700
>
> i965: properly initialize brw->cs.base.stage to
> MESA_SHADER_COMPUTE
>
> You should probably add Ken's a16dc04ad51c before f9d5a7add42a.
>
Thanks a lot! Fortunately, a16dc04ad51c was already nominated and
included in 17.3.0. So it is in the stable branch.
J.A.
> -Jordan
>
> >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
> > > Cc: Kenneth Graunke <kenneth at whitecape.org>
> > > Cc: Eero Tamminen <eero.t.tamminen at intel.com>
> > > Cc: <mesa-stable at lists.freedesktop.org>
> > > Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> > > ---
> > > src/mesa/drivers/dri/i965/brw_program.c | 44
> > > ++++++++++++++++++++-------------
> > > 1 file changed, 27 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c
> > > b/src/mesa/drivers/dri/i965/brw_program.c
> > > index 527f003977b..c121136c439 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_program.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_program.c
> > > @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context
> > > *brw,
> > > if (devinfo->gen >= 9)
> > > subslices = 4 * brw->screen->devinfo.num_slices;
> > >
> > > - /* WaCSScratchSize:hsw
> > > - *
> > > - * Haswell's scratch space address calculation appears to
> > > be sparse
> > > - * rather than tightly packed. The Thread ID has bits
> > > indicating
> > > - * which subslice, EU within a subslice, and thread within
> > > an EU
> > > - * it is. There's a maximum of two slices and two
> > > subslices, so these
> > > - * can be stored with a single bit. Even though there are
> > > only 10 EUs
> > > - * per subslice, this is stored in 4 bits, so there's an
> > > effective
> > > - * maximum value of 16 EUs. Similarly, although there are
> > > only 7
> > > - * threads per EU, this is stored in a 3 bit number,
> > > giving an effective
> > > - * maximum value of 8 threads per EU.
> > > - *
> > > - * This means that we need to use 16 * 8 instead of 10 * 7
> > > for the
> > > - * number of threads per subslice.
> > > - */
> > > - const unsigned scratch_ids_per_subslice =
> > > - devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads;
> > > + unsigned scratch_ids_per_subslice;
> > > + if (devinfo->is_haswell) {
> > > + /* WaCSScratchSize:hsw
> > > + *
> > > + * Haswell's scratch space address calculation appears
> > > to be sparse
> > > + * rather than tightly packed. The Thread ID has bits
> > > indicating
> > > + * which subslice, EU within a subslice, and thread
> > > within an EU it
> > > + * is. There's a maximum of two slices and two
> > > subslices, so these
> > > + * can be stored with a single bit. Even though there
> > > are only 10 EUs
> > > + * per subslice, this is stored in 4 bits, so there's
> > > an effective
> > > + * maximum value of 16 EUs. Similarly, although there
> > > are only 7
> > > + * threads per EU, this is stored in a 3 bit number,
> > > giving an
> > > + * effective maximum value of 8 threads per EU.
> > > + *
> > > + * This means that we need to use 16 * 8 instead of 10
> > > * 7 for the
> > > + * number of threads per subslice.
> > > + */
> > > + scratch_ids_per_subslice = 16 * 8;
> > > + } else if (devinfo->is_cherryview) {
> > > + /* For Cherryview, it appears that the scratch
> > > addresses for the 6 EU
> > > + * devices may still generate compute scratch addresses
> > > covering the
> > > + * same range as 8 EU.
> > > + */
> > > + scratch_ids_per_subslice = 8 * 7;
> > > + } else {
> > > + scratch_ids_per_subslice = devinfo->max_cs_threads;
> > > + }
> > >
> > > thread_count = scratch_ids_per_subslice * subslices;
> > > break;
>
> _______________________________________________
> mesa-stable mailing list
> mesa-stable at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
More information about the mesa-stable
mailing list