[Intel-gfx] [PATCH] drm/i915/gt: Limit VFE threads based on GT

Chris Wilson chris at chris-wilson.co.uk
Thu Jan 7 22:04:46 UTC 2021


Quoting Rodrigo Vivi (2021-01-07 19:50:37)
> On Fri, Oct 16, 2020 at 06:54:11PM +0100, Chris Wilson wrote:
> > MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the
> > range [0, n-1] where n is #EU * (#threads/EU) with the number of threads
> > based on plaform and the number of EU based on the number of slices and
> > subslices. This is a fixed number per platform/gt, so appropriately
> > limit the number of threads we spawn to match the device.
> > 
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024
> 
> we need to get this closed...

Unfortunately this failed the validation test. And as that test is still
not in CI, I cannot say why. My vote would be to remove the
clear_residuals until it works on all target platforms. Plus we clearly
need a hsw-gt1 in CI.
 
> >       bv->scratch_size = bv->surface_height * bv->surface_width;
> > @@ -244,7 +258,6 @@ gen7_emit_vfe_state(struct batch_chunk *batch,
> >                   u32 urb_size, u32 curbe_size,
> >                   u32 mode)
> >  {
> > -     u32 urb_entries = bv->max_urb_entries;
> >       u32 threads = bv->max_primitives - 1;
> >       u32 *cs = batch_alloc_items(batch, 32, 8);
> >  
> > @@ -254,7 +267,7 @@ gen7_emit_vfe_state(struct batch_chunk *batch,
> >       *cs++ = 0;
> >  
> >       /* number of threads & urb entries for GPGPU vs Media Mode */
> > -     *cs++ = threads << 16 | urb_entries << 8 | mode << 2;
> > +     *cs++ = threads << 16 | 1 << 8 | mode << 2;
> 
> why urb_entries = 1 ?

We only used a single entry. There was no measurable benefit from
assigning more entries, and the importance of any side effects from doing
so unknown.

> the range is 0,64 and 0,128 depending on the sku.
> 
> in general there's a min of 32 URBs

Don't forget num_entries * entry_size must fit within the URB
allocation/allotment.
-Chris


More information about the Intel-gfx mailing list