[Intel-gfx] [PATCH] drm/i915/gt: Limit VFE threads based on GT
Chris Wilson
chris at chris-wilson.co.uk
Thu Jan 7 22:04:46 UTC 2021
Quoting Rodrigo Vivi (2021-01-07 19:50:37)
> On Fri, Oct 16, 2020 at 06:54:11PM +0100, Chris Wilson wrote:
> > MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the
> > range [0, n-1] where n is #EU * (#threads/EU) with the number of threads
> > based on plaform and the number of EU based on the number of slices and
> > subslices. This is a fixed number per platform/gt, so appropriately
> > limit the number of threads we spawn to match the device.
> >
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024
>
> we need to get this closed...
Unfortunately this failed the validation test. And as that test is still
not in CI, I cannot say why. My vote would be to remove the
clear_residuals until it works on all target platforms. Plus we clearly
need a hsw-gt1 in CI.
> > bv->scratch_size = bv->surface_height * bv->surface_width;
> > @@ -244,7 +258,6 @@ gen7_emit_vfe_state(struct batch_chunk *batch,
> > u32 urb_size, u32 curbe_size,
> > u32 mode)
> > {
> > - u32 urb_entries = bv->max_urb_entries;
> > u32 threads = bv->max_primitives - 1;
> > u32 *cs = batch_alloc_items(batch, 32, 8);
> >
> > @@ -254,7 +267,7 @@ gen7_emit_vfe_state(struct batch_chunk *batch,
> > *cs++ = 0;
> >
> > /* number of threads & urb entries for GPGPU vs Media Mode */
> > - *cs++ = threads << 16 | urb_entries << 8 | mode << 2;
> > + *cs++ = threads << 16 | 1 << 8 | mode << 2;
>
> why urb_entries = 1 ?
We only used a single entry. There was no measurable benefit from
assigning more entries, and the importance of any side effects from doing
so unknown.
> the range is 0,64 and 0,128 depending on the sku.
>
> in general there's a min of 32 URBs
Don't forget num_entries * entry_size must fit within the URB
allocation/allotment.
-Chris
More information about the Intel-gfx
mailing list