[Intel-gfx] [PATCH] drm/i915/mtl: Increase guard pages when vt-d is enabled
Sripada, Radhakrishna
radhakrishna.sripada at intel.com
Fri Nov 3 15:53:10 UTC 2023
Hi Tvrtko,
> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> Sent: Friday, November 3, 2023 1:30 AM
> To: Sripada, Radhakrishna <radhakrishna.sripada at intel.com>; Hajda, Andrzej
> <andrzej.hajda at intel.com>; intel-gfx at lists.freedesktop.org
> Cc: Chris Wilson <chris.p.wilson at linux.intel.com>
> Subject: Re: [Intel-gfx] [PATCH] drm/i915/mtl: Increase guard pages when vt-d is
> enabled
>
>
> On 02/11/2023 22:14, Sripada, Radhakrishna wrote:
> > Hi Tvrtko,
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> >> Sent: Thursday, November 2, 2023 10:41 AM
> >> To: Hajda, Andrzej <andrzej.hajda at intel.com>; Sripada, Radhakrishna
> >> <radhakrishna.sripada at intel.com>; intel-gfx at lists.freedesktop.org
> >> Cc: Chris Wilson <chris.p.wilson at linux.intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH] drm/i915/mtl: Increase guard pages when vt-d
> is
> >> enabled
> >>
> >>
> >> On 02/11/2023 16:58, Andrzej Hajda wrote:
> >>> On 02.11.2023 17:06, Radhakrishna Sripada wrote:
> >>>> Experiments were conducted with different multipliers to VTD_GUARD
> macro
> >>>> with multiplier of 185 we were observing occasional pipe faults when
> >>>> running kms_cursor_legacy --run-subtest single-bo
> >>>>
> >>>> There could possibly be an underlying issue that is being
> >>>> investigated, for
> >>>> now bump the guard pages for MTL.
> >>>>
> >>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2017
> >>>> Cc: Gustavo Sousa <gustavo.sousa at intel.com>
> >>>> Cc: Chris Wilson <chris.p.wilson at linux.intel.com>
> >>>> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada at intel.com>
> >>>> ---
> >>>> drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
> >>>> 1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> index 3770828f2eaf..b65f84c6bb3f 100644
> >>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> @@ -456,6 +456,9 @@ i915_gem_object_pin_to_display_plane(struct
> >>>> drm_i915_gem_object *obj,
> >>>> if (intel_scanout_needs_vtd_wa(i915)) {
> >>>> unsigned int guard = VTD_GUARD;
> >>>> + if (IS_METEORLAKE(i915))
> >>>> + guard *= 200;
> >>>> +
> >>>
> >>> 200 * VTD_GUARD = 200 * 168 * 4K = 131MB
> >>>
> >>> Looks insanely high, 131MB for padding, if this is before and after it
> >>> becomes even 262MB of wasted address per plane. Just signalling, I do
> >>> not know if this actually hurts.
> >>
> >> Yeah this feels crazy. There must be some other explanation which is
> >> getting hidden by the crazy amount of padding so I'd rather we figured
> >> it out.
> >>
> >> With 262MiB per fb how many fit in GGTT before eviction hits? N screens
> >> with double/triple buffering?
> >
> > I believe with this method we will have to limit the no of frame buffers in the
> system. One alternative
> > that worked is to do a proper clear range for the ggtt instead of doing a nop.
> Although it adds marginal
> > time during suspend/resume/boot it does not add restrictions to the no of fb's
> that can be used.
>
> And if we remember the guard pages replaced clearing to scratch, to
> improve suspend resume times, exactly for improving user experience. :(
>
> Luckily there is time to fix this properly on MTL one way or the other.
> Is it just kms_cursor_legacy --run-subtest single-bo that is affected?
I am trying to dump the page table entries at the time of failure for bot the fame buffer and if required
For the guard pages. Will see if I get some info from there.
Yes the test kms_cursor_legacy is used to reliably reproduce. Looking at public CI, I also see pipe errors
being reported with varying occurrences while running kms_cursor_crc, kms_pipe_crc_basic,
and kms_plane_scaling. More details on the occurrence can be found here [1].
Thanks,
RK
1. http://gfx-ci.igk.intel.com/cibuglog-ng/results/knownfailures?query_key=d9c3297dd17dda35a6c638eb96b3139bd1a6633c
>
> Regards,
>
> Tvrtko
>
>
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >> P.S. Where did the 185 from the commit message come from?
> > 185 came from experiment to increase the guard size. It is not a standard
> number.
> >
> > Regards,
> > Radhakrishna(RK) Sripada
More information about the Intel-gfx
mailing list