[PATCH v1 1/1] drm/i915/gt: Increase a time to retry RING_HEAD reset
Gote, Nitin R
nitin.r.gote at intel.com
Thu Dec 5 16:03:38 UTC 2024
Hi Andi,
> -----Original Message-----
> From: Andi Shyti <andi.shyti at linux.intel.com>
> Sent: Thursday, December 5, 2024 6:35 PM
> To: Gote, Nitin R <nitin.r.gote at intel.com>
> Cc: intel-gfx at lists.freedesktop.org; Wilson, Chris P <chris.p.wilson at intel.com>
> Subject: Re: [PATCH v1 1/1] drm/i915/gt: Increase a time to retry RING_HEAD
> reset
>
> Hi Nitin,
>
> On Thu, Dec 05, 2024 at 05:27:36PM +0530, Nitin Gote wrote:
> > Issue is seen again where engine resets fails because the engine
> > resumes from an incorrect RING_HEAD. So, increase a time if at first
> > the write doesn't succeed and retry again.
> >
> > Fixes: 6ef0e3ef2662 ("drm/i915/gt: Retry RING_HEAD reset until it get
> > sticks")
>
> Is this a real fix or is it more of a fine tuning?
Here we can say this for more fine tuning as issue seen again and
that's why I have added fixes : 6ef0e3ef2662.
>
> > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12806
> > Signed-off-by: Nitin Gote <nitin.r.gote at intel.com>
>
> ...
>
> > @@ -231,7 +231,7 @@ static int xcs_resume(struct intel_engine_cs *engine)
> > set_pp_dir(engine);
> >
> > /* First wake the ring up to an empty/idle ring */
> > - for ((kt) = ktime_get() + (2 * NSEC_PER_MSEC);
> > + for ((kt) = ktime_get() + (50 * NSEC_PER_MSEC);
>
> Where is this 50 coming from?
Well, here HEAD is still not 0 even after writing in it.
So, it could be the timing issue. I discussed this with Chris and we thought
It is better to add 50ms instead of 2ms delay here to let HEAD write complete.
I tested this on trybot for Haswell/Ivybridge platform https://patchwork.freedesktop.org/series/141779/ and
I see BAT is successful and shards issues are not related.
>
> Thanks,
> Andi
>
> > ktime_before(ktime_get(), (kt)); cpu_relax()) {
> > /*
> > * In case of resets fails because engine resumes from
> > --
> > 2.25.1
More information about the Intel-gfx
mailing list