[Intel-gfx] [PATCH v3 2/3] drm/i915/gt: Double check heartbeat timeout before resetting

Chris Wilson chris at chris-wilson.co.uk
Thu Feb 4 13:02:48 UTC 2021


Quoting Mika Kuoppala (2021-02-04 12:57:46)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > Check that we have actually passed the heartbeat interval since last
> > checking the request before resetting the device.
> >
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2780
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> > index 48a91c0dbad6..93741a65924a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> > @@ -31,7 +31,7 @@ static bool next_heartbeat(struct intel_engine_cs *engine)
> >       delay = msecs_to_jiffies_timeout(delay);
> >       if (delay >= HZ)
> >               delay = round_jiffies_up_relative(delay);
> > -     mod_delayed_work(system_highpri_wq, &engine->heartbeat.work, delay);
> > +     mod_delayed_work(system_highpri_wq, &engine->heartbeat.work, delay + 1);
> >  
> >       return true;
> >  }
> > @@ -103,6 +103,13 @@ static void heartbeat(struct work_struct *wrk)
> >               goto out;
> >  
> >       if (engine->heartbeat.systole) {
> > +             long delay = READ_ONCE(engine->props.heartbeat_interval_ms);
> > +
> > +             /* Safeguard against too-fast worker invocations */
> > +             if (!time_after(jiffies,
> > +                             rq->emitted_jiffies + msecs_to_jiffies(delay)))
> > +                     goto out;
> > +
> >               if (!i915_sw_fence_signaled(&rq->submit)) {
> >                       /*
> >                        * Not yet submitted, system is stalled.
> > @@ -139,6 +146,8 @@ static void heartbeat(struct work_struct *wrk)
> >                                             "stopped heartbeat on %s",
> >                                             engine->name);
> >               }
> > +
> > +             rq->emitted_jiffies = jiffies;
> 
> Would possibly interfere with throttle. But who would get handle to
> internal request.

Indeed. And it changes the meaning of the pretty printing in the debug
message, but I can live with that.
-Chris


More information about the Intel-gfx mailing list