[Intel-gfx] [PATCH] drm/i915: Don't continually defer the hangcheck

Chris Wilson chris at chris-wilson.co.uk
Wed Nov 19 11:13:58 CET 2014


On Wed, Nov 19, 2014 at 11:00:08AM +0100, Daniel Vetter wrote:
> On Fri, Nov 07, 2014 at 05:14:36PM +0200, Mika Kuoppala wrote:
> > Chris Wilson <chris at chris-wilson.co.uk> writes:
> > 
> > > On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote:
> > >> Chris Wilson <chris at chris-wilson.co.uk> writes:
> > >> 
> > >> > With multiple rings, we may continue to render on the blitter whilst
> > >> > executing an infinite shader on the render ring. As we currently, rearm
> > >> > the timer with each execbuf, in this scenario the hangcheck will never
> > >> > fire and we will never detect the lockup on the render ring. Instead,
> > >> > only arm the timer once per hangcheck, so that hangcheck runs more
> > >> > frequently.
> > >> >
> > >> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > >> > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > >> > ---
> > >> >  drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
> > >> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > >> >
> > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > >> > index 318a6a0724d0..82b4d742aba5 100644
> > >> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > >> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > >> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
> > >> >  void i915_queue_hangcheck(struct drm_device *dev)
> > >> >  {
> > >> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >> > +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
> > >> > +
> > >> >  	if (!i915.enable_hangcheck)
> > >> >  		return;
> > >> >  
> > >> > -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
> > >> > -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> > >> > +	if (timer_pending(timer))
> > >> > +		return;
> > >> > +
> > >> 
> > >> As this is called from both process and interrupt context, what
> > >> keeps us safe from not messing up the timer bookkeepping? The lock in timer code?
> > >> 
> > >> I am thinking that the other thread will hit the BUG_ON in add_timer().
> > >
> > > if (!timer_pending(timer))
> > > 	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> > > mod_timer(timer, timer->expires);
> > > ?
> > 
> > With this changed:
> > 
> > Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
> 
> Can you please respin this with mod_timer so that I can slurp it in?

I was lazy and didn't look for this msg to reply to:

http://patchwork.freedesktop.org/patch/37111/
1416390439-5724-1-git-send-email-chris at chris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list