[Intel-gfx] [PATCH 1/2] drm/i915: Detect page faults during hangcheck

Chris Wilson chris at chris-wilson.co.uk
Wed Jan 21 02:05:04 PST 2015


On Fri, Dec 05, 2014 at 10:03:42PM +0100, Daniel Vetter wrote:
> On Fri, Dec 05, 2014 at 02:15:21PM +0000, Chris Wilson wrote:
> > On Sandybridge+, the GPU provides the ERROR register for detecting page
> > faults. Hook this up to our hangcheck so that we can dump the error
> > state soon after such an event occurs. This would be better inside an
> > interrupt handler, but it serves a purpose here as it detects that our
> > initial context setup is invalid...
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_irq.c     | 5 +++++
> >  drivers/gpu/drm/i915/intel_uncore.c | 2 ++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 7913a72ce30a..eb2149b941e4 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -2969,6 +2969,11 @@ static void i915_hangcheck_elapsed(unsigned long data)
> >  	if (!i915.enable_hangcheck)
> >  		return;
> >  
> > +	if (INTEL_INFO(dev_priv)->gen >= 6 && I915_READ(ERROR_GEN6)) {
> > +		i915_handle_error(dev, false, "GPU reported a page fault");
> 
> Is the full hangcheck state actually useful for debugging these
> pagefaults? The gpu doesn't seem to fall over completely, so I guess ACTHD
> and friends are somewhere in nirvana.

Yes, it was. In this case ACTHD wasn't in nirvana, but close inspection
showed that the environment must have changed since the execution of the
batch and the GPU fault.
 
> Enabling the interrupts would definitely be useful though. I think all the
> handler code is written already (perhaps a few missing drm_debug lines),
> it's just that we don't enable these interrupt sources by default.

That would be even better.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list