[Intel-gfx] [PATCH] drm/i915: Report an error when i915.reset prevents a reset

Chris Wilson chris at chris-wilson.co.uk
Mon Jun 22 06:53:39 PDT 2015


On Mon, Jun 22, 2015 at 03:44:51PM +0200, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 11:42:08AM +0100, Chris Wilson wrote:
> > If the user disables the GPU reset using the i915.reset parameter and
> > one occurs, report that we failed to reset the GPU. If we return early,
> > as we currently do, then we leave all state intact (with a hung GPU)
> > and clients block forever waiting for their requests to complete.
> > 
> > Testcase: igt/gem_eio
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c     | 1 -
> >  drivers/gpu/drm/i915/i915_drv.c     | 3 ---
> >  drivers/gpu/drm/i915/intel_uncore.c | 3 +++
> >  3 files changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 88795d2f1819..c5349fa3fcce 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -165,7 +165,6 @@ static int i915_getparam(struct drm_device *dev, void *data,
> >  		break;
> >  	case I915_PARAM_HAS_GPU_RESET:
> >  		value = i915.enable_hangcheck &&
> > -			i915.reset &&
> >  			intel_has_gpu_reset(dev);
> >  		break;
> >  	default:
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 78ef0bb53c36..25ffe8afe744 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -863,9 +863,6 @@ int i915_reset(struct drm_device *dev)
> >  	bool simulated;
> >  	int ret;
> >  
> > -	if (!i915.reset)
> > -		return 0;
> > -
> >  	intel_reset_gt_powersave(dev);
> >  
> >  	mutex_lock(&dev->struct_mutex);
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > index 4a86cf007aa0..f8e75def1a1d 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -1457,6 +1457,9 @@ static int gen6_do_reset(struct drm_device *dev)
> >  
> >  static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *)
> >  {
> > +	if (!i915.reset)
> > +		return NULL;
> 
> Maybe a special reset function which always returns -EIO and prints
> something to illuminate the situation to dmesg? Otherwise even more wtf
> material in bugzilla ...

It does report a failure (different to reset() itself failing), and reset
should be yet another do not touch module option.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list