[Intel-gfx] [PATCH] drm/i915: Report an error when i915.reset prevents a reset

Daniel Vetter daniel at ffwll.ch
Mon Jun 22 07:48:26 PDT 2015


On Mon, Jun 22, 2015 at 02:53:39PM +0100, Chris Wilson wrote:
> On Mon, Jun 22, 2015 at 03:44:51PM +0200, Daniel Vetter wrote:
> > On Thu, Jun 18, 2015 at 11:42:08AM +0100, Chris Wilson wrote:
> > > If the user disables the GPU reset using the i915.reset parameter and
> > > one occurs, report that we failed to reset the GPU. If we return early,
> > > as we currently do, then we leave all state intact (with a hung GPU)
> > > and clients block forever waiting for their requests to complete.
> > > 
> > > Testcase: igt/gem_eio
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c     | 1 -
> > >  drivers/gpu/drm/i915/i915_drv.c     | 3 ---
> > >  drivers/gpu/drm/i915/intel_uncore.c | 3 +++
> > >  3 files changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 88795d2f1819..c5349fa3fcce 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -165,7 +165,6 @@ static int i915_getparam(struct drm_device *dev, void *data,
> > >  		break;
> > >  	case I915_PARAM_HAS_GPU_RESET:
> > >  		value = i915.enable_hangcheck &&
> > > -			i915.reset &&
> > >  			intel_has_gpu_reset(dev);
> > >  		break;
> > >  	default:
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > > index 78ef0bb53c36..25ffe8afe744 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -863,9 +863,6 @@ int i915_reset(struct drm_device *dev)
> > >  	bool simulated;
> > >  	int ret;
> > >  
> > > -	if (!i915.reset)
> > > -		return 0;
> > > -
> > >  	intel_reset_gt_powersave(dev);
> > >  
> > >  	mutex_lock(&dev->struct_mutex);
> > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > > index 4a86cf007aa0..f8e75def1a1d 100644
> > > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > > @@ -1457,6 +1457,9 @@ static int gen6_do_reset(struct drm_device *dev)
> > >  
> > >  static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *)
> > >  {
> > > +	if (!i915.reset)
> > > +		return NULL;
> > 
> > Maybe a special reset function which always returns -EIO and prints
> > something to illuminate the situation to dmesg? Otherwise even more wtf
> > material in bugzilla ...
> 
> It does report a failure (different to reset() itself failing), and reset
> should be yet another do not touch module option.

The only one I could spot is conditional upon the hang being a simulated
one. But just marking i915.reset as unsafe works for me too, I'll amend
your patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list