[Intel-gfx] [PATCH v3] drm/i915: Rework GPU reset sequence to match driver load & thaw

Mcaulay, Alistair alistair.mcaulay at intel.com
Wed Aug 20 17:21:55 CEST 2014



> -----Original Message-----
> From: Chris Wilson [mailto:chris at chris-wilson.co.uk]
> Sent: Wednesday, August 20, 2014 3:58 PM
> To: Daniel, Thomas
> Cc: Mcaulay, Alistair; intel-gfx at lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH v3] drm/i915: Rework GPU reset sequence to
> match driver load & thaw
> 
> On Wed, Aug 20, 2014 at 02:46:37PM +0000, Daniel, Thomas wrote:
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx [mailto:intel-gfx-bounces at lists.freedesktop.org] On
> > > Behalf Of alistair.mcaulay at intel.com
> > > Sent: Friday, August 15, 2014 6:52 PM
> > > To: intel-gfx at lists.freedesktop.org
> > > Subject: [Intel-gfx] [PATCH v3] drm/i915: Rework GPU reset sequence
> > > to match driver load & thaw
> > >
> > > From: "McAulay, Alistair" <alistair.mcaulay at intel.com>
> > >
> > > This patch is to address Daniels concerns over different code during reset:
> > >
> > > http://lists.freedesktop.org/archives/intel-gfx/2014-June/047758.htm
> > > l
> > >
> > > "The reason for aiming as hard as possible to use the exact same
> > > code for driver load, gpu reset and runtime pm/system resume is that
> > > we've simply seen too many bugs due to slight variations and unintended
> omissions."
> > >
> > > Tested using igt drv_hangman.
> > >
> > > V2: Cleaner way of preventing check_wedge returning -EAGAIN
> > > V3: Clean the last_context during reset, to ensure do_switch() does
> > > the MI_SET_CONTEXT. As per review.
> > > Signed-off-by: McAulay, Alistair <alistair.mcaulay at intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c         |  6 +++
> > >  drivers/gpu/drm/i915/i915_drv.h         |  3 ++
> > >  drivers/gpu/drm/i915/i915_gem.c         |  4 +-
> > >  drivers/gpu/drm/i915/i915_gem_context.c | 33 +++-------------
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c     | 67 +++++--------------------------
> --
> > >  drivers/gpu/drm/i915/i915_gem_gtt.h     |  3 +-
> > >  6 files changed, 28 insertions(+), 88 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c index 5e4fefd..3bfafe6 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -806,7 +806,13 @@ int i915_reset(struct drm_device *dev)
> > >  			!dev_priv->ums.mm_suspended) {
> > >  		dev_priv->ums.mm_suspended = 0;
> > >
> > > +		/* Used to prevent gem_check_wedged returning -EAGAIN
> > > during gpu reset */
> > > +		dev_priv->gpu_error.reload_in_reset = true;
> > > +
> > >  		ret = i915_gem_init_hw(dev);
> > > +
> > > +		dev_priv->gpu_error.reload_in_reset = false;
> > > +
> > >  		mutex_unlock(&dev->struct_mutex);
> > >  		if (ret) {
> > >  			DRM_ERROR("Failed hw init on reset %d\n", ret); diff
> --git
> > > a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 991b663..116daff 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1217,6 +1217,9 @@ struct i915_gpu_error {
> > >
> > >  	/* For missed irq/seqno simulation. */
> > >  	unsigned int test_irq_rings;
> > > +
> > > +	/* Used to prevent gem_check_wedged returning -EAGAIN during
> > > gpu reset   */
> > > +	bool reload_in_reset;
> > >  };
> > >
> > >  enum modeset_restore {
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > > b/drivers/gpu/drm/i915/i915_gem.c index ef047bc..e7396eb 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1085,7 +1085,9 @@ i915_gem_check_wedge(struct i915_gpu_error
> > > *error,
> > >  		if (i915_terminally_wedged(error))
> > >  			return -EIO;
> > >
> > > -		return -EAGAIN;
> > > +		/* Check if GPU Reset is in progress */
> > > +		if (!error->reload_in_reset)
> > > +			return -EAGAIN;
> 
> This is silly. You already have the same flag above. Look closer.
> -Chris
> 
> --
> Chris Wilson, Intel Open Source Technology Centre

It is not the same. This is a special case when re-initialising the hw. This flag is to allow gem_init_hw() to complete successfully during reset. 
At any other point during reset, -EAGAIN should be returned.

Alistair.



More information about the Intel-gfx mailing list