[Intel-gfx] [PATCH v3] drm/i915: Rework GPU reset sequence to match driver load & thaw
Mcaulay, Alistair
alistair.mcaulay at intel.com
Wed Aug 20 17:21:55 CEST 2014
> -----Original Message-----
> From: Chris Wilson [mailto:chris at chris-wilson.co.uk]
> Sent: Wednesday, August 20, 2014 3:58 PM
> To: Daniel, Thomas
> Cc: Mcaulay, Alistair; intel-gfx at lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH v3] drm/i915: Rework GPU reset sequence to
> match driver load & thaw
>
> On Wed, Aug 20, 2014 at 02:46:37PM +0000, Daniel, Thomas wrote:
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx [mailto:intel-gfx-bounces at lists.freedesktop.org] On
> > > Behalf Of alistair.mcaulay at intel.com
> > > Sent: Friday, August 15, 2014 6:52 PM
> > > To: intel-gfx at lists.freedesktop.org
> > > Subject: [Intel-gfx] [PATCH v3] drm/i915: Rework GPU reset sequence
> > > to match driver load & thaw
> > >
> > > From: "McAulay, Alistair" <alistair.mcaulay at intel.com>
> > >
> > > This patch is to address Daniels concerns over different code during reset:
> > >
> > > http://lists.freedesktop.org/archives/intel-gfx/2014-June/047758.htm
> > > l
> > >
> > > "The reason for aiming as hard as possible to use the exact same
> > > code for driver load, gpu reset and runtime pm/system resume is that
> > > we've simply seen too many bugs due to slight variations and unintended
> omissions."
> > >
> > > Tested using igt drv_hangman.
> > >
> > > V2: Cleaner way of preventing check_wedge returning -EAGAIN
> > > V3: Clean the last_context during reset, to ensure do_switch() does
> > > the MI_SET_CONTEXT. As per review.
> > > Signed-off-by: McAulay, Alistair <alistair.mcaulay at intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.c | 6 +++
> > > drivers/gpu/drm/i915/i915_drv.h | 3 ++
> > > drivers/gpu/drm/i915/i915_gem.c | 4 +-
> > > drivers/gpu/drm/i915/i915_gem_context.c | 33 +++-------------
> > > drivers/gpu/drm/i915/i915_gem_gtt.c | 67 +++++--------------------------
> --
> > > drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +-
> > > 6 files changed, 28 insertions(+), 88 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c index 5e4fefd..3bfafe6 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -806,7 +806,13 @@ int i915_reset(struct drm_device *dev)
> > > !dev_priv->ums.mm_suspended) {
> > > dev_priv->ums.mm_suspended = 0;
> > >
> > > + /* Used to prevent gem_check_wedged returning -EAGAIN
> > > during gpu reset */
> > > + dev_priv->gpu_error.reload_in_reset = true;
> > > +
> > > ret = i915_gem_init_hw(dev);
> > > +
> > > + dev_priv->gpu_error.reload_in_reset = false;
> > > +
> > > mutex_unlock(&dev->struct_mutex);
> > > if (ret) {
> > > DRM_ERROR("Failed hw init on reset %d\n", ret); diff
> --git
> > > a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 991b663..116daff 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1217,6 +1217,9 @@ struct i915_gpu_error {
> > >
> > > /* For missed irq/seqno simulation. */
> > > unsigned int test_irq_rings;
> > > +
> > > + /* Used to prevent gem_check_wedged returning -EAGAIN during
> > > gpu reset */
> > > + bool reload_in_reset;
> > > };
> > >
> > > enum modeset_restore {
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > > b/drivers/gpu/drm/i915/i915_gem.c index ef047bc..e7396eb 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1085,7 +1085,9 @@ i915_gem_check_wedge(struct i915_gpu_error
> > > *error,
> > > if (i915_terminally_wedged(error))
> > > return -EIO;
> > >
> > > - return -EAGAIN;
> > > + /* Check if GPU Reset is in progress */
> > > + if (!error->reload_in_reset)
> > > + return -EAGAIN;
>
> This is silly. You already have the same flag above. Look closer.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
It is not the same. This is a special case when re-initialising the hw. This flag is to allow gem_init_hw() to complete successfully during reset.
At any other point during reset, -EAGAIN should be returned.
Alistair.
More information about the Intel-gfx
mailing list