[Intel-gfx] [PATCH] drm/i915: Prevent lock-cycles between GPU waits and GPU resets
Chris Wilson
chris at chris-wilson.co.uk
Wed Jun 12 11:04:46 UTC 2019
Quoting Mika Kuoppala (2019-06-12 12:00:07)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > We cannot allow ourselves to wait on the GPU while holding any lock we
>
> s/we/as we?
>
> My english parser is not strong.
>
> > may need to reset the GPU. While there is not an explicit lock between
> > the two operations, lockdep cannot detect the dependency. So let's tell
> > lockdep about the wait/reset dependency with an explicit lockmap.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > ---
> > This is *annoyingly* good at detecting lock cycles in GPU reset.
> > -Chris
> > ---
> > drivers/gpu/drm/i915/gt/intel_reset.c | 5 ++++-
> > drivers/gpu/drm/i915/i915_drv.h | 8 ++++++++
> > drivers/gpu/drm/i915/i915_gem.c | 3 +++
> > drivers/gpu/drm/i915/i915_request.c | 2 ++
> > drivers/gpu/drm/i915/selftests/mock_gem_device.c | 2 ++
> > 5 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index 60d24110af80..6368b37f26d1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -978,10 +978,11 @@ void i915_reset(struct drm_i915_private *i915,
> >
> > might_sleep();
> > GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
> > + lock_map_acquire(&i915->gt.reset_lockmap);
> >
> > /* Clear any previous failed attempts at recovery. Time to try again. */
> > if (!__i915_gem_unset_wedged(i915))
> > - return;
> > + goto unlock;
> >
> > if (reason)
> > dev_notice(i915->drm.dev, "Resetting chip for %s\n", reason);
> > @@ -1029,6 +1030,8 @@ void i915_reset(struct drm_i915_private *i915,
> >
> > finish:
> > reset_finish(i915);
> > +unlock:
> > + lock_map_release(&i915->gt.reset_lockmap);
> > return;
>
> The return patter in this function is rather unorthodox. Might be
> even that I reviewed it. Very close that I fell into trap of thinking
> that you return with lock held.
Sssh. It's a one-off unorthodoxy. Exception to the rule type of thing.
> > taint:
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 0ea7f78ae227..9cfa9500fcc4 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1919,6 +1919,14 @@ struct drm_i915_private {
> > ktime_t last_init_time;
> >
> > struct i915_vma *scratch;
> > +
> > + /*
> > + * We must never wait on the GPU while holding a lock we may
>
> My english parser still expected 'as' somewhere in there.
Both fixes required, thanks.
-Chris
More information about the Intel-gfx
mailing list