[Intel-gfx] [PATCH v4 02/38] drm/i915: Explicit power enable during deferred context initialisation

Imre Deak imre.deak at intel.com
Tue Jan 12 07:59:46 PST 2016


On ti, 2016-01-12 at 16:35 +0100, Daniel Vetter wrote:
> On Tue, Jan 12, 2016 at 02:21:51PM +0000, John Harrison wrote:
> > On 12/01/2016 14:04, Daniel Vetter wrote:
> > > On Tue, Jan 12, 2016 at 11:50:34AM +0000, John Harrison wrote:
> > > > On 12/01/2016 11:28, Chris Wilson wrote:
> > > > > On Tue, Jan 12, 2016 at 11:11:20AM +0000, John Harrison
> > > > > wrote:
> > > > > > On 12/01/2016 00:20, Chris Wilson wrote:
> > > > > > > On Mon, Jan 11, 2016 at 06:42:31PM +0000, John.C.Harrison
> > > > > > > @Intel.com wrote:
> > > > > > > > From: John Harrison <John.C.Harrison at Intel.com>
> > > > > > > > 
> > > > > > > > A later patch in this series re-organises the batch
> > > > > > > > buffer submission
> > > > > > > > code. Part of that is to reduce the scope of a
> > > > > > > > pm_get/put pair.
> > > > > > > > Specifically, they previously wrapped the entire
> > > > > > > > submission path from
> > > > > > > > the very start to the very end, now they only wrap the
> > > > > > > > actual hardware
> > > > > > > > submission part in the back half.
> > > > > > > However, as you haven't fixed the ordering issue that
> > > > > > > requires rpm_get
> > > > > > > before struct_mutex, this is broken.
> > > > > > Why does 'intel_runtime_pm_get' require the struct mutex to
> > > > > > be held?
> > > > > > It has certainly not complained at me about trying to do
> > > > > > stuff
> > > > > > without it.
> > > > > Because it depends upon the struct_mutex and rpm doesn't have
> > > > > sufficient
> > > > > lockdep integration to be able to warn about using rpm from
> > > > > the
> > > > > incorrect contexts.
> > > > Where? What does the 'pm_runtime_get_sync' call turn into?
> > > > There are already
> > > > other places in the driver which call intel_runtime_pm_get()
> > > > immediately
> > > > after grabbing the mutex lock. Also, the description comment
> > > > for _pm_get()
> > > > does not mention anything about mutexes at all.
> > > If you nest rpm_get within dev->struct_mutex that's a bug and
> > > could
> > > deadlock. Where does this happen? And for any such place we need
> > > a new
> > > subtest in pm_rpm.
> > 
> > The first two hits when grepping the driver are in
> > 'i915_gem_seqno_info()'
> > and 'i915_interrupt_info()' in i915_debugfs.c. Both say:
> >     ret = mutex_lock_interruptible(&dev->struct_mutex);
> >     if (ret)
> >         return ret;
> >     intel_runtime_pm_get(dev_priv);
> 
> Yeah that's totally bonkers and will deadlock if the device is
> actually
> suspend.

It won't deadlock, runtime resume doesn't need struct mutex. Runtime
suspend needs it, but we return -EAGAIN from there if it's already
held. That in turn will delay the suspend.

--Imre


More information about the Intel-gfx mailing list