[Intel-gfx] [RFC PATCH v3 4/4] tests/drv_module_reload: add ipvr support

Daniel Vetter daniel at ffwll.ch
Tue Jan 6 23:33:43 PST 2015


On Tue, Jan 06, 2015 at 02:14:27PM +0000, Cheng, Yao wrote:
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
> > Vetter
> > Sent: Monday, January 5, 2015 16:40
> > To: Cheng, Yao
> > Cc: Daniel Vetter; Thierry Reding; intel-gfx at lists.freedesktop.org; dri-
> > devel at lists.freedesktop.org; Kelley, Sean V; Chehab, John;
> > emil.l.velikov at gmail.com; Jiang, Fei; Beckett, Robert; Barbalho, Rafael
> > Subject: Re: [RFC PATCH v3 4/4] tests/drv_module_reload: add ipvr support
> > 
> > On Sun, Dec 21, 2014 at 02:40:24PM +0000, Cheng, Yao wrote:
> > > > -----Original Message-----
> > > > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch]
> > > > Sent: Thursday, December 18, 2014 19:21
> > > > To: Thierry Reding
> > > > Cc: Cheng, Yao; intel-gfx at lists.freedesktop.org; dri-
> > > > devel at lists.freedesktop.org; Kelley, Sean V; Chehab, John;
> > > > emil.l.velikov at gmail.com; Jiang, Fei
> > > > Subject: Re: [RFC PATCH v3 4/4] tests/drv_module_reload: add ipvr
> > > > support
> > > >
> > > > On Thu, Dec 18, 2014 at 11:04 AM, Thierry Reding
> > > > <thierry.reding at gmail.com>
> > > > wrote:
> > > > >> I double checked the symptom and found it was a deadlock on
> > > > drm_global_mutex.
> > > > >> When i915_driver_load() registers the platform device while ipvr
> > > > >> module
> > > > is in the system, ipvr's probe() function tries to lock
> > > > drm_global_mutex which was already held by i915.
> > > > >> I think either of the following 2 actions need to be moved to a
> > > > >> bottom half
> > > > e.g. a work queue:
> > > > >>       platform_device_add () call in i915_ved.c (called during
> > > > i915_driver_load())
> > > > >>       drm_dev_register() call during ipvr's probe() Which one
> > > > >> makes more sense? pls kindly advise (I personally prefer the former
> > one.).
> > > > >
> > > > > Yes, that's somewhat ugly, but I don't see a way around that. I'd
> > > > > also think that moving platform_device_add() to a workqueue would
> > > > > be the best option here.
> > > >
> > > > Or we simply kill drm_global_mutex for platform drivers that don't
> > > > use the -
> > > > >probe hook. It should work when they have a correct order betwen
> > > > drm_dev_alloc and _register and all the code in between. So just
> > > > ditch the -
> > > > >load callback in teh ipvr driver and rework the load sequence as
> > > > >suggested
> > > > somewhere else and this is fixed already. No need for bottom halfs I
> > think.
> > >
> > > Daniel, sorry I didn't quite understand "platform drivers that don't
> > > use the probe hook". For initialization, the ipvr platform driver's
> > > probe() is called in following 2 possible paths:
> > > 1. ipvr installed before i915. In this case, ipvr's probe() is called
> > > inside i915_driver_load() and falls into the drm_global_mutex dead lock.
> > > 2. i915 installed before ipvr. In this case, ipvr's probe() is called
> > > without drm_global_mutex held by i915 and no dead lock issue.
> > > If we kill drm_global_mutex, will path 2 run into issue? And in your
> > > suggestion, how to rework the load sequence? Do you mean calling
> > > ipvr's
> > > load() callback directly during platform driver probe()?
> > 
> > Hm right it's not that simple really. What we need in more detail is:
> > - Move the mutex_lock(&drm_global_mutex) out of drm_dev_register into
> >   all the callers. If a driver has a ->load() callback it most likely is
> >   racy with the usual load ordering issues.
> > 
> > - Rework ipvr to no longer have a ->load callback. Insteaed use the
> >   following sequence (in the platform ->probe callback):
> > 
> >   drm_dev_alloc();
> >   ipvr_load();
> >   drm_dev_register();
> > 
> >   With that ordering we don't need the additional guarantees that
> >   drm_global_mutex provides and we can avoid to take that lock around
> >   drm_dev_registrer() call in the ipvr code.
> 
> Thanks for the detailed explanation, Daniel!
> That sounds to be a small refactor on drm core, and need change many drm drivers: nouveau, tegra, udl.
> Should it be a standalone RFC patch?

I think the locking shuffling should be doable in just one patch, but
definitely needs to be split out.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the Intel-gfx mailing list