[PATCH 0/1] [RFC] DRM locking issues during early open

Dave Airlie airlied at gmail.com
Fri Apr 20 02:40:35 PDT 2012

On Thu, Apr 19, 2012 at 5:22 PM, Andy Whitcroft <apw at canonical.com> wrote:
> We have been carrying a (rather poor) patch for an issue we identified in
> the DRM driver.  This issue is triggered when a DRM device is initialising
> and userspace attempts to open it, typically in response to the sysfs
> device added event.  Basically we allocate the minor numbers making
> the device available, and then call the drm load callback.  Until this
> completes the device is really not ready and these early opens typically
> lead to oopses.
> We have been using the following patch to avoid this by marking the minors
> as in error until the load method has completed.  This avoids the early
> open by simply erroring out the opens with EAGAIN.  Obviously we should
> be delaying the open until the load method complete.
> I include the existing patch for completness (it is not really ready for
> merging) to illustrate the issue.  I think it is logical that the wait
> should simply be delayed until the load has completed.  I am proposing
> to include a wait queue associated with the idr cache for the drm minors
> which we can use to allow open callers to wait_event_interruptible() on.
> I'll be putting together a prototype shortly and will follow up with it.
> Thoughts?

I've just revisited this, maybe I'm going insane but why does
drm_global_mutex not stop this?

drm_get_pci_dev takes drm_global_mutex before calling drm_fill_in_dev
and drm_get_minor.

Now the fops should be pointing at stub_open at this point, as we
won't have switched to the per device fops yet,
and one of the first things drm_stub_open does is take the
drm_global_mutex before doing the idr lookup.

So is the problem opening some sysfs or proc file early?


