[PATCH] drivers/base: use a worker for sysfs unbind

Daniel Vetter daniel at ffwll.ch
Thu Dec 13 09:58:14 UTC 2018


On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> >
> > Drivers might want to remove some sysfs files, which needs the same
> > locks and ends up angering lockdep. Relevant snippet of the stack
> > trace:
> >
> >   kernfs_remove_by_name_ns+0x3b/0x80
> >   bus_remove_driver+0x92/0xa0
> >   acpi_video_unregister+0x24/0x40
> >   i915_driver_unload+0x42/0x130 [i915]
> >   i915_pci_remove+0x19/0x30 [i915]
> >   pci_device_remove+0x36/0xb0
> >   device_release_driver_internal+0x185/0x250
> >   unbind_store+0xaf/0x180
> >   kernfs_fop_write+0x104/0x190
> 
> Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> source of the lockdep unhappiness?

Yeah I guess I cut out too much of the lockdep splat. It complains about
kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
class. It's ofc not the same lock, so no real deadlock. Getting the
device_release_driver outside of the callchain under kernfs_fop_write,
which this patch does, "fixes" it. For "fixes" = shut up lockdep.

Other options:
- Anotate the recursion with the usual lockdep annotations. Potentially
  results in lockdep not catching real deadlocks (you can still have other
  loops closing the deadlock, maybe through some subsystem/bus lock).

- Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
  that know what they're doing), which should be fine if we refcount
  everything properly (bus, driver & device).

- Also note that probably the same bug exists on the bind sysfs interface,
  but we don't use that, so I don't care :-)

- Most of these issues are never visible in normal usage, since normally
  driver bind/unbind is done from a kthread or model_load/unload, neither
  of which is running in the context of that kernfs mutex kernfs_fop_write
  holds. That's why I think the task work is the best solution, since it
  changes the locking context of the unbind sysfs to match the locking
  context of module unload and hotunplug. Unfortunately that trick doesn't
  work for the bind sysfs file, since that way we can't thread the errno
  value back to userspace.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list