[Intel-gfx] [PATCH 01/44] drivers/base: Always release devres on device_del

Greg Kroah-Hartman gregkh at linuxfoundation.org
Fri Apr 3 14:17:12 UTC 2020


On Fri, Apr 03, 2020 at 03:57:45PM +0200, Daniel Vetter wrote:
> In drm we've added nice drm_device (the main gpu driver thing, which
> also represents the userspace interfaces and has everything else
> dangling off it) init functions using devres, devm_drm_dev_init and
> soon devm_drm_dev_alloc (this patch series adds that).
> 
> A slight trouble is that drm_device itself holds a reference on the
> struct device it's sitting on top (for sysfs links and dmesg debug and
> lots of other things), so there's a reference loop. For real drivers
> this is broken at remove/unplug time, where all devres resources are
> released device_release_driver(), before the final device reference is
> dropped. So far so good.
> 
> There's 2 exceptions:
> - drm/vkms|vgem: Virtual drivers for which we create a fake/virtual
>   platform device to make them look more like normal devices to
>   userspace. These aren't drivers in the driver model sense, we simple
>   create a platform_device and register it.

That's a horrid abuse of platform devices, just use a "virtual" device
please, create/remove it when you need it, and all should be fine.

> - drm/i915/selftests, where we create minimal mock devices, and again
>   the selftests aren't proper drivers in the driver model sense.

Again, virtual devices are best to use for this.

> For these two cases the reference loop isn't broken, because devres is
> only cleaned up when the last device reference is dropped. But that's
> not happening, because the drm_device holds that last struct device
> reference.
> 
> Thus far this wasn't a problem since the above cases simply
> hand-rolled their cleanup code. But I want to convert all drivers over
> to the devm_ versions, hence it would be really nice if these
> virtual/fake/mock uses-cases could also be managed with devres
> cleanup.
> 
> I see three possible approaches:
> 
> - Clean up devres from device_del (or platform_device_unregister) even
>   when no driver is bound. This seems like the simplest solution, but
>   also the one with the widest impact, and what this patch implements.
>   We might want to include more of the cleanup than just
>   devres_release_all, but this is all I need to get my use case going.

After device_del, you should never be using that structure again anyway.
So why is there any "resource leak"?  You can't recycle the structure,
and you can't assign it to anything else, so eventually you have to do
a final put on the thing, which will free up the resources.

And then all should be fine, right?  But, by putting the freeing here,
you can still have a "live" device that thinks it has resources availble
that it can access, but yet they are now gone.  Yeah, it's probably not
ever going to really happen, but the lifecycles of dynamic devices are
tough to "prove" at times, and I worry that freeing things this early is
going to cause odd disconnect issues.

thanks,

greg k-h


More information about the Intel-gfx mailing list