[PATCH v2 02/37] drm/nouveau: handle pci/tegra drm_dev_{alloc, register} from common code

Jason Gunthorpe jgg at nvidia.com
Sun Jul 28 23:04:52 UTC 2024


On Sun, Jul 28, 2024 at 11:34:14PM +0200, Danilo Krummrich wrote:
> On Sun, Jul 28, 2024 at 03:13:08PM -0300, Jason Gunthorpe wrote:

> I think we're on the same page with all that. As clarified in [1], that's not a
> big concern, I was referring to the changes required to integrate the auxbus
> stuff.

Well, I see this thread having the realization that things are not
setup proeprly to use devres. To be fair devres creates almost as many
bugs as it solves :\ cleanup.h is possibly a better option for most
simple things and harder to misuse...

> > normal (though most subsystems would call that unregister, not put)
> 
> A DRM device is reference counted and can out-live the driver, hence the
> drm_dev_put() call in .remove(). There is also a special drm_dev_unplug()
> function, which does not only unregister the DRM device, but also sets a guard
> to be able prevent HW accesses after the HW is accessible anymore.

Every subsystem has a refcounted object, struct device is inherently
refcounted. You call the thing driver calls during .remove()
'unregister' because it is special. Once it returns the subsystem has
to promise no more code is running in driver callbacks and the driver
is permitted to start destroying anything it might need to use when
processing any callbacks.

This is really tricky and people routinely misunderstand the
requirements and get this wrong. The consequence is UAF problems in
obscure cases with unbind races (that few actually care about), but
getting it right starts with labeling things properly :)

We went through this long ago in RDMA because someone actually had a
usecase of live driver unbind, making that work reliably under a full
active work load took some thoughtfulness.

Jason


More information about the Nouveau mailing list