[PATCH 2/8] drm/xe: covert sysfs over to devm

Rodrigo Vivi rodrigo.vivi at intel.com
Mon Apr 29 18:45:26 UTC 2024


On Mon, Apr 29, 2024 at 04:17:54PM +0100, Matthew Auld wrote:
> On 29/04/2024 14:52, Lucas De Marchi wrote:
> > On Mon, Apr 29, 2024 at 09:28:00AM GMT, Rodrigo Vivi wrote:
> > > On Mon, Apr 29, 2024 at 01:14:38PM +0100, Matthew Auld wrote:
> > > > Hotunplugging the device seems to result in stuff like:
> > > > 
> > > > kobject_add_internal failed for tile0 with -EEXIST, don't try to
> > > > register things with the same name in the same directory.
> > > > 
> > > > We only remove the sysfs as part of drmm, however that is tied to the
> > > > lifetime of the driver instance and not the device underneath. Attempt
> > > > to fix by using devm for all of the remaining sysfs stuff related to the
> > > > device.
> > > 
> > > hmmm... so basically we should use the drmm only for the global module
> > > stuff and the devm for things that are per device?
> > 
> > that doesn't make much sense. drmm is supposed to run when the driver
> > unbinds from the device... basically when all refcounts are gone with
> > drm_dev_put().  Are we keeping a ref we shouldn't?
> 
> It's run when all refcounts are dropped for that particular drm_device, but
> that is separate from the physical device underneath (struct device). For
> example if something has an open driver fd the drmm release action is not
> going to be called until after that is also closed. But in the meantime we
> might have already removed the pci device and re-attached it to a newly
> allocated drm_device/xe_driver instance, like with hotunplug.
> 
> For example, currently we don't even call basic stuff like guc_fini() etc.
> when removing the pci device, but rather when the drm_device is released,
> which sounds quite broken.
> 
> So roughly drmm is for drm_device software level stuff and devm is for stuff
> that needs to happen when removing the device. See also the doc for drmm:
> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_managed.c#L23
> 
> Also: https://docs.kernel.org/gpu/drm-uapi.html#device-hot-unplug

Cc: Aravind and Michal since this likely relates to the FLR discussion...

but it looks to me that we should move more towards the devm_ and limit
the usage of drmm_ to some very specific cases...

> 
> > 
> > Lucas De Marchi


More information about the Intel-xe mailing list