[PATCH 2/8] drm/xe: covert sysfs over to devm

Lucas De Marchi lucas.demarchi at intel.com
Mon Apr 29 21:28:42 UTC 2024


On Mon, Apr 29, 2024 at 02:45:26PM GMT, Rodrigo Vivi wrote:
>On Mon, Apr 29, 2024 at 04:17:54PM +0100, Matthew Auld wrote:
>> On 29/04/2024 14:52, Lucas De Marchi wrote:
>> > On Mon, Apr 29, 2024 at 09:28:00AM GMT, Rodrigo Vivi wrote:
>> > > On Mon, Apr 29, 2024 at 01:14:38PM +0100, Matthew Auld wrote:
>> > > > Hotunplugging the device seems to result in stuff like:
>> > > >
>> > > > kobject_add_internal failed for tile0 with -EEXIST, don't try to
>> > > > register things with the same name in the same directory.
>> > > >
>> > > > We only remove the sysfs as part of drmm, however that is tied to the
>> > > > lifetime of the driver instance and not the device underneath. Attempt
>> > > > to fix by using devm for all of the remaining sysfs stuff related to the
>> > > > device.
>> > >
>> > > hmmm... so basically we should use the drmm only for the global module
>> > > stuff and the devm for things that are per device?
>> >
>> > that doesn't make much sense. drmm is supposed to run when the driver
>> > unbinds from the device... basically when all refcounts are gone with
>> > drm_dev_put().  Are we keeping a ref we shouldn't?
>>
>> It's run when all refcounts are dropped for that particular drm_device, but
>> that is separate from the physical device underneath (struct device). For
>> example if something has an open driver fd the drmm release action is not
>> going to be called until after that is also closed. But in the meantime we
>> might have already removed the pci device and re-attached it to a newly
>> allocated drm_device/xe_driver instance, like with hotunplug.
>>
>> For example, currently we don't even call basic stuff like guc_fini() etc.
>> when removing the pci device, but rather when the drm_device is released,
>> which sounds quite broken.
>>
>> So roughly drmm is for drm_device software level stuff and devm is for stuff
>> that needs to happen when removing the device. See also the doc for drmm:
>> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_managed.c#L23
>>
>> Also: https://docs.kernel.org/gpu/drm-uapi.html#device-hot-unplug

yeah... I think you convinced me

>
>Cc: Aravind and Michal since this likely relates to the FLR discussion...
>
>but it looks to me that we should move more towards the devm_ and limit
>the usage of drmm_ to some very specific cases...

agreed,

Lucas De Marchi

>
>>
>> >
>> > Lucas De Marchi


More information about the Intel-xe mailing list