Amdgpu module is references even after unbinding the vtcon
Thomas Zimmermann
tzimmermann at suse.de
Thu Jan 26 12:54:20 UTC 2023
Hi
Am 26.01.23 um 13:45 schrieb Christian König:
> Am 26.01.23 um 13:40 schrieb Thomas Zimmermann:
>> Hi
>>
>> Am 26.01.23 um 10:49 schrieb Slivka, Danijel:
>>> [AMD Official Use Only - General]
>>>
>>> Hi Thomas,
>>>
>>> I have checked what you mentioned.
>>> When loading amdgpu we call drm_client_init() during fbdev setup
>>> [1], the refcnt for drm_kms_helper increases from 3 -> 4.
>>> When we unbind vtcon, refcnt for drm_kms_helper drops 4 -> 3, but the
>>> drm_client_release() [2] is not called.
>>> The drm_client_release() is called only when unloading the amdgpu
>>> driver.
>>>
>>> Is this expected?
>>>
>>> There is a comment for drm_client_release with regards to fbdev :
>>> * This function should only be called from the unregister callback.
>>> An exception
>>> * is fbdev which cannot free the buffer if userspace has open file
>>> descriptors.
>>>
>>> Could this be relevant for our use case, although as
>>> Application/X/GDM are stopped at that point and no fd should be open.
>>
>> This looks like the bug to me.
>>
>> I'm not sure why the client code takes the module reference in the
>> first place. Drivers invoke client interface directly. Shouldn't that
>> imply that they have a module reference already?
>
> It's not the client code who takes the module reference, it's the
> DMA-buf code.
>
> As far as we have narrowed this down GDM/X is inspecting the existing
> configuring during startup, while doing so they export the BO initially
> created by fbdev with DMA-buf (probably to give it to EGL or something
> like this). This DMA-buf export is what's adding the module reference.
>
> The problem is now that when GDM/X exits the DMA-buf should be destroyed
> again, but it isn't because obj->handle_count isn't zero because the
> drm_client interface keeps the handle around even after creating the DRM
> framebuffer object.
OK, thanks. I saw your patch to address the problem. Let me give it a test.
Best regards
Thomas
>
> Regards,
> Christian.
>
>>
>> Best regards
>> Thomas
>>
>>>
>>> Thank you,
>>> BR,
>>> Danijel
>>>
>>>> -----Original Message-----
>>>> From: Thomas Zimmermann <tzimmermann at suse.de>
>>>> Sent: Wednesday, January 25, 2023 8:48 PM
>>>> To: Christian König <ckoenig.leichtzumerken at gmail.com>
>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Slivka, Danijel
>>>> <Danijel.Slivka at amd.com>; dri-devel
>>>> <dri-devel at lists.freedesktop.org>; Sharma,
>>>> Shashank <Shashank.Sharma at amd.com>
>>>> Subject: Re: Amdgpu module is references even after unbinding the vtcon
>>>>
>>>> Hi Christian
>>>>
>>>> Am 24.01.23 um 15:12 schrieb Christian König:
>>>>> Hi Thomas,
>>>>>
>>>>> we ran into a problem with the general fbcon/fbdev implementation and
>>>>> though that you might have some idea.
>>>>>
>>>>> What happens is the following:
>>>>> 1. We load amdgpu and get our normal fbcon.
>>>>> 2. fbcon allocates a dump BO as backing store for the console.
>>>>> 3. GDM/X/Applications start, new framebuffers are created BOs
>>>>> imported, exported etc...
>>>>> 4. Somehow X or GDM iterated over all the framebuffer objects the
>>>>> kernels knows about and export them as DMA-buf.
>>>>> 5. Application/X/GDM are stopped, handles closed, framebuffers
>>>>> released etc...
>>>>> 6. We unbind vtcon.
>>>>>
>>>>> At this point the amdgpu module usually has a reference count of 0 and
>>>>> can be unloaded, but since GDM/X/Whoever iterated over all the known
>>>>> framebuffers and exported them as DMA-buf (for whatever reason idk) we
>>>>> now still have an exported DMA-buf and with it a reference to the
>>>>> module.
>>>>>
>>>>> Any idea how we could prevent that?
>>>>
>>>> Here's another stab in the dark.
>>>>
>>>> The big difference between old-style fbdev and the new one is that
>>>> the old fbdev
>>>> setup (e.g., radeon) allocates a GEM object and puts together the
>>>> fbdev data
>>>> structures from the BO in a fairly hackish way. The new style uses
>>>> an in-kernel
>>>> client with a file to allocate the BO via dumb buffers; and holds a
>>>> reference to the
>>>> DRM module.
>>>>
>>>> Maybe the reference comes from the in-kernel DRM client itself. [1]
>>>> Check if the
>>>> client resources get released [2] when you unbind vtcon.
>>>>
>>>> Best regards
>>>> Thomas
>>>>
>>>> [1]
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_client.c#L87
>>>> [2]
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_client.c#L16
>>>> 0
>>>>
>>>>>
>>>>> Thanks,
>>>>> Christian.
>>>>
>>>> --
>>>> Thomas Zimmermann
>>>> Graphics Driver Developer
>>>> SUSE Software Solutions Germany GmbH
>>>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>>>> (HRB 36809, AG Nürnberg)
>>>> Geschäftsführer: Ivo Totev
>>
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230126/970f89c6/attachment.sig>
More information about the dri-devel
mailing list