Amdgpu module is references even after unbinding the vtcon

Thomas Zimmermann tzimmermann at suse.de
Thu Jan 26 12:54:20 UTC 2023


Hi

Am 26.01.23 um 13:45 schrieb Christian König:
> Am 26.01.23 um 13:40 schrieb Thomas Zimmermann:
>> Hi
>>
>> Am 26.01.23 um 10:49 schrieb Slivka, Danijel:
>>> [AMD Official Use Only - General]
>>>
>>> Hi Thomas,
>>>
>>> I have checked what you mentioned.
>>> When loading amdgpu we call  drm_client_init() during fbdev setup 
>>> [1], the refcnt for drm_kms_helper increases from 3 -> 4.
>>> When we unbind vtcon, refcnt for drm_kms_helper drops 4 -> 3, but the 
>>> drm_client_release() [2] is not called.
>>> The drm_client_release() is called only when unloading the amdgpu 
>>> driver.
>>>
>>> Is this expected?
>>>
>>> There is a comment for drm_client_release with regards to fbdev :
>>> * This function should only be called from the unregister callback. 
>>> An exception
>>>   * is fbdev which cannot free the buffer if userspace has open file 
>>> descriptors.
>>>
>>> Could this be relevant for our use case, although as 
>>> Application/X/GDM are stopped at that point and no fd should be open.
>>
>> This looks like the bug to me.
>>
>> I'm not sure why the client code takes the module reference in the 
>> first place. Drivers invoke client interface directly. Shouldn't that 
>> imply that they have a module reference already?
> 
> It's not the client code who takes the module reference, it's the 
> DMA-buf code.
> 
> As far as we have narrowed this down GDM/X is inspecting the existing 
> configuring during startup, while doing so they export the BO initially 
> created by fbdev with DMA-buf (probably to give it to EGL or something 
> like this). This DMA-buf export is what's adding the module reference.
> 
> The problem is now that when GDM/X exits the DMA-buf should be destroyed 
> again, but it isn't because obj->handle_count isn't zero because the 
> drm_client interface keeps the handle around even after creating the DRM 
> framebuffer object.

OK, thanks. I saw your patch to address the problem. Let me give it a test.

Best regards
Thomas

> 
> Regards,
> Christian.
> 
>>
>> Best regards
>> Thomas
>>
>>>
>>> Thank you,
>>> BR,
>>> Danijel
>>>
>>>> -----Original Message-----
>>>> From: Thomas Zimmermann <tzimmermann at suse.de>
>>>> Sent: Wednesday, January 25, 2023 8:48 PM
>>>> To: Christian König <ckoenig.leichtzumerken at gmail.com>
>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Slivka, Danijel
>>>> <Danijel.Slivka at amd.com>; dri-devel 
>>>> <dri-devel at lists.freedesktop.org>; Sharma,
>>>> Shashank <Shashank.Sharma at amd.com>
>>>> Subject: Re: Amdgpu module is references even after unbinding the vtcon
>>>>
>>>> Hi Christian
>>>>
>>>> Am 24.01.23 um 15:12 schrieb Christian König:
>>>>> Hi Thomas,
>>>>>
>>>>> we ran into a problem with the general fbcon/fbdev implementation and
>>>>> though that you might have some idea.
>>>>>
>>>>> What happens is the following:
>>>>> 1. We load amdgpu and get our normal fbcon.
>>>>> 2. fbcon allocates a dump BO as backing store for the console.
>>>>> 3. GDM/X/Applications start, new framebuffers are created BOs
>>>>> imported, exported etc...
>>>>> 4. Somehow X or GDM iterated over all the framebuffer objects the
>>>>> kernels knows about and export them as DMA-buf.
>>>>> 5. Application/X/GDM are stopped, handles closed, framebuffers
>>>>> released etc...
>>>>> 6. We unbind vtcon.
>>>>>
>>>>> At this point the amdgpu module usually has a reference count of 0 and
>>>>> can be unloaded, but since GDM/X/Whoever iterated over all the known
>>>>> framebuffers and exported them as DMA-buf (for whatever reason idk) we
>>>>> now still have an exported DMA-buf and with it a reference to the 
>>>>> module.
>>>>>
>>>>> Any idea how we could prevent that?
>>>>
>>>> Here's another stab in the dark.
>>>>
>>>> The big difference between old-style fbdev and the new one is that 
>>>> the old fbdev
>>>> setup (e.g., radeon) allocates a GEM object and puts together the 
>>>> fbdev data
>>>> structures from the BO in a fairly hackish way. The new style uses 
>>>> an in-kernel
>>>> client with a file to allocate the BO via dumb buffers; and holds a 
>>>> reference to the
>>>> DRM module.
>>>>
>>>> Maybe the reference comes from the in-kernel DRM client itself. [1] 
>>>> Check if the
>>>> client resources get released [2] when you unbind vtcon.
>>>>
>>>> Best regards
>>>> Thomas
>>>>
>>>> [1]
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_client.c#L87
>>>> [2]
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_client.c#L16
>>>> 0
>>>>
>>>>>
>>>>> Thanks,
>>>>> Christian.
>>>>
>>>> -- 
>>>> Thomas Zimmermann
>>>> Graphics Driver Developer
>>>> SUSE Software Solutions Germany GmbH
>>>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>>>> (HRB 36809, AG Nürnberg)
>>>> Geschäftsführer: Ivo Totev
>>
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230126/970f89c6/attachment.sig>


More information about the dri-devel mailing list