[PATCH 6/6] drm/amdgpu: Fix driver unload issue

Deng, Emily Emily.Deng at amd.com
Tue Mar 30 08:19:54 UTC 2021


[AMD Official Use Only - Internal Distribution Only]

Hi Christian,
     Yes, I agree both with you. But the issue occurs randomly and in unload driver and in fairly low rate. It is hard to debug where is the memory leak. Could you give some suggestion about how
to debug this issue?


Best wishes
Emily Deng



>-----Original Message-----
>From: Christian König <ckoenig.leichtzumerken at gmail.com>
>Sent: Tuesday, March 30, 2021 3:11 PM
>To: Deng, Emily <Emily.Deng at amd.com>; Chen, Jiansong (Simon)
><Jiansong.Chen at amd.com>; amd-gfx at lists.freedesktop.org
>Subject: Re: [PATCH 6/6] drm/amdgpu: Fix driver unload issue
>
>Good morning,
>
>yes Jiansong is right that patch is really not a good idea.
>
>Moving buffers can indeed happen during shutdown while some memory is
>still referenced.
>
>Just ignoring the move is not the right approach, you need to find out why the
>memory is moved in the first place.
>
>You could add something like WARN_ON(adev->shutdown);
>
>Regards,
>Christian.
>
>Am 30.03.21 um 09:05 schrieb Deng, Emily:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi Jiansong,
>>       It does happen,  maybe have the race condition?
>>
>>
>> Best wishes
>> Emily Deng
>>
>>
>>
>>> -----Original Message-----
>>> From: Chen, Jiansong (Simon) <Jiansong.Chen at amd.com>
>>> Sent: Tuesday, March 30, 2021 2:49 PM
>>> To: Deng, Emily <Emily.Deng at amd.com>; amd-gfx at lists.freedesktop.org
>>> Cc: Deng, Emily <Emily.Deng at amd.com>
>>> Subject: RE: [PATCH 6/6] drm/amdgpu: Fix driver unload issue
>>>
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> I still wonder how the issue takes place? According to my humble
>>> knowledge in driver model, the reference count of the kobject for the
>>> device will not reach zero when there is still some device mem
>>> access, and shutdown should not happen.
>>>
>>> Regards,
>>> Jiansong
>>> -----Original Message-----
>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
>>> Emily Deng
>>> Sent: Tuesday, March 30, 2021 12:42 PM
>>> To: amd-gfx at lists.freedesktop.org
>>> Cc: Deng, Emily <Emily.Deng at amd.com>
>>> Subject: [PATCH 6/6] drm/amdgpu: Fix driver unload issue
>>>
>>> During driver unloading, don't need to copy mem, or it will introduce
>>> some call trace, such as when sa_manager is freed, it will introduce
>>> warn call trace in amdgpu_sa_bo_new.
>>>
>>> Signed-off-by: Emily Deng <Emily.Deng at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index e00263bcc88b..f0546a489e0d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -317,6 +317,9 @@ int amdgpu_ttm_copy_mem_to_mem(struct
>>> amdgpu_device *adev,  struct dma_fence *fence = NULL;  int r = 0;
>>>
>>> +if (adev->shutdown)
>>> +return 0;
>>> +
>>> if (!adev->mman.buffer_funcs_enabled) {  DRM_ERROR("Trying to move
>>> memory with ring turned off.\n");  return -EINVAL;
>>> --
>>> 2.25.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis
>>> ts.fr
>>> eedesktop.org%2Fmailman%2Flistinfo%2Famd-
>>>
>gfx&data=04%7C01%7CJiansong.Chen%40amd.com%7C1b4c71d7b96247
>>>
>6a367508d8f3362f40%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7
>>>
>C637526761354532311%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
>>>
>MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdat
>>>
>a=RxRnZW0fmwjKSGMN1nf6kIHRdAPVs9J5OBluDYhR6vQ%3D&reserved
>>> =0
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-
>gfx&data=04%7C01%7CEm
>>
>ily.Deng%40amd.com%7Cffacb4715aff4ba4336808d8f34af62d%7C3dd8961fe4
>884e
>>
>608e11a82d994e183d%7C0%7C0%7C637526850578585302%7CUnknown%7CT
>WFpbGZsb3
>>
>d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
>3D%7
>>
>C1000&sdata=u26JPASmJOF5nkXFSJP89PiUUFehvzf%2B2qxQM%2FgT9Ek
>%3D&amp
>> ;reserved=0



More information about the amd-gfx mailing list