FW: [PATCH 1/3] drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()

Andrey Grodzovsky andrey.grodzovsky at amd.com
Fri Oct 22 15:09:56 UTC 2021


On 2021-10-21 7:33 p.m., Yu, Lang wrote:
> [AMD Official Use Only]
>
>
>
>> -----Original Message-----
>> From: Grodzovsky, Andrey <Andrey.Grodzovsky at amd.com>
>> Sent: Thursday, October 21, 2021 11:18 PM
>> To: Yu, Lang <Lang.Yu at amd.com>; amd-gfx at lists.freedesktop.org
>> Subject: Re: FW: [PATCH 1/3] drm/amdgpu: fix a potential memory leak in
>> amdgpu_device_fini_sw()
>>
>> On 2021-10-21 3:19 a.m., Yu, Lang wrote:
>>
>>> [AMD Official Use Only]
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Yu, Lang <Lang.Yu at amd.com>
>>>> Sent: Thursday, October 21, 2021 3:18 PM
>>>> To: Grodzovsky, Andrey <Andrey.Grodzovsky at amd.com>
>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
>>>> <Christian.Koenig at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Yu, Lang
>>>> <Lang.Yu at amd.com>
>>>> Subject: [PATCH 1/3] drm/amdgpu: fix a potential memory leak in
>>>> amdgpu_device_fini_sw()
>>>>
>>>> amdgpu_fence_driver_sw_fini() should be executed before
>>>> amdgpu_device_ip_fini(), otherwise fence driver resource won't be
>>>> properly freed as adev->rings have been tore down.
>>
>> Cam you clarify more where exactly the memleak happens ?
>>
>> Andrey
> See amdgpu_fence_driver_sw_fini(), ring->fence_drv.fences will only be freed
> when adev->rings[i] is not NULL.
>
> void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev)
> {
> 	unsigned int i, j;
>
> 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> 		struct amdgpu_ring *ring = adev->rings[i];
>
> 		if (!ring || !ring->fence_drv.initialized)
> 			continue;
>
> 		if (!ring->no_scheduler)
> 			drm_sched_fini(&ring->sched);
>
> 		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
> 			dma_fence_put(ring->fence_drv.fences[j]);
> 		kfree(ring->fence_drv.fences);
> 		ring->fence_drv.fences = NULL;
> 		ring->fence_drv.initialized = false;
> 	}
> }
>
> If amdgpu_device_ip_fini() is executed before amdgpu_fence_driver_sw_fini(),
> amdgpu_device_ip_fini() will call gfx_vX_0_sw_fini()
> then call amdgpu_ring_fini() and set adev->rings[i] to NULL.
> Nothing will be freed in amdgpu_fence_driver_sw_fini().
> ring->fence_drv.fences  memory leak happened!
>
> void amdgpu_ring_fini(struct amdgpu_ring *ring)
> {
> 	......
> 	ring->adev->rings[ring->idx] = NULL;
> }
>
> Regards,
> Lang


Got it, Looks good to me.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>

Andrey

>
>>
>>>> Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early
>>>> and late")
>>>>
>>>> Signed-off-by: Lang Yu <lang.yu at amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 41ce86244144..5654c4790773 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -3843,8 +3843,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device
>>>> *adev)
>>>>
>>>> void amdgpu_device_fini_sw(struct amdgpu_device *adev)  {
>>>> -	amdgpu_device_ip_fini(adev);
>>>> 	amdgpu_fence_driver_sw_fini(adev);
>>>> +	amdgpu_device_ip_fini(adev);
>>>> 	release_firmware(adev->firmware.gpu_info_fw);
>>>> 	adev->firmware.gpu_info_fw = NULL;
>>>> 	adev->accel_working = false;
>>>> --
>>>> 2.25.1


More information about the amd-gfx mailing list