[PATCH] drm/amdgpu: flush delete wq after wait fence

Yao, Yiqing(James) yiqinyao at amd.com
Thu May 5 07:39:59 UTC 2022


Dear Paul,

Patch edited:

[why]
lru_list not empty warning in sw fini during repeated device bind unbind.
There should be a amdgpu_fence_wait_empty() before the flush_delayed_work()
call as Christian suggested.

[how]
Move to do flush_delayed_work for ttm bo delayed delete wq after 
fence_driver_hw_fini.

Signed-off-by: Yiqing Yao <yiqing.yao at amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++----
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 14c5ccf81e80..92e5ed3ed345 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4003,10 +4003,6 @@ void amdgpu_device_fini_hw(struct amdgpu_device 
*adev)
  {
      dev_info(adev->dev, "amdgpu: finishing device.\n");
      flush_delayed_work(&adev->delayed_init_work);
-    if (adev->mman.initialized) {
-        flush_delayed_work(&adev->mman.bdev.wq);
-        ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
-    }
      adev->shutdown = true;

      /* make sure IB test finished before entering exclusive mode
@@ -4029,6 +4025,11 @@ void amdgpu_device_fini_hw(struct amdgpu_device 
*adev)
      }
      amdgpu_fence_driver_hw_fini(adev);

+    if (adev->mman.initialized) {
+        flush_delayed_work(&adev->mman.bdev.wq);
+        ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
+    }
+
      if (adev->pm_sysfs_en)
          amdgpu_pm_sysfs_fini(adev);
      if (adev->ucode_sysfs_en)
-- 
2.25.1


On 5/5/2022 3:15 PM, Paul Menzel wrote:
>> [how]
>> Do flush_delayed_work for ttm bo delayed delete wq after 
>> fence_driver_hw_fini.
>>
>> Signed-off-by: Yiqing Yao <yiqing.yao at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++----
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 14c5ccf81e80..92e5ed3ed345 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4003,10 +4003,6 @@ void amdgpu_device_fini_hw(struct 
>> amdgpu_device *adev)
>>   {
>>       dev_info(adev->dev, "amdgpu: finishing device.\n");
>>       flush_delayed_work(&adev->delayed_init_work);
>> -    if (adev->mman.initialized) {
>> -        flush_delayed_work(&adev->mman.bdev.wq);
>> -        ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
>> -    }
>
> From the commit message, it’s not clear, that you remove this here.
>
This part is moved to be done later.


Thank you for advice,

Yiqing



More information about the amd-gfx mailing list