Re: 回复: [PATCH] drm/amdgpu: Make sure ttm delayed work finished
Christian König
christian.koenig at amd.com
Wed Apr 13 08:14:56 UTC 2022
That warning is a bit more than a little annoying.
Before we stop the delayed delete worker we *must* absolutely make sure
that there is nothing going on the hardware any more. Otherwise we could
easily run into use after free issues.
There should somewhere be a amdgpu_fence_wait_empty() before the
flush_delayed_work() call. If that isn't there we do have a problem
elsewhere.
Thanks for investigating this,
Christian.
Am 13.04.22 um 09:47 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> The log from tester says it is the drm framebuffer BO being busy.
>
> I just feel there is lack of time for its fence to be signaled.
> As a delay works too in my test.
> But the warning is a little annoying.
>
> ________________________________________
> 发件人: Koenig, Christian <Christian.Koenig at amd.com>
> 发送时间: 2022年4月13日 15:30
> 收件人: Pan, Xinhui; amd-gfx at lists.freedesktop.org
> 抄送: Deucher, Alexander
> 主题: AW: [PATCH] drm/amdgpu: Make sure ttm delayed work finished
>
> We don't need that.
>
> TTM only reschedules when the BOs are still busy.
>
> And if the BOs are still busy when you unload the driver we have much bigger problems that this TTM worker :)
>
> Regards,
> Christian
>
> ________________________________
> Von: Pan, Xinhui <Xinhui.Pan at amd.com>
> Gesendet: Mittwoch, 13. April 2022 05:08
> An: amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; Pan, Xinhui <Xinhui.Pan at amd.com>
> Betreff: [PATCH] drm/amdgpu: Make sure ttm delayed work finished
>
> ttm_device_delayed_workqueue would reschedule itself if there is pending
> BO to be destroyed. So just one flush + cancel_sync is not enough. We
> still see lru_list not empty warnging.
>
> Fix it by waiting all BO to be destroyed.
>
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 6f47726f1765..e249923eb9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3957,11 +3957,17 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev)
> */
> void amdgpu_device_fini_hw(struct amdgpu_device *adev)
> {
> + int pending = 1;
> +
> dev_info(adev->dev, "amdgpu: finishing device.\n");
> flush_delayed_work(&adev->delayed_init_work);
> - if (adev->mman.initialized) {
> + while (adev->mman.initialized && pending) {
> flush_delayed_work(&adev->mman.bdev.wq);
> - ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
> + pending = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
> + if (pending) {
> + ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, true);
> + msleep((HZ / 100) < 1) ? 1 : HZ / 100);
> + }
> }
> adev->shutdown = true;
>
> --
> 2.25.1
>
More information about the amd-gfx
mailing list