[PATCH] drm/amdgpu: Fix module unload hang with RAS enabled

Zhang, Hawking Hawking.Zhang at amd.com
Wed Jan 24 04:24:20 UTC 2024


[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang <Hawking.Zhang at amd.com>

Regards,
Hawking
-----Original Message-----
From: Joshi, Mukul <Mukul.Joshi at amd.com>
Sent: Wednesday, January 24, 2024 05:01
To: amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Chai, Thomas <YiPeng.Chai at amd.com>; Joshi, Mukul <Mukul.Joshi at amd.com>
Subject: [PATCH] drm/amdgpu: Fix module unload hang with RAS enabled

The driver unload hangs because the page retirement kthread cannot be stopped as it is sleeping and waiting on page retirement event to occur. Add kthread_should_stop() to the event condition to wake up the kthread when kthread stop is called during driver unload.

Fixes: 45c3d468793d ("drm/amdgpu: Prepare for asynchronous processing of umc page retirement")
Signed-off-by: Mukul Joshi <mukul.joshi at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index a32e7eb31354..80816c4ec1f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2670,8 +2670,12 @@ static int amdgpu_ras_page_retirement_thread(void *param)
        while (!kthread_should_stop()) {

                wait_event_interruptible(con->page_retirement_wq,
+                               kthread_should_stop() ||
                                atomic_read(&con->page_retirement_req_cnt));

+               if (kthread_should_stop())
+                       break;
+
                dev_info(adev->dev, "Start processing page retirement. request:%d\n",
                        atomic_read(&con->page_retirement_req_cnt));

--
2.35.1



More information about the amd-gfx mailing list