[PATCH 2/8] drm/sched: Always free the job after the timeout

Maíra Canal mcanal at igalia.com
Sat May 3 20:59:53 UTC 2025


Currently, if we add the assertions presented in this commit to the mock
scheduler, we will see the following output:

[15:47:08] ============== [PASSED] drm_sched_basic_tests ==============
[15:47:08] ======== drm_sched_basic_timeout_tests (1 subtest) =========
[15:47:08] # drm_sched_basic_timeout: ASSERTION FAILED at drivers/gpu/drm/scheduler/tests/tests_basic.c:246
[15:47:08] Expected list_empty(&sched->job_list) to be true, but is false
[15:47:08] [FAILED] drm_sched_basic_timeout
[15:47:08] # module: drm_sched_tests

This occurs because `mock_sched_timedout_job()` doesn't properly handle
the hang. From the DRM sched documentation, `drm_sched_stop()` and
`drm_sched_start()` are typically used for reset recovery. If these
functions are not used, the offending job won't be freed and should be
freed by the caller.

Currently, the mock scheduler doesn't use the functions provided by the
API, nor does it handle the freeing of the job. As a result, the job isn't
removed from the job list.

This commit mocks a GPU reset by stopping the scheduler affected by the
reset, waiting a couple of microseconds to mimic a hardware reset, and
then restart the affected scheduler.

Signed-off-by: Maíra Canal <mcanal at igalia.com>
---
 drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 10 ++++++++++
 drivers/gpu/drm/scheduler/tests/tests_basic.c    |  3 +++
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
index f999c8859cf7adb8f06fc8a37969656dd3249fa7..e9af202d84bd55ea5cc048215e39f5407bc84458 100644
--- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
+++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2025 Valve Corporation */
 
+#include <linux/delay.h>
+
 #include "sched_tests.h"
 
 /*
@@ -203,10 +205,18 @@ static struct dma_fence *mock_sched_run_job(struct drm_sched_job *sched_job)
 static enum drm_gpu_sched_stat
 mock_sched_timedout_job(struct drm_sched_job *sched_job)
 {
+	struct drm_mock_scheduler *sched =
+		drm_sched_to_mock_sched(sched_job->sched);
 	struct drm_mock_sched_job *job = drm_sched_job_to_mock_job(sched_job);
 
 	job->flags |= DRM_MOCK_SCHED_JOB_TIMEDOUT;
 
+	drm_sched_stop(&sched->base, &job->base);
+
+	usleep_range(200, 500);
+
+	drm_sched_start(&sched->base, 0);
+
 	return DRM_GPU_SCHED_STAT_NOMINAL;
 }
 
diff --git a/drivers/gpu/drm/scheduler/tests/tests_basic.c b/drivers/gpu/drm/scheduler/tests/tests_basic.c
index 7230057e0594c6246f02608f07fcb1f8d738ac75..8f960f0fd31d0af7873f410ceba2d636f58a5474 100644
--- a/drivers/gpu/drm/scheduler/tests/tests_basic.c
+++ b/drivers/gpu/drm/scheduler/tests/tests_basic.c
@@ -241,6 +241,9 @@ static void drm_sched_basic_timeout(struct kunit *test)
 			job->flags & DRM_MOCK_SCHED_JOB_TIMEDOUT,
 			DRM_MOCK_SCHED_JOB_TIMEDOUT);
 
+	KUNIT_ASSERT_TRUE(test, list_empty(&sched->job_list));
+	KUNIT_ASSERT_TRUE(test, list_empty(&sched->done_list));
+
 	drm_mock_sched_entity_free(entity);
 }
 

-- 
2.49.0



More information about the dri-devel mailing list