[PATCH 1/8] drm/sched: Allow drivers to skip the reset and keep on running

Sat May 3 20:59:52 UTC 2025

When the DRM scheduler times out, it's possible that the GPU isn't hung;
instead, a job may still be running, and there may be no valid reason to
reset the hardware. This can occur in two situations:

  1. The GPU exposes some mechanism that ensures the GPU is still making
     progress. By checking this mechanism, we can safely skip the reset,
     rearm the timeout, and allow the job to continue running until
     completion. This is the case for v3d and Etnaviv.
  2. TDR has fired before the IRQ that signals the fence. Consequently,
     the job actually finishes, but it triggers a timeout before signaling
     the completion fence.

These two scenarios are problematic because we remove the job from the
`sched->pending_list` before calling `sched->ops->timedout_job()`. This
means that when the job finally signals completion (e.g. in the IRQ
handler), the scheduler won't call `sched->ops->free_job()`. As a result,
the job and its resources won't be freed, leading to a memory leak.

To resolve this issue, we create a new `drm_gpu_sched_stat` that allows a
driver to skip the reset. This new status will indicate that the job
should be reinserted into the pending list, and the driver will still
signal its completion.

Signed-off-by: Maíra Canal <mcanal at igalia.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 14 ++++++++++++++
 include/drm/gpu_scheduler.h            |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 829579c41c6b5d8b2abce5ad373c7017469b7680..68ca827d77e32187a034309f881135dbc639a9b4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -568,6 +568,17 @@ static void drm_sched_job_timedout(struct work_struct *work)
 			job->sched->ops->free_job(job);
 			sched->free_guilty = false;
 		}
+
+		/*
+		 * If the driver indicated that the GPU is still running and wants to skip
+		 * the reset, reinsert the job back into the pending list and realarm the
+		 * timeout.
+		 */
+		if (status == DRM_GPU_SCHED_STAT_RUNNING) {
+			spin_lock(&sched->job_list_lock);
+			list_add(&job->list, &sched->pending_list);
+			spin_unlock(&sched->job_list_lock);
+		}
 	} else {
 		spin_unlock(&sched->job_list_lock);
 	}
@@ -590,6 +601,9 @@ static void drm_sched_job_timedout(struct work_struct *work)
  * This function is typically used for reset recovery (see the docu of
  * drm_sched_backend_ops.timedout_job() for details). Do not call it for
  * scheduler teardown, i.e., before calling drm_sched_fini().
+ *
+ * As it's used for reset recovery, drm_sched_stop() shouldn't be called
+ * if the scheduler skipped the timeout (DRM_SCHED_STAT_RUNNING).
  */
 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 {
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 1a7e377d4cbb4fc12ed93c548b236970217945e8..fe9043b6d43141bee831b5fc16b927202a507d51 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -389,11 +389,13 @@ struct drm_sched_job {
  * @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
  * @DRM_GPU_SCHED_STAT_NOMINAL: Operation succeeded.
  * @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available anymore.
+ * @DRM_GPU_SCHED_STAT_RUNNING: GPU is still running, so skip the reset.
  */
 enum drm_gpu_sched_stat {
 	DRM_GPU_SCHED_STAT_NONE,
 	DRM_GPU_SCHED_STAT_NOMINAL,
 	DRM_GPU_SCHED_STAT_ENODEV,
+	DRM_GPU_SCHED_STAT_RUNNING,
 };
 
 /**

-- 
2.49.0