[PATCH v3 7/8] drm/xe: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset
Maíra Canal
mcanal at igalia.com
Wed Jun 18 14:47:48 UTC 2025
Xe can skip the reset if TDR has fired before the free job worker and can
also re-arm the timeout timer in some scenarios. Instead of manipulating
scheduler's internals, inform the scheduler that the job did not actually
timeout and no reset was performed through the new status code
DRM_GPU_SCHED_STAT_NO_HANG.
Note that, in the first case, there is no need to restart submission if it
hasn't been stopped.
Signed-off-by: Maíra Canal <mcanal at igalia.com>
---
drivers/gpu/drm/xe/xe_guc_submit.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 9c7e445b9ea7ce7e3610eadca023e6d810e683e9..f6289eeffd852e40b33d0e455d9bcc21a4fb1467 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1078,12 +1078,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
* list so job can be freed and kick scheduler ensuring free job is not
* lost.
*/
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) {
- xe_sched_add_pending_job(sched, job);
- xe_sched_submission_start(sched);
-
- return DRM_GPU_SCHED_STAT_RESET;
- }
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags))
+ return DRM_GPU_SCHED_STAT_NO_HANG;
/* Kill the run_job entry point */
xe_sched_submission_stop(sched);
@@ -1261,10 +1257,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
* but there is not currently an easy way to do in DRM scheduler. With
* some thought, do this in a follow up.
*/
- xe_sched_add_pending_job(sched, job);
xe_sched_submission_start(sched);
-
- return DRM_GPU_SCHED_STAT_RESET;
+ return DRM_GPU_SCHED_STAT_NO_HANG;
}
static void __guc_exec_queue_fini_async(struct work_struct *w)
--
2.49.0
More information about the Intel-xe
mailing list