[PATCH v6 7/8] drm/xe: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset
Maíra Canal
mcanal at igalia.com
Mon Jul 14 22:07:08 UTC 2025
Xe can skip the reset if TDR has fired before the free job worker and can
also re-arm the timeout timer in some scenarios. Instead of manipulating
scheduler's internals, inform the scheduler that the job did not actually
timeout and no reset was performed through the new status code
DRM_GPU_SCHED_STAT_NO_HANG.
Note that, in the first case, there is no need to restart submission if it
hasn't been stopped.
Signed-off-by: Maíra Canal <mcanal at igalia.com>
Reviewed-by: Matthew Brost <matthew.brost at intel.com>
---
drivers/gpu/drm/xe/xe_guc_submit.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index ca2344b538c30b26542742ae9ea145836c940bbf..b28bb75e11c162d339186059c1a5d92811728742 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1092,12 +1092,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
* list so job can be freed and kick scheduler ensuring free job is not
* lost.
*/
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) {
- xe_sched_add_pending_job(sched, job);
- xe_sched_submission_start(sched);
-
- return DRM_GPU_SCHED_STAT_RESET;
- }
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags))
+ return DRM_GPU_SCHED_STAT_NO_HANG;
/* Kill the run_job entry point */
xe_sched_submission_stop(sched);
@@ -1275,10 +1271,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
* but there is not currently an easy way to do in DRM scheduler. With
* some thought, do this in a follow up.
*/
- xe_sched_add_pending_job(sched, job);
xe_sched_submission_start(sched);
-
- return DRM_GPU_SCHED_STAT_RESET;
+ return DRM_GPU_SCHED_STAT_NO_HANG;
}
static void __guc_exec_queue_fini_async(struct work_struct *w)
--
2.50.0
More information about the dri-devel
mailing list