[PATCH] drm/xe/guc_submit: improve schedule disable error logging
Matthew Auld
matthew.auld at intel.com
Fri Sep 27 13:35:36 UTC 2024
A few things here. Make the two prints consistent (and distinct), print
the guc_id, and finally dump the CT queues. It should be possible to
spot the guc_id in the CT queue dump, and for example see that host side
has yet to process the response for the schedule disable, or see that
GuC is yet to send it, to help narrow things down if we trigger the
timeout.
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1638
Signed-off-by: Matthew Auld <matthew.auld at intel.com>
Cc: Matthew Brost <matthew.brost at intel.com>
Cc: Nirmoy Das <nirmoy.das at intel.com>
---
drivers/gpu/drm/xe/xe_guc_submit.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 80062e1d3f66..52ed7c0043f9 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -977,7 +977,12 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
!exec_queue_pending_disable(q) ||
guc_read_stopped(guc), HZ * 5);
if (!ret) {
- drm_warn(&xe->drm, "Schedule disable failed to respond");
+ struct xe_gt *gt = guc_to_gt(guc);
+ struct drm_printer p = xe_gt_err_printer(gt);
+
+ xe_gt_warn(gt, "%s schedule disable failed to respond guc_id=%d",
+ __func__, ge->id);
+ xe_guc_ct_print(&guc->ct, &p, false);
xe_sched_submission_start(sched);
xe_gt_reset_async(q->gt);
return;
@@ -1177,8 +1182,14 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
guc_read_stopped(guc), HZ * 5);
if (!ret || guc_read_stopped(guc)) {
trigger_reset:
- if (!ret)
- xe_gt_warn(guc_to_gt(guc), "Schedule disable failed to respond");
+ if (!ret) {
+ struct xe_gt *gt = guc_to_gt(guc);
+ struct drm_printer p = xe_gt_err_printer(gt);
+
+ xe_gt_warn(gt, "%s schedule disable failed to respond guc_id=%d",
+ __func__, q->guc->id);
+ xe_guc_ct_print(&guc->ct, &p, true);
+ }
set_exec_queue_extra_ref(q);
xe_exec_queue_get(q); /* GT reset owns this */
set_exec_queue_banned(q);
--
2.46.1
More information about the Intel-xe
mailing list