[PATCH v3 7/7] drm/xe: Wire devcoredump to LR TDR

John Harrison john.c.harrison at intel.com
Thu Nov 14 02:01:23 UTC 2024


On 11/12/2024 14:01, Matthew Brost wrote:
> LR queues can hang, cause engine reset, or cause IOMMU CAT errors.
> Collect an error capture when this occurs.
>
> v2:
>   - s/queue's/queues (Jonathan)
>
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> ---
>   drivers/gpu/drm/xe/xe_guc_submit.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 985a21a72da4..29099497429d 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -896,13 +896,17 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
>   					 !exec_queue_pending_disable(q) ||
>   					 xe_guc_read_stopped(guc), HZ * 5);
>   		if (!ret) {
> -			xe_gt_warn(q->gt, "Schedule disable failed to respond\n");
> +			xe_gt_warn(q->gt, "Schedule disable failed to respond, guc_id=%d\n");
Missing the guc id value.

John.

> +			xe_devcoredump(q, NULL);
>   			xe_sched_submission_start(sched);
>   			xe_gt_reset_async(q->gt);
>   			return;
>   		}
>   	}
>   
> +	if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
> +		xe_devcoredump(q, NULL);
> +
>   	xe_sched_submission_start(sched);
>   }
>   



More information about the Intel-xe mailing list