[PATCH] drm/xe: Do not print engine reset message on a killed queue

Matthew Brost matthew.brost at intel.com
Fri May 9 06:03:07 UTC 2025


On Thu, May 08, 2025 at 04:03:56PM -0700, John Harrison wrote:
> 
> 
> On 5/8/2025 12:09 PM, Matthew Brost wrote:
> > When an app is ctrl-c (killed) any queues running on the GPU have their
> > preemption timeout set to the minimum value and scheduling is disabled.
> > If the queue has something active on the GPU it is very likely for the
> > GuC will trigger an engine reset resulting in the engine reset message
> > being printed when this is fully expected. Do not print the engine reset
> > message on queues which have been killed.
> > 
> > Reported-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4904
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_guc_submit.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index 369be36f7dc5..efff462ddd75 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -2005,8 +2005,9 @@ int xe_guc_exec_queue_reset_handler(struct xe_guc *guc, u32 *msg, u32 len)
> >   	if (unlikely(!q))
> >   		return -EPROTO;
> > -	xe_gt_info(gt, "Engine reset: engine_class=%s, logical_mask: 0x%x, guc_id=%d",
> > -		   xe_hw_engine_class_to_str(q->class), q->logical_mask, guc_id);
> > +	if (!exec_queue_killed(q))
> > +		xe_gt_info(gt, "Engine reset: engine_class=%s, logical_mask: 0x%x, guc_id=%d",
> > +			   xe_hw_engine_class_to_str(q->class), q->logical_mask, guc_id);
> Maybe make it an xe_gt_dbg in the case of a killed queue? It is still useful
> to see such messages when triaging CI failures to get an idea of what is
> going on behind the scenes.
> 

I had thought about this, should be fine as long as this isn't spamming
normal production kernels dmesg. I would assume xe_gt_dbg would not show
up. I did the same thing for job timeout message in this patch - just
dropped it on killed queues maybe I should be xe_gt_dbg message there
too?

Matt

> John.
> 
> >   	trace_xe_exec_queue_reset(q);
> 


More information about the Intel-xe mailing list