[PATCH] drm/xe: Do not print engine reset message on a killed queue
Matthew Brost
matthew.brost at intel.com
Thu May 8 22:00:06 UTC 2025
On Thu, May 08, 2025 at 03:36:40PM -0600, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Matthew Brost
> Sent: Thursday, May 8, 2025 12:09 PM
> To: intel-xe at lists.freedesktop.org
> Subject: [PATCH] drm/xe: Do not print engine reset message on a killed queue
> >
> > When an app is ctrl-c (killed) any queues running on the GPU have their
> > preemption timeout set to the minimum value and scheduling is disabled.
> > If the queue has something active on the GPU it is very likely for the
> > GuC will trigger an engine reset resulting in the engine reset message
> > being printed when this is fully expected. Do not print the engine reset
> > message on queues which have been killed.
> >
> > Reported-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4904
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_submit.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index 369be36f7dc5..efff462ddd75 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -2005,8 +2005,9 @@ int xe_guc_exec_queue_reset_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > if (unlikely(!q))
> > return -EPROTO;
> >
> > - xe_gt_info(gt, "Engine reset: engine_class=%s, logical_mask: 0x%x, guc_id=%d",
> > - xe_hw_engine_class_to_str(q->class), q->logical_mask, guc_id);
> > + if (!exec_queue_killed(q))
> > + xe_gt_info(gt, "Engine reset: engine_class=%s, logical_mask: 0x%x, guc_id=%d",
> > + xe_hw_engine_class_to_str(q->class), q->logical_mask, guc_id);
> >
> > trace_xe_exec_queue_reset(q);
>
> Hmm... what does this trace do, again? I can't find the declaration of this function in the code for some reason.
Yea, those generated by macros in xe_trace.h so grep is not going to
help you there. These show in /sys/kernel/debug/tracing/trace if
enabled.
> If it's also in charge of printing additional debug data, then we should probably shove it into the above if statement as well.
I think we still want the trace as these really show us everything going
in the KMD. Here we don't want to spam dmesg with can expected user
event but the trace which captures everything should have it.
Matt
> If not, or if there's other reasons why it needs to be run every time the exec queue reset handler function is executed, then:
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
>
> >
> > --
> > 2.34.1
> >
> >
More information about the Intel-xe
mailing list