[PATCH] drm/xe: Resume TDR after GT reset

Matthew Brost matthew.brost at intel.com
Fri Sep 27 07:12:17 UTC 2024


On Thu, Sep 26, 2024 at 07:25:22PM +0200, Nirmoy Das wrote:
> 
> On 7/25/2024 1:59 AM, Matthew Brost wrote:
> > Not starting the TDR after GT reset on exec queue which have been
> > restarted can lead to jobs being able to be run forever. Fix this by
> > restarting the TDR.
> >
> > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> 
> Reviewed-by: Nirmoy Das <nirmoy.das at intel.com>
> 

Thanks for the review and helping make sure this didn't get lost on the
list.

Pushed.

Matt

> > ---
> >  drivers/gpu/drm/xe/xe_gpu_scheduler.c | 5 +++++
> >  drivers/gpu/drm/xe/xe_gpu_scheduler.h | 2 ++
> >  drivers/gpu/drm/xe/xe_guc_submit.c    | 1 +
> >  3 files changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > index e4ad1d6ce1d5..7f24e58cc992 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > @@ -90,6 +90,11 @@ void xe_sched_submission_stop(struct xe_gpu_scheduler *sched)
> >  	cancel_work_sync(&sched->work_process_msg);
> >  }
> >  
> > +void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched)
> > +{
> > +	drm_sched_resume_timeout(&sched->base, sched->base.timeout);
> > +}
> > +
> >  void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
> >  		      struct xe_sched_msg *msg)
> >  {
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > index 10c6bb9c9386..6aac7fe68673 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > @@ -22,6 +22,8 @@ void xe_sched_fini(struct xe_gpu_scheduler *sched);
> >  void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
> >  void xe_sched_submission_stop(struct xe_gpu_scheduler *sched);
> >  
> > +void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched);
> > +
> >  void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
> >  		      struct xe_sched_msg *msg);
> >  
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index 460808507947..2327e11ae311 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -1768,6 +1768,7 @@ static void guc_exec_queue_start(struct xe_exec_queue *q)
> >  	}
> >  
> >  	xe_sched_submission_start(sched);
> > +	xe_sched_submission_resume_tdr(sched);
> >  }
> >  
> >  int xe_guc_submit_start(struct xe_guc *guc)


More information about the Intel-xe mailing list