[PATCH 1/8] drm/sched: Allow drivers to skip the reset and keep on running
Matthew Brost
matthew.brost at intel.com
Tue May 6 14:32:13 UTC 2025
On Mon, May 05, 2025 at 07:41:09PM -0700, Matthew Brost wrote:
> On Sat, May 03, 2025 at 05:59:52PM -0300, Maíra Canal wrote:
> > When the DRM scheduler times out, it's possible that the GPU isn't hung;
> > instead, a job may still be running, and there may be no valid reason to
> > reset the hardware. This can occur in two situations:
> >
> > 1. The GPU exposes some mechanism that ensures the GPU is still making
> > progress. By checking this mechanism, we can safely skip the reset,
> > rearm the timeout, and allow the job to continue running until
> > completion. This is the case for v3d and Etnaviv.
> > 2. TDR has fired before the IRQ that signals the fence. Consequently,
> > the job actually finishes, but it triggers a timeout before signaling
> > the completion fence.
> >
>
> We have both of these cases in Xe too. We implement the requeuing in Xe
> via driver side function - xe_sched_add_pending_job but this looks
> better and will make use of this.
>
> > These two scenarios are problematic because we remove the job from the
> > `sched->pending_list` before calling `sched->ops->timedout_job()`. This
> > means that when the job finally signals completion (e.g. in the IRQ
> > handler), the scheduler won't call `sched->ops->free_job()`. As a result,
> > the job and its resources won't be freed, leading to a memory leak.
> >
> > To resolve this issue, we create a new `drm_gpu_sched_stat` that allows a
> > driver to skip the reset. This new status will indicate that the job
> > should be reinserted into the pending list, and the driver will still
> > signal its completion.
> >
> > Signed-off-by: Maíra Canal <mcanal at igalia.com>
>
> Reviewed-by: Matthew Brost <matthew.brost at intel.com>
>
Wait - nevermind I think one issue is below.
> > ---
> > drivers/gpu/drm/scheduler/sched_main.c | 14 ++++++++++++++
> > include/drm/gpu_scheduler.h | 2 ++
> > 2 files changed, 16 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 829579c41c6b5d8b2abce5ad373c7017469b7680..68ca827d77e32187a034309f881135dbc639a9b4 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -568,6 +568,17 @@ static void drm_sched_job_timedout(struct work_struct *work)
> > job->sched->ops->free_job(job);
> > sched->free_guilty = false;
> > }
> > +
> > + /*
> > + * If the driver indicated that the GPU is still running and wants to skip
> > + * the reset, reinsert the job back into the pending list and realarm the
> > + * timeout.
> > + */
> > + if (status == DRM_GPU_SCHED_STAT_RUNNING) {
> > + spin_lock(&sched->job_list_lock);
> > + list_add(&job->list, &sched->pending_list);
> > + spin_unlock(&sched->job_list_lock);
> > + }
I think you need to requeue free_job wq here. It is possible the
free_job wq ran, didn't find a job, goes to sleep, then we add a
signaled job here which will never get freed.
Matt
> > } else {
> > spin_unlock(&sched->job_list_lock);
> > }
> > @@ -590,6 +601,9 @@ static void drm_sched_job_timedout(struct work_struct *work)
> > * This function is typically used for reset recovery (see the docu of
> > * drm_sched_backend_ops.timedout_job() for details). Do not call it for
> > * scheduler teardown, i.e., before calling drm_sched_fini().
> > + *
> > + * As it's used for reset recovery, drm_sched_stop() shouldn't be called
> > + * if the scheduler skipped the timeout (DRM_SCHED_STAT_RUNNING).
> > */
> > void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> > {
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 1a7e377d4cbb4fc12ed93c548b236970217945e8..fe9043b6d43141bee831b5fc16b927202a507d51 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -389,11 +389,13 @@ struct drm_sched_job {
> > * @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
> > * @DRM_GPU_SCHED_STAT_NOMINAL: Operation succeeded.
> > * @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available anymore.
> > + * @DRM_GPU_SCHED_STAT_RUNNING: GPU is still running, so skip the reset.
> > */
> > enum drm_gpu_sched_stat {
> > DRM_GPU_SCHED_STAT_NONE,
> > DRM_GPU_SCHED_STAT_NOMINAL,
> > DRM_GPU_SCHED_STAT_ENODEV,
> > + DRM_GPU_SCHED_STAT_RUNNING,
> > };
> >
> > /**
> >
> > --
> > 2.49.0
> >
More information about the Intel-xe
mailing list