[PATCH] drm/sched: Drain all entities in DRM sched run job worker

Dave Airlie airlied at gmail.com
Fri Jan 26 02:45:58 UTC 2024


 Just FYI I'm pulling this into drm-fixes straight as is, since if
fixes the regression and avoids the revert, however please keep
discussing until we are sure things are right, and we can deal with
any fixes in a follow-up patch.

Dave.

On Fri, 26 Jan 2024 at 03:32, Matthew Brost <matthew.brost at intel.com> wrote:
>
> On Thu, Jan 25, 2024 at 10:24:24AM +0100, Vlastimil Babka wrote:
> > On 1/24/24 22:08, Matthew Brost wrote:
> > > All entities must be drained in the DRM scheduler run job worker to
> > > avoid the following case. An entity found that is ready, no job found
> > > ready on entity, and run job worker goes idle with other entities + jobs
> > > ready. Draining all ready entities (i.e. loop over all ready entities)
> > > in the run job worker ensures all job that are ready will be scheduled.
> > >
> > > Cc: Thorsten Leemhuis <regressions at leemhuis.info>
> > > Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov at gmail.com>
> > > Closes: https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuOWtoeeE+q26zE+Q@mail.gmail.com/
> > > Reported-and-tested-by: Mario Limonciello <mario.limonciello at amd.com>
> > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
> > > Link: https://lore.kernel.org/all/20240123021155.2775-1-mario.limonciello@amd.com/
> > > Reported-by: Vlastimil Babka <vbabka at suse.cz>
> >
> > Can change to Reported-and-tested-by: Vlastimil Babka <vbabka at suse.cz>
> >
>
> +1, got it.
>
> Matt
>
> > Thanks!
> >
> > > Closes: https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd159a8@amd.com/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
> > > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > > ---
> > >  drivers/gpu/drm/scheduler/sched_main.c | 15 +++++++--------
> > >  1 file changed, 7 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 550492a7a031..85f082396d42 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -1178,21 +1178,20 @@ static void drm_sched_run_job_work(struct work_struct *w)
> > >     struct drm_sched_entity *entity;
> > >     struct dma_fence *fence;
> > >     struct drm_sched_fence *s_fence;
> > > -   struct drm_sched_job *sched_job;
> > > +   struct drm_sched_job *sched_job = NULL;
> > >     int r;
> > >
> > >     if (READ_ONCE(sched->pause_submit))
> > >             return;
> > >
> > > -   entity = drm_sched_select_entity(sched);
> > > +   /* Find entity with a ready job */
> > > +   while (!sched_job && (entity = drm_sched_select_entity(sched))) {
> > > +           sched_job = drm_sched_entity_pop_job(entity);
> > > +           if (!sched_job)
> > > +                   complete_all(&entity->entity_idle);
> > > +   }
> > >     if (!entity)
> > > -           return;
> > > -
> > > -   sched_job = drm_sched_entity_pop_job(entity);
> > > -   if (!sched_job) {
> > > -           complete_all(&entity->entity_idle);
> > >             return; /* No more work */
> > > -   }
> > >
> > >     s_fence = sched_job->s_fence;
> > >
> >


More information about the dri-devel mailing list