[Intel-xe] [RFC PATCH 01/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Mon Jul 31 00:56:31 UTC 2023

On Fri, Jun 09, 2023 at 08:58:39AM +0200, Boris Brezillon wrote:
> Hi Matthew,
> 
> On Mon,  3 Apr 2023 17:22:02 -0700
> Matthew Brost <matthew.brost at intel.com> wrote:
> 
> > -static int drm_sched_main(void *param)
> > +static void drm_sched_main(struct work_struct *w)
> >  {
> > -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> > +	struct drm_gpu_scheduler *sched =
> > +		container_of(w, struct drm_gpu_scheduler, work_run);
> >  	int r;
> >  
> > -	sched_set_fifo_low(current);
> > -
> > -	while (!kthread_should_stop()) {
> > -		struct drm_sched_entity *entity = NULL;
> > +	while (!READ_ONCE(sched->pause_run_wq)) {
> 
> During an informal discussion on IRC I mentioned that this loop might
> become problematic if all the 1:1 entities share the same wq
> (especially if it's an ordered wq), and one of them is getting passed a
> lot of requests. Just wanted to tell you that we've hit that case in
> PowerVR:
> 
> Geometry and fragment queues get passed X requests respectively, each
> pair of request corresponding to a rendering operation. Because we're
> using an ordered wq (which I know we shouldn't do, and I intend to
> fix that, but I think it shows the problem exists by making it more
> visible), all geometry requests get submitted first, then come the
> fragment requests. It turns out the submission time is non-negligible
> compared to the geometry job execution time, and geometry jobs end up
> generating data for the fragment jobs that are not consumed fast enough
> by the fragment job to allow the following geom jobs to re-use the same
> portion of memory, leading to on-demand allocation of extra memory
> chunks which wouldn't happen if submissions were interleaved.
> 
> I know you were not fundamentally opposed to killing this loop and doing
> one iteration at a time (you even provided a patch doing that), just
> wanted to share my findings to prove this is not just a theoretical
> issue, and the lack of fairness in the submission path can cause trouble
> in practice.
> 
> Best Regards,
> 
> Boris
> 

Thanks for the info Boris, about to revive this series in a non-RFC form.

This loop seems controversial, let me drop it. Going to cook up a patch
for the Xe branch and get this merged for CI / UMD benchmarks to absorb
and if there any noticeable differences.

Also be on the lookout for a new rev of this series hopefully this week.

Matt

> > +		struct drm_sched_entity *entity;
> >  		struct drm_sched_fence *s_fence;
> >  		struct drm_sched_job *sched_job;
> >  		struct dma_fence *fence;
> > -		struct drm_sched_job *cleanup_job = NULL;
> > +		struct drm_sched_job *cleanup_job;
> >  
> > -		wait_event_interruptible(sched->wake_up_worker,
> > -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
> > -					 (!drm_sched_blocked(sched) &&
> > -					  (entity = drm_sched_select_entity(sched))) ||
> > -					 kthread_should_stop());
> > +		cleanup_job = drm_sched_get_cleanup_job(sched);
> > +		entity = drm_sched_select_entity(sched);
> >  
> >  		if (cleanup_job)
> >  			sched->ops->free_job(cleanup_job);
> >  
> > -		if (!entity)
> > +		if (!entity) {
> > +			if (!cleanup_job)
> > +				break;
> >  			continue;
> > +		}
> >  
> >  		sched_job = drm_sched_entity_pop_job(entity);
> >  
> >  		if (!sched_job) {
> >  			complete_all(&entity->entity_idle);
> > +			if (!cleanup_job)
> > +				break;
> >  			continue;
> >  		}
> >  
> > @@ -1055,14 +1083,14 @@ static int drm_sched_main(void *param)
> >  					  r);
> >  		} else {
> >  			if (IS_ERR(fence))
> > -				dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
> > +				dma_fence_set_error(&s_fence->finished,
> > +						    PTR_ERR(fence));
> >  
> >  			drm_sched_job_done(sched_job);
> >  		}
> >  
> >  		wake_up(&sched->job_scheduled);
> >  	}
> > -	return 0;
> >  }