[RFC PATCH 01/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Fri Jun 9 06:58:39 UTC 2023

Hi Matthew,

On Mon,  3 Apr 2023 17:22:02 -0700
Matthew Brost <matthew.brost at intel.com> wrote:

> -static int drm_sched_main(void *param)
> +static void drm_sched_main(struct work_struct *w)
>  {
> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> +	struct drm_gpu_scheduler *sched =
> +		container_of(w, struct drm_gpu_scheduler, work_run);
>  	int r;
>  
> -	sched_set_fifo_low(current);
> -
> -	while (!kthread_should_stop()) {
> -		struct drm_sched_entity *entity = NULL;
> +	while (!READ_ONCE(sched->pause_run_wq)) {

During an informal discussion on IRC I mentioned that this loop might
become problematic if all the 1:1 entities share the same wq
(especially if it's an ordered wq), and one of them is getting passed a
lot of requests. Just wanted to tell you that we've hit that case in
PowerVR:

Geometry and fragment queues get passed X requests respectively, each
pair of request corresponding to a rendering operation. Because we're
using an ordered wq (which I know we shouldn't do, and I intend to
fix that, but I think it shows the problem exists by making it more
visible), all geometry requests get submitted first, then come the
fragment requests. It turns out the submission time is non-negligible
compared to the geometry job execution time, and geometry jobs end up
generating data for the fragment jobs that are not consumed fast enough
by the fragment job to allow the following geom jobs to re-use the same
portion of memory, leading to on-demand allocation of extra memory
chunks which wouldn't happen if submissions were interleaved.

I know you were not fundamentally opposed to killing this loop and doing
one iteration at a time (you even provided a patch doing that), just
wanted to share my findings to prove this is not just a theoretical
issue, and the lack of fairness in the submission path can cause trouble
in practice.

Best Regards,

Boris

> +		struct drm_sched_entity *entity;
>  		struct drm_sched_fence *s_fence;
>  		struct drm_sched_job *sched_job;
>  		struct dma_fence *fence;
> -		struct drm_sched_job *cleanup_job = NULL;
> +		struct drm_sched_job *cleanup_job;
>  
> -		wait_event_interruptible(sched->wake_up_worker,
> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
> -					 (!drm_sched_blocked(sched) &&
> -					  (entity = drm_sched_select_entity(sched))) ||
> -					 kthread_should_stop());
> +		cleanup_job = drm_sched_get_cleanup_job(sched);
> +		entity = drm_sched_select_entity(sched);
>  
>  		if (cleanup_job)
>  			sched->ops->free_job(cleanup_job);
>  
> -		if (!entity)
> +		if (!entity) {
> +			if (!cleanup_job)
> +				break;
>  			continue;
> +		}
>  
>  		sched_job = drm_sched_entity_pop_job(entity);
>  
>  		if (!sched_job) {
>  			complete_all(&entity->entity_idle);
> +			if (!cleanup_job)
> +				break;
>  			continue;
>  		}
>  
> @@ -1055,14 +1083,14 @@ static int drm_sched_main(void *param)
>  					  r);
>  		} else {
>  			if (IS_ERR(fence))
> -				dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
> +				dma_fence_set_error(&s_fence->finished,
> +						    PTR_ERR(fence));
>  
>  			drm_sched_job_done(sched_job);
>  		}
>  
>  		wake_up(&sched->job_scheduled);
>  	}
> -	return 0;
>  }