[RFC v4 10/16] drm/sched: Free all finished jobs at once
Philipp Stanner
phasta at mailbox.org
Mon May 12 12:56:29 UTC 2025
On Fri, 2025-04-25 at 11:20 +0100, Tvrtko Ursulin wrote:
> To implement fair scheduling we will need as accurate as possible
> view
> into per entity GPU time utilisation. Because sched fence execution
> time
> are only adjusted for accuracy in the free worker we need to process
> completed jobs as soon as possible so the metric is most up to date
> when
> view from the submission side of things.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
> Cc: Christian König <christian.koenig at amd.com>
> Cc: Danilo Krummrich <dakr at kernel.org>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Philipp Stanner <phasta at kernel.org>
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 15 ++-------------
> 1 file changed, 2 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 8950c7705f57..22428a1569dd 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -865,13 +865,12 @@ drm_sched_select_entity(struct
> drm_gpu_scheduler *sched)
> * drm_sched_get_finished_job - fetch the next finished job to be
> destroyed
> *
> * @sched: scheduler instance
> - * @have_more: are there more finished jobs on the list
> *
> * Returns the next finished job from the pending list (if there is
> one)
> * ready for it to be destroyed.
> */
> static struct drm_sched_job *
> -drm_sched_get_finished_job(struct drm_gpu_scheduler *sched, bool
> *have_more)
> +drm_sched_get_finished_job(struct drm_gpu_scheduler *sched)
> {
> struct drm_sched_job *job, *next;
>
> @@ -886,7 +885,6 @@ drm_sched_get_finished_job(struct
> drm_gpu_scheduler *sched, bool *have_more)
> /* cancel this job's TO timer */
> cancel_delayed_work(&sched->work_tdr);
>
> - *have_more = false;
> next = list_first_entry_or_null(&sched-
> >pending_list,
> typeof(*next),
> list);
> if (next) {
> @@ -896,10 +894,6 @@ drm_sched_get_finished_job(struct
> drm_gpu_scheduler *sched, bool *have_more)
> next->s_fence->scheduled.timestamp =
> dma_fence_timestamp(&job-
> >s_fence->finished);
>
> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> - &next->s_fence-
> >finished.flags))
> - *have_more = true;
> -
> /* start TO timer for next job */
> drm_sched_start_timeout(sched);
> }
> @@ -958,14 +952,9 @@ static void drm_sched_free_job_work(struct
> work_struct *w)
> struct drm_gpu_scheduler *sched =
> container_of(w, struct drm_gpu_scheduler,
> work_free_job);
> struct drm_sched_job *job;
> - bool have_more;
>
> - job = drm_sched_get_finished_job(sched, &have_more);
> - if (job) {
> + while ((job = drm_sched_get_finished_job(sched)))
> sched->ops->free_job(job);
> - if (have_more)
> - __drm_sched_run_free_queue(sched);
> - }
Are there any have_more users left after that?
Removing here what was added before IMO makes it more questionable
adding that improvement in the first place.
P.
>
> drm_sched_run_job_queue(sched);
> }
More information about the dri-devel
mailing list