[PATCH] drm/sched: Avoid double re-lock on the job free path

Tvrtko Ursulin tvrtko.ursulin at igalia.com
Fri Jul 18 10:18:58 UTC 2025


On 18/07/2025 10:41, Philipp Stanner wrote:
> On Fri, 2025-07-18 at 10:35 +0100, Tvrtko Ursulin wrote:
>>
>> On 18/07/2025 10:31, Philipp Stanner wrote:
>>> On Fri, 2025-07-18 at 08:13 +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 16/07/2025 21:44, Maíra Canal wrote:
>>>>> Hi Tvrtko,
>>>>>
>>>>> On 16/07/25 11:46, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 16/07/2025 15:30, Maíra Canal wrote:
>>>>>>> Hi Tvrtko,
>>>>>>>
>>>>>>> On 16/07/25 10:49, Tvrtko Ursulin wrote:
>>>>>>>>
>>>>>>>> On 16/07/2025 14:31, Maíra Canal wrote:
>>>>>>>>> Hi Tvrtko,
>>>>>>>>>
>>>>>>>>> On 16/07/25 05:51, Tvrtko Ursulin wrote:
>>>>>>>>>> Currently the job free work item will lock sched-
>>>>>>>>>>> job_list_lock
>>>>>>>>>> first time
>>>>>>>>>> to see if there are any jobs, free a single job, and
>>>>>>>>>> then lock
>>>>>>>>>> again to
>>>>>>>>>> decide whether to re-queue itself if there are more
>>>>>>>>>> finished jobs.
>>>>>>>>>>
>>>>>>>>>> Since drm_sched_get_finished_job() already looks at
>>>>>>>>>> the second job
>>>>>>>>>> in the
>>>>>>>>>> queue we can simply add the signaled check and have
>>>>>>>>>> it return the
>>>>>>>>>> presence
>>>>>>>>>> of more jobs to be freed to the caller. That way the
>>>>>>>>>> work item
>>>>>>>>>> does not
>>>>>>>>>> have to lock the list again and repeat the signaled
>>>>>>>>>> check.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Tvrtko Ursulin
>>>>>>>>>> <tvrtko.ursulin at igalia.com>
>>>>>>>>>> Cc: Christian König <christian.koenig at amd.com>
>>>>>>>>>> Cc: Danilo Krummrich <dakr at kernel.org>
>>>>>>>>>> Cc: Maíra Canal <mcanal at igalia.com>
>>>>>>>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>>>>>>>> Cc: Philipp Stanner <phasta at kernel.org>
>>>>>>>>>> ---
>>>>>>>>>> v2:
>>>>>>>>>>     * Improve commit text and kerneldoc. (Philipp)
>>>>>>>>>>     * Rename run free work helper. (Philipp)
>>>>>>>>>>
>>>>>>>>>> v3:
>>>>>>>>>>     * Rebase on top of Maira's changes.
>>>>>>>>>> ---
>>>>>>>>>>     drivers/gpu/drm/scheduler/sched_main.c | 53
>>>>>>>>>> +++++++++
>>>>>>>>>> +----------------
>>>>>>>>>>     1 file changed, 21 insertions(+), 32 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>> b/drivers/gpu/
>>>>>>>>>> drm/ scheduler/sched_main.c
>>>>>>>>>> index e2cda28a1af4..5a550fd76bf0 100644
>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>> @@ -349,34 +349,13 @@ static void
>>>>>>>>>> drm_sched_run_job_queue(struct
>>>>>>>>>> drm_gpu_scheduler *sched)
>>>>>>>>>>     }
>>>>>>>>>>     /**
>>>>>>>>>> - * __drm_sched_run_free_queue - enqueue free-job
>>>>>>>>>> work
>>>>>>>>>> - * @sched: scheduler instance
>>>>>>>>>> - */
>>>>>>>>>> -static void __drm_sched_run_free_queue(struct
>>>>>>>>>> drm_gpu_scheduler
>>>>>>>>>> *sched)
>>>>>>>>>> -{
>>>>>>>>>> -    if (!READ_ONCE(sched->pause_submit))
>>>>>>>>>> -        queue_work(sched->submit_wq, &sched-
>>>>>>>>>>> work_free_job);
>>>>>>>>>> -}
>>>>>>>>>> -
>>>>>>>>>> -/**
>>>>>>>>>> - * drm_sched_run_free_queue - enqueue free-job work
>>>>>>>>>> if ready
>>>>>>>>>> + * drm_sched_run_free_queue - enqueue free-job work
>>>>>>>>>>      * @sched: scheduler instance
>>>>>>>>>>      */
>>>>>>>>>>     static void drm_sched_run_free_queue(struct
>>>>>>>>>> drm_gpu_scheduler
>>>>>>>>>> *sched)
>>>>>>>>>>     {
>>>>>>>>>> -    struct drm_sched_job *job;
>>>>>>>>>> -
>>>>>>>>>> -    job = list_first_entry_or_null(&sched-
>>>>>>>>>>> pending_list,
>>>>>>>>>> -                       struct drm_sched_job, list);
>>>>>>>>>> -    if (job && dma_fence_is_signaled(&job->s_fence-
>>>>>>>>>>> finished))
>>>>>>>>>> -        __drm_sched_run_free_queue(sched);
>>>>>>>>>
>>>>>>>>> I believe we'd still need this chunk for
>>>>>>>>> DRM_GPU_SCHED_STAT_NO_HANG
>>>>>>>>> (check the comment in
>>>>>>>>> drm_sched_job_reinsert_on_false_timeout()). How
>>>>>>>>
>>>>>>>> You mean the "is there a signaled job in the list check"
>>>>>>>> is needed
>>>>>>>> for drm_sched_job_reinsert_on_false_timeout()? Hmm why?
>>>>>>>> Worst case
>>>>>>>> is a false positive wakeup on the free worker, no?
>>>>>>>
>>>>>>> Correct me if I'm mistaken, we would also have a false
>>>>>>> positive wake-up
>>>>>>> on the run_job worker, which I believe it could be
>>>>>>> problematic in the
>>>>>>> cases that we skipped the reset because the job is still
>>>>>>> running.
>>>>>>
>>>>>> Run job worker exits when it sees no free credits so I don't
>>>>>> think
>>>>>> there is a problem. What am I missing?
>>>>>>
>>>>>
>>>>> I was the one missing the code in `drm_sched_can_queue()`.
>>>>> Sorry for the
>>>>> misleading comments. This is:
>>>>>
>>>>> Reviewed-by: Maíra Canal <mcanal at igalia.com>
>>>>
>>>> No worries, and thanks!
>>>>
>>>> Philipp - are you okay with this version? V2 was done to address
>>>> your
>>>> feedback so that should be good now.
>>>
>>> Was just giving it another spin when you wrote. (a [PATCH v3]
>>> would've
>>> been neat for identification, though – I almost pulled the wrong
>>> patch
>>> from the archive *wink*)
>>
>> Oops, my bad.
>>
>>> LGTM, improves things, can be merged.
>>>
>>> However, we had to merge Lin Cao's bug fix [1] recently. That one
>>> is
>>> now in drm-misc-fixes, and your patch should go to drm-misc-next.
>>> This
>>> would cause a conflict once the two branches meet.
>>>
>>> So I suggest that we wait with this non-urgent patch until drm-
>>> misc-
>>> fixes / Linus's -rc gets merged into drm-misc-next, and then we
>>> apply
>>> it. Should be next week or the week after AFAIK.
>>>
>>> Unless somebody has a better idea, of course?
>>
>> Lin's patch touches sched_entity.c only and mine only sched_main.c -
>> ie.
>> no conflict AFAICT?
> 
> Aaahhh, I had a hallucination ^^'
> 
> It doesn't apply to drm-misc-fixes, but that is because fixes misses
> changes that yours is based on. Because Lin's patch was the last thing
> I touched on that branch I seem to have jumped to that conclusion.
> 
> Should be fine, then. My bad.
> 
> Will apply.

Thank you!

This enables me to send out a rebase of the fair DRM scheduler series soon.

Regards,

Tvrtko

>>> Remind me in case I forget.
>>>
>>>
>>> P.
>>>
>>> [1]
>>> https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/15f77764e90a713ee3916ca424757688e4f565b9
>>>
>>>
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>
>>
> 



More information about the dri-devel mailing list