[PATCH] drm/sched: Avoid double re-lock on the job free path
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Fri Jul 18 07:13:14 UTC 2025
On 16/07/2025 21:44, Maíra Canal wrote:
> Hi Tvrtko,
>
> On 16/07/25 11:46, Tvrtko Ursulin wrote:
>>
>> On 16/07/2025 15:30, Maíra Canal wrote:
>>> Hi Tvrtko,
>>>
>>> On 16/07/25 10:49, Tvrtko Ursulin wrote:
>>>>
>>>> On 16/07/2025 14:31, Maíra Canal wrote:
>>>>> Hi Tvrtko,
>>>>>
>>>>> On 16/07/25 05:51, Tvrtko Ursulin wrote:
>>>>>> Currently the job free work item will lock sched->job_list_lock
>>>>>> first time
>>>>>> to see if there are any jobs, free a single job, and then lock
>>>>>> again to
>>>>>> decide whether to re-queue itself if there are more finished jobs.
>>>>>>
>>>>>> Since drm_sched_get_finished_job() already looks at the second job
>>>>>> in the
>>>>>> queue we can simply add the signaled check and have it return the
>>>>>> presence
>>>>>> of more jobs to be freed to the caller. That way the work item
>>>>>> does not
>>>>>> have to lock the list again and repeat the signaled check.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
>>>>>> Cc: Christian König <christian.koenig at amd.com>
>>>>>> Cc: Danilo Krummrich <dakr at kernel.org>
>>>>>> Cc: Maíra Canal <mcanal at igalia.com>
>>>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>>>> Cc: Philipp Stanner <phasta at kernel.org>
>>>>>> ---
>>>>>> v2:
>>>>>> * Improve commit text and kerneldoc. (Philipp)
>>>>>> * Rename run free work helper. (Philipp)
>>>>>>
>>>>>> v3:
>>>>>> * Rebase on top of Maira's changes.
>>>>>> ---
>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 53 +++++++++
>>>>>> +----------------
>>>>>> 1 file changed, 21 insertions(+), 32 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/
>>>>>> drm/ scheduler/sched_main.c
>>>>>> index e2cda28a1af4..5a550fd76bf0 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -349,34 +349,13 @@ static void drm_sched_run_job_queue(struct
>>>>>> drm_gpu_scheduler *sched)
>>>>>> }
>>>>>> /**
>>>>>> - * __drm_sched_run_free_queue - enqueue free-job work
>>>>>> - * @sched: scheduler instance
>>>>>> - */
>>>>>> -static void __drm_sched_run_free_queue(struct drm_gpu_scheduler
>>>>>> *sched)
>>>>>> -{
>>>>>> - if (!READ_ONCE(sched->pause_submit))
>>>>>> - queue_work(sched->submit_wq, &sched->work_free_job);
>>>>>> -}
>>>>>> -
>>>>>> -/**
>>>>>> - * drm_sched_run_free_queue - enqueue free-job work if ready
>>>>>> + * drm_sched_run_free_queue - enqueue free-job work
>>>>>> * @sched: scheduler instance
>>>>>> */
>>>>>> static void drm_sched_run_free_queue(struct drm_gpu_scheduler
>>>>>> *sched)
>>>>>> {
>>>>>> - struct drm_sched_job *job;
>>>>>> -
>>>>>> - job = list_first_entry_or_null(&sched->pending_list,
>>>>>> - struct drm_sched_job, list);
>>>>>> - if (job && dma_fence_is_signaled(&job->s_fence->finished))
>>>>>> - __drm_sched_run_free_queue(sched);
>>>>>
>>>>> I believe we'd still need this chunk for DRM_GPU_SCHED_STAT_NO_HANG
>>>>> (check the comment in drm_sched_job_reinsert_on_false_timeout()). How
>>>>
>>>> You mean the "is there a signaled job in the list check" is needed
>>>> for drm_sched_job_reinsert_on_false_timeout()? Hmm why? Worst case
>>>> is a false positive wakeup on the free worker, no?
>>>
>>> Correct me if I'm mistaken, we would also have a false positive wake-up
>>> on the run_job worker, which I believe it could be problematic in the
>>> cases that we skipped the reset because the job is still running.
>>
>> Run job worker exits when it sees no free credits so I don't think
>> there is a problem. What am I missing?
>>
>
> I was the one missing the code in `drm_sched_can_queue()`. Sorry for the
> misleading comments. This is:
>
> Reviewed-by: Maíra Canal <mcanal at igalia.com>
No worries, and thanks!
Philipp - are you okay with this version? V2 was done to address your
feedback so that should be good now.
Regards,
Tvrtko
More information about the dri-devel
mailing list