[PATCH 2/3] drm/amdgpu: Pop jobs from the queue more robustly
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Fri Feb 14 10:34:45 UTC 2025
On 14/02/2025 10:31, Christian König wrote:
> Am 14.02.25 um 11:21 schrieb Tvrtko Ursulin:
>>
>> Hi Christian,
>>
>> On 11/02/2025 10:21, Christian König wrote:
>>> Am 11.02.25 um 11:08 schrieb Philipp Stanner:
>>>> On Tue, 2025-02-11 at 09:22 +0100, Christian König wrote:
>>>>> Am 06.02.25 um 17:40 schrieb Tvrtko Ursulin:
>>>>>> Replace a copy of DRM scheduler's to_drm_sched_job with a copy of a
>>>>>> newly
>>>>>> added __drm_sched_entity_queue_pop.
>>>>>>
>>>>>> This allows breaking the hidden dependency that queue_node has to
>>>>>> be the
>>>>>> first element in struct drm_sched_job.
>>>>>>
>>>>>> A comment is also added with a reference to the mailing list
>>>>>> discussion
>>>>>> explaining the copied helper will be removed when the whole broken
>>>>>> amdgpu_job_stop_all_jobs_on_sched is removed.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
>>>>>> Cc: Christian König <christian.koenig at amd.com>
>>>>>> Cc: Danilo Krummrich <dakr at kernel.org>
>>>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>>>> Cc: Philipp Stanner <phasta at kernel.org>
>>>>>> Cc: "Zhang, Hawking" <Hawking.Zhang at amd.com>
>>>>> Reviewed-by: Christian König <christian.koenig at amd.com>
>>>> I think this v3 has been supplanted by a v4 by now.
>>>
>>> I've seen the larger v4 series as well, but at least that patch here looks identical on first glance. So my rb still counts.
>>
>> Is it okay for you to merge the whole series (including this single amdgpu patch) via drm-misc?
>
> I can do that, but don't want the scheduler maintainer want to pick them up?
Sorry that was some bad and unclear English. :(
It is as you suggest - what I meant was, is it okay from your point of
view that the whole series is merged via drm-misc? I assume Philipp
would indeed be the one to merge it, once all patches get r-b-ed.
Regards,
Tvrtko
>>>> @Tvrtko: btw, do you create patches with
>>>> git format-patch -v4 ?
>>>>
>>>> That way the v4 label will be included in all patch titles, too, not
>>>> just the cover letter. That makes searching etc. easier in large
>>>> inboxes
>>>>
>>>> P.
>>>>
>>>>>> ---
>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 22 +++++++++++++++++++-
>>>>>> --
>>>>>> 1 file changed, 19 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> index 100f04475943..22cb48bab24d 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> @@ -411,8 +411,24 @@ static struct dma_fence *amdgpu_job_run(struct
>>>>>> drm_sched_job *sched_job)
>>>>>> return fence;
>>>>>> }
>>>>>> -#define to_drm_sched_job(sched_job) \
>>>>>> - container_of((sched_job), struct drm_sched_job,
>>>>>> queue_node)
>>>>>> +/*
>>>>>> + * This is a duplicate function from DRM scheduler
>>>>>> sched_internal.h.
>>>>>> + * Plan is to remove it when amdgpu_job_stop_all_jobs_on_sched is
>>>>>> removed, due
>>>>>> + * latter being incorrect and racy.
>>>>>> + *
>>>>>> + * See
>>>>>> https://lore.kernel.org/amd-gfx/44edde63-7181-44fb- a4f7-94e50514f539 at amd.com/
>>>>>> + */
>>>>>> +static struct drm_sched_job *
>>>>>> +__drm_sched_entity_queue_pop(struct drm_sched_entity *entity)
>>>>>> +{
>>>>>> + struct spsc_node *node;
>>>>>> +
>>>>>> + node = spsc_queue_pop(&entity->job_queue);
>>>>>> + if (!node)
>>>>>> + return NULL;
>>>>>> +
>>>>>> + return container_of(node, struct drm_sched_job,
>>>>>> queue_node);
>>>>>> +}
>>>>>> void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler
>>>>>> *sched)
>>>>>> {
>>>>>> @@ -425,7 +441,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct
>>>>>> drm_gpu_scheduler *sched)
>>>>>> struct drm_sched_rq *rq = sched->sched_rq[i];
>>>>>> spin_lock(&rq->lock);
>>>>>> list_for_each_entry(s_entity, &rq->entities, list)
>>>>>> {
>>>>>> - while ((s_job =
>>>>>> to_drm_sched_job(spsc_queue_pop(&s_entity->job_queue)))) {
>>>>>> + while ((s_job =
>>>>>> __drm_sched_entity_queue_pop(s_entity))) {
>>>>>> struct drm_sched_fence *s_fence =
>>>>>> s_job->s_fence;
>>>>>> dma_fence_signal(&s_fence-
>>>>>>> scheduled);
>>>
>>
>
More information about the amd-gfx
mailing list