[PATCH 4/4] drm/scheduler: do not keep a copy of sched list

Nirmoy nirmodas at amd.com
Tue Dec 10 15:08:37 UTC 2019


I think amdgpu_ctx_init() should check for num_scheds and not call 
drm_sched_entity_init()

if its zero.

On 12/10/19 3:47 PM, Nirmoy wrote:
>
> On 12/10/19 2:00 PM, Christian König wrote:
>> Am 10.12.19 um 13:53 schrieb Nirmoy Das:
>>> entity should not keep copy and maintain sched list for
>>> itself.
>>>
>>> Signed-off-by: Nirmoy Das <nirmoy.das at amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c | 10 +---------
>>>   1 file changed, 1 insertion(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index f9b6ce29c58f..a5f729f421f8 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -67,17 +67,10 @@ int drm_sched_entity_init(struct 
>>> drm_sched_entity *entity,
>>>       entity->guilty = guilty;
>>>       entity->num_sched_list = num_sched_list;
>>>       entity->priority = priority;
>>> -    entity->sched_list =  kcalloc(num_sched_list,
>>> -                      sizeof(struct drm_gpu_scheduler *), GFP_KERNEL);
>>> +    entity->sched_list =  sched_list;
>>
>> Maybe make this "num_sched_list > 1 ? sched_list : NULL" to avoid 
>> accidentally dereferencing a stale pointer to the stack.
> Do you mean "num_sched_list >= 1 ? sched_list : NULL"
>>
>>>   -    if(!entity->sched_list)
>>> -        return -ENOMEM;
>>>         init_completion(&entity->entity_idle);
>>> -
>>> -    for (i = 0; i < num_sched_list; i++)
>>> -        entity->sched_list[i] = sched_list[i];
>>> -
>>>       if (num_sched_list)
>>
>> That check can actually be dropped as well. We return -EINVAL when 
>> the num_sched_list is zero.
>
> This  one took me some time to understand. So we don't really return 
> -EINVAL if num_sched_list == 0
>
> if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])))
>                 return -EINVAL;
>
> This is coming from below patch. Are we suppose to tolerate IPs with 
> uninitialized sched so that ctx creation dosn't return error ?
>
> commit 1decbf6bb0b4dc56c9da6c5e57b994ebfc2be3aa
> Author: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> Date:   Wed Jan 30 02:53:19 2019 +0100
>
>     drm/sched: Fix entities with 0 rqs.
>
>     Some blocks in amdgpu can have 0 rqs.
>
>     Job creation already fails with -ENOENT when entity->rq is NULL,
>     so jobs cannot be pushed. Without a rq there is no scheduler to
>     pop jobs, and rq selection already does the right thing with a
>     list of length 0.
>
>     So the operations we need to fix are:
>       - Creation, do not set rq to rq_list[0] if the list can have 
> length 0.
>       - Do not flush any jobs when there is no rq.
>       - On entity destruction handle the rq = NULL case.
>       - on set_priority, do not try to change the rq if it is NULL.
>
>>
>> Regards,
>> Christian.
>>
>>>           entity->rq = 
>>> &entity->sched_list[0]->sched_rq[entity->priority];
>>>   @@ -312,7 +305,6 @@ void drm_sched_entity_fini(struct 
>>> drm_sched_entity *entity)
>>>         dma_fence_put(entity->last_scheduled);
>>>       entity->last_scheduled = NULL;
>>> -    kfree(entity->sched_list);
>>>   }
>>>   EXPORT_SYMBOL(drm_sched_entity_fini);
>>


More information about the amd-gfx mailing list