[PATCH 4/4] drm/scheduler: do not keep a copy of sched list
Nirmoy
nirmodas at amd.com
Tue Dec 10 15:08:37 UTC 2019
I think amdgpu_ctx_init() should check for num_scheds and not call
drm_sched_entity_init()
if its zero.
On 12/10/19 3:47 PM, Nirmoy wrote:
>
> On 12/10/19 2:00 PM, Christian König wrote:
>> Am 10.12.19 um 13:53 schrieb Nirmoy Das:
>>> entity should not keep copy and maintain sched list for
>>> itself.
>>>
>>> Signed-off-by: Nirmoy Das <nirmoy.das at amd.com>
>>> ---
>>> drivers/gpu/drm/scheduler/sched_entity.c | 10 +---------
>>> 1 file changed, 1 insertion(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index f9b6ce29c58f..a5f729f421f8 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -67,17 +67,10 @@ int drm_sched_entity_init(struct
>>> drm_sched_entity *entity,
>>> entity->guilty = guilty;
>>> entity->num_sched_list = num_sched_list;
>>> entity->priority = priority;
>>> - entity->sched_list = kcalloc(num_sched_list,
>>> - sizeof(struct drm_gpu_scheduler *), GFP_KERNEL);
>>> + entity->sched_list = sched_list;
>>
>> Maybe make this "num_sched_list > 1 ? sched_list : NULL" to avoid
>> accidentally dereferencing a stale pointer to the stack.
> Do you mean "num_sched_list >= 1 ? sched_list : NULL"
>>
>>> - if(!entity->sched_list)
>>> - return -ENOMEM;
>>> init_completion(&entity->entity_idle);
>>> -
>>> - for (i = 0; i < num_sched_list; i++)
>>> - entity->sched_list[i] = sched_list[i];
>>> -
>>> if (num_sched_list)
>>
>> That check can actually be dropped as well. We return -EINVAL when
>> the num_sched_list is zero.
>
> This one took me some time to understand. So we don't really return
> -EINVAL if num_sched_list == 0
>
> if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])))
> return -EINVAL;
>
> This is coming from below patch. Are we suppose to tolerate IPs with
> uninitialized sched so that ctx creation dosn't return error ?
>
> commit 1decbf6bb0b4dc56c9da6c5e57b994ebfc2be3aa
> Author: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> Date: Wed Jan 30 02:53:19 2019 +0100
>
> drm/sched: Fix entities with 0 rqs.
>
> Some blocks in amdgpu can have 0 rqs.
>
> Job creation already fails with -ENOENT when entity->rq is NULL,
> so jobs cannot be pushed. Without a rq there is no scheduler to
> pop jobs, and rq selection already does the right thing with a
> list of length 0.
>
> So the operations we need to fix are:
> - Creation, do not set rq to rq_list[0] if the list can have
> length 0.
> - Do not flush any jobs when there is no rq.
> - On entity destruction handle the rq = NULL case.
> - on set_priority, do not try to change the rq if it is NULL.
>
>>
>> Regards,
>> Christian.
>>
>>> entity->rq =
>>> &entity->sched_list[0]->sched_rq[entity->priority];
>>> @@ -312,7 +305,6 @@ void drm_sched_entity_fini(struct
>>> drm_sched_entity *entity)
>>> dma_fence_put(entity->last_scheduled);
>>> entity->last_scheduled = NULL;
>>> - kfree(entity->sched_list);
>>> }
>>> EXPORT_SYMBOL(drm_sched_entity_fini);
>>
More information about the amd-gfx
mailing list