[PATCH] drm/sched: Avoid infinite waits in the drm_sched_entity_destroy() path

Steven Price steven.price at arm.com
Thu Oct 1 15:07:13 UTC 2020


On 01/10/2020 15:12, Boris Brezillon wrote:
> If we don't initialize the entity to idle and the entity is never
> scheduled before being destroyed we end up with an infinite wait in the
> destroy path.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon at collabora.com>

This seems reasonable to me - it looks like in theory if you very 
quickly open, submit a job and close you could trigger this (i.e. if 
drm_sched_main() never actually enters the while loop).

You should CC some other folk as this doesn't just affect Panfrost.

Reviewed-by: Steven Price <steven.price at arm.com>

> ---
> This is something I noticed while debugging another issue on panfrost
> causing the scheduler to be in a weird state where new entities were no
> longer scheduled. This was causing all userspace threads trying to close
> their DRM fd to be blocked in kernel space waiting for this "entity is
> idle" event. I don't know if that fix is legitimate (now that we fixed
> the other bug we don't seem to end up in that state anymore), but I
> thought I'd share it anyway.
> ---
>   drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 146380118962..f8ec277a6aa8 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -73,6 +73,9 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   
>   	init_completion(&entity->entity_idle);
>   
> +	/* We start in an idle state. */
> +	complete(&entity->entity_idle);
> +
>   	spin_lock_init(&entity->rq_lock);
>   	spsc_queue_init(&entity->job_queue);
>   
> 



More information about the dri-devel mailing list