[PATCH 2/2] drm/scheduler: stop setting rq to NULL

Christian König ckoenig.leichtzumerken at gmail.com
Thu Aug 2 07:26:04 UTC 2018


Am 02.08.2018 um 08:47 schrieb Nayan Deshmukh:
>
>
> On Thu, Aug 2, 2018 at 12:12 PM Christian König 
> <ckoenig.leichtzumerken at gmail.com 
> <mailto:ckoenig.leichtzumerken at gmail.com>> wrote:
>
>     Am 02.08.2018 um 00:25 schrieb Andrey Grodzovsky:
>     > Thinking again about this change and 53d5f1e drm/scheduler: move
>     idle
>     > entities to scheduler with less load v2
>     >
>     > Looks to me like use case for which fc9a539 drm/scheduler: add NULL
>     > pointer check for run queue (v2) was done
>     >
>     > will not work anymore.
>     >
>     > First of all in drm_sched_entity_push_job, 'if (!entity->rq)' will
>     > never be true any more since we stopped entity->rq to NULL in
>     >
>     > drm_sched_entity_flush.
>
>     Good point, going to remove that.
>
>     >
>     > Second of all, even after we removed the entity from rq in
>     > drm_sched_entity_flush to terminate any subsequent submissions
>     >
>     > to the queue the other thread doing push job can just acquire
>     again a
>     > run queue
>     >
>     > from drm_sched_entity_get_free_sched and continue submission
>
> Hi Christian
>
>     That is actually desired.
>
>     When another process is now using the entity to submit jobs adding it
>     back to the rq is actually the right thing to do cause the entity is
>     still in use.
>
> I am not aware of the usecase so I might just be rambling. but if the 
> thread/process that created the entity has called 
> drm_sched_entity_flush then we shouldn't allow other processes to push 
> jobs to that entity.

Why not?

Calling flush() only means that we should wait for all waiting jobs to 
be send to the hardware. It doesn't mean that we need to stop accepting 
new jobs.

It is perfectly possible that a child process has inherited a file 
descriptor from it's parent and calls flush because it exits, while the 
parent is still using the file descriptor.

The only tricky part is how to handle it when a process is killed, but I 
think we came to good conclusion there as well.

We remove the entity from the scheduling when the last process which 
submitted jobs is killed or if there is no such process any more and the 
current flushing process is killed.

Christian.

>
> Nayan
>
>
>     Christian.
>
>     > so you can't substitute ' if (!entity->rq)' to 'if
>     > (list_empty(&entity->list))'.
>     >
>     > What about adding a 'stopped' flag to drm_sched_entity to be set in
>     > drm_sched_entity_flush and in
>     >
>     > drm_sched_entity_push_job check for  'if (!entity->stopped)'
>     instead
>     > of  ' if (!entity->rq)' ?
>     >
>     > Andrey
>     >
>     >
>     > On 07/30/2018 07:03 AM, Christian König wrote:
>     >> We removed the redundancy of having an extra scheduler field, so we
>     >> can't set the rq to NULL any more or otherwise won't know which
>     >> scheduler to use for the cleanup.
>     >>
>     >> Just remove the entity from the scheduling list instead.
>     >>
>     >> Signed-off-by: Christian König <christian.koenig at amd.com
>     <mailto:christian.koenig at amd.com>>
>     >> ---
>     >>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 35
>     >> +++++++------------------------
>     >>   1 file changed, 8 insertions(+), 27 deletions(-)
>     >>
>     >> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     >> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     >> index f563e4fbb4b6..1b733229201e 100644
>     >> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     >> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     >> @@ -198,21 +198,6 @@ int drm_sched_entity_init(struct
>     >> drm_sched_entity *entity,
>     >>   }
>     >>   EXPORT_SYMBOL(drm_sched_entity_init);
>     >>   -/**
>     >> - * drm_sched_entity_is_initialized - Query if entity is
>     initialized
>     >> - *
>     >> - * @sched: Pointer to scheduler instance
>     >> - * @entity: The pointer to a valid scheduler entity
>     >> - *
>     >> - * return true if entity is initialized, false otherwise
>     >> -*/
>     >> -static bool drm_sched_entity_is_initialized(struct
>     drm_gpu_scheduler
>     >> *sched,
>     >> -                        struct drm_sched_entity *entity)
>     >> -{
>     >> -    return entity->rq != NULL &&
>     >> -        entity->rq->sched == sched;
>     >> -}
>     >> -
>     >>   /**
>     >>    * drm_sched_entity_is_idle - Check if entity is idle
>     >>    *
>     >> @@ -224,7 +209,8 @@ static bool drm_sched_entity_is_idle(struct
>     >> drm_sched_entity *entity)
>     >>   {
>     >>       rmb();
>     >>   -    if (!entity->rq || spsc_queue_peek(&entity->job_queue)
>     == NULL)
>     >> +    if (list_empty(&entity->list) ||
>     >> +        spsc_queue_peek(&entity->job_queue) == NULL)
>     >>           return true;
>     >>         return false;
>     >> @@ -279,8 +265,6 @@ long drm_sched_entity_flush(struct
>     >> drm_sched_entity *entity, long timeout)
>     >>       long ret = timeout;
>     >>         sched = entity->rq->sched;
>     >> -    if (!drm_sched_entity_is_initialized(sched, entity))
>     >> -        return ret;
>     >>       /**
>     >>        * The client will not queue more IBs during this fini,
>     consume
>     >> existing
>     >>        * queued IBs or discard them on SIGKILL
>     >> @@ -299,7 +283,7 @@ long drm_sched_entity_flush(struct
>     >> drm_sched_entity *entity, long timeout)
>     >>       last_user = cmpxchg(&entity->last_user,
>     current->group_leader,
>     >> NULL);
>     >>       if ((!last_user || last_user == current->group_leader) &&
>     >>           (current->flags & PF_EXITING) && (current->exit_code ==
>     >> SIGKILL))
>     >> -        drm_sched_entity_set_rq(entity, NULL);
>     >> +        drm_sched_rq_remove_entity(entity->rq, entity);
>     >>         return ret;
>     >>   }
>     >> @@ -320,7 +304,7 @@ void drm_sched_entity_fini(struct
>     >> drm_sched_entity *entity)
>     >>       struct drm_gpu_scheduler *sched;
>     >>         sched = entity->rq->sched;
>     >> -    drm_sched_entity_set_rq(entity, NULL);
>     >> +    drm_sched_rq_remove_entity(entity->rq, entity);
>     >>         /* Consumption of existing IBs wasn't completed. Forcefully
>     >>        * remove them here.
>     >> @@ -416,15 +400,12 @@ void drm_sched_entity_set_rq(struct
>     >> drm_sched_entity *entity,
>     >>       if (entity->rq == rq)
>     >>           return;
>     >>   -    spin_lock(&entity->rq_lock);
>     >> -
>     >> -    if (entity->rq)
>     >> -        drm_sched_rq_remove_entity(entity->rq, entity);
>     >> +    BUG_ON(!rq);
>     >>   +    spin_lock(&entity->rq_lock);
>     >> +    drm_sched_rq_remove_entity(entity->rq, entity);
>     >>       entity->rq = rq;
>     >> -    if (rq)
>     >> -        drm_sched_rq_add_entity(rq, entity);
>     >> -
>     >> +    drm_sched_rq_add_entity(rq, entity);
>     >>       spin_unlock(&entity->rq_lock);
>     >>   }
>     >>   EXPORT_SYMBOL(drm_sched_entity_set_rq);
>     >
>
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180802/e8b2fc4f/attachment-0001.html>


More information about the dri-devel mailing list