[PATCH 2/2] drm/scheduler: stop setting rq to NULL
Christian König
christian.koenig at amd.com
Fri Aug 3 14:06:40 UTC 2018
Am 02.08.2018 um 16:11 schrieb Andrey Grodzovsky:
>
>
>
> On 08/02/2018 02:47 AM, Nayan Deshmukh wrote:
>>
>>
>> On Thu, Aug 2, 2018 at 12:12 PM Christian König
>> <ckoenig.leichtzumerken at gmail.com
>> <mailto:ckoenig.leichtzumerken at gmail.com>> wrote:
>>
>> Am 02.08.2018 um 00:25 schrieb Andrey Grodzovsky:
>> > Thinking again about this change and 53d5f1e drm/scheduler:
>> move idle
>> > entities to scheduler with less load v2
>> >
>> > Looks to me like use case for which fc9a539 drm/scheduler: add
>> NULL
>> > pointer check for run queue (v2) was done
>> >
>> > will not work anymore.
>> >
>> > First of all in drm_sched_entity_push_job, 'if (!entity->rq)' will
>> > never be true any more since we stopped entity->rq to NULL in
>> >
>> > drm_sched_entity_flush.
>>
>> Good point, going to remove that.
>>
>
> So basically we don't need that if (...){ return; } in
> drm_sched_entity_push_job any more ?
Yes, exactly.
>
>>
>> >
>> > Second of all, even after we removed the entity from rq in
>> > drm_sched_entity_flush to terminate any subsequent submissions
>> >
>> > to the queue the other thread doing push job can just acquire
>> again a
>> > run queue
>> >
>> > from drm_sched_entity_get_free_sched and continue submission
>>
>> Hi Christian
>>
>> That is actually desired.
>>
>> When another process is now using the entity to submit jobs
>> adding it
>> back to the rq is actually the right thing to do cause the entity is
>> still in use.
>>
>
> Yes, no problem if it's another process. But what about another thread
> from same process ? Is it a possible use case that 2 threads from same
> process submit to same entity concurrently ? If so and we specifically
> kill one, the other will not stop event if we want him to because
> current code makes him just require a rq for him self.
Well you can't kill a single thread of a process (you can only interrupt
it), a SIGKILL will always kill the whole process.
>
>> I am not aware of the usecase so I might just be rambling. but if the
>> thread/process that created the entity has called
>> drm_sched_entity_flush then we shouldn't allow other processes to
>> push jobs to that entity.
>>
>> Nayan
>>
>>
>> Christian.
>>
> We don't really know who created the entity, the creation could be by
> X itself and then it's passed to other process for use. Check
> 'drm/scheduler: only kill entity if last user is killed v2', the idea
> is that if by the time you want to
> kill this entity another process (not another thread from your process
> - i.e. current->group_leader != last_user in drm_sched_entity_flush)
> already started to use this entity just let it be.
Yes, exactly that's the idea.
Christian.
>
> Andrey
>>
>>
>> > so you can't substitute ' if (!entity->rq)' to 'if
>> > (list_empty(&entity->list))'.
>> >
>> > What about adding a 'stopped' flag to drm_sched_entity to be
>> set in
>> > drm_sched_entity_flush and in
>> >
>> > drm_sched_entity_push_job check for 'if (!entity->stopped)'
>> instead
>> > of ' if (!entity->rq)' ?
>> >
>> > Andrey
>> >
>> >
>> > On 07/30/2018 07:03 AM, Christian König wrote:
>> >> We removed the redundancy of having an extra scheduler field,
>> so we
>> >> can't set the rq to NULL any more or otherwise won't know which
>> >> scheduler to use for the cleanup.
>> >>
>> >> Just remove the entity from the scheduling list instead.
>> >>
>> >> Signed-off-by: Christian König <christian.koenig at amd.com
>> <mailto:christian.koenig at amd.com>>
>> >> ---
>> >> drivers/gpu/drm/scheduler/gpu_scheduler.c | 35
>> >> +++++++------------------------
>> >> 1 file changed, 8 insertions(+), 27 deletions(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> >> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> >> index f563e4fbb4b6..1b733229201e 100644
>> >> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> >> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> >> @@ -198,21 +198,6 @@ int drm_sched_entity_init(struct
>> >> drm_sched_entity *entity,
>> >> }
>> >> EXPORT_SYMBOL(drm_sched_entity_init);
>> >> -/**
>> >> - * drm_sched_entity_is_initialized - Query if entity is
>> initialized
>> >> - *
>> >> - * @sched: Pointer to scheduler instance
>> >> - * @entity: The pointer to a valid scheduler entity
>> >> - *
>> >> - * return true if entity is initialized, false otherwise
>> >> -*/
>> >> -static bool drm_sched_entity_is_initialized(struct
>> drm_gpu_scheduler
>> >> *sched,
>> >> - struct drm_sched_entity *entity)
>> >> -{
>> >> - return entity->rq != NULL &&
>> >> - entity->rq->sched == sched;
>> >> -}
>> >> -
>> >> /**
>> >> * drm_sched_entity_is_idle - Check if entity is idle
>> >> *
>> >> @@ -224,7 +209,8 @@ static bool drm_sched_entity_is_idle(struct
>> >> drm_sched_entity *entity)
>> >> {
>> >> rmb();
>> >> - if (!entity->rq || spsc_queue_peek(&entity->job_queue)
>> == NULL)
>> >> + if (list_empty(&entity->list) ||
>> >> + spsc_queue_peek(&entity->job_queue) == NULL)
>> >> return true;
>> >> return false;
>> >> @@ -279,8 +265,6 @@ long drm_sched_entity_flush(struct
>> >> drm_sched_entity *entity, long timeout)
>> >> long ret = timeout;
>> >> sched = entity->rq->sched;
>> >> - if (!drm_sched_entity_is_initialized(sched, entity))
>> >> - return ret;
>> >> /**
>> >> * The client will not queue more IBs during this fini,
>> consume
>> >> existing
>> >> * queued IBs or discard them on SIGKILL
>> >> @@ -299,7 +283,7 @@ long drm_sched_entity_flush(struct
>> >> drm_sched_entity *entity, long timeout)
>> >> last_user = cmpxchg(&entity->last_user,
>> current->group_leader,
>> >> NULL);
>> >> if ((!last_user || last_user == current->group_leader) &&
>> >> (current->flags & PF_EXITING) && (current->exit_code ==
>> >> SIGKILL))
>> >> - drm_sched_entity_set_rq(entity, NULL);
>> >> + drm_sched_rq_remove_entity(entity->rq, entity);
>> >> return ret;
>> >> }
>> >> @@ -320,7 +304,7 @@ void drm_sched_entity_fini(struct
>> >> drm_sched_entity *entity)
>> >> struct drm_gpu_scheduler *sched;
>> >> sched = entity->rq->sched;
>> >> - drm_sched_entity_set_rq(entity, NULL);
>> >> + drm_sched_rq_remove_entity(entity->rq, entity);
>> >> /* Consumption of existing IBs wasn't completed.
>> Forcefully
>> >> * remove them here.
>> >> @@ -416,15 +400,12 @@ void drm_sched_entity_set_rq(struct
>> >> drm_sched_entity *entity,
>> >> if (entity->rq == rq)
>> >> return;
>> >> - spin_lock(&entity->rq_lock);
>> >> -
>> >> - if (entity->rq)
>> >> - drm_sched_rq_remove_entity(entity->rq, entity);
>> >> + BUG_ON(!rq);
>> >> + spin_lock(&entity->rq_lock);
>> >> + drm_sched_rq_remove_entity(entity->rq, entity);
>> >> entity->rq = rq;
>> >> - if (rq)
>> >> - drm_sched_rq_add_entity(rq, entity);
>> >> -
>> >> + drm_sched_rq_add_entity(rq, entity);
>> >> spin_unlock(&entity->rq_lock);
>> >> }
>> >> EXPORT_SYMBOL(drm_sched_entity_set_rq);
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180803/03e9debc/attachment-0001.html>
More information about the dri-devel
mailing list