[PATCH 2/2] drm/scheduler: stop setting rq to NULL
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Aug 6 08:14:53 UTC 2018
Am 03.08.2018 um 20:41 schrieb Andrey Grodzovsky:
>
>
>
> On 08/06/2018 08:44 AM, Christian König wrote:
>> Am 03.08.2018 um 16:54 schrieb Andrey Grodzovsky:
>>>
>>> [SNIP]
>>>
>>>>>
>>>>>>
>>>>>> >
>>>>>> > Second of all, even after we removed the entity from rq in
>>>>>> > drm_sched_entity_flush to terminate any subsequent submissions
>>>>>> >
>>>>>> > to the queue the other thread doing push job can just
>>>>>> acquire again a
>>>>>> > run queue
>>>>>> >
>>>>>> > from drm_sched_entity_get_free_sched and continue submission
>>>>>>
>>>>>> Hi Christian
>>>>>>
>>>>>> That is actually desired.
>>>>>>
>>>>>> When another process is now using the entity to submit jobs
>>>>>> adding it
>>>>>> back to the rq is actually the right thing to do cause the
>>>>>> entity is
>>>>>> still in use.
>>>>>>
>>>>>
>>>>> Yes, no problem if it's another process. But what about another
>>>>> thread from same process ? Is it a possible use case that 2
>>>>> threads from same process submit to same entity concurrently ? If
>>>>> so and we specifically kill one, the other will not stop event if
>>>>> we want him to because current code makes him just require a rq
>>>>> for him self.
>>>>
>>>> Well you can't kill a single thread of a process (you can only
>>>> interrupt it), a SIGKILL will always kill the whole process.
>>>
>>> Is the following scenario possible and acceptable ?
>>> 2 threads from same process working on same queue where thread A
>>> currently in drm_sched_entity_flush->wait_event_timeout (the process
>>> getting shut down because of SIGKILL sent by user)
>>> while thread B still inside drm_sched_entity_push_job before 'if
>>> (reschedule)'. 'A' stopped waiting because queue became empty and
>>> then removes the entity queue from scheduler's run queue while
>>> B goes inside 'reschedule' because it evaluates to true ('first' is
>>> true and all the rest of the conditions), acquires new rq, and later
>>> adds it back to scheduler (different one maybe) and keeps submitting
>>> jobs as much as he likes and then can be stack for up to 'timeout'
>>> time in his drm_sched_entity_flush waiting for them.
>>
>> I'm not 100% sure but I don't think that can happen.
>>
>> See flushing the fd is done while dropping the fd, which happens only
>> after all threads of the process in question are killed.
>
> Yea, this FDs handling is indeed a lot of gray area for me but as far
> as I remember flushing is done per each thread when exits (possibly
> due to a signal).
> Now signals interception and processing (as a result of which .flush
> will get called if SIGKILL received) is done in some points amongst
> which is when returning from IOCTL.
> So if first thread was at the very end of the CS ioctl when SIGKILL
> was received while the other one at the beginning then I think we
> might see something like the scenario above.
SIGKILL isn't processed as long as any thread of the application is
still inside the kernel. That's why we have wait_event_killable().
So I don't think that the scenario above is possible, but I'm really not
100% sure either.
Christian.
>
> Andrey
>
>>
>> Otherwise the flushing wouldn't make to much sense. In other words
>> imagine an application where a thread does a write() on a fd which is
>> killed.
>>
>> The idea of the flush is to preserve the data and that won't work if
>> that isn't correctly ordered.
>>
>>> My understanding was that introduction of entity->last is to force
>>> immediate termination job submissions by any thread from the
>>> terminating process.
>>
>> We could consider reordering that once more. Going to play out all
>> scenarios in my head over the weekend :)
>>
>> Christian.
>>
>>>
>>> Andrey
>>
>
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180806/9fd8bfb5/attachment-0001.html>
More information about the dri-devel
mailing list