[PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Apr 30 18:29:11 UTC 2018
Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>> On 04/30, Andrey Grodzovsky wrote:
>>> What about changing PF_SIGNALED to PF_EXITING in
>>> drm_sched_entity_do_release
>>>
>>> - if ((current->flags & PF_SIGNALED) && current->exit_code ==
>>> SIGKILL)
>>> + if ((current->flags & PF_EXITING) && current->exit_code ==
>>> SIGKILL)
>> let me repeat, please don't use task->exit_code. And in fact this
>> check is racy
>>
>> But this doesn't matter. Say, we can trivially add
>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>> or do something else,
>
>
> Can you explain where is the race and what is a possible alternative
> then ?
The race is that the release doesn't necessarily comes from the
process/context which used the fd.
E.g. it is just called when the last reference count goes away, but that
can be anywhere not related to the original process using it, e.g. in a
kernel thread or a debugger etc...
The approach with the flush is indeed a really nice idea and I bite
myself to not had that previously as well.
Christian.
>
>> but I fail to understand what are you trying to do. Suppose
>> that the check above is correct in that it is true iff the task is
>> exiting and
>> it was killed by SIGKILL. What about the "else" branch which does
>>
>> r = wait_event_killable(sched->job_scheduled, ...)
>>
>> ?
>>
>> Once again, fatal_signal_pending() (or even signal_pending()) is not
>> well defined
>> after the exiting task passes exit_signals().
>>
>> So wait_event_killable() can fail because fatal_signal_pending() is
>> true; and this
>> can happen even if it was not killed.
>>
>> Or it can block and SIGKILL won't be able to wake it up.
>>
>>> If SIGINT was sent then it's SIGINT,
>> Yes, but see above. in this case fatal_signal_pending() will be
>> likely true so
>> wait_event_killable() will fail unless condition is already true.
>
> My bad, I didn't show the full intended fix, it was just a snippet to
> address the differentiation between exiting
> do to SIGKILL and any other exit, I also intended to change
> wait_event_killable to wait_event_timeout.
>
> Andrey
>
>>
>> Oleg.
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list