[Intel-gfx] [PATCH igt] core/sighelper: Interrupt everyone in the process group

Mon Jan 11 04:25:07 PST 2016

On 11/01/16 09:06, Daniel Vetter wrote:
> On Mon, Jan 11, 2016 at 08:54:59AM +0000, Chris Wilson wrote:
>> On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote:
>>> On Fri, Jan 08, 2016 at 08:44:29AM +0000, Chris Wilson wrote:
>>>> Some stress tests create both the signal helper and a lot of competing
>>>> processes. In these tests, the parent is just waiting upon the children,
>>>> and the intention is not to keep waking up the waiting parent, but to
>>>> keep interrupting the children (as we hope to trigger races in our
>>>> kernel code). kill(-pid) sends the signal to all members of the process
>>>> group, not just the target pid.
>>>
>>> I don't really have any clue about unix pgroups, but the -pid disappeared
>>> compared to the previous version.
>>
>> -getppid().
>>
>> I felt it was clearer to pass along the "negative pid = process group"
>> after setting up the process group.
>
> Oh, I was blind ... Yeah looks better, but please add a bigger comment
> around that code explaining why we need a group and why we use SIG_CONT.
> With that acked-by: me.
>
> Cheers, Daniel
>
>>>> We also switch from using SIGUSR1 to SIGCONT to paper over a race
>>>> condition when forking children that saw the default signal action being
>>>> run (and thus killing the child).
>>>
>>> I thought I fixed that race by first installing the new signal handler,
>>> then forking. Ok, rechecked and it's the SYS_getpid stuff, so another
>>> race. Still I thought signal handlers would survive a fork?
>>
>> So did irc. They didn't appear to as the children would sporadically
>> die with SIGUSR1.
>
> Could be that libc is doing something funny, iirc they have piles of fork
> helpers to make fork more reliable (breaking locks and stuff like that),
> but then in turn break the abstraction.
> -Daniel

You could use killpg(pgrp, sig) rather than kill(), just to make it 
clearer that the target is a process group, rather than people having to 
know about the "negative pid" semantics.

I don't think SIGCHLD is a good idea; it has kernel-defined semantics 
beyond just sending a signal. And it may not be delivered at all, if the 
disposition is not "caught". SIGUSR1 was the right thing, really; so it 
would be better to work out how to make that work properly, rather than 
change to a different one.

Signal handlers are (supposed to be) inherited across fork(); signal 
disposition is also inherited, and the set of pending signals of a new 
process is (supposed to be) empty. OTOH a signal can be delivered to the 
child before it returns from the fork(), which may be a bit surprising.

I think the safest way to avoid unexpected signals around a fork() is:

parent calls sigprocmask() to block all interesting signals
parent calls fork() --> child inherits mask
parent calls sigprocmask() to restore the previous mask

	child updates handlers if required
	child calls sigprocmask() to unblock signals

.Dave.