[Intel-gfx] [PATCH igt] core/sighelper: Interrupt everyone in the process group
Chris Wilson
chris at chris-wilson.co.uk
Mon Jan 11 05:41:03 PST 2016
On Mon, Jan 11, 2016 at 01:29:12PM +0000, Dave Gordon wrote:
> On 11/01/16 12:34, Chris Wilson wrote:
> >On Mon, Jan 11, 2016 at 12:25:07PM +0000, Dave Gordon wrote:
> >>On 11/01/16 09:06, Daniel Vetter wrote:
> >>>On Mon, Jan 11, 2016 at 08:54:59AM +0000, Chris Wilson wrote:
> >>>>On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote:
> >>>>>On Fri, Jan 08, 2016 at 08:44:29AM +0000, Chris Wilson wrote:
> >>>>>>Some stress tests create both the signal helper and a lot of competing
> >>>>>>processes. In these tests, the parent is just waiting upon the children,
> >>>>>>and the intention is not to keep waking up the waiting parent, but to
> >>>>>>keep interrupting the children (as we hope to trigger races in our
> >>>>>>kernel code). kill(-pid) sends the signal to all members of the process
> >>>>>>group, not just the target pid.
> >>>>>
> >>>>>I don't really have any clue about unix pgroups, but the -pid disappeared
> >>>>>compared to the previous version.
> >>>>
> >>>>-getppid().
> >>>>
> >>>>I felt it was clearer to pass along the "negative pid = process group"
> >>>>after setting up the process group.
> >>>
> >>>Oh, I was blind ... Yeah looks better, but please add a bigger comment
> >>>around that code explaining why we need a group and why we use SIG_CONT.
> >>>With that acked-by: me.
> >>>
> >>>Cheers, Daniel
> >>>
> >>>>>>We also switch from using SIGUSR1 to SIGCONT to paper over a race
> >>>>>>condition when forking children that saw the default signal action being
> >>>>>>run (and thus killing the child).
> >>>>>
> >>>>>I thought I fixed that race by first installing the new signal handler,
> >>>>>then forking. Ok, rechecked and it's the SYS_getpid stuff, so another
> >>>>>race. Still I thought signal handlers would survive a fork?
> >>>>
> >>>>So did irc. They didn't appear to as the children would sporadically
> >>>>die with SIGUSR1.
> >>>
> >>>Could be that libc is doing something funny, iirc they have piles of fork
> >>>helpers to make fork more reliable (breaking locks and stuff like that),
> >>>but then in turn break the abstraction.
> >>>-Daniel
> >>
> >>You could use killpg(pgrp, sig) rather than kill(), just to make it
> >>clearer that the target is a process group, rather than people
> >>having to know about the "negative pid" semantics.
> >>
> >>I don't think SIGCHLD is a good idea; it has kernel-defined
> >>semantics beyond just sending a signal. And it may not be delivered
> >>at all, if the disposition is not "caught". SIGUSR1 was the right
> >>thing, really; so it would be better to work out how to make that
> >>work properly, rather than change to a different one.
> >
> >SIGCONT not SIGCHLD. And the deposition is supposed to be fully under
> >our control any way.
>
> Oops, yes, I meant SIGCONT has kernel-defined semantics, etc ...
>
> Catching SIGCONT is ... unusual. Because you can't catch SIGSTOP,
> you don't normally have any reason to catch SIGCONT.
>
> Actually, SIGCONT is even more bizarre than SIGCHLD as sending a
> SIGCONT to a process can result in a SIGCHLD being sent to its
> parent.
Really? That's a nuiscance but nothing more. I'm only trying to paper
over a bug here :)
> >>Signal handlers are (supposed to be) inherited across fork(); signal
> >>disposition is also inherited, and the set of pending signals of a
> >>new process is (supposed to be) empty. OTOH a signal can be
> >>delivered to the child before it returns from the fork(), which may
> >>be a bit surprising.
> >>
> >>I think the safest way to avoid unexpected signals around a fork() is:
> >>
> >>parent calls sigprocmask() to block all interesting signals
> >>parent calls fork() --> child inherits mask
> >>parent calls sigprocmask() to restore the previous mask
> >
> >I tried that.
> >-Chris
>
> Are we using signal(2) to install the handlers? 'Cos that's archaic
> and has known unfixable race conditions. The Linux kernel supplies
> SysV signal semantics by default, which means the disposition gets
> reset before the handler is called, so a double signal kills the
> program. The glibc signal(3) wrapper provides BSD semantics which
> are slightly less problematic; but libc5 signal(3) implements SysV.
We are using both, but in for the sighelper interrupt we are using
signal - but a long time before we fork the test children.
Worth a shot as much as anything else...
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list