[systemd-devel] BUG: several bugs in core/main.c (v218)

Wed Jan 28 17:57:15 PST 2015

On Tue, Jan 27, 2015 at 01:57:58AM +0100, Lennart Poettering wrote:
> On Mon, 26.01.15 23:45, Tomasz Pawlak (tomazzi at wp.pl) wrote:
> 
> > > Actually it *is* protected, see kill(2). Signals are ignored for PID 1
> > > unless it installed handlers for them. Nevertheless, we probably want to
> > > abort on SIGSEGV and similar and not continue, so we shouldn't ever run
> > > without the handlers installed.
> > 
> > Actually this is not what kill(2) says: it says that indeed, the
> > signals are not delivered to PID1 to prevent accidential
> > termination. This however *does not* mean that You are allowed to
> > ignore the signals, because by doing so You can run the process into
> > undefined state.
> 
> We are not ignoring them, we just let the kernel deal with them for
> us, instead of doing so on our own. The kernel will OOPS the kernel
> when init does abnormally.
Ah, OK. So the kernel simply kills PID1 on a fatal signal. I thought
that it would let the process continue, which sounds bizarre, now that
I think about it.

> And that's actually pretty OK behaviour in
> that case. Of course, it would be slightly better if not the whole
> system would crash, but only PID 1 like we can do it with the crash
> handler, but either way you have to reboot anyway to get back to a
> working system.
Agreed. Nothing to fix here.

Zbyszek

> > > We shouldn't really ever fail to install the handlers, so this is a
> > > rather academic exercise. I guess we can add an assert_se() around
> > > it.
> >
> > This very bad and dangerus assumption, as the sigaction may fail due
> > to various reasons, like wrong/malformed args, internal kernel
> > problems or just random memory faults (which are very unlikely in
> > ECC RAM, but not so unlikely on customer grade hardware or on
> > embedded systems).  No ofense, but the discussion is indeed becoming
> > academic when You are trying to prove that it's not necessary to
> > check return value from a call to external function which has
> > defined error codes.
> > 
> > Systemd is is not just another user space application - it is going
> > to be one of the most important parts of the system - so please -
> > such excuses should not even appear in this mailing list.
> 
> Well, we are pretty careful usually when it comes to checking return
> values. But this case is simply one of those cases where you can only
> choose between:
> 
> a) if you cannot install the crash handler, continue and the let the
>    kernel do its normal crash handling.
> 
> b) if you cannot install the crash handler, abort immediately.
> 
> I fail to see how b) would be that much better than a) here. It just
> replaces one way to die with another way to die. Sure, our crash
> handling way to die is nicer than the kernel's own, but allowing the
> system to boot up in the first place, is certainly even nicer!
> 
> Lennart
> 
> -- 
> Lennart Poettering, Red Hat
>