[systemd-devel] BUG: several bugs in core/main.c (v218)

Mon Jan 26 16:57:58 PST 2015

On Mon, 26.01.15 23:45, Tomasz Pawlak (tomazzi at wp.pl) wrote:

> > Actually it *is* protected, see kill(2). Signals are ignored for PID 1
> > unless it installed handlers for them. Nevertheless, we probably want to
> > abort on SIGSEGV and similar and not continue, so we shouldn't ever run
> > without the handlers installed.
> 
> Actually this is not what kill(2) says: it says that indeed, the
> signals are not delivered to PID1 to prevent accidential
> termination. This however *does not* mean that You are allowed to
> ignore the signals, because by doing so You can run the process into
> undefined state.

We are not ignoring them, we just let the kernel deal with them for
us, instead of doing so on our own. The kernel will OOPS the kernel
when init does abnormally. And that's actually pretty OK behaviour in
that case. Of course, it would be slightly better if not the whole
system would crash, but only PID 1 like we can do it with the crash
handler, but either way you have to reboot anyway to get back to a
working system.

> > We shouldn't really ever fail to install the handlers, so this is a
> > rather academic exercise. I guess we can add an assert_se() around
> > it.
>
> This very bad and dangerus assumption, as the sigaction may fail due
> to various reasons, like wrong/malformed args, internal kernel
> problems or just random memory faults (which are very unlikely in
> ECC RAM, but not so unlikely on customer grade hardware or on
> embedded systems).  No ofense, but the discussion is indeed becoming
> academic when You are trying to prove that it's not necessary to
> check return value from a call to external function which has
> defined error codes.
> 
> Systemd is is not just another user space application - it is going
> to be one of the most important parts of the system - so please -
> such excuses should not even appear in this mailing list.

Well, we are pretty careful usually when it comes to checking return
values. But this case is simply one of those cases where you can only
choose between:

a) if you cannot install the crash handler, continue and the let the
   kernel do its normal crash handling.

b) if you cannot install the crash handler, abort immediately.

I fail to see how b) would be that much better than a) here. It just
replaces one way to die with another way to die. Sure, our crash
handling way to die is nicer than the kernel's own, but allowing the
system to boot up in the first place, is certainly even nicer!

Lennart

-- 
Lennart Poettering, Red Hat