[systemd-devel] BUG: several bugs in core/main.c (v218)

Tomasz Pawlak tomazzi at wp.pl
Mon Jan 26 14:45:23 PST 2015


Dnia Poniedziałek, 26 Stycznia 2015 07:58 Zbigniew Jędrzejewski-Szmek <zbyszek at in.waw.pl> napisał(a) 
> On Sun, Jan 25, 2015 at 03:37:09AM +0100, Tomasz Pawlak wrote:
> > core/main.c:1519
> >       /* Make sure we leave a core dump without panicing the
> >        * kernel. */
> >       if (getpid() == 1) {
> >               install_crash_handler();
> > 
> >               r = mount_cgroup_controllers(arg_join_controllers);
> >               if (r < 0)
> >                       goto finish;
> >       }
> > 
> > core/main.c:226
> > static void install_crash_handler(void) {
> >       struct sigaction sa = {
> >               .sa_handler = crash,
> >               .sa_flags = SA_NODEFER,
> >       };
> > 
> >       sigaction_many(&sa, SIGNALS_CRASH_HANDLER, -1);
> > }
> > 
> > /shared/util.c:2207
> > int sigaction_many(const struct sigaction *sa, ...) {
> >       va_list ap;
> >       int r = 0, sig;
> > 
> >       va_start(ap, sa);
> >       while ((sig = va_arg(ap, int)) > 0)
> >               if (sigaction(sig, sa, NULL) < 0)
> >                       r = -errno;
> >       va_end(ap);
> > 
> >       return r;
> > }
> > 
> > shared/def.h:40
> > #define SIGNALS_CRASH_HANDLER SIGSEGV,SIGILL,SIGFPE,SIGBUS,SIGQUIT,SIGABRT
> > 
> > 
> > BUGS:
> > 1. Ignoring return value from sigaction_many(): all sig handlers can silently fail to install, thus leaving the whole process unprotected
> 
> Actually it *is* protected, see kill(2). Signals are ignored for PID 1
> unless it installed handlers for them. Nevertheless, we probably want to
> abort on SIGSEGV and similar and not continue, so we shouldn't ever run
> without the handlers installed.

Actually this is not what kill(2) says: it says that indeed, the signals are not delivered to PID1 to prevent accidential termination. This however *does not* mean that You are allowed to ignore the signals, because by doing so You can run the process into undefined state.

> 
> We shouldn't really ever fail to install the handlers, so this is a
> rather academic exercise. I guess we can add an assert_se() around it.
> 
This very bad and dangerus assumption, as the sigaction may fail due to various reasons, like wrong/malformed args, internal kernel problems or just random memory faults (which are very unlikely in ECC RAM, but not so unlikely on customer grade hardware or on embedded systems).
No ofense, but the discussion is indeed becoming academic when You are trying to prove that it's not necessary to check return value from a call to external function which has defined error codes.

Systemd is is not just another user space application - it is going to be one of the most important parts of the system - so please - such excuses should not even appear in this mailing list.

> > 2. SIGSEGV should be catched by a handler running on a separate stack (SA_ONSTACK) - otherwise it can cause segfault itself, when the first SIGSEGV which triggered the handler is caused by stack overflow.
> 
> That's a good point.
> 
> > 
> > 3. SA_NODEFER makes no sense, since the handler is expected to catch only first critical signal. With SA_NODEFER nesting of signals is possible, what can cause unpredictable results, including uncatched stack overflow caused by the handler itself.
> 
> Your analysis sounds correct here too.
> 
> A patch would be welcome.
> 
> Zbyszek

Regarding a patch: 
By accident (I think) my previous e-mail was not attached to this thread, You can find it here:
http://lists.freedesktop.org/archives/systemd-devel/2015-January/027395.html
I've desribed one of possible solutions there.

I'm writting an alternative, systemd-compatible solution (it's not a fork in the general sense, because the code is written from ground-up, only the interfaces are the same). The reason for this is that I can see many problems in how the exceptions are handled in this project, and I'm going to use a completely experimental error handling system, which is based on so-called c-exceptions (context switching, with the ability to partially restart the process without termination). It would need a 
lot of work to write a "patch" for systemd and of course a lot of tests.

The bugs I've reported here are just result of my analysis of systemd code (of course, I have to read every single line of code to understand how to implement ifaces) 

Regards.

Tomek




More information about the systemd-devel mailing list