[systemd-devel] systemd39: journald segfault brings down some user services

Lennart Poettering lennart at poettering.net
Thu Feb 9 11:12:55 PST 2012


On Fri, 03.02.12 11:53, warpme (warpme at o2.pl) wrote:

> Hi,
> 
> I have question related to latest systemd39 & newly introduced journald.
> I'm on ArchLinux (kernel 3.0.18). Recently - after upgrade from systemd37->39 I'm observing decreased system stability. 
> It manifests as random services outages. 
> So far during last days I had 2 such cases. 
> In both of them careful log analysis points to hypothesis that event sequence was following:
> 
> 1.segfault in systemd-journald
> 2.it looks like one from user process received restart event because
> of p1

So problem goes like this: if the journal dies all processes which have
stdout/stderr connected to the journal will get a SIGPIPE or EPIPE on
the next write to it. SIGPIPE by default terminates the process and
hence a huge chunk of the the system goes down when the journal goes
away (for example D-Bus which then recursively causes a lot of other
processes to die). I have now changed things in 41 so that by default
all services get SIGPIPE set to SIG_IGN, so that the SIGPIPE won't
happen anymore (SIGPIPE in this case is exclusively useful in shell
pipelines, and daemons aren't shell pipelines, and hence ignoreing
SIGPIPE for all daemons is a good idea we believe.)

Now, of course, the journal shouldn't crash in the first place. This bug
is still something to fix, but so far nobody managed to get me a bt of
this. if the journal itself crashes a coredump will be placed in
/var/lib/systemd/coredump/. It would be great if somebody could generate
a backtrace of that!

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the systemd-devel mailing list