[systemd-devel] System stability when journald locks up

Lennart Poettering lennart at poettering.net
Tue May 29 13:29:10 PDT 2012


On Mon, 28.05.12 20:33, Marti Raudsepp (marti at juffo.org) wrote:

> Hi list,
> 
> Long story short, I believe there are two problems with journald:
> 1) journald gets stuck in an infinte loop, trying to send the message
> "Dropping message, as we can't find a place to store the data" to
> somewhere -- occurs in v44 and v183 (Arch Linux)

Hmm, that message should go to kmsg. Are you saying for you kmsg gets
looped back to the journal and we don't detect that properly. I am
pretty sure I made sure that didn't happen. I guess I need to have
another look at this.

> 2) A journald problem can effectively lock up the whole system. I
> agree that reliable logging is a worthwhile goal, but it shouldn't
> compromise the reliability of the whole system. Are there any plans to
> address this failure mode?
> I'm sure there are other ways how journald can get stuck -- attaching
> a debugger or trying to write to a crashed hard drive or network file
> system for instance.

Hmm, so we already did a lot of work to make things less problematic if
the logger dies (for exampl IgnoreSIGPIPE=) but I guess there is more to
fix here. The journal is still very new. I think so far it is quite
stable, but there is definitely more work necessary to make it rock
solid in all corner cases.

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the systemd-devel mailing list