[systemd-devel] is the watchdog useful?

Vito Caputo vcaputo at pengaru.com
Thu Oct 24 21:56:55 UTC 2019


On Thu, Oct 24, 2019 at 10:45:32AM +0000, Zbigniew Jędrzejewski-Szmek wrote:
> On Tue, Oct 22, 2019 at 04:35:13AM -0700, Vito Caputo wrote:
> > On Tue, Oct 22, 2019 at 10:51:49AM +0000, Zbigniew Jędrzejewski-Szmek wrote:
> > > On Tue, Oct 22, 2019 at 12:34:45PM +0200, Umut Tezduyar Lindskog wrote:
> > > > I am curious Zbigniew of how you find out if the coredump was on a starved
> > > > process?
> > > 
> > > A very common case is systemd-journald which gets SIGABRT when in a
> > > read() or write() or similar syscall. Another case is when
> > > systemd-udevd workers get ABRT when doing open() on a device.
> > > 
> > 
> > In the case of journald, is it really in read()/write() syscalls you're
> > seeing the SIGABRTs?
> 
> I was sloppy here — it's not read/write, but various other syscalls.
> In particular clone(), which makes sense, because it involves memory
> allocation.
> 

That's interesting, it's not like journald calls clone() a lot.  I
presume that's due to the offline thread creation?  Things could be
refactored a bit to create the offline thread once @ journal file open
and keep it around.

I'd expect the mmap-related file-backed page faults to be more
problematic as it's most of what journald is doing and requires both
memory and IO.

Maybe clone() is getting delayed from another source of contention?  It
is a bit of a unique syscall.

Regards,
Vito Caputo


More information about the systemd-devel mailing list