[systemd-devel] is the watchdog useful?

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Thu Oct 31 18:04:52 UTC 2019


On Thu, Oct 31, 2019 at 06:30:33PM +0100, Lennart Poettering wrote:
> On Mo, 21.10.19 17:50, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:
> 
> > In principle, the watchdog for services is nice. But in practice it seems
> > be bring only grief. The Fedora bugtracker is full of automated reports of ABRTs,
> > and of those that were fired by the watchdog, pretty much 100% are bogus, in
> > the sense that the machine was resource starved and the watchdog fired.
> >
> > There a few downsides to the watchdog killing the service:
> > 1. if it is something like logind, it is possible that it will cause user-visible
> > failure of other services
> > 2. restarting of the service causes additional load on the machine
> > 3. coredump handling causes additional load on the machine, quite significant
> > 4. those failures are reported in bugtrackers and waste everyone's time.
> >
> > I had the following ideas:
> > 1. disable coredumps for watchdog abrts: systemd could set some flag
> > on the unit or otherwise notify systemd-coredump about this, and it could just
> > log the occurence but not dump the core file.
> > 2. generally disable watchdogs and make them opt in. We have 'systemd-analyze service-watchdogs',
> > and we could make the default configurable to "yes|no".
> >
> > What do you think?
> 
> Isn't this more a reason to substantially increase the watchdog
> interval by default? i.e. 30min if needed?

Yep, there was a proposal like that. I want to make it 1h in Fedora.

Zbyszek



More information about the systemd-devel mailing list