[systemd-devel] is the watchdog useful?

Lennart Poettering lennart at poettering.net
Thu Oct 31 17:30:33 UTC 2019


On Mo, 21.10.19 17:50, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:

> In principle, the watchdog for services is nice. But in practice it seems
> be bring only grief. The Fedora bugtracker is full of automated reports of ABRTs,
> and of those that were fired by the watchdog, pretty much 100% are bogus, in
> the sense that the machine was resource starved and the watchdog fired.
>
> There a few downsides to the watchdog killing the service:
> 1. if it is something like logind, it is possible that it will cause user-visible
> failure of other services
> 2. restarting of the service causes additional load on the machine
> 3. coredump handling causes additional load on the machine, quite significant
> 4. those failures are reported in bugtrackers and waste everyone's time.
>
> I had the following ideas:
> 1. disable coredumps for watchdog abrts: systemd could set some flag
> on the unit or otherwise notify systemd-coredump about this, and it could just
> log the occurence but not dump the core file.
> 2. generally disable watchdogs and make them opt in. We have 'systemd-analyze service-watchdogs',
> and we could make the default configurable to "yes|no".
>
> What do you think?

Isn't this more a reason to substantially increase the watchdog
interval by default? i.e. 30min if needed?

Lennart

--
Lennart Poettering, Berlin


More information about the systemd-devel mailing list