[systemd-devel] Hear opinions about changing watchdog timeout value during service running

Tue May 24 21:34:40 UTC 2016

Hi,

On Tuesday 2016-05-24 12:12:56 +0200, Lennart Poettering wrote:
> Also, we already have NotifyAccess= already, which I think is enough.

NotifyAccess= slipped my mind, so agreed.

However, watchdogs being a safety (vs security;) feature, I still feel
that an upper bound should be configurable. A process having this
feature enabled should still be able to guarantee(!) that it gets
restarted, no matter what(!) error condition in the service occurs.
Thats the main feature of a watchdog and a system-wide safety concept
chained finally into a hardware watchdog and a reset-logic.

However, if there is no upper bound, an error in a service might result
in the service accidentally setting the WatchdogSec to e.g. -1, 0 or
just something very large. In the latter case the watchdog effectively
is turned off. And the safety-promise is broken. Even if it is very
unlikely ;).

Masking the input with a lower bound (>0) and an upper bound fixes this
issue.

I don't mean to be rude by being this persistent. I work in an
industrial environment and safety is one of the primary objectives here,
so I take this very serious. Thats also why I wrote the patch for
user-slice watchdog support a few weeks ago (pull-request #3073
'implement sending WATCHDOG=1 notification messages from systemd
itself'). I still need to adapt it to your suggested changes, but that
may take some time as I have to do it in my spare time...

That said, I don't strongly oppose if you leave it as is, as a seriously
safety-concerned service then just must not be allowed to use this
feature (by disabling NotifyAccess).

Viele Grüße,

David