[systemd-devel] systemd issues related to watchdog

Lennart Poettering lennart at poettering.net
Wed Mar 21 20:49:13 UTC 2018


On Mi, 21.03.18 23:34, prashantkumar dhotre (prashantkumardhotre at gmail.com) wrote:

> Hi systemd-experts
> 
> I am seeing few issues related to watchdog  in systemd 230 version.
> Could you please help me with few queries below ?
> 
> 
> 1) How do I  test hardware watchdog config RunTimeWatchDiogSec and
> ShutDownTimeWatchDogSec

See the various suggestions in the responses to:

https://lists.freedesktop.org/archives/systemd-devel/2018-February/040428.html

> 2) If I enable RunTimeWatchDogSec, should I also run watchdog.service which
> runs /usr/sbin/watchdog ?

No you should not. There can only be one consumer of each
/dev/watchdog device, and if that's systemd than nobody else will get access.

> 3) What is the config to put say 2 min timeout for shutdown/reboot ?
> ShutDownTimeWatchDogSec  does not do that .

ShutdownWatchdogSec= applies to the last phase of boot only, i.e. to
the phase where all services are already shut down, and only the final
unmounting and killing of whatever remains is done.

To apply a timeout for the first phase of shutdown, use JobTimeoutSec=
and JobTimeoutAction= in shutdown.target or so. See systemd.unit(5)
for details.

> 4) In console of my device ,when i reboot it, I  sometimes , I see “
> 553.001000]
> Uhhuh. NMI receÿ “ string message.
> 
> What is this indicate ? doe this indicate expiry of hw watchdog
> timer ?

That's generated by the kernel, and is something the kernel folks
should be able to help you with.

> 5) how do I see present setting of hw watchdog timer  ?
> 
> wdctl is not working

Hmm, yeah, it's an exclusive use device. However watchdog drivers
export their current settings in /sys, too:

grep . /sys/class/watchdog/*/*

> 5) reboot-force seem to be overwriting the hardware watchdog timeout
> value.

Yes, with the ShutdownWatchdogSec= setting mentioned above.

> I have changed reboot.target to make JobTimeoutSec=5sec
> when system boots up i see that hardware watchdog is set to 1 min 4 sec.
> but when 'systemctl reboot' timesout , reboot-force is invoked and that is
> overwriting the
> hardware watchdog timeout value to 4 min.
> Is this a bug or I am missing some config?
> 
> Note that i have not set ShutdownWatchdogSec= in
> /etc/systemd/system.conf

The default for ShutdownWatchdogSec= is 10min, and most likely your hw
can't do that, hence the next closest 4min is set instead.

> 6) RuntimeWatchdogSec does not seem to work.
> I have set it to 90 sec, but i see my system getting rebooted due to
> hardware watchdog getting triggered as I see in my console 'NMI received'
> This will not happen if i have  only watchdog.service
> (running /usr/sbin/watchdog ) and not set  *RuntimeWatchdogSec* .
> this observation indicates that  *RuntimeWatchdogSec*  does not seem to do
> what it is supposed to do.

Hmm, if you "strace -p 1", do you see the watchdog ping ioctls
happening?

IIUC you have two watchdog devices. systemd can only manage one. Is it
possible that the other might be causing this?

> 7)  hw watchdg time out seems to be be non-configurable and always
> hardcoded to 1min 4 sec.
> neither  *RuntimeWatchdogSec* nor  ShutDownTimeWatchDogSec setting are
> having effect on hw wd timeout.
> i see hw watchdog NMI at reboot after 1min4sec of reboot command even if
> ShutDownTimeWatchDogSec  is configured to 10 min
> is this a systemd bug ?

Similar here, is it possible that the other watchdog is causing this?

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list