[systemd-devel] Health check for a service managed by systemd

Fri Jul 26 13:45:01 UTC 2019

On Fri, Jul 26, 2019 at 4:37 PM Debraj Manna <subharaj.manna at gmail.com>
wrote:

> Can we make use of the watchdog & systemd-notify functionality of
> systemd? I mean something like this.
>
> [Unit]
> Description=Test service
> After=network.target
>
> [Service]
> Type=notify
> # test.sh wrapper script to call the service
> ExecStart=/opt/test/test.sh
> Restart=always
> RestartSec=1
> TimeoutSec=5
> WatchdogSec=5
>
> [Install]
> WantedBy=multi-user.target
>
> Then in test.sh can we do something like
>
> #!/bin/bash
> trap 'kill $(jobs -p)' EXIT
>
> # Start the actual service
> /opt/test/service &
> PID=$!
>
> /bin/systemd-notify --ready
> while(true); do
>     FAIL=0
>     kill -0 $PID
>     if [[ $? -ne 0 ]]; then FAIL=1; fi
>
> #    curl http://localhost/test/
> #    if [[ $? -ne 0 ]]; then FAIL=1; fi
>
> if [[ $FAIL -eq 0 ]]; then /bin/systemd-notify WATCHDOG=1; fi
>
>     sleep 1
> done
>

That doesn't look nice; it might technically work but it isn't any better
than a standalone periodic check script. On top of that, the script calls
--ready without knowing whether the service is ready; /bin/systemd-notify
as an external binary doesn't work very well; and the way you implement PID
existence check means even a completely crashed/exited daemon won't get
restarted until watchdog timeout expires...

Consider something already made for this purpose, such as Monit.

-- 
Mantas Mikulėnas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20190726/e6636c23/attachment.html>