[systemd-devel] OnFailure=

Jakob Schürz wertstoffe at nurfuerspam.de
Wed Mar 7 23:37:48 UTC 2018


Hi there!

I build a test-unit

# cat test at .service
[Unit]
Description=Testservice notification
OnFailure=notification-telegram@%n.service

[Service]
Type=simple
Restart=on-failure
#RestartSec=2
ExecStart=/bin/%i
SyslogIdentifier=test@%i.service
StartLimitBurst=5
StartLimitInterval=10


And the notification-Unit notification-telegram@%n.service

# cat notification-telegram at .service
[Unit]
Description=Send failure-notification about %i to telegram

[Service]
User=jakob
ExecStart=/bin/bash -c "/usr/local/bin/ntfy -b telegram send
\"FAILED\n$(systemctl status %i)\""

When i start the Test-Unit with systemctl start test at false i get 5
Messages in telegram...

The log is:
Mär 08 00:31:53 aldebaran systemd[1]: Started Testservice notification.
Mär 08 00:31:53 aldebaran systemd[1]: test at false.service: Main process
exited, code=exited, status=1/FAILURE
Mär 08 00:31:53 aldebaran systemd[1]: test at false.service: Failed with
result 'exit-code'.
Mär 08 00:31:53 aldebaran systemd[1]: test at false.service: Triggering
OnFailure= dependencies.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service
hold-off time over, scheduling restart.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled
restart job, restart counter is at 1.
Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: Started Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Main process
exited, code=exited, status=1/FAILURE
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with
result 'exit-code'.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering
OnFailure= dependencies.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service
hold-off time over, scheduling restart.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled
restart job, restart counter is at 2.
Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: Started Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Main process
exited, code=exited, status=1/FAILURE
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with
result 'exit-code'.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering
OnFailure= dependencies.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service
hold-off time over, scheduling restart.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled
restart job, restart counter is at 3.
Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: Started Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Main process
exited, code=exited, status=1/FAILURE
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with
result 'exit-code'.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering
OnFailure= dependencies.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service
hold-off time over, scheduling restart.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled
restart job, restart counter is at 4.
Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Start request
repeated too quickly.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with
result 'exit-code'.
Mär 08 00:31:54 aldebaran systemd[1]: Failed to start Testservice
notification.
Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering
OnFailure= dependencies.


You see, the Unit from OnFailure= is called 5 times, not at the "Failed
to start Testservice notification"-time.

The man-page says:

OnFailure=
           A space-separated list of one or more units that are
activated when this unit enters the "failed" state. A service unit using
Restart= enters the failed state only after the
           start limits are reached.


But in this testcase, the unit listet in OnFailure is called every time,
the unit failes, restarts again fails again, and after 5 times
(=StartLimitBurst), the unit falls into failed state... Here should be
the only one time, where "OnFailure=" is hit...

My systemd-Version is 237-3 from debian.

Should i file a Bug in bugs.freedesktop.org?

Jakob


More information about the systemd-devel mailing list