[systemd-devel] difference in restart behavior between 219 and 220?

Vadim Berezniker vadim at berezniker.com
Thu Jul 9 14:56:55 PDT 2015


Hello,

I have a scenario where a service restarts after a dependency failure
in 219, but not in 220.
I'm wondering if it's a systemd issue or a problem with my configuration.

I've created a simple, but somewhat contrived repro.
Service A fails on first attempt, gets restarted and succeeds.
Service B depends on service A.
I attempt to start service B.

In systemd 219, B gets restarted when A succeeds, but in 220 B remains failed.
I've attached the logs with debugging enabled for both units.

# /etc/systemd/system/a.service
[Unit]
Description=A

[Service]
ExecStartPre=/bin/sh -x -c "[ -f /tmp/success ] || (touch /tmp/success
&& sleep 10)"
ExecStart=/bin/true
TimeoutStartSec=5
Restart=on-failure
RestartSec=5
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target



# /etc/systemd/system/b.service
[Unit]
Description=B
After=a.service
Requires=a.service

[Service]
ExecStart=/bin/true
RemainAfterExit=yes
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
-------------- next part --------------
systemd219 systemd[1]: Trying to enqueue job b.service/start/replace
systemd219 systemd[1]: Installed new job b.service/start as 3454
systemd219 systemd[1]: Installed new job a.service/start as 3455
systemd219 systemd[1]: Enqueued job b.service/start as 3454
systemd219 systemd[1]: About to execute: /bin/sh -x -c '[ -f /tmp/success ] || (touch oldcoreos
systemd219 systemd[1]: Forked /bin/sh as 1502
systemd219 systemd[1]: a.service changed dead -> start-pre
systemd219 systemd[1]: Starting A...
systemd219 systemd[1502]: Executing: /bin/sh -x -c '[ -f /tmp/success ] || (touch /tmpoldcoreos
systemd219 sh[1502]: + '[' -f /tmp/success ']'
systemd219 sh[1502]: + touch /tmp/success
systemd219 sh[1502]: + sleep 10
systemd219 systemd[1]: a.service start-pre operation timed out. Terminating.
systemd219 systemd[1]: a.service changed start-pre -> final-sigterm
systemd219 systemd[1]: Child 1502 belongs to a.service
systemd219 systemd[1]: a.service: control process exited, code=killed status=15
systemd219 systemd[1]: a.service got final SIGCHLD for state final-sigterm
systemd219 systemd[1]: a.service changed final-sigterm -> failed
systemd219 systemd[1]: Job a.service/start finished, result=failed
systemd219 systemd[1]: Failed to start A.
systemd219 systemd[1]: Job b.service/start finished, result=dependency
systemd219 systemd[1]: Dependency failed for B.
systemd219 systemd[1]: Job b.service/start failed with result 'dependency'.
systemd219 systemd[1]: Unit a.service entered failed state.
systemd219 systemd[1]: a.service failed.
systemd219 systemd[1]: a.service changed failed -> auto-restart
systemd219 systemd[1]: a.service: cgroup is empty
systemd219 systemd[1]: a.service: cgroup is empty
systemd219 systemd[1]: a.service holdoff time over, scheduling restart.
systemd219 systemd[1]: Trying to enqueue job a.service/restart/fail
systemd219 systemd[1]: Installed new job a.service/restart as 3718
systemd219 systemd[1]: Installed new job b.service/restart as 3803
systemd219 systemd[1]: Enqueued job a.service/restart as 3718
systemd219 systemd[1]: a.service scheduled restart job.
systemd219 systemd[1]: Job b.service/restart finished, result=done
systemd219 systemd[1]: Converting job b.service/restart -> b.service/start
systemd219 systemd[1]: a.service changed auto-restart -> dead
systemd219 systemd[1]: Job a.service/restart finished, result=done
systemd219 systemd[1]: Converting job a.service/restart -> a.service/start
systemd219 systemd[1]: About to execute: /bin/sh -x -c '[ -f /tmp/success ] || (touch oldcoreos
systemd219 systemd[1]: Forked /bin/sh as 1558
systemd219 systemd[1]: a.service changed dead -> start-pre
systemd219 systemd[1]: Starting A...
systemd219 systemd[1]: Child 1558 belongs to a.service
systemd219 systemd[1]: a.service: control process exited, code=exited status=0
systemd219 systemd[1]: a.service got final SIGCHLD for state start-pre
systemd219 systemd[1]: About to execute: /bin/true
systemd219 systemd[1]: Forked /bin/true as 1561
systemd219 systemd[1]: a.service changed start-pre -> running
systemd219 systemd[1]: Job a.service/start finished, result=done
systemd219 systemd[1]: Started A.
systemd219 systemd[1]: Child 1561 belongs to a.service
systemd219 systemd[1]: a.service: main process exited, code=exited, status=0/SUCCESS
systemd219 systemd[1]: a.service changed running -> exited
systemd219 systemd[1]: a.service: cgroup is empty
systemd219 systemd[1]: About to execute: /bin/true
systemd219 systemd[1]: Forked /bin/true as 1563
systemd219 systemd[1]: b.service changed dead -> running
systemd219 systemd[1]: Job b.service/start finished, result=done
systemd219 systemd[1]: Started B.
systemd219 systemd[1]: Starting B...
systemd219 systemd[1]: Child 1563 belongs to b.service
systemd219 systemd[1]: b.service: main process exited, code=exited, status=0/SUCCESS
systemd219 systemd[1]: b.service changed running -> exited
systemd219 systemd[1]: b.service: cgroup is empty
systemd219 sh[1558]: + '[' -f /tmp/success ']'
-------------- next part --------------
systemd220 systemd[1]: b.service: Trying to enqueue job b.service/start/replace
systemd220 systemd[1]: a.service: Installed new job a.service/start as 4846
systemd220 systemd[1]: b.service: Installed new job b.service/start as 4761
systemd220 systemd[1]: b.service: Enqueued job b.service/start as 4761
systemd220 systemd[1]: a.service: About to execute: /bin/sh -x -c '[ -f /tmp/success ] || (touch /tmp/success && sleep 10)'
systemd220 systemd[1]: a.service: Forked /bin/sh as 2032
systemd220 systemd[1]: a.service: Changed dead -> start-pre
systemd220 systemd[1]: Starting A...
systemd220 systemd[2032]: a.service: Executing: /bin/sh -x -c '[ -f /tmp/success ] || (touch /tmp/success && sleep 10)'
systemd220 sh[2032]: + '[' -f /tmp/success ']'
systemd220 sh[2032]: + touch /tmp/success
systemd220 sh[2032]: + sleep 10
systemd220 systemd[1]: a.service: Start-pre operation timed out. Terminating.
systemd220 systemd[1]: a.service: Changed start-pre -> final-sigterm
systemd220 systemd[1]: a.service: Child 2032 belongs to a.service
systemd220 systemd[1]: a.service: Control process exited, code=killed status=15
systemd220 systemd[1]: a.service: Got final SIGCHLD for state final-sigterm.
systemd220 systemd[1]: a.service: Changed final-sigterm -> failed
systemd220 systemd[1]: a.service: Job a.service/start finished, result=failed
systemd220 systemd[1]: Failed to start A.
systemd220 systemd[1]: b.service: Job b.service/start finished, result=dependency
systemd220 systemd[1]: Dependency failed for B.
systemd220 systemd[1]: b.service: Job b.service/start failed with result 'dependency'.
systemd220 systemd[1]: a.service: Unit entered failed state.
systemd220 systemd[1]: a.service: Failed with result 'timeout'.
systemd220 systemd[1]: a.service: Changed failed -> auto-restart
systemd220 systemd[1]: a.service: cgroup is empty
systemd220 systemd[1]: a.service: Failed to send unit change signal for a.service: Transport endpoint is not connected
systemd220 systemd[1]: a.service: Service hold-off time over, scheduling restart.
systemd220 systemd[1]: a.service: Trying to enqueue job a.service/restart/fail
systemd220 systemd[1]: a.service: Installed new job a.service/restart as 5190
systemd220 systemd[1]: a.service: Enqueued job a.service/restart as 5190
systemd220 systemd[1]: a.service: Scheduled restart job.
systemd220 systemd[1]: a.service: Changed auto-restart -> dead
systemd220 systemd[1]: a.service: Job a.service/restart finished, result=done
systemd220 systemd[1]: a.service: Converting job a.service/restart -> a.service/start
systemd220 systemd[1]: a.service: About to execute: /bin/sh -x -c '[ -f /tmp/success ] || (touch /tmp/success && sleep 10)'
systemd220 systemd[1]: a.service: Forked /bin/sh as 2132
systemd220 systemd[1]: a.service: Changed dead -> start-pre
systemd220 systemd[1]: Starting A...
systemd220 systemd[1]: a.service: Child 2132 belongs to a.service
systemd220 systemd[1]: a.service: Control process exited, code=exited status=0
systemd220 systemd[1]: a.service: Got final SIGCHLD for state start-pre.
systemd220 systemd[1]: a.service: About to execute: /bin/true
systemd220 systemd[1]: a.service: Forked /bin/true as 2136
systemd220 systemd[1]: a.service: Changed start-pre -> running
systemd220 systemd[1]: a.service: Job a.service/start finished, result=done
systemd220 systemd[1]: Started A.
systemd220 systemd[1]: a.service: Child 2136 belongs to a.service
systemd220 systemd[1]: a.service: Main process exited, code=exited, status=0/SUCCESS
systemd220 systemd[1]: a.service: Changed running -> exited
systemd220 systemd[1]: a.service: cgroup is empty
systemd220 systemd[1]: a.service: cgroup is empty
systemd220 systemd[1]: a.service: cgroup is empty
systemd220 systemd[1]: a.service: cgroup is empty
systemd220 sh[2132]: + '[' -f /tmp/success ']'


More information about the systemd-devel mailing list