[systemd-devel] Restart=always and "ExecStartPre"

Michal Schmidt mschmidt at redhat.com
Wed Sep 28 13:48:44 PDT 2011


On Wed, 28 Sep 2011 17:49:54 +0200 Reindl Harald wrote:
> why does systemd not restart a killed service if the
> "ExecStartPre"-process is still running, see below - at my opinion
> after "killall afpd" the service should be restarted and in a perfect
> case even if "ExecStartPre"-process dies
> 
> systemd-26-10.fc15.x86_64

I see at least three issues here. See below.

> [root at testserver:~]$ cat /lib/systemd/system/netatalk.service
> [Unit]
> Description=Apple-File-Server
> After=syslog.target network.target avahi-daemon.service
> [Service]
> Type=forking
> PIDFile=/var/run/netatalk.pid
> ExecStartPre=/usr/sbin/cnid_metad -l log_note

issue #1:
ExecStartPre is not supposed to fork off daemons. A future version of
systemd might even enforce this rule. The service should be split into
two.

> ...
> [root at testserver:~]$ systemctl status netatalk.service
> netatalk.service - Apple-File-Server
>           Loaded: loaded (/lib/systemd/system/netatalk.service)
>           Active: active (running) since Wed, 28 Sep 2011 17:45:55
>...
> Main PID: 1812 (code=exited, status=0/SUCCESS)

issue #2:
This Main PID looks like stale information from a previous run of the
service. This is a minor bug in systemd that it does not reset it.

There are two reasons why systemd failed to detect the new main PID:
 - issue #3: afpd has a racy daemonization sequence. It writes its PID
   file too late. My recently proposed patch "service: delayed main PID
   guessing" should be able to workaround it.
 - Given that systemd could not read the information from the PID file,
   it tried to guess the main PID, but it also failed, because there
   are two top-level processes in the cgroup:

> CGroup: name=systemd:/system/netatalk.service
> ├ 1999 /usr/sbin/cnid_metad -l log_note
> └ 2002 /usr/sbin/afpd -P /var/run/netatalk.pid -F /etc/netatalk/afpd.conf

systemd could tell which one of them is the main PID.

When the main PID is not known, the only way to detect the death of the
service is to watch for the cgroup getting empty.

Michal


More information about the systemd-devel mailing list