[systemd-devel] option to wait for pid file to appear

Thu May 17 10:07:50 UTC 2018

On Thu, 17 May 2018, Igor Bukanov wrote:
> Hi,
> 
> I have a service unit for nginx that uses Type=forking and PIDFile.
> That works, but occasionally I see in the log a message like
> 
> nginx.service: PID file /run/nginx/nginx.pid not readable (yet?) after
> start: No such file or directory
> 
> After investigating this father I see the reason for this is that
> nginx creates the pid file in a child process, not in the parent (at
> least this is how I interpret their code). As the parent may exit and
> systemd may respond to it before the child writes the pid, that leads
> to the above message.

The message is essentially harmless, at least when you're talking about 
initial service startup.

It _is_ better for the PID file to be written out before the initial 
process exits, but systemd will handle things correctly even if they 
happen the other way around. Essentially the service won't be considered 
to have completed activation until both events occur. If one or the other 
takes too long (i.e. longer than the TimeoutStartSec= setting), the unit 
will fail.

I think the only time this is actually a problematic situation is if the 
service is in a fully activated state, and the main process goes away 
before its replacement writes out a new PID file. There's a window there 
when systemd can think that the main process has simply decided to exit 
itself. In this case it will stop the unit.

On the other hand, if this sequence of events was initiated because the 
admin explicitly reloaded the service through systemd, then again systemd 
will, as far as I know, be happy with these two events happening in the 
wrong order. I'd need to check the code more thoroughly to be sure of this 
though.

For what it's worth, there is an "introduce Type=pid-file" item in the 
systemd TODO that seems like it would help with these kinds of problematic 
services. I see value in having something that simply waits for a PID file 
to appear, and doesn't care whether any processes exited in the meantime.

As an example use-case, authors of scripts currently need to use 
systemd-notify and NotifyAccess= to implement reliable service startup 
notification. With Type=pid-file, however, they could just do:

  #!/bin/sh

  # Run various setup stuff, possibly "exit 1"
  # if there's a problem ...

  # All good, tell systemd we're active
  echo $$ >/run/some-script.pid

  # ...

As it happens, I actually have some code to implement Type=pid-file, but I 
never got around to finish cleaning it up and raising a pull request. I 
should probably dig it out and raise a PR.