[systemd-devel] [PATCH 1/4] WIP: service: add watchdog timestamp

Michael Olbrich m.olbrich at pengutronix.de
Thu Nov 3 03:31:19 PDT 2011


Hi,

On Wed, Nov 02, 2011 at 02:34:04AM +0100, Lennart Poettering wrote:
> Sorry for the delay in reviewing. We've been busy with getting F16 out
> of the door and didn't want to add major new features into systemd at
> that time. But now git is open for new features again.
> 
> When Kay and I first saw your patch we initially were inclined to say
> "no" to this, since we didn't want to have monitoring capabilities in
> systemd. It is our intention to provide the right hooks to do
> monitoring, but not do the monitoring itself. i.e. provide the data that
> monitoring tools need, but insist that they need to run outside of
> system. Or with other words: we never ever want to see code in systemd
> that is an HTTP client to check whether apache is still responding.

I've been thinking on how to actually do some kind of monitoring for
services that can't (for whatever reason) do the sd_notify() stuff by
themselves. One of the use-cases is actually to monitor a web-server or an
attached fastCGI service. What I came up with is this:

- start the monitor process instead of the actual service.
- fork
- exec the actual service in the parent process
  - is this enough to now break any main-pid detection?
- in the child process:
  while (true) {
    if (service_is_ok())
      sd_notify(...)
    sleep()
  }

It's not all that elegant, but it works well enough. It would of course be
nicer to have a ExecWatchdog= or something like that. Would you accept a
patch for that?

> However, your patch is actually quite minimal and very generic, and
> reuses the existing notification channel for the watchdog messages which
> are just trivial extensions of it. That makes it a lot more interesting,
> in particular because this means systemd never has to ask the clients,
> but they have to query systemd. So, yupp, we are convinced, and would
> like to merge this. 

Great.

> A few review nitpicks:
> 
> > --- a/src/sd-daemon.h
> > +++ b/src/sd-daemon.h
> > @@ -217,6 +217,11 @@ int sd_is_mq(int fd, const char *path);
> >       MAINPID=...  The main pid of a daemon, in case systemd did not
> >                    fork off the process itself. Example: "MAINPID=4711"
> >  
> > +     WATCHDOG=1   Tells systemd to update the watchdog timestamp.
> > +                  Services using this feature should do this in
> > +                  regular intervals. A watchdog framework can use the
> > +                  timestamps to detect failed services.
> > +
> >    Daemons can choose to send additional variables. However, it is
> >    recommended to prefix variable names not listed above with X_.
> 
> Please add the same to the man page of sd_notify().

ok.

> Otherwise looks flawless!

Great. I'll try to send an updated version soon.

Regards,
Michael

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |


More information about the systemd-devel mailing list