[systemd-devel] [PATCH] systemd restart patch

Lennart Poettering lennart at poettering.net
Fri Jul 1 17:21:16 PDT 2011


On Fri, 01.07.11 15:35, Cristian Patrascu (Cristian.Patrascu at windriver.com) wrote:

Heya!

(BTW, got your first post of this patch too, just didn't find the time
to review the patch, sorry. It was still in my mail queue however.)

>      After "x" restarts have happened, the "OnFailure" unit will start
> (if specified) but only after all unsuccessfull restarts (after "x"
> restarts reached).

Patch looks pretty good. A few comments though.
> 
> This patch is made for systemd-29, on master branch with SHA1 ID:
> 3661ac04b4f2840d3345605aa35963bbde3c469d
> 
> ________systemd-restart-on-fail.patch : _______________________________
> 
> Index: systemd-29/src/dbus-service.c
> ===================================================================
> --- systemd-29.orig/src/dbus-service.c	2011-06-22 10:47:14.000000000 +0300
> +++ systemd-29/src/dbus-service.c	2011-06-22 13:48:51.292321742 +0300
> @@ -42,6 +42,8 @@
>          "<property name=\"PIDFile\" type=\"s\" access=\"read\"/>\n"   \
>          "<property name=\"NotifyAccess\" type=\"s\" access=\"read\"/>\n" \
>          "<property name=\"RestartUSec\" type=\"t\" access=\"read\"/>\n" \
> +        "<property name=\"MaxRestartRetries\" type=\"i\" access=\"read\"/>\n" \
> +        "<property name=\"RestartRetry\" type=\"i\"
> access=\"read\"/>\n" \

I Think it would make sense to make both of these unsigned, in order to
avoid confusion whether these actually ever can be negative.

If I read your patch correctly right now MaxRestartRetries=-1 means that
there is no limit on the number of retries. I'd use 0 for that instead
(0 is not needed to indicate disabling, since we can indicate that with
Restart=no already.)

> +        s->restart_retries = DEFAULT_RESTART_RETRIES;
> +        s->restart_crt_retry = 0;

We generally try not to abbrievate variables unnecessarily, so I'd suggest
to call this variable "restart_current_retry" or so.

>  #ifdef HAVE_SYSV_COMPAT
>          s->sysv_start_priority = -1;
>          s->sysv_start_priority_from_rcnd = -1;
> @@ -1151,6 +1153,22 @@
>                  if (s->meta.default_dependencies)
>                          if ((r = service_add_default_dependencies(s))<  0)
>                                  return r;
> +
> +                /* If the option "RestartRetries=" is set then the service will be "revived" */
> +                if (s->restart_retries != DEFAULT_RESTART_RETRIES){
> +                        /* If the option "Restart=" is not set, then for the service to be "revived"
> +                           we must set "Restart=" to "on-failure" */
> +                        if (s->restart == SERVICE_RESTART_NO){
> +                                s->restart = SERVICE_RESTART_ON_FAILURE;
> +                        }

Please follow coding style. We generally place no {} brackets around
single line blocks.

>          case SERVICE_AUTO_RESTART:
> -                log_info("%s holdoff time over, scheduling restart.", u->meta.id);
> +                if (0<= s->restart_retries){

Please follow coding style, compare variables with constants not
constants with variables. Also, please place spaces before the <= and
the {.

> 
> +        if (u->meta.type == UNIT_SERVICE&&
> +                u->service.restart_crt_retry<= u->service.restart_retries)
> +                return;
> +

Huhm, I wonder if we could fine a better place for this, not sure where though.

>          log_info("Triggering OnFailure= dependencies of %s.", u->meta.id);
> 
>          SET_FOREACH(other, u->meta.dependencies[UNIT_ON_FAILURE], i) {
> @@ -1245,6 +1249,17 @@
>                          log_notice("Unit %s entered failed state.", u->meta.id);
>                          unit_trigger_on_failure(u);
>                  }
> +
> +                /* When a unit is of type service and has finished,
> +                   reset restart-retry counter, if applicable */
> +                if (ns != os&&
> +                    u->meta.type == UNIT_SERVICE&&
> +                    UNIT_IS_INACTIVE_OR_FAILED(ns)&&
> +                    u->service.restart_crt_retry>  u->service.restart_retries ){
> +                            log_debug("Unit %s entered inactive or failed, restart retries at max, reset counter (%d ->  0)!",
> +                                      u->meta.id, u->service.restart_crt_retry, u->service.restart_retries);
> +                            u->service.restart_crt_retry = 0;

Hmm, I think it would be nicer to reset this counter in service.c only, not
in the generic code. I'd like to avoid that we access too many
service-private fields from generic unit code.

>  #include "execute.h"
>  #include "condition.h"
> 
> +/* default = infinite retries (= systemd behaviour not patched) */
> +#define DEFAULT_RESTART_RETRIES -1
> +

Hee, if I commit this, then it won't be patched anymore, so the comment
should not refer to it being unpatched ;-)

Patch looks quite OK in general.

Thanks for the work,

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the systemd-devel mailing list