[systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

Thu Aug 14 16:59:34 PDT 2014

On Wed, 16.07.14 04:15, Jon Severinsson (jon at severinsson.net) wrote:

Sorry for the late review, but this stuff is a bit harder to grok, so I
needed some time to have a closer look.

> Before this commit systemctl reload on an inactive unit with a queued
> start job would block until the unit had started if the unit supported
> reload, but return failure immediately if the unit didn't.

This sounds like correct behaviour to me.

1) I think the rule should be to return only after the unit is in the
desired state, and the desired operation or something equivalent has
been executed. This is useful, so that callers can rely on the fact that
whatever they wanted is actually in effect after returning. The strict
focus is here on the *effect*, the *resulting state* of things.

2) A second rule should be that the operation should fail if the unit is
not in an appropriate state, and is not in process of being brought into
the right state.

The combination of these rules is that we try to be "positive": unless
there's no way to bring the unit into the state the admin wants, we will
not fail, but wait.

Due to rule 2 "systemctl reload" should fail on inactive units. However
if a start job is already queued, we should try to be "positive", and
wait for it to complete.

> Additionally systemctl reload-or-try-restart (and systemctl force-reload)
> would block until the unit had started if the unit supported reload, but
> return *success* immediately if the unit didn't.

This actually also sounds like correct behaviour to me.

"try-restart" means that we try to restart a service, but only if it is
already running. If we are not running this is not considered an
error. If you then add the "reload-or-" to the mix, this means that we
will try to reload it if the unit supports it, instead of restarting it.

So, if a start is already queued, we'll wait for it to finish, so that
the configuration is fully in effect afterwards. But if the thing is not
running at all, then we will not consider that a problem at all and
return success.

This is really what you want actually, because this allows admins (and
scripts) to change some daemon configuration, and then finish off by
issuing "reload-or-try-restart" on the daemon, so that the changes take
effect, and then immediately talk to the daemon, knowing that if it was
running, then the new configuration is definitely in effect. If it
wasn't, then of course we can't talk to the service, but that's OK of
course.

> Finaly reload on an inactive unit without a queued start job would
> reurn failure, but try-restart on the same job would return success.

Well, the "try-" prefix is supposed to be the thing "apply only if
already running, but don't consider it an error if it isn't". We could of
course add "try-reload" to the mix, so that you have a "reload" version
that will not fail if the daemon is not running, but I think that's not
particularly interested....

> This behaviour is unintuitive and inconsistent, so make reload of inactive
> units behave just like try-restart.

Well, it might appear surprising in some ways, but there's a lot of
sense in it, and I think what matters is really to look at the effect of
calls, and make guarantees about that.

> Also fixes deadlocks on boot if a unit calls systemctl reload,
> reload-or-try-restart or force-reload on a unit ordered later
> in the boot sequence during startup.

Well, for cases like that, please consider using --no-block, which will
just queue the job, but not wait for it. Or use
--job-mode=ignore-dependencies or so, even though that's a horrid
invention...

Lennart

-- 
Lennart Poettering, Red Hat