[systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

Thu Aug 14 19:09:43 PDT 2014

On Fri, 2014-08-15 at 01:59 +0200, Lennart Poettering wrote:
> On Wed, 16.07.14 04:15, Jon Severinsson (jon at severinsson.net) wrote:
> > Before this commit systemctl reload on an inactive unit with a queued
> > start job would block until the unit had started if the unit supported
> > reload, but return failure immediately if the unit didn't.
> 
> This sounds like correct behaviour to me.
> 
> 1) I think the rule should be to return only after the unit is in the
> desired state, and the desired operation or something equivalent has
> been executed. This is useful, so that callers can rely on the fact that

What is your "desired state" for reload then? To me the obvious natural
meaning is that after reload has completed, it is guaranteed the service
will not process any requests with outdated configuration from before
reload was issued. Do you want to have reload imply that AND "service is
currently running"? If so, I think such a combination is better done
with "systemctl reload && systemctl start"; getting the correct
semantics for the reload part alone is harder if systemctl only
implements the combined operation.

> > Additionally systemctl reload-or-try-restart (and systemctl force-reload)
> > would block until the unit had started if the unit supported reload, but
> > return *success* immediately if the unit didn't.
> 
> This actually also sounds like correct behaviour to me.
> 
> "try-restart" means that we try to restart a service, but only if it is
> already running. If we are not running this is not considered an
> error. If you then add the "reload-or-" to the mix, this means that we
> will try to reload it if the unit supports it, instead of restarting it.
> 
> So, if a start is already queued, we'll wait for it to finish, so that
> the configuration is fully in effect afterwards. But if the thing is not
> running at all, then we will not consider that a problem at all and
> return success.

I don't think this makes sense. Reload is a "softer" alternative; if I
say "reload-or-try-restart", that should be considered as asking less
from systemd - full restart is not necessary, just reload is enough if
that option is available. Asking for less should not be a reason to
block longer!

> This is really what you want actually, because this allows admins (and
> scripts) to change some daemon configuration, and then finish off by
> issuing "reload-or-try-restart" on the daemon, so that the changes take
> effect, and then immediately talk to the daemon, knowing that if it was
> running, then the new configuration is definitely in effect. If it
> wasn't, then of course we can't talk to the service, but that's OK of
> course.

Exactly, this is a common usecase. But note that this case supports the
changed semantics of the patch, and does not support your view!

After an operation that changes configuration,
"reload-or-try-restart" (or just "reload" if we know reload is in fact
supported by the daemon and sufficient for the configuration change in
question) should guarantee that further requests use the new
configuration. It should not unnecessarily block until the daemon is
actually running if it's being started; anything which relies on it
being up should test for that separately, independently of configuration
change handling. And there's no reason why implementing lighter-weight
reloads to replace full restarts should lead to systemd blocking longer.

> > Finaly reload on an inactive unit without a queued start job would
> > reurn failure, but try-restart on the same job would return success.
> 
> Well, the "try-" prefix is supposed to be the thing "apply only if
> already running, but don't consider it an error if it isn't". We could of
> course add "try-reload" to the mix, so that you have a "reload" version
> that will not fail if the daemon is not running, but I think that's not
> particularly interested....

I think such "try-reload" semantics are the actually useful use-case,
and what just "reload" should do as the non-try semantics do not seem
that useful.

> > This behaviour is unintuitive and inconsistent, so make reload of inactive
> > units behave just like try-restart.
> 
> Well, it might appear surprising in some ways, but there's a lot of
> sense in it, and I think what matters is really to look at the effect of
> calls, and make guarantees about that.

try-restart: guarantee that no old daemon instance is running at all
reload: guarantee that the daemon is not running with old configuration,
at least to the degree its ExecReload functionality supports

> > Also fixes deadlocks on boot if a unit calls systemctl reload,
> > reload-or-try-restart or force-reload on a unit ordered later
> > in the boot sequence during startup.
> 
> Well, for cases like that, please consider using --no-block, which will
> just queue the job, but not wait for it. Or use
> --job-mode=ignore-dependencies or so, even though that's a horrid
> invention...

That's not safe - with --no-block the daemon may still process requests
with outdated configuration after the configuration change operation
returns. The above semantics for "reload" make this work completely
naturally without such problems. The problems are only caused by the
inappropriate blocking for unit start in current systemd; the start time
of the daemon is unrelated to the configuration change operations, and
the added blocking has no benefit whatsoever, it only causes problems.
Waiting for start should be done with a separate operation if needed.