[systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

Fri Aug 15 03:50:35 PDT 2014

On Fri, 15.08.14 05:09, Uoti Urpala (uoti.urpala at pp1.inet.fi) wrote:

> > > Before this commit systemctl reload on an inactive unit with a queued
> > > start job would block until the unit had started if the unit supported
> > > reload, but return failure immediately if the unit didn't.
> > 
> > This sounds like correct behaviour to me.
> > 
> > 1) I think the rule should be to return only after the unit is in the
> > desired state, and the desired operation or something equivalent has
> > been executed. This is useful, so that callers can rely on the fact that
> 
> What is your "desired state" for reload then?  

*operating* with the new configuration loaded.

I call this being "positive", i.e. try to make the best of things, and
if we know things will improve, then wait for that, and return
then. Or in other words: if there's the chance we can do something that
is what the user asked for, or equivalent to it, then do that, but not
just give up early, because right now we coudln't fulfill what is
requested, even though we *know* that in a very short time we could.

> To me the obvious natural meaning is that after reload has completed,
> it is guaranteed the service will not process any requests with
> outdated configuration from before reload was issued. Do you want to
> have reload imply that AND "service is currently running"? If so, I
> think such a combination is better done with "systemctl reload &&
> systemctl start"; getting the correct semantics for the reload part
> alone is harder if systemctl only implements the combined operation.

Hmm? That's what "systemctl reload-or-restart" does. But it's really not
about this. it's about trying to give the caller guarantees that
something is in a desired state after the operation returned, if there's
any chance for it.

> > > Additionally systemctl reload-or-try-restart (and systemctl force-reload)
> > > would block until the unit had started if the unit supported reload, but
> > > return *success* immediately if the unit didn't.
> > 
> > This actually also sounds like correct behaviour to me.
> > 
> > "try-restart" means that we try to restart a service, but only if it is
> > already running. If we are not running this is not considered an
> > error. If you then add the "reload-or-" to the mix, this means that we
> > will try to reload it if the unit supports it, instead of restarting it.
> > 
> > So, if a start is already queued, we'll wait for it to finish, so that
> > the configuration is fully in effect afterwards. But if the thing is not
> > running at all, then we will not consider that a problem at all and
> > return success.
> 
> I don't think this makes sense. Reload is a "softer" alternative; if I
> say "reload-or-try-restart", that should be considered as asking less
> from systemd - full restart is not necessary, just reload is enough if
> that option is available. Asking for less should not be a reason to
> block longer!

"try-restart" also waits for any already queued start job to finish, if
there's any, before it returns, hence "reload-or-try-restart" and
"try-restart" actually block for the same time... Not following here...

> > This is really what you want actually, because this allows admins (and
> > scripts) to change some daemon configuration, and then finish off by
> > issuing "reload-or-try-restart" on the daemon, so that the changes take
> > effect, and then immediately talk to the daemon, knowing that if it was
> > running, then the new configuration is definitely in effect. If it
> > wasn't, then of course we can't talk to the service, but that's OK of
> > course.
> 
> Exactly, this is a common usecase. But note that this case supports the
> changed semantics of the patch, and does not support your view!
> 
> After an operation that changes configuration,
> "reload-or-try-restart" (or just "reload" if we know reload is in fact
> supported by the daemon and sufficient for the configuration change in
> question) should guarantee that further requests use the new
> configuration. It should not unnecessarily block until the daemon is
> actually running if it's being started; anything which relies on it
> being up should test for that separately, independently of configuration
> change handling. And there's no reason why implementing lighter-weight
> reloads to replace full restarts should lead to systemd blocking
> longer.

No. It *should* wait until any start request that is already queued is
complete... Again, it's about the *outcome*, and about being *positive*,
really. Nothing else. We should not make things fail, if we already
*know* that there's alerady something going on that makes the thing we
are trying to do succeed.

I think most of the confusion here comes from the fact that sysv service
restarts don't care about ordering at all, really, and we do. But the
answer to that is not to weaken the current strong semantics of
blocking, but simply not to request blocking operation at all, i.e. use
--no-block, and just queue things in, instead of waiting for them.

Note that on FEdora the sysv /sbin/service glue actually adds in
--no-block for many cases, too, due to the stricter requirements of
systemd there. 

> > > Finaly reload on an inactive unit without a queued start job would
> > > reurn failure, but try-restart on the same job would return success.
> > 
> > Well, the "try-" prefix is supposed to be the thing "apply only if
> > already running, but don't consider it an error if it isn't". We could of
> > course add "try-reload" to the mix, so that you have a "reload" version
> > that will not fail if the daemon is not running, but I think that's not
> > particularly interested....
> 
> I think such "try-reload" semantics are the actually useful use-case,
> and what just "reload" should do as the non-try semantics do not seem
> that useful.

Well, I disagree on this. I think it's a good idea to make things fail
if people ask a service to reload that isn't actually running. If people
want to tape ove that case (which is totally acceptable, and also has
good usecases), then they should use "try-reload", and all is good.

But I think this discussion is really moot. There are two usecases here,
both valid, we just have different opinions which one is more relevant.

At this time, these two verbs have been defined like this since systemd
was introduced, and we actually inherited their names from sysv, so we
really shouldn't change their effect anymore.
> 
> > > This behaviour is unintuitive and inconsistent, so make reload of inactive
> > > units behave just like try-restart.
> > 
> > Well, it might appear surprising in some ways, but there's a lot of
> > sense in it, and I think what matters is really to look at the effect of
> > calls, and make guarantees about that.
> 
> try-restart: guarantee that no old daemon instance is running at all
> reload: guarantee that the daemon is not running with old configuration,
> at least to the degree its ExecReload functionality supports

The way I see it "reload" should result in the daemon running with the
new config, if possible. And only fail if that's not possible. 

> > > Also fixes deadlocks on boot if a unit calls systemctl reload,
> > > reload-or-try-restart or force-reload on a unit ordered later
> > > in the boot sequence during startup.
> > 
> > Well, for cases like that, please consider using --no-block, which will
> > just queue the job, but not wait for it. Or use
> > --job-mode=ignore-dependencies or so, even though that's a horrid
> > invention...
> 
> That's not safe - with --no-block the daemon may still process requests
> with outdated configuration after the configuration change operation
> returns. The above semantics for "reload" make this work completely
> naturally without such problems. The problems are only caused by the
> inappropriate blocking for unit start in current systemd; the start time
> of the daemon is unrelated to the configuration change operations, and
> the added blocking has no benefit whatsoever, it only causes problems.
> Waiting for start should be done with a separate operation if needed.

Can you elaborate on what the actual reason this popped up? I am really
not convinced that the semantics you propose would be in any way an
improvement, since they are much weaker. Where exactly would this be
beneficial?

Again, sysv made no restrictions at all, when you could run
services. You could invoke /sbin/service at *any* time and it would do
something. It was left to the caller to know when exactly it was OK to
call it, and when it wasn't. Just using "systemctl" instead of
"/sbin/service" won't give you the same semantics here, as we actually
care about the ordering of things at *execution time*, rather than
*install time*. To accomodate for this change of semantics, we actually
added --job-mode=ignore-dependencies (on request of the suse guys iirc),
that will bring back the exact semantics of sysv on systemd, where the
job you queue and wait for gets immeidately executed. If you have
deadlocks caused by systemctl's default more strict behaviour, and don't
want to change the invocation logic of your scripts, because of compat
issues, then this is really what you should use.

(though again, I actually believe that --job-mode=ignore-depdencncies is
an awful invention, and I would never use it myself...)

Lennart

-- 
Lennart Poettering, Red Hat