[systemd-devel] Deadlocks with reloading jobs which are part of current transaction [was: [PATCH] Avoid reloading services when shutting down]

Wed Feb 4 07:38:38 PST 2015

On Wed, 04.02.15 15:25, Martin Pitt (martin.pitt at ubuntu.com) wrote:

> Lennart Poettering [2015-02-04 13:27 +0100]:
> > On Wed, 04.02.15 08:56, Martin Pitt (martin.pitt at ubuntu.com) wrote:
> > 
> > > Lennart Poettering [2015-02-03 21:40 +0100]:
> > > > It's really about synchronous waiting on jobs. If you synchronously
> > > > wait for completion of jobs that are ordered against the job your are
> > > > part of yourself, then things will deadlock.
> > > 
> > > Indeed. The problem is that if you reload e. g. postfix from a DHCP or
> > > "network up/down" hook, such a script doesn't have the slightest idea
> > > whether it was run because the network changed at runtime (i. e. udev
> > > event or the user just selected a new network) or whether it happens
> > > as part of a systemd transaction (boot/shutdown). In the former case
> > > you do want to block, in the latter case you mustn't.
> > 
> > I don't see why you'd want to block in either case. Networking is
> > asynchronous anyway, there's really no point in blocking here. What
> > does this give you?
> 
> So far, if a hook or shell script calls e. g. "service foo restart",
> it can count on the service actually being reloaded after it finishes,
> and thus you can then interact with it immediately. That's not
> important for many scripts, but we can't just always make "service foo
> reload" async by using --no-block, as that would break compatibility
> with scripts which do rely on that behaviour. So we do need to
> conditionalize it to avoid the deadlocks under systemd.

Sure, I can only recommend again: in the the glue code that calls out
to "systemctl" from "service", you can add the code to use --no-block
or --job-mode=ignore-dependencies , if you notice you are in shutdown
mode...

> > Modern code, that talks directly to systemctl, that uses hook scripts,
> > should always use --no-block for these cases.
> 
> Yeah, for sure. We just don't have that luxury easily, as first these
> scripts don't call systemctl but "service" (for the Debianists: or
> invoke-rc.d, but same difference), it will take some time to find
> all these occurrences, and people will hate us for having to add stuff
> like
> 
>   if (running under systemd)
>        systemctl --no-block reload foo
>   else
>        service foo reload

I am not proposing anything like that.

Your /usr/sbin/service binary translates sysv requests into systemctl
requests, right? In that code, you can easily impliy --no-block or
--job-mode=ignore-dependencies under the right circumstances.

There's *no* reason to patch that into all scripts, just patch that
into your "service" binary...

> > I mean, you must have some code in Debian that forwards old sysv
> > restart/reload requests to systemctl. That's probably going to be in
> > /usr/sbin/service and maybe a few other places. Why can't you just add
> > the --no-block logic there, conditionalized to shutdown/reboot?
> 
> Well, we can; it's just as much an unreliable hack as the two existing
> patches :-) and I was wondering if we can't solve this in a more
> generic/distro agnostic way.

No it's not as broken. Because if you use "systemctl" then, you will
always get the strict and right systemd semantics. Only your sysv glue
code will degrade things to be closer to sysv then.

/usr/bin/service is distro-specifc, there's no way around that. Some
distros have it as C binary, others as shell script, all with slightly
different semantics. If you make it translate sysv scripts into
systemctl invocation, then that's precisely where you should add in
--no-block or --job-mode=ignore-dependencies.

> > >  - Don't enqueue a reload if the service to be reloaded isn't running.
> > >    E. g. postfix.service "inactive/dead" in
> > >    https://bugs.debian.org/635777 or smbd.service "start/waiting" in
> > >    https://launchpad.net/bugs/1417010.  This would completely avoid
> > >    the deadlock in most situations already, and doesn't change the
> > >    semantics for working use cases either, so this should even be
> > >    applicable for upstream?
> > 
> > No, this would open up the door for races. The guarantee we give
> > around blocking operations, is that by the time they return the
> > operation or an equivalent has been executed. More specifically, if
> > you start a service, and it is in "starting", and then issue a
> > "reload" or "restart", and it returns you *know* that the
> > configuration that was on disk at the time you issued the "reload" or
> > "restart" -- or a newer one -- is in place. If you'd suppress the
> > reload/restart in this case, then you will not get that guarantee,
> > because the configuration ultimately loaded might be the one from the
> > time the daemon was first put into starting mode at.
> 
> Hm, I don't quite understand this. If you reload a service which isn't
> currently running, systemctl will fail. It's just currently going to
> enqueue the reload request, and executing the job will fail due to the
> "not active" check in unit_reload(). The problem with that is just
> that systemctl doesn't fail immediately with "the unit is not active",
> but blocks on the queued request. So if you don't have pending jobs,
> the net result is the same, but with pending jobs you can easily get a
> deadlock.

Again, if a "start" job is already queued and in progress of being
dispatched, we need to queue the reload job, to get the right
guarantess.

It's not that hard to see, is it?

> > Or in other words: if a script does this:
> > 
> >    # sed -i -e 's/something/somethingelse/' /etc/mydaemon.conf
> >    # systemctl restart mydaemon.service
> > 
> > Then we *must* give the guarantee that when the second command returns
> > the change just made to /etc/mydaemon.conf is in place, and not a
> > previous config loaded.
> 
> Yes, that's fine. I only said "reload" above, as that's what
> DHCP/network scripts usually do.
> 
> > (Now, this example is about restarts, but reload is pretty much the
> > same
> 
> reload on an inactive unit will fail, restart will start it.

Yes, but again: regardless if the unit is currently starting, already
started or currently reloading, we must properly enqueue a new job, to
give the right guarantess. 

We must execute the jobs in order. If there's start job in progress,
or a reload job, and we queue a reload job, then we need to wait for
the start job or reload job to finish, to begin with the new reload
job. Otherwise you cannot give the guarantee I pointed out above.

> > Also: *why* for heaven's sake? Just add --no-block when you are
> > running from these contexts, and all is good. I really don't get why
> > you don't want to do that.
> 
> Well, I do, but special-casing that on boot/shutdown isn't enough (you
> can have other bigger transactions which lock up); and it's rather
> expensive to do this via multiple CLI tools and D-BUS calls rather
> than in the manager itself. So I was pondering how to get this effect
> much more efficiently in manager.c instead of the shell wrappers
> calling systemctl.

Well, universal deadlock detection is not possible. That would mean
solving the halting problem, which is impossible.

I mean, if you want precise sysv semantics you can just use
--job-mode=ignore-dependencies always, since sysv ignore all deps too
when sysv scripts are involved.

And two bus calls instead of one, what's the problem with that? You
are not winning speed records with shell in the loop anyway, and two
or one bus call are entirely irrelevant.

> > On Fedora, these hook scripts use --no-block correctly, and I am
> > completely puzzled why you don't want this on Debian!
> 
> Mostly because we don't currently have the luxury of having only
> systemd :-)

Dude! Come on!

As I keep repeating: please add this to the sysv compat glue, not to
PID 1! Add it to your service command, and all is good. 

> Anyway, I think your stanza on this is clear. We'll keep the hacks on
> the Debian side, if that's not a problem for other distros.

I'd strongly encourage you to work around this on client side in the
sysv glue, not breaking the guarantess that systemd-aware code needs
when issuing commands.

> Thanks for the discussion and the hint with --no-block!

Please, think about this. Fix this in the sysv client code once, don't
break proper systemd clients forever. 

If you break the engine like this this will reflect back on us
upstream. On Fedora and everywhere else this:

          # sed -i -e s/foobar/waldo/ /etc/mydaemon.conf
          # systemctl reload mydaemon.conf

will have the excepected effect of making sure that the config changes
made to mydaemon.conf are in effect when systemctl reload. With your
hack however that guarantee is lost when the daemon is still in the
process of being started or being reloaded.

You are not helping anyone with this, you are just breaking things.

I really don't udnerstand why you don't want to change the
/usr/sbin/service code. Why precisely are you unwilling to touch that
code?

Lennart

-- 
Lennart Poettering, Red Hat