[systemd-devel] Deadlocks with reloading jobs which are part of current transaction [was: [PATCH] Avoid reloading services when shutting down]

Martin Pitt martin.pitt at ubuntu.com
Tue Feb 3 23:56:36 PST 2015


Hello all,

I changed the Subject: to make this thread a bit easier to find.

Lennart Poettering [2015-02-03 21:40 +0100]:
> It's really about synchronous waiting on jobs. If you synchronously
> wait for completion of jobs that are ordered against the job your are
> part of yourself, then things will deadlock.

Indeed. The problem is that if you reload e. g. postfix from a DHCP or
"network up/down" hook, such a script doesn't have the slightest idea
whether it was run because the network changed at runtime (i. e. udev
event or the user just selected a new network) or whether it happens
as part of a systemd transaction (boot/shutdown). In the former case
you do want to block, in the latter case you mustn't.

FTR, I'm currently debugging a similar issue on
https://launchpad.net/bugs/1417010 which isn't caught by the current
two Debian patches, so we need a more generic solution anyway.

> Now, regardless which option you choose it's always a good idea to
> keep this change as local as possible. Altering the state engine for
> all operations is the worst solution...

Well, it's a problem which can happen in a lot of scenarios and isn't
specific to which kind of service or hook script you have, so what's
"local" is actually quite hard to define here.

I agree with Michael that involving a lot of shell commands which we
then have to copy to lots of places (and find these places at all) is
also not the best solution. So perhaps we could have some middle
ground here and make systemctl a bit more clever?

 - Don't enqueue a reload if the service to be reloaded isn't running.
   E. g. postfix.service "inactive/dead" in
   https://bugs.debian.org/635777 or smbd.service "start/waiting" in
   https://launchpad.net/bugs/1417010.  This would completely avoid
   the deadlock in most situations already, and doesn't change the
   semantics for working use cases either, so this should even be
   applicable for upstream?

And/or

 - systemctl reload/restart could imply --no-block if the service is
   already enqueued in the current transaction. That would avoid this
   deadlock situation in more cases.

With that the remaining deadlock case would be trying to reload an
already running service which isn't affected by the current
transaction, but we haven't seen that in practice yet.

If you don't want this upstream, I'd keep it as a patch in Debian. But
I can't really imagine that this wouldn't happen in Fedora or other
distros? I mean, things like the ISC DHCP hooks aren't a Debianism,
and a lot of existing software wasn't written with this "be careful on
service reloads and guess whether you need --no-block" approach in
mind, as it has never been a problem with other init systems.

Thanks,

Martin

-- 
Martin Pitt                        | http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150204/1bd8d32d/attachment.sig>


More information about the systemd-devel mailing list