[systemd-devel] socket activation systemd holding a reference to an accepted socket

Tue Feb 23 22:58:09 UTC 2016

On Thu, Feb 18, 2016 at 9:08 AM, Lennart Poettering <lennart at poettering.net>
wrote:

> On Wed, 17.02.16 17:44, Ben Woodard (woodard at redhat.com) wrote:
>
> > Is it intentional that systemd holds a reference to a socket it has
> > just accepted even though it just handed the open socket over to a
> > socket.activated service that it has just started.
>
> Yes, we do that currently. While there's currently no strict reason to
> do this, I think we should continue to do so, as there was always the
> plan to provide a bus interface so that services can rerequest their
> activation and fdstore fds at any time. Also, for the listening socket
> case (i.e. Accept=no case) we do the same and have to, hence it kinda
> is makes sure we expose the same behaviour here on all kinds of
> sockets.
>

I'm having trouble believing that consistency of behavior is an ideal to
strive for in this case. Consistency with xinetd's behavior would seem to
be a better benchmark in this case.

And I'm not sure that I understand the value being able to rerequest a
fdstore of fds. To me this sounds like it would be a very rarely used
feature. Could this be made an option that could be enabled when you add
the bus service that allows a service to rerequest their activation and
fdstore fds?

> Did you run into problems with this behaviour?
>

Oh yes. It breaks a large number of management tools that we have on to do
various things on clusters. It is a kind of pseudoconcurrency. Think of
like this:

foreach compute-node;
   rsh node daemonize quick-but-not-instant-task

With xinetd the demonization would close the accepted socket.  Then the
foreach loop would nearly instantly move onto the next node. We could zip
through 8000 nodes in a couple of seconds.

With systemd holding onto the socket the "rsh" hangs until the
quick-but-not-instant-task completes. This causes the foreach loop to take
anywhere from between 45min and several hours. Because it isn't really rsh
and demonize and I just used that to make it easy to understand what is
going on, rewriting several of our tools is non-trivial and would end up
violating all sorts of implicit logical layering within the tools and
libraries that we use to build them.

Where is this in the source code? I've been planning to send you a patch to
change the behavior but I have't quite connected the dots from where a job
within a transaction is inserted on the run queue to where the fork for the
"ExecStart" actually happens.

-ben
Red Hat Inc.

> Lennart
>
> --
> Lennart Poettering, Red Hat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20160223/d6d1a707/attachment.html>