[systemd-devel] [PATCH] core: support Distribute=n to distribute to n workers

Sun Jan 5 13:44:46 PST 2014

On Thu, Dec 19, 2013 at 12:21:26PM -0800, Shawn Landden wrote:
> On Fri, Dec 13, 2013 at 8:23 PM, Shawn Landden <shawn at churchofgit.com> wrote:
> > If Distribute=n, turns SO_REUSEPORT on, and spawns
> > n workers to handling incoming requests.
> >
> > SO_REUSEPORT sockets on the same port must all be created
> > by the same uid, therefore using the option allows
> > other root programs (or programs of the same user
> > if running in --user mode) to "hijack" this port, even
> > after systemd reserves it.
> >
> > We spawn workers at a rate approximentally reverse
> > exponentially proportianal to the number of incoming connections.
> > Faster based on the time for new workers to start accept()ing
> > and their load, or slower if systemd is under load.
Hi Shawn,
sorry for the delay.

Your patch is nice, but I found three issues:

1. The documentation is still lacking. I made a small patch which extends
   and clarifies the description of Distribute=n a bit, but I think that
   even more explanation should be given [1]. Maybe you fold it into your
   patch?

2. It is possible that the instance name might be taken. One legitimate
   case would be when the socket is started, some instances are created,
   and the socket is stopped and started again. Then the connection count
   will be reset to 0. The user might also start an instance by hand. Such
   situations should not prevent the connection from being accepted.
   Something similar happens when snapshots are created, and systemd
   loops looking for a free name. The same fallback should be implemented
   here, either with linearly increasing instances, or maybe with random
   numbers in case the instance names is occupied.

3. The strategy of dup()ing the socket doesn't work. I wrote
   a simple server in python which logs the connections [2], and hooked
   it up into systemd [3-4] (*). If REUSEPORT was working correctly,
   each connection would be handled by just one instance, either created
   previously, or newly created by systemd for this connection. But
   I see the same connection being accept()ed by one of the instances
   and systemd itself spawning a new instance. I'm pretty sure that what
   Lennart wrote before, that you need to create a new socket bound to
   the same port for REUSEPORT to take effect, is true.

[1] http://in.waw.pl/~zbyszek/distribute-n/0001-Fix-Distribute-n-documentation.patch
[2] http://in.waw.pl/~zbyszek/distribute-n/socket_logger.py
[3] http://in.waw.pl/~zbyszek/distribute-n/distributed.socket
[4] http://in.waw.pl/~zbyszek/distribute-n/distributed@.service

(*) In the python script, it seems that print() statements don't reach
    the journal, but systemd.journal.send()s do. I guess I'm missing something.
    But that's why logging is duplicted. If somebody could explain this,
    that would be great.

Zbyszek