[systemd-devel] [PATCH] core: support Distribute=n to distribute to n workers
Zbigniew Jędrzejewski-Szmek
zbyszek at in.waw.pl
Sun Jan 5 13:44:46 PST 2014
On Thu, Dec 19, 2013 at 12:21:26PM -0800, Shawn Landden wrote:
> On Fri, Dec 13, 2013 at 8:23 PM, Shawn Landden <shawn at churchofgit.com> wrote:
> > If Distribute=n, turns SO_REUSEPORT on, and spawns
> > n workers to handling incoming requests.
> >
> > SO_REUSEPORT sockets on the same port must all be created
> > by the same uid, therefore using the option allows
> > other root programs (or programs of the same user
> > if running in --user mode) to "hijack" this port, even
> > after systemd reserves it.
> >
> > We spawn workers at a rate approximentally reverse
> > exponentially proportianal to the number of incoming connections.
> > Faster based on the time for new workers to start accept()ing
> > and their load, or slower if systemd is under load.
Hi Shawn,
sorry for the delay.
Your patch is nice, but I found three issues:
1. The documentation is still lacking. I made a small patch which extends
and clarifies the description of Distribute=n a bit, but I think that
even more explanation should be given [1]. Maybe you fold it into your
patch?
2. It is possible that the instance name might be taken. One legitimate
case would be when the socket is started, some instances are created,
and the socket is stopped and started again. Then the connection count
will be reset to 0. The user might also start an instance by hand. Such
situations should not prevent the connection from being accepted.
Something similar happens when snapshots are created, and systemd
loops looking for a free name. The same fallback should be implemented
here, either with linearly increasing instances, or maybe with random
numbers in case the instance names is occupied.
3. The strategy of dup()ing the socket doesn't work. I wrote
a simple server in python which logs the connections [2], and hooked
it up into systemd [3-4] (*). If REUSEPORT was working correctly,
each connection would be handled by just one instance, either created
previously, or newly created by systemd for this connection. But
I see the same connection being accept()ed by one of the instances
and systemd itself spawning a new instance. I'm pretty sure that what
Lennart wrote before, that you need to create a new socket bound to
the same port for REUSEPORT to take effect, is true.
[1] http://in.waw.pl/~zbyszek/distribute-n/0001-Fix-Distribute-n-documentation.patch
[2] http://in.waw.pl/~zbyszek/distribute-n/socket_logger.py
[3] http://in.waw.pl/~zbyszek/distribute-n/distributed.socket
[4] http://in.waw.pl/~zbyszek/distribute-n/distributed@.service
(*) In the python script, it seems that print() statements don't reach
the journal, but systemd.journal.send()s do. I guess I'm missing something.
But that's why logging is duplicted. If somebody could explain this,
that would be great.
Zbyszek
More information about the systemd-devel
mailing list