[systemd-devel] [PATCH 3/3] socket: Add support for TCP defer accept

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Thu Aug 14 09:30:51 PDT 2014


On Tue, Jul 29, 2014 at 11:10:09PM +0530, Susant Sahani wrote:
> TCP_DEFER_ACCEPT Allow a listener to be awakened only when data
> arrives on the socket. If TCP_DEFER_ACCEPT set on a server-side
> listening socket, the TCP/IP stack will not to wait for the final
> ACK packet and not to initiate the process until the first packet
> of real data has arrived. After sending the SYN/ACK, the server will
> then wait for a data packet from a client. Now, only three packets
> will be sent over the network, and the connection establishment delay
> will be significantly reduced.
> ---
>  man/systemd.socket.xml | 16 ++++++++++++++++
>  src/core/dbus-socket.c |  1 +
>  src/core/socket.c      | 11 +++++++++++
>  src/core/socket.h      |  1 +
>  4 files changed, 29 insertions(+)
> 
> diff --git a/man/systemd.socket.xml b/man/systemd.socket.xml
> index e6bbb2e..9ce94aa 100644
> --- a/man/systemd.socket.xml
> +++ b/man/systemd.socket.xml
> @@ -539,6 +539,22 @@
>                          </varlistentry>
>  
>                          <varlistentry>
> +                                <term><varname>DeferAccept=</varname></term>
> +                                <listitem><para>Takes time (in seconds) as argument
> +                                Allow a listener to be awakened only when data arrives on the socket.
> +                                If TCP_DEFER_ACCEPT set on a server-side listening socket,
> +                                the TCP/IP stack will not to wait for the final ACK packet and not to
> +                                initiate the process until the first packet of real data has arrived.
> +                                After sending the SYN/ACK, the server will then wait for a data packet
> +                                from a client. Now, only three packets will be sent over the network,
> +                                and the connection establishment delay will be significantly reduced.
> +                                This controls the TCP_DEFER_ACCEPT socket option (see
> +                                <citerefentry><refentrytitle>socket</refentrytitle><manvolnum>7</manvolnum></citerefentry>
> +                                Defaults to
> +                                <option>disabled</option>.</para></listitem>
I think this needs to be cleaned up to be gramatically correct. Right now
some guesswork is required to gather the true meaning.

Maybe something like this:

---

<para>Takes time (in seconds) as argument. If set, the listening process
will be awakened only when data arrives on the socket, and not immediately
when connection is established. When this option is set, the
<constant>TCP_DEFER_ACCEPT</constant> socket option will be used
(see
<citerefentry><refentrytitle>tcp</refentrytitle><manvolnum>7</manvolnum></citerefentry>),
and the kernel will ignore initial ACK packets without any data.
The argument specifies the approximate amount
of time the kernel should wait for incoming data before falling
back to the normal behaviour of honouring empty ACK packets.
This option beneficial for protocols where the client sends the data
first (e.g. HTTP, in contrast to SMTP), because the server
process will not be woken up unnecessarily before it can take any action.
</para>

<para>If the client also uses the <constant>TCP_DEFER_ACCEPT</constant>
option, the latency of the initial connection may be
reduced, because the kernel will send data in the
final packet establishing the connection (the third packet in the
"three-way handshake").</para>

<para>Disabled by default.</para>

---

(Note that TCP_DEFER_ACCEPT is described in tcp(7), not socket(7), and this
description is so terse that it is nearly useless anyway.)

Zbyszek

> +                        </varlistentry>
> +
> +                        <varlistentry>
>                                  <term><varname>Priority=</varname></term>
>                                  <listitem><para>Takes an integer
>                                  argument controlling the priority for
> diff --git a/src/core/dbus-socket.c b/src/core/dbus-socket.c
> index f9ef7ef..1142ca5 100644
> --- a/src/core/dbus-socket.c
> +++ b/src/core/dbus-socket.c
> @@ -101,6 +101,7 @@ const sd_bus_vtable bus_socket_vtable[] = {
>          SD_BUS_PROPERTY("KeepAliveInterval", "t", bus_property_get_usec, offsetof(Socket, keep_alive_interval), SD_BUS_VTABLE_PROPERTY_CONST),
>          SD_BUS_PROPERTY("KeepAliveProbes", "i", bus_property_get_int, offsetof(Socket, keep_alive_cnt), SD_BUS_VTABLE_PROPERTY_CONST),
>          SD_BUS_PROPERTY("FastOpen" , "b", bus_property_get_bool, offsetof(Socket, fast_open), SD_BUS_VTABLE_PROPERTY_CONST),
> +        SD_BUS_PROPERTY("DeferAccept" , "t", bus_property_get_usec, offsetof(Socket, defer_accept), SD_BUS_VTABLE_PROPERTY_CONST),
>          SD_BUS_PROPERTY("Priority", "i", bus_property_get_int, offsetof(Socket, priority), SD_BUS_VTABLE_PROPERTY_CONST),
>          SD_BUS_PROPERTY("ReceiveBuffer", "t", bus_property_get_size, offsetof(Socket, receive_buffer), SD_BUS_VTABLE_PROPERTY_CONST),
>          SD_BUS_PROPERTY("SendBuffer", "t", bus_property_get_size, offsetof(Socket, send_buffer), SD_BUS_VTABLE_PROPERTY_CONST),
> diff --git a/src/core/socket.c b/src/core/socket.c
> index b798d4e..32cadf9 100644
> --- a/src/core/socket.c
> +++ b/src/core/socket.c
> @@ -610,6 +610,11 @@ static void socket_dump(Unit *u, FILE *f, const char *prefix) {
>                          "%sKeepAliveProbes: %u\n",
>                          prefix, s->keep_alive_cnt);
>  
> +        if(s->defer_accept)
> +                fprintf(f,
> +                        "%sDeferAccept: %lo\n",
> +                        prefix, s->defer_accept / USEC_PER_SEC);
> +
>          LIST_FOREACH(port, p, s->ports) {
>  
>                  if (p->type == SOCKET_SOCKET) {
> @@ -831,6 +836,12 @@ static void socket_apply_socket_options(Socket *s, int fd) {
>                          log_warning_unit(UNIT(s)->id, "TCP_FASTOPEN failed: %m");
>          }
>  
> +        if (s->defer_accept) {
> +                int value = s->defer_accept / USEC_PER_SEC;
> +                if (setsockopt(fd, SOL_TCP, TCP_DEFER_ACCEPT, &value, sizeof(value)) < 0)
> +                        log_warning_unit(UNIT(s)->id, "TCP_DEFER_ACCEPT failed: %m");
> +        }
> +
>          if (s->broadcast) {
>                  int one = 1;
>                  if (setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &one, sizeof(one)) < 0)
> diff --git a/src/core/socket.h b/src/core/socket.h
> index 9cb82fa..7452d27 100644
> --- a/src/core/socket.h
> +++ b/src/core/socket.h
> @@ -104,6 +104,7 @@ struct Socket {
>          usec_t timeout_usec;
>          usec_t keep_alive_time;
>          usec_t keep_alive_interval;
> +        usec_t defer_accept;
>  
>          ExecCommand* exec_command[_SOCKET_EXEC_COMMAND_MAX];
>          ExecContext exec_context;
> -- 
> 1.9.3
> 
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 


More information about the systemd-devel mailing list