[Spice-devel] Spice network performance and possible optimizations

Frediano Ziglio fziglio at redhat.com
Thu Feb 9 13:52:32 UTC 2017


Recently I was trying to measure some possible problems with our network
implementation.

One question was about TLS. How many more packets/bytes are we using
if TLS is enabled? What's the cost of TLS?

NOTE: following tests are done using replay utility so results
could be a bit different from real cases as:
- replay goes as fast as it can, for instance packets could
  be merged by the kernel decreasing packet numbers and a bit
  byte spent (this actually make the following improves worse);
- there are fewer channels (no much cursor, sound, etc).
The following tests shows count packet and total bytes from server
to client using a real network. I used a direct cable connection
using 1gb connection and 2 laptops.

This is the results for TLS (encryption)

rhel7-recordX.xz tls: 12881 14474175
rhel7-recordX.xz tls: 12577 14724788
rhel7-recordX.xz tls: 12522 13955459
rhel7-recordX.xz tls: 12401 14227387
rhel7-recordX.xz tls opt: 10854 14078531
rhel7-recordX.xz tls opt: 10908 14350148
rhel7-recordX.xz tls opt: 10704 13622323
rhel7-recordX.xz tls opt: 11021 14112498
rhel7-recordX.xz normal: 9764 13384017
rhel7-recordX.xz normal: 9730 13421657
rhel7-recordX.xz normal: 9616 13404023
rhel7-recordX.xz normal: 9540 13157164

"tls" is just using master. "tls opt" is using this optimization:
--- a/server/reds-stream.c
+++ b/server/reds-stream.c
@@ -305,6 +305,18 @@ ssize_t reds_stream_writev(RedsStream *s, const struct iovec *iov, int iovcnt)
         return s->priv->writev(s, iov, iovcnt);
     }
 
+    size_t total;
+    for (total = 0, i = 0; i < iovcnt; ++i) {
+        total += iov[i].iov_len;
+    }
+    if (total <= 1024) {
+        uint8_t buf[1024];
+        for (total = 0, i = 0; i < iovcnt; ++i) {
+            memcpy(buf + total, iov[i].iov_base, iov[i].iov_len);
+            total += iov[i].iov_len;
+        }
+        return reds_stream_write(s, buf, total);
+    }
     for (i = 0; i < iovcnt; ++i) {
         n = reds_stream_write(s, iov[i].iov_base, iov[i].iov_len);
         if (n <= 0)

while "normal" is using no tls (unencrypted).
You can see that tls costs about 7.5% as more bytes and 30% more packets.
This is reduced to 5.2% and 12% with the small optimization above.
The reason of the optimization is basically that there are a lot of small writes
which ends up calling SSL_write. Looking at some dumps every SSL_write add 29 bytes
so sending 1 single bytes calling SSL_write is spending 30 bytes at network level.
Note that these tests are using a lot display channel which usually uses big blocks,
for sound and other channels the cost is worse. Also while not encrypted connections
use writev using encrypted connections every iovec is converted to a call to
SSL_write, this explain the huge difference (18%) between the packet count of the 2
TLS tests.


The other tests I did was using my cork branch and some other more experimental
patches. Cork branch uses a Linux/BSD feature help sending larger tcp packets
without the lag expense of the Nagle algorithm.
The experimental patches increase the packet window and avoid some useless
push of pipe items.

win7-boot-record2.xz cork: 537 1582240
win7-boot-record2.xz cork: 681 1823754
win7-boot-record2.xz cork: 524 1583287
win7-boot-record2.xz cork: 538 1582350
win7-boot-record2.xz normal: 1329 1834630
win7-boot-record2.xz normal: 1290 1829094
win7-boot-record2.xz normal: 1289 1830164
win7-boot-record2.xz normal: 1317 1833589
win7-boot-record2.xz normal: 1320 1835705
win7-boot-record2.xz hacks-2: 443 1584991
win7-boot-record2.xz hacks-2: 435 1577067
win7-boot-record2.xz hacks-2: 434 1581417
win7-boot-record2.xz hacks-2: 440 1579405

As you can see with cork patches the byte utilization is reduced by about
11% while the packet number is reduced by a 56%.
The additional patches take the improvements to 14% and 67%.

Frediano


More information about the Spice-devel mailing list