[systemd-devel] kdbus vs. pipe based ipc performance
Stefan Westerfeld
stefan at space.twc.de
Mon Mar 3 13:35:45 PST 2014
Hi!
First of all: I'd really like to see kdbus being used as a general purpose IPC
layer; so that developers working on client-/server software will no longer
need to create their own homemade IPC by using primitives like sockets or
similar.
Now kdbus is advertised as high performance IPC solution, and compared to the
traditional dbus approach, this may well be true. But are the numbers that
$ test-bus-kernel-benchmark chart
produces impressive? Or to put it in another way: will developers working on
client-/server software happily accept kdbus, because it performs as good as a
homemade IPC solution would? Or does kdbus add overhead to a degree that some
applications can't accept?
To answer this, I wrote a program called "ibench" which passes messages between
a client and a server, but instead of using kdbus to do it, it uses traditional
pipes. To simulate main loop integration, it uses poll() in cases where a normal
client or server application would go into the main loop, and wait to be woken
up by filedescriptor activity.
Now here are the results I obtained using
- AMD Phenom(tm) 9850 Quad-Core Processor
- running Fedora 20 64-bit with systemd+kdbus from git
- system booted with kdbus and single kernel arguments
============================================================================
*** single cpu performance: .
SIZE COPY MEMFD KDBUS-MAX IBENCH SPEEDUP
1 32580 16390 32580 192007 5.89
2 40870 16960 40870 191730 4.69
4 40750 16870 40750 190938 4.69
8 40930 16950 40930 191234 4.67
16 40290 17150 40290 192041 4.77
32 40220 18050 40220 191963 4.77
64 40280 16930 40280 192183 4.77
128 40530 17440 40530 191649 4.73
256 40610 17610 40610 190405 4.69
512 40770 16690 40770 188671 4.63
1024 40670 17840 40670 185819 4.57
2048 40510 17780 40510 181050 4.47
4096 39610 17330 39610 154303 3.90
8192 38000 16540 38000 121710 3.20
16384 35900 15050 35900 80921 2.25
32768 31300 13020 31300 54062 1.73
65536 24300 9940 24300 27574 1.13
131072 16730 6820 16730 14886 0.89
262144 4420 4080 4420 6888 1.56
524288 1660 2040 2040 2781 1.36
1048576 800 950 950 1231 1.30
2097152 310 490 490 475 0.97
4194304 150 240 240 227 0.95
*** dual cpu performance: .
SIZE COPY MEMFD KDBUS-MAX IBENCH SPEEDUP
1 31680 14000 31680 104664 3.30
2 34960 14290 34960 104926 3.00
4 34930 14050 34930 104659 3.00
8 24610 13300 24610 104058 4.23
16 33840 14740 33840 103800 3.07
32 33880 14400 33880 103917 3.07
64 34180 14220 34180 103349 3.02
128 34540 14260 34540 102622 2.97
256 37820 14240 37820 102076 2.70
512 37570 14270 37570 99105 2.64
1024 37570 14780 37570 96010 2.56
2048 21640 13330 21640 89602 4.14
4096 23430 13120 23430 73682 3.14
8192 34350 12300 34350 59827 1.74
16384 25180 10560 25180 43808 1.74
32768 20210 9700 20210 21112 1.04
65536 15440 7820 15440 10771 0.70
131072 11630 5670 11630 5775 0.50
262144 4080 3730 4080 3012 0.74
524288 1830 2040 2040 1421 0.70
1048576 810 950 950 631 0.66
2097152 310 490 490 269 0.55
4194304 150 240 240 133 0.55
============================================================================
I ran the tests twice - once using the same cpu for client and server (via cpu
affinity) and once using a different cpu for client and server.
The SIZE, COPY and MEMFD column are produced by "test-bus-kernel-benchmark
chart", the KDBUS-MAX column is the maximum of the COPY and MEMFD column. So
this is the effective number of roundtrips that kdbus is able to do at that
SIZE. The IBENCH column is the effective number of roundtrips that ibench can
do at that SIZE.
For many relevant cases, ibench outperforms kdbus (a lot). The SPEEDUP factor
indicates how much faster ibench is than kdbus. For small to medium array
sizes, ibench always wins (sometimes a lot). For instance passing a 4Kb array
from client to server and returning back, ibench is 3.90 times faster if client
and server live on the same cpu, and 3.14 times faster if client and server
live on different cpus.
I'm bringing this up now because it would be sad if kdbus became part of the
kernel and universally available, but application developers would still build
their own protocols for performance reasons. And some things that may need to
be changed to make kdbus run as fast as ibench may be backward incompatible at
some level so it may be better to do it now than later on.
The program "ibench" I wrote to provide a performance comparision for the
"test-bus-kernel-benchmark" program can be downloaded at
http://space.twc.de/~stefan/download/ibench.c
As a final note, ibench also supports using a socketpair() for communication
between client and server via #define at top, but pipe() communication was
faster in my test setup.
Cu... Stefan
--
Stefan Westerfeld, http://space.twc.de/~stefan
More information about the systemd-devel
mailing list