[systemd-devel] kdbus vs. pipe based ipc performance
Kay Sievers
kay at vrfy.org
Mon Mar 3 14:06:03 PST 2014
On Mon, Mar 3, 2014 at 10:35 PM, Stefan Westerfeld <stefan at space.twc.de> wrote:
> First of all: I'd really like to see kdbus being used as a general purpose IPC
> layer; so that developers working on client-/server software will no longer
> need to create their own homemade IPC by using primitives like sockets or
> similar.
>
> Now kdbus is advertised as high performance IPC solution, and compared to the
> traditional dbus approach, this may well be true. But are the numbers that
>
> $ test-bus-kernel-benchmark chart
>
> produces impressive? Or to put it in another way: will developers working on
> client-/server software happily accept kdbus, because it performs as good as a
> homemade IPC solution would? Or does kdbus add overhead to a degree that some
> applications can't accept?
>
> To answer this, I wrote a program called "ibench" which passes messages between
> a client and a server, but instead of using kdbus to do it, it uses traditional
> pipes. To simulate main loop integration, it uses poll() in cases where a normal
> client or server application would go into the main loop, and wait to be woken
> up by filedescriptor activity.
>
> Now here are the results I obtained using
>
> - AMD Phenom(tm) 9850 Quad-Core Processor
> - running Fedora 20 64-bit with systemd+kdbus from git
> - system booted with kdbus and single kernel arguments
>
> ============================================================================
> *** single cpu performance: .
>
> SIZE COPY MEMFD KDBUS-MAX IBENCH SPEEDUP
>
> 1 32580 16390 32580 192007 5.89
> 2 40870 16960 40870 191730 4.69
> 4 40750 16870 40750 190938 4.69
> 8 40930 16950 40930 191234 4.67
> 16 40290 17150 40290 192041 4.77
> 32 40220 18050 40220 191963 4.77
> 64 40280 16930 40280 192183 4.77
> 128 40530 17440 40530 191649 4.73
> 256 40610 17610 40610 190405 4.69
> 512 40770 16690 40770 188671 4.63
> 1024 40670 17840 40670 185819 4.57
> 2048 40510 17780 40510 181050 4.47
> 4096 39610 17330 39610 154303 3.90
> 8192 38000 16540 38000 121710 3.20
> 16384 35900 15050 35900 80921 2.25
> 32768 31300 13020 31300 54062 1.73
> 65536 24300 9940 24300 27574 1.13
> 131072 16730 6820 16730 14886 0.89
> 262144 4420 4080 4420 6888 1.56
> 524288 1660 2040 2040 2781 1.36
> 1048576 800 950 950 1231 1.30
> 2097152 310 490 490 475 0.97
> 4194304 150 240 240 227 0.95
>
> *** dual cpu performance: .
>
> SIZE COPY MEMFD KDBUS-MAX IBENCH SPEEDUP
>
> 1 31680 14000 31680 104664 3.30
> 2 34960 14290 34960 104926 3.00
> 4 34930 14050 34930 104659 3.00
> 8 24610 13300 24610 104058 4.23
> 16 33840 14740 33840 103800 3.07
> 32 33880 14400 33880 103917 3.07
> 64 34180 14220 34180 103349 3.02
> 128 34540 14260 34540 102622 2.97
> 256 37820 14240 37820 102076 2.70
> 512 37570 14270 37570 99105 2.64
> 1024 37570 14780 37570 96010 2.56
> 2048 21640 13330 21640 89602 4.14
> 4096 23430 13120 23430 73682 3.14
> 8192 34350 12300 34350 59827 1.74
> 16384 25180 10560 25180 43808 1.74
> 32768 20210 9700 20210 21112 1.04
> 65536 15440 7820 15440 10771 0.70
> 131072 11630 5670 11630 5775 0.50
> 262144 4080 3730 4080 3012 0.74
> 524288 1830 2040 2040 1421 0.70
> 1048576 810 950 950 631 0.66
> 2097152 310 490 490 269 0.55
> 4194304 150 240 240 133 0.55
> ============================================================================
>
> I ran the tests twice - once using the same cpu for client and server (via cpu
> affinity) and once using a different cpu for client and server.
>
> The SIZE, COPY and MEMFD column are produced by "test-bus-kernel-benchmark
> chart", the KDBUS-MAX column is the maximum of the COPY and MEMFD column. So
> this is the effective number of roundtrips that kdbus is able to do at that
> SIZE. The IBENCH column is the effective number of roundtrips that ibench can
> do at that SIZE.
>
> For many relevant cases, ibench outperforms kdbus (a lot). The SPEEDUP factor
> indicates how much faster ibench is than kdbus. For small to medium array
> sizes, ibench always wins (sometimes a lot). For instance passing a 4Kb array
> from client to server and returning back, ibench is 3.90 times faster if client
> and server live on the same cpu, and 3.14 times faster if client and server
> live on different cpus.
>
> I'm bringing this up now because it would be sad if kdbus became part of the
> kernel and universally available, but application developers would still build
> their own protocols for performance reasons. And some things that may need to
> be changed to make kdbus run as fast as ibench may be backward incompatible at
> some level so it may be better to do it now than later on.
>
> The program "ibench" I wrote to provide a performance comparision for the
> "test-bus-kernel-benchmark" program can be downloaded at
>
> http://space.twc.de/~stefan/download/ibench.c
>
> As a final note, ibench also supports using a socketpair() for communication
> between client and server via #define at top, but pipe() communication was
> faster in my test setup.
Pipes are not interesting for general purpose D-Bus IPC; with a pipe
the memory can "move* from one client to the other, because it is no
longer needed in the process that fills the pipe.
Pipes are a model out-of-focus for kdbus; using pipes where pipes are
the appropriate IPC mechanism is just fine, there is no competition,
and being 5 times slower than simple pipes is a very good number for
kdbus.
Kdbus is a low-level implementation for D-Bus, not much else, it will
not try to cover all sorts of specialized IPC use cases.
Kay
More information about the systemd-devel
mailing list