[systemd-devel] [RFC] [PATCH 0/3] journal: Add deferred log processing to reduce synchonous IO overhead

Karol Lewandowski k.lewandowsk at samsung.com
Fri Dec 13 03:46:45 PST 2013


On 12/12/2013 07:05 PM, Lennart Poettering wrote:
> On Thu, 12.12.13 17:42, Karol Lewandowski (lmctlx at gmail.com) wrote:
> 
>>> Hmm, but this wouldn't be any different from old syslog, right? I mean,
>>> old syslog also uses sendv() and recv() on AF_UNIX SOCK_DGRAM sockets...
>>
>> That's true, however, we aren't migrating from syslog, but from
>> Android Logger.  Android provides /dev/log_{main,radio,...} device
>> nodes which are simple in-kernel circular buffers apps can write to at
>> any time.  This was invented when /dev/kmsg wasn't writable and serves
>> more-or-less same purpose.
>>
>> From 2.6.29 it's in mailine kernel under
>> drivers/staging/android/logger.c
> 
> But if the the socket layer is really thaaaaat much slower than android
> logging, then that's kinda sad, and the kernel socket stuff should
> probably be fixed?

It's not that slower when we take throughput as only parameter:

packet   |              |
size [B] | Adnroid [kB] |  DGRAM [kB]
---------+--------------+------------
  64     |    110640    |     84827
 128     |    212602    |    166866
 256     |    423880    |    328440
 512     |    815094    |    644499
1024     |   1543326    |   1246914
2048     |   2751674    |   2359654
4096     |   2821260    |   4145395

However, if take context switches into account picture starts
to look differently:

                 | Android   |   DGRAM
-----------------+-----------+--------------
context switches |  1818     |   580443      (involuntary + voluntary)
                 |           |
system time      | 149.97    |   118.62      [seconds]
user time        |   4.03    |     7.45      [secodns]

(Columns contain cumulative data: client  + server)

Attached raw results and test program.

>>>> This is why I decided to try to make logging completely async and see
>>>> if/what changes.
>>>
>>> Well, if AF_UNIX is slow, then we really should try to fix that in the
>>> kernel instead of bypassing it in userspace... Whether we rely on the
>>> socket buffer of the AF_UNIX socket or put together our own buffer in
>>> /dev/shm shouldn't be much of a difference if they both have the same
>>> size and can take the same number of entries...
>>
>> I do agree.  Truth is I have looked at linux/net/ code once but I
>> didn't grok it well.  I guess it's the time to take a second look. ;)
> 
> Here's another option: extend journald to use kdbus as additional
> transport. This is something we want to do anyway since the kdbus
> transport will attach the metadata we need without race to each
> packet. Given that kdbus ultimately is just a way to write into an
> mmaped tmpfs that some other process owns this should not be much worse
> than the android logger in performance.

I doubt it would help as other side would still be woken up on _every_
message, right?

Cheers,
Karol
-------------- next part --------------
  Client:  /usr/bin/time --verbose ./logger-client
  ================================================

Average stats:
packet size              = Bytes/s
  64 =        113295768 Bytes/s = 110640 kB/s
 128 =        217704976 Bytes/s = 212602 kB/s
 256 =        434053344 Bytes/s = 423880 kB/s
 512 =        834656704 Bytes/s = 815094 kB/s
1024 =       1580365824 Bytes/s = 1543326 kB/s
2048 =       2817714432 Bytes/s = 2751674 kB/s
4096 =       2888970749 Bytes/s = 2821260 kB/s
        Command being timed: "./logger-client"
        User time (seconds): 2.20
        System time (seconds): 74.80
        Percent of CPU this job got: 100%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:17.00
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 624
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 204
        Voluntary context switches: 1064
        Involuntary context switches: 368
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0


  Server:  /usr/bin/time --verbose ./logger-server
  ================================================

 Command being timed: "./logger-server"
        User time (seconds): 1.83
        System time (seconds): 75.17
        Percent of CPU this job got: 30%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 4:08.72
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 480
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 167
        Voluntary context switches: 242
        Involuntary context switches: 144
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0


-------------- next part --------------
  Client:  /usr/bin/time --verbose ./sock-client
  ==============================================

Average stats:
packet size              = Bytes/s
  64 =         86863584 Bytes/s =  84827 kB/s
 128 =        170870896 Bytes/s = 166866 kB/s
 256 =        336323520 Bytes/s = 328440 kB/s
 512 =        659967424 Bytes/s = 644499 kB/s
1024 =       1276840448 Bytes/s = 1246914 kB/s
2048 =       2416285952 Bytes/s = 2359654 kB/s
4096 =       4244884480 Bytes/s = 4145395 kB/s
        Command being timed: "./sock-client"
        User time (seconds): 2.14
        System time (seconds): 46.97
        Percent of CPU this job got: 63%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:17.00
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 636
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 207
        Voluntary context switches: 577696
        Involuntary context switches: 633
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0


  Server:  /usr/bin/time --verbose ./sock-server 
  ==============================================

Command being timed: "./sock-server"
        User time (seconds): 5.31
        System time (seconds): 71.65
        Percent of CPU this job got: 84%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:31.51
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 500
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 172
        Voluntary context switches: 1765
        Involuntary context switches: 349
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0



-------------- next part --------------
A non-text attachment was scrubbed...
Name: android-dgram-test.tar
Type: application/x-tar
Size: 20480 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20131213/fc33b369/attachment-0001.tar>


More information about the systemd-devel mailing list