[systemd-devel] [RFC] [PATCH 0/3] journal: Add deferred log processing to reduce synchonous IO overhead

Karol Lewandowski k.lewandowsk at samsung.com
Thu Dec 12 05:12:53 PST 2013


Folks,

We are trying to apply journald as default (and only) logging system
for our ARM-based handhelds.  There is one major problem, though -
performance.

We have found that journald can consume considerable amount of CPU 
time under moderate load (10-20% of CPU per, for say 100-300msg/s).

We have profiled journald and there are few places that could benefit
from _some_ optimizations but I have come to conclusion it isn't
journald code that is real bottleneck - it's, or seems to be,
synchronous nature of send()/recv().  In default configuration each
send() can cause context-switch which is especially costly on ARM.

This is why I decided to try to make logging completely async and see
if/what changes.

Following patchset (based on top of f20c84c15f3) implements pickup
logic in journald [patch 1] and uncoditionally enables it in client
api [patch 2].

In short - instead of sending DGRAMs I do drop files to /dev/shm
which are then picked up by journald at specified intervals.

(Precisely, I have (ab)used mechanism used to pass fd via DGRAM
 when message is too big.)

Patch 3 contains standalone test program, which floods journald with
specified number of messages per second.

On my development (x86) machine I made following test with good old
time(1):


  # send 2000 messages (each of size 100 bytes) to journal per second
  # for ~5mins
  ./journal-feeder 2000 100 & sleep $((60*5 - 10)) && kill $!


Unpatched systemd:
==================

  /usr/bin/time --verbose ./systemd-journald  & sleep $((60*5)) && kill $(pidof systemd-journald)
        User time (seconds): 16.87
        System time (seconds): 39.38
        Percent of CPU this job got: 18%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 5:00.06
        Major (requiring I/O) page faults: 30
        Minor (reclaiming a frame) page faults: 1061740
        Voluntary context switches: 496757
        Involuntary context switches: 497
        File system inputs: 0
        File system outputs: 50592


Deferred pickup every 5 secs (logs passed via /dev/shm/FILE rather than DGRAMs):
================================================================================

  /usr/bin/time --verbose ./systemd-journald  & sleep $((60*5)) && kill $(pidof systemd-journald)
        User time (seconds): 8.27
        System time (seconds): 8.40
        Percent of CPU this job got: 5%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 5:00.36
        Major (requiring I/O) page faults: 180
        Minor (reclaiming a frame) page faults: 38976
        Voluntary context switches: 157
        Involuntary context switches: 1933
        File system inputs: 8
        File system outputs: 307312


Deferred pickup every 1 sec (logs passed via /dev/shm/FILE rather than DGRAMs):
===============================================================================

  /usr/bin/time --verbose ./systemd-journald  & sleep $((60*5)) && kill $(pidof systemd-journald)
        User time (seconds): 9.54
        System time (seconds): 9.56
        Percent of CPU this job got: 6%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 5:00.15
        Major (requiring I/O) page faults: 180
        Minor (reclaiming a frame) page faults: 39034
        Voluntary context switches: 382
        Involuntary context switches: 2359
        File system inputs: 0
        File system outputs: 307768


This is quite naive test as a far more changes when we pass fd instead
of DGRAM (ucred is NULL, which means we won't gather information from
proc at all).  However, we verified that cost of this is rather minor
and not that important in this test.

Of course, I'm not proposing to include it as is (uncoditionally
enable it for all messages) but I would just like to hear your opinion
about this idea.

If I read these numbers correctly there seems to be considerable
benefit in terms of CPU time.  Something like that wouldn't be useful
for all messages, but I think that gazzilions of debug messages that
our apps produce could be send in non-synchronous and low-overhead
manner.

RFC.

Thanks in advance!


Karol Lewandowski (3):
  journald: Add deferred log processing logic
  journal: Try to pass log via /dev/shm to avoid waking up journal
  journal-feeder: trivial test program to flood journald

 journal-feeder.c                 |  64 +++++++++++++++++++++++
 src/journal/journal-send.c       |  39 ++++++++++++++
 src/journal/journald-gperf.gperf |   2 +
 src/journal/journald-server.c    | 107 +++++++++++++++++++++++++++++++++++++++
 src/journal/journald-server.h    |   5 ++
 src/journal/journald.conf        |   2 +
 6 files changed, 219 insertions(+)
 create mode 100644 journal-feeder.c

-- 
1.8.4.rc3



More information about the systemd-devel mailing list