[systemd-devel] Journalctl performance

Colin Guthrie gmane at colin.guthr.ie
Wed Sep 25 04:10:49 PDT 2013


Hi,

On a relatively average journal it can take a looooooooong time to page
through all the data collected.

With data stored from 5th August to 25th September  and running
"journalctl" and pressing G to jump to the end in less, and it takes
several minutes before the end of the messages are reached with
journalctl taking 100% CPU for most of that time.

This is on a slightly older systemd (approximately similar to the 195
version used in Suse - i.e. a shitton of backported patches!) but even
on my regular-use laptop (with SSD) and logs dating back to Jun 18th it
takes ~1m 30s to do the same. This is ultimately with something
approximating 2.5m lines of output to be paged through (systemd
debugging is on!)

With this kind of performance, it's kinda a hard sell, although I'm not
really sure if I should *expect* any better performance and I appreciate
that listing specific date ranges or particular services can reduce the
amont of data returned and thus speed things up dramitcally. So I guess
my question is is it basically unrealistic to expect better performance
from a simple "just output everything" operation like this? Sadly this
is exactly the type of operation a typical user who is used to syslog
would do with journalctl and thus don't see the beneifts.

Any thoughts on this?

HDD (systemd 195+patches):
[root at marley ~]# du -sh /var/log/journal/
1.5G	/var/log/journal/
[root at marley ~]# date; journalctl | wc -l; date
Wed 25 Sep 11:39:00 BST 2013
1957295
Wed 25 Sep 11:42:16 BST 2013


SSD (systemd 207):
[root at jimmy ~]# du -sh /var/log/journal/
2.0G	/var/log/journal/
[root at jimmy ~]# date; journalctl | wc -l; date
Wed 25 Sep 11:40:18 BST 2013
2391076
Wed 25 Sep 11:42:10 BST 2013


And just for some plain text comparisions on the older, HDD machine:

[root at marley ~]# date; journalctl >/home/journal; date
Wed 25 Sep 11:50:41 BST 2013
Wed 25 Sep 11:53:59 BST 2013
[root at marley ~]# wc -l /home/journal
1957527 /home/journal
[root at marley ~]# date; cat /home/journal >/dev/null; date
Wed 25 Sep 11:54:49 BST 2013
Wed 25 Sep 11:54:50 BST 2013
[root at marley ~]# date; cat /home/journal | gzip >/home/journal.gz; date
Wed 25 Sep 11:55:23 BST 2013
Wed 25 Sep 11:55:28 BST 2013
[root at marley ~]# date; zcat /home/journal.gz >/dev/null; date
Wed 25 Sep 11:55:50 BST 2013
Wed 25 Sep 11:55:51 BST 2013
[root at marley ~]# date; cat /home/journal | xz >/home/journal.xz; date
Wed 25 Sep 11:56:15 BST 2013
Wed 25 Sep 11:58:12 BST 2013
[root at marley ~]# date; xzcat /home/journal.xz >/dev/null; date
Wed 25 Sep 12:01:25 BST 2013
Wed 25 Sep 12:01:27 BST 2013
[root at marley ~]# ls -lh /home/journal*
-rw-r--r-- 1 root root 244M Sep 25 11:53 /home/journal
-rw-r--r-- 1 root root  17M Sep 25 11:55 /home/journal.gz
-rw-r--r-- 1 root root 9.8M Sep 25 11:58 /home/journal.xz

So, 2 seconds to page through 9.8M of compressed data directly with log
files, or ~2 minutes, 30 seconds to page through 1.5GB of journal based
logs to produce the same result.... (and I know the files here will be
"hot" in terms of cache etc which gives them a slightly unfair
advantage, but this would factor into real world usage too)

Now, of course I do know that in the journalctl case, there is both more
to look at, perhaps some old journals that are ultimately analysed at
but never used because they are corrupt or something, and a whole bunch
of other data that is not synthesizable to the journalctl syslog-style
output, but forgetting features and looking at raw numbers and, as I
said above, it's a hard sell on the surface!

Is there something wrong here? Are my numbers unrealistic? Is this
pointing at a larger problem with my setup/usage?

Col


-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/



More information about the systemd-devel mailing list