[systemd-devel] journald disk space usage

Lennart Poettering lennart at poettering.net
Tue Feb 28 15:56:05 UTC 2017


On Mon, 27.02.17 16:18, Bill Lipa (dojo at masterleep.com) wrote:

> Hello,
> 
> I have a Rails application that produces quite a bit of log output -
> about 500MB per day, maybe 3-4 million lines.  Currently this is going
> into a normal file with daily rotation.
> 
> I tried dumping this into journald via STDOUT so that I could see
> everything in one place.  On a standard Google Cloud Platform
> instance, this used about 10% extra CPU.  I was willing to live with
> that, but more of a problem was the rapid increase in storage used for
> the log.  It was growing at about 10x the rate as a flat file for the
> 2 hours I ran the experiment.  That is, after 2 hours, the usage
> reported by 'sudo journalctl --disk-usage' was over 400MB, which is
> not much less than I would normally see for an entire day's worth of
> logging.
> 
> I am wondering if this is to be expected due to journald's extra
> functionality and complexity, or does this seem incorrect?  I'm using
> systemd 229 on Ubuntu 16.04.

The journal generates substantially more data, simply because we
collect a lot of implicit metadata for each log even. This data is
usually not compressed (we only compress individually large fields,
and usually fields are not individuall large). The implicit metadata
means we roughly collect 10x as much data and store that away. This is
easy to verify:

journalctl -n 1000 | wc -c

vs.

journalctl -n 1000 -o verbose | wc -c

The first command outputs the journal data in syslog compatible
format, thus lacking all metadata. THe second command uses "verbose"
output mode, which includes all metadata. We output that for the 1000
most recent log events. On my system this yields 101993 and 971434.

If you are not interested in the metadata and systemd's indexing you
can of course turn off journald's storage and use something
non-indexed that carries no metadata, such as rsyslog or so.

Also note that beyond the mere metadata we also tend to collect more
data, simply because we also hook into audit, and every service's
stdout/stderr, as well as early boot logging, which syslog
traditionally didn't do this level.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list