[systemd-devel] Slow startup of systemd-journal on BTRFS

Tue Jun 17 11:26:05 PDT 2014

Goffredo Baroncelli <kreijack at libero.it> schrieb:

> I investigate a bit why readahead doesn't defrag the journal.
> I put in CC also the mailing list for "the record".
> 
> On 06/17/2014 03:33 AM, Kai Krakow wrote:
> [...]
>> Instead, for me, the readahead collector catches access to my system
>> journal and thus defragments it. That's not the case for your system
>> which explains why it won't be defragmented for you when enabling
>> readahead.
> 
> The readahead program skips file greater than 10M. If you look at its man
> page, it documents the switch --file-size-max. The default value is
> READAHEAD_FILE_SIZE_MAX which is equal to 10M. So only when the
> system.journal is smaller than this value, the readahead defrags it. But
> this happens only few times.
> 
> This behavior seems to me reasonable to avoid blocking a system when a big
> file is touched during the boot.

Ah okay, good catch... Well, systemd does not generate massive amounts of 
logs for me. Maybe you want to play around with "SystemMaxUse" in 
journald.conf to let the file size stay below 10M?

Actually, for servers using syslog, I set the log file rotation to kick in 
after 10M worth of logging (or after 7 days, whatever comes first) so our 
rsync-diffed backups stay small for log files as otherwise it would start 
creating huge diffs just for backups when I do the "rsync --backup" way of 
doing backups.

So that may actually be a not too bad option. I'm not using it with my home 
system because I do snapshot-based backups to btrfs and diffs are just much 
smaller (using rsync --no-whole-file --inplace).

We recently switched to zfs based backup storage for our servers, and in 
that process switched to using "rsync --no-whole-file --inplace", too. So we 
won't need the small log rotation scheme any longer but I feel comfortable 
with it so I stick to this.

BTW: I checked our systemd-enabled servers and fragmentation of journal 
files is very very high there, too. Way above 5000 extents. The journals are 
on xfs so it suffers that problem, too (which already has been pointed out 
somewhere in this thread multiple times). But to my surprise, this does not 
affect the performance of journalctl there. The bad performance is only for 
fragmentation on btrfs, not xfs. IMHO, this pushes the pointer a little bit 
more towards systemd, tho I couldn't suggest how to fix it. Time for the 
filesystem hackers to kick in.

BTW: The servers are virtualized so by default, systemd-readahead does 
nothing there because of "ConditionVirtual" in the service files. OTOH, it 
won't defragment there anyways because xfs is not supported for this in 
systemd, altough it would probably be possible because there is "xfs_fsr". 
But my feeling is that xfs_fsr is more or less just locking the file, 
copying it, then replaces the original. It cannot work on files in use. This 
is different for btrfs which only chokes on defragmenting files currently 
mmap'ed (e.g., binaries being executed, read: busy text files). Following 
that thought, I wonder why it seems to defragment system.journal for me 
because after all I learned here, systemd writes to the files mmap'ed.

-- 
Replies to list only preferred.