[systemd-devel] Slow startup of systemd-journal on BTRFS

Lennart Poettering lennart at poettering.net
Sun Jun 15 15:34:21 PDT 2014


On Wed, 11.06.14 20:32, Chris Murphy (lists at colorremedies.com) wrote:

> > systemd has a very stupid journal write pattern. It checks if there
> > is space in the file for the write, and if not it fallocates the
> > small amount of space it needs (it does *4 byte* fallocate calls!)

Not really the case. 

http://cgit.freedesktop.org/systemd/systemd/tree/src/journal/journal-file.c#n354

We allocate 8mb at minimum.

> > and then does the write to it.  All this does is fragment the crap
> > out of the log files because the filesystems cannot optimise the
> > allocation patterns.

Well, it would be good if you'd tell me what to do instead...

I am invoking fallocate() in advance, because we write those files with
mmap() and that of course would normally triggered SIGBUS already on the
most boring of reasons, such as disk full/quota full or so. Hence,
before we do anything like that, we invoke fallocate() to ensure that
the space is actually available... As far as I can see, that pretty much
in line with what fallocate() is supposed to be useful for, the man page
says this explicitly:

     "...After a successful call to posix_fallocate(), subsequent writes
      to bytes in the specified range are guaranteed not to fail because
      of lack of disk space."

Happy to be informed that the man page is wrong. 

I am also happy to change our code, if it really is the wrong thing to
do. Note however that I generally favour correctness and relying on
documented behaviour, instead of nebulous optimizations whose effects
might change with different file systems or kernel versions...

> > Yup, it fragments journal files on XFS, too.
> > 
> > http://oss.sgi.com/archives/xfs/2014-03/msg00322.html
> > 
> > IIRC, the systemd developers consider this a filesystem problem and
> > so refused to change the systemd code to be nice to the filesystem
> > allocators, even though they don't actually need to use fallocate...

What? No need to be dick. Nobody ever pinged me about this. And yeah, I
think I have a very good reason to use fallocate(). The only reason in
fact the man page explicitly mentions.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list