[systemd-devel] consider dropping defrag of journals on btrfs

Phillip Susi phill at thesusis.net
Mon Feb 8 15:09:21 UTC 2021


Chris Murphy writes:

> I showed that the archived journals have way more fragmentation than
> active journals. And the fragments in active journals are
> insignificant, and can even be reduced by fully allocating the journal

Then clearly this is a problem with btrfs: it absolutely should not be
making the files more fragmented when asked to defrag them.

> file to final size rather than appending - which has a good chance of
> fragmenting the file on any file system, not just Btrfs.

And yet, you just said the active journal had minimal fragmentation.
That seems to mean that the 8mb fallocates that journald does is working
well.  Sure, you could proabbly get fewer fragments by fallocating the
whole 128 mb at once, but there are tradeoffs to that that are not worth
it.  One fragment per 8 mb isn't a big deal.  Ideally a filesystem will
manage to do better than that ( didn't btrfs have a persistent
reservation system for this purpose? ), but it certainly should not
commonly do worse.

> Further, even *despite* this worse fragmentation of the archived
> journals, bcc-tools fileslower shows no meaningful latency as a
> result. I wrote this in the previous email. I don't understand what
> you want me to show you.

*Of course* it showed no meaningful latency because you did the test on
an SSD, which has no meaningful latency penalty due to fragmentation.
The question is how bad is it on HDD.

> And since journald offers no ability to disable the defragment on
> Btrfs, I can't really do a longer term A/B comparison can I?

You proposed a patch to disable it.  Test before and after the patch.

> I did provide data. That you don't like what the data shows: archived
> journals have more fragments than active journals, is not my fault.
> The existing "optimization" is making things worse, in addition to
> adding a pile of unnecessary writes upon journal rotation.

If it is making things worse, that is definately a bug in btrfs.  It
might be nice to avoid the writes on SSD though since there is no
benefit there.

> Conversely, you have not provided data proving that nodatacow
> fallocated files on Btrfs are any more fragmented than fallocated
> files on ext4 or XFS.

That's a fair point: if btrfs isn't any worse than other filessytems,
then why is it the only one that gets a defrag?



More information about the systemd-devel mailing list