[systemd-devel] consider dropping defrag of journals on btrfs

Thu Feb 4 06:11:27 UTC 2021

On Wed, Feb 3, 2021 at 9:46 AM Lennart Poettering
<lennart at poettering.net> wrote:
>
> Performance is terrible if cow is used on journal files while we write
> them.

I've done it for a year on NVMe. The latency is so low, it doesn't matter.

> It would be great if we could turn datacow back on once the files are
> archived, and then take benefit of compression/checksumming and
> stuff. not sure if there's any sane API for that in btrfs besides
> rewriting the whole file, though. Anyone knows?

A compressed file results in a completely different encoding and
extent size, so it's a complete rewrite of the whole file, regardless
of the cow/nocow status.

Without compression it'd be a rewrite because in effect it's a
different extent type that comes with checksums. i.e. a reflink copy
of a nodatacow file can only be a nodatacow file; a reflink copy of a
datacow file can only be a datacow file. The conversion between them
is basically 'cp --reflink=never' and you get a complete rewrite.

But you get a complete rewrite of extents by submitting for
defragmentation too, depending on the target extent size.

It is possible to do what you want by no longer setting nodatacow on
the enclosing dir. Create a 0 length journal file, set nodatacow on
that file, then fallocate it. That gets you a nodatacow active
journal. And then you can just duplicate it in place with a new name,
and the result will be datacow and automatically compressed if
compression is enabled.

But the write hit has already happened by writing journal data into
this journal file during its lifetime. Just rename it on rotate.
That's the least IO impact possible at this point. Defragmenting it
means even more writes, and not much of a gain if any, unless it's
datacow which isn't the journald default.

-- 
Chris Murphy