[systemd-devel] consider dropping defrag of journals on btrfs
Lennart Poettering
lennart at poettering.net
Fri Feb 5 22:55:41 UTC 2021
On Fr, 05.02.21 20:58, Maksim Fomin (maxim at fomin.one) wrote:
> > You know, we issue the btrfs ioctl, under the assumption that if the
> > file is already perfectly defragmented it's a NOP. Are you suggesting
> > it isn't a NOP in that case?
>
> So, what is the reason for defragmenting journal is BTRFS is
> detected? This does not happen at other filesystems. I have read
> this thread but has not found a clear answer to this question.
btrfs like any file system fragments files with nocow a bit. Without
nocow (i.e. with cow) it fragments files horribly, given our write
pattern (wich is: append something to the end, and update a few
pointers in the beginning). By upstream default we set nocow, some
downstreams/users undo that however. (this is done via tmpfiles,
i.e. journald doesn't actually set nocow ever).
When we archive a journal file (i.e stop writing to it) we know it
will never receive any further writes. It's a good time to undo the
fragmentation (we make no distinction whether heavily fragmented,
little fragmented or not at all fragmented on this) and thus for the
future make access behaviour better, given that we'll still access the
file regularly (because archiving in journald doesn't mean we stop
reading it, it just means we stop writing it — journalctl always
operates on the full data set). defragmentation happens in the bg once
triggered, it's a simple ioctl you can invoke on a file. if the file
is not fragmented it shouldn't do anything.
other file systems simply have no such ioctl, and they never fragment
as terribly as btrfs can fragment. hence we don't call that ioctl.
i'd be fine to avoid the ioctl if we knew for sure the file is at
worst mildly fragmented, but apparently btrfs is too broken to be able
to implement something like that. I'd even be fine dropping it
entirely, if someone actually can show the benefits of having the
files unfragmented when archived don't outweigh the downside of
generating some iops when executing the defragmentation. i.e. someone
does some profiling, on both ssd and rotating media. Apparently noone
who cares about this apparently wants to do such research though, and
hence I remain deeply unimpressed. Let's not try to do such
optimizations without any data that actually shows it betters things.
Lennart
--
Lennart Poettering, Berlin
More information about the systemd-devel
mailing list