[systemd-devel] consider dropping defrag of journals on btrfs

Maksim Fomin maxim at fomin.one
Fri Feb 5 20:58:51 UTC 2021


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 5, 2021 3:23 PM, Lennart Poettering <lennart at poettering.net> wrote:

> On Do, 04.02.21 12:51, Chris Murphy (lists at colorremedies.com) wrote:
>
> > On Thu, Feb 4, 2021 at 6:49 AM Lennart Poettering
> > lennart at poettering.net wrote:
> >
> > > You want to optimize write pattersn I understand, i.e. minimize
> > > iops. Hence start with profiling iops, i.e. what defrag actually costs
> > > and then weight that agains the reduced access time when accessing the
> > > files. In particular on rotating media.
> >
> > A nodatacow journal on Btrfs is no different than a journal on ext4 or
> > xfs. So I don't understand why you think you also need to defragment
> > the file, only on Btrfs. You cannot do better than you already are
> > with a nodatacow file. That file isn't going to get anymore fragmented
> > in use than it was at creation.
>
> You know, we issue the btrfs ioctl, under the assumption that if the
> file is already perfectly defragmented it's a NOP. Are you suggesting
> it isn't a NOP in that case?

So, what is the reason for defragmenting journal is BTRFS is detected? This does not happen at other filesystems. I have read this thread but has not found a clear answer to this question.

> > But it gets worse. The way systemd-journald is submitting the journals
> > for defragmentation is making them more fragmented than just leaving
> > them alone.
>
> Sounds like a bug in btrfs? systemd is not the place to hack around
> btrfs bugs?

I would say it depends on whether defragmentation issues are feature of btrfs. As Chris mentioned, if root fs is snapshotted, 'defragmenting' the journal can actually increase fragmentation. This is an example when the problem is caused by a feature (not a bug) in btrfs. For example, my 'system.journal' file is currently 16 MB and according to filefrag it has 1608 extents (consequence of snapshotted rootfs?). It looks too much, if I am not missing some technical details (perhaps filefrag 'extent' is not a real extent in case of this fs?). Even if it is a bug in btrfs, it would make sense to temporarily disable the policy of 'defragmenting only in BTRFS' in systemd.

I am interested in this issue because for some time (probably since late 2017 till late 2019) I had strange issues with systemd-journald crashing at boot time because of archiving journal/defragmenting. The setup was follows: btrfs on external hd (not ssd) with full disk encryption. After mistaken disconnection of mounted disk (but not in all such cases) systemd-journald caused very long lock of boot process because of following loop: systemd-journald tries to archive/defragment journal files -> it crashes for some reason -> systemd restarts systemd-journald -> it starts archiving/defragmenting journal files -> it crashes again -> systemd restarts systemd-journald (my understaing of logs after boot). Eventually this loop breaks and the boot process counties. After login I see that journal data is fine - at least there is no evidence of journal data corruption, so I presume it was caused by archiving/defragmentation policy on btrfs. I used this disk with ext4 filesystem from 2014 to 2017 and never had any problem like that. Eventually I decided to buy a better disk and this problem vanished since then, but why systemd defragmets journal only in btrfs remained a mystery to me.



More information about the systemd-devel mailing list