<html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> <pre><blockquote type="cite"><pre>On Wed, 18.02.15 06:22, Andrei Borzenkov (<a href="http://lists.freedesktop.org/mailman/listinfo/systemd-devel">arvidjaar at gmail.com</a>) wrote: > В Wed, 18 Feb 2015 01:14:44 +0100 > Zbigniew Jędrzejewski-Szmek <<a href="http://lists.freedesktop.org/mailman/listinfo/systemd-devel">zbyszek at in.waw.pl</a>> пишет: > > > On Tue, Feb 17, 2015 at 08:05:29PM +0100, Goffredo Baroncelli wrote: > > > Hi Lennart, > > > > > > On 2015-02-16 23:59, Lennart Poettering wrote: > > > > * journald now sets the special FS_NOCOW file flag for its > > > > journal files. This should improve performance on btrfs, by > > > > avoiding heavy fragmentation when journald's write-pattern > > > > is used on COW file systems. It degrades btrfs' data > > > > integrity guarantees for the files to the same levels as for > > > > ext3/ext4 however. This should be OK though as journald does > > > > its own data integrity checks and all its objects are > > > > checksummed on disk. Also, journald should handle btrfs disk > > > > full events a lot more gracefully now, by processing SIGBUS > > > > errors, and not relying on fallocate() anymore. > > > > > > If I read correctly the code, the FS_NOCOW is a temporary workaround, i.e. > > > when the file is closed (or rotated ?) the FS_NOCOW flags is unset again. > > > It is true ? > > Yes, but you miss the point in general. FS_NOCOW is set during the > > entire time when the file is being written to, which could be months, > > and then it is unset when the file will not be written to anymore. So > > indeed, the file is not protected by btrfs checksums for the majority > > of time, but journald does its own checksumming, so the contents are > > protected in a different way. > > > > btrfs checksumming theoretically allows you to transparently recover > after media corruption if filesystem has redundancy (more than one copy > of data). Journald checksum will probably detect corruption, but can it > repair it? No it cannot. But btrfs checksumming cannot fix things for you either if you lose non-trivial amounts of data. It might be able to fix a few bits of errors, but not non-trivial amounts. I mean, that's a simple property of error correction codes: the more you want to be able to correct the longer must your checksum be. Neither btrfs' nor journald's are substantial enough to correct even a sector... Lennart -- Lennart Poettering, Red Hat </pre></blockquote> Hi Lennart, it's correct, that checksums are not suitable to recover a file; BUT when using btrfs RAID, checksums are used to determine which copy of the file is malformed. (and restore it, if any redundant OK copy exists) Using FS_NOCOW on journal files does prevent btrfs from restoring the journal, even if a sane copy would exist. (i.e. hardware / drive failure.) That probably means losing important data. While this IMHO seems like a temporary workaround until btrfs autodefrag (on a per file basis) exist, I'd rather make this configurable and surely not the default! Do you have any further info or opinion on this? Best regards, Florian</pre> </body> </html>