<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<pre><blockquote type="cite"><pre>On Wed, 18.02.15 06:22, Andrei Borzenkov (<a href="http://lists.freedesktop.org/mailman/listinfo/systemd-devel">arvidjaar at gmail.com</a>) wrote:
><i> В Wed, 18 Feb 2015 01:14:44 +0100
</i>><i> Zbigniew Jędrzejewski-Szmek <<a href="http://lists.freedesktop.org/mailman/listinfo/systemd-devel">zbyszek at in.waw.pl</a>> пишет:
</i>><i>
</i>><i> > On Tue, Feb 17, 2015 at 08:05:29PM +0100, Goffredo Baroncelli wrote:
</i>><i> > > Hi Lennart,
</i>><i> > >
</i>><i> > > On 2015-02-16 23:59, Lennart Poettering wrote:
</i>><i> > > > * journald now sets the special FS_NOCOW file flag for its
</i>><i> > > > journal files. This should improve performance on btrfs, by
</i>><i> > > > avoiding heavy fragmentation when journald's write-pattern
</i>><i> > > > is used on COW file systems. It degrades btrfs' data
</i>><i> > > > integrity guarantees for the files to the same levels as for
</i>><i> > > > ext3/ext4 however. This should be OK though as journald does
</i>><i> > > > its own data integrity checks and all its objects are
</i>><i> > > > checksummed on disk. Also, journald should handle btrfs disk
</i>><i> > > > full events a lot more gracefully now, by processing SIGBUS
</i>><i> > > > errors, and not relying on fallocate() anymore.
</i>><i> > >
</i>><i> > > If I read correctly the code, the FS_NOCOW is a temporary workaround, i.e.
</i>><i> > > when the file is closed (or rotated ?) the FS_NOCOW flags is unset again.
</i>><i> > > It is true ?
</i>><i> > Yes, but you miss the point in general. FS_NOCOW is set during the
</i>><i> > entire time when the file is being written to, which could be months,
</i>><i> > and then it is unset when the file will not be written to anymore. So
</i>><i> > indeed, the file is not protected by btrfs checksums for the majority
</i>><i> > of time, but journald does its own checksumming, so the contents are
</i>><i> > protected in a different way.
</i>><i> >
</i>><i>
</i>><i> btrfs checksumming theoretically allows you to transparently recover
</i>><i> after media corruption if filesystem has redundancy (more than one copy
</i>><i> of data). Journald checksum will probably detect corruption, but can it
</i>><i> repair it?
</i>
No it cannot.
But btrfs checksumming cannot fix things for you either if you lose
non-trivial amounts of data. It might be able to fix a few bits of
errors, but not non-trivial amounts. I mean, that's a simple property
of error correction codes: the more you want to be able to correct the
longer must your checksum be. Neither btrfs' nor journald's are
substantial enough to correct even a sector...
Lennart
--
Lennart Poettering, Red Hat
</pre></blockquote>
Hi Lennart,
it's correct, that checksums are not suitable to recover a file;
BUT when using btrfs RAID, checksums are used to determine which copy of the file is malformed.
(and restore it, if any redundant OK copy exists)
Using FS_NOCOW on journal files does prevent btrfs from restoring the journal, even if a sane copy would exist.
(i.e. hardware / drive failure.)
That probably means losing important data.
While this IMHO seems like a temporary workaround until btrfs autodefrag (on a per file basis) exist,
I'd rather make this configurable and surely not the default!
Do you have any further info or opinion on this?
Best regards,
Florian</pre>
</body>
</html>