[systemd-devel] Slow startup of systemd-journal on BTRFS

Tue Jun 17 14:02:14 PDT 2014

Filipe Brandenburger <filbranden at google.com> schrieb:

> On Mon, Jun 16, 2014 at 6:13 PM, cwillu <cwillu at cwillu.com> wrote:
>> For the case of sequential writes (via write or mmap), padding writes
>> to page boundaries would help, if the wasted space isn't an issue.
>> Another approach, again assuming all other writes are appends, would
>> be to periodically (but frequently enough that the pages are still in
>> cache) read a chunk of the file and write it back in-place, with or
>> without an fsync. On the other hand, if you can afford to lose some
>> logs on a crash, not fsyncing/msyncing after each write will also
>> eliminate the fragmentation.
> 
> I was wondering if something could be done in btrfs to improve
> performance under this workload... Something like a "defrag on demand"
> for a case where mostly appends are happening.
> 
> When there are small appends with fsync/msync, they become new
> fragments (as expected), but once the writes go past a block boundary,
> btrfs could defragment the previous block in background, since it's
> not really expected to change again.
> 
> That could potentially achieve performance close to chattr +C without
> the drawbacks of disabling copy-on-write.

I thought about something like that, too. I'm pretty sure it really doesn't 
matter if your 500G image file is split across 10000 extents - as long as at 
least chunks of extents are kept together and rebuilt as one extent. That 
means, instead of letting autodefrag work on the whole file just let it 
operate on a chunk of it within some sane boundaries - maybe 8MB chunks, - 
of course without splitting existing extents if those already cross a chunk 
boundary. That way, it would still reduce head movements a lot while 
maintaining good performance during defragmentation. Your idea would be the 
missing companion to that (it is some sort of slow-growing-file-detection).

If I remember correctly, MacOSX implements a similar adaptic defragmentation 
strategy for its HFS+ filesystem, tho the underlying semantics are probably 
quite different. And it acts upon opening the file instead upon writing to 
the file, so it is probably limited to smallish files only (which I don't 
think makes so much sense on its own, for small files locality to 
semantically similar files is much more important, i.e. files needed during 
boot, files needed for starting a specific application).

If, after those strategies, it is still important to get your file's chunks 
cleanly aligned one after each other, one could still run a manual defrag 
which does the complete story.

BTW: Is it possible to physically relocate files in btrfs? I think that is 
more important than defragmentation. Is such a thing possible with the 
defrag IOCTL? My current understanding is that defragmentating a file just 
rewrites it somewhere random as a contiguous block - which is not always the 
best option because it can hurt boot performance a lot and thus reverses the 
effect of what is being tried to achieve. It feels a bit like playing 
lottery when I defragment my boot files only to learn that the boot process 
now is slower instead of faster. :-\

-- 
Replies to list only preferred.