[systemd-devel] what does --ephemeral really do on btrfs?

Elias Probst mail at eliasprobst.eu
Fri Jul 3 04:31:55 PDT 2015


On 07/02/2015 07:22 AM, Johannes Ernst wrote:
> Mine takes an awful long time ā€” blocking IO on the device in the awful long meantime ā€” and Iā€™m puzzled why. Does it perhaps copy (deep? references only?) the entire drive? 
> 

I've seen a similar issue here which was caused by a obviously corrupted
btrfs volume.
'systemd-nspawn' and 'btrfs-transacti' consumed a full CPU core and
never stopped doing so - they were just stuck there.

After enabling CONFIG_HANGCHECK_TIMER=y in my Kernel I was able to get
some more information out of it:

kernel: INFO: task kworker/u8:2:213 blocked for more than 120 seconds.
kernel:       Not tainted 4.1.0-gentoo #4
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kernel: kworker/u8:2    D ffff8802344d3978     0   213      2 0x00000000
kernel: Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
kernel:  ffff8802344d3978 ffff8802361ee350 ffff880234ad8390 0000000000000246
kernel:  ffff8802344d4000 ffff8801f5296ae0 ffff8802344d3a80 ffff8802304ac000
kernel:  ffff880001447300 ffff8802344d3998 ffffffff94963329 0000000000000000
kernel: Call Trace:
kernel:  [<ffffffff94963329>] schedule+0x74/0x83
kernel:  [<ffffffff943e478f>] btrfs_tree_lock+0xa7/0x1b7
kernel:  [<ffffffff941346bd>] ? wait_woken+0x74/0x74
kernel:  [<ffffffff943d03eb>] lock_extent_buffer_for_io+0x39/0x199
kernel:  [<ffffffff943d1371>] btree_write_cache_pages+0x306/0x35d
kernel:  [<ffffffff943ad2db>] btree_writepages+0x1e/0x5d
kernel:  [<ffffffff941ad9b9>] do_writepages+0x19/0x27
kernel:  [<ffffffff9420d42d>] __writeback_single_inode+0x4e/0x258
kernel:  [<ffffffff9420d804>] writeback_sb_inodes+0x1cd/0x35a
kernel:  [<ffffffff9420da0b>] __writeback_inodes_wb+0x7a/0xb3
kernel:  [<ffffffff9420dbeb>] wb_writeback+0x10d/0x256
kernel:  [<ffffffff941acb92>] ? bdi_dirty_limit+0x2c/0x8c
kernel:  [<ffffffff9420e27f>] bdi_writeback_workfn+0x1a1/0x352
kernel:  [<ffffffff94116b4b>] process_one_work+0x1d5/0x364
kernel:  [<ffffffff9411756e>] worker_thread+0x2d2/0x3f4
kernel:  [<ffffffff9411729c>] ? cancel_delayed_work_sync+0x10/0x10
kernel:  [<ffffffff9411b44d>] kthread+0xcd/0xd5
kernel:  [<ffffffff94120000>] ?
ftrace_raw_output_sched_migrate_task+0x7f/0x7f
kernel:  [<ffffffff9411b380>] ? kthread_create_on_node+0x17a/0x17a
kernel:  [<ffffffff94967d62>] ret_from_fork+0x42/0x70
kernel:  [<ffffffff9411b380>] ? kthread_create_on_node+0x17a/0x17a
kernel: INFO: task btrfs-transacti:2020 blocked for more than 120 seconds.
kernel:       Not tainted 4.1.0-gentoo #4
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kernel: btrfs-transacti D ffff8802347838a8     0  2020      2 0x00000080
kernel:  ffff8802347838a8 ffff8802361e02d0 ffff88023477a350 0000000000000246
kernel:  ffff880234784000 ffff8801f5296ae0 ffff88023477a350 ffff8802347838f8
kernel:  ffff88023477a350 ffff8802347838c8 ffffffff94963329 0000000000000000
kernel: Call Trace:
kernel:  [<ffffffff94963329>] schedule+0x74/0x83
kernel:  [<ffffffff943e44eb>] btrfs_tree_read_lock+0xc0/0xea
kernel:  [<ffffffff941346bd>] ? wait_woken+0x74/0x74
kernel:  [<ffffffff94395c6d>] btrfs_search_old_slot+0x509/0x7fc
kernel:  [<ffffffff9440587b>] __resolve_indirect_refs+0x192/0x5d9
kernel:  [<ffffffff943e43e5>] ? btrfs_clear_lock_blocking_rw+0x78/0xbe
kernel:  [<ffffffff943d0c46>] ? release_extent_buffer+0x34/0xb1
kernel:  [<ffffffff943d105f>] ? free_extent_buffer+0x6e/0x7a
kernel:  [<ffffffff94406905>] find_parent_nodes+0xb28/0xfaa
kernel:  [<ffffffff94406e19>] __btrfs_find_all_roots+0x92/0xf0
kernel:  [<ffffffff94406ed3>] btrfs_find_all_roots+0x45/0x65
kernel:  [<ffffffff94391387>] ? btrfs_get_tree_mod_seq+0x72/0x86
kernel:  [<ffffffff9440ad70>] btrfs_delayed_qgroup_accounting+0x32b/0x962
kernel:  [<ffffffff943a58e4>] btrfs_run_delayed_refs+0x185/0x210
kernel:  [<ffffffff943b3e0a>] btrfs_commit_transaction+0x47/0xaf0
kernel:  [<ffffffff943b4c5a>] ? start_transaction+0x3a7/0x52a
kernel:  [<ffffffff943b0550>] transaction_kthread+0xfa/0x1d0
kernel:  [<ffffffff943b0456>] ? btrfs_cleanup_transaction+0x498/0x498
kernel:  [<ffffffff9411b44d>] kthread+0xcd/0xd5
kernel:  [<ffffffff94120000>] ?
ftrace_raw_output_sched_migrate_task+0x7f/0x7f
kernel:  [<ffffffff9411b380>] ? kthread_create_on_node+0x17a/0x17a
kernel:  [<ffffffff94967d62>] ret_from_fork+0x42/0x70
kernel:  [<ffffffff9411b380>] ? kthread_create_on_node+0x17a/0x17a


In this state, I was even unable to fully copy away my machines into a
tarball which just ended up in

tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

and

gzip: stdin: unexpected end of file


Didn't get a reply in #btrfs regarding this yet and didn't get around
yet to raise this on the btrfs ML.

I unmounted the machine volume (which was just backed by a raw image on
my ext4 root volume), moved the raw volume file away for possible later
analysis and recreated the volume from scratch (machinectl set-limit 6G).


- Elias

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150703/f1674941/attachment.sig>


More information about the systemd-devel mailing list