[systemd-devel] [systemd-219] Journal spamming by umount / high CPU usage

Kai Krakow hurikhan77 at gmail.com
Sun Mar 29 08:12:35 PDT 2015


Hello!

I've got a automount point for a daily USB backup job. Due to some 
instabilities of early USB3 chipsets and early USB3 devices, the mounted 
device sometimes wents offline and eventually comes back after a while but 
my backup job is stuck (rsync). I configured rsync to shutdown upon block io 
timeouts. This in turn makes the systemd automounter think (correctly) that 
the mount point is no longer in use - but it cannot be unmounted. The 
problem here is, that it spams the journal with hundreds of messages per 
minute all day long (or as long as I don't reboot):

Mär 29 16:49:00 jupiter umount[19547]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19575]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19577]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19645]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19671]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19719]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19795]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19843]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19853]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19907]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt
Mär 29 16:49:00 jupiter umount[19947]: umount: /mnt/private/usb-backup: 
umount fehlgeschlagen: Die Operation ist nicht erlaubt

The German phrase means "umount failed: operation not permitted". This in 
turn leads to a constant CPU usage of systemd and journald between 30% and 
70% added together and the journal grows to gigabytes, meanwhile "old" 
journals become rotated away so that I currently cannot even look back far 
enough to see the logs right before that repeating incident. The system 
reboots just fine, there's no hanging systemd job during shutdown.

The thing is, the device is back - it switched to another device node, tho 
(sda at boot, not sdf). The systemd automounter happily mounts it when 
changing to the directory, and I can list all my files fine. The backup 
scratch area is dirty/incomplete as expected - but there are no file system 
errors or inconsistencies.

Related dmesg log purely for informational purpose to see the incident in 
chronological order, this in not about discussing a btrfs issue (and it is 
not):

$ dmesg|egrep "sd[af]"
[    2.241559] sd 6:0:0:0: [sda] 3907029164 512-byte logical blocks: (2.00 
TB/1.81 TiB)
[    2.241878] sd 6:0:0:0: [sda] Write Protect is off
[    2.241883] sd 6:0:0:0: [sda] Mode Sense: 03 00 00 00
[    2.242380] sd 6:0:0:0: [sda] No Caching mode page found
[    2.242419] sd 6:0:0:0: [sda] Assuming drive cache: write through
[    2.267163]  sda: sda1
[    2.268617] sd 6:0:0:0: [sda] Attached SCSI disk
[    4.349156] BTRFS: device label usb-backup devid 1 transid 28610 
/dev/sda1
# backup started here:
[13059.571377] BTRFS info (device sda1): force zlib compression
[13059.571382] BTRFS info (device sda1): disk space caching is enabled
# USB hickup here:
[58885.081027] sd 7:0:0:0: [sdf] 3907029164 512-byte logical blocks: (2.00 
TB/1.81 TiB)
[58885.081433] sd 7:0:0:0: [sdf] Write Protect is off
[58885.081437] sd 7:0:0:0: [sdf] Mode Sense: 03 00 00 00
[58885.081979] sd 7:0:0:0: [sdf] No Caching mode page found
[58885.083306] sd 7:0:0:0: [sdf] Assuming drive cache: write through
# backup failed here:
[58885.250710] BTRFS: error (device sda1) in btrfs_commit_transaction:2010: 
errno=-5 IO failure (Error while writing out transaction)
[58885.252190] BTRFS info (device sda1): forced readonly
[58885.252195] BTRFS warning (device sda1): Skipping commit of aborted 
transaction.
[58885.252359] BTRFS: error (device sda1) in cleanup_transaction:1670: 
errno=-5 IO failure
[58885.253900] BTRFS info (device sda1): delayed_refs has NO entry
[58885.253913] BTRFS error (device sda1): commit super ret -5
[58893.553877]  sdf: sdf1
[58893.555061] sd 7:0:0:0: [sdf] Attached SCSI disk
# I've cd'ed into the directory here:
[99457.380783] BTRFS info (device sdf1): force zlib compression
[99457.380788] BTRFS info (device sdf1): disk space caching is enabled


Here's the configuration:

$ fgrep usb-backup /etc/fstab
LABEL=usb-backup /mnt/private/usb-backup btrfs \
    noauto,noatime,compress-force=zlib,subvolid=0,x-systemd.automount \
    0 0

$ systemctl cat mnt-private-usb\\x2dbackup.{automount,mount}
# /run/systemd/generator/mnt-private-usb\x2dbackup.automount
# Automatically generated by systemd-fstab-generator

[Unit]
SourcePath=/etc/fstab
Documentation=man:fstab(5) man:systemd-fstab-generator(8)
Before=local-fs.target
[Automount]
Where=/mnt/private/usb-backup

# /run/systemd/generator/mnt-private-usb\x2dbackup.mount
# Automatically generated by systemd-fstab-generator

[Unit]
SourcePath=/etc/fstab
Documentation=man:fstab(5) man:systemd-fstab-generator(8)

[Mount]
What=/dev/disk/by-label/usb-backup
Where=/mnt/private/usb-backup
Type=btrfs
Options=noauto,noatime,compress-force=zlib,subvolid=0,x-systemd.automount


I wonder if systemd automount should become smarter about handling such 
hickups because they can happen out in the real world. It will probably have 
a similar or same problem with users just disconnecting a device that was 
automounted by systemd. This has not been an issue in earlier versions of 
systemd so this looks like a regression, but I cannot say which version. 
Probably it was still okay with v216 or v217.

I see at least to problems which should be fixed:

  * automounter umount can loop until infinity in case of errors
  * automounter can become very spammy in the journal

The latter one could be fixed with different burst settings maybe tho I 
think that should not be the purpose of that setting for this particular 
problem. The first looks like a regression. OTOH, in previous versions maybe 
it wasn't just so spammy to the journal so I didn't discover the problem.

-- 
Replies to list only preferred.



More information about the systemd-devel mailing list