[systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

Chris Murphy lists at colorremedies.com
Mon Oct 12 20:42:18 UTC 2020


On Mon, Oct 12, 2020 at 1:33 AM Lennart Poettering
<lennart at poettering.net> wrote:
>
> On So, 11.10.20 14:57, Chris Murphy (lists at colorremedies.com) wrote:
>
> > Hi,
> >
> > A Fedora 32 (systemd-245.8-2.fc32) user has a 10-drive Btrfs raid1 set
> > to mount in /etc/fstab:
> >
> > UUID=f89f0a16-<snipped>  /srv   btrfs  defaults,nofail,x-systemd.requires=/  0 0
> >
> > For some reason, systemd is trying to mount this file system before
> > all ten devices are ready. Supposedly this rule applies:
> > https://github.com/systemd/systemd/blob/master/rules.d/64-btrfs.rules.in
>
> udev calls the btrfs ready ioctl whenever a new btrfs fs block deice
> shows up. The ioctl will fail as long as not all devices that make up
> the fs have shown up. It succeeds once all devices for the fs are
> there. i.e. for n=10 devices it will return failure 9 times, and
> sucess the 1 final time.
>
> When precisely it returns success or failure is entirely up to the btrfs kernel
> code. systemd/udev doesn't have any control on that. The udev btrfs
> builtin is too trivial for that: it just calls the ioctl and that
> pretty much is it.

What does this line mean? Does it mean the 'btrfs ready' ioctl has
been called at this moment and the device is ready? i.e. this specific
device is ready now, but not before now?

[   30.923721] kernel: BTRFS: device label BTRFS_RAID1_srv devid 1
transid 60815 /dev/sdg scanned by systemd-udevd (710)

Because I see six such lines for this file system before the mount
attempt. And four such lines after the mount attempt. If "all devices
ready" is not true until the last such line appears, then the mount is
happening too soon for some reason.


> For historical reasons udev log level is independent from the rest of
> systemd log level. Thus use udev.log_priority=debug to turn on udev
> debug logging.

I'll have him retry with udev.log_priority=debug and if I get a moment
I'll try to reproduce. The difficulty is reproducing truly missing
devices is easy and appears to work, whereas in this case they are
merely late being scanned for whatever reason (maybe they take longer
to spin up, maybe the HBA they're connected to is just slow or has a
later loading driver, etc)


-- 
Chris Murphy


More information about the systemd-devel mailing list