[systemd-devel] writing a systemd unit for nbd devices

Andrey Borzenkov arvidjaar at gmail.com
Fri Mar 21 06:18:51 PDT 2014


On Fri, Mar 21, 2014 at 4:05 PM, Wouter Verhelst <wouter at debian.org> wrote:
>
> First, let me explain how NBD works.
>
> The client side of the NBD protocol is implemented partially in
> user space, and partially in kernel space. The user space part handles
> connecting and the initial protocol negotiation; but once that has been
> done, nbd-client calls the NBD_DO_IT ioctl() on an open /dev/nbdX file,
> which hands the socket file descriptor to the kernel and which does not
> return until the device is disconnected (with "nbd-client -d", or
> because the link to the server died). As such, the nbd-client process
> needs to continue running while the device is connected.
>
> In addition, nbd-client needs to fork() and open() the /dev/nbdX device
> to support partitioned NBD devices (due to a deadlock issue, that can't
> be done from the initial NBD_DO_IT ioctl handling, so it is done in the
> first open() instead).
>

But the parent nbd-client remains around, right?

> For supporting root-on-NBD in conjunction with systemd, I've already
> added a -systemd-mark option to nbd-client so it will make argv[0][0]
> read as '@' (I think that method is slightly ugly, but that's a
> discussion for another time). In Debian, I've already supported
> root-on-NBD for quite a while with an initramfs script and some code in
> the init script of nbd-client which adds the PID for the root NBD device
> to the list of PIDs that shouldn't be killed; I understand that dracut
> (and hence Fedora as well) have similar support (though I'm not sure how
> well it all works).
>
> For non-root NBD devices, however, the situation would be slightly more
> complex. This is supported in Debian by the package's init script;
> AFAIK, though, no other distribution has support for that in its init
> scripts (or upstart/systemd configuration, yada yada).
>
> Currently, in Debian, the situation is that there is a configuration
> file, /etc/nbd-client, which is sourced in the init script, and which
> contains bash arrays with configuration. The init script then loops over
> those bash arrays and runs the appropriate nbd-client command to connect
> the device. Any actual mounting (etc) of the device, then, is left to
> other init scripts. It expects that filesystems on NBD devices have the
> "_netdev" option in its fstab entry listed, so that it will be mounted
> by the "mountnfs" rather than "mountall" init script.
>
> As can be expected, this took several iterations to get right in all
> corner cases. It seems to be working fine now, however.
>
> When converting this to systemd unit files, from skimming over the
> documentation, I guess I'll need something along the lines of the
> following:
>
> - I will need to create dev-nbd at .device unit files. These unit files would
>   connect the device when needed.

How do you determine when it is needed?

>From your description I'd suggest as first step generator that mostly
does the same as already done by iitscript

- create generator that reads /etc/nbd-client and creates service (not
device) file for every NBD device. It could be a link to common
template, or separate unit - it does not really matter.

- this service will simply start nbd-client as you describe above.

Those services will be of Type=simple; and they probably should be
After=network.target and definitely Before=remote-fs-pre.target for
_netdev to work.

> - It may be a good idea to move the configuration from a sourced shell
>   script snipped to "something else". I do want to retain some backwards
>   compatibility, but it's okay if that's just a program interpreting the
>   shell script snippet and outputting something more modern.
>

Generator allows you to retain existing configuration. And at the very
beginning generator can be just a shell script (especially if it does
not need to do more than just "ln -s"). But I suspect you will need at
least pass information about NBD server and I presume it may be
different for each device, right?

> I do foresee some problems, though, and I'd like to see if these are
> indeed problems or whether I just need to read more documentation. I
> haven't found an easy answer in the documentation that I've read, but
> then maybe I haven't been looking very well.
>
> NBD device nodes are a bit special in that due to the way NBD devices
> are connected, the device must exist at all times, even before it is
> connected; I suspect (though have not actually tried) that systemd will
> only try to "start" a .device unit file if the device node itself is not
> there yet. For NBD, the difference between a connected device and a
> not-connected one can be spotted in the apparent size of the block
> device (the BLKGETSIZE64 ioctl will return 0 for a not-connected device)
> and in the presence (or lack thereof) of a file /sys/block/nbdX/pid (if
> it exists, it contains the PID of the nbd-client process handling the
> connection; if it does not, the device is not connected), not by the
> presence (or lack thereof) of the device node itself.
>
> This is not the case for partitions of NBD devices, however; these will
> only show up after the first open(), as explained above. As such, I
> might need two templates: one which connects the NBD device (for a
> /dev/nbdX device), and one for the partition (/dev/nbdXpY) which simply
> depends on the regular NBD device. However, if I understand correctly,
> it would not seem to be possible to create an nbdX template that does
> not also match nbdXpY.
>

nbdX is unrelated to nbdXpY. If those devices are created by kernel,
you do not really need to do anything.


More information about the systemd-devel mailing list