[systemd-devel] persisting sriov_numvfs

Tom Gundersen teg at jklm.no
Tue Jan 27 04:40:33 PST 2015


Hi Dan,

On Mon, Jan 19, 2015 at 3:18 PM, Dan Kenigsberg <danken at redhat.com> wrote:
> I'm an http://oVirt.org developer, and we plan to (finally) support
> SR-IOV cards natively. Working on this feature, we've noticed that
> something is missing in the platform OS.
>
> If I maintain a host with sr-iov cards, I'd like to use the new kernel
> method of defining how many virtual functions (VFs) are to be exposed by
> each physical function:
>
>     # echo 3 > /sys/class/net/enp2s0f0/device/sriov_numvfs
>
> This spawns 3 new devices, for which udev allocated (on my host) the names
> enp2s16, enp2s16f2 and enp2s16f4.
>
> I can attach these VFs to virtual machines, but I can also use them as
> yet another host NIC. Let's assume that I did the latter, and persisted
> its IP address using initscripts in
> /etc/sysconfig/network-scripts/ifcfg-enp2s16f4.
>
> However, on the next boot, sriov_numvfs is reset to 0, there's no
> device named enp2s16f4, and certainly no IP address asigned to it.
>
> The admin can solve his own private issue by writing a service to start
> after udev allocats device names but before network services kick in,
> and re-apply his "echo" there. But it feels like something that should
> be solved in a more generic fashion. It is also not limitted to network
> device. As similar issue would affect anything that attempts to refer to
> a VF by its name, and survive reboot.
>
> How should this be implemented in the realm of systemd?

Sorry for the delay in getting back to you.

My understanding is that the number of vfs must be basically set once
and not changed after that? It seems that it is possible to change it,
but only at the cost of removing all of them first, which I guess is
not really an option in case they are in use.

If that is the case, and what you essentially want is to just override
the kernel default (0 VFs), then I think we can add a feature to
udev's .link files to handle this.

This means the VFs will be allocated very early during boot, as soon
as the PF appears.

On the downside, there is no mechanism to nicely update this setting
during run-time (which may not be a problem if that is not really
supported anyway), you would have to reinsert the PF or reboot the
machine for the .link file to be applied. Moreover, .link files are
specific to network devices, so this will not help you with other
kinds of PFs. I think that may be ok, depending on how common it is to
use this for non-network hardware. If that is a niche usecase, it will
always be possible to write an udev rule to achieve the same result as
the .link file (for any kind of hardware), it is just a bit more
cumbersome.

What do you think?

I do agree with Lennart, that it would be really nice to treat this
the same as VLAN's and allow networkd/networkmanager to add/remove VFs
independently of eachother using the standard netlink mechanism that
is used for all other sorts of network devices. However, this is
something we could do in addition to .link files, so one does not
preclude the other.

Cheers,

Tom


More information about the systemd-devel mailing list