[systemd-devel] persisting sriov_numvfs

Martin Polednik mpolednik at redhat.com
Tue Jan 27 05:41:50 PST 2015



----- Original Message -----
> From: "Lennart Poettering" <lennart at poettering.net>
> To: "Martin Polednik" <mpolednik at redhat.com>
> Cc: "Andrei Borzenkov" <arvidjaar at gmail.com>, systemd-devel at lists.freedesktop.org, ibarkan at redhat.com
> Sent: Tuesday, January 27, 2015 2:21:21 PM
> Subject: Re: [systemd-devel] persisting sriov_numvfs
> 
> On Tue, 27.01.15 07:35, Martin Polednik (mpolednik at redhat.com) wrote:
> 
> > > > > Hmm, I see. In many ways this feels like VLAN setup from a
> > > > > configuration PoV, right? i.e. you have one hw device the driver
> > > > > creates, and then you configure a couple of additional interfaces on
> > > > > top of it.
> > > > > 
> > > > > This of course then raises the question: shouldn't this functionality
> > > > > be exposed by the kernel the same way as VLANs? i.e. with a
> > > > > rtnetlink-based API to create additional interfaces, instead of /sys?
> > > > > 
> > > > > In systemd I figure the right way to expose this to the user would be
> > > > > via
> > > > > .netdev files, the same way as we expose VLAN devices. Not however
> > > > > that that would be networkd territory,
> > > > 
> > > > No, this is not limited to NICs. It is generic feature that can be in
> > > > principle used with any hardware and there are e.g. FC or FCoE HBAs
> > > > with SR-IOV support. It is true that today it is mostly comes with NICs
> > > > though.
> > > > 
> > > > Any general framework for setting it up should not be tied to specific
> > > > card type.
> > > 
> > > Well, I doubt that there will be graphics cards that support this
> > > right? I mean, it's really only network connectivity that can support
> > > a concept like this easily, since you can easily merge packet streams
> > > from multiple VMs on one connection. However, I am not sure how you
> > > want to physically merge VGA streams onto a single VGA connector...
> > > 
> > > If this is about ethernet, FC, FCOE, then I still think that the
> > > network management solution should consider this as something you can
> > > configure on physical links like VLANs. Hence networkd or
> > > NetworkManager and so on should cover it.
> > > 
> > > Lennart
> > 
> > Afaik some storage cards support this, for GPU's it's possibly for the
> > GPUPU applications and such - where you don't care about the physical
> > output, but the processing core of gpu itself (but I'm not aware of such
> > implementation yet, nvidia seems to be doing something but the details
> > are nowhere to be found).
> 
> Hmm, so there are three options I think.
> 
> a) Expose this in networkd .netdev files, as I suggested
>    originally. This would be appropriate if we can add and remove VFs
>    freely any time, without the other VFs being affected. Can you
>    clarify whether going from let's say 4 to 5 VFs requires removing
>    all VFs and recreating them? THis would be the nicest exposure I
>    think, but be specific to networkd.

Removing and recreating the VFs is unfortunately required when changing the
number of them (both ways - increasing and decreasing their count).

https://www.kernel.org/doc/Documentation/PCI/pci-iov-howto.txt

> b) Expose this via udev .link files. This would be appropriate if
>    adding/removing VFs is a one-time thing, when a device pops
>    up. This would be networking specific, not cover anything else like
>    GPU or storage or so. Would still be quite nice. Would probably the
>    best option, after a), if VFs cannot be added/removed dynamically
>    all the time without affecting the other VFs.
> 
> c) Expose this via udev rules files. This would be generic, would work
>    for networking as well as GPUs or storage. This would entail
>    writing our rules files when you want to configure the number of
>    VFs. Care needs to be taken to use the right way to identify
>    devices as they come and go, so that you can apply configuration to
>    them in a stable way. This is somewhat uglier, as we don't really
>    think that udev rules should be used that much for configuration,
>    especially not for configuration written out by programs, rather
>    than manually. However, logind already does this, to assign seat
>    identifiers to udev devices to enable multi-seat support.
> 
> A combination of b) for networking and c) for the rest might be an
> option too.

I myself would vote for b) + c) since we want to cover most of the
possible use cases for SR-IOV and MR-IOV, which hopefully shares
the interface; adding Dan back to CC as he is the one to speak for network. 

> Lennart
> 
> --
> Lennart Poettering, Red Hat
> 


More information about the systemd-devel mailing list