[PATCH 0/5] QEMU VFIO live migration

Dr. David Alan Gilbert dgilbert at redhat.com
Wed Mar 27 20:18:54 UTC 2019


* Zhao Yan (yan.y.zhao at intel.com) wrote:
> On Wed, Feb 20, 2019 at 07:42:42PM +0800, Cornelia Huck wrote:
> > > > > >   b) How do we detect if we're migrating from/to the wrong device or
> > > > > > version of device?  Or say to a device with older firmware or perhaps
> > > > > > a device that has less device memory ?  
> > > > > Actually it's still an open for VFIO migration. Need to think about
> > > > > whether it's better to check that in libvirt or qemu (like a device magic
> > > > > along with verion ?).  
> > > 
> > > We must keep the hardware generation is the same with one POD of public cloud
> > > providers. But we still think about the live migration between from the the lower
> > > generation of hardware migrated to the higher generation.
> > 
> > Agreed, lower->higher is the one direction that might make sense to
> > support.
> > 
> > But regardless of that, I think we need to make sure that incompatible
> > devices/versions fail directly instead of failing in a subtle, hard to
> > debug way. Might be useful to do some initial sanity checks in libvirt
> > as well.
> > 
> > How easy is it to obtain that information in a form that can be
> > consumed by higher layers? Can we find out the device type at least?
> > What about some kind of revision?
> hi Alex and Cornelia
> for device compatibility, do you think it's a good idea to use "version"
> and "device version" fields?
> 
> version field: identify live migration interface's version. it can have a
> sort of backward compatibility, like target machine's version >= source
> machine's version. something like that.
> 
> device_version field consists two parts:
> 1. vendor id : it takes 32 bits. e.g. 0x8086.
> 2. vendor proprietary string: it can be any string that a vendor driver
> thinks can identify a source device. e.g. pciid + mdev type.
> "vendor id" is to avoid overlap of "vendor proprietary string".
> 
> 
> struct vfio_device_state_ctl {
>      __u32 version;            /* ro */
>      __u8 device_version[MAX_DEVICE_VERSION_LEN];            /* ro */
>      struct {
>      	__u32 action; /* GET_BUFFER, SET_BUFFER, IS_COMPATIBLE*/
> 	...
>      }data;
>      ...
>  };
> 
> Then, an action IS_COMPATIBLE is added to check device compatibility.
> 
> The flow to figure out whether a source device is migratable to target device
> is like that:
> 1. in source side's .save_setup, save source device's device_version string
> 2. in target side's .load_state, load source device's device version string
> and write it to data region, and call IS_COMPATIBLE action to ask vendor driver
> to check whether the source device is compatible to it.
> 
> The advantage of adding an IS_COMPATIBLE action is that, vendor driver can
> maintain a compatibility table and decide whether source device is compatible
> to target device according to its proprietary table.
> In device_version string, vendor driver only has to describe the source
> device as elaborately as possible and resorts to vendor driver in target side
> to figure out whether they are compatible.

It would also be good if the 'IS_COMPATIBLE' was somehow callable
externally - so we could be able to answer a question like 'can we
migrate this VM to this host' - from the management layer before it
actually starts the migration.

Dave

> Thanks
> Yan
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK


More information about the intel-gvt-dev mailing list