[PATCH 0/5] QEMU VFIO live migration
Zhao Yan
yan.y.zhao at intel.com
Wed Mar 13 01:13:01 UTC 2019
hi Alex
Any comments to the sequence below?
Actaully we have some concerns and suggestions to userspace-opaque migration
data.
1. if data is opaque to userspace, kernel interface must be tightly bound to
migration.
e.g. vendor driver has to know state (running + not logging) should not
return any data, and state (running + logging) should return whole
snapshot first and dirty later. it also has to know qemu migration will
not call GET_BUFFER in state (running + not logging), otherwise, it has
to adjust its behavior.
2. vendor driver cannot ensure userspace get all the data it intends to
save in pre-copy phase.
e.g. in stop-and-copy phase, vendor driver has to first check and send
data in previous phase.
3. if all the sequence is tightly bound to live migration, can we remove the
logging state? what about adding two states migrate-in and migrate-out?
so there are four states: running, stopped, migrate-in, migrate-out.
migrate-out is for source side when migration starts. together with
state running and stopped, it can substitute state logging.
migrate-in is for target side.
Thanks
Yan
On Tue, Mar 12, 2019 at 10:57:47AM +0800, Zhao Yan wrote:
> hi Alex
> thanks for your reply.
>
> So, if we choose migration data to be userspace opaque, do you think below
> sequence is the right behavior for vendor driver to follow:
>
> 1. initially LOGGING state is not set. If userspace calls GET_BUFFER to
> vendor driver, vendor driver should reject and return 0.
>
> 2. then LOGGING state is set, if userspace calls GET_BUFFER to vendor
> driver,
> a. vendor driver shoud first query a whole snapshot of device memory
> (let's use this term to represent device's standalone memory for now),
> b. vendor driver returns a chunk of data just queried to userspace,
> while recording current pos in data.
> c. vendor driver finds all data just queried is finished transmitting to
> userspace, and queries only dirty data in device memory now.
> d. vendor driver returns a chunk of data just quered (this time is dirty
> data )to userspace while recording current pos in data
> e. if all data is transmited to usespace and still GET_BUFFERs come from
> userspace, vendor driver starts another round of dirty data query.
>
> 3. if LOGGING state is unset then, and userpace calls GET_BUFFER to vendor
> driver,
> a. if vendor driver finds there's previously untransmitted data, returns
> them until all transmitted.
> b. vendor driver then queries dirty data again and transmits them.
> c. at last, vendor driver queris device config data (which has to be
> queried at last and sent once) and transmits them.
>
>
> for the 1 bullet, if LOGGING state is firstly set and migration aborts
> then, vendor driver has to be able to detect that condition. so seemingly,
> vendor driver has to know more qemu's migration state, like migration
> called and failed. Do you think that's acceptable?
>
>
> Thanks
> Yan
>
>
More information about the intel-gvt-dev
mailing list