[Spice-devel] [Qemu-devel] seamless migration with spice

Yonit Halperin yhalperi at redhat.com
Tue Mar 20 09:47:18 PDT 2012


Hi,
On 03/20/2012 03:58 PM, Gerd Hoffmann wrote:
>    Hi,
>
>> We can either store and migrate the cache, or choose to reset it.
>> In the extinct spice seamless migration solution, the cache was reset.
>
> Hmm, this makes me wonder what the main advantage of seamless migration
> used to be?  image cache was reset, surfaces didn't exist back then.  So
> any image data must be retransmitted anyway, correct?
>
I guess the main advantage for the display channel was that you could 
make sure all the pending messages from the source have reached the 
client, so you can continue the session with the destination without 
sending the client the primary surface.

>> For implementing this approach, I think that the first display channel
>> that handles migration can freeze the source cache, and send
>> SPICE_MSG_DISPLAY_INVAL_ALL_PIXMAPS to the client (together with the
>> corresponding "wait list" - i.e., other display channels' message
>> serials we should wait for before resetting the cache).
>
> Looks sane to me.
>
>> In the old solution, resetting the client side cache was performed only
>> after the channel that freezed the cached completely switched to the
>> destination. This required migrating the "wait list" and the last
>> message serial. Then, the freezer channel sent the
>> SPICE_MSG_DISPLAY_INVAL_ALL_PIXMAPS with the MAX(migrated_wait_list,
>> current_cache_wait_list_serial).
>> I'm not sure why the old solution initiated the reset from the
>> destination and not from the source. Maybe for a case that for some
>> reason the client stayed connected to the source and the vm was started
>> on the source???
>
> Could be.  Nowdays qemu informs spice-server whenever the migration was
> successful or not, so this should not be needed any more.
>
>> If we choose to restore the complete cache on the destination side we
>> need to:
>> (1) freeze the cache
>> (2) send the cache to the destination. The cache holds the ids of the
>> images in stored in the client side cache, and the lru list of them.
>> In addition, for each such image we store the serial of the last message
>> that accessed it from each display channel.
>> (3) start the destination cache in freeze mode
>> (4) Unfreeze the cache after it is restored from the migtation data.
>
> Ok.
>
>> In any case, the migration data should also hold the cache size (which
>> is set by the client upon connection initialization).
>
> Why?  When the client sets it on connection initialization the dest host
> should know it already ...
>
I think that when migrating the channels we can skip the initialization 
of the channels, and use the same setting.
>> about it. The only date that should be migrated to the destination
>> server are (1) the dictionary size (also set by the client upon connection)
>
> Same here, not needed I think.
>
>> (2) the last image id in the dictionary (otherwise we should have a
>> message for resetting the dictionary on the client side).
>
> A message would need to be simliar to
> SPICE_MSG_DISPLAY_INVAL_ALL_PIXMAPS for display channel syncronization,
> correct?
>
Yes, but it can be simpler for the dictionary, sending the last_image_id 
should be enough.
>> (C) Surfaces:
>> Again, 2 options:
>> (1) Not migrate anything related to the client's off-screen surfaces.
>> Consequence: we might send the client off-screen surfaces that we have
>> already sent.
>> (2) Migrate the list of surfaces that the client holds and their lossy
>> regions (or just the regions extents, for simplicity).
>
> Do we have any stats on image / surface usage?
>
> I'd expect image cache tends to hold short-living images, whereas
> surfaces tends to hold long-living ones, so preserving surfaces is more
> important than keeping the image cache.  That expectation isn't backed
> by any real data through, and it probably also depends on the guest os
> and driver version ...
>
Well, I wish it was. Unfortunately, almost all of the surfaces, both for 
Windows 7 and Linux guests, are short living. Moreover, a major part of 
the surfaces are not even used for any drawing on the primary surface, 
and just make us call update_area (instead of just rendering them on the 
guest, like it was when they were guest-managed bitmaps). Font-smoothing 
is one of the triggers of this behavior.
I hope this will change with Xrender and when we upgrade the Windows 
driver and support Dircet and 3D.
I have an old email with stats about surfaces. I'll send it to you.
>> (D) In order to promise that in flight data from/to the src server won't
>> get lost we still need to assure that the src server is not killed
>> before spice completes its work - and then we are back to the original
>> problem that started this thread.
>
> I think this is needed no matter which way the migration state travels,
> correct?
>
If we take the vmstate approach, we can use the pre_save callback for 
sending all the pending data, and post_load cb for completing the client 
switch from the src to the destination. But it will be ugly as long as 
these callbacks are synchronous (at least the pre_save one).

Regards,
Yonit.
> cheers,
>    Gerd
>



More information about the Spice-devel mailing list