[Spice-devel] changing the timing of spice client linking in migration (RHBZ #725009)

Wed Aug 17 11:48:33 PDT 2011

On Wed, Aug 17, 2011 at 10:19:27AM +0200, David Jaša wrote:
> On 17.8.2011 09:47, Yonit Halperin wrote:
> > On 08/17/2011 01:54 AM, Marc-André Lureau wrote:
> >> Hi
> >>
> >> I am also unfamiliar with the migration code, in particular the qemu
> >> ->  qemu part. It seems to me that no spice transmission occurs, but
> >> only guest memory. Is that correct? How is state of the channel
> >> restored? Perhaps it doesn't need any state transmission, and the
> >> connection of a client to the target is enough to revive the channels
> >> after migration.
> >>
> > Hi,
> > You are right. No Spice transmission occurs. When we supported seamless
> > migration we did transfer spice info. This was done via the client: each
> > source server channel sent opaque migration data to the corresponding
> > channel in the client. The client channel passed this data to the dest
> > server and the dest server restored its state from this data.
> > We don't do it anymore since the current migration notifiers, afaik,
> > don't allow us to hold the target vm stopped till we make sure the
> > target server has restored its state. We can't prevent the target vm
> > from starting => we have synchronization problem. That is why
> > switch_host makes the client reconnect from scratch to the target.
> > 
> 
> Why can't we keep VM stopped? It seems to me that this is precisely the
> root cause of https://bugzilla.redhat.com/show_bug.cgi?id=730645 , so
> until we somehow solve it, our migration UX will be horrible.
> 

I'm not sure I understand the question. The source VM gets stopped because it
is part of the migration process, irrespective of spice. There are two parts
to a migration, the live part where the source VM is running and the dest is stopped,
the closing/finishing part (not sure what the catchy name for it is) where both
vm's are stopped (this is required to send the last pages from the source guest),
and when it is done the destination vm is started.

So back to your question, the root cause of the regression is that we used to have
a secondary channel through the client (the primary channel is src qemu -> dst qemu),
i.e. src spice-server -> client -> target spice-server.

That channel went away during the upstreaming process that Gerd led. So now we are
left only with the existing src qemu -> dst qemu channel.

Maybe it's possible to fix that, but it will take more thought and I think what Yonit
is proposing will bring a large part of it, namely reducing the reconnect latency. The
screen resize is another matter and can be solved purely in the client. The only thing
that is left is that we are losing the glzdict and cache state.

> David
> 
> >> Thanks a lot Yonit for your clear mail, it helps a lot.
> > :)
> >>
> >> ----- Original Message -----
> >>> qemu
> >>> =====
> >>> ui/spice-core::migration_state_notifier should handle MIG_STATE_ACTIVE
> >>> (for migration start) and MIG_STATE_ERROR/CANCELLED by calling
> >>> spice_server_migrate_start, and spice_server_migrate_end,
> >>> respectively.
> >>> These callbacks are currently declared in spice-experimental.h.
> >>
> >> Contrary to Christophe, I don't think we should be afraid of using
> >> those functions which have not been supported and used since quite
> >> some time, afaik.
> >>
> >>> spice-server
> >>> =============
> >>> (A) Migration source side
> >>>
> >>> * reds::spice_server_migrate_start:
> >>> send SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
> >>> We can't use SPICE_MSG_MAIN_MIGRATE_BEGIN since it doesn't
> >>> include the certificate information we need. But we can change it
> >>> to be identical to SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
> >>
> >> For the same reason, I guess we can break messages, as long as proper
> >> version/caps check are perform before client&  server receive them.
> >>
> >>> * reds::spice_server_migrate_end(completed)
> >>> - if (completed) =>  send SPICE_MSG_MIGRATE (flags=0) to all
> >>> connected channels (via Channel->migrate).
> >>> - if (!completed) =>  send SPICE_MSG_MAIN_MIGRATE_CANCEL
> >>
> >> flags=0 == No NEED_FLUSH or DATA_TRANSFER. ok
> >>
> >>> (B) Migration target side
> >>>
> >>> reds identifies it is a migration target when the client connects with
> >>> a
> >>> connection id != 0.
> >>> When linking to a migrated channels, a special treatment is required
> >>> (and not the support that is currently coded, since it is for seamless
> >>> migration).
> >>> For example:
> >>> - For the main channel, (1) network test is not required (2) no need
> >>> for
> >>> SPICE_MSG_MAIN_INIT, but rather SPICE_MSG_MAIN_MULTI_MEDIA_TIME and
> >>> SPICE_MSG_MAIN_MOUSE_MODE. This way we will also save all the agent
> >>> work
> >>> we preform when initializing the main channel in the client.
> >>> - For the display channel, we mustn't call
> >>> display_channel_wait_for_init
> >>> immediately upon link, but we should expect it to arrive later (for
> >>> setting cache and dictionary sizes).
> >>> - For playback channel: we still need to send the current playback
> >>> status, as opposed to seamless migration.
> >>
> >> It looks to me like you would like to revive the seamless migration.
> >>
> > see the above explanation about seamless.
> >> Wouldn't it be simpler to just leave connection id == 0 for now, and
> >> do regular connection? Wouldn't that also work like "switch-host"?
> >>
> > Since we want to execute the linking to the target before the logical
> > target spice session starts, it is problematic not to let the target
> > server know it is a migration target: one of the problems is that upon
> > connection, the server display channel expect an INIT message from the
> > client. If timeout occurs, it disconnects. This is not desirable when
> > this is only the initial connection (the one triggered by
> > migrate_start), and the actual communication will start only when
> > migration ends.
> > 
> > Besides, it will also save us time and other artifacts that are a result
> > of executing a fresh connection. E.g., We can avoid the network
> > bandwidth test. Though I guess the bandwidth can change from one host to
> > another....But preforming the network test upon linking is more
> > complicated since the client doesn't listen yet to the socket...Maybe we
> > can neglect this for now and assume the same bandwidth for the new host?
> >>> Spice client
> >>> ============
> >>> (A) SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
> >>> client connects to the target, but still stays connected to the
> >>> source host. It doesn't listen to the target sockets.
> >>> The link message to the target contains the connection_id of the
> >>> connection to the source (this allows the target server to identify
> >>> itself as a migration target).
> >>> For this part we can use most of the code in the class Migrate in
> >>> red_client.cpp
> >>> (B) SPICE_MSG_MIGRATE
> >>> We can use the code in red_channel::handle_migrate to switch the
> >>> channels and start listening to the target.
> >>> The difference is that we should implement differently the virtual
> >>> method RedChannel::on_migrate.
> >>> (1) Each channel must reset all the dynamic data that depends on
> >>> the server. For example: the display channel
> >>> needs to destroy all the surfaces and reset the caches and
> >>> dictionary; The playback and record channel need to stop
> >>> the current session, if there is an active one, etc.
> >>> (2) Each channel should send to the server the initalization
> >>> information it normally sends in RedChannel::on_connect.
> >>>
> >>> (C) SPICE_MSG_MAIN_MIGRATE_CANCEL
> >>> disconnects all the new channels. This code is already implemented
> >>> in spice-client.
> >>>
> >>> spice-protocol(?)/Backward compatibility
> >>> =========================================
> >>> should we bounce spice protocol version, or use capabilities? (if we
> >>> change SPICE_MSG_MAIN_MIGRATE_BEGIN structue, there is no question).
> >>
> >>> New Spice-Server with old client will send only
> >>> SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST, and only when migration completes
> >>> (same as today).
> >>> New client with old Spice-server will disconnect the source and will
> >>> connect the target upon receiving SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
> >>> (same as today).
> >>>
> >>
> >> Preferably, I would introduce SPICE_MSG_MAIN_MIGRATE_BEGIN2 etc. and
> >> deprecate the older messages. From what I understand, we are now
> >> preferably using caps rather than bumping protocol version.
> > o.k. But when do we actually change the protocol version?
> >>
> >> cheers
> >>
> > 
> > _______________________________________________
> > Spice-devel mailing list
> > Spice-devel at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/spice-devel
> 
> -- 
> 
> David Jaša
> 
> _______________________________________________
> Spice-devel mailing list
> Spice-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/spice-devel