[Spice-devel] changing the timing of spice client linking in migration (RHBZ #725009)

Wed Aug 17 13:10:13 PDT 2011

On Wed, Aug 17, 2011 at 10:15:01PM +0300, Yonit Halperin wrote:
> On 08/17/2011 09:48 PM, Alon Levy wrote:
> >On Wed, Aug 17, 2011 at 10:19:27AM +0200, David Jaša wrote:
> >>On 17.8.2011 09:47, Yonit Halperin wrote:
> >>>On 08/17/2011 01:54 AM, Marc-André Lureau wrote:
> >>>>Hi
> >>>>
> >>>>I am also unfamiliar with the migration code, in particular the qemu
> >>>>->   qemu part. It seems to me that no spice transmission occurs, but
> >>>>only guest memory. Is that correct? How is state of the channel
> >>>>restored? Perhaps it doesn't need any state transmission, and the
> >>>>connection of a client to the target is enough to revive the channels
> >>>>after migration.
> >>>>
> >>>Hi,
> >>>You are right. No Spice transmission occurs. When we supported seamless
> >>>migration we did transfer spice info. This was done via the client: each
> >>>source server channel sent opaque migration data to the corresponding
> >>>channel in the client. The client channel passed this data to the dest
> >>>server and the dest server restored its state from this data.
> >>>We don't do it anymore since the current migration notifiers, afaik,
> >>>don't allow us to hold the target vm stopped till we make sure the
> >>>target server has restored its state. We can't prevent the target vm
> >>>from starting =>  we have synchronization problem. That is why
> >>>switch_host makes the client reconnect from scratch to the target.
> >>>
> >>
> >>Why can't we keep VM stopped? It seems to me that this is precisely the
> >>root cause of https://bugzilla.redhat.com/show_bug.cgi?id=730645 , so
> >>until we somehow solve it, our migration UX will be horrible.
> >>
> >
> >I'm not sure I understand the question. The source VM gets stopped because it
> >is part of the migration process, irrespective of spice. There are two parts
> >to a migration, the live part where the source VM is running and the dest is stopped,
> >the closing/finishing part (not sure what the catchy name for it is) where both
> >vm's are stopped (this is required to send the last pages from the source guest),
> >and when it is done the destination vm is started.
> >
> >So back to your question, the root cause of the regression is that we used to have
> >a secondary channel through the client (the primary channel is src qemu ->  dst qemu),
> >i.e. src spice-server ->  client ->  target spice-server.
> >
> >That channel went away during the upstreaming process that Gerd led. So now we are
> >left only with the existing src qemu ->  dst qemu channel.
> Hi,
> This is not accurate. We didn't have such special channels. We used
> our channels as is. e.g.,, src display-channel send migration_data
> to the client, and client sends it through the dest display channel
> to the dest server, which restores the display channel state using
> this data.

My secondary channel is secondary to the primary migration socket. Yes, it
is reusing our existing channels (==sockets) with migration specific messages,
no contradition.

> We did have a special channel between the src spice server to the
> target spice server. That channel was used to transform ticketing
> and other authentication information. Today we bypass the use of
> this channel by the monitor command 'client_migrate_info'.
> 
> Afaik, the real missing link for seamless migration is the ability
> to hold the target vm stopped till we make sure the target spice
> server restored its state (from the data it got from the client).
> And as I stated above, if we can't prevent the target vm from
> starting =>  we can have synchronization problem.
> I'm not familiar enough with the current migration code in
> qemu-devel to know if it is possible and acceptable to hold the
> target vm when notifying on migration completion,
> 

I think you can do that without forcing the target to freeze, but that
is the simplest solution.

> >
> >Maybe it's possible to fix that, but it will take more thought and I think what Yonit
> >is proposing will bring a large part of it, namely reducing the reconnect latency. The
> >screen resize is another matter and can be solved purely in the client. The only thing
> >that is left is that we are losing the glzdict and cache state.
> I think we are also losing the playback/record state, and all the surfaces.
> >
> >
> >>David
> >>
> >>>>Thanks a lot Yonit for your clear mail, it helps a lot.
> >>>:)
> >>>>
> >>>>----- Original Message -----
> >>>>>qemu
> >>>>>=====
> >>>>>ui/spice-core::migration_state_notifier should handle MIG_STATE_ACTIVE
> >>>>>(for migration start) and MIG_STATE_ERROR/CANCELLED by calling
> >>>>>spice_server_migrate_start, and spice_server_migrate_end,
> >>>>>respectively.
> >>>>>These callbacks are currently declared in spice-experimental.h.
> >>>>
> >>>>Contrary to Christophe, I don't think we should be afraid of using
> >>>>those functions which have not been supported and used since quite
> >>>>some time, afaik.
> >>>>
> >>>>>spice-server
> >>>>>=============
> >>>>>(A) Migration source side
> >>>>>
> >>>>>* reds::spice_server_migrate_start:
> >>>>>send SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
> >>>>>We can't use SPICE_MSG_MAIN_MIGRATE_BEGIN since it doesn't
> >>>>>include the certificate information we need. But we can change it
> >>>>>to be identical to SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
> >>>>
> >>>>For the same reason, I guess we can break messages, as long as proper
> >>>>version/caps check are perform before client&   server receive them.
> >>>>
> >>>>>* reds::spice_server_migrate_end(completed)
> >>>>>- if (completed) =>   send SPICE_MSG_MIGRATE (flags=0) to all
> >>>>>connected channels (via Channel->migrate).
> >>>>>- if (!completed) =>   send SPICE_MSG_MAIN_MIGRATE_CANCEL
> >>>>
> >>>>flags=0 == No NEED_FLUSH or DATA_TRANSFER. ok
> >>>>
> >>>>>(B) Migration target side
> >>>>>
> >>>>>reds identifies it is a migration target when the client connects with
> >>>>>a
> >>>>>connection id != 0.
> >>>>>When linking to a migrated channels, a special treatment is required
> >>>>>(and not the support that is currently coded, since it is for seamless
> >>>>>migration).
> >>>>>For example:
> >>>>>- For the main channel, (1) network test is not required (2) no need
> >>>>>for
> >>>>>SPICE_MSG_MAIN_INIT, but rather SPICE_MSG_MAIN_MULTI_MEDIA_TIME and
> >>>>>SPICE_MSG_MAIN_MOUSE_MODE. This way we will also save all the agent
> >>>>>work
> >>>>>we preform when initializing the main channel in the client.
> >>>>>- For the display channel, we mustn't call
> >>>>>display_channel_wait_for_init
> >>>>>immediately upon link, but we should expect it to arrive later (for
> >>>>>setting cache and dictionary sizes).
> >>>>>- For playback channel: we still need to send the current playback
> >>>>>status, as opposed to seamless migration.
> >>>>
> >>>>It looks to me like you would like to revive the seamless migration.
> >>>>
> >>>see the above explanation about seamless.
> >>>>Wouldn't it be simpler to just leave connection id == 0 for now, and
> >>>>do regular connection? Wouldn't that also work like "switch-host"?
> >>>>
> >>>Since we want to execute the linking to the target before the logical
> >>>target spice session starts, it is problematic not to let the target
> >>>server know it is a migration target: one of the problems is that upon
> >>>connection, the server display channel expect an INIT message from the
> >>>client. If timeout occurs, it disconnects. This is not desirable when
> >>>this is only the initial connection (the one triggered by
> >>>migrate_start), and the actual communication will start only when
> >>>migration ends.
> >>>
> >>>Besides, it will also save us time and other artifacts that are a result
> >>>of executing a fresh connection. E.g., We can avoid the network
> >>>bandwidth test. Though I guess the bandwidth can change from one host to
> >>>another....But preforming the network test upon linking is more
> >>>complicated since the client doesn't listen yet to the socket...Maybe we
> >>>can neglect this for now and assume the same bandwidth for the new host?
> >>>>>Spice client
> >>>>>============
> >>>>>(A) SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
> >>>>>client connects to the target, but still stays connected to the
> >>>>>source host. It doesn't listen to the target sockets.
> >>>>>The link message to the target contains the connection_id of the
> >>>>>connection to the source (this allows the target server to identify
> >>>>>itself as a migration target).
> >>>>>For this part we can use most of the code in the class Migrate in
> >>>>>red_client.cpp
> >>>>>(B) SPICE_MSG_MIGRATE
> >>>>>We can use the code in red_channel::handle_migrate to switch the
> >>>>>channels and start listening to the target.
> >>>>>The difference is that we should implement differently the virtual
> >>>>>method RedChannel::on_migrate.
> >>>>>(1) Each channel must reset all the dynamic data that depends on
> >>>>>the server. For example: the display channel
> >>>>>needs to destroy all the surfaces and reset the caches and
> >>>>>dictionary; The playback and record channel need to stop
> >>>>>the current session, if there is an active one, etc.
> >>>>>(2) Each channel should send to the server the initalization
> >>>>>information it normally sends in RedChannel::on_connect.
> >>>>>
> >>>>>(C) SPICE_MSG_MAIN_MIGRATE_CANCEL
> >>>>>disconnects all the new channels. This code is already implemented
> >>>>>in spice-client.
> >>>>>
> >>>>>spice-protocol(?)/Backward compatibility
> >>>>>=========================================
> >>>>>should we bounce spice protocol version, or use capabilities? (if we
> >>>>>change SPICE_MSG_MAIN_MIGRATE_BEGIN structue, there is no question).
> >>>>
> >>>>>New Spice-Server with old client will send only
> >>>>>SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST, and only when migration completes
> >>>>>(same as today).
> >>>>>New client with old Spice-server will disconnect the source and will
> >>>>>connect the target upon receiving SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
> >>>>>(same as today).
> >>>>>
> >>>>
> >>>>Preferably, I would introduce SPICE_MSG_MAIN_MIGRATE_BEGIN2 etc. and
> >>>>deprecate the older messages. From what I understand, we are now
> >>>>preferably using caps rather than bumping protocol version.
> >>>o.k. But when do we actually change the protocol version?
> >>>>
> >>>>cheers
> >>>>
> >>>
> >>>_______________________________________________
> >>>Spice-devel mailing list
> >>>Spice-devel at lists.freedesktop.org
> >>>http://lists.freedesktop.org/mailman/listinfo/spice-devel
> >>
> >>--
> >>
> >>David Jaša
> >>
> >>_______________________________________________
> >>Spice-devel mailing list
> >>Spice-devel at lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/spice-devel
> 
> _______________________________________________
> Spice-devel mailing list
> Spice-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/spice-devel