[Spice-devel] [RFC PATCH spice-protocol v2 03/20] Add the StreamInfo message

Fri Aug 17 10:36:37 UTC 2018

> On Thu, 2018-08-16 at 16:08 -0500, Jonathon Jongsma wrote:
> > On Thu, 2018-08-16 at 18:26 +0200, Lukáš Hrázký wrote:
> > > The message is meant to contain information related to the stream,
> > > e.g.
> > > now the guest (xrandr) output ID of the streamed monitor.
> > > 
> > > Signed-off-by: Lukáš Hrázký <lhrazky at redhat.com>
> > > ---
> > >  spice/stream-device.h | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > > 
> > > diff --git a/spice/stream-device.h b/spice/stream-device.h
> > > index 6add42b..be22e06 100644
> > > --- a/spice/stream-device.h
> > > +++ b/spice/stream-device.h
> > > @@ -90,6 +90,8 @@ typedef enum StreamMsgType {
> > >      STREAM_TYPE_CURSOR_SET,
> > >      /* guest cursor position */
> > >      STREAM_TYPE_CURSOR_MOVE,
> > > +    /* the stream information message */
> > > +    STREAM_TYPE_INFO,
> > >  } StreamMsgType;
> > >  
> > >  typedef enum StreamCapabilities {
> > > @@ -140,6 +142,15 @@ typedef struct StreamMsgData {
> > >      uint8_t data[0];
> > >  } StreamMsgData;
> > >  
> > > +/* Message containing information about the stream, sent when a new
> > > stream is created.
> > > + */
> > > +typedef struct StreamMsgInfo {
> > > +    /* The xrandr output ID (index of the output in the list) that
> > > is being
> > > +     * streamed, 0-based.
> > > +     */
> > > +    uint32_t guest_output_id;
> > > +} StreamMsgInfo;
> > > +
> > >  /* Tell to stop current stream and possibly start a new one.
> > >   * This message is sent by the host to the guest.
> > >   * Allows to communicate the codecs supported by the clients.
> > 
> > 
> > So, here's where our work starts to overlap a bit (and why it's a bit
> > unfortunate that I haven't gotten around to sending out a proposal for
> > multi-monitor support so far, perhaps).
> 
> It's not much of a problem, I just did it the simple way not knowing
> what the plans were for multi-monitor, but it's not hard to change.
> 
> > The change above assumes that there will be a single output for a
> > single stream device. This is currently true because we only support a
> > single output. For a variety of reasons, I've chosen to implement
> > multimonitor support by multiplexing multiple outputs over a single
> > stream device. I'll share my current working patch for the stream
> > device protocol below.
> > 
> > But first I'll try to explain why I chose to multiplex displays over a
> > single stream device. In general, I think it makes things easier. Some
> > examples:
> > 
> >  * if you use one stream device per output, you need to decide before
> >    you launch the vm how many outputs it will support and allocate
> >    those devices by passing the correct options to qemu. To add an
> >    additional output, you need to re-start qemu with different
> >    parameters.
> 
> This alone is reason enough to do it the way you did it...
> 
> >  * We currently use a hard-coded spice port name (org.spice-
> >    space.stream.0). If we have a separate stream device for each
> >    display, we will have multiple port names. How do those port names
> >    get communicated from the server to the guest?
> >  * If you have multiple stream devices, do we launch a separate spice-
> >    streaming-agent process for each one? Or a single streaming agent
> >    that handles all of them?
> >  * You also have to decide how to handle odd issues such as the fact
> >    that the driver may be perfectly capable of enabling additional
> >    monitors, but we don't have sufficient stream devices for them. So
> >    what do you do when the vm was configured with only a single spice
> >    device, but the user enables an additional display from within the
> >    guest? Now perhaps the guest has two displays enabled, but we have
> >    no way of sending that second display to the client.
> > 

Another solution would be to allocate a maximum of devices we want to
support and use up to this number. Kind of 8 devices. Obviously this
will be a limitation.

> > So there were a lot of little things like this that led me to decide
> > that using a single stream device was simpler and more flexible. There
> > may have been other reasons that I've forgotten already as well.
> > 
> > The only real downside I could think of was that the stream device
> > protocol would have to change in a non-backward-compatible way (see
> > below). But since the streaming agent is still in early stages, that
> > doesn't seem like a huge deal.
> 
> I'm not sure, we made a release and no compatibility disclaimer was
> made about the protocol compatibility (only about API/ABI instability),
> but at this stage I'd say we shouldn't worry about compatibility yet,
> probability of changes is still high.
> 

"have to change in a non-backward-compatible way" is pretty false.
I didn't see in your patch below the needs to do it. Just add capabilities
and messages.
With these reasoning we'll probably have HTTP 187 and not HTTP 2 and TCP
934 instead of 6.
Personally I would use the reserved "padding" field in StreamDevHeader
for the output ID.

> > Of course, it's possible that I missed
> > other disadvantages, so let me know any problems you see with this
> > approach.
> 
> I can only think of possible performace issues of a single port with
> multiplexed 4 4k displays for example? Not sure of the behaviour and if
> something would be different if there were 4 ports? Maybe a test would
> be in place before we commit to a solution?
> 

One possible issue of this is serialization. If I need to send 4 encoded
frames, each frame has to be contained in a single message (as is today)
and each message is contiguous is possible that before sending the forth
frame we need to send all other complete frames which can cause potential
delays. Different protocol when they implement multiplexing like these
allow the messages/requests to be split (examples are HTTP2, TDS and potentially
there are plans for WebSockets). This would be potentially a change requiring
breaking compatibility.

I suppose the server will have multiple connection with the client
using multiple DisplayChannel (as protocol) and each frame will be sent
to the appropriate channel. Or are we going to implement multiple
surfaces for DisplayChannel (as protocol implementation)?
Maybe asking too much details about multi monitor.
But if we need multiple DisplayChannel and potentially we have the issue
that we need to open all at the beginning and not later (due to the
login token expiring) won't we have to allocate these channel statically
(like 8) to support them?

I won't be again trying a library like cap'n proto and break compatibility,
seems to me that the team has problems dealing with protocol and compatibility,
but even choosing a library without understanding much protocol issues
is not really good either.

> > Anyway, here's the current proof-of-concept changes to the stream
> > device protocol that I'm working with:
> 
> Looks fine to me in general, hopefully Frediano takes a look as well :)
> 
> > diff --git a/spice/stream-device.h b/spice/stream-device.h
> > index 6add42b..57c768f 100644
> > --- a/spice/stream-device.h
> > +++ b/spice/stream-device.h
> > @@ -58,7 +58,7 @@
> >   */
> >  
> >  /* version of the protocol */
> > -#define STREAM_DEVICE_PROTOCOL 1
> > +#define STREAM_DEVICE_PROTOCOL 2
> >  
> >  typedef struct StreamDevHeader {
> >      /* should be STREAM_DEVICE_PROTOCOL */
> > @@ -78,8 +78,8 @@ typedef enum StreamMsgType {
> >      STREAM_TYPE_INVALID = 0,
> >      /* allows to send version information */
> >      STREAM_TYPE_CAPABILITIES,
> > -    /* send screen resolution */
> > -    STREAM_TYPE_FORMAT,
> > +    /* send guest output configuration */
> > +    STREAM_TYPE_OUTPUT_CONFIG,
> 
> So the naming is "output_config" here. I don't mind, if we want to call
> each "monitor" by the name "output" here? At this point I don't have
> much of an opinion, except... (see below)
> 

I don't see why simply a new message cannot be added instead.
Or use capability and extend current message.

> >      /* stream data */
> >      STREAM_TYPE_DATA,
> >      /* server ask to start a new stream */
> > @@ -115,6 +115,16 @@ typedef struct StreamMsgCapabilities {
> >  
> >  #define STREAM_MSG_CAPABILITIES_MAX_BYTES 1024
> >  
> > +typedef struct StreamMsgOutput {
> > +    uint8_t output_id;
> 
> ... "output_id" is very similar to the "guest_output_id" I used
> throughout the monitor ID series. (though you've got uint8_t here and I
> use uint32_t)
> 
> Most important role of the "output_id" here is being the unique
> identifier of the "Outputs" as you define them here. A major question
> we should answer is whether we want to use this "output_id" also as the
> guest_output_id. At this moment, "guest_output_id" is a sequence
> starting from 0, so "output_id" can easily be equal to that, but they
> are not in essence not the same numbers.
> 
> On a side note, a question was already raised what will the
> "guest_output_id"s be on Wayland. The answer is I have no idea. On
> Wayland it seems each compositor is doing the multi-monitor setup
> itself, which seems like a nightmare for us.
> 

Yes, I think Wayland is quite obfuscating lot of stuff in the name of
"security". I really don't understand why a computer user, which should
represent a human user in the system has to have much less rights than
the real one. I can see all the screens with my eyes and all the windows
so why my user on the computer is so limited? Why I can launch a lspci
or similar but Wayland filter lot of hardware information?

OT: due to additional delay for the compositor is recommended to use
Xorg for gaming!

> > +    /* screen resolution/stream size */
> > +    uint32_t width;
> > +    uint32_t height;
> > +    /* coordinates of the top-left corner of the screen */
> > +    int32_t x;
> > +    int32_t y;

You need to align the structure.

Should the server compute a kind of fake surface? I was thinking
about the protocol server <-> client for multi monitor.

> > +} StreamMsgOutput;
> > +
> >  /* Define the format of the stream, start a new stream.
> >   * This message is sent by the guest to the host to
> >   * tell the host the new stream format.
> > @@ -122,21 +132,21 @@ typedef struct StreamMsgCapabilities {
> >   * States allowed: Ready
> >   *   state will change to Streaming.
> >   */
> > -typedef struct StreamMsgFormat {
> > -    /* screen resolution/stream size */
> > -    uint32_t width;
> > -    uint32_t height;
> > +typedef struct StreamMsgOutputConfig {
> >      /* as defined in SpiceVideoCodecType enumeration */
> >      uint8_t codec;
> > -    uint8_t padding1[3];
> > -} StreamMsgFormat;
> > +    uint8_t n_outputs;
> > +    StreamMsgOutput outputs[0];

alignment.

>From the protocol prospective why not sending multiple formats?
Well, I suppose would need to add ID field.

> > +} StreamMsgOutputConfig;
> > +
> 
> So the "codec" is common to all the outputs and each output has it's
> own dimensions. Makes sense...
> 
> Cheers,
> Lukas
> 
> > -/* This message contains just raw data stream.
> > +/* This message contains just raw data stream for the specified
> > output.
> >   * This message is sent by the guest to the host.
> >   *
> >   * States allowed: Streaming
> >   */
> >  typedef struct StreamMsgData {
> > +    uint8_t output_id;

This won't be necessary using the "padding" in the header.

> >      uint8_t data[0];
> >  } StreamMsgData;
> >  
> > 
> > 
> > Jonathon

Frediano