[Spice-devel] [RFC/POC PATCH 00/16] add output_id to monitors_config

Thu Jun 21 12:40:51 UTC 2018

On Thu, 2018-06-21 at 13:42 +0200, Christophe de Dinechin wrote:
> > On 20 Jun 2018, at 14:14, Lukáš Hrázký <lhrazky at redhat.com> wrote:
> > 
> > On Tue, 2018-06-19 at 21:55 +0200, Christophe de Dinechin wrote:
> > > > On 19 Jun 2018, at 17:54, Jonathon Jongsma <jjongsma at redhat.com> wrote:
> > > > 
> > > > On Tue, 2018-06-19 at 15:46 +0200, Christophe de Dinechin wrote:
> > > > > > On 19 Jun 2018, at 14:11, Frediano Ziglio <fziglio at redhat.com>
> > > > > > wrote:
> > > > > > 
> > > > > > > 
> > > > > > > On Fri, 2018-06-15 at 15:12 +0200, Marc-André Lureau wrote:
> > > > > > > > Hi
> > > > > > > > 
> > > > > > > > On Fri, Jun 15, 2018 at 1:01 PM, Lukáš Hrázký <lhrazky at redhat.c
> > > > > > > > om> wrote:
> > > > > > > > > On Fri, 2018-06-15 at 12:24 +0200, Marc-André Lureau wrote:
> > > > > > > > > > Hi
> > > > > > > > > > 
> > > > > > > > > > On Fri, Jun 15, 2018 at 12:16 PM, Lukáš Hrázký <lhrazky at red
> > > > > > > > > > hat.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > On Thu, 2018-06-14 at 21:12 +0200, Marc-André Lureau
> > > > > > > > > > > wrote:
> > > > > > > > > > > > Hi
> > > > > > > > > > > > 
> > > > > > > > > > > > On Tue, Jun 5, 2018 at 5:30 PM, Lukáš Hrázký <lhrazky at r
> > > > > > > > > > > > edhat.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > following is my attempt at solving the ID issues with
> > > > > > > > > > > > > monitors_config
> > > > > > > > > > > > > and streaming. The concept is as follows:
> > > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Before introducing a new solution, could you describe
> > > > > > > > > > > > the "issues
> > > > > > > > > > > > with
> > > > > > > > > > > > monitors_config and streaming"? I am not following
> > > > > > > > > > > > closely enough
> > > > > > > > > > > > the
> > > > > > > > > > > > mailing list probably, so a recap would be welcome. I'd
> > > > > > > > > > > > like to
> > > > > > > > > > > > understand the shortcomings of the current messages,
> > > > > > > > > > > > and see if we
> > > > > > > > > > > > are
> > > > > > > > > > > > on the same page about it. Thanks!
> > > > > > > > > > > 
> > > > > > > > > > > Right, sorry about that!
> > > > > > > > > > > 
> > > > > > > > > > > The issue is when having a plugin a for the streaming
> > > > > > > > > > > agent that is
> > > > > > > > > > > using a HW-accelerated encoder. The typical case is that
> > > > > > > > > > > you have a
> > > > > > > > > > > QXL
> > > > > > > > > > > monitor in the guest which shows the boot and then you
> > > > > > > > > > > have a
> > > > > > > > > > > physical
> > > > > > > > > > > HW device (either directly assigned to the VM or a vGPU)
> > > > > > > > > > > which is
> > > > > > > > > > > configured in the X server.
> > > > > > > > > > > 
> > > > > > > > > > > Once the X server starts, the QXL monitor goes blank and
> > > > > > > > > > > you get a
> > > > > > > > > > > second monitor on the client with the streamed content
> > > > > > > > > > > from the
> > > > > > > > > > > streaming agent plugin.
> > > > > > > > > > > 
> > > > > > > > > > > The problem is the code and the protocol assumes (amongst
> > > > > > > > > > > other
> > > > > > > > > > > assumptions) that all monitors are configured in X. The
> > > > > > > > > > > monitors on
> > > > > > > > > > > the
> > > > > > > > > > > client are put into an array and the index is used as the
> > > > > > > > > > > ID
> > > > > > > > > > > throughout
> > > > > > > > > > > SPICE. So for the case above, the QXL monitor is ID 0 and
> > > > > > > > > > > the
> > > > > > > > > > > streamed
> > > > > > > > > > > monitor is ID 1.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Since the "streamed monitor" replaces the QXL monitor,
> > > > > > > > > > wouldn't it
> > > > > > > > > > make sense for them to share the same ID? This would be
> > > > > > > > > > handled by the
> > > > > > > > > > server/guest transparently.
> > > > > > > > > 
> > > > > > > > > Well... for one, it does not entirely replace it. From a user
> > > > > > > > > experience PoV, yes, it makes sense to "replace" one monitor
> > > > > > > > > for the
> > > > > > > > > other. We've even considered switching the content of the
> > > > > > > > > display
> > > > > > > > > channel from QXL to the streamed one. But in the system, they
> > > > > > > > > are
> > > > > > > > > different monitors on different channels, so it's a bad idea
> > > > > > > > > to give
> > > > > > > > > them the same ID. And the ID we're talking about is actually
> > > > > > > > > (for the
> > > > > > > > > monitors config messages):
> > > > > > > > 
> > > > > > > > So if it's the same from use PoV, the client API shouldn't need
> > > > > > > > to change.
> > > > > > > 
> > > > > > > I didn't say it's the same from the user PoV, not entirely. We
> > > > > > > can
> > > > > > > assume he want's to either be looking at the QXL monitor or the
> > > > > > > streamed one, i.e.:
> > > > > > > 
> > > > > > > 1. VM is booting, we are showing QXL monitor
> > > > > > > 2. X started, we switch to showing the streamed monitor
> > > > > > 
> > > > > > Note: this is also due to X limitation not supporting multiple
> > > > > > devices,
> > > > > > on Windows probably you'll have both used at the same time and
> > > > > > probably
> > > > > > user will manually disable QXL (as wants to use the other)
> > > > > > 
> > > > > > > 3. The user switches to a different TTY, we switch back to QXL
> > > > > > > etc...
> > > > > > > 
> > > > > > > This is for the simple case of one QXL monitor and one streamed
> > > > > > > monitor. But we are still "hiding" from the user that his VM
> > > > > > > actually
> > > > > > > has two graphics devices, the QXL one and a physical (or virtual)
> > > > > > > GPU.
> > > > > > > 
> > > > > > > I once advocated the approach to switch the monitors for the
> > > > > > > users, but
> > > > > > > now I think it would actually complicate things more and could
> > > > > > > potentially cause problems.
> > > > > > > 
> > > > > > 
> > > > > > I think the possibility to have the switch would be interesting and
> > > > > > help
> > > > > > but I think the actual protocol is limiting and potentially
> > > > > > complicating
> > > > > > stuff.
> > > > > > 
> > > > > > > > The client only cares about about having a specific set of
> > > > > > > > monitors
> > > > > > > > configuration. The way the server/guest handled them is not its
> > > > > > > > business, it expects the best configuration.
> > > > > > 
> > > > > > True, the client should implement the protocol and server should
> > > > > > use
> > > > > > the protocol as best as it can. This does not however mean that the
> > > > > > current protocol is perfect and using it as it is don't limit us
> > > > > > and make the code a bunch of spaghetti code in order to make
> > > > > > protocol
> > > > > > happy.
> > > > > > 
> > > > > > > 
> > > > > > > That is true, but the monitors configuration needs to reflect the
> > > > > > > actual configuration of the guest. If we mix it up for the
> > > > > > > clients
> > > > > > > convenience, problems... :)
> > > > > > > 
> > > > > > > > So far, we have the current simplified model (ie other more
> > > > > > > > complex
> > > > > > > > combinations are not supported):
> > > > > > > > 
> > > > > > > > qxl device, display channel 0,
> > > > > > > > monitor 0:
> > > > > > > > monitor 1:
> > > > > > > > monitor x..
> > > > > > > > 
> > > > > > > > 
> > > > > > > > or
> > > > > > > > 
> > > > > > > > qxl device, display channel 0,
> > > > > > > > monitor 0:
> > > > > > > > qxl device, display channel 1,
> > > > > > > > monitor 0:
> > > > > > > > ..
> > > > > > > > 
> > > > > > > > Feel free to correct me, I haven't been looking into this for
> > > > > > > > many years
> > > > > > > > now.
> > > > > > > 
> > > > > > > Correct.
> > > > > > > 
> > > > > > 
> > > > > > True. Note that this model however have physical (sockets and
> > > > > > connections)
> > > > > > repercussions which could be hard to maintain.
> > > > > > If we decide to keep the current protocol and have one device we
> > > > > > end up
> > > > > > with a huge surface 0 with all monitors in it having to mux and
> > > > > > unmux streamings
> > > > > > and commands in both server and clients having to implement fair
> > > > > > queueing and
> > > > > > synchronization just to maintain an old design.
> > > > > > Said that this approach would be prohibitive and looking at X
> > > > > > implementation
> > > > > > requiring a single surface this means that if streaming devices are
> > > > > > used we must
> > > > > > limit to one QXL with 1 monitor only and "map" all monitors to
> > > > > > different channels
> > > > > > (with one monitor each).
> > > > > > 
> > > > > > > > > 
> > > > > > > > > server -> client: a pair (channel_id, monitor_id)
> > > > > > > > > 
> > > > > > > > > client -> server: channel_id + monitor_id, converted to an
> > > > > > > > > array index
> > > > > > > > > under the assumption I mentioned
> > > > > > > > > 
> > > > > > > > > So we can't really make it the same even if we wanted.
> > > > > > > > 
> > > > > > > > With multi-head/xrandr qxl:
> > > > > > > > 
> > > > > > > > qxl device, display channel 0,
> > > > > > > > monitor 0:
> > > > > > > > monitor 1:
> > > > > > > > monitor x..
> > > > > > > > gpu device, stream channel 0,
> > > > > > > > maps to display channel 0, monitor 0
> > > > > > > > gpu device, stream channel 1,
> > > > > > > > maps to display channel 0, monitor 1
> > > > > > > > ...
> > > > > > > > 
> > > > > > > > With multihead/xrandr qxl & gpu:
> > > > > > > > qxl device, display channel 0,
> > > > > > > > monitor 0:
> > > > > > > > monitor 1:
> > > > > > > > monitor x..
> > > > > > > > gpu device, stream channel 0,
> > > > > > > > maps to display channel 0, monitor 0
> > > > > > > > maps to display channel 0, monitor 1
> > > > > > > > ...
> > > > > > > > 
> > > > > > > > With multiple QXL devices:
> > > > > > > > 
> > > > > > > > qxl device, display channel 0,
> > > > > > > > monitor 0:
> > > > > > > > qxl device, display channel 1,
> > > > > > > > monitor 0:
> > > > > > > > gpu device, stream channel 0,
> > > > > > > > maps to display channel 0, monitor 0
> > > > > > > > gpu device, stream channel 1,
> > > > > > > > maps to display channel 1, monitor 0
> > > > > > > 
> > > > > > > But, regardless of multihead/multiple devices, there is no
> > > > > > > guarantee
> > > > > > > that the number of QXL monitors is going to match the number of
> > > > > > > GPU
> > > > > > > device monitors. In fact, the most probable scenario is one QXL
> > > > > > > monitor
> > > > > > > and multiple GPU device monitors.
> > > > > > > 
> > > > > > 
> > > > > > This is not an issue, would be just a problem of allocating enough
> > > > > > channels to support all monitors we need.
> > > > > > 
> > > > > > > I don't think it makes sense to have more than one QXL monitor
> > > > > > > anyway.
> > > > > > > And if you did, it makes no sense to map them to GPU device
> > > > > > > monitors
> > > > > > > besides the first one, I guess...
> > > > > > > 
> > > > > > > Also, it is unclear what you mean by the "mapping", i.e. how
> > > > > > > exactly
> > > > > > > would you implement it?
> > > > > > > 
> > > > > > 
> > > > > > Simply to the client you present a channel_id, monitor_id which can
> > > > > > be different on the server.
> > > > > 
> > > > > I don’t know if this was Lukas’ question,  but how would we define
> > > > > this mapping.
> > > > > That was my original question to Marc-André.
> > > > > 
> > > > > > 
> > > > > > > > If you want to support QXL devices that should not be
> > > > > > > > associated with
> > > > > > > > a gpu/stream channel, then you should be able to flag them.
> > > > > > 
> > > > > > Not clear what do you want with this. A QXL device should be either
> > > > > > be
> > > > > > seen by the client or not. You need to flag if seen or not.
> > > > > > 
> > > > > > > > 
> > > > > > > > If you need a manual association of stream channel with
> > > > > > > > qxl/monitor,
> > > > > > > > this could be done with a manual configuration.
> > > > > > 
> > > > > > I don't see a case where I would need a manual configuration, I
> > > > > > mean,
> > > > > > should be configurable within the guest disabling/enabling devices
> > > > > > and/or monitors.
> > > > > 
> > > > > What about the case where you have physical monitors also connected
> > > > > to the VM?
> > > > 
> > > > 
> > > > What are the implications? Why should we care about this scenario?
> > > 
> > > I suspect that it would restrict the possible resolutions.
> > > 
> > > As I understand it, with QXL, resolution changes are free, and can be set by resizing the client side. If we have a streaming card, I assume there are more restrictions to possible resolutions, i.e. we can’t pick resolutions the driver won’t accept. If we have a monitor attached, there is a further restriction that the driver will probably not accept a resolution the monitor does not explicitly list in its EDID data. Isn’t that the reason Frediano uses dongles in order to enable more outputs?
> > > 
> > > So I guess having an external monitor is similar to the VGA case you describe below, except that the available resolutions may depend on which display is physically connected, not on some software setup.
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > > > 
> > > > > > > > Imho, we should avoid making things more complex. Testing is
> > > > > > > > hard
> > > > > > > > enough. We should actually take the simplest approach possible
> > > > > > > > (the
> > > > > > > > same decision we took before, it's already a mess to deal with,
> > > > > > > > as you
> > > > > > > > noticed)
> > > > > > 
> > > > > > You are stating that the current situation is complex and also we
> > > > > > need
> > > > > > to keep things simple. Looks like a bit contradictory.
> > > > > > Maybe the current situation is complex, or feel more complex than
> > > > > > it
> > > > > > should be, for the choices done and current implementation.
> > > > > > From my point of view is just that client need to tell "I want this
> > > > > > monitor this size" and "I'm moving the mouse here on this monitor".
> > > > > > The problem is defining "this monitor" at protocol level from
> > > > > > client
> > > > > > to server, including messages to agent.
> > > > > > 
> > > > > > > 
> > > > > > > I fully agree with the notion :) However, we are already making
> > > > > > > things
> > > > > > > much more complex by adding the GPU device streaming feature. It
> > > > > > > breaks
> > > > > > 
> > > > > > I don't think the problem is related to streaming, more about
> > > > > > devices
> > > > > > which are not QXL.
> > > > > 
> > > > > Technically correct, but at the moment, this h/w streaming is the
> > > > > only case, right?
> > > > > 
> > > > > > With only QXL all old assumptions are easy to maintain,
> > > > > > the problem is the new devices. For instance having QXL 0 as Xrandr
> > > > > > ID 0
> > > > > > and QXL 1 as Xrandr ID 1 is pretty easy, is the same driver and
> > > > > > actually
> > > > > > we also manage (at code level!) the guest device driver. Just
> > > > > > imagine if
> > > > > > the driver changes the order or physical device scanning having QXL
> > > > > > 0 as
> > > > > > Xrandr 1 and QXL 1 as Xrandr 0! This would break the
> > > > > > channel_id+monitor_id
> > > > > > formula. This does not happen as we maintain it, but we (SPICE)
> > > > > > don't
> > > > > > maintain Intel/Nvidia/ATI drivers or Xorg/Wayland/Windows!
> > > > > > 
> > > > > > I'm relative new in this group (3 years) compared to these stuff,
> > > > > > some are possibly not in current git repos, but I think that more
> > > > > > or
> > > > > > less the history is:
> > > > > > - before multimonitor. No much issues the monitor was the monitor;
> > > > > > - start adding multi monitor support, only one device was used. The
> > > > > > problem was only sending the ID of the monitor which was 0, 1, ...
> > > > > > Added a monitor_config message in the SPICE protocol from server
> > > > > > to client. The client could resize the monitors sending a message
> > > > > > to the agent through the server, server was not involved. As there
> > > > > > was only a QXL device and no other devices there was no reason to
> > > > > > send
> > > > > > information for channel/qxl device and monitor IDs matched Xrandr
> > > > > > IDs;
> > > > > > - for some reasons on Windows was not great, somebody decided to
> > > > > > use
> > > > > > multiple QXL devices each with one monitor. As a workaround not
> > > > > > having
> > > > > > the channel in the protocol to the agent the channel_id+monitor_id
> > > > > > formula
> > > > > > was introduced to create an ID fitting into the existing one (for
> > > > > > the
> > > > > > agent);
> > > > > > - somebody realized that on Linux was possible to handle resolution
> > > > > > changes talking directly to the device making possible resize
> > > > > > without
> > > > > > using the agent. This was implemented with a change in device
> > > > > > modes
> > > > > > (rom/ram) and an interrupt to the card handled by our Linux kernel
> > > > > > driver. As on Linux there is only one device, spice-server just
> > > > > > "redirects" the message for the agent sent by the client to the
> > > > > > QXL
> > > > > > card. This works on Windows as Windows driver does not handle this
> > > > > > interrupt/feature (or it would replicate monitor settings on all
> > > > > > cards!).
> > > > > > Possibly some wrong assumptions, please correct me.
> > > > > > Note: is not clear to me the history of the mouse events, currently
> > > > > > spice-server sends messages to the agent for the mouse.
> > > > > 
> > > > > Thanks for the write up. It makes sense to me. If anybody knows this
> > > > > to be wrong, please speak up ;-)
> > > > 
> > > > I'm not sure that it's exactly correct, but I think it's probably
> > > > accurate enough to explain things. The point is that it evolved over
> > > > time. 
> > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > > 
> > > > > > > previous assumptions and adds more stuff on top of it. And I
> > > > > > > think your
> > > > > > > suggestion adds even more complexity than necessary. Perhaps not
> > > > > > > in the
> > > > > > > client, but the "mapping" code, which sounds rather non-trivial,
> > > > > > > needs
> > > > > > > to go somewhere, presumably the server?
> > > > > > > 
> > > > > > 
> > > > > > I think too that bending to the old "design" will do stuff more
> > > > > > complex.
> > > > > > 
> > > > > > > It is actually my intention to add as little complexity as
> > > > > > > possible
> > > > > > > with my patches. The fact is, both the current code and the
> > > > > > > protocol is
> > > > > > > not robust enough and need to be fixed. And even if they were
> > > > > > > robust
> > > > > > 
> > > > > > I would not speak about robustness and fix, from my point of view
> > > > > > looks
> > > > > > more a problem of flexibility. Simply we now have new use cases to
> > > > > > support.
> > > > > > 
> > > > > > > enough, the protocol would need to be extended either way. On top
> > > > > > > of
> > > > > > > that, the backwards compatibility turns the solution into a very
> > > > > > > complex one. But if we ever could deprecate and drop the old code
> > > > > > > paths...
> > > > > > > 
> > > > > > 
> > > > > > I have same sensation, not changing protocol will make thing much
> > > > > > more complicated and possibly limited.
> > > > > > 
> > > > > > > > > > > The more pressing issue with this is that a mouse motion
> > > > > > > > > > > event is
> > > > > > > > > > > sent
> > > > > > > > > > > with a display_id 1 from the client, but when it reaches
> > > > > > > > > > > the guest,
> > > > > > > > > > > the
> > > > > > > > > > > correct monitor is ID 0, as it is the only one.
> > > > > > > > > > > 
> > > > > > > > > > > The other thing this breaks is monitors_config messages
> > > > > > > > > > > that
> > > > > > > > > > > enable/disable displays and change the resolution.
> > > > > > > > > > > Besides the same
> > > > > > > > > > > "shift" happening here as well, another problem is the
> > > > > > > > > > > information
> > > > > > > > > > > that
> > > > > > > > > > > the monitors belong to different devices (or channels) is
> > > > > > > > > > > erased when
> > > > > > > > > > > they're all put into the single array (under the
> > > > > > > > > > > assumption there is
> > > > > > > > > > > either one channel with multiple monitors or multiple
> > > > > > > > > > > channels with
> > > > > > > > > > > one
> > > > > > > > > > > monitor each).
> > > > > > > > > > 
> > > > > > > > > > Correct and so far was a fine solution.
> > > > > > > > > 
> > > > > > > > > I respectfully disagree. Doing the channel_id + monitor_id I
> > > > > > > > > mentioned
> > > > > > > > > above is asking for trouble, which is exactly what we have
> > > > > > > > > now.
> > > > > > > > 
> > > > > > > > With the limitation we decided, we avoid the biggest troubles
> > > > > > > > though.
> > > > > > 
> > > > > > We avoid to change the protocol but to implement the new use cases
> > > > > > we introduce potentially many assumptions and code complexity.
> > > > > > Not saying that is hard to discuss new code thinking about design
> > > > > > which nobody wrote down and that is bound to old code which nobody
> > > > > > fully remember.
> > > > > > 
> > > > > > > 
> > > > > > > I take it you were involved with the original implementation
> > > > > > > then...
> > > > > > > I'm sorry for being so critical, it is not my intention to be
> > > > > > > mean :)
> > > > > > > 
> > > > > > > However, I am not criticising the limitations you chose to
> > > > > > > impose, but
> > > > > > > the implementation. I find two problems here:
> > > > > > > 
> > > > > > > 1. Not keeping the IDs of the monitors_config messages as a
> > > > > > > (channel_id, monitor_id) pair and instead abusing the limitation
> > > > > > > you
> > > > > > > introduced to change it to a singe ID. I sort of consider it a
> > > > > > > hard
> > > > > > > rule you always keep your unique IDs intact and always keep them
> > > > > > > with
> > > > > > > the data. That way they're there when you need them and they work
> > > > > > > (because you didn't mess with them :)).
> > > > > > > 
> > > > > > 
> > > > > > I think this single ID came from the history I wrote above (that
> > > > > > could
> > > > > > be partly wrong).
> > > > > > I personally would add channel_id and monitor_id to the new
> > > > > > messages,
> > > > > > this to make possible to send the correct modes to the card in
> > > > > > every
> > > > > > possible case. Saving few bytes for these messages (which are not
> > > > > > that frequent) is not worth it.
> > > > > 
> > > > > This was my suggestion as well, although I was suggesting putting
> > > > > fields in the existing ID, where one of the fields would be 0 in
> > > > > “compatible mode” and non-zero if we have device_id / monitor_id /
> > > > > channel_id information.
> > > > > 
> > > > > At the time, I suggested that the non-zero field would indicate in
> > > > > which boot phase we were (I called that a “generation”), i.e. if it
> > > > > was expected that we were dealing with QXL or some VGPU.
> > > > > 
> > > > > But it’s perfectly fine to add explicit fields. As long as all
> > > > > elements in the chain can agree on the mapping, that’s fine with me.
> > > > > 
> > > > > 
> > > > > > 
> > > > > > > 2. The change in the IDs and the actual creation of the
> > > > > > > monitors_config
> > > > > > > list leaks over the spice-gtk API to the client application (via
> > > > > > > the
> > > > > > > spice_main_update_display{,_enabled}() function). It is
> > > > > > > unnecessary,
> > > > > > > gives more opportunity to the client app to break it, and any
> > > > > > > extension
> > > > > > > of the monitors_config message requires a spice-gtk API change.
> > > > > > > 
> > > > > > > You could have imposed the limitation you did (I'm not entirely
> > > > > > > sure
> > > > > > > what was the reason) and still make the implementation more
> > > > > > > robust by
> > > > > > > not doing the above...
> > > > > > > 
> > > > > > > And for these reasons this patch series is more complicated that
> > > > > > > it
> > > > > > > could have been (though as I said, the protocol extension, i.e.
> > > > > > > adding
> > > > > > > the output_id) would still be necessary.
> > > > > > > 
> > > > > > > I'm not saying this to pick on you, I'm not even sure how you
> > > > > > > were
> > > > > > > involved (didn't go digging who wrote the code), but more as an
> > > > > > > explanation and so that we all can learn from it for the future
> > > > > > > :)
> > > > > > > 
> > > > > > 
> > > > > > Usually complex code is the result of many passages and possibly
> > > > > > shortcuts, usually involving different people and undocumented
> > > > > > stuff. Picking on people is often unfair.
> > > > > 
> > > > > In all fairness, I think Lukas took a triple dose of carefulness to
> > > > > avoid picking on anybody :-) But I agree we don’t want to ignore
> > > > > Marc-André’s input on the topic, nor do we want to hastily discount
> > > > > Lukas’ analysis of the problem.
> > > > > 
> > > > > The insight I got from Marc-André was: remapping monitors on the fly
> > > > > might be easier and more user-friendly.
> > > > > The answer I read from Lukas was that he initially thought the same,
> > > > > and changed his mind. I’d like to know more.
> > > > > 
> > > > > 
> > > > > > 
> > > > > > > > > > Either you do multiple monitor over a single channel, or
> > > > > > > > > > you use
> > > > > > > > > > multiple
> > > > > > > > > > channels, each with one monitor.
> > > > > > > > > 
> > > > > > > > > That is a limitation you can introduce, perhaps there was a
> > > > > > > > > reason for
> > > > > > > > > it. But the code is taking shortcuts and borderline hacks
> > > > > > > > > because of
> > > > > > > > > this, while doing it right wouldn't be much harder. I'm not
> > > > > > > > > looking to
> > > > > > > > > blame or judge anyone, I don't know how this came into place.
> > > > > > > > > But what
> > > > > > > > > it is now is (IMO) a bad design and I feel the need to say it
> > > > > > > > > after
> > > > > > > > > working on it for quite some time ;)
> > > > > > > > 
> > > > > > > > Doing it differently would lead to push the complexity higher
> > > > > > > > up the
> > > > > > > > stack, perhaps even to the user. We wanted to avoid that.
> > > > > > 
> > > > > > I don't see how this could reach the user.
> > > > > 
> > > > > I think your response focuses on the protocol itself. But we know of
> > > > > user issues related to this problem.
> > > > > 
> > > > > One of them is client-side window management. From a user’s
> > > > > perspective, I think that having a single window with a content that
> > > > > changes form text to graphical as we boot is preferable to having a
> > > > > second window that pops up when streaming starts. But even on that
> > > > > simple “user POV” point, there is a debate. I seem to recall that
> > > > > David, for example, considers that a text-only window is a feature.
> > > > > 
> > > > > Still from a user’s perspective, the real issue we are trying to
> > > > > solve is that the mouse cursor position is bad, even in a single-
> > > > > monitor configuration, as soon as streaming device and QXL device
> > > > > have different resolutions, and that there is no obvious way to
> > > > > automatically transfer resolution from VGPU to QXL. That is by far
> > > > > the highest-priority user-visible problem to solve. The next one down
> > > > > the list is to support multi-montor configurations.
> > > > > 
> > > > > Yet a couple of user POV observations:
> > > > > 
> > > > > - Resizing the guest by resizing the client window is a useful
> > > > > feature, that works well with QXL. How do we want it to behave with a
> > > > > hardware VGPU where they may be resolution constraints?
> > > > 
> > > > 
> > > > What sort of resolution constraints are you referring to? 
> > > 
> > > The ones I outlined earlier. Typically, the GPUs that have actual display outputs will only accept resolutions that correspond to what the monitor (or dongle) supports. For streaming-only GPUs, I’m not sure what the exact limits are, but at the moment, I am not aware that it’s possible to select arbitrary resolution, only one among a few standard ones.
> > > 
> > > > 
> > > > It's worth noting that QXL fully supports resizing because it supports
> > > > arbitrary resolution. But other non-QXL devices (generic VGA etc) also
> > > > support resizing via the vdagent. But they only support a fixed number
> > > > of resolutions. For example, when you resize the client window to WxH,
> > > > we end a new monitor config to the server instructing it to resize the
> > > > display to WxH. This message is then sent to the vdagent, which then
> > > > uses the xrandr APIs to resize the guest to WxH. If the guest driver
> > > > doesn't support WxH resolutions, it will generally get set to the
> > > > nearest resolution that it does support (W'xH'). The server then sends
> > > > a monitor config back to the client informing it that the display has
> > > > changed to W'xH' resolution. The client window will adjust itself to
> > > > match that resolution.
> > > 
> > > So I guess we agree that there would be additional constraints with a physical streaming device, and that it’s unlikely we can get arbitrary resolutions.
> > > 
> > > > 
> > > > So, we do already support some amount of resolution constraints, but
> > > > I'm not sure whether the VGPU constraints you're referring to are
> > > > significantly different?
> > > 
> > > I see no real reason why. I was mentioning that mostly as a sort of user impact. If we redefine the monitor config messages, we might want to add a list of target resolutions, and sending the name from the EDID or something like that.
> > 
> > Correct me if I'm wrong, but I think it was Jonathon's point that we
> > already have the case where only a specific set of resolutions is
> > supported covered, by the mechanism that he described.
> 
> Yes, that’s correct, but that’s not addressing the same kind of issues as EDID-style information would.
> 
> - No display name in title bar
> - No DPI mismatch
> - No color correction mismatch (native gamma)
> - No menu with list of supported resolutions

Ok, true, but that's just going off addressing completely different
issues... I'll let others speak up if they find this worthwhile, but
I'm reluctant to go off on a tangent here trying to cater to future
extensions which I'm afraid I'd end up speculating about the protocol
they need.

But I'm not against adding some padding which can be used in the future
for such stuff (see below, I think you didn't understand my suggestion
there), which I believe is sufficient to not have backwards
compatibility trouble and that's all we really need to do now...?

> > 
> > > From a user perspective, it might be nice to have a menu that shows the monitors by name (e.g. “HP LP2065” or “NVIDIA P4 output #3”, stuff like that, and for each of these outputs, a submenu with the list of resolutions. I don’t think we have enough in the protocol to do that, but as we rework the whole chain, it might be a good idea to go beyond just the basics we already have, and add a name and a list of supported resolutions.
> > 
> > As Jonathon said, atm. this is solved by the vd_agent picking the
> > closest resolution to the one requested (as opposed to your solution of
> > giving the client a list of the resolutions to pick from).
> 
> The mechanism Jonathon described only updates the client with a resolution that is actually supported. It does send a list of supported resolutions, so it does not allow building a menu.

(Should have been "It does _not_ send a list ...")

> > 
> > > > 
> > > > 
> > > > 
> > > > > 
> > > > > - What about the case where we connect to a VM that is also connected
> > > > > to a real monitor? Can’t pick up any resolution in that case. What
> > > > > about identifying the monitors guest-side (e.g. should we included
> > > > > some EDID information in the messages to show them client-side as was
> > > > > once discussed? Forgive me if it’s in the patches and I missed it)
> > > > 
> > > > Maybe the above comment is relevant here. But I still wonder why this is a use case we should even care about?
> > > 
> > > Well, imagine you use SPICE to connect to a remote machine which is, say driving some local monitor and some projector. You might want to easily identify which is which from your virt-viewer. I guess you could probably use the in-guest display utility, which generally offer some kind of “identify function” to achieve that as well. But having the EDID name in the title of the window, for example, would be a nice touch.
> > > 
> > > I would not do that “for the sake of it”. But if we are reworking the whole chain from vd-agent to client, and defining V2 of the montior config message, it might be the right time to add it. Or maybe it’s easy enough to add yet another message for that info.
> > > 
> > > 
> > > > 
> > > > > 
> > > > > - What about the full-screen cases where the resolution is really
> > > > > imposed client-side?
> > > > > 
> > > > > To me, those are all important use cases to take into account in the
> > > > > design. At the moment, I am not sure how well the proposed design
> > > > > addresses these issues. For example, I would add fields in new
> > > > > monitor config (or maybe flags shared between old and new) to share
> > > > > restrictions about window resizing (e.g. “resolution imposed by
> > > > > client”, “resolution imposed by guest”, etc)
> > > > 
> > > > That seems overly complex and unnecessary to me, but perhaps I just
> > > > don't understand the situation you're trying to describe.
> > > 
> > > Did the examples above illustrate well enough?
> > > 
> > > Essentially, thinking more about it, I believe that I’d be happy if the new monitor config could pass the following information, which we may not need right away nor even fill right away, but that would help implement future “cosmetic” features.
> > > 
> > > - Some kind of “output name” in ASCII. I think it’s safe to use a char[16] since that’s display names in EDID.
> > > 
> > > - A list of supported resolutions. From VD agent, allows the client to help switching window sizes e.g. with a menu. From client, allows the VD to select W’xH’ that make sense, e.g. avoid sending 4K data to a client that runs on an HD monitor.
> > > 
> > > - A list of supported refresh rates / frame rates. No point in streaming at 120FPS if local screen only displays 60, and may want to avoid mismatched pairs like 60FPS on 75Hz display to avoid flicker.
> > 
> > I think this case in particular is not for the server -> client
> > message, but for the client -> server message (as you seem to talk
> > about the refresh rate on the client).
> 
> As stated above, most of these messages have use cases in both directions.
> 
> In the specific case of refresh rate,
> 
> - client data gives hints on max useful encoding rate
> 
> - guest data lets client avoid flickering / stroboscopic effects
> 
> 
> > 
> > > - DPI information - I’m really annoyed at the moment that my SPCIE windows are systematically twice too small as soon as the client is on a 4K display. That looks so last century ;-) Would help with scaling the mouse correctly as well.
> > > 
> > > - Preferred aspect ratio information.
> > > 
> > > - And if I want to be really picky, Gamma / color correction information if available. 
> > > 
> > > Again, not suggesting we do all that right now, only suggesting we reserve the fields in the new structure, possibly define them as “0” for now, and that’s it. It’s not complicated to do it now, and it opens a number of possibilities for later.
> > 
> > TBH, I think you are sort of complicating the issue too much :)
> 
> I find this statement overly judgemental, even with a smiley ;-)

I'm sorry :) but it was meant to be :) The smileys are there to express
there are no bad feelings and I mean this in a good natured way, but I
do think so. I think this is big enough without adding more stuff on
top.

> Do you really think it is so complicated to add half a dozen fields to a structure instead of one?
> 
> All I did was look at EDID and select information that might be useful later. It’s a very finite list.

Yes, but for what purpose. This is either pure speculation or I and
reviewers need to go out of their way and think through the intended
use cases, which are more or less vague at this stage.

> > I'm not sure what some of the info would be useful for at all and the other
> > information is mostly for additional features which are non-trivial and
> > a question whether they would ever get implemented.
> 
> If you don’t add the data in the structure, then you probably *decide* that the feature will never be implemented, because the features I listed are too marginal to justify a protocol break by themselves.

Interesting point. But adding the padding I mentioned would solve this?

> > 
> > It's again touching the subject of a more extensible protocol, with
> > which you wouldn't have to try and predict what kind of stuff we'll
> > need for the future. There's a very high chance we'll need something
> > completely different and first thing we would do is to rename the
> > zeroed fields you are suggesting to use them for something else.
> 
> How do you compute the “very high chance”? Are you saying there is a high chance that EDID will change by the time we implement that stuff? Or that we will have more urgent need for monitor related data that will force us to evict EDID fields to make room for something else?
> 
> In any case, I am not advocating *against* an extensible format. That’s what I planned for smart streaming metrics. I’m advocating *for* some additional fixed fields in the new monitor config, that’s all. If you want to tack some future-proof extensible mechanism on top of this, go ahead. But since an extensible protocol is more complicated, you should justify why you can’t say today what kind of data you might need. So back to how you compute the “very high chance”, which I find personally hard to believe without good evidence.

Let's call it "gut feeling" :) I don't have a mathematical definition
for you.

I didn't mean EDID would change or anything, but that when actually
implementing the feature, there is the chance I'd need something in the
protocol which we didn't think about when adding the fields.

> > 
> > For that, we can just add some ~20B zeroed padding (or however much we
> > want) to have a space to use in the future…
> 
> Straw-man. EDID is 128 bytes, not 20MB, and I only talked about part of it.

That was 20B, not 20MB. And since you read it as 20MB, you probably
took this as not being serious, which it was... I seriously suggest to
add an arbitrary zeroed space at the end of the message for future use.
It can be however big you think the extra data you are suggesting would
need.

Cheers,
Lukas

> Cheers,
> Christophe
> 
> > 
> > Cheers,
> > Lukas
> > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > Yeah, as I said, not sure what the reason was, but IMO you could
> > > > > > > have
> > > > > > > had that and still make it much more robust.
> > > > > > 
> > > > > > I cannot say a car has a bad design because it does not have the
> > > > > > load
> > > > > > of a big truck, just a different use case. We probably started
> > > > > > using
> > > > > > the car (to continue the metaphor) to load some stuff and we are
> > > > > > now
> > > > > > exceeding the limit. At the beginning we though a car was enough :-
> > > > > > )
> > > > > > 
> > > > > > > 
> > > > > > > Oh and BTW, not sure how old this is, but I'm sure I'd have much
> > > > > > > worse
> > > > > > > things to say about the code _I_ wrote say like 8 years ago ;)
> > > > > > > 
> > > > > > > > > > > Hope this explains it well enough. It's all very complex
> > > > > > > > > > > and goes
> > > > > > > > > > > over
> > > > > > > > > > > multiple interfaces (the network protocols as well as the
> > > > > > > > > > > spice-gtk
> > > > > > > > > > > API), so the more brilliant ideas you'll have, the more
> > > > > > > > > > > welcome they
> > > > > > > > > > > will be :)
> > > > > > > > > > 
> > > > > > > > > > If we can avoid modifying all the layers and exposing the
> > > > > > > > > > inner
> > > > > > > > > > details, I would encourage the exploration of other
> > > > > > > > > > solution.
> > > > > > > > > 
> > > > > > > > > I don't see how we can do that... And the inner details are
> > > > > > > > > already
> > > > > > > > > quite exposed, unfortunately :(
> > > > > > > > > 
> > > > > > > > > As I said, ideas welcome.
> > > > > > > > > 
> > > > > > > > > > > > > 1. The streaming-agent sends a new StreamInfo message
> > > > > > > > > > > > > when it
> > > > > > > > > > > > > starts
> > > > > > > > > > > > > streaming. The message contains the output_id of the
> > > > > > > > > > > > > streamed
> > > > > > > > > > > > > monitor.
> > > > > > > > > > > > > The actual value is the index in the list of xrandr
> > > > > > > > > > > > > outputs.
> > > > > > > > > > > > > Basically,
> > > > > > > > > > > > > the number should uniquely identify a monitor in the
> > > > > > > > > > > > > guest
> > > > > > > > > > > > > context under
> > > > > > > > > > > > > X.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 2. The server copies the number into the
> > > > > > > > > > > > > SpiceMsgDisplayMonitorsConfig
> > > > > > > > > > > > > message and sends it to the client as part of the
> > > > > > > > > > > > > monitors_config
> > > > > > > > > > > > > exchange.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 3. The client stores the output_id in it's list of
> > > > > > > > > > > > > monitor_configs. Here
> > > > > > > > > > > > > I had to make significant changes, because the
> > > > > > > > > > > > > monitors_config
> > > > > > > > > > > > > code in
> > > > > > > > > > > > > spice-gtk is very loosely written and also exposes
> > > > > > > > > > > > > quite a few
> > > > > > > > > > > > > unnecessary details to the client app. The client
> > > > > > > > > > > > > sends the
> > > > > > > > > > > > > output_id
> > > > > > > > > > > > > back to the server as part of the monitors_config
> > > > > > > > > > > > > messages
> > > > > > > > > > > > > (VDAgentMonitorsConfigV2) and also uses it for the
> > > > > > > > > > > > > mouse motion
> > > > > > > > > > > > > event,
> > > > > > > > > > > > > which also contains a display ID interpreted by the
> > > > > > > > > > > > > vd_agent. In
> > > > > > > > > > > > > the
> > > > > > > > > > > > > end, the API/ABI towards the client app should remain
> > > > > > > > > > > > > unchanged.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 4. The server passes the output_id in above-mentioned 
> > > > > > > > > > > > > messages to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > vd_agent. The output_id is meaningless in the server
> > > > > > > > > > > > > context.
> > > > > > > > > > > > > (Currently, it doesn't pass the monitors_config
> > > > > > > > > > > > > messages if there
> > > > > > > > > > > > > is a
> > > > > > > > > > > > > QXL device that supports it, though. Needs more
> > > > > > > > > > > > > work.)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 5. vd_agent:
> > > > > > > > > > > > > a) For mouse input, the output_id was passed in the
> > > > > > > > > > > > > original
> > > > > > > > > > > > > message,
> > > > > > > > > > > > > so no change needed here, it works.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > b) If the server sends monitors_config to the guest,
> > > > > > > > > > > > > the
> > > > > > > > > > > > > vdagent will
> > > > > > > > > > > > > prefer to use monitors_configs with the output_ids
> > > > > > > > > > > > > set, if such
> > > > > > > > > > > > > are
> > > > > > > > > > > > > present. In that case, it ignores monitors_configs
> > > > > > > > > > > > > with the
> > > > > > > > > > > > > output_id
> > > > > > > > > > > > > unset. If no output_ids are present, it should
> > > > > > > > > > > > > behave as it
> > > > > > > > > > > > > used to.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > A couple of things to note:
> > > > > > > > > > > > > - While I did copy the VDAgentMonitorsConfig to a
> > > > > > > > > > > > > different
> > > > > > > > > > > > > message for
> > > > > > > > > > > > > backwards compatibility, I didn't do the same for
> > > > > > > > > > > > > SpiceMsgDisplayMonitorsConfig, it remains to be done.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > - I didn't introduce any capabilities to handle the
> > > > > > > > > > > > > compatibility, also
> > > > > > > > > > > > > remains to be done. Hopefully it is also clear it
> > > > > > > > > > > > > will be quite a
> > > > > > > > > > > > > non-trivial job to do that :( Ain't gonna make the
> > > > > > > > > > > > > code prettier
> > > > > > > > > > > > > either.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > For your convenience, you can also pull the branches
> > > > > > > > > > > > > below:
> > > > > > > > > > > > > https://gitlab.freedesktop.org/lukash/spice-protocol/
> > > > > > > > > > > > > tree/monitors-config-poc
> > > > > > > > > > > > > https://gitlab.freedesktop.org/lukash/spice-common/tr
> > > > > > > > > > > > > ee/monitors-config-poc
> > > > > > > > > > > > > https://gitlab.com/lhrazky/spice-streaming-agent/tree
> > > > > > > > > > > > > /monitors-config-poc
> > > > > > > > > > > > > https://gitlab.freedesktop.org/lukash/spice/tree/moni
> > > > > > > > > > > > > tors-config-poc
> > > > > > > > > > > > > https://gitlab.freedesktop.org/lukash/spice-gtk/tree/
> > > > > > > > > > > > > monitors-config-poc
> > > > > > > > > > > > > https://gitlab.freedesktop.org/lukash/vd_agent/tree/m
> > > > > > > > > > > > > onitors-config-poc
> > > > > > > > > > > > > 
> > > > > > > > > > > > > All in all, it's big, complex and invasive. Note I
> > > > > > > > > > > > > did review the
> > > > > > > > > > > > > emergency instructional video [1] and am therefore
> > > > > > > > > > > > > ready for any
> > > > > > > > > > > > > bombardment you'll be sending my way :D (Don't
> > > > > > > > > > > > > hesitate to
> > > > > > > > > > > > > contact me
> > > > > > > > > > > > > with questions either)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Last minute note: I've noticed some of the patches
> > > > > > > > > > > > > are missing
> > > > > > > > > > > > > Signed-off-by line, since they are not for merging,
> > > > > > > > > > > > > should not be
> > > > > > > > > > > > > an
> > > > > > > > > > > > > issue...
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Lukáš Hrázký (16):
> > > > > > > > > > > > > spice-protocol
> > > > > > > > > > > > > Add the StreamInfo message
> > > > > > > > > > > > > Create a version 2 of the VDAgentMonitorsConfig
> > > > > > > > > > > > > message
> > > > > > > > > > > > > spice-common
> > > > > > > > > > > > > add output_id to SpiceMsgDisplayMonitorsConfig
> > > > > > > > > > > > > spice-streaming-agent
> > > > > > > > > > > > > Send a StreamInfo message when starting to stream
> > > > > > > > > > > > > spice-server
> > > > > > > > > > > > > Handle the StreamInfo message from the streaming
> > > > > > > > > > > > > agent
> > > > > > > > > > > > > Use VDAgentMonitorsConfigV2 that contains the
> > > > > > > > > > > > > output_id field
> > > > > > > > > > > > > spice-gtk
> > > > > > > > > > > > > Rework the handling of monitors_config
> > > > > > > > > > > > > Remove the n_display_channels check when sending
> > > > > > > > > > > > > monitors_config
> > > > > > > > > > > > > Use an 'enabled' flag instead of status enum in
> > > > > > > > > > > > > monitors_config
> > > > > > > > > > > > > Use VDAgentMonitorsConfigV2 as the message for
> > > > > > > > > > > > > monitors_config
> > > > > > > > > > > > > Add output_id to the monitors_config
> > > > > > > > > > > > > Use the new output_id as display ID for the mouse
> > > > > > > > > > > > > motion event
> > > > > > > > > > > > > vd_agent
> > > > > > > > > > > > > vdagent: Log the diddly doo X11 error
> > > > > > > > > > > > > Improve/add some logging messages
> > > > > > > > > > > > > Use VDAgentMonitorsConfigV2 instead of
> > > > > > > > > > > > > VDAgentMonitorsConfig
> > > > > > > > > > > > > vdagent: Use output_id from VDAgentMonitorsConfigV2
> > > > > > > > > > > > > 
> > > > > > > > > > > > > [1] https://www.youtube.com/watch?v=IKqXu-5jw60
> > > > > > > > > > > > > 
> > > > > > 
> > > > > > Frediano
> > > > > > _______________________________________________
> > > > > > Spice-devel mailing list
> > > > > > Spice-devel at lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/spice-devel
> > > > > 
> > > > > _______________________________________________
> > > > > Spice-devel mailing list
> > > > > Spice-devel at lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/spice-devel
> > > > 
> > > > _______________________________________________
> > > > Spice-devel mailing list
> > > > Spice-devel at lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/spice-devel
> > > 
> > > _______________________________________________
> > > Spice-devel mailing list
> > > Spice-devel at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/spice-devel
> 
>